1
|
Zeng Y, Zhang X, Wang J, Usui A, Ichiji K, Bukovsky I, Chou S, Funayama M, Homma N. Inconsistency between Human Observation and Deep Learning Models: Assessing Validity of Postmortem Computed Tomography Diagnosis of Drowning. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:1-10. [PMID: 38336949 PMCID: PMC11169324 DOI: 10.1007/s10278-024-00974-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 10/18/2023] [Accepted: 11/17/2023] [Indexed: 02/12/2024]
Abstract
Drowning diagnosis is a complicated process in the autopsy, even with the assistance of autopsy imaging and the on-site information from where the body was found. Previous studies have developed well-performed deep learning (DL) models for drowning diagnosis. However, the validity of the DL models was not assessed, raising doubts about whether the learned features accurately represented the medical findings observed by human experts. In this paper, we assessed the medical validity of DL models that had achieved high classification performance for drowning diagnosis. This retrospective study included autopsy cases aged 8-91 years who underwent postmortem computed tomography between 2012 and 2021 (153 drowning and 160 non-drowning cases). We first trained three deep learning models from a previous work and generated saliency maps that highlight important features in the input. To assess the validity of models, pixel-level annotations were created by four radiological technologists and further quantitatively compared with the saliency maps. All the three models demonstrated high classification performance with areas under the receiver operating characteristic curves of 0.94, 0.97, and 0.98, respectively. On the other hand, the assessment results revealed unexpected inconsistency between annotations and models' saliency maps. In fact, each model had, respectively, around 30%, 40%, and 80% of irrelevant areas in the saliency maps, suggesting the predictions of the DL models might be unreliable. The result alerts us in the careful assessment of DL tools, even those with high classification performance.
Collapse
Affiliation(s)
- Yuwen Zeng
- Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan.
| | - Xiaoyong Zhang
- National Institute of Technology, Sendai College, Sendai, Japan
| | - Jiaoyang Wang
- Department of Intelligent Biomedical System Engineering, Graduate School of Biomedical Engineering, Tohoku University, Sendai, Japan
| | - Akihito Usui
- Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Kei Ichiji
- Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Ivo Bukovsky
- Faculty of Science, University of South Bohemia in Ceske Budejovice, Ceske Budejovice, Czech Republic
- Mechanical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Shuoyan Chou
- Department of Industrial Management, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Masato Funayama
- Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan
| | - Noriyasu Homma
- Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan
| |
Collapse
|
2
|
Venkatesh K, Mutasa S, Moore F, Sulam J, Yi PH. Gradient-Based Saliency Maps Are Not Trustworthy Visual Explanations of Automated AI Musculoskeletal Diagnoses. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01136-4. [PMID: 38710971 DOI: 10.1007/s10278-024-01136-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/30/2024] [Accepted: 05/01/2024] [Indexed: 05/08/2024]
Abstract
Saliency maps are popularly used to "explain" decisions made by modern machine learning models, including deep convolutional neural networks (DCNNs). While the resulting heatmaps purportedly indicate important image features, their "trustworthiness," i.e., utility and robustness, has not been evaluated for musculoskeletal imaging. The purpose of this study was to systematically evaluate the trustworthiness of saliency maps used in disease diagnosis on upper extremity X-ray images. The underlying DCNNs were trained using the Stanford MURA dataset. We studied four trustworthiness criteria-(1) localization accuracy of abnormalities, (2) repeatability, (3) reproducibility, and (4) sensitivity to underlying DCNN weights-across six different gradient-based saliency methods (Grad-CAM (GCAM), gradient explanation (GRAD), integrated gradients (IG), Smoothgrad (SG), smooth IG (SIG), and XRAI). Ground-truth was defined by the consensus of three fellowship-trained musculoskeletal radiologists who each placed bounding boxes around abnormalities on a holdout saliency test set. Compared to radiologists, all saliency methods showed inferior localization (AUPRCs: 0.438 (SG)-0.590 (XRAI); average radiologist AUPRC: 0.816), repeatability (IoUs: 0.427 (SG)-0.551 (IG); average radiologist IOU: 0.613), and reproducibility (IoUs: 0.250 (SG)-0.502 (XRAI); average radiologist IOU: 0.613) on abnormalities such as fractures, orthopedic hardware insertions, and arthritis. Five methods (GCAM, GRAD, IG, SG, XRAI) passed the sensitivity test. Ultimately, no saliency method met all four trustworthiness criteria; therefore, we recommend caution and rigorous evaluation of saliency maps prior to their clinical use.
Collapse
Affiliation(s)
- Kesavan Venkatesh
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Simukayi Mutasa
- University of Maryland Medical Intelligent Imaging (UM2ii) Center, Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, 520 W Lombard St, Baltimore, MD, USA
| | - Fletcher Moore
- University of Maryland Medical Intelligent Imaging (UM2ii) Center, Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, 520 W Lombard St, Baltimore, MD, USA
| | - Jeremias Sulam
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Paul H Yi
- University of Maryland Medical Intelligent Imaging (UM2ii) Center, Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, 520 W Lombard St, Baltimore, MD, USA.
| |
Collapse
|
3
|
Choukali MA, Amirani MC, Valizadeh M, Abbasi A, Komeili M. Pseudo-class part prototype networks for interpretable breast cancer classification. Sci Rep 2024; 14:10341. [PMID: 38710757 DOI: 10.1038/s41598-024-60743-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 04/26/2024] [Indexed: 05/08/2024] Open
Abstract
Interpretability in machine learning has become increasingly important as machine learning is being used in more and more applications, including those with high-stakes consequences such as healthcare where Interpretability has been regarded as a key to the successful adoption of machine learning models. However, using confounding/irrelevant information in making predictions by deep learning models, even the interpretable ones, poses critical challenges to their clinical acceptance. That has recently drawn researchers' attention to issues beyond the mere interpretation of deep learning models. In this paper, we first investigate application of an inherently interpretable prototype-based architecture, known as ProtoPNet, for breast cancer classification in digital pathology and highlight its shortcomings in this application. Then, we propose a new method that uses more medically relevant information and makes more accurate and interpretable predictions. Our method leverages the clustering concept and implicitly increases the number of classes in the training dataset. The proposed method learns more relevant prototypes without any pixel-level annotated data. To have a more holistic assessment, in addition to classification accuracy, we define a new metric for assessing the degree of interpretability based on the comments of a group of skilled pathologists. Experimental results on the BreakHis dataset show that the proposed method effectively improves the classification accuracy and interpretability by respectively 8 % and 18 % . Therefore, the proposed method can be seen as a step toward implementing interpretable deep learning models for the detection of breast cancer using histopathology images.
Collapse
Affiliation(s)
| | - Mehdi Chehel Amirani
- Department of Electrical and Computer Engineering, Urmia University, Urmia, Iran
| | - Morteza Valizadeh
- Department of Electrical and Computer Engineering, Urmia University, Urmia, Iran.
| | - Ata Abbasi
- Cellular and Molecular Research Center, Cellular and Molecular Medicine Research Institute, Urmia University of Medical Sciences, Urmia, Iran
- Department of Pathology, Faculty of Medicine, Urmia University of medical sciences, Urmia, Iran
| | - Majid Komeili
- School of Computer Science, Carleton University, Ottawa, Canada
| |
Collapse
|
4
|
Mercolli L, Rominger A, Shi K. Towards quality management of artificial intelligence systems for medical applications. Z Med Phys 2024; 34:343-352. [PMID: 38413355 PMCID: PMC11156774 DOI: 10.1016/j.zemedi.2024.02.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 02/05/2024] [Accepted: 02/06/2024] [Indexed: 02/29/2024]
Abstract
The use of artificial intelligence systems in clinical routine is still hampered by the necessity of a medical device certification and/or by the difficulty of implementing these systems in a clinic's quality management system. In this context, the key questions for a user are how to ensure robust model predictions and how to appraise the quality of a model's results on a regular basis. In this paper we discuss some conceptual foundation for a clinical implementation of a machine learning system and argue that both vendors and users should take certain responsibilities, as is already common practice for high-risk medical equipment. We propose the methodology from AAPM Task Group 100 report No. 283 as a conceptual framework for developing risk-driven a quality management program for a clinical process that encompasses a machine learning system. This is illustrated with an example of a clinical workflow. Our analysis shows how the risk evaluation in this framework can accommodate artificial intelligence based systems independently of their robustness evaluation or the user's in-house expertise. In particular, we highlight how the degree of interpretability of a machine learning system can be systematically accounted for within the risk evaluation and in the development of a quality management system.
Collapse
Affiliation(s)
- Lorenzo Mercolli
- Department of Nuclear Medicine, Inselspital, Bern University Hospital, University of Bern, Freiburgstrasse 18, CH-3010 Bern, Switzerland.
| | - Axel Rominger
- Department of Nuclear Medicine, Inselspital, Bern University Hospital, University of Bern, Freiburgstrasse 18, CH-3010 Bern, Switzerland
| | - Kuangyu Shi
- Department of Nuclear Medicine, Inselspital, Bern University Hospital, University of Bern, Freiburgstrasse 18, CH-3010 Bern, Switzerland
| |
Collapse
|
5
|
Cerekci E, Alis D, Denizoglu N, Camurdan O, Ege Seker M, Ozer C, Hansu MY, Tanyel T, Oksuz I, Karaarslan E. Quantitative evaluation of Saliency-Based Explainable artificial intelligence (XAI) methods in Deep Learning-Based mammogram analysis. Eur J Radiol 2024; 173:111356. [PMID: 38364587 DOI: 10.1016/j.ejrad.2024.111356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 12/10/2023] [Accepted: 02/02/2024] [Indexed: 02/18/2024]
Abstract
BACKGROUND Explainable Artificial Intelligence (XAI) is prominent in the diagnostics of opaque deep learning (DL) models, especially in medical imaging. Saliency methods are commonly used, yet there's a lack of quantitative evidence regarding their performance. OBJECTIVES To quantitatively evaluate the performance of widely utilized saliency XAI methods in the task of breast cancer detection on mammograms. METHODS Three radiologists drew ground-truth boxes on a balanced mammogram dataset of women (n = 1496 cancer-positive and negative scans) from three centers. A modified, pre-trained DL model was employed for breast cancer detection, using MLO and CC images. Saliency XAI methods, including Gradient-weighted Class Activation Mapping (Grad-CAM), Grad-CAM++, and Eigen-CAM, were evaluated. We utilized the Pointing Game to assess these methods, determining if the maximum value of a saliency map aligned with the bounding boxes, representing the ratio of correctly identified lesions among all cancer patients, with a value ranging from 0 to 1. RESULTS The development sample included 2,244 women (75%), with the remaining 748 women (25%) in the testing set for unbiased XAI evaluation. The model's recall, precision, accuracy, and F1-Score in identifying cancer in the testing set were 69%, 88%, 80%, and 0.77, respectively. The Pointing Game Scores for Grad-CAM, Grad-CAM++, and Eigen-CAM were 0.41, 0.30, and 0.35 in women with cancer and marginally increased to 0.41, 0.31, and 0.36 when considering only true-positive samples. CONCLUSIONS While saliency-based methods provide some degree of explainability, they frequently fall short in delineating how DL models arrive at decisions in a considerable number of instances.
Collapse
Affiliation(s)
- Esma Cerekci
- Sisli Hamidiye Etfal Training and Research Hospital, Department of Radiology, Istanbul, Turkey.
| | - Deniz Alis
- Acibadem Mehmet Ali Aydinlar University, School of Medicine, Department of Radiology, Istanbul, Turkey.
| | - Nurper Denizoglu
- Acibadem Healthcare Group, Department of Radiology, Istanbul, Turkey.
| | - Ozden Camurdan
- Acibadem Healthcare Group, Department of Radiology, Istanbul, Turkey.
| | - Mustafa Ege Seker
- Acibadem Mehmet Ali Aydinlar University, School of Medicine, Istanbul, Turkey.
| | - Caner Ozer
- Istanbul Technical University, Department of Computer Engineering, Istanbul, Turkey.
| | - Muhammed Yusuf Hansu
- Istanbul Technical University, Department of Electronics and Communication Engineering, Istanbul, Turkey.
| | - Toygar Tanyel
- Istanbul Technical University, Department of Biomedical Engineering, Istanbul, Turkey.
| | - Ilkay Oksuz
- Istanbul Technical University, Department of Computer Engineering, Istanbul, Turkey.
| | - Ercan Karaarslan
- Acibadem Mehmet Ali Aydinlar University, School of Medicine, Department of Radiology, Istanbul, Turkey
| |
Collapse
|
6
|
Kang E, Heo DW, Lee J, Suk HI. A Learnable Counter-Condition Analysis Framework for Functional Connectivity-Based Neurological Disorder Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1377-1387. [PMID: 38019623 DOI: 10.1109/tmi.2023.3337074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
To understand the biological characteristics of neurological disorders with functional connectivity (FC), recent studies have widely utilized deep learning-based models to identify the disease and conducted post-hoc analyses via explainable models to discover disease-related biomarkers. Most existing frameworks consist of three stages, namely, feature selection, feature extraction for classification, and analysis, where each stage is implemented separately. However, if the results at each stage lack reliability, it can cause misdiagnosis and incorrect analysis in afterward stages. In this study, we propose a novel unified framework that systemically integrates diagnoses (i.e., feature selection and feature extraction) and explanations. Notably, we devised an adaptive attention network as a feature selection approach to identify individual-specific disease-related connections. We also propose a functional network relational encoder that summarizes the global topological properties of FC by learning the inter-network relations without pre-defined edges between functional networks. Last but not least, our framework provides a novel explanatory power for neuroscientific interpretation, also termed counter-condition analysis. We simulated the FC that reverses the diagnostic information (i.e., counter-condition FC): converting a normal brain to be abnormal and vice versa. We validated the effectiveness of our framework by using two large resting-state functional magnetic resonance imaging (fMRI) datasets, Autism Brain Imaging Data Exchange (ABIDE) and REST-meta-MDD, and demonstrated that our framework outperforms other competing methods for disease identification. Furthermore, we analyzed the disease-related neurological patterns based on counter-condition analysis.
Collapse
|
7
|
Kim C, Gadgil SU, DeGrave AJ, Omiye JA, Cai ZR, Daneshjou R, Lee SI. Transparent medical image AI via an image-text foundation model grounded in medical literature. Nat Med 2024; 30:1154-1165. [PMID: 38627560 DOI: 10.1038/s41591-024-02887-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 02/27/2024] [Indexed: 04/21/2024]
Abstract
Building trustworthy and transparent image-based medical artificial intelligence (AI) systems requires the ability to interrogate data and models at all stages of the development pipeline, from training models to post-deployment monitoring. Ideally, the data and associated AI systems could be described using terms already familiar to physicians, but this requires medical datasets densely annotated with semantically meaningful concepts. In the present study, we present a foundation model approach, named MONET (medical concept retriever), which learns how to connect medical images with text and densely scores images on concept presence to enable important tasks in medical AI development and deployment such as data auditing, model auditing and model interpretation. Dermatology provides a demanding use case for the versatility of MONET, due to the heterogeneity in diseases, skin tones and imaging modalities. We trained MONET based on 105,550 dermatological images paired with natural language descriptions from a large collection of medical literature. MONET can accurately annotate concepts across dermatology images as verified by board-certified dermatologists, competitively with supervised models built on previously concept-annotated dermatology datasets of clinical images. We demonstrate how MONET enables AI transparency across the entire AI system development pipeline, from building inherently interpretable models to dataset and model auditing, including a case study dissecting the results of an AI clinical trial.
Collapse
Affiliation(s)
- Chanwoo Kim
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA
| | - Soham U Gadgil
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA
| | - Alex J DeGrave
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA
- Medical Scientist Training Program, University of Washington, Seattle, WA, USA
| | - Jesutofunmi A Omiye
- Department of Dermatology, Stanford School of Medicine, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford School of Medicine, Stanford, CA, USA
| | - Zhuo Ran Cai
- Program for Clinical Research and Technology, Stanford University, Stanford, CA, USA
| | - Roxana Daneshjou
- Department of Dermatology, Stanford School of Medicine, Stanford, CA, USA.
- Department of Biomedical Data Science, Stanford School of Medicine, Stanford, CA, USA.
| | - Su-In Lee
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA.
| |
Collapse
|
8
|
Wang R, Kuo PC, Chen LC, Seastedt KP, Gichoya JW, Celi LA. Drop the shortcuts: image augmentation improves fairness and decreases AI detection of race and other demographics from medical images. EBioMedicine 2024; 102:105047. [PMID: 38471396 PMCID: PMC10945176 DOI: 10.1016/j.ebiom.2024.105047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 02/15/2024] [Accepted: 02/21/2024] [Indexed: 03/14/2024] Open
Abstract
BACKGROUND It has been shown that AI models can learn race on medical images, leading to algorithmic bias. Our aim in this study was to enhance the fairness of medical image models by eliminating bias related to race, age, and sex. We hypothesise models may be learning demographics via shortcut learning and combat this using image augmentation. METHODS This study included 44,953 patients who identified as Asian, Black, or White (mean age, 60.68 years ±18.21; 23,499 women) for a total of 194,359 chest X-rays (CXRs) from MIMIC-CXR database. The included CheXpert images comprised 45,095 patients (mean age 63.10 years ±18.14; 20,437 women) for a total of 134,300 CXRs were used for external validation. We also collected 1195 3D brain magnetic resonance imaging (MRI) data from the ADNI database, which included 273 participants with an average age of 76.97 years ±14.22, and 142 females. DL models were trained on either non-augmented or augmented images and assessed using disparity metrics. The features learned by the models were analysed using task transfer experiments and model visualisation techniques. FINDINGS In the detection of radiological findings, training a model using augmented CXR images was shown to reduce disparities in error rate among racial groups (-5.45%), age groups (-13.94%), and sex (-22.22%). For AD detection, the model trained with augmented MRI images was shown 53.11% and 31.01% reduction of disparities in error rate among age and sex groups, respectively. Image augmentation led to a reduction in the model's ability to identify demographic attributes and resulted in the model trained for clinical purposes incorporating fewer demographic features. INTERPRETATION The model trained using the augmented images was less likely to be influenced by demographic information in detecting image labels. These results demonstrate that the proposed augmentation scheme could enhance the fairness of interpretations by DL models when dealing with data from patients with different demographic backgrounds. FUNDING National Science and Technology Council (Taiwan), National Institutes of Health.
Collapse
Affiliation(s)
- Ryan Wang
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
| | - Po-Chih Kuo
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan.
| | - Li-Ching Chen
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
| | - Kenneth Patrick Seastedt
- Department of Surgery, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA; Department of Thoracic Surgery, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA
| | | | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA; Division of Pulmonary Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
9
|
Pai S, Bontempi D, Hadzic I, Prudente V, Sokač M, Chaunzwa TL, Bernatz S, Hosny A, Mak RH, Birkbak NJ, Aerts HJWL. Foundation model for cancer imaging biomarkers. NAT MACH INTELL 2024; 6:354-367. [PMID: 38523679 PMCID: PMC10957482 DOI: 10.1038/s42256-024-00807-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 02/08/2024] [Indexed: 03/26/2024]
Abstract
Foundation models in deep learning are characterized by a single large-scale model trained on vast amounts of data serving as the foundation for various downstream tasks. Foundation models are generally trained using self-supervised learning and excel in reducing the demand for training samples in downstream applications. This is especially important in medicine, where large labelled datasets are often scarce. Here, we developed a foundation model for cancer imaging biomarker discovery by training a convolutional encoder through self-supervised learning using a comprehensive dataset of 11,467 radiographic lesions. The foundation model was evaluated in distinct and clinically relevant applications of cancer imaging-based biomarkers. We found that it facilitated better and more efficient learning of imaging biomarkers and yielded task-specific models that significantly outperformed conventional supervised and other state-of-the-art pretrained implementations on downstream tasks, especially when training dataset sizes were very limited. Furthermore, the foundation model was more stable to input variations and showed strong associations with underlying biology. Our results demonstrate the tremendous potential of foundation models in discovering new imaging biomarkers that may extend to other clinical use cases and can accelerate the widespread translation of imaging biomarkers into clinical settings.
Collapse
Affiliation(s)
- Suraj Pai
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Dennis Bontempi
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Ibrahim Hadzic
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Vasco Prudente
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Mateo Sokač
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Tafadzwa L. Chaunzwa
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Simon Bernatz
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Ahmed Hosny
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Raymond H. Mak
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
| | - Nicolai J. Birkbak
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Hugo J. W. L. Aerts
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
- Department of Radiology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| |
Collapse
|
10
|
Donnelly J, Moffett L, Barnett AJ, Trivedi H, Schwartz F, Lo J, Rudin C. AsymMirai: Interpretable Mammography-based Deep Learning Model for 1-5-year Breast Cancer Risk Prediction. Radiology 2024; 310:e232780. [PMID: 38501952 DOI: 10.1148/radiol.232780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]
Abstract
Background Mirai, a state-of-the-art deep learning-based algorithm for predicting short-term breast cancer risk, outperforms standard clinical risk models. However, Mirai is a black box, risking overreliance on the algorithm and incorrect diagnoses. Purpose To identify whether bilateral dissimilarity underpins Mirai's reasoning process; create a simplified, intelligible model, AsymMirai, using bilateral dissimilarity; and determine if AsymMirai may approximate Mirai's performance in 1-5-year breast cancer risk prediction. Materials and Methods This retrospective study involved mammograms obtained from patients in the EMory BrEast imaging Dataset, known as EMBED, from January 2013 to December 2020. To approximate 1-5-year breast cancer risk predictions from Mirai, another deep learning-based model, AsymMirai, was built with an interpretable module: local bilateral dissimilarity (localized differences between left and right breast tissue). Pearson correlation coefficients were computed between the risk scores of Mirai and those of AsymMirai. Subgroup analysis was performed in patients for whom AsymMirai's year-over-year reasoning was consistent. AsymMirai and Mirai risk scores were compared using the area under the receiver operating characteristic curve (AUC), and 95% CIs were calculated using the DeLong method. Results Screening mammograms (n = 210 067) from 81 824 patients (mean age, 59.4 years ± 11.4 [SD]) were included in the study. Deep learning-extracted bilateral dissimilarity produced similar risk scores to those of Mirai (1-year risk prediction, r = 0.6832; 4-5-year prediction, r = 0.6988) and achieved similar performance as Mirai. For AsymMirai, the 1-year breast cancer risk AUC was 0.79 (95% CI: 0.73, 0.85) (Mirai, 0.84; 95% CI: 0.79, 0.89; P = .002), and the 5-year risk AUC was 0.66 (95% CI: 0.63, 0.69) (Mirai, 0.71; 95% CI: 0.68, 0.74; P < .001). In a subgroup of 183 patients for whom AsymMirai repeatedly highlighted the same tissue over time, AsymMirai achieved a 3-year AUC of 0.92 (95% CI: 0.86, 0.97). Conclusion Localized bilateral dissimilarity, an imaging marker for breast cancer risk, approximated the predictive power of Mirai and was a key to Mirai's reasoning. © RSNA, 2024 Supplemental material is available for this article See also the editorial by Freitas in this issue.
Collapse
Affiliation(s)
- Jon Donnelly
- From the Departments of Computer Science (J.D., L.M., A.J.B., C.R.) and Electrical and Computer Engineering (C.R.), Duke University, 308 Research Dr, LSRC Building D101, Duke Box 90129, Durham, NC 27708; Department of Radiology and Imaging Services, Emory University, Atlanta, Ga (H.T.); Department of Radiology, Harvard University, Cambridge, Mass (F.S.); and Department of Radiology, Duke University School of Medicine, Durham, NC (J.L.)
| | - Luke Moffett
- From the Departments of Computer Science (J.D., L.M., A.J.B., C.R.) and Electrical and Computer Engineering (C.R.), Duke University, 308 Research Dr, LSRC Building D101, Duke Box 90129, Durham, NC 27708; Department of Radiology and Imaging Services, Emory University, Atlanta, Ga (H.T.); Department of Radiology, Harvard University, Cambridge, Mass (F.S.); and Department of Radiology, Duke University School of Medicine, Durham, NC (J.L.)
| | - Alina Jade Barnett
- From the Departments of Computer Science (J.D., L.M., A.J.B., C.R.) and Electrical and Computer Engineering (C.R.), Duke University, 308 Research Dr, LSRC Building D101, Duke Box 90129, Durham, NC 27708; Department of Radiology and Imaging Services, Emory University, Atlanta, Ga (H.T.); Department of Radiology, Harvard University, Cambridge, Mass (F.S.); and Department of Radiology, Duke University School of Medicine, Durham, NC (J.L.)
| | - Hari Trivedi
- From the Departments of Computer Science (J.D., L.M., A.J.B., C.R.) and Electrical and Computer Engineering (C.R.), Duke University, 308 Research Dr, LSRC Building D101, Duke Box 90129, Durham, NC 27708; Department of Radiology and Imaging Services, Emory University, Atlanta, Ga (H.T.); Department of Radiology, Harvard University, Cambridge, Mass (F.S.); and Department of Radiology, Duke University School of Medicine, Durham, NC (J.L.)
| | - Fides Schwartz
- From the Departments of Computer Science (J.D., L.M., A.J.B., C.R.) and Electrical and Computer Engineering (C.R.), Duke University, 308 Research Dr, LSRC Building D101, Duke Box 90129, Durham, NC 27708; Department of Radiology and Imaging Services, Emory University, Atlanta, Ga (H.T.); Department of Radiology, Harvard University, Cambridge, Mass (F.S.); and Department of Radiology, Duke University School of Medicine, Durham, NC (J.L.)
| | - Joseph Lo
- From the Departments of Computer Science (J.D., L.M., A.J.B., C.R.) and Electrical and Computer Engineering (C.R.), Duke University, 308 Research Dr, LSRC Building D101, Duke Box 90129, Durham, NC 27708; Department of Radiology and Imaging Services, Emory University, Atlanta, Ga (H.T.); Department of Radiology, Harvard University, Cambridge, Mass (F.S.); and Department of Radiology, Duke University School of Medicine, Durham, NC (J.L.)
| | - Cynthia Rudin
- From the Departments of Computer Science (J.D., L.M., A.J.B., C.R.) and Electrical and Computer Engineering (C.R.), Duke University, 308 Research Dr, LSRC Building D101, Duke Box 90129, Durham, NC 27708; Department of Radiology and Imaging Services, Emory University, Atlanta, Ga (H.T.); Department of Radiology, Harvard University, Cambridge, Mass (F.S.); and Department of Radiology, Duke University School of Medicine, Durham, NC (J.L.)
| |
Collapse
|
11
|
Scott I, Connell D, Moulton D, Waters S, Namburete A, Arnab A, Malliaras P. An automated method for tendon image segmentation on ultrasound using grey-level co-occurrence matrix features and hidden Gaussian Markov random fields. Comput Biol Med 2024; 169:107872. [PMID: 38160500 DOI: 10.1016/j.compbiomed.2023.107872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 12/07/2023] [Accepted: 12/17/2023] [Indexed: 01/03/2024]
Abstract
BACKGROUND Despite knowledge of qualitative changes that occur on ultrasound in tendinopathy, there is currently no objective and reliable means to quantify the severity or prognosis of tendinopathy on ultrasound. OBJECTIVE The primary objective of this study is to produce a quantitative and automated means of inferring potential structural changes in tendinopathy by developing and implementing an algorithm which performs a texture based segmentation of tendon ultrasound (US) images. METHOD A model-based segmentation approach is used which combines Gaussian mixture models, Markov random field theory and grey-level co-occurrence (GLCM) features. The algorithm is trained and tested on 49 longitudinal B-mode ultrasound images of the Achilles tendons which are labelled as tendinopathic (24) or healthy (25). Hyperparameters are tuned, using a training set of 25 images, to optimise a decision tree based classification of the images from texture class proportions. We segment and classify the remaining test images using the decision tree. RESULTS Our approach successfully detects a difference in the texture profiles of tendinopathic and healthy tendons, with 22/24 of the test images accurately classified based on a simple texture proportion cut-off threshold. Results for the tendinopathic images are also collated to gain insight into the topology of structural changes that occur with tendinopathy. It is evident that distinct textures, which are predominantly present in tendinopathic tendons, appear most commonly near the transverse boundary of the tendon, though there was a large variability among diseased tendons. CONCLUSION The GLCM based segmentation of tendons under ultrasound resulted in distinct segmentations between healthy and tendinopathic tendons and provides a potential tool to objectively quantify damage in tendinopathy.
Collapse
Affiliation(s)
- Isabelle Scott
- Mathematical Institute, University of Oxford, Oxford, United Kingdom; Orygen, The National Centre of Excellence in Youth Mental Health, University of Melbourne, Parkville, Melbourne, Australia.
| | | | - Derek Moulton
- Mathematical Institute, University of Oxford, Oxford, United Kingdom
| | - Sarah Waters
- Mathematical Institute, University of Oxford, Oxford, United Kingdom
| | - Ana Namburete
- Oxford Machine Learning in Neuroimaging laboratory, OMNI, Department of Computer Science, University of Oxford, Oxford, United Kingdom
| | | | - Peter Malliaras
- Imaging at Olympic Park, Melbourne, Australia; Department of Physiotherapy, Monash University, Melbourne, Australia
| |
Collapse
|
12
|
Fuchs M, Gonzalez C, Frisch Y, Hahn P, Matthies P, Gruening M, Pinto Dos Santos D, Dratsch T, Kim M, Nensa F, Trenz M, Mukhopadhyay A. Closing the loop for AI-ready radiology. ROFO-FORTSCHR RONTG 2024; 196:154-162. [PMID: 37582385 DOI: 10.1055/a-2124-1958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]
Abstract
BACKGROUND In recent years, AI has made significant advancements in medical diagnosis and prognosis. However, the incorporation of AI into clinical practice is still challenging and under-appreciated. We aim to demonstrate a possible vertical integration approach to close the loop for AI-ready radiology. METHOD This study highlights the importance of two-way communication for AI-assisted radiology. As a key part of the methodology, it demonstrates the integration of AI systems into clinical practice with structured reports and AI visualization, giving more insight into the AI system. By integrating cooperative lifelong learning into the AI system, we ensure the long-term effectiveness of the AI system, while keeping the radiologist in the loop. RESULTS: We demonstrate the use of lifelong learning for AI systems by incorporating AI visualization and structured reports. We evaluate Memory Aware-Synapses and Rehearsal approach and find that both approaches work in practice. Furthermore, we see the advantage of lifelong learning algorithms that do not require the storing or maintaining of samples from previous datasets. CONCLUSION In conclusion, incorporating AI into the clinical routine of radiology requires a two-way communication approach and seamless integration of the AI system, which we achieve with structured reports and visualization of the insight gained by the model. Closing the loop for radiology leads to successful integration, enabling lifelong learning for the AI system, which is crucial for sustainable long-term performance. KEY POINTS · The integration of AI systems into the clinical routine with structured reports and AI visualization.. · Two-way communication between AI and radiologists is necessary to enable AI that keeps the radiologist in the loop.. · Closing the loop enables lifelong learning, which is crucial for long-term, high-performing AI in radiology..
Collapse
Affiliation(s)
| | | | | | | | | | - Maximilian Gruening
- Interorganisational Informationssystems, Georg-August-Universität Göttingen, Goettingen, Germany
| | - Daniel Pinto Dos Santos
- Institute for Diagnostic and Interventional Radiology, Uniklinik Koln, Germany
- Institute for Diagnostic and Interventional Radiology, Universitätsklinikum Frankfurt, Frankfurt am Main, Germany
| | - Thomas Dratsch
- Institute for Diagnostic and Interventional Radiology, Uniklinik Koln, Germany
| | - Moon Kim
- Institute for Diagnostic and Interventional Radiology and Neuroradiology, Universitätsklinikum Essen, Germany
- Institute for Artificial Intelligence in Medicine, Universitätsklinikum Essen, Germany
| | - Felix Nensa
- Institute for Diagnostic and Interventional Radiology and Neuroradiology, Universitätsklinikum Essen, Germany
- Institute for Artificial Intelligence in Medicine, Universitätsklinikum Essen, Germany
| | - Manuel Trenz
- Interorganisational Informationssystems, Georg-August-Universität Göttingen, Goettingen, Germany
| | | |
Collapse
|
13
|
Bourazana A, Xanthopoulos A, Briasoulis A, Magouliotis D, Spiliopoulos K, Athanasiou T, Vassilopoulos G, Skoularigis J, Triposkiadis F. Artificial Intelligence in Heart Failure: Friend or Foe? Life (Basel) 2024; 14:145. [PMID: 38276274 PMCID: PMC10817517 DOI: 10.3390/life14010145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 01/08/2024] [Accepted: 01/17/2024] [Indexed: 01/27/2024] Open
Abstract
In recent times, there have been notable changes in cardiovascular medicine, propelled by the swift advancements in artificial intelligence (AI). The present work provides an overview of the current applications and challenges of AI in the field of heart failure. It emphasizes the "garbage in, garbage out" issue, where AI systems can produce inaccurate results with skewed data. The discussion covers issues in heart failure diagnostic algorithms, particularly discrepancies between existing models. Concerns about the reliance on the left ventricular ejection fraction (LVEF) for classification and treatment are highlighted, showcasing differences in current scientific perceptions. This review also delves into challenges in implementing AI, including variable considerations and biases in training data. It underscores the limitations of current AI models in real-world scenarios and the difficulty in interpreting their predictions, contributing to limited physician trust in AI-based models. The overarching suggestion is that AI can be a valuable tool in clinicians' hands for treating heart failure patients, as far as existing medical inaccuracies have been addressed before integrating AI into these frameworks.
Collapse
Affiliation(s)
- Angeliki Bourazana
- Department of Cardiology, University Hospital of Larissa, 41110 Larissa, Greece
| | - Andrew Xanthopoulos
- Department of Cardiology, University Hospital of Larissa, 41110 Larissa, Greece
| | - Alexandros Briasoulis
- Division of Cardiovascular Medicine, Section of Heart Failure and Transplantation, University of Iowa, Iowa City, IA 52242, USA
| | - Dimitrios Magouliotis
- Department of Cardiothoracic Surgery, University of Thessaly, 41110 Larissa, Greece; (D.M.); (K.S.)
| | - Kyriakos Spiliopoulos
- Department of Cardiothoracic Surgery, University of Thessaly, 41110 Larissa, Greece; (D.M.); (K.S.)
| | - Thanos Athanasiou
- Department of Surgery and Cancer, Imperial College London, St Mary’s Hospital, London W2 1NY, UK
| | - George Vassilopoulos
- Department of Hematology, University Hospital of Larissa, University of Thessaly Medical School, 41110 Larissa, Greece
| | - John Skoularigis
- Department of Cardiology, University Hospital of Larissa, 41110 Larissa, Greece
| | | |
Collapse
|
14
|
Yanagawa M, Sato J. Seeing Is Not Always Believing: Discrepancies in Saliency Maps. Radiol Artif Intell 2024; 6:e230488. [PMID: 38166327 PMCID: PMC10831517 DOI: 10.1148/ryai.230488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 11/09/2023] [Accepted: 11/17/2023] [Indexed: 01/04/2024]
Affiliation(s)
- Masahiro Yanagawa
- From the Department of Radiology, Osaka University Graduate School of Medicine, 2-2 Yamadaoka, Suita-city, Osaka 565-0871, Japan
| | - Junya Sato
- From the Department of Radiology, Osaka University Graduate School of Medicine, 2-2 Yamadaoka, Suita-city, Osaka 565-0871, Japan
| |
Collapse
|
15
|
Zhang J, Chao H, Dasegowda G, Wang G, Kalra MK, Yan P. Revisiting the Trustworthiness of Saliency Methods in Radiology AI. Radiol Artif Intell 2024; 6:e220221. [PMID: 38166328 PMCID: PMC10831523 DOI: 10.1148/ryai.220221] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 10/04/2023] [Accepted: 10/23/2023] [Indexed: 01/04/2024]
Abstract
Purpose To determine whether saliency maps in radiology artificial intelligence (AI) are vulnerable to subtle perturbations of the input, which could lead to misleading interpretations, using prediction-saliency correlation (PSC) for evaluating the sensitivity and robustness of saliency methods. Materials and Methods In this retrospective study, locally trained deep learning models and a research prototype provided by a commercial vendor were systematically evaluated on 191 229 chest radiographs from the CheXpert dataset and 7022 MR images from a human brain tumor classification dataset. Two radiologists performed a reader study on 270 chest radiograph pairs. A model-agnostic approach for computing the PSC coefficient was used to evaluate the sensitivity and robustness of seven commonly used saliency methods. Results The saliency methods had low sensitivity (maximum PSC, 0.25; 95% CI: 0.12, 0.38) and weak robustness (maximum PSC, 0.12; 95% CI: 0.0, 0.25) on the CheXpert dataset, as demonstrated by leveraging locally trained model parameters. Further evaluation showed that the saliency maps generated from a commercial prototype could be irrelevant to the model output, without knowledge of the model specifics (area under the receiver operating characteristic curve decreased by 8.6% without affecting the saliency map). The human observer studies confirmed that it is difficult for experts to identify the perturbed images; the experts had less than 44.8% correctness. Conclusion Popular saliency methods scored low PSC values on the two datasets of perturbed chest radiographs, indicating weak sensitivity and robustness. The proposed PSC metric provides a valuable quantification tool for validating the trustworthiness of medical AI explainability. Keywords: Saliency Maps, AI Trustworthiness, Dynamic Consistency, Sensitivity, Robustness Supplemental material is available for this article. © RSNA, 2023 See also the commentary by Yanagawa and Sato in this issue.
Collapse
Affiliation(s)
- Jiajin Zhang
- From the Department of Biomedical Engineering, Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, 110 8th St, Biotech 4231, Troy, NY 12180 (J.Z., H.C., G.W., P.Y.); and Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Mass (G.D., M.K.K.)
| | - Hanqing Chao
- From the Department of Biomedical Engineering, Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, 110 8th St, Biotech 4231, Troy, NY 12180 (J.Z., H.C., G.W., P.Y.); and Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Mass (G.D., M.K.K.)
| | - Giridhar Dasegowda
- From the Department of Biomedical Engineering, Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, 110 8th St, Biotech 4231, Troy, NY 12180 (J.Z., H.C., G.W., P.Y.); and Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Mass (G.D., M.K.K.)
| | - Ge Wang
- From the Department of Biomedical Engineering, Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, 110 8th St, Biotech 4231, Troy, NY 12180 (J.Z., H.C., G.W., P.Y.); and Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Mass (G.D., M.K.K.)
| | - Mannudeep K. Kalra
- From the Department of Biomedical Engineering, Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, 110 8th St, Biotech 4231, Troy, NY 12180 (J.Z., H.C., G.W., P.Y.); and Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Mass (G.D., M.K.K.)
| | - Pingkun Yan
- From the Department of Biomedical Engineering, Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, 110 8th St, Biotech 4231, Troy, NY 12180 (J.Z., H.C., G.W., P.Y.); and Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Mass (G.D., M.K.K.)
| |
Collapse
|
16
|
Morales MA, Manning WJ, Nezafat R. Present and Future Innovations in AI and Cardiac MRI. Radiology 2024; 310:e231269. [PMID: 38193835 PMCID: PMC10831479 DOI: 10.1148/radiol.231269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 10/21/2023] [Accepted: 10/26/2023] [Indexed: 01/10/2024]
Abstract
Cardiac MRI is used to diagnose and treat patients with a multitude of cardiovascular diseases. Despite the growth of clinical cardiac MRI, complicated image prescriptions and long acquisition protocols limit the specialty and restrain its impact on the practice of medicine. Artificial intelligence (AI)-the ability to mimic human intelligence in learning and performing tasks-will impact nearly all aspects of MRI. Deep learning (DL) primarily uses an artificial neural network to learn a specific task from example data sets. Self-driving scanners are increasingly available, where AI automatically controls cardiac image prescriptions. These scanners offer faster image collection with higher spatial and temporal resolution, eliminating the need for cardiac triggering or breath holding. In the future, fully automated inline image analysis will most likely provide all contour drawings and initial measurements to the reader. Advanced analysis using radiomic or DL features may provide new insights and information not typically extracted in the current analysis workflow. AI may further help integrate these features with clinical, genetic, wearable-device, and "omics" data to improve patient outcomes. This article presents an overview of AI and its application in cardiac MRI, including in image acquisition, reconstruction, and processing, and opportunities for more personalized cardiovascular care through extraction of novel imaging markers.
Collapse
Affiliation(s)
- Manuel A. Morales
- From the Department of Medicine, Cardiovascular Division (M.A.M.,
W.J.M., R.N.), and Department of Radiology (W.J.M.), Beth Israel Deaconess
Medical Center and Harvard Medical School, 330 Brookline Ave, Boston, MA
02215
| | - Warren J. Manning
- From the Department of Medicine, Cardiovascular Division (M.A.M.,
W.J.M., R.N.), and Department of Radiology (W.J.M.), Beth Israel Deaconess
Medical Center and Harvard Medical School, 330 Brookline Ave, Boston, MA
02215
| | - Reza Nezafat
- From the Department of Medicine, Cardiovascular Division (M.A.M.,
W.J.M., R.N.), and Department of Radiology (W.J.M.), Beth Israel Deaconess
Medical Center and Harvard Medical School, 330 Brookline Ave, Boston, MA
02215
| |
Collapse
|
17
|
Li MD, Jaremko JL. Personalizing Short-term Fracture Prevention After Hip Fracture: CT-based AI Risk Stratification. Radiology 2024; 310:e233396. [PMID: 38289218 DOI: 10.1148/radiol.233396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Affiliation(s)
- Matthew D Li
- From the Department of Radiology and Diagnostic Imaging, Faculty of Medicine and Dentistry, University of Alberta Hospital, 8440 112 St NW, 2A2.41 WMC, Edmonton, AB, Canada T6G 2B7
| | - Jacob L Jaremko
- From the Department of Radiology and Diagnostic Imaging, Faculty of Medicine and Dentistry, University of Alberta Hospital, 8440 112 St NW, 2A2.41 WMC, Edmonton, AB, Canada T6G 2B7
| |
Collapse
|
18
|
Sigut J, Fumero F, Estévez J, Alayón S, Díaz-Alemán T. In-Depth Evaluation of Saliency Maps for Interpreting Convolutional Neural Network Decisions in the Diagnosis of Glaucoma Based on Fundus Imaging. SENSORS (BASEL, SWITZERLAND) 2023; 24:239. [PMID: 38203101 PMCID: PMC10781365 DOI: 10.3390/s24010239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 12/14/2023] [Accepted: 12/29/2023] [Indexed: 01/12/2024]
Abstract
Glaucoma, a leading cause of blindness, damages the optic nerve, making early diagnosis challenging due to no initial symptoms. Fundus eye images taken with a non-mydriatic retinograph help diagnose glaucoma by revealing structural changes, including the optic disc and cup. This research aims to thoroughly analyze saliency maps in interpreting convolutional neural network decisions for diagnosing glaucoma from fundus images. These maps highlight the most influential image regions guiding the network's decisions. Various network architectures were trained and tested on 739 optic nerve head images, with nine saliency methods used. Some other popular datasets were also used for further validation. The results reveal disparities among saliency maps, with some consensus between the folds corresponding to the same architecture. Concerning the significance of optic disc sectors, there is generally a lack of agreement with standard medical criteria. The background, nasal, and temporal sectors emerge as particularly influential for neural network decisions, showing a likelihood of being the most relevant ranging from 14.55% to 28.16% on average across all evaluated datasets. We can conclude that saliency maps are usually difficult to interpret and even the areas indicated as the most relevant can be very unintuitive. Therefore, its usefulness as an explanatory tool may be compromised, at least in problems such as the one addressed in this study, where the features defining the model prediction are generally not consistently reflected in relevant regions of the saliency maps, and they even cannot always be related to those used as medical standards.
Collapse
Affiliation(s)
- Jose Sigut
- Department of Computer Science and Systems Engineering, Universidad de La Laguna, Camino San Francisco de Paula, 19, La Laguna, 38203 Santa Cruz de Tenerife, Spain; (F.F.); (J.E.); (S.A.)
| | - Francisco Fumero
- Department of Computer Science and Systems Engineering, Universidad de La Laguna, Camino San Francisco de Paula, 19, La Laguna, 38203 Santa Cruz de Tenerife, Spain; (F.F.); (J.E.); (S.A.)
| | - José Estévez
- Department of Computer Science and Systems Engineering, Universidad de La Laguna, Camino San Francisco de Paula, 19, La Laguna, 38203 Santa Cruz de Tenerife, Spain; (F.F.); (J.E.); (S.A.)
| | - Silvia Alayón
- Department of Computer Science and Systems Engineering, Universidad de La Laguna, Camino San Francisco de Paula, 19, La Laguna, 38203 Santa Cruz de Tenerife, Spain; (F.F.); (J.E.); (S.A.)
| | - Tinguaro Díaz-Alemán
- Department of Ophthalmology, Hospital Universitario de Canarias, Carretera Ofra S/N, La Laguna, 38320 Santa Cruz de Tenerife, Spain;
| |
Collapse
|
19
|
Kang DW, Park GH, Ryu WS, Schellingerhout D, Kim M, Kim YS, Park CY, Lee KJ, Han MK, Jeong HG, Kim DE. Strengthening deep-learning models for intracranial hemorrhage detection: strongly annotated computed tomography images and model ensembles. Front Neurol 2023; 14:1321964. [PMID: 38221995 PMCID: PMC10784380 DOI: 10.3389/fneur.2023.1321964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Accepted: 12/11/2023] [Indexed: 01/16/2024] Open
Abstract
Background and purpose Multiple attempts at intracranial hemorrhage (ICH) detection using deep-learning techniques have been plagued by clinical failures. We aimed to compare the performance of a deep-learning algorithm for ICH detection trained on strongly and weakly annotated datasets, and to assess whether a weighted ensemble model that integrates separate models trained using datasets with different ICH improves performance. Methods We used brain CT scans from the Radiological Society of North America (27,861 CT scans, 3,528 ICHs) and AI-Hub (53,045 CT scans, 7,013 ICHs) for training. DenseNet121, InceptionResNetV2, MobileNetV2, and VGG19 were trained on strongly and weakly annotated datasets and compared using independent external test datasets. We then developed a weighted ensemble model combining separate models trained on all ICH, subdural hemorrhage (SDH), subarachnoid hemorrhage (SAH), and small-lesion ICH cases. The final weighted ensemble model was compared to four well-known deep-learning models. After external testing, six neurologists reviewed 91 ICH cases difficult for AI and humans. Results InceptionResNetV2, MobileNetV2, and VGG19 models outperformed when trained on strongly annotated datasets. A weighted ensemble model combining models trained on SDH, SAH, and small-lesion ICH had a higher AUC, compared with a model trained on all ICH cases only. This model outperformed four deep-learning models (AUC [95% C.I.]: Ensemble model, 0.953[0.938-0.965]; InceptionResNetV2, 0.852[0.828-0.873]; DenseNet121, 0.875[0.852-0.895]; VGG19, 0.796[0.770-0.821]; MobileNetV2, 0.650[0.620-0.680]; p < 0.0001). In addition, the case review showed that a better understanding and management of difficult cases may facilitate clinical use of ICH detection algorithms. Conclusion We propose a weighted ensemble model for ICH detection, trained on large-scale, strongly annotated CT scans, as no model can capture all aspects of complex tasks.
Collapse
Affiliation(s)
- Dong-Wan Kang
- Department of Public Health, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
- Department of Neurology, Gyeonggi Provincial Medical Center, Icheon Hospital, Icheon, Republic of Korea
- Department of Neurology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
| | - Gi-Hun Park
- JLK Inc., Artificial Intelligence Research Center, Seoul, Republic of Korea
| | - Wi-Sun Ryu
- JLK Inc., Artificial Intelligence Research Center, Seoul, Republic of Korea
| | - Dawid Schellingerhout
- Department of Neuroradiology and Imaging Physics, The University of Texas M.D. Anderson Cancer Center, Houston, TX, United States
| | - Museong Kim
- Department of Neurosurgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
- Hospital Medicine Center, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
| | - Yong Soo Kim
- Department of Neurology, Nowon Eulji Medical Center, Eulji University School of Medicine, Seoul, Republic of Korea
| | - Chan-Young Park
- Department of Neurology, Chung-Ang University Hospital, Seoul, Republic of Korea
| | - Keon-Joo Lee
- Department of Neurology, Korea University Guro Hospital, Seoul, Republic of Korea
| | - Moon-Ku Han
- Department of Neurology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
| | - Han-Gil Jeong
- Department of Neurology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
- Department of Neurosurgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
| | - Dong-Eog Kim
- Department of Neurology, Dongguk University Ilsan Hospital, Goyang, Republic of Korea
- National Priority Research Center for Stroke, Goyang, Republic of Korea
| |
Collapse
|
20
|
Avula V, Wu KC, Carrick RT. Clinical Applications, Methodology, and Scientific Reporting of Electrocardiogram Deep-Learning Models: A Systematic Review. JACC. ADVANCES 2023; 2:100686. [PMID: 38288263 PMCID: PMC10824530 DOI: 10.1016/j.jacadv.2023.100686] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/31/2024]
Abstract
BACKGROUND The electrocardiogram (ECG) is one of the most common diagnostic tools available to assess cardio-vascular health. The advent of advanced computational techniques such as deep learning has dramatically expanded the breadth of clinical problems that can be addressed using ECG data, leading to increasing popularity of ECG deep-learning models aimed at predicting clinical endpoints. OBJECTIVES The purpose of this study was to define the current landscape of clinically relevant ECG deep-learning models and examine practices in the scientific reporting of these studies. METHODS We performed a systematic review of PubMed and EMBASE databases to identify clinically relevant ECG deep-learning models published through July 1, 2022. RESULTS We identified 44 manuscripts including 53 unique, clinically relevant ECG deep-learning models. The rate of publication of ECG deep-learning models is increasing rapidly. The most common clinical applications of ECG deep learning were identification of cardiomyopathy (14/53 [26%]), followed by arrhythmia detection (9/53 [17%]). Methodologic reporting varied; while 33/44 (75%) publications included model architecture diagrams, complete information required to reproduce these models was provided in only 10/44 (23%). Saliency analysis was performed in 20/44 (46%) of publications. Only 18/53 (34%) models were tested within external validation cohorts. Model code or resources allowing for model implementation by external groups were available for only 5/44 (11%) publications. CONCLUSIONS While ECG deep-learning models are increasingly clinically relevant, their reporting is highly variable, and few publications provide sufficient detail for methodologic reproduction or model validation by external groups. The field of ECG deep learning would benefit from adherence to a set of standardized scientific reporting guidelines.
Collapse
Affiliation(s)
- Vennela Avula
- Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Katherine C. Wu
- Division of Cardiology, Johns Hopkins University Department of Medicine, Baltimore, Maryland, USA
| | - Richard T. Carrick
- Division of Cardiology, Johns Hopkins University Department of Medicine, Baltimore, Maryland, USA
| |
Collapse
|
21
|
Pertuz S, Ortega D, Suarez É, Cancino W, Africano G, Rinta-Kiikka I, Arponen O, Paris S, Lozano A. Saliency of breast lesions in breast cancer detection using artificial intelligence. Sci Rep 2023; 13:20545. [PMID: 37996504 PMCID: PMC10667547 DOI: 10.1038/s41598-023-46921-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 11/07/2023] [Indexed: 11/25/2023] Open
Abstract
The analysis of mammograms using artificial intelligence (AI) has shown great potential for assisting breast cancer screening. We use saliency maps to study the role of breast lesions in the decision-making process of AI systems for breast cancer detection in screening mammograms. We retrospectively collected mammograms from 191 women with screen-detected breast cancer and 191 healthy controls matched by age and mammographic system. Two radiologists manually segmented the breast lesions in the mammograms from CC and MLO views. We estimated the detection performance of four deep learning-based AI systems using the area under the ROC curve (AUC) with a 95% confidence interval (CI). We used automatic thresholding on saliency maps from the AI systems to identify the areas of interest on the mammograms. Finally, we measured the overlap between these areas of interest and the segmented breast lesions using Dice's similarity coefficient (DSC). The detection performance of the AI systems ranged from low to moderate (AUCs from 0.525 to 0.694). The overlap between the areas of interest and the breast lesions was low for all the studied methods (median DSC from 4.2% to 38.0%). The AI system with the highest cancer detection performance (AUC = 0.694, CI 0.662-0.726) showed the lowest overlap (DSC = 4.2%) with breast lesions. The areas of interest found by saliency analysis of the AI systems showed poor overlap with breast lesions. These results suggest that AI systems with the highest performance do not solely rely on localized breast lesions for their decision-making in cancer detection; rather, they incorporate information from large image regions. This work contributes to the understanding of the role of breast lesions in cancer detection using AI.
Collapse
Affiliation(s)
- Said Pertuz
- Escuela de Ingenierías Eléctrica Electrónica y de Telecomunicaciones, Universidad Industrial de Santander, Bucaramanga, Colombia
| | - David Ortega
- Escuela de Ingenierías Eléctrica Electrónica y de Telecomunicaciones, Universidad Industrial de Santander, Bucaramanga, Colombia
| | - Érika Suarez
- Escuela de Ingenierías Eléctrica Electrónica y de Telecomunicaciones, Universidad Industrial de Santander, Bucaramanga, Colombia
| | - William Cancino
- Escuela de Ingenierías Eléctrica Electrónica y de Telecomunicaciones, Universidad Industrial de Santander, Bucaramanga, Colombia
| | - Gerson Africano
- Escuela de Ingenierías Eléctrica Electrónica y de Telecomunicaciones, Universidad Industrial de Santander, Bucaramanga, Colombia
| | - Irina Rinta-Kiikka
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- Department of Radiology, Tampere University Hospital, Tampere, Finland
| | - Otso Arponen
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.
- Department of Radiology, Tampere University Hospital, Tampere, Finland.
| | - Sara Paris
- Departamento de Imágenes Diagnósticas, Universidad Nacional de Colombia, Bogotá, Colombia
| | - Alfonso Lozano
- Departamento de Imágenes Diagnósticas, Universidad Nacional de Colombia, Bogotá, Colombia
| |
Collapse
|
22
|
Kelly BS, Mathur P, Plesniar J, Lawlor A, Killeen RP. Using deep learning-derived image features in radiologic time series to make personalised predictions: proof of concept in colonic transit data. Eur Radiol 2023; 33:8376-8386. [PMID: 37284869 PMCID: PMC10244854 DOI: 10.1007/s00330-023-09769-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 04/04/2023] [Accepted: 04/19/2023] [Indexed: 06/08/2023]
Abstract
OBJECTIVES Siamese neural networks (SNN) were used to classify the presence of radiopaque beads as part of a colonic transit time study (CTS). The SNN output was then used as a feature in a time series model to predict progression through a CTS. METHODS This retrospective study included all patients undergoing a CTS in a single institution from 2010 to 2020. Data were partitioned in an 80/20 Train/Test split. Deep learning models based on a SNN architecture were trained and tested to classify images according to the presence, absence, and number of radiopaque beads and to output the Euclidean distance between the feature representations of the input images. Time series models were used to predict the total duration of the study. RESULTS In total, 568 images of 229 patients (143, 62% female, mean age 57) patients were included. For the classification of the presence of beads, the best performing model (Siamese DenseNET trained with a contrastive loss with unfrozen weights) achieved an accuracy, precision, and recall of 0.988, 0.986, and 1. A Gaussian process regressor (GPR) trained on the outputs of the SNN outperformed both GPR using only the number of beads and basic statistical exponential curve fitting with MAE of 0.9 days compared to 2.3 and 6.3 days (p < 0.05) respectively. CONCLUSIONS SNNs perform well at the identification of radiopaque beads in CTS. For time series prediction our methods were superior at identifying progression through the time series compared to statistical models, enabling more accurate personalised predictions. CLINICAL RELEVANCE STATEMENT Our radiologic time series model has potential clinical application in use cases where change assessment is critical (e.g. nodule surveillance, cancer treatment response, and screening programmes) by quantifying change and using it to make more personalised predictions. KEY POINTS • Time series methods have improved but application to radiology lags behind computer vision. Colonic transit studies are a simple radiologic time series measuring function through serial radiographs. • We successfully employed a Siamese neural network (SNN) to compare between radiographs at different points in time and then used the output of SNN as a feature in a Gaussian process regression model to predict progression through the time series. • This novel use of features derived from a neural network on medical imaging data to predict progression has potential clinical application in more complex use cases where change assessment is critical such as in oncologic imaging, monitoring for treatment response, and screening programmes.
Collapse
Affiliation(s)
- Brendan S Kelly
- Department of Radiology, St Vincent's University Hospital, Dublin, Ireland.
- Insight Centre for Data Analytics, UCD, Dublin, Ireland.
- School of Medicine, University College Dublin, Dublin, Ireland.
| | | | - Jan Plesniar
- School of Medicine, University College Dublin, Dublin, Ireland
| | | | - Ronan P Killeen
- Department of Radiology, St Vincent's University Hospital, Dublin, Ireland
- School of Medicine, University College Dublin, Dublin, Ireland
| |
Collapse
|
23
|
Zech JR, Jaramillo D, Altosaar J, Popkin CA, Wong TT. Artificial intelligence to identify fractures on pediatric and young adult upper extremity radiographs. Pediatr Radiol 2023; 53:2386-2397. [PMID: 37740031 DOI: 10.1007/s00247-023-05754-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 08/09/2023] [Accepted: 08/21/2023] [Indexed: 09/24/2023]
Abstract
BACKGROUND Pediatric fractures are challenging to identify given the different response of the pediatric skeleton to injury compared to adults, and most artificial intelligence (AI) fracture detection work has focused on adults. OBJECTIVE Develop and transparently share an AI model capable of detecting a range of pediatric upper extremity fractures. MATERIALS AND METHODS In total, 58,846 upper extremity radiographs (finger/hand, wrist/forearm, elbow, humerus, shoulder/clavicle) from 14,873 pediatric and young adult patients were divided into train (n = 12,232 patients), tune (n = 1,307), internal test (n = 819), and external test (n = 515) splits. Fracture was determined by manual inspection of all test radiographs and the subset of train/tune radiographs whose reports were classified fracture-positive by a rule-based natural language processing (NLP) algorithm. We trained an object detection model (Faster Region-based Convolutional Neural Network [R-CNN]; "strongly-supervised") and an image classification model (EfficientNetV2-Small; "weakly-supervised") to detect fractures using train/tune data and evaluate on test data. AI fracture detection accuracy was compared with accuracy of on-call residents on cases they preliminarily interpreted overnight. RESULTS A strongly-supervised fracture detection AI model achieved overall test area under the receiver operating characteristic curve (AUC) of 0.96 (95% CI 0.95-0.97), accuracy 89.7% (95% CI 88.0-91.3%), sensitivity 90.8% (95% CI 88.5-93.1%), and specificity 88.7% (95% CI 86.4-91.0%), and outperformed a weakly-supervised model (AUC 0.93, 95% CI 0.92-0.94, P < 0.0001). AI accuracy on cases preliminary interpreted overnight was higher than resident accuracy (AI 89.4% vs. 85.1%, 95% CI 87.3-91.5% vs. 82.7-87.5%, P = 0.01). CONCLUSION An object detection AI model identified pediatric upper extremity fractures with high accuracy.
Collapse
Affiliation(s)
- John R Zech
- Department of Radiology, Columbia University Irving Medical Center, 622 W. 168th St., New York, NY, 10032, USA.
| | - Diego Jaramillo
- Department of Radiology, Columbia University Irving Medical Center, 622 W. 168th St., New York, NY, 10032, USA
| | | | - Charles A Popkin
- Department of Orthopedic Surgery, Columbia University Irving Medical Center, New York, NY, USA
| | - Tony T Wong
- Department of Radiology, Columbia University Irving Medical Center, 622 W. 168th St., New York, NY, 10032, USA
| |
Collapse
|
24
|
Li MD, Little BP. Appropriate Reliance on Artificial Intelligence in Radiology Education. J Am Coll Radiol 2023; 20:1126-1130. [PMID: 37392983 DOI: 10.1016/j.jacr.2023.04.019] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 03/20/2023] [Accepted: 04/06/2023] [Indexed: 07/03/2023]
Abstract
Users of artificial intelligence (AI) can become overreliant on AI, negatively affecting the performance of human-AI teams. For a future in which radiologists use interpretive AI tools routinely in clinical practice, radiology education will need to evolve to provide radiologists with the skills to use AI appropriately and wisely. In this work, we examine how overreliance on AI may develop in radiology trainees and explore how this problem can be mitigated, including through the use of AI-augmented education. Radiology trainees will still need to develop the perceptual skills and mastery of knowledge fundamental to radiology to use AI safely. We propose a framework for radiology trainees to use AI tools with appropriate reliance, drawing on lessons from human-AI interactions research.
Collapse
Affiliation(s)
- Matthew D Li
- Department of Radiology and Diagnostic Imaging, Faculty of Medicine & Dentistry, University of Alberta, Edmonton, Alberta, Canada.
| | - Brent P Little
- Mayo Clinic College of Medicine and Science, Department of Radiology, Division of Cardiothoracic Imaging, Mayo Clinic Florida, Florida; Committee Member, ACR Appropriateness Criteria Thoracic Imaging
| |
Collapse
|
25
|
Pai S, Bontempi D, Prudente V, Hadzic I, Sokač M, Chaunzwa TL, Bernatz S, Hosny A, Mak RH, Birkbak NJ, Aerts HJWL. Foundation Models for Quantitative Biomarker Discovery in Cancer Imaging. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.09.04.23294952. [PMID: 37732237 PMCID: PMC10508804 DOI: 10.1101/2023.09.04.23294952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
Foundation models represent a recent paradigm shift in deep learning, where a single large-scale model trained on vast amounts of data can serve as the foundation for various downstream tasks. Foundation models are generally trained using self-supervised learning and excel in reducing the demand for training samples in downstream applications. This is especially important in medicine, where large labeled datasets are often scarce. Here, we developed a foundation model for imaging biomarker discovery by training a convolutional encoder through self-supervised learning using a comprehensive dataset of 11,467 radiographic lesions. The foundation model was evaluated in distinct and clinically relevant applications of imaging-based biomarkers. We found that they facilitated better and more efficient learning of imaging biomarkers and yielded task-specific models that significantly outperformed their conventional supervised counterparts on downstream tasks. The performance gain was most prominent when training dataset sizes were very limited. Furthermore, foundation models were more stable to input and inter-reader variations and showed stronger associations with underlying biology. Our results demonstrate the tremendous potential of foundation models in discovering novel imaging biomarkers that may extend to other clinical use cases and can accelerate the widespread translation of imaging biomarkers into clinical settings.
Collapse
Affiliation(s)
- Suraj Pai
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Radiology and Nuclear Medicine, CARIM & GROW, Maastricht University, Universiteitssingel 40, 6229 ER Maastricht, The Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| | - Dennis Bontempi
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Radiology and Nuclear Medicine, CARIM & GROW, Maastricht University, Universiteitssingel 40, 6229 ER Maastricht, The Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| | - Vasco Prudente
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Radiology and Nuclear Medicine, CARIM & GROW, Maastricht University, Universiteitssingel 40, 6229 ER Maastricht, The Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| | - Ibrahim Hadzic
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Radiology and Nuclear Medicine, CARIM & GROW, Maastricht University, Universiteitssingel 40, 6229 ER Maastricht, The Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| | - Mateo Sokač
- Department of Molecular Medicine, Aarhus University Hospital, 8200 Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, 8200 Aarhus, Denmark
| | - Tafadzwa L. Chaunzwa
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| | - Simon Bernatz
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| | - Ahmed Hosny
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| | - Raymond H Mak
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Radiology and Nuclear Medicine, CARIM & GROW, Maastricht University, Universiteitssingel 40, 6229 ER Maastricht, The Netherlands
| | - Nicolai J Birkbak
- Department of Molecular Medicine, Aarhus University Hospital, 8200 Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, 8200 Aarhus, Denmark
| | - Hugo JWL Aerts
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Radiology and Nuclear Medicine, CARIM & GROW, Maastricht University, Universiteitssingel 40, 6229 ER Maastricht, The Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
- Department of Radiology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| |
Collapse
|
26
|
Yahyatabar M, Jouvet P, Cheriet F. Joint classification and segmentation for an interpretable diagnosis of acute respiratory distress syndrome from chest x-rays. J Med Imaging (Bellingham) 2023; 10:054504. [PMID: 37854097 PMCID: PMC10581023 DOI: 10.1117/1.jmi.10.5.054504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 09/05/2023] [Accepted: 10/03/2023] [Indexed: 10/20/2023] Open
Abstract
Purpose Acute respiratory distress syndrome (ARDS) is a life-threatening condition that can cause a dramatic drop in blood oxygen levels due to widespread lung inflammation. Chest radiography is widely used as a primary modality to detect ARDS due to its crucial role in diagnosing the syndrome, and the x-ray images can be obtained promptly. However, despite the extensive literature on chest x-ray (CXR) image analysis, there is limited research on ARDS diagnosis due to the scarcity of ARDS-labeled datasets. Additionally, many machine learning-based approaches result in high performance in pulmonary disease diagnosis, but their decisions are often not easily interpretable, which can hinder their clinical acceptance. This work aims to develop a method for detecting signs of ARDS in CXR images that can be clinically interpretable. Approach To achieve this goal, an ARDS-labeled dataset of chest radiography images is gathered and annotated for training and evaluation of the proposed approach. The proposed deep classification-segmentation model, Dense-Ynet, provides an interpretable framework for automatically diagnosing ARDS in CXR images. The model takes advantage of lung segmentation in diagnosing ARDS. By definition, ARDS causes bilateral diffuse infiltrates throughout the lungs. To consider the local involvement of lung areas, each lung is divided into upper and lower halves, and our model classifies the resulting lung quadrants. Results The quadrant-based classification strategy yields the area under the receiver operating characteristic curve of 95.1% (95% CI 93.5 to 96.1), which allows for providing a reference for the model's predictions. In terms of segmentation, the model accurately identifies lung regions in CXR images even when lung boundaries are unclear in abnormal images. Conclusions This study provides an interpretable decision system for diagnosing ARDS, by following the definition used by clinicians for the diagnosis of ARDS from CXR images.
Collapse
Affiliation(s)
- Mohammad Yahyatabar
- Polytechnique Montréal, Department of Computer and Software Engineering, Montreal, Quebec, Canada
| | - Philippe Jouvet
- University of Montréal, Department of Pediatrics, Faculty of Medicine, Montréal, Quebec, Canada
| | - Farida Cheriet
- Polytechnique Montréal, Department of Computer and Software Engineering, Montreal, Quebec, Canada
| |
Collapse
|
27
|
Tan TF, Dai P, Zhang X, Jin L, Poh S, Hong D, Lim J, Lim G, Teo ZL, Liu N, Ting DSW. Explainable artificial intelligence in ophthalmology. Curr Opin Ophthalmol 2023; 34:422-430. [PMID: 37527200 DOI: 10.1097/icu.0000000000000983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/03/2023]
Abstract
PURPOSE OF REVIEW Despite the growing scope of artificial intelligence (AI) and deep learning (DL) applications in the field of ophthalmology, most have yet to reach clinical adoption. Beyond model performance metrics, there has been an increasing emphasis on the need for explainability of proposed DL models. RECENT FINDINGS Several explainable AI (XAI) methods have been proposed, and increasingly applied in ophthalmological DL applications, predominantly in medical imaging analysis tasks. SUMMARY We summarize an overview of the key concepts, and categorize some examples of commonly employed XAI methods. Specific to ophthalmology, we explore XAI from a clinical perspective, in enhancing end-user trust, assisting clinical management, and uncovering new insights. We finally discuss its limitations and future directions to strengthen XAI for application to clinical practice.
Collapse
Affiliation(s)
- Ting Fang Tan
- Artificial Intelligence and Digital Innovation Research Group
- Singapore National Eye Centre, Singapore General Hospital
| | - Peilun Dai
- Institute of High Performance Computing, A∗STAR
| | - Xiaoman Zhang
- Duke-National University of Singapore Medical School, Singapore
| | - Liyuan Jin
- Artificial Intelligence and Digital Innovation Research Group
- Duke-National University of Singapore Medical School, Singapore
| | - Stanley Poh
- Singapore National Eye Centre, Singapore General Hospital
| | - Dylan Hong
- Artificial Intelligence and Digital Innovation Research Group
| | - Joshua Lim
- Singapore National Eye Centre, Singapore General Hospital
| | - Gilbert Lim
- Artificial Intelligence and Digital Innovation Research Group
| | - Zhen Ling Teo
- Artificial Intelligence and Digital Innovation Research Group
- Singapore National Eye Centre, Singapore General Hospital
| | - Nan Liu
- Artificial Intelligence and Digital Innovation Research Group
- Duke-National University of Singapore Medical School, Singapore
| | - Daniel Shu Wei Ting
- Artificial Intelligence and Digital Innovation Research Group
- Singapore National Eye Centre, Singapore General Hospital
- Duke-National University of Singapore Medical School, Singapore
- Byers Eye Institute, Stanford University, Stanford, California, USA
| |
Collapse
|
28
|
Crincoli E, Sacconi R, Querques G. Reshaping the use of Artificial Intelligence in Ophthalmology: Sometimes you Need to go Backwards. Retina 2023; 43:1429-1432. [PMID: 37343295 DOI: 10.1097/iae.0000000000003878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/23/2023]
Affiliation(s)
- Emanuele Crincoli
- Department of Ophthalmology, University Vita-Salute, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | | | | |
Collapse
|
29
|
Storås AM, Andersen OE, Lockhart S, Thielemann R, Gnesin F, Thambawita V, Hicks SA, Kanters JK, Strümke I, Halvorsen P, Riegler MA. Usefulness of Heat Map Explanations for Deep-Learning-Based Electrocardiogram Analysis. Diagnostics (Basel) 2023; 13:2345. [PMID: 37510089 PMCID: PMC10378376 DOI: 10.3390/diagnostics13142345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 07/06/2023] [Accepted: 07/10/2023] [Indexed: 07/30/2023] Open
Abstract
Deep neural networks are complex machine learning models that have shown promising results in analyzing high-dimensional data such as those collected from medical examinations. Such models have the potential to provide fast and accurate medical diagnoses. However, the high complexity makes deep neural networks and their predictions difficult to understand. Providing model explanations can be a way of increasing the understanding of "black box" models and building trust. In this work, we applied transfer learning to develop a deep neural network to predict sex from electrocardiograms. Using the visual explanation method Grad-CAM, heat maps were generated from the model in order to understand how it makes predictions. To evaluate the usefulness of the heat maps and determine if the heat maps identified electrocardiogram features that could be recognized to discriminate sex, medical doctors provided feedback. Based on the feedback, we concluded that, in our setting, this mode of explainable artificial intelligence does not provide meaningful information to medical doctors and is not useful in the clinic. Our results indicate that improved explanation techniques that are tailored to medical data should be developed before deep neural networks can be applied in the clinic for diagnostic purposes.
Collapse
Affiliation(s)
- Andrea M Storås
- Department of Holistic Systems, Simula Metropolitan Center for Digital Engineering, 0167 Oslo, Norway
- Department of Computer Science, Oslo Metropolitan University, 0130 Oslo, Norway
| | - Ole Emil Andersen
- Department of Public Health, Aarhus University, 8000 Aarhus, Denmark
- Steno Diabetes Center, Aarhus University, 8000 Aarhus, Denmark
| | - Sam Lockhart
- Wellcome Trust-Medical Research Council Institute of Metabolic Science, University of Cambridge, Cambridge CB2 0QQ, UK
| | - Roman Thielemann
- Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Filip Gnesin
- Department of Cardiology, North Zealand Hospital, 3400 Hillerød, Denmark
| | - Vajira Thambawita
- Department of Holistic Systems, Simula Metropolitan Center for Digital Engineering, 0167 Oslo, Norway
| | - Steven A Hicks
- Department of Holistic Systems, Simula Metropolitan Center for Digital Engineering, 0167 Oslo, Norway
| | - Jørgen K Kanters
- Department of Biomedical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Inga Strümke
- Department of Computer Science, Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Pål Halvorsen
- Department of Holistic Systems, Simula Metropolitan Center for Digital Engineering, 0167 Oslo, Norway
- Department of Computer Science, Oslo Metropolitan University, 0130 Oslo, Norway
| | - Michael A Riegler
- Department of Holistic Systems, Simula Metropolitan Center for Digital Engineering, 0167 Oslo, Norway
- Department of Computer Science, UiT The Arctic University of Norway, 9037 Tromsø, Norway
| |
Collapse
|
30
|
Decoodt P, Liang TJ, Bopardikar S, Santhanam H, Eyembe A, Garcia-Zapirain B, Sierra-Sosa D. Hybrid Classical-Quantum Transfer Learning for Cardiomegaly Detection in Chest X-rays. J Imaging 2023; 9:128. [PMID: 37504805 PMCID: PMC10381726 DOI: 10.3390/jimaging9070128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 06/16/2023] [Accepted: 06/19/2023] [Indexed: 07/29/2023] Open
Abstract
Cardiovascular diseases are among the major health problems that are likely to benefit from promising developments in quantum machine learning for medical imaging. The chest X-ray (CXR), a widely used modality, can reveal cardiomegaly, even when performed primarily for a non-cardiological indication. Based on pre-trained DenseNet-121, we designed hybrid classical-quantum (CQ) transfer learning models to detect cardiomegaly in CXRs. Using Qiskit and PennyLane, we integrated a parameterized quantum circuit into a classic network implemented in PyTorch. We mined the CheXpert public repository to create a balanced dataset with 2436 posteroanterior CXRs from different patients distributed between cardiomegaly and the control. Using k-fold cross-validation, the CQ models were trained using a state vector simulator. The normalized global effective dimension allowed us to compare the trainability in the CQ models run on Qiskit. For prediction, ROC AUC scores up to 0.93 and accuracies up to 0.87 were achieved for several CQ models, rivaling the classical-classical (CC) model used as a reference. A trustworthy Grad-CAM++ heatmap with a hot zone covering the heart was visualized more often with the QC option than that with the CC option (94% vs. 61%, p < 0.001), which may boost the rate of acceptance by health professionals.
Collapse
Affiliation(s)
- Pierre Decoodt
- Cardiologie, Centre Hospitalo-Universitaire Brugmann, Faculté de Médecine, Université Libre de Bruxelles, Place Van Gehuchten 4, 1020 Brussels, Belgium
| | - Tan Jun Liang
- School of Computer Science, Digital Health and Innovations Impact Lab, Taylor's University, Subang Jaya 47500, Selangor, Malaysia
- qBraid Co., Chicago, IL 60615, USA
| | - Soham Bopardikar
- Department of Electronics and Telecommunication Engineering, College of Engineering Pune, Pune 411005, India
| | - Hemavathi Santhanam
- Faculty of Graduate Studies and Research, Saint Mary's University, 923 Robie Street, Halifax, NS B3H 3C3, Canada
| | - Alfaxad Eyembe
- Faculty of Engineering, Kyoto University of Advanced Science (KUAS), Ukyo-ku, Kyoto 615-8577, Japan
| | | | - Daniel Sierra-Sosa
- Computer Science and Information Technologies Department, Hood College, 401 Rosemont Ave., Frederick, MD 21702, USA
| |
Collapse
|
31
|
Bradshaw TJ, McCradden MD, Jha AK, Dutta J, Saboury B, Siegel EL, Rahmim A. Artificial Intelligence Algorithms Need to Be Explainable-or Do They? J Nucl Med 2023; 64:976-977. [PMID: 37116913 PMCID: PMC10885777 DOI: 10.2967/jnumed.122.264949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 03/17/2023] [Indexed: 04/30/2023] Open
Affiliation(s)
| | | | - Abhinav K Jha
- Washington University in St. Louis, St. Louis, Missouri
| | - Joyita Dutta
- University of Massachusetts Amherst, Amherst, Massachusetts
| | | | - Eliot L Siegel
- University of Maryland School of Medicine, Baltimore, Maryland; and
| | - Arman Rahmim
- University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
32
|
Dolezal JM, Wolk R, Hieromnimon HM, Howard FM, Srisuwananukorn A, Karpeyev D, Ramesh S, Kochanny S, Kwon JW, Agni M, Simon RC, Desai C, Kherallah R, Nguyen TD, Schulte JJ, Cole K, Khramtsova G, Garassino MC, Husain AN, Li H, Grossman R, Cipriani NA, Pearson AT. Deep learning generates synthetic cancer histology for explainability and education. NPJ Precis Oncol 2023; 7:49. [PMID: 37248379 PMCID: PMC10227067 DOI: 10.1038/s41698-023-00399-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Accepted: 05/12/2023] [Indexed: 05/31/2023] Open
Abstract
Artificial intelligence methods including deep neural networks (DNN) can provide rapid molecular classification of tumors from routine histology with accuracy that matches or exceeds human pathologists. Discerning how neural networks make their predictions remains a significant challenge, but explainability tools help provide insights into what models have learned when corresponding histologic features are poorly defined. Here, we present a method for improving explainability of DNN models using synthetic histology generated by a conditional generative adversarial network (cGAN). We show that cGANs generate high-quality synthetic histology images that can be leveraged for explaining DNN models trained to classify molecularly-subtyped tumors, exposing histologic features associated with molecular state. Fine-tuning synthetic histology through class and layer blending illustrates nuanced morphologic differences between tumor subtypes. Finally, we demonstrate the use of synthetic histology for augmenting pathologist-in-training education, showing that these intuitive visualizations can reinforce and improve understanding of histologic manifestations of tumor biology.
Collapse
Affiliation(s)
- James M Dolezal
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medicine, Chicago, IL, USA
| | - Rachelle Wolk
- Department of Pathology, University of Chicago Medicine, Chicago, IL, USA
| | - Hanna M Hieromnimon
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medicine, Chicago, IL, USA
| | - Frederick M Howard
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medicine, Chicago, IL, USA
| | | | | | - Siddhi Ramesh
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medicine, Chicago, IL, USA
| | - Sara Kochanny
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medicine, Chicago, IL, USA
| | - Jung Woo Kwon
- Department of Pathology, University of Chicago Medicine, Chicago, IL, USA
| | - Meghana Agni
- Department of Pathology, University of Chicago Medicine, Chicago, IL, USA
| | - Richard C Simon
- Department of Pathology, University of Chicago Medicine, Chicago, IL, USA
| | - Chandni Desai
- Department of Pathology, University of Chicago Medicine, Chicago, IL, USA
| | - Raghad Kherallah
- Department of Pathology, University of Chicago Medicine, Chicago, IL, USA
| | - Tung D Nguyen
- Department of Pathology, University of Chicago Medicine, Chicago, IL, USA
| | - Jefree J Schulte
- Department of Pathology and Laboratory Medicine, University of Wisconsin at Madison, Madison, WN, USA
| | - Kimberly Cole
- Department of Pathology, University of Chicago Medicine, Chicago, IL, USA
| | - Galina Khramtsova
- Department of Pathology, University of Chicago Medicine, Chicago, IL, USA
| | - Marina Chiara Garassino
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medicine, Chicago, IL, USA
| | - Aliya N Husain
- Department of Pathology, University of Chicago Medicine, Chicago, IL, USA
| | - Huihua Li
- Department of Pathology, University of Chicago Medicine, Chicago, IL, USA
| | - Robert Grossman
- University of Chicago, Center for Translational Data Science, Chicago, IL, USA
| | - Nicole A Cipriani
- Department of Pathology, University of Chicago Medicine, Chicago, IL, USA.
| | - Alexander T Pearson
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medicine, Chicago, IL, USA.
| |
Collapse
|
33
|
Ozcan BB, Patel BK, Banerjee I, Dogan BE. Artificial Intelligence in Breast Imaging: Challenges of Integration Into Clinical Practice. JOURNAL OF BREAST IMAGING 2023; 5:248-257. [PMID: 38416888 DOI: 10.1093/jbi/wbad007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Indexed: 03/01/2024]
Abstract
Artificial intelligence (AI) in breast imaging is a rapidly developing field with promising results. Despite the large number of recent publications in this field, unanswered questions have led to limited implementation of AI into daily clinical practice for breast radiologists. This paper provides an overview of the key limitations of AI in breast imaging including, but not limited to, limited numbers of FDA-approved algorithms and annotated data sets with histologic ground truth; concerns surrounding data privacy, security, algorithm transparency, and bias; and ethical issues. Ultimately, the successful implementation of AI into clinical care will require thoughtful action to address these challenges, transparency, and sharing of AI implementation workflows, limitations, and performance metrics within the breast imaging community and other end-users.
Collapse
Affiliation(s)
- B Bersu Ozcan
- The University of Texas Southwestern Medical Center, Department of Radiology, Dallas, TX, USA
| | | | - Imon Banerjee
- Mayo Clinic, Department of Radiology, Scottsdale, AZ, USA
| | - Basak E Dogan
- The University of Texas Southwestern Medical Center, Department of Radiology, Dallas, TX, USA
| |
Collapse
|
34
|
Yan J, Zhao J, Cai Y, Wang S, Qiu X, Yao X, Tian Y, Zhu Y, Cao W, Zhang X. Improving multi-scale detection layers in the deep learning network for wheat spike detection based on interpretive analysis. PLANT METHODS 2023; 19:46. [PMID: 37179312 PMCID: PMC10183117 DOI: 10.1186/s13007-023-01020-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 04/29/2023] [Indexed: 05/15/2023]
Abstract
BACKGROUND Detecting and counting wheat spikes is essential for predicting and measuring wheat yield. However, current wheat spike detection researches often directly apply the new network structure. There are few studies that can combine the prior knowledge of wheat spike size characteristics to design a suitable wheat spike detection model. It remains unclear whether the complex detection layers of the network play their intended role. RESULTS This study proposes an interpretive analysis method for quantitatively evaluating the role of three-scale detection layers in a deep learning-based wheat spike detection model. The attention scores in each detection layer of the YOLOv5 network are calculated using the Gradient-weighted Class Activation Mapping (Grad-CAM) algorithm, which compares the prior labeled wheat spike bounding boxes with the attention areas of the network. By refining the multi-scale detection layers using the attention scores, a better wheat spike detection network is obtained. The experiments on the Global Wheat Head Detection (GWHD) dataset show that the large-scale detection layer performs poorly, while the medium-scale detection layer performs best among the three-scale detection layers. Consequently, the large-scale detection layer is removed, a micro-scale detection layer is added, and the feature extraction ability in the medium-scale detection layer is enhanced. The refined model increases the detection accuracy and reduces the network complexity by decreasing the network parameters. CONCLUSION The proposed interpretive analysis method to evaluate the contribution of different detection layers in the wheat spike detection network and provide a correct network improvement scheme. The findings of this study will offer a useful reference for future applications of deep network refinement in this field.
Collapse
Affiliation(s)
- Jiawei Yan
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
| | - Jianqing Zhao
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
| | - Yucheng Cai
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
| | - Suwan Wang
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
| | - Xiaolei Qiu
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
| | - Xia Yao
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
- Jiangsu Key Laboratory for Information Agriculture, Nanjing, 210095, China
| | - Yongchao Tian
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing, 210095, China
| | - Yan Zhu
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
| | - Weixing Cao
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China
| | - Xiaohu Zhang
- National Engineering and Technology Center for Information Agriculture, Nanjing Agricultural University, Nanjing, 210095, China.
- Key Laboratory for Crop System Analysis and Decision Making, Ministry of Agriculture and Rural Affairs, Nanjing, 210095, China.
- Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing, 210095, China.
| |
Collapse
|
35
|
Affective Design Analysis of Explainable Artificial Intelligence (XAI): A User-Centric Perspective. INFORMATICS 2023. [DOI: 10.3390/informatics10010032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2023] Open
Abstract
Explainable Artificial Intelligence (XAI) has successfully solved the black box paradox of Artificial Intelligence (AI). By providing human-level insights on AI, it allowed users to understand its inner workings even with limited knowledge of the machine learning algorithms it uses. As a result, the field grew, and development flourished. However, concerns have been expressed that the techniques are limited in terms of to whom they are applicable and how their effect can be leveraged. Currently, most XAI techniques have been designed by developers. Though needed and valuable, XAI is more critical for an end-user, considering transparency cleaves on trust and adoption. This study aims to understand and conceptualize an end-user-centric XAI to fill in the lack of end-user understanding. Considering recent findings of related studies, this study focuses on design conceptualization and affective analysis. Data from 202 participants were collected from an online survey to identify the vital XAI design components and testbed experimentation to explore the affective and trust change per design configuration. The results show that affective is a viable trust calibration route for XAI. In terms of design, explanation form, communication style, and presence of supplementary information are the components users look for in an effective XAI. Lastly, anxiety about AI, incidental emotion, perceived AI reliability, and experience using the system are significant moderators of the trust calibration process for an end-user.
Collapse
|
36
|
Chen JS, Baxter SL, van den Brandt A, Lieu A, Camp AS, Do JL, Welsbie DS, Moghimi S, Christopher M, Weinreb RN, Zangwill LM. Usability and Clinician Acceptance of a Deep Learning-Based Clinical Decision Support Tool for Predicting Glaucomatous Visual Field Progression. J Glaucoma 2023; 32:151-158. [PMID: 36877820 PMCID: PMC9996451 DOI: 10.1097/ijg.0000000000002163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 11/19/2023] [Indexed: 03/08/2023]
Abstract
PRCIS We updated a clinical decision support tool integrating predicted visual field (VF) metrics from an artificial intelligence model and assessed clinician perceptions of the predicted VF metric in this usability study. PURPOSE To evaluate clinician perceptions of a prototyped clinical decision support (CDS) tool that integrates visual field (VF) metric predictions from artificial intelligence (AI) models. METHODS Ten ophthalmologists and optometrists from the University of California San Diego participated in 6 cases from 6 patients, consisting of 11 eyes, uploaded to a CDS tool ("GLANCE", designed to help clinicians "at a glance"). For each case, clinicians answered questions about management recommendations and attitudes towards GLANCE, particularly regarding the utility and trustworthiness of the AI-predicted VF metrics and willingness to decrease VF testing frequency. MAIN OUTCOMES AND MEASURES Mean counts of management recommendations and mean Likert scale scores were calculated to assess overall management trends and attitudes towards the CDS tool for each case. In addition, system usability scale scores were calculated. RESULTS The mean Likert scores for trust in and utility of the predicted VF metric and clinician willingness to decrease VF testing frequency were 3.27, 3.42, and 2.64, respectively (1=strongly disagree, 5=strongly agree). When stratified by glaucoma severity, all mean Likert scores decreased as severity increased. The system usability scale score across all responders was 66.1±16.0 (43rd percentile). CONCLUSIONS A CDS tool can be designed to present AI model outputs in a useful, trustworthy manner that clinicians are generally willing to integrate into their clinical decision-making. Future work is needed to understand how to best develop explainable and trustworthy CDS tools integrating AI before clinical deployment.
Collapse
Affiliation(s)
- Jimmy S Chen
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, CA
| | - Sally L Baxter
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, CA
| | | | - Alexander Lieu
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute
| | - Andrew S Camp
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute
| | - Jiun L Do
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute
| | - Derek S Welsbie
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute
| | - Sasan Moghimi
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute
| | - Mark Christopher
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute
| | - Robert N Weinreb
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute
| | - Linda M Zangwill
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute
| |
Collapse
|
37
|
Bakrania A, Joshi N, Zhao X, Zheng G, Bhat M. Artificial intelligence in liver cancers: Decoding the impact of machine learning models in clinical diagnosis of primary liver cancers and liver cancer metastases. Pharmacol Res 2023; 189:106706. [PMID: 36813095 DOI: 10.1016/j.phrs.2023.106706] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Revised: 02/17/2023] [Accepted: 02/19/2023] [Indexed: 02/22/2023]
Abstract
Liver cancers are the fourth leading cause of cancer-related mortality worldwide. In the past decade, breakthroughs in the field of artificial intelligence (AI) have inspired development of algorithms in the cancer setting. A growing body of recent studies have evaluated machine learning (ML) and deep learning (DL) algorithms for pre-screening, diagnosis and management of liver cancer patients through diagnostic image analysis, biomarker discovery and predicting personalized clinical outcomes. Despite the promise of these early AI tools, there is a significant need to explain the 'black box' of AI and work towards deployment to enable ultimate clinical translatability. Certain emerging fields such as RNA nanomedicine for targeted liver cancer therapy may also benefit from application of AI, specifically in nano-formulation research and development given that they are still largely reliant on lengthy trial-and-error experiments. In this paper, we put forward the current landscape of AI in liver cancers along with the challenges of AI in liver cancer diagnosis and management. Finally, we have discussed the future perspectives of AI application in liver cancer and how a multidisciplinary approach using AI in nanomedicine could accelerate the transition of personalized liver cancer medicine from bench side to the clinic.
Collapse
Affiliation(s)
- Anita Bakrania
- Toronto General Hospital Research Institute, Toronto, ON, Canada; Ajmera Transplant Program, University Health Network, Toronto, ON, Canada; Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada.
| | | | - Xun Zhao
- Toronto General Hospital Research Institute, Toronto, ON, Canada; Ajmera Transplant Program, University Health Network, Toronto, ON, Canada
| | - Gang Zheng
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada; Institute of Biomedical Engineering, University of Toronto, Toronto, ON, Canada; Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
| | - Mamatha Bhat
- Toronto General Hospital Research Institute, Toronto, ON, Canada; Ajmera Transplant Program, University Health Network, Toronto, ON, Canada; Division of Gastroenterology, Department of Medicine, University Health Network and University of Toronto, Toronto, ON, Canada; Department of Medical Sciences, Toronto, ON, Canada.
| |
Collapse
|
38
|
A Web-Based Platform for the Automatic Stratification of ARDS Severity. Diagnostics (Basel) 2023; 13:diagnostics13050933. [PMID: 36900077 PMCID: PMC10000955 DOI: 10.3390/diagnostics13050933] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2023] [Revised: 02/23/2023] [Accepted: 02/24/2023] [Indexed: 03/06/2023] Open
Abstract
Acute respiratory distress syndrome (ARDS), including severe pulmonary COVID infection, is associated with a high mortality rate. It is crucial to detect ARDS early, as a late diagnosis may lead to serious complications in treatment. One of the challenges in ARDS diagnosis is chest X-ray (CXR) interpretation. ARDS causes diffuse infiltrates through the lungs that must be identified using chest radiography. In this paper, we present a web-based platform leveraging artificial intelligence (AI) to automatically assess pediatric ARDS (PARDS) using CXR images. Our system computes a severity score to identify and grade ARDS in CXR images. Moreover, the platform provides an image highlighting the lung fields, which can be utilized for prospective AI-based systems. A deep learning (DL) approach is employed to analyze the input data. A novel DL model, named Dense-Ynet, is trained using a CXR dataset in which clinical specialists previously labelled the two halves (upper and lower) of each lung. The assessment results show that our platform achieves a recall rate of 95.25% and a precision of 88.02%. The web platform, named PARDS-CxR, assigns severity scores to input CXR images that are compatible with current definitions of ARDS and PARDS. Once it has undergone external validation, PARDS-CxR will serve as an essential component in a clinical AI framework for diagnosing ARDS.
Collapse
|
39
|
Saboury B, Bradshaw T, Boellaard R, Buvat I, Dutta J, Hatt M, Jha AK, Li Q, Liu C, McMeekin H, Morris MA, Scott PJH, Siegel E, Sunderland JJ, Pandit-Taskar N, Wahl RL, Zuehlsdorff S, Rahmim A. Artificial Intelligence in Nuclear Medicine: Opportunities, Challenges, and Responsibilities Toward a Trustworthy Ecosystem. J Nucl Med 2023; 64:188-196. [PMID: 36522184 PMCID: PMC9902852 DOI: 10.2967/jnumed.121.263703] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2022] [Revised: 12/06/2022] [Accepted: 12/06/2022] [Indexed: 12/23/2022] Open
Abstract
Trustworthiness is a core tenet of medicine. The patient-physician relationship is evolving from a dyad to a broader ecosystem of health care. With the emergence of artificial intelligence (AI) in medicine, the elements of trust must be revisited. We envision a road map for the establishment of trustworthy AI ecosystems in nuclear medicine. In this report, AI is contextualized in the history of technologic revolutions. Opportunities for AI applications in nuclear medicine related to diagnosis, therapy, and workflow efficiency, as well as emerging challenges and critical responsibilities, are discussed. Establishing and maintaining leadership in AI require a concerted effort to promote the rational and safe deployment of this innovative technology by engaging patients, nuclear medicine physicians, scientists, technologists, and referring providers, among other stakeholders, while protecting our patients and society. This strategic plan was prepared by the AI task force of the Society of Nuclear Medicine and Molecular Imaging.
Collapse
Affiliation(s)
- Babak Saboury
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Bethesda, Maryland;
| | - Tyler Bradshaw
- Department of Radiology, University of Wisconsin-Madison, Madison, Wisconsin
| | - Ronald Boellaard
- Department of Radiology and Nuclear Medicine, Cancer Centre Amsterdam, Amsterdam University Medical Centres, Amsterdam, The Netherlands
| | - Irène Buvat
- Institut Curie, Université PSL, INSERM, Université Paris-Saclay, Orsay, France
| | - Joyita Dutta
- Department of Electrical and Computer Engineering, University of Massachusetts Lowell, Lowell, Massachusetts
| | - Mathieu Hatt
- LaTIM, INSERM, UMR 1101, University of Brest, Brest, France
| | - Abhinav K Jha
- Department of Biomedical Engineering and Mallinckrodt Institute of Radiology, Washington University, St. Louis, Missouri
| | - Quanzheng Li
- Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts
| | - Chi Liu
- Department of Radiology and Biomedical Imaging, Yale University, New Haven, Connecticut
| | - Helena McMeekin
- Department of Clinical Physics, Barts Health NHS Trust, London, United Kingdom
| | - Michael A Morris
- Department of Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Bethesda, Maryland
| | - Peter J H Scott
- Department of Radiology, University of Michigan Medical School, Ann Arbor, Michigan
| | - Eliot Siegel
- Department of Radiology and Nuclear Medicine, University of Maryland Medical Center, Baltimore, Maryland
| | - John J Sunderland
- Departments of Radiology and Physics, University of Iowa, Iowa City, Iowa
| | - Neeta Pandit-Taskar
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Richard L Wahl
- Mallinckrodt Institute of Radiology, Washington University, St. Louis, Missouri
| | - Sven Zuehlsdorff
- Siemens Medical Solutions USA, Inc., Hoffman Estates, Illinois; and
| | - Arman Rahmim
- Departments of Radiology and Physics, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
40
|
Hadjiiski L, Cha K, Chan HP, Drukker K, Morra L, Näppi JJ, Sahiner B, Yoshida H, Chen Q, Deserno TM, Greenspan H, Huisman H, Huo Z, Mazurchuk R, Petrick N, Regge D, Samala R, Summers RM, Suzuki K, Tourassi G, Vergara D, Armato SG. AAPM task group report 273: Recommendations on best practices for AI and machine learning for computer-aided diagnosis in medical imaging. Med Phys 2023; 50:e1-e24. [PMID: 36565447 DOI: 10.1002/mp.16188] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 11/13/2022] [Accepted: 11/22/2022] [Indexed: 12/25/2022] Open
Abstract
Rapid advances in artificial intelligence (AI) and machine learning, and specifically in deep learning (DL) techniques, have enabled broad application of these methods in health care. The promise of the DL approach has spurred further interest in computer-aided diagnosis (CAD) development and applications using both "traditional" machine learning methods and newer DL-based methods. We use the term CAD-AI to refer to this expanded clinical decision support environment that uses traditional and DL-based AI methods. Numerous studies have been published to date on the development of machine learning tools for computer-aided, or AI-assisted, clinical tasks. However, most of these machine learning models are not ready for clinical deployment. It is of paramount importance to ensure that a clinical decision support tool undergoes proper training and rigorous validation of its generalizability and robustness before adoption for patient care in the clinic. To address these important issues, the American Association of Physicists in Medicine (AAPM) Computer-Aided Image Analysis Subcommittee (CADSC) is charged, in part, to develop recommendations on practices and standards for the development and performance assessment of computer-aided decision support systems. The committee has previously published two opinion papers on the evaluation of CAD systems and issues associated with user training and quality assurance of these systems in the clinic. With machine learning techniques continuing to evolve and CAD applications expanding to new stages of the patient care process, the current task group report considers the broader issues common to the development of most, if not all, CAD-AI applications and their translation from the bench to the clinic. The goal is to bring attention to the proper training and validation of machine learning algorithms that may improve their generalizability and reliability and accelerate the adoption of CAD-AI systems for clinical decision support.
Collapse
Affiliation(s)
- Lubomir Hadjiiski
- Department of Radiology, University of Michigan, Ann Arbor, Michigan, USA
| | - Kenny Cha
- U.S. Food and Drug Administration, Silver Spring, Maryland, USA
| | - Heang-Ping Chan
- Department of Radiology, University of Michigan, Ann Arbor, Michigan, USA
| | - Karen Drukker
- Department of Radiology, University of Chicago, Chicago, Illinois, USA
| | - Lia Morra
- Department of Control and Computer Engineering, Politecnico di Torino, Torino, Italy
| | - Janne J Näppi
- 3D Imaging Research, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | - Berkman Sahiner
- U.S. Food and Drug Administration, Silver Spring, Maryland, USA
| | - Hiroyuki Yoshida
- 3D Imaging Research, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | - Quan Chen
- Department of Radiation Medicine, University of Kentucky, Lexington, Kentucky, USA
| | - Thomas M Deserno
- Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Braunschweig, Germany
| | - Hayit Greenspan
- Department of Biomedical Engineering, Faculty of Engineering, Tel Aviv, Israel & Department of Radiology, Ichan School of Medicine, Tel Aviv University, Mt Sinai, New York, New York, USA
| | - Henkjan Huisman
- Radboud Institute for Health Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Zhimin Huo
- Tencent America, Palo Alto, California, USA
| | - Richard Mazurchuk
- Division of Cancer Prevention, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, USA
| | | | - Daniele Regge
- Radiology Unit, Candiolo Cancer Institute, FPO-IRCCS, Candiolo, Italy.,Department of Surgical Sciences, University of Turin, Turin, Italy
| | - Ravi Samala
- U.S. Food and Drug Administration, Silver Spring, Maryland, USA
| | - Ronald M Summers
- Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bethesda, Maryland, USA
| | - Kenji Suzuki
- Institute of Innovative Research, Tokyo Institute of Technology, Tokyo, Japan
| | | | - Daniel Vergara
- Department of Radiology, Yale New Haven Hospital, New Haven, Connecticut, USA
| | - Samuel G Armato
- Department of Radiology, University of Chicago, Chicago, Illinois, USA
| |
Collapse
|
41
|
Chaddad A, Peng J, Xu J, Bouridane A. Survey of Explainable AI Techniques in Healthcare. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23020634. [PMID: 36679430 PMCID: PMC9862413 DOI: 10.3390/s23020634] [Citation(s) in RCA: 33] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 12/14/2022] [Accepted: 12/29/2022] [Indexed: 05/27/2023]
Abstract
Artificial intelligence (AI) with deep learning models has been widely applied in numerous domains, including medical imaging and healthcare tasks. In the medical field, any judgment or decision is fraught with risk. A doctor will carefully judge whether a patient is sick before forming a reasonable explanation based on the patient's symptoms and/or an examination. Therefore, to be a viable and accepted tool, AI needs to mimic human judgment and interpretation skills. Specifically, explainable AI (XAI) aims to explain the information behind the black-box model of deep learning that reveals how the decisions are made. This paper provides a survey of the most recent XAI techniques used in healthcare and related medical imaging applications. We summarize and categorize the XAI types, and highlight the algorithms used to increase interpretability in medical imaging topics. In addition, we focus on the challenging XAI problems in medical applications and provide guidelines to develop better interpretations of deep learning models using XAI concepts in medical image and text analysis. Furthermore, this survey provides future directions to guide developers and researchers for future prospective investigations on clinical topics, particularly on applications with medical imaging.
Collapse
Affiliation(s)
- Ahmad Chaddad
- School of Artificial Intelligence, Guilin University of Electronic Technology, Jinji Road, Guilin 541004, China
- The Laboratory for Imagery Vision and Artificial Intelligence, Ecole de Technologie Superieure, 1100 Rue Notre Dame O, Montreal, QC H3C 1K3, Canada
| | - Jihao Peng
- School of Artificial Intelligence, Guilin University of Electronic Technology, Jinji Road, Guilin 541004, China
| | - Jian Xu
- School of Artificial Intelligence, Guilin University of Electronic Technology, Jinji Road, Guilin 541004, China
| | - Ahmed Bouridane
- Centre for Data Analytics and Cybersecurity, University of Sharjah, Sharjah 27272, United Arab Emirates
| |
Collapse
|
42
|
Marsh P, Radif D, Rajpurkar P, Wang Z, Hariton E, Ribeiro S, Simbulan R, Kaing A, Lin W, Rajah A, Rabara F, Lungren M, Demirci U, Ng A, Rosen M. A proof of concept for a deep learning system that can aid embryologists in predicting blastocyst survival after thaw. Sci Rep 2022; 12:21119. [PMID: 36477633 PMCID: PMC9729222 DOI: 10.1038/s41598-022-25062-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 11/24/2022] [Indexed: 12/12/2022] Open
Abstract
The ability to understand whether embryos survive the thaw process is crucial to transferring competent embryos that can lead to pregnancy. The objective of this study was to develop a proof of concept deep learning model capable of assisting embryologist assessment of survival of thawed blastocysts prior to embryo transfer. A deep learning model was developed using 652 labeled time-lapse videos of freeze-thaw blastocysts. The model was evaluated against and along embryologists on a test set of 99 freeze-thaw blastocysts, using images obtained at 0.5 h increments from 0 to 3 h post-thaw. The model achieved AUCs of 0.869 (95% CI 0.789, 0.934) and 0.807 (95% CI 0.717, 0.886) and the embryologists achieved average AUCs of 0.829 (95% CI 0.747, 0.896) and 0.850 (95% CI 0.773, 0.908) at 2 h and 3 h, respectively. Combining embryologist predictions with model predictions resulted in a significant increase in AUC of 0.051 (95% CI 0.021, 0.083) at 2 h, and an equivalent increase in AUC of 0.010 (95% CI -0.018, 0.037) at 3 h. This study suggests that a deep learning model can predict in vitro blastocyst survival after thaw in aneuploid embryos. After correlation with clinical outcomes of transferred embryos, this model may help embryologists ascertain which embryos may have failed to survive the thaw process and increase the likelihood of pregnancy by preventing the transfer of non-viable embryos.
Collapse
Affiliation(s)
- P. Marsh
- grid.266102.10000 0001 2297 6811Center for Reproductive Health, Department of Medicine, University of California, San Francisco, USA
| | - D. Radif
- grid.168010.e0000000419368956Department of Computer Science, Stanford University, Stanford, USA
| | - P. Rajpurkar
- grid.168010.e0000000419368956Department of Computer Science, Stanford University, Stanford, USA
| | - Z. Wang
- grid.168010.e0000000419368956Department of Computer Science, Stanford University, Stanford, USA
| | - E. Hariton
- grid.266102.10000 0001 2297 6811Center for Reproductive Health, Department of Medicine, University of California, San Francisco, USA
| | - S. Ribeiro
- grid.266102.10000 0001 2297 6811Center for Reproductive Health, Department of Medicine, University of California, San Francisco, USA
| | - R. Simbulan
- grid.266102.10000 0001 2297 6811Center for Reproductive Health, Department of Medicine, University of California, San Francisco, USA
| | - A. Kaing
- grid.266102.10000 0001 2297 6811Center for Reproductive Health, Department of Medicine, University of California, San Francisco, USA
| | - W. Lin
- grid.266102.10000 0001 2297 6811Center for Reproductive Health, Department of Medicine, University of California, San Francisco, USA
| | - A. Rajah
- grid.266102.10000 0001 2297 6811Center for Reproductive Health, Department of Medicine, University of California, San Francisco, USA
| | - F. Rabara
- grid.266102.10000 0001 2297 6811Center for Reproductive Health, Department of Medicine, University of California, San Francisco, USA
| | - M. Lungren
- grid.168010.e0000000419368956Center for Artificial Intelligence in Medicine & Imaging, Stanford University, Stanford, USA
| | - U. Demirci
- grid.168010.e0000000419368956Canary Center for Cancer Early Detection, Stanford University, Stanford, USA
| | - A. Ng
- grid.168010.e0000000419368956Department of Computer Science, Stanford University, Stanford, USA
| | - M. Rosen
- grid.266102.10000 0001 2297 6811Center for Reproductive Health, Department of Medicine, University of California, San Francisco, USA
| |
Collapse
|
43
|
A Systematic Review on the Use of Explainability in Deep Learning Systems for Computer Aided Diagnosis in Radiology: Limited Use of Explainable AI? Eur J Radiol 2022; 157:110592. [DOI: 10.1016/j.ejrad.2022.110592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 10/19/2022] [Accepted: 11/01/2022] [Indexed: 11/06/2022]
|
44
|
Khosravi B, Rouzrokh P, Faghani S, Moassefi M, Vahdati S, Mahmoudi E, Chalian H, Erickson BJ. Machine Learning and Deep Learning in Cardiothoracic Imaging: A Scoping Review. Diagnostics (Basel) 2022; 12:2512. [PMID: 36292201 PMCID: PMC9600598 DOI: 10.3390/diagnostics12102512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Revised: 10/14/2022] [Accepted: 10/15/2022] [Indexed: 01/17/2023] Open
Abstract
Machine-learning (ML) and deep-learning (DL) algorithms are part of a group of modeling algorithms that grasp the hidden patterns in data based on a training process, enabling them to extract complex information from the input data. In the past decade, these algorithms have been increasingly used for image processing, specifically in the medical domain. Cardiothoracic imaging is one of the early adopters of ML/DL research, and the COVID-19 pandemic resulted in more research focus on the feasibility and applications of ML/DL in cardiothoracic imaging. In this scoping review, we systematically searched available peer-reviewed medical literature on cardiothoracic imaging and quantitatively extracted key data elements in order to get a big picture of how ML/DL have been used in the rapidly evolving cardiothoracic imaging field. During this report, we provide insights on different applications of ML/DL and some nuances pertaining to this specific field of research. Finally, we provide general suggestions on how researchers can make their research more than just a proof-of-concept and move toward clinical adoption.
Collapse
Affiliation(s)
- Bardia Khosravi
- Radiology Informatics Lab (RIL), Department of Radiology, Mayo Clinic, Rochester, MN 55905, USA
- Orthopedic Surgery Artificial Intelligence Laboratory (OSAIL), Department of Orthopedic Surgery, Mayo Clinic, Rochester, MN 55905, USA
| | - Pouria Rouzrokh
- Radiology Informatics Lab (RIL), Department of Radiology, Mayo Clinic, Rochester, MN 55905, USA
- Orthopedic Surgery Artificial Intelligence Laboratory (OSAIL), Department of Orthopedic Surgery, Mayo Clinic, Rochester, MN 55905, USA
| | - Shahriar Faghani
- Radiology Informatics Lab (RIL), Department of Radiology, Mayo Clinic, Rochester, MN 55905, USA
| | - Mana Moassefi
- Radiology Informatics Lab (RIL), Department of Radiology, Mayo Clinic, Rochester, MN 55905, USA
| | - Sanaz Vahdati
- Radiology Informatics Lab (RIL), Department of Radiology, Mayo Clinic, Rochester, MN 55905, USA
| | - Elham Mahmoudi
- Radiology Informatics Lab (RIL), Department of Radiology, Mayo Clinic, Rochester, MN 55905, USA
| | - Hamid Chalian
- Department of Radiology, Cardiothoracic Imaging, University of Washington, Seattle, WA 98195, USA
| | - Bradley J. Erickson
- Radiology Informatics Lab (RIL), Department of Radiology, Mayo Clinic, Rochester, MN 55905, USA
| |
Collapse
|
45
|
Benchmarking saliency methods for chest X-ray interpretation. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00536-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
AbstractSaliency methods, which produce heat maps that highlight the areas of the medical image that influence model prediction, are often presented to clinicians as an aid in diagnostic decision-making. However, rigorous investigation of the accuracy and reliability of these strategies is necessary before they are integrated into the clinical setting. In this work, we quantitatively evaluate seven saliency methods, including Grad-CAM, across multiple neural network architectures using two evaluation metrics. We establish the first human benchmark for chest X-ray segmentation in a multilabel classification set-up, and examine under what clinical conditions saliency maps might be more prone to failure in localizing important pathologies compared with a human expert benchmark. We find that (1) while Grad-CAM generally localized pathologies better than the other evaluated saliency methods, all seven performed significantly worse compared with the human benchmark, (2) the gap in localization performance between Grad-CAM and the human benchmark was largest for pathologies that were smaller in size and had shapes that were more complex, and (3) model confidence was positively correlated with Grad-CAM localization performance. Our work demonstrates that several important limitations of saliency methods must be addressed before we can rely on them for deep learning explainability in medical imaging.
Collapse
|
46
|
Savjani RR, Lauria M, Bose S, Deng J, Yuan Y, Andrearczyk V. Automated Tumor Segmentation in Radiotherapy. Semin Radiat Oncol 2022; 32:319-329. [DOI: 10.1016/j.semradonc.2022.06.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
47
|
Choi W, Dahiya N, Nadeem S. CIRDataset: A large-scale Dataset for Clinically-Interpretable lung nodule Radiomics and malignancy prediction. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2022; 2022:13-22. [PMID: 36198166 PMCID: PMC9527770 DOI: 10.1007/978-3-031-16443-9_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Spiculations/lobulations, sharp/curved spikes on the surface of lung nodules, are good predictors of lung cancer malignancy and hence, are routinely assessed and reported by radiologists as part of the standardized Lung-RADS clinical scoring criteria. Given the 3D geometry of the nodule and 2D slice-by-slice assessment by radiologists, manual spiculation/lobulation annotation is a tedious task and thus no public datasets exist to date for probing the importance of these clinically-reported features in the SOTA malignancy prediction algorithms. As part of this paper, we release a large-scale Clinically-Interpretable Radiomics Dataset, CIRDataset, containing 956 radiologist QA/QC'ed spiculation/lobulation annotations on segmented lung nodules from two public datasets, LIDC-IDRI (N=883) and LUNGx (N=73). We also present an end-to-end deep learning model based on multi-class Voxel2Mesh extension to segment nodules (while preserving spikes), classify spikes (sharp/spiculation and curved/lobulation), and perform malignancy prediction. Previous methods have performed malignancy prediction for LIDC and LUNGx datasets but without robust attribution to any clinically reported/actionable features (due to known hyperparameter sensitivity issues with general attribution schemes). With the release of this comprehensively-annotated CIRDataset and end-to-end deep learning baseline, we hope that malignancy prediction methods can validate their explanations, benchmark against our baseline, and provide clinically-actionable insights. Dataset, code, pretrained models, and docker containers are available at https://github.com/nadeemlab/CIR.
Collapse
Affiliation(s)
- Wookjin Choi
- Department of Radiation Oncology, Thomas Jefferson University Hospital
| | - Navdeep Dahiya
- School of Electrical and Computer Engineering, Georgia Institute of Technology
| | - Saad Nadeem
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center
| |
Collapse
|
48
|
Faghani S, Khosravi B, Zhang K, Moassefi M, Jagtap JM, Nugen F, Vahdati S, Kuanar SP, Rassoulinejad-Mousavi SM, Singh Y, Vera Garcia DV, Rouzrokh P, Erickson BJ. Mitigating Bias in Radiology Machine Learning: 3. Performance Metrics. Radiol Artif Intell 2022; 4:e220061. [PMID: 36204539 PMCID: PMC9530766 DOI: 10.1148/ryai.220061] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 08/16/2022] [Accepted: 08/17/2022] [Indexed: 05/31/2023]
Abstract
The increasing use of machine learning (ML) algorithms in clinical settings raises concerns about bias in ML models. Bias can arise at any step of ML creation, including data handling, model development, and performance evaluation. Potential biases in the ML model can be minimized by implementing these steps correctly. This report focuses on performance evaluation and discusses model fitness, as well as a set of performance evaluation toolboxes: namely, performance metrics, performance interpretation maps, and uncertainty quantification. By discussing the strengths and limitations of each toolbox, our report highlights strategies and considerations to mitigate and detect biases during performance evaluations of radiology artificial intelligence models. Keywords: Segmentation, Diagnosis, Convolutional Neural Network (CNN) © RSNA, 2022.
Collapse
|
49
|
Gozzi N, Giacomello E, Sollini M, Kirienko M, Ammirabile A, Lanzi P, Loiacono D, Chiti A. Image Embeddings Extracted from CNNs Outperform Other Transfer Learning Approaches in Classification of Chest Radiographs. Diagnostics (Basel) 2022; 12:diagnostics12092084. [PMID: 36140486 PMCID: PMC9497580 DOI: 10.3390/diagnostics12092084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 08/21/2022] [Accepted: 08/24/2022] [Indexed: 11/16/2022] Open
Abstract
To identify the best transfer learning approach for the identification of the most frequent abnormalities on chest radiographs (CXRs), we used embeddings extracted from pretrained convolutional neural networks (CNNs). An explainable AI (XAI) model was applied to interpret black-box model predictions and assess its performance. Seven CNNs were trained on CheXpert. Three transfer learning approaches were thereafter applied to a local dataset. The classification results were ensembled using simple and entropy-weighted averaging. We applied Grad-CAM (an XAI model) to produce a saliency map. Grad-CAM maps were compared to manually extracted regions of interest, and the training time was recorded. The best transfer learning model was that which used image embeddings and random forest with simple averaging, with an average AUC of 0.856. Grad-CAM maps showed that the models focused on specific features of each CXR. CNNs pretrained on a large public dataset of medical images can be exploited as feature extractors for tasks of interest. The extracted image embeddings contain relevant information that can be used to train an additional classifier with satisfactory performance on an independent dataset, demonstrating it to be the optimal transfer learning strategy and overcoming the need for large private datasets, extensive computational resources, and long training times.
Collapse
Affiliation(s)
- Noemi Gozzi
- IRCCS Humanitas Research Hospital, Via Manzoni 56, Rozzano, 20089 Milan, Italy
- Laboratory for Neuroengineering, Department of Health Sciences and Technology, Institute for Robotics and Intelligent Systems, ETH Zurich, 8092 Zurich, Switzerland
| | - Edoardo Giacomello
- Dipartimento di Elettronica, Informazione e Bioingegneria, Via Giuseppe Ponzio 34, 20133 Milan, Italy
| | - Martina Sollini
- IRCCS Humanitas Research Hospital, Via Manzoni 56, Rozzano, 20089 Milan, Italy
- Department of Biomedical Sciences, Humanitas University, Via Rita Levi Montalcini 4, Pieve Emanuele, 20090 Milan, Italy
- Correspondence: ; Tel.: +39-0282245614
| | - Margarita Kirienko
- Fondazione IRCCS Istituto Nazionale Tumori, Via G. Venezian 1, 20133 Milan, Italy
| | - Angela Ammirabile
- IRCCS Humanitas Research Hospital, Via Manzoni 56, Rozzano, 20089 Milan, Italy
- Department of Biomedical Sciences, Humanitas University, Via Rita Levi Montalcini 4, Pieve Emanuele, 20090 Milan, Italy
| | - Pierluca Lanzi
- Dipartimento di Elettronica, Informazione e Bioingegneria, Via Giuseppe Ponzio 34, 20133 Milan, Italy
| | - Daniele Loiacono
- Dipartimento di Elettronica, Informazione e Bioingegneria, Via Giuseppe Ponzio 34, 20133 Milan, Italy
| | - Arturo Chiti
- IRCCS Humanitas Research Hospital, Via Manzoni 56, Rozzano, 20089 Milan, Italy
- Department of Biomedical Sciences, Humanitas University, Via Rita Levi Montalcini 4, Pieve Emanuele, 20090 Milan, Italy
| |
Collapse
|
50
|
Blandford A, Abdi S, Aristidou A, Carmichael J, Cappellaro G, Hussain R, Balaskas K. Protocol for a qualitative study to explore acceptability, barriers and facilitators of the implementation of new teleophthalmology technologies between community optometry practices and hospital eye services. BMJ Open 2022; 12:e060810. [PMID: 35858730 PMCID: PMC9305899 DOI: 10.1136/bmjopen-2022-060810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
INTRODUCTION Novel teleophthalmology technologies have the potential to reduce unnecessary and inaccurate referrals between community optometry practices and hospital eye services and as a result improve patients' access to appropriate and timely eye care. However, little is known about the acceptability and facilitators and barriers to the implementations of these technologies in real life. METHODS AND ANALYSIS A theoretically informed, qualitative study will explore patients' and healthcare professionals' perspectives on teleophthalmology and Artificial Intelligence Decision Support System models of care. A combination of situated observations in community optometry practices and hospital eye services, semistructured qualitative interviews with patients and healthcare professionals and self-audiorecordings of healthcare professionals will be conducted. Participants will be purposively selected from 4 to 5 hospital eye services and 6-8 affiliated community optometry practices. The aim will be to recruit 30-36 patients and 30 healthcare professionals from hospital eye services and community optometry practices. All interviews will be audiorecorded, with participants' permission, and transcribed verbatim. Data from interviews, observations and self-audiorecordings will be analysed thematically and will be informed by normalisation process theory and an inductive approach. ETHICS AND DISSEMINATION Ethical approval has been received from London-Bromley research ethics committee. Findings will be reported through academic journals and conferences in ophthalmology, health services research, management studies and human-computer interaction.
Collapse
Affiliation(s)
- Ann Blandford
- UCL Interaction Centre, University College London, London, UK
| | - Sarah Abdi
- UCL Interaction Centre, University College London, London, UK
| | | | - Josie Carmichael
- UCL Interaction Centre, University College London, London, UK
- Moorfields Eye Hospital NHS Foundation Trust, London, UK
| | - Giulia Cappellaro
- School of Management, University College London, London, UK
- Department of Social and Political Sciences, Bocconi University, Milano, Italy
| | - Rima Hussain
- Moorfields Eye Hospital NHS Foundation Trust, London, UK
- Institute of Ophthalmology, UCL, London, UK
| | - Konstantinos Balaskas
- Moorfields Eye Hospital NHS Foundation Trust, London, UK
- Institute of Ophthalmology, UCL, London, UK
| |
Collapse
|