1
|
Jones WS, Farrow DJ. One-class support vector machines for detecting population drift in deployed machine learning medical diagnostics. Sci Rep 2025; 15:12157. [PMID: 40204747 PMCID: PMC11982198 DOI: 10.1038/s41598-025-94427-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2024] [Accepted: 03/13/2025] [Indexed: 04/11/2025] Open
Abstract
Machine learning (ML) models are increasingly being applied to diagnose and predict disease, but face technical challenges such as population drift, where the training and real-world deployed data distributions differ. This phenomenon can degrade model performance, risking incorrect diagnoses. Current detection methods are limited: not directly measuring population drift and often requiring ground truth labels for new patient data. Here, we propose using a one-class support vector machine (OCSVM) to detect population drift. We trained a OCSVM on the Wisconsin Breast Cancer dataset and tested its ability to detect population drift on simulated data. Simulated data was offset at 0.4 standard deviations of the minimum and maximum values of the radius_mean variable, at three noise levels: 5%, 10% and 30% of the standard deviation; 10,000 records per noise level. We hypothesised that increased noise would correlate with more OCSVM-detected inliers, indicating a sensitivity to population drift. As noise increased, more inliers were detected: 5% (27 inliers), 10% (486), and 30% (851). Therefore, this approach could effectively alert to population drift, supporting safe ML diagnostics adoption. Future research should explore OCSVM monitoring on real-world data, enhance model transparency, investigate complementary statistical and ML methods, and extend applications to other data types.
Collapse
Affiliation(s)
- William S Jones
- Centre of Excellence for Data Science, Artificial Intelligence and Modelling (DAIM), Faculty of Science and Engineering, University of Hull, Hull, UK.
| | - Daniel J Farrow
- Centre of Excellence for Data Science, Artificial Intelligence and Modelling (DAIM), Faculty of Science and Engineering, University of Hull, Hull, UK
| |
Collapse
|
2
|
Koçak B, Ponsiglione A, Stanzione A, Bluethgen C, Santinha J, Ugga L, Huisman M, Klontzas ME, Cannella R, Cuocolo R. Bias in artificial intelligence for medical imaging: fundamentals, detection, avoidance, mitigation, challenges, ethics, and prospects. Diagn Interv Radiol 2025; 31:75-88. [PMID: 38953330 PMCID: PMC11880872 DOI: 10.4274/dir.2024.242854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Accepted: 06/11/2024] [Indexed: 07/04/2024]
Abstract
Although artificial intelligence (AI) methods hold promise for medical imaging-based prediction tasks, their integration into medical practice may present a double-edged sword due to bias (i.e., systematic errors). AI algorithms have the potential to mitigate cognitive biases in human interpretation, but extensive research has highlighted the tendency of AI systems to internalize biases within their model. This fact, whether intentional or not, may ultimately lead to unintentional consequences in the clinical setting, potentially compromising patient outcomes. This concern is particularly important in medical imaging, where AI has been more progressively and widely embraced than any other medical field. A comprehensive understanding of bias at each stage of the AI pipeline is therefore essential to contribute to developing AI solutions that are not only less biased but also widely applicable. This international collaborative review effort aims to increase awareness within the medical imaging community about the importance of proactively identifying and addressing AI bias to prevent its negative consequences from being realized later. The authors began with the fundamentals of bias by explaining its different definitions and delineating various potential sources. Strategies for detecting and identifying bias were then outlined, followed by a review of techniques for its avoidance and mitigation. Moreover, ethical dimensions, challenges encountered, and prospects were discussed.
Collapse
Affiliation(s)
- Burak Koçak
- University of Health Sciences Başakşehir Çam and Sakura City Hospital, Clinic of Radiology, İstanbul, Türkiye
| | - Andrea Ponsiglione
- University of Naples Federico II Department of Advanced Biomedical Sciences, Naples, Italy
| | - Arnaldo Stanzione
- University of Naples Federico II Department of Advanced Biomedical Sciences, Naples, Italy
| | - Christian Bluethgen
- University of Zurich University Hospital Zurich, Diagnostic and Interventional Radiology, Zurich, Switzerland
| | - João Santinha
- Digital Surgery LAB Champalimaud Research, Champalimaud Foundation; Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
| | - Lorenzo Ugga
- University of Naples Federico II Department of Advanced Biomedical Sciences, Naples, Italy
| | - Merel Huisman
- Radboud University Medical Center Department of Radiology and Nuclear Medicine, Nijmegen, Netherlands
| | - Michail E. Klontzas
- University of Crete School of Medicine, Department of Radiology; University Hospital of Heraklion, Department of Medical Imaging,Crete, Greece; Karolinska Institute, Department of Clinical Science Intervention and Technology (CLINTEC), Division of Radiology, Solna, Sweden
| | - Roberto Cannella
- University of Palermo Department of Biomedicine, Neuroscience and Advanced Diagnostics, Section of Radiology, Palermo, Italy
| | - Renato Cuocolo
- University of Salerno Department of Medicine, Surgery and Dentistry, Baronissi, Italy
| |
Collapse
|
3
|
Tschoellitsch T, Maletzky A, Moser P, Seidl P, Böck C, Tomic Mahečić T, Thumfart S, Giretzlehner M, Hochreiter S, Meier J. Machine learning prediction of unexpected readmission or death after discharge from intensive care: A retrospective cohort study. J Clin Anesth 2024; 99:111654. [PMID: 39405923 DOI: 10.1016/j.jclinane.2024.111654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 09/03/2024] [Accepted: 10/08/2024] [Indexed: 11/26/2024]
Abstract
BACKGROUND Intensive care units (ICUs) harbor the sickest patients with the utmost needs of medical care. Discharge from ICU needs to consider the reason for admission and stability after ICU care. Organ dysfunction or instability after ICU discharge constitute potentially life-threatening situations for patients. METHODS This is a single center, observational, retrospective cohort study conducted at ICUs at the Kepler University Hospital in Linz, Austria. Patients aged 18 years and above admitted to the study center's ICUs between 2010-01-01 and 2019-10-31 were included in the study. Patients transferred to another ICU, discharged to a different hospital or home, or that died during their ICU stay were excluded. We used machine learning (ML) models to predict unplanned ICU readmission or death using an internal dataset or MIMIC-IV as training data and compared the models with the Stability and Workload Index for Transfer (SWIFT) score. Further, we evaluated the influence of features on the models using Shapley Additive Explanations. RESULTS The best ML models achieved an area under the curve of the receiver operating characteristic (AUC-ROC) of 0.721 ± 0.029 and a high negative predictive value (NPV) of 0.990 ± 0.002. The most important features were heart rate, peripheral oxygen saturation and arterial blood pressure. Performance of the SWIFT score was worse than the ML models (best AUC-ROC 0.618 ± 0.011). CONCLUSIONS ML models were able to identify patients that will not need unplanned ICU readmission and will not die within 48 h after discharge.
Collapse
Affiliation(s)
- Thomas Tschoellitsch
- Department of Anesthesiology and Critical Care Medicine, Johannes Kepler University Linz and Kepler University Hospital, Linz, Austria.
| | - Alexander Maletzky
- Research Unit Medical Informatics, RISC Software GmbH, Hagenberg i. M., Austria.
| | - Philipp Moser
- Research Unit Medical Informatics, RISC Software GmbH, Hagenberg i. M., Austria.
| | - Philipp Seidl
- European Laboratory for Learning and Intelligent Systems Unit Linz, Linz Institute of Technology Artificial Intelligence Lab, Institute for Machine Learning, Johannes Kepler University, Linz, Austria.
| | - Carl Böck
- Institute of Signal Processing, Johannes Kepler University Linz, Austria.
| | - Tina Tomic Mahečić
- Clinic of Anaesthesiology and Intensive Care Medicine, University Hospital Centre Zagreb - Rebro, Croatia.
| | - Stefan Thumfart
- Research Unit Medical Informatics, RISC Software GmbH, Hagenberg i. M., Austria.
| | - Michael Giretzlehner
- Research Unit Medical Informatics, RISC Software GmbH, Hagenberg i. M., Austria.
| | - Sepp Hochreiter
- European Laboratory for Learning and Intelligent Systems Unit Linz, Linz Institute of Technology Artificial Intelligence Lab, Institute for Machine Learning, Johannes Kepler University, Linz, Austria.
| | - Jens Meier
- Department of Anesthesiology and Critical Care Medicine, Johannes Kepler University Linz and Kepler University Hospital, Linz, Austria.
| |
Collapse
|
4
|
Zhang F, Kreuter D, Chen Y, Dittmer S, Tull S, Shadbahr T, Preller J, Rudd JH, Aston JA, Schönlieb CB, Gleadall N, Roberts M. Recent methodological advances in federated learning for healthcare. PATTERNS (NEW YORK, N.Y.) 2024; 5:101006. [PMID: 39005485 PMCID: PMC11240178 DOI: 10.1016/j.patter.2024.101006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
For healthcare datasets, it is often impossible to combine data samples from multiple sites due to ethical, privacy, or logistical concerns. Federated learning allows for the utilization of powerful machine learning algorithms without requiring the pooling of data. Healthcare data have many simultaneous challenges, such as highly siloed data, class imbalance, missing data, distribution shifts, and non-standardized variables, that require new methodologies to address. Federated learning adds significant methodological complexity to conventional centralized machine learning, requiring distributed optimization, communication between nodes, aggregation of models, and redistribution of models. In this systematic review, we consider all papers on Scopus published between January 2015 and February 2023 that describe new federated learning methodologies for addressing challenges with healthcare data. We reviewed 89 papers meeting these criteria. Significant systemic issues were identified throughout the literature, compromising many methodologies reviewed. We give detailed recommendations to help improve methodology development for federated learning in healthcare.
Collapse
Affiliation(s)
- Fan Zhang
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
| | - Daniel Kreuter
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
| | - Yichen Chen
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
| | - Sören Dittmer
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
- ZeTeM, University of Bremen, Bremen, Germany
| | - Samuel Tull
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
| | - Tolou Shadbahr
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Jacobus Preller
- Addenbrooke’s Hospital, Cambridge University Hospitals NHS Trust, Cambridge, UK
| | - James H.F. Rudd
- Department of Medicine, University of Cambridge, Cambridge, UK
| | - John A.D. Aston
- Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, Cambridge, UK
| | - Carola-Bibiane Schönlieb
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
| | | | - Michael Roberts
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
- Department of Medicine, University of Cambridge, Cambridge, UK
| |
Collapse
|
5
|
Stroh N, Stefanits H, Maletzky A, Kaltenleithner S, Thumfart S, Giretzlehner M, Drexler R, Ricklefs FL, Dührsen L, Aspalter S, Rauch P, Gruber A, Gmeiner M. Machine learning based outcome prediction of microsurgically treated unruptured intracranial aneurysms. Sci Rep 2023; 13:22641. [PMID: 38114635 PMCID: PMC10730905 DOI: 10.1038/s41598-023-50012-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 12/14/2023] [Indexed: 12/21/2023] Open
Abstract
Machine learning (ML) has revolutionized data processing in recent years. This study presents the results of the first prediction models based on a long-term monocentric data registry of patients with microsurgically treated unruptured intracranial aneurysms (UIAs) using a temporal train-test split. Temporal train-test splits allow to simulate prospective validation, and therefore provide more accurate estimations of a model's predictive quality when applied to future patients. ML models for the prediction of the Glasgow outcome scale, modified Rankin Scale (mRS), and new transient or permanent neurological deficits (output variables) were created from all UIA patients that underwent microsurgery at the Kepler University Hospital Linz (Austria) between 2002 and 2020 (n = 466), based on 18 patient- and 10 aneurysm-specific preoperative parameters (input variables). Train-test splitting was performed with a temporal split for outcome prediction in microsurgical therapy of UIA. Moreover, an external validation was conducted on an independent external data set (n = 256) of the Department of Neurosurgery, University Medical Centre Hamburg-Eppendorf. In total, 722 aneurysms were included in this study. A postoperative mRS > 2 was best predicted by a quadratic discriminant analysis (QDA) estimator in the internal test set, with an area under the receiver operating characteristic curve (ROC-AUC) of 0.87 ± 0.03 and a sensitivity and specificity of 0.83 ± 0.08 and 0.71 ± 0.07, respectively. A Multilayer Perceptron predicted the post- to preoperative mRS difference > 1 with a ROC-AUC of 0.70 ± 0.02 and a sensitivity and specificity of 0.74 ± 0.07 and 0.50 ± 0.04, respectively. The QDA was the best model for predicting a permanent new neurological deficit with a ROC-AUC of 0.71 ± 0.04 and a sensitivity and specificity of 0.65 ± 0.24 and 0.60 ± 0.12, respectively. Furthermore, these models performed significantly better than the classic logistic regression models (p < 0.0001). The present results showed good performance in predicting functional and clinical outcomes after microsurgical therapy of UIAs in the internal data set, especially for the main outcome parameters, mRS and permanent neurological deficit. The external validation showed poor discrimination with ROC-AUC values of 0.61, 0.53 and 0.58 respectively for predicting a postoperative mRS > 2, a pre- and postoperative difference in mRS > 1 point and a GOS < 5. Therefore, generalizability of the models could not be demonstrated in the external validation. A SHapley Additive exPlanations (SHAP) analysis revealed that this is due to the most important features being distributed quite differently in the internal and external data sets. The implementation of newly available data and the merging of larger databases to form more broad-based predictive models is imperative in the future.
Collapse
Affiliation(s)
- Nico Stroh
- Department of Neurosurgery, Kepler University Hospital, Johannes Kepler University, Linz, Austria
| | - Harald Stefanits
- Department of Neurosurgery, Kepler University Hospital, Johannes Kepler University, Linz, Austria.
| | | | | | | | | | - Richard Drexler
- Department of Neurosurgery, University Medical Centre Hamburg-Eppendorf, Hamburg, Germany
| | - Franz L Ricklefs
- Department of Neurosurgery, University Medical Centre Hamburg-Eppendorf, Hamburg, Germany
| | - Lasse Dührsen
- Department of Neurosurgery, University Medical Centre Hamburg-Eppendorf, Hamburg, Germany
| | - Stefan Aspalter
- Department of Neurosurgery, Kepler University Hospital, Johannes Kepler University, Linz, Austria
| | - Philip Rauch
- Department of Neurosurgery, Kepler University Hospital, Johannes Kepler University, Linz, Austria
| | - Andreas Gruber
- Department of Neurosurgery, Kepler University Hospital, Johannes Kepler University, Linz, Austria
| | - Matthias Gmeiner
- Department of Neurosurgery, Kepler University Hospital, Johannes Kepler University, Linz, Austria
| |
Collapse
|
6
|
Chadaga K, Prabhu S, Bhat V, Sampathila N, Umakanth S, Upadya P S. COVID-19 diagnosis using clinical markers and multiple explainable artificial intelligence approaches: A case study from Ecuador. SLAS Technol 2023; 28:393-410. [PMID: 37689365 DOI: 10.1016/j.slast.2023.09.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 08/16/2023] [Accepted: 09/06/2023] [Indexed: 09/11/2023]
Abstract
The COVID-19 pandemic erupted at the beginning of 2020 and proved fatal, causing many casualties worldwide. Immediate and precise screening of affected patients is critical for disease control. COVID-19 is often confused with various other respiratory disorders since the symptoms are similar. As of today, the reverse transcription-polymerase chain reaction (RT-PCR) test is utilized for diagnosing COVID-19. However, this approach is sometimes prone to producing erroneous and false negative results. Hence, finding a reliable diagnostic method that can validate the RT-PCR test results is crucial. Artificial intelligence (AI) and machine learning (ML) applications in COVID-19 diagnosis has proven to be beneficial. Hence, clinical markers have been utilized for COVID-19 diagnosis with the help of several classifiers in this study. Further, five different explainable artificial intelligence techniques have been utilized to interpret the predictions. Among all the algorithms, the k-nearest neighbor obtained the best performance with an accuracy, precision, recall and f1-score of 84%, 85%, 84% and 84%. According to this study, the combination of clinical markers such as eosinophils, lymphocytes, red blood cells and leukocytes was significant in differentiating COVID-19. The classifiers can be utilized synchronously with the standard RT-PCR procedure making diagnosis more reliable and efficient.
Collapse
Affiliation(s)
- Krishnaraj Chadaga
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
| | - Srikanth Prabhu
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India.
| | - Vivekananda Bhat
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
| | - Niranjana Sampathila
- Department of Biomedical Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India.
| | - Shashikiran Umakanth
- Department of Medicine, Dr. TMA Hospital, Manipal Academy of Higher Education, Manipal, India
| | - Sudhakara Upadya P
- Manipal School of Information Sciences, Manipal Academy of Higher Education, Manipal, India
| |
Collapse
|
7
|
Drukker K, Chen W, Gichoya J, Gruszauskas N, Kalpathy-Cramer J, Koyejo S, Myers K, Sá RC, Sahiner B, Whitney H, Zhang Z, Giger M. Toward fairness in artificial intelligence for medical image analysis: identification and mitigation of potential biases in the roadmap from data collection to model deployment. J Med Imaging (Bellingham) 2023; 10:061104. [PMID: 37125409 PMCID: PMC10129875 DOI: 10.1117/1.jmi.10.6.061104] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 04/03/2023] [Indexed: 05/02/2023] Open
Abstract
Purpose To recognize and address various sources of bias essential for algorithmic fairness and trustworthiness and to contribute to a just and equitable deployment of AI in medical imaging, there is an increasing interest in developing medical imaging-based machine learning methods, also known as medical imaging artificial intelligence (AI), for the detection, diagnosis, prognosis, and risk assessment of disease with the goal of clinical implementation. These tools are intended to help improve traditional human decision-making in medical imaging. However, biases introduced in the steps toward clinical deployment may impede their intended function, potentially exacerbating inequities. Specifically, medical imaging AI can propagate or amplify biases introduced in the many steps from model inception to deployment, resulting in a systematic difference in the treatment of different groups. Approach Our multi-institutional team included medical physicists, medical imaging artificial intelligence/machine learning (AI/ML) researchers, experts in AI/ML bias, statisticians, physicians, and scientists from regulatory bodies. We identified sources of bias in AI/ML, mitigation strategies for these biases, and developed recommendations for best practices in medical imaging AI/ML development. Results Five main steps along the roadmap of medical imaging AI/ML were identified: (1) data collection, (2) data preparation and annotation, (3) model development, (4) model evaluation, and (5) model deployment. Within these steps, or bias categories, we identified 29 sources of potential bias, many of which can impact multiple steps, as well as mitigation strategies. Conclusions Our findings provide a valuable resource to researchers, clinicians, and the public at large.
Collapse
Affiliation(s)
- Karen Drukker
- The University of Chicago, Department of Radiology, Chicago, Illinois, United States
| | - Weijie Chen
- US Food and Drug Administration, Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, Silver Spring, Maryland, United States
| | - Judy Gichoya
- Emory University, Department of Radiology, Atlanta, Georgia, United States
| | - Nicholas Gruszauskas
- The University of Chicago, Department of Radiology, Chicago, Illinois, United States
| | | | - Sanmi Koyejo
- Stanford University, Department of Computer Science, Stanford, California, United States
| | - Kyle Myers
- Puente Solutions LLC, Phoenix, Arizona, United States
| | - Rui C. Sá
- National Institutes of Health, Bethesda, Maryland, United States
- University of California, San Diego, La Jolla, California, United States
| | - Berkman Sahiner
- US Food and Drug Administration, Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, Silver Spring, Maryland, United States
| | - Heather Whitney
- The University of Chicago, Department of Radiology, Chicago, Illinois, United States
| | - Zi Zhang
- Jefferson Health, Philadelphia, Pennsylvania, United States
| | - Maryellen Giger
- The University of Chicago, Department of Radiology, Chicago, Illinois, United States
| |
Collapse
|
8
|
Sahiner B, Chen W, Samala RK, Petrick N. Data drift in medical machine learning: implications and potential remedies. Br J Radiol 2023; 96:20220878. [PMID: 36971405 PMCID: PMC10546450 DOI: 10.1259/bjr.20220878] [Citation(s) in RCA: 47] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 02/16/2023] [Accepted: 02/20/2023] [Indexed: 03/29/2023] Open
Abstract
Data drift refers to differences between the data used in training a machine learning (ML) model and that applied to the model in real-world operation. Medical ML systems can be exposed to various forms of data drift, including differences between the data sampled for training and used in clinical operation, differences between medical practices or context of use between training and clinical use, and time-related changes in patient populations, disease patterns, and data acquisition, to name a few. In this article, we first review the terminology used in ML literature related to data drift, define distinct types of drift, and discuss in detail potential causes within the context of medical applications with an emphasis on medical imaging. We then review the recent literature regarding the effects of data drift on medical ML systems, which overwhelmingly show that data drift can be a major cause for performance deterioration. We then discuss methods for monitoring data drift and mitigating its effects with an emphasis on pre- and post-deployment techniques. Some of the potential methods for drift detection and issues around model retraining when drift is detected are included. Based on our review, we find that data drift is a major concern in medical ML deployment and that more research is needed so that ML models can identify drift early, incorporate effective mitigation strategies and resist performance decay.
Collapse
Affiliation(s)
- Berkman Sahiner
- Center for Devices and Radiological Health, U.S. Food and Drug Administration 10903 New Hampshire Avenue, Silver Spring, MD 20993-0002
| | - Weijie Chen
- Center for Devices and Radiological Health, U.S. Food and Drug Administration 10903 New Hampshire Avenue, Silver Spring, MD 20993-0002
| | - Ravi K. Samala
- Center for Devices and Radiological Health, U.S. Food and Drug Administration 10903 New Hampshire Avenue, Silver Spring, MD 20993-0002
| | - Nicholas Petrick
- Center for Devices and Radiological Health, U.S. Food and Drug Administration 10903 New Hampshire Avenue, Silver Spring, MD 20993-0002
| |
Collapse
|
9
|
Rahmani K, Thapa R, Tsou P, Casie Chetty S, Barnes G, Lam C, Foon Tso C. Assessing the effects of data drift on the performance of machine learning models used in clinical sepsis prediction. Int J Med Inform 2023; 173:104930. [PMID: 36893656 DOI: 10.1016/j.ijmedinf.2022.104930] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 10/30/2022] [Accepted: 11/15/2022] [Indexed: 11/21/2022]
Abstract
BACKGROUND Data drift can negatively impact the performance of machine learning algorithms (MLAs) that were trained on historical data. As such, MLAs should be continuously monitored and tuned to overcome the systematic changes that occur in the distribution of data. In this paper, we study the extent of data drift and provide insights about its characteristics for sepsis onset prediction. This study will help elucidate the nature of data drift for prediction of sepsis and similar diseases. This may aid with the development of more effective patient monitoring systems that can stratify risk for dynamic disease states in hospitals. METHODS We devise a series of simulations that measure the effects of data drift in patients with sepsis, using electronic health records (EHR). We simulate multiple scenarios in which data drift may occur, namely the change in the distribution of the predictor variables (covariate shift), the change in the statistical relationship between the predictors and the target (concept shift), and the occurrence of a major healthcare event (major event) such as the COVID-19 pandemic. We measure the impact of data drift on model performances, identify the circumstances that necessitate model retraining, and compare the effects of different retraining methodologies and model architecture on the outcomes. We present the results for two different MLAs, eXtreme Gradient Boosting (XGB) and Recurrent Neural Network (RNN). RESULTS Our results show that the properly retrained XGB models outperform the baseline models in all simulation scenarios, hence signifying the existence of data drift. In the major event scenario, the area under the receiver operating characteristic curve (AUROC) at the end of the simulation period is 0.811 for the baseline XGB model and 0.868 for the retrained XGB model. In the covariate shift scenario, the AUROC at the end of the simulation period for the baseline and retrained XGB models is 0.853 and 0.874 respectively. In the concept shift scenario and under the mixed labeling method, the retrained XGB models perform worse than the baseline model for most simulation steps. However, under the full relabeling method, the AUROC at the end of the simulation period for the baseline and retrained XGB models is 0.852 and 0.877 respectively. The results for the RNN models were mixed, suggesting that retraining based on a fixed network architecture may be inadequate for an RNN. We also present the results in the form of other performance metrics such as the ratio of observed to expected probabilities (calibration) and the normalized rate of positive predictive values (PPV) by prevalence, referred to as lift, at a sensitivity of 0.8. CONCLUSION Our simulations reveal that retraining periods of a couple of months or using several thousand patients are likely to be adequate to monitor machine learning models that predict sepsis. This indicates that a machine learning system for sepsis prediction will probably need less infrastructure for performance monitoring and retraining compared to other applications in which data drift is more frequent and continuous. Our results also show that in the event of a concept shift, a full overhaul of the sepsis prediction model may be necessary because it indicates a discrete change in the definition of sepsis labels, and mixing the labels for the sake of incremental training may not produce the desired results.
Collapse
Affiliation(s)
- Keyvan Rahmani
- Dascena, Inc., 12333 Sowden Rd Ste B PMB 65148, Houston, TX 77080-2059, USA
| | - Rahul Thapa
- Dascena, Inc., 12333 Sowden Rd Ste B PMB 65148, Houston, TX 77080-2059, USA
| | - Peiling Tsou
- Dascena, Inc., 12333 Sowden Rd Ste B PMB 65148, Houston, TX 77080-2059, USA
| | - Satish Casie Chetty
- Dascena, Inc., 12333 Sowden Rd Ste B PMB 65148, Houston, TX 77080-2059, USA.
| | - Gina Barnes
- Dascena, Inc., 12333 Sowden Rd Ste B PMB 65148, Houston, TX 77080-2059, USA
| | - Carson Lam
- Dascena, Inc., 12333 Sowden Rd Ste B PMB 65148, Houston, TX 77080-2059, USA
| | - Chak Foon Tso
- Dascena, Inc., 12333 Sowden Rd Ste B PMB 65148, Houston, TX 77080-2059, USA
| |
Collapse
|
10
|
Chadaga K, Prabhu S, Bhat V, Sampathila N, Umakanth S, Chadaga R. A Decision Support System for Diagnosis of COVID-19 from Non-COVID-19 Influenza-like Illness Using Explainable Artificial Intelligence. Bioengineering (Basel) 2023; 10:439. [PMID: 37106626 PMCID: PMC10135993 DOI: 10.3390/bioengineering10040439] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 03/27/2023] [Accepted: 03/29/2023] [Indexed: 04/03/2023] Open
Abstract
The coronavirus pandemic emerged in early 2020 and turned out to be deadly, killing a vast number of people all around the world. Fortunately, vaccines have been discovered, and they seem effectual in controlling the severe prognosis induced by the virus. The reverse transcription-polymerase chain reaction (RT-PCR) test is the current golden standard for diagnosing different infectious diseases, including COVID-19; however, it is not always accurate. Therefore, it is extremely crucial to find an alternative diagnosis method which can support the results of the standard RT-PCR test. Hence, a decision support system has been proposed in this study that uses machine learning and deep learning techniques to predict the COVID-19 diagnosis of a patient using clinical, demographic and blood markers. The patient data used in this research were collected from two Manipal hospitals in India and a custom-made, stacked, multi-level ensemble classifier has been used to predict the COVID-19 diagnosis. Deep learning techniques such as deep neural networks (DNN) and one-dimensional convolutional networks (1D-CNN) have also been utilized. Further, explainable artificial techniques (XAI) such as Shapley additive values (SHAP), ELI5, local interpretable model explainer (LIME), and QLattice have been used to make the models more precise and understandable. Among all of the algorithms, the multi-level stacked model obtained an excellent accuracy of 96%. The precision, recall, f1-score and AUC obtained were 94%, 95%, 94% and 98% respectively. The models can be used as a decision support system for the initial screening of coronavirus patients and can also help ease the existing burden on medical infrastructure.
Collapse
Affiliation(s)
- Krishnaraj Chadaga
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India;
| | - Srikanth Prabhu
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India;
| | - Vivekananda Bhat
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India;
| | - Niranjana Sampathila
- Department of Biomedical Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India;
| | - Shashikiran Umakanth
- Department of Medicine, Dr. TMA Hospital, Manipal Academy of Higher Education, Manipal 576104, India;
| | - Rajagopala Chadaga
- Department of Mechanical and Industrial Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India;
| |
Collapse
|
11
|
Lee YJ, Sun S, Kim YK, Jeoung JW, Park KH. Diagnostic ability of macular microvasculature with swept-source OCT angiography for highly myopic glaucoma using deep learning. Sci Rep 2023; 13:5209. [PMID: 36997639 PMCID: PMC10063664 DOI: 10.1038/s41598-023-32164-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Accepted: 03/23/2023] [Indexed: 04/01/2023] Open
Abstract
Macular OCT angiography (OCTA) measurements have been reported to be useful for glaucoma diagnostics. However, research on highly myopic glaucoma is lacking, and the diagnostic value of macular OCTA measurements versus OCT parameters remains inconclusive. We aimed to evaluate the diagnostic ability of the macular microvasculature assessed with OCTA for highly myopic glaucoma and to compare it with that of macular thickness parameters, using deep learning (DL). A DL model was trained, validated and tested using 260 pairs of macular OCTA and OCT images from 260 eyes (203 eyes with highly myopic glaucoma, 57 eyes with healthy high myopia). The DL model achieved an AUC of 0.946 with the OCTA superficial capillary plexus (SCP) images, which was comparable to that with the OCT GCL+ (ganglion cell layer + inner plexiform layer; AUC, 0.982; P = 0.268) or OCT GCL++ (retinal nerve fiber layer + ganglion cell layer + inner plexiform layer) images (AUC, 0.997; P = 0.101), and significantly superior to that with the OCTA deep capillary plexus images (AUC, 0.779; P = 0.028). The DL model with macular OCTA SCP images demonstrated excellent and comparable diagnostic ability to that with macular OCT images in highly myopic glaucoma, which suggests macular OCTA microvasculature could serve as a potential biomarker for glaucoma diagnosis in high myopia.
Collapse
Affiliation(s)
- Yun Jeong Lee
- Department of Ophthalmology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Sukkyu Sun
- Biomedical Research Institute, Seoul National University Hospital, Seoul, Korea
| | - Young Kook Kim
- Department of Ophthalmology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Jin Wook Jeoung
- Department of Ophthalmology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Ki Ho Park
- Department of Ophthalmology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea.
| |
Collapse
|
12
|
Lee YJ, Choe S, Wy S, Jang M, Jeoung JW, Choi HJ, Park KH, Sun S, Kim YK. Demographics Prediction and Heatmap Generation From OCT Images of Anterior Segment of the Eye: A Vision Transformer Model Study. Transl Vis Sci Technol 2022; 11:7. [DOI: 10.1167/tvst.11.11.7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Affiliation(s)
- Yun Jeong Lee
- Department of Ophthalmology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Sooyeon Choe
- Department of Ophthalmology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Seoyoung Wy
- Department of Ophthalmology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Mirinae Jang
- Department of Ophthalmology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Jin Wook Jeoung
- Department of Ophthalmology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Hyuk Jin Choi
- Department of Ophthalmology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
- Department of Ophthalmology, Seoul National University Hospital Healthcare System Gangnam Center, Seoul National University College of Medicine, Seoul, Korea
| | - Ki Ho Park
- Department of Ophthalmology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Sukkyu Sun
- Biomedical Research Institute, Seoul National University Hospital, Seoul, Korea
| | - Young Kook Kim
- Department of Ophthalmology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| |
Collapse
|
13
|
Maletzky A, Böck C, Tschoellitsch T, Roland T, Ludwig H, Thumfart S, Giretzlehner M, Hochreiter S, Meier J. Lifting Hospital Electronic Health Record Data Treasures: Challenges and Opportunities. JMIR Med Inform 2022; 10:e38557. [PMID: 36269654 PMCID: PMC9636533 DOI: 10.2196/38557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 08/02/2022] [Accepted: 09/07/2022] [Indexed: 12/04/2022] Open
Abstract
Electronic health records (EHRs) have been successfully used in data science and machine learning projects. However, most of these data are collected for clinical use rather than for retrospective analysis. This means that researchers typically face many different issues when attempting to access and prepare the data for secondary use. We aimed to investigate how raw EHRs can be accessed and prepared in retrospective data science projects in a disciplined, effective, and efficient way. We report our experience and findings from a large-scale data science project analyzing routinely acquired retrospective data from the Kepler University Hospital in Linz, Austria. The project involved data collection from more than 150,000 patients over a period of 10 years. It included diverse data modalities, such as static demographic data, irregularly acquired laboratory test results, regularly sampled vital signs, and high-frequency physiological waveform signals. Raw medical data can be corrupted in many unexpected ways that demand thorough manual inspection and highly individualized data cleaning solutions. We present a general data preparation workflow, which was shaped in the course of our project and consists of the following 7 steps: obtain a rough overview of the available EHR data, define clinically meaningful labels for supervised learning, extract relevant data from the hospital’s data warehouses, match data extracted from different sources, deidentify them, detect errors and inconsistencies therein through a careful exploratory analysis, and implement a suitable data processing pipeline in actual code. Only few of the data preparation issues encountered in our project were addressed by generic medical data preprocessing tools that have been proposed recently. Instead, highly individualized solutions for the specific data used in one’s own research seem inevitable. We believe that the proposed workflow can serve as a guidance for practitioners, helping them to identify and address potential problems early and avoid some common pitfalls.
Collapse
Affiliation(s)
- Alexander Maletzky
- Research Department Medical Informatics, RISC Software GmbH, Hagenberg, Austria
| | - Carl Böck
- JKU LIT SAL eSPML Lab, Institute of Signal Processing, Johannes Kepler University, Linz, Austria
| | - Thomas Tschoellitsch
- Department of Anesthesiology and Critical Care Medicine, Kepler University Hospital GmbH, Johannes Kepler University, Linz, Austria
| | - Theresa Roland
- ELLIS Unit Linz, LIT AI Lab, Institute for Machine Learning, Johannes Kepler University, Linz, Austria
| | - Helga Ludwig
- ELLIS Unit Linz, LIT AI Lab, Institute for Machine Learning, Johannes Kepler University, Linz, Austria
| | - Stefan Thumfart
- Research Department Medical Informatics, RISC Software GmbH, Hagenberg, Austria
| | | | - Sepp Hochreiter
- ELLIS Unit Linz, LIT AI Lab, Institute for Machine Learning, Johannes Kepler University, Linz, Austria
| | - Jens Meier
- Department of Anesthesiology and Critical Care Medicine, Kepler University Hospital GmbH, Johannes Kepler University, Linz, Austria
| |
Collapse
|
14
|
Kistenev YV, Vrazhnov DA, Shnaider EE, Zuhayri H. Predictive models for COVID-19 detection using routine blood tests and machine learning. Heliyon 2022; 8:e11185. [PMID: 36311357 PMCID: PMC9595489 DOI: 10.1016/j.heliyon.2022.e11185] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 03/25/2022] [Accepted: 10/16/2022] [Indexed: 11/06/2022] Open
Abstract
The problem of accurate, fast, and inexpensive COVID-19 tests has been urgent till now. Standard COVID-19 tests need high-cost reagents and specialized laboratories with high safety requirements, are time-consuming. Data of routine blood tests as a base of SARS-CoV-2 invasion detection allows using the most practical medicine facilities. But blood tests give general information about a patient's state, which is not directly associated with COVID-19. COVID-19-specific features should be selected from the list of standard blood characteristics, and decision-making software based on appropriate clinical data should be created. This review describes the abilities to develop predictive models for COVID-19 detection using routine blood tests and machine learning.
Collapse
Affiliation(s)
- Yury V. Kistenev
- Laboratory of Laser Molecular Imaging and Machine Learning, Tomsk State University, 36 Lenin Av., 634050 Tomsk, Russia
| | - Denis A. Vrazhnov
- Laboratory of Laser Molecular Imaging and Machine Learning, Tomsk State University, 36 Lenin Av., 634050 Tomsk, Russia
| | - Ekaterina E. Shnaider
- Laboratory of Laser Molecular Imaging and Machine Learning, Tomsk State University, 36 Lenin Av., 634050 Tomsk, Russia
| | - Hala Zuhayri
- Laboratory of Laser Molecular Imaging and Machine Learning, Tomsk State University, 36 Lenin Av., 634050 Tomsk, Russia
| |
Collapse
|