1
|
Yuan H, Zhu M, Yang R, Liu H, Li I, Hong C. Rethinking Domain-Specific Pretraining by Supervised or Self-Supervised Learning for Chest Radiograph Classification: A Comparative Study Against ImageNet Counterparts in Cold-Start Active Learning. HEALTH CARE SCIENCE 2025; 4:110-143. [PMID: 40241982 PMCID: PMC11997468 DOI: 10.1002/hcs2.70009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/10/2024] [Revised: 01/05/2025] [Accepted: 01/26/2025] [Indexed: 04/18/2025]
Abstract
Objective Deep learning (DL) has become the prevailing method in chest radiograph analysis, yet its performance heavily depends on large quantities of annotated images. To mitigate the cost, cold-start active learning (AL), comprising an initialization followed by subsequent learning, selects a small subset of informative data points for labeling. Recent advancements in pretrained models by supervised or self-supervised learning tailored to chest radiograph have shown broad applicability to diverse downstream tasks. However, their potential in cold-start AL remains unexplored. Methods To validate the efficacy of domain-specific pretraining, we compared two foundation models: supervised TXRV and self-supervised REMEDIS with their general domain counterparts pretrained on ImageNet. Model performance was evaluated at both initialization and subsequent learning stages on two diagnostic tasks: psychiatric pneumonia and COVID-19. For initialization, we assessed their integration with three strategies: diversity, uncertainty, and hybrid sampling. For subsequent learning, we focused on uncertainty sampling powered by different pretrained models. We also conducted statistical tests to compare the foundation models with ImageNet counterparts, investigate the relationship between initialization and subsequent learning, examine the performance of one-shot initialization against the full AL process, and investigate the influence of class balance in initialization samples on initialization and subsequent learning. Results First, domain-specific foundation models failed to outperform ImageNet counterparts in six out of eight experiments on informative sample selection. Both domain-specific and general pretrained models were unable to generate representations that could substitute for the original images as model inputs in seven of the eight scenarios. However, pretrained model-based initialization surpassed random sampling, the default approach in cold-start AL. Second, initialization performance was positively correlated with subsequent learning performance, highlighting the importance of initialization strategies. Third, one-shot initialization performed comparably to the full AL process, demonstrating the potential of reducing experts' repeated waiting during AL iterations. Last, a U-shaped correlation was observed between the class balance of initialization samples and model performance, suggesting that the class balance is more strongly associated with performance at middle budget levels than at low or high budgets. Conclusions In this study, we highlighted the limitations of medical pretraining compared to general pretraining in the context of cold-start AL. We also identified promising outcomes related to cold-start AL, including initialization based on pretrained models, the positive influence of initialization on subsequent learning, the potential for one-shot initialization, and the influence of class balance on middle-budget AL. Researchers are encouraged to improve medical pretraining for versatile DL foundations and explore novel AL methods.
Collapse
Affiliation(s)
- Han Yuan
- Duke‐NUS Medical School, Centre for Quantitative MedicineSingaporeSingapore
| | - Mingcheng Zhu
- Duke‐NUS Medical School, Centre for Quantitative MedicineSingaporeSingapore
- Department of Engineering ScienceUniversity of OxfordOxfordUK
| | - Rui Yang
- Duke‐NUS Medical School, Centre for Quantitative MedicineSingaporeSingapore
| | - Han Liu
- Department of Computer ScienceVanderbilt UniversityNashvilleTennesseeUSA
| | - Irene Li
- Information Technology CenterUniversity of TokyoBunkyo‐kuJapan
| | - Chuan Hong
- Department of Biostatistics and BioinformaticsDuke UniversityDurhamNorth CarolinaUSA
| |
Collapse
|
2
|
Oh MY, Kim HS, Jung YM, Lee HC, Lee SB, Lee SM. Machine Learning-Based Explainable Automated Nonlinear Computation Scoring System for Health Score and an Application for Prediction of Perioperative Stroke: Retrospective Study. J Med Internet Res 2025; 27:e58021. [PMID: 40106818 PMCID: PMC11966079 DOI: 10.2196/58021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Revised: 03/24/2024] [Accepted: 10/30/2024] [Indexed: 03/22/2025] Open
Abstract
BACKGROUND Machine learning (ML) has the potential to enhance performance by capturing nonlinear interactions. However, ML-based models have some limitations in terms of interpretability. OBJECTIVE This study aimed to develop and validate a more comprehensible and efficient ML-based scoring system using SHapley Additive exPlanations (SHAP) values. METHODS We developed and validated the Explainable Automated nonlinear Computation scoring system for Health (EACH) framework score. We developed a CatBoost-based prediction model, identified key features, and automatically detected the top 5 steepest slope change points based on SHAP plots. Subsequently, we developed a scoring system (EACH) and normalized the score. Finally, the EACH score was used to predict perioperative stroke. We developed the EACH score using data from the Seoul National University Hospital cohort and validated it using data from the Boramae Medical Center, which was geographically and temporally different from the development set. RESULTS When applied for perioperative stroke prediction among 38,737 patients undergoing noncardiac surgery, the EACH score achieved an area under the curve (AUC) of 0.829 (95% CI 0.753-0.892). In the external validation, the EACH score demonstrated superior predictive performance with an AUC of 0.784 (95% CI 0.694-0.871) compared with a traditional score (AUC=0.528, 95% CI 0.457-0.619) and another ML-based scoring generator (AUC=0.564, 95% CI 0.516-0.612). CONCLUSIONS The EACH score is a more precise, explainable ML-based risk tool, proven effective in real-world data. The EACH score outperformed traditional scoring system and other prediction models based on different ML techniques in predicting perioperative stroke.
Collapse
Affiliation(s)
- Mi-Young Oh
- Department of Neurology, Sejong General Hospital, Sejong General Hospital, Bucheon-si, Republic of Korea
| | - Hee-Soo Kim
- Department of Medical Informatics, School of Medicine, Keimyung University, Daegu, Republic of Korea
| | - Young Mi Jung
- Department of Obstetrics and Gynecology, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
| | - Hyung-Chul Lee
- Department of Anesthesiology and Pain Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
- Department of Anesthesiology and Pain Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Seung-Bo Lee
- Department of Medical Informatics, School of Medicine, Keimyung University, Daegu, Republic of Korea
| | - Seung Mi Lee
- Department of Obstetrics and Gynecology, College of Medicine, Seoul National University, Seoul, Republic of Korea
- Department of Obstetrics and Gynecology, Seoul National University Hospital, Seoul, Republic of Korea
- Innovative Medical Technology Research Institute, Seoul National University Hospital, Seoul, Republic of Korea
- Institute of Reproductive Medicine and Population & Medical Big Data Research Center, Seoul National University, Seoul, Republic of Korea
| |
Collapse
|
3
|
Look CSJ, Teixayavong S, Djärv T, Ho AFW, Tan KBK, Ong MEH. Improved interpretable machine learning emergency department triage tool addressing class imbalance. Digit Health 2024; 10:20552076241240910. [PMID: 38708185 PMCID: PMC11067679 DOI: 10.1177/20552076241240910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 03/05/2024] [Indexed: 05/07/2024] Open
Abstract
Objective The Score for Emergency Risk Prediction (SERP) is a novel mortality risk prediction score which leverages machine learning in supporting triage decisions. In its derivation study, SERP-2d, SERP-7d and SERP-30d demonstrated good predictive performance for 2-day, 7-day and 30-day mortality. However, the dataset used had significant class imbalance. This study aimed to determine if addressing class imbalance can improve SERP's performance, ultimately improving triage accuracy. Methods The Singapore General Hospital (SGH) emergency department (ED) dataset was used, which contains 1,833,908 ED records between 2008 and 2020. Records between 2008 and 2017 were randomly split into a training set (80%) and validation set (20%). The 2019 and 2020 records were used as test sets. To address class imbalance, we used random oversampling and random undersampling in the AutoScore-Imbalance framework to develop SERP+-2d, SERP+-7d, and SERP+-30d scores. The performance of SERP+, SERP, and the commonly used triage risk scores was compared. Results The developed SERP+ scores had five to six variables. The AUC of SERP+ scores (0.874 to 0.905) was higher than that of the corresponding SERP scores (0.859 to 0.894) on both test sets. This superior performance was statistically significant for SERP+-7d (2019: Z = -5.843, p < 0.001, 2020: Z = -4.548, p < 0.001) and SERP+-30d (2019: Z = -3.063, p = 0.002, 2020: Z = -3.256, p = 0.001). SERP+ outperformed SERP marginally on sensitivity, specificity, balanced accuracy, and positive predictive value measures. Negative predictive value was the same for SERP+ and SERP. Additionally, SERP+ showed better performance compared to the commonly used triage risk scores. Conclusions Accounting for class imbalance during training improved score performance for SERP+. Better stratification of even a small number of patients can be meaningful in the context of the ED triage. Our findings reiterate the potential of machine learning-based scores like SERP+ in supporting accurate, data-driven triage decisions at the ED.
Collapse
Affiliation(s)
- Clarisse SJ Look
- Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
| | | | - Therese Djärv
- Department of Medicine Solna, Karolinska Institute, Stockholm, Sweden
| | - Andrew FW Ho
- Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
- Department of Emergency Medicine, Singapore General Hospital, Singapore, Singapore
| | - Kenneth BK Tan
- Department of Emergency Medicine, Singapore General Hospital, Singapore, Singapore
| | - Marcus EH Ong
- Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
- Department of Emergency Medicine, Singapore General Hospital, Singapore, Singapore
| |
Collapse
|
4
|
Xie F, Ning Y, Liu M, Li S, Saffari SE, Yuan H, Volovici V, Ting DSW, Goldstein BA, Ong MEH, Vaughan R, Chakraborty B, Liu N. A universal AutoScore framework to develop interpretable scoring systems for predicting common types of clinical outcomes. STAR Protoc 2023; 4:102302. [PMID: 37178115 PMCID: PMC10200969 DOI: 10.1016/j.xpro.2023.102302] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 03/13/2023] [Accepted: 04/21/2023] [Indexed: 05/15/2023] Open
Abstract
The AutoScore framework can automatically generate data-driven clinical scores in various clinical applications. Here, we present a protocol for developing clinical scoring systems for binary, survival, and ordinal outcomes using the open-source AutoScore package. We describe steps for package installation, detailed data processing and checking, and variable ranking. We then explain how to iterate through steps for variable selection, score generation, fine-tuning, and evaluation to generate understandable and explainable scoring systems using data-driven evidence and clinical knowledge. For complete details on the use and execution of this protocol, please refer to Xie et al. (2020),1 Xie et al. (2022)2, Saffari et al. (2022)3 and the online tutorial https://nliulab.github.io/AutoScore/.
Collapse
Affiliation(s)
- Feng Xie
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore; Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Yilin Ning
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Mingxuan Liu
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Siqi Li
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Seyed Ehsan Saffari
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore; Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Han Yuan
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Victor Volovici
- Department of Neurosurgery, Erasmus MC University Medical Center, 3015 GD Rotterdam, the Netherlands; Department of Public Health, Erasmus MC, 3015 GD Rotterdam, the Netherlands
| | - Daniel Shu Wei Ting
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore; Singapore Eye Research Institute, Singapore National Eye Centre, Singapore 168751, Singapore; SingHealth AI Office, Singapore Health Services, Singapore 168582, Singapore
| | - Benjamin Alan Goldstein
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore 169857, Singapore; Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27710, USA
| | - Marcus Eng Hock Ong
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore 169857, Singapore; Health Services Research Centre, Singapore Health Services, Singapore 169856, Singapore; Department of Emergency Medicine, Singapore General Hospital, Singapore 169608, Singapore
| | - Roger Vaughan
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Bibhas Chakraborty
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore; Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore 169857, Singapore; Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27710, USA; Department of Statistics and Data Science, National University of Singapore, Singapore 117546, Singapore
| | - Nan Liu
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore 169857, Singapore; Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore 169857, Singapore; SingHealth AI Office, Singapore Health Services, Singapore 168582, Singapore; Institute of Data Science, National University of Singapore, Singapore 117602, Singapore.
| |
Collapse
|
5
|
Yu JY, Heo S, Xie F, Liu N, Yoon SY, Chang HS, Kim T, Lee SU, Hock Ong ME, Ng YY, Do shin S, Kajino K, Cha WC. Development and Asian-wide validation of the Grade for Interpretable Field Triage (GIFT) for predicting mortality in pre-hospital patients using the Pan-Asian Trauma Outcomes Study (PATOS). THE LANCET REGIONAL HEALTH. WESTERN PACIFIC 2023; 34:100733. [PMID: 37283981 PMCID: PMC10240358 DOI: 10.1016/j.lanwpc.2023.100733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 01/24/2023] [Accepted: 02/19/2023] [Indexed: 03/07/2023]
Abstract
Background Field triage is critical in injury patients as the appropriate transport of patients to trauma centers is directly associated with clinical outcomes. Several prehospital triage scores have been developed in Western and European cohorts; however, their validity and applicability in Asia remains unclear. Therefore, we aimed to develop and validate an interpretable field triage scoring systems based on a multinational trauma registry in Asia. Methods This retrospective and multinational cohort study included all adult transferred injury patients from Korea, Malaysia, Vietnam, and Taiwan between 2016 and 2018. The outcome of interest was a death in the emergency department (ED) after the patients' ED visit. Using these results, we developed the interpretable field triage score with the Korea registry using an interpretable machine learning framework and validated the score externally. The performance of each country's score was assessed using the area under the receiver operating characteristic curve (AUROC). Furthermore, a website for real-world application was developed using R Shiny. Findings The study population included 26,294, 9404, 673 and 826 transferred injury patients between 2016 and 2018 from Korea, Malaysia, Vietnam, and Taiwan, respectively. The corresponding rates of a death in the ED were 0.30%, 0.60%, 4.0%, and 4.6% respectively. Age and vital sign were found to be the significant variables for predicting mortality. External validation showed the accuracy of the model with an AUROC of 0.756-0.850. Interpretation The Grade for Interpretable Field Triage (GIFT) score is an interpretable and practical tool to predict mortality in field triage for trauma. Funding This research was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (Grant Number: HI19C1328).
Collapse
Affiliation(s)
- Jae Yong Yu
- Department of Digital Health, Samsung Advanced Institute for Health Science & Technology (SAIHST), Sungkyunkwan University, Seoul, South Korea
- Digital & Smart Health Office, Tan Tock Seng Hospital, Singapore
| | - Sejin Heo
- Department of Digital Health, Samsung Advanced Institute for Health Science & Technology (SAIHST), Sungkyunkwan University, Seoul, South Korea
- Department of Emergency Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Feng Xie
- Programme in Health Services and Systems Research, Duke–National University of Singapore Medical School, Singapore
- Department of Biomedical Data Science, Stanford University, Stanford, USA
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University, Stanford, USA
| | - Nan Liu
- Programme in Health Services and Systems Research, Duke–National University of Singapore Medical School, Singapore
- Health Service Research Centre, Singapore Health Services, Singapore
- Institute of Data Science, National University of Singapore, Singapore
| | - Sun Yung Yoon
- Department of Digital Health, Samsung Advanced Institute for Health Science & Technology (SAIHST), Sungkyunkwan University, Seoul, South Korea
| | - Han Sol Chang
- Department of Digital Health, Samsung Advanced Institute for Health Science & Technology (SAIHST), Sungkyunkwan University, Seoul, South Korea
- Department of Emergency Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Taerim Kim
- Department of Digital Health, Samsung Advanced Institute for Health Science & Technology (SAIHST), Sungkyunkwan University, Seoul, South Korea
- Department of Emergency Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Se Uk Lee
- Department of Emergency Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Marcus Eng Hock Ong
- Programme in Health Services and Systems Research, Duke–National University of Singapore Medical School, Singapore
- Department of Emergency Medicine, Singapore General Hospital, Singapore
| | - Yih Yng Ng
- Digital & Smart Health Office, Tan Tock Seng Hospital, Singapore
| | - Sang Do shin
- Department of Emergency Medicine, Seoul National University College of Medicine, Seoul, South Korea
| | - Kentaro Kajino
- Department of Emergency and Critical Care Medicine, Kansai Medical University, Moriguchi, Japan
| | - Won Chul Cha
- Department of Digital Health, Samsung Advanced Institute for Health Science & Technology (SAIHST), Sungkyunkwan University, Seoul, South Korea
- Department of Emergency Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
- Digital Innovation Center, Samsung Medical Center, Seoul, South Korea
| |
Collapse
|
6
|
Popovic D, Wertz M, Geisler C, Kaufmann J, Lähteenvuo M, Lieslehto J, Witzel J, Bogerts B, Walter M, Falkai P, Koutsouleris N, Schiltz K. Patterns of risk-Using machine learning and structural neuroimaging to identify pedophilic offenders. Front Psychiatry 2023; 14:1001085. [PMID: 37151966 PMCID: PMC10157073 DOI: 10.3389/fpsyt.2023.1001085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 03/27/2023] [Indexed: 05/09/2023] Open
Abstract
Background Child sexual abuse (CSA) has become a focal point for lawmakers, law enforcement, and mental health professionals. With high prevalence rates around the world and far-reaching, often chronic, individual, and societal implications, CSA and its leading risk factor, pedophilia, have been well investigated. This has led to a wide range of clinical tools and actuarial instruments for diagnosis and risk assessment regarding CSA. However, the neurobiological underpinnings of pedosexual behavior, specifically regarding hands-on pedophilic offenders (PO), remain elusive. Such biomarkers for PO individuals could potentially improve the early detection of high-risk PO individuals and enhance efforts to prevent future CSA. Aim To use machine learning and MRI data to identify PO individuals. Methods From a single-center male cohort of 14 PO individuals and 15 matched healthy control (HC) individuals, we acquired diffusion tensor imaging data (anisotropy, diffusivity, and fiber tracking) in literature-based regions of interest (prefrontal cortex, anterior cingulate cortex, amygdala, and corpus callosum). We trained a linear support vector machine to discriminate between PO and HC individuals using these WM microstructure data. Post hoc, we investigated the PO model decision scores with respect to sociodemographic (age, education, and IQ) and forensic characteristics (psychopathy, sexual deviance, and future risk of sexual violence) in the PO subpopulation. We assessed model specificity in an external cohort of 53 HC individuals. Results The classifier discriminated PO from HC individuals with a balanced accuracy of 75.5% (sensitivity = 64.3%, specificity = 86.7%, P 5000 = 0.018) and an out-of-sample specificity to correctly identify HC individuals of 94.3%. The predictive brain pattern contained bilateral fractional anisotropy in the anterior cingulate cortex, diffusivity in the left amygdala, and structural prefrontal cortex-amygdala connectivity in both hemispheres. This brain pattern was associated with the number of previous child victims, the current stance on sexuality, and the professionally assessed risk of future sexual violent reoffending. Conclusion Aberrant white matter microstructure in the prefronto-temporo-limbic circuit could be a potential neurobiological correlate for PO individuals at high-risk of reoffending with CSA. Although preliminary and exploratory at this point, our findings highlight the general potential of MRI-based biomarkers and particularly WM microstructure patterns for future CSA risk assessment and preventive efforts.
Collapse
Affiliation(s)
- David Popovic
- Department of Psychiatry and Psychotherapy, Ludwig-Maximilians-University Munich, Munich, Germany
- Department of Forensic Psychiatry, Ludwig-Maximilians-University Munich, Munich, Germany
- International Max Planck Research School for Translational Psychiatry (IMPRS-TP), Munich, Germany
- Max Planck Institute of Psychiatry, Munich, Germany
- *Correspondence: David Popovic,
| | - Maximilian Wertz
- Department of Psychiatry and Psychotherapy, Ludwig-Maximilians-University Munich, Munich, Germany
- Department of Forensic Psychiatry, Ludwig-Maximilians-University Munich, Munich, Germany
| | - Carolin Geisler
- Department of Dermatology, Venereology, and Allergology, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Joern Kaufmann
- Department of Neurology, Otto-von-Guericke-University, Magdeburg, Germany
| | - Markku Lähteenvuo
- Department of Forensic Psychiatry, University of Eastern Finland, Niuvanniemi Hospital, Kuopio, Finland
- Institute for Molecular Medicine FIMM, University of Helsinki, Helsinki, Finland
| | - Johannes Lieslehto
- Department of Forensic Psychiatry, University of Eastern Finland, Niuvanniemi Hospital, Kuopio, Finland
| | - Joachim Witzel
- Central State Forensic Psychiatric Hospital of Saxony-Anhalt, Uchtspringe, Germany
| | - Bernhard Bogerts
- Salus Institut, Salus gGmbH, Magdeburg, Germany
- Department of Psychiatry and Psychotherapy, Otto-von-Guericke-University, Magdeburg, Germany
| | - Martin Walter
- Department of Psychiatry and Psychotherapy, Jena University Hospital, Jena, Germany
| | - Peter Falkai
- Department of Psychiatry and Psychotherapy, Ludwig-Maximilians-University Munich, Munich, Germany
- International Max Planck Research School for Translational Psychiatry (IMPRS-TP), Munich, Germany
- Max Planck Institute of Psychiatry, Munich, Germany
| | - Nikolaos Koutsouleris
- Department of Psychiatry and Psychotherapy, Ludwig-Maximilians-University Munich, Munich, Germany
- International Max Planck Research School for Translational Psychiatry (IMPRS-TP), Munich, Germany
- Max Planck Institute of Psychiatry, Munich, Germany
- Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, United Kingdom
| | - Kolja Schiltz
- Department of Psychiatry and Psychotherapy, Ludwig-Maximilians-University Munich, Munich, Germany
- Department of Forensic Psychiatry, Ludwig-Maximilians-University Munich, Munich, Germany
| |
Collapse
|
7
|
Xie F, Zhou J, Lee JW, Tan M, Li S, Rajnthern LS, Chee ML, Chakraborty B, Wong AKI, Dagan A, Ong MEH, Gao F, Liu N. Benchmarking emergency department prediction models with machine learning and public electronic health records. Sci Data 2022; 9:658. [PMID: 36302776 PMCID: PMC9610299 DOI: 10.1038/s41597-022-01782-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 10/14/2022] [Indexed: 11/26/2022] Open
Abstract
The demand for emergency department (ED) services is increasing across the globe, particularly during the current COVID-19 pandemic. Clinical triage and risk assessment have become increasingly challenging due to the shortage of medical resources and the strain on hospital infrastructure caused by the pandemic. As a result of the widespread use of electronic health records (EHRs), we now have access to a vast amount of clinical data, which allows us to develop prediction models and decision support systems to address these challenges. To date, there is no widely accepted clinical prediction benchmark related to the ED based on large-scale public EHRs. An open-source benchmark data platform would streamline research workflows by eliminating cumbersome data preprocessing, and facilitate comparisons among different studies and methodologies. Based on the Medical Information Mart for Intensive Care IV Emergency Department (MIMIC-IV-ED) database, we created a benchmark dataset and proposed three clinical prediction benchmarks. This study provides future researchers with insights, suggestions, and protocols for managing data and developing predictive tools for emergency care.
Collapse
Affiliation(s)
- Feng Xie
- Centre for Quantitative Medicine and Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
| | - Jun Zhou
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Jin Wee Lee
- Centre for Quantitative Medicine and Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
| | - Mingrui Tan
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Siqi Li
- Centre for Quantitative Medicine and Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
| | - Logasan S/O Rajnthern
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
| | - Marcel Lucas Chee
- Faculty of Medicine, Nursing and Health Sciences, Monash University, Victoria, Australia
| | - Bibhas Chakraborty
- Centre for Quantitative Medicine and Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
- Department of Statistics and Data Science, National University of Singapore, Singapore, Singapore
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | - An-Kwok Ian Wong
- Division of Pulmonary, Allergy, and Critical Care Medicine, Duke University, Durham, NC, USA
| | - Alon Dagan
- Department of Emergency Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
- MIT Critical Data, Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Marcus Eng Hock Ong
- Centre for Quantitative Medicine and Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
- Department of Emergency Medicine, Singapore General Hospital, Singapore, Singapore
| | - Fei Gao
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Nan Liu
- Centre for Quantitative Medicine and Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore.
- SingHealth AI Health Program, Singapore Health Services, Singapore, Singapore.
- Institute of Data Science, National University of Singapore, Singapore, Singapore.
| |
Collapse
|