1
|
Kuo KM, Chang CS. A meta-analysis of the diagnostic test accuracy of artificial intelligence predicting emergency department dispositions. BMC Med Inform Decis Mak 2025; 25:187. [PMID: 40375078 PMCID: PMC12082892 DOI: 10.1186/s12911-025-03010-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Accepted: 04/23/2025] [Indexed: 05/18/2025] Open
Abstract
BACKGROUND The rapid advancement of Artificial Intelligence (AI) has led to its widespread application across various domains, showing encouraging outcomes. Many studies have utilized AI to forecast emergency department (ED) disposition, aiming to forecast patient outcomes earlier and to allocate resources better; however, a dearth of comprehensive review literature exists to assess the objective performance standards of these predictive models using quantitative evaluations. This study aims to conduct a meta-analysis to assess the diagnostic accuracy of AI in predicting ED disposition, encompassing admission, critical care, and mortality. METHODS Multiple databases, including Scopus, Springer, ScienceDirect, PubMed, Wiley, Sage, and Google Scholar, were searched until December 31, 2023, to gather relevant literature. Risk of bias was assessed using the Prediction Model Risk of Bias Assessment Tool. Pooled estimates of sensitivity, specificity, and area under the receiver operating characteristic curve (AUROC) were calculated to evaluate AI's predictive performance. Sub-group analyses were performed to explore covariates affecting AI predictive model performance. RESULTS The study included 88 articles possessed with 117 AI models, among which 39, 45, and 33 models predicted admission, critical care, and mortality, respectively. The reported statistics for sensitivity, specificity, and AUROC represent pooled summary measures derived from the component studies included in this meta-analysis. AI's summary sensitivity, specificity, and AUROC for predicting admission were 0.81 (95% Confidence Interval [CI] 0.74-0.86), 0.87 (95% CI 0.81-0.91), and 0.87 (95% CI 0.84-0.93), respectively. For critical care, the values were 0.86 (95% CI 0.79-0.91), 0.89 (95% CI 0.83-0.93), and 0.93 (95% CI 0.89-0.95), respectively, and for mortality, they were 0.85 (95% CI 0.80-0.89), 0.94 (95% CI 0.90-0.96), and 0.93 (95% CI 0.89-0.96), respectively. Emergent sample characteristics and AI techniques showed evidence of significant covariates influencing the heterogeneity of AI predictive models for ED disposition. CONCLUSIONS The meta-analysis indicates promising performance of AI in predicting ED disposition, with certain potential for improvement, especially in sensitivity. Future research could explore advanced AI techniques such as ensemble learning and cross-validation with hyper-parameter tuning to enhance predictive model efficacy. TRIAL REGISTRATION This systematic review was not registered with PROSPERO or any other similar registry because the review was completed prior to the opportunity for registration, and PROSPERO currently does not accept registrations for reviews that are already completed. We are committed to transparency and have adhered to best practices in systematic review methodology throughout this study.
Collapse
Affiliation(s)
- Kuang-Ming Kuo
- Department of Business Management, National United University, No. 1, Lienda, Miaoli, 360301, Taiwan
| | - Chao Sheng Chang
- Department of Emergency Medicine, E-Da Hospital, Kaohsiung City, Taiwan.
- Department of Occupational Therapy, I-Shou University, Kaohsiung City, Taiwan.
| |
Collapse
|
2
|
Hong T, Huang J, Deng J, Kuang L, Sun M, Wang Q, Luo C, Zhao J, Liu X, Wang H. The Scoring Model to Predict ICU Stay and Mortality After Emergency Admissions in Atrial Fibrillation: A Retrospective Study of 30 366 Patients. Clin Cardiol 2025; 48:e70101. [PMID: 39976638 PMCID: PMC11841604 DOI: 10.1002/clc.70101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/05/2024] [Revised: 01/31/2025] [Accepted: 02/10/2025] [Indexed: 02/23/2025] Open
Abstract
BACKGROUND The rapid assessment of the conditions is crucial for the prognosis of atrial fibrillation (AF) patients admitted to the emergency department (ED). We aim to derive and validate a more accurate and simplified scoring model to optimize the triage of AF patients in the ED. MATERIALS AND METHODS We conducted a retrospective study using data from the Medical Information Mart for Intensive Care (MIMIC-IV) database and developed scoring models employing the Random Forest algorithm. The area under the receiver operating characteristic (ROC) curve (AUC) was used to measure the performance of the prediction for intensive care unit (ICU) stay, and the death likelihood within 3, 7, and 30 days following the ED admission. RESULTS The study included 30 366 AF patients, randomly divided into training, validation, and testing cohorts at a 7:1:2 ratio. The training set consisted of 21 257 patients, the validation set included 3036 patients, and the remaining 6073 patients were classified as the validation set. Among the cohorts, 9594 patients (32%) required ICU transfers, with mortality rates of 1% at 3 days, 3% at 7 days, and 6% at 30 days. In the testing set, the scoring models demonstrated strong discriminative ability with AUCs of 0.724 for ICU stay, 0.782 for 3-day mortality, 0.755 for 7-day mortality, and 0.767 for 30-day mortality. CONCLUSION We derived and validated novel simplified scoring models with good discriminative performance to predict the likelihood of ICU stay, 3-day, 7-day, and 30-day death in AF patients after ED admission.
Collapse
Affiliation(s)
- Tao Hong
- Postgraduate CollegeDalian Medical UniversityDalianChina
- Department of Cardiovascular SurgeryGeneral Hospital of Northern Theater CommandShenyangChina
| | - Jian Huang
- Department of Diagnostic UltrasoundSir Run Run Shaw Hospital, Zhejiang University College of MedicineHangzhouChina
| | - Jiewen Deng
- Department of NeurosurgeryXiushan People's HospitalChongqingChina
| | - Lirong Kuang
- Department of OphthalmologyWuhan Wuchang Hospital (Wuchang Hospital Affiliated to Wuhan University of Science and Technology)WuhanChina
| | | | - Qianqian Wang
- College of Medical InformaticsChongqing Medical UniversityChongqingChina
| | - Chao Luo
- The People's Hospital of Shayang CountyJingmenChina
| | - Jikai Zhao
- Department of Cardiovascular SurgeryGeneral Hospital of Northern Theater CommandShenyangChina
| | - Xiaozhu Liu
- Emergency and Critical Care Medical Center, Beijing Shijitan HospitalCapital Medical UniversityBeijingChina
| | - Huishan Wang
- Postgraduate CollegeDalian Medical UniversityDalianChina
- Department of Cardiovascular SurgeryGeneral Hospital of Northern Theater CommandShenyangChina
| |
Collapse
|
3
|
Martínez‐Licort R, Sahelices B, de la Torre I, Vegas J. Machine Learning Methods for Predicting Syncope Severity in the Emergency Department: A Retrospective Analysis. Health Sci Rep 2025; 8:e70477. [PMID: 39995795 PMCID: PMC11847648 DOI: 10.1002/hsr2.70477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 01/10/2025] [Accepted: 01/27/2025] [Indexed: 02/26/2025] Open
Abstract
Background and Aims Syncope is a frequent reason for hospital emergency admissions, presenting significant challenges in determining its cause and associated risks. Despite its prevalence, research on using artificial intelligence (AI) to improve patient outcomes in this context has been limited. The main objective of current study is to predict the severity of syncope cases using machine learning (ML) algorithms based on data collected during on-site treatment and ambulance transportation. Methods This study analyzed 572 records from five Spanish public hospitals (2018-2021), focusing on hospitalization, ICU admission, and mortality. A three-phase strategy was used: data preprocessing, model exploration, and model selection. In the exploration phase, three data transformations techniques were applied and in each of them, models were evaluated using stratified 10-fold cross-validation, optimizing AUC, accuracy, and recall, with emphasis on minimizing false negatives (FN). The top-performing models were fine-tuned and tested. The strategy was implemented using Python libraries and a diverse set of ML classifiers were applied, including linear discriminant analysis (LDA), random forest (RF), dummy classifier (DC), and gradient boosting (GB). Results The RF classifier performed best for predicting hospitalization, reducing FN to 37% and achieving a true negative rate (TN) of 78%, with a recall of 0.63 and accuracy of 0.74. For ICU, DC showed FN = 29%, TN = 57%, recall = 0.625, and accuracy = 0.58. The LDA classifier excelled in predicting hospital mortality, with FN = 40%, TN = 89%, recall = 0.6, and accuracy = 0.88. These results indicate that RF was superior for predicting hospitalization, while DC for ICU and LDA performed better for predicting mortality. Conclusions This study provides an experimental foundation for the application of ML techniques in managing syncope in ED. The intention is to stimulate AI research in this area, with a view to integrating these models into clinical workflows in the future.
Collapse
Affiliation(s)
| | - Benjamín Sahelices
- GCME Research Group, Department of Computer ScienceUniversity of ValladolidValladolidSpain
| | - Isabel de la Torre
- Department of Signal Theory and Communications and Telematics EngineeringUniversity of ValladolidValladolidSpain
| | - Jesús Vegas
- Department of Computer ScienceUniversity of ValladolidValladolidSpain
| |
Collapse
|
4
|
Porto BM. Improving triage performance in emergency departments using machine learning and natural language processing: a systematic review. BMC Emerg Med 2024; 24:219. [PMID: 39558255 PMCID: PMC11575054 DOI: 10.1186/s12873-024-01135-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2024] [Accepted: 11/11/2024] [Indexed: 11/20/2024] Open
Abstract
BACKGROUND In Emergency Departments (EDs), triage is crucial for determining patient severity and prioritizing care, typically using the Manchester Triage Scale (MTS). Traditional triage systems, reliant on human judgment, are prone to under-triage and over-triage, resulting in variability, bias, and incorrect patient classification. Studies suggest that Machine Learning (ML) and Natural Language Processing (NLP) could enhance triage accuracy and consistency. This review analyzes studies on ML and/or NLP algorithms for ED patient triage. METHODS Following Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) guidelines, we conducted a systematic review across five databases: Web of Science, PubMed, Scopus, IEEE Xplore, and ACM Digital Library, from their inception of each database to October 2023. The risk of bias was assessed using the Prediction model Risk of Bias Assessment Tool (PROBAST). Only articles employing at least one ML and/or NLP method for patient triage classification were included. RESULTS Sixty studies covering 57 ML algorithms were included. Logistic Regression (LR) was the most used model, while eXtreme Gradient Boosting (XGBoost), decision tree-based algorithms with Gradient Boosting (GB), and Deep Neural Networks (DNNs) showed superior performance. Frequent predictive variables included demographics and vital signs, with oxygen saturation, chief complaints, systolic blood pressure, age, and mode of arrival being the most retained. The ML algorithms showed significant bias risk due to critical bias assessment in classification models. CONCLUSION NLP methods improved ML algorithms' classification capability using triage nursing and medical notes and structured clinical data compared to algorithms using only structured data. Feature engineering (FE) and class imbalance correction methods enhanced ML workflows' performance, but FE and eXplainable Artificial Intelligence (XAI) were underexplored in this field. Registration and funding. This systematic review has been registered (registration number: CRD42024604529) in the International Prospective Register of Systematic Reviews (PROSPERO) and can be accessed online at the following URL: https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=604529 . Funding for this work was provided by the National Council for Scientific and Technological Development (CNPq), Brazil.
Collapse
Affiliation(s)
- Bruno Matos Porto
- Industrial Engineering Department, Federal University of Rio Grande do Sul, Av. Osvaldo Aranha 55, Porto Alegre, RS, Brazil.
| |
Collapse
|
5
|
Liu J, Duan X, Duan M, Jiang Y, Mao W, Wang L, Liu G. Development and external validation of an interpretable machine learning model for the prediction of intubation in the intensive care unit. Sci Rep 2024; 14:27174. [PMID: 39511328 PMCID: PMC11544239 DOI: 10.1038/s41598-024-77798-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Accepted: 10/25/2024] [Indexed: 11/15/2024] Open
Abstract
Given the limited capacity to accurately determine the necessity for intubation in intensive care unit settings, this study aimed to develop and externally validate an interpretable machine learning model capable of predicting the need for intubation among ICU patients. Seven widely used machine learning (ML) algorithms were employed to construct the prediction models. Adult patients from the Medical Information Mart for Intensive Care IV database who stayed in the ICU for longer than 24 h were included in the development and internal validation. The model was subsequently externally validated using the eICU-CRD database. In addition, the SHapley Additive exPlanations method was employed to interpret the influence of individual parameters on the predictions made by the model. A total of 11,988 patients were included in the final cohort for this study. The CatBoost model demonstrated the best performance (AUC: 0.881). In the external validation set, the efficacy of our model was also confirmed (AUC: 0.750), which suggests robust generalization capabilities. The Glasgow Coma Scale (GCS), body mass index (BMI), arterial partial pressure of oxygen (PaO2), respiratory rate (RR) and length of stay (LOS) before ICU were the top 5 features of the CatBoost model with the greatest impact. We developed an externally validated CatBoost model that accurately predicts the need for intubation in ICU patients within 24 to 96 h of admission, facilitating clinical decision-making and has the potential to improve patient outcomes. The prediction model utilizes readily obtainable monitoring parameters and integrates the SHAP method to enhance interpretability, providing clinicians with clear insights into the factors influencing predictions.
Collapse
Affiliation(s)
- Jianyuan Liu
- Emergency Medicine Clinical Research Center, Beijing Chao-Yang Hospital, Capital Medical University, Beijing, China
| | - Xiangjie Duan
- Department of Infectious Diseases, Department of Emergency Medicine, The First Affiliated Hospital of Jinan University, Guangzhou, China
| | - Minjie Duan
- Center for Artificial Intelligence in Medicine, Chinese PLA General Hospital, Beijing, China
| | - Yu Jiang
- Department of Respiratory and Critical Care Medicine, University-Town Hospital of Chongqing Medical University, Chongqing, China
| | - Wei Mao
- Department of Emergency and Critical Care Medicine, University-Town Hospital of Chongqing Medical University, Chongqing, China
| | - Lilin Wang
- Department of Emergency and Critical Care Medicine, University-Town Hospital of Chongqing Medical University, Chongqing, China
| | - Gang Liu
- Department of Emergency and Critical Care Medicine, University-Town Hospital of Chongqing Medical University, Chongqing, China.
| |
Collapse
|
6
|
Jawad BN, Shaker SM, Altintas I, Eugen-Olsen J, Nehlin JO, Andersen O, Kallemose T. Development and validation of prognostic machine learning models for short- and long-term mortality among acutely admitted patients based on blood tests. Sci Rep 2024; 14:5942. [PMID: 38467752 PMCID: PMC10928126 DOI: 10.1038/s41598-024-56638-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 03/08/2024] [Indexed: 03/13/2024] Open
Abstract
Several scores predicting mortality at the emergency department have been developed. However, all with shortcomings either simple and applicable in a clinical setting, with poor performance, or advanced, with high performance, but clinically difficult to implement. This study aimed to explore if machine learning algorithms could predict all-cause short- and long-term mortality based on the routine blood test collected at admission. METHODS We analyzed data from a retrospective cohort study, including patients > 18 years admitted to the Emergency Department (ED) of Copenhagen University Hospital Hvidovre, Denmark between November 2013 and March 2017. The primary outcomes were 3-, 10-, 30-, and 365-day mortality after admission. PyCaret, an automated machine learning library, was used to evaluate the predictive performance of fifteen machine learning algorithms using the area under the receiver operating characteristic curve (AUC). RESULTS Data from 48,841 admissions were analyzed, of these 34,190 (70%) were randomly divided into training data, and 14,651 (30%) were in test data. Eight machine learning algorithms achieved very good to excellent results of AUC on test data in a of range 0.85-0.93. In prediction of short-term mortality, lactate dehydrogenase (LDH), leukocyte counts and differentials, Blood urea nitrogen (BUN) and mean corpuscular hemoglobin concentration (MCHC) were the best predictors, whereas prediction of long-term mortality was favored by age, LDH, soluble urokinase plasminogen activator receptor (suPAR), albumin, and blood urea nitrogen (BUN). CONCLUSION The findings suggest that measures of biomarkers taken from one blood sample during admission to the ED can identify patients at high risk of short-and long-term mortality following emergency admissions.
Collapse
Affiliation(s)
- Baker Nawfal Jawad
- Department of Clinical Research, Copenhagen University Hospital Amager and Hvidovre, Hvidovre, Denmark.
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark.
| | | | - Izzet Altintas
- Department of Clinical Research, Copenhagen University Hospital Amager and Hvidovre, Hvidovre, Denmark
- Emergency Department, Copenhagen University Hospital Amager and Hvidovre, Hvidovre, Denmark
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Jesper Eugen-Olsen
- Department of Clinical Research, Copenhagen University Hospital Amager and Hvidovre, Hvidovre, Denmark
| | - Jan O Nehlin
- Department of Clinical Research, Copenhagen University Hospital Amager and Hvidovre, Hvidovre, Denmark
| | - Ove Andersen
- Department of Clinical Research, Copenhagen University Hospital Amager and Hvidovre, Hvidovre, Denmark
- Emergency Department, Copenhagen University Hospital Amager and Hvidovre, Hvidovre, Denmark
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Thomas Kallemose
- Department of Clinical Research, Copenhagen University Hospital Amager and Hvidovre, Hvidovre, Denmark
| |
Collapse
|
7
|
Rahmatinejad Z, Dehghani T, Hoseini B, Rahmatinejad F, Lotfata A, Reihani H, Eslami S. A comparative study of explainable ensemble learning and logistic regression for predicting in-hospital mortality in the emergency department. Sci Rep 2024; 14:3406. [PMID: 38337000 PMCID: PMC10858239 DOI: 10.1038/s41598-024-54038-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 02/07/2024] [Indexed: 02/12/2024] Open
Abstract
This study addresses the challenges associated with emergency department (ED) overcrowding and emphasizes the need for efficient risk stratification tools to identify high-risk patients for early intervention. While several scoring systems, often based on logistic regression (LR) models, have been proposed to indicate patient illness severity, this study aims to compare the predictive performance of ensemble learning (EL) models with LR for in-hospital mortality in the ED. A cross-sectional single-center study was conducted at the ED of Imam Reza Hospital in northeast Iran from March 2016 to March 2017. The study included adult patients with one to three levels of emergency severity index. EL models using Bagging, AdaBoost, random forests (RF), Stacking and extreme gradient boosting (XGB) algorithms, along with an LR model, were constructed. The training and validation visits from the ED were randomly divided into 80% and 20%, respectively. After training the proposed models using tenfold cross-validation, their predictive performance was evaluated. Model performance was compared using the Brier score (BS), The area under the receiver operating characteristics curve (AUROC), The area and precision-recall curve (AUCPR), Hosmer-Lemeshow (H-L) goodness-of-fit test, precision, sensitivity, accuracy, F1-score, and Matthews correlation coefficient (MCC). The study included 2025 unique patients admitted to the hospital's ED, with a total percentage of hospital deaths at approximately 19%. In the training group and the validation group, 274 of 1476 (18.6%) and 152 of 728 (20.8%) patients died during hospitalization, respectively. According to the evaluation of the presented framework, EL models, particularly Bagging, predicted in-hospital mortality with the highest AUROC (0.839, CI (0.802-0.875)) and AUCPR = 0.64 comparable in terms of discrimination power with LR (AUROC (0.826, CI (0.787-0.864)) and AUCPR = 0.61). XGB achieved the highest precision (0.83), sensitivity (0.831), accuracy (0.842), F1-score (0.833), and the highest MCC (0.48). Additionally, the most accurate models in the unbalanced dataset belonged to RF with the lowest BS (0.128). Although all studied models overestimate mortality risk and have insufficient calibration (P > 0.05), stacking demonstrated relatively good agreement between predicted and actual mortality. EL models are not superior to LR in predicting in-hospital mortality in the ED. Both EL and LR models can be considered as screening tools to identify patients at risk of mortality.
Collapse
Affiliation(s)
- Zahra Rahmatinejad
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Toktam Dehghani
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
- Toos Institute of Higher Education, Mashhad, Iran
| | - Benyamin Hoseini
- Pharmaceutical Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Fatemeh Rahmatinejad
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Aynaz Lotfata
- Department of Pathology, Microbiology, and Immunology, School of Veterinary Medicine, University of California, Davis, CA, USA
| | - Hamidreza Reihani
- Department of Emergency Medicine, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
| | - Saeid Eslami
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
- Pharmaceutical Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran.
- Department of Medical Informatics, Amsterdam UMC - Location AMC, University of Amsterdam, Amsterdam, The Netherlands.
| |
Collapse
|
8
|
Stoessel D, Fa R, Artemova S, von Schenck U, Nowparast Rostami H, Madiot PE, Landelle C, Olive F, Foote A, Moreau-Gaudry A, Bosson JL. Early prediction of in-hospital mortality utilizing multivariate predictive modelling of electronic medical records and socio-determinants of health of the first day of hospitalization. BMC Med Inform Decis Mak 2023; 23:259. [PMID: 37957690 PMCID: PMC10644472 DOI: 10.1186/s12911-023-02356-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Accepted: 10/27/2023] [Indexed: 11/15/2023] Open
Abstract
BACKGROUND In France an average of 4% of hospitalized patients die during their hospital stay. To aid medical decision making and the attribution of resources, within a few days of admission the identification of patients at high risk of dying in hospital is essential. METHODS We used de-identified routine patient data available in the first 2 days of hospitalization in a French University Hospital (between 2016 and 2018) to build models predicting in-hospital mortality (at ≥ 2 and ≤ 30 days after admission). We tested nine different machine learning algorithms with repeated 10-fold cross-validation. Models were trained with 283 variables including age, sex, socio-determinants of health, laboratory test results, procedures (Classification of Medical Acts), medications (Anatomical Therapeutic Chemical code), hospital department/unit and home address (urban, rural etc.). The models were evaluated using various performance metrics. The dataset contained 123,729 admissions, of which the outcome for 3542 was all-cause in-hospital mortality and 120,187 admissions (no death reported within 30 days) were controls. RESULTS The support vector machine, logistic regression and Xgboost algorithms demonstrated high discrimination with a balanced accuracy of 0.81 (95%CI 0.80-0.82), 0.82 (95%CI 0.80-0.83) and 0.83 (95%CI 0.80-0.83) and AUC of 0.90 (95%CI 0.88-0.91), 0.90 (95%CI 0.89-0.91) and 0.90 (95%CI 0.89-0.91) respectively. The most predictive variables for in-hospital mortality in all three models were older age (greater risk), and admission with a confirmed appointment (reduced risk). CONCLUSION We propose three highly discriminating machine-learning models that could improve clinical and organizational decision making for adult patients at hospital admission.
Collapse
Affiliation(s)
- Daniel Stoessel
- Life Science Analytics, Clinical Solutions, Elsevier, Berlin, Germany
| | - Rui Fa
- Elsevier Health Analytics, London, UK
| | - Svetlana Artemova
- Public Health Department, CHU Grenoble Alpes, Grenoble, F-38000, France
| | | | | | | | - Caroline Landelle
- Public Health Department, CHU Grenoble Alpes, Grenoble, F-38000, France
- TIMC CNRS UMR5525, Université Grenoble Alpes, Grenoble, F-38000, France
| | - Fréderic Olive
- Public Health Department, CHU Grenoble Alpes, Grenoble, F-38000, France
| | - Alison Foote
- Public Health Department, CHU Grenoble Alpes, Grenoble, F-38000, France
| | - Alexandre Moreau-Gaudry
- Public Health Department, CHU Grenoble Alpes, Grenoble, F-38000, France
- TIMC CNRS UMR5525, Université Grenoble Alpes, Grenoble, F-38000, France
| | - Jean-Luc Bosson
- Public Health Department, CHU Grenoble Alpes, Grenoble, F-38000, France.
- TIMC CNRS UMR5525, Université Grenoble Alpes, Grenoble, F-38000, France.
| |
Collapse
|
9
|
Choi A, Choi SY, Chung K, Chung HS, Song T, Choi B, Kim JH. Development of a machine learning-based clinical decision support system to predict clinical deterioration in patients visiting the emergency department. Sci Rep 2023; 13:8561. [PMID: 37237057 DOI: 10.1038/s41598-023-35617-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2023] [Accepted: 05/21/2023] [Indexed: 05/28/2023] Open
Abstract
This study aimed to develop a machine learning-based clinical decision support system for emergency departments based on the decision-making framework of physicians. We extracted 27 fixed and 93 observation features using data on vital signs, mental status, laboratory results, and electrocardiograms during emergency department stay. Outcomes included intubation, admission to the intensive care unit, inotrope or vasopressor administration, and in-hospital cardiac arrest. eXtreme gradient boosting algorithm was used to learn and predict each outcome. Specificity, sensitivity, precision, F1 score, area under the receiver operating characteristic curve (AUROC), and area under the precision-recall curve were assessed. We analyzed 303,345 patients with 4,787,121 input data, resampled into 24,148,958 1 h-units. The models displayed a discriminative ability to predict outcomes (AUROC > 0.9), and the model with lagging 6 and leading 0 displayed the highest value. The AUROC curve of in-hospital cardiac arrest had the smallest change, with increased lagging for all outcomes. With inotropic use, intubation, and intensive care unit admission, the range of AUROC curve change with the leading 6 was the highest according to different amounts of previous information (lagging). In this study, a human-centered approach to emulate the clinical decision-making process of emergency physicians has been adopted to enhance the use of the system. Machine learning-based clinical decision support systems customized according to clinical situations can help improve the quality of care.
Collapse
Affiliation(s)
- Arom Choi
- Department of Emergency Medicine, Yonsei University College of Medicine, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
- Institute for Innovation in Digital Healthcare, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - So Yeon Choi
- Department of Emergency Medicine, Yonsei University College of Medicine, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Kyungsoo Chung
- Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Yonsei University College of Medicine, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
- Institute for Innovation in Digital Healthcare, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Hyun Soo Chung
- Department of Emergency Medicine, Yonsei University College of Medicine, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Taeyoung Song
- LG Electronics, 128 Yeoui-daero, Yeongdeungpo-gu, Seoul, 07336, Republic of Korea
| | - Byunghun Choi
- LG Electronics, 128 Yeoui-daero, Yeongdeungpo-gu, Seoul, 07336, Republic of Korea
| | - Ji Hoon Kim
- Department of Emergency Medicine, Yonsei University College of Medicine, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea.
- Institute for Innovation in Digital Healthcare, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea.
| |
Collapse
|
10
|
Li H, Tao X, Liang T, Jiang J, Zhu J, Wu S, Chen L, Zhang Z, Zhou C, Sun X, Huang S, Chen J, Chen T, Ye Z, Chen W, Guo H, Yao Y, Liao S, Yu C, Fan B, Liu Y, Lu C, Hu J, Xie Q, Wei X, Fang C, Liu H, Huang C, Pan S, Zhan X, Liu C. Comprehensive AI-assisted tool for ankylosing spondylitis based on multicenter research outperforms human experts. Front Public Health 2023; 11:1063633. [PMID: 36844823 PMCID: PMC9947660 DOI: 10.3389/fpubh.2023.1063633] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Accepted: 01/18/2023] [Indexed: 02/11/2023] Open
Abstract
Introduction The diagnosis and treatment of ankylosing spondylitis (AS) is a difficult task, especially in less developed countries without access to experts. To address this issue, a comprehensive artificial intelligence (AI) tool was created to help diagnose and predict the course of AS. Methods In this retrospective study, a dataset of 5389 pelvic radiographs (PXRs) from patients treated at a single medical center between March 2014 and April 2022 was used to create an ensemble deep learning (DL) model for diagnosing AS. The model was then tested on an additional 583 images from three other medical centers, and its performance was evaluated using the area under the receiver operating characteristic curve analysis, accuracy, precision, recall, and F1 scores. Furthermore, clinical prediction models for identifying high-risk patients and triaging patients were developed and validated using clinical data from 356 patients. Results The ensemble DL model demonstrated impressive performance in a multicenter external test set, with precision, recall, and area under the receiver operating characteristic curve values of 0.90, 0.89, and 0.96, respectively. This performance surpassed that of human experts, and the model also significantly improved the experts' diagnostic accuracy. Furthermore, the model's diagnosis results based on smartphone-captured images were comparable to those of human experts. Additionally, a clinical prediction model was established that accurately categorizes patients with AS into high-and low-risk groups with distinct clinical trajectories. This provides a strong foundation for individualized care. Discussion In this study, an exceptionally comprehensive AI tool was developed for the diagnosis and management of AS in complex clinical scenarios, especially in underdeveloped or rural areas that lack access to experts. This tool is highly beneficial in providing an efficient and effective system of diagnosis and management.
Collapse
Affiliation(s)
- Hao Li
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Xiang Tao
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Tuo Liang
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Jie Jiang
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Jichong Zhu
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Shaofeng Wu
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Liyi Chen
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Zide Zhang
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Chenxing Zhou
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Xuhua Sun
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Shengsheng Huang
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Jiarui Chen
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Tianyou Chen
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Zhen Ye
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Wuhua Chen
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Hao Guo
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Yuanlin Yao
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Shian Liao
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Chaojie Yu
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Binguang Fan
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Yihong Liu
- Guangxi Medical University, Nanning, Guangxi, China
| | - Chunai Lu
- Guangxi Medical University, Nanning, Guangxi, China
| | - Junnan Hu
- Guangxi Medical University, Nanning, Guangxi, China
| | - Qinghong Xie
- Guangxi Medical University, Nanning, Guangxi, China
| | - Xiao Wei
- Guangxi Medical University, Nanning, Guangxi, China
| | - Cairen Fang
- Guangxi Medical University, Nanning, Guangxi, China
| | - Huijiang Liu
- Orthopaedics of The First People's Hospital of Nanning, Nanning, Guangxi, China
| | - Chengqian Huang
- Orthopaedics of People's Hospital of Baise, Baise, Guangxi, China
| | - Shixin Pan
- Orthopaedics of Wuzhou Red Cross Hospital, Wuzhou, Guangxi, China
| | - Xinli Zhan
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Chong Liu
- The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China,*Correspondence: Chong Liu ✉
| |
Collapse
|
11
|
Petersson L, Vincent K, Svedberg P, Nygren JM, Larsson I. Ethical considerations in implementing AI for mortality prediction in the emergency department: Linking theory and practice. Digit Health 2023; 9:20552076231206588. [PMID: 37829612 PMCID: PMC10566278 DOI: 10.1177/20552076231206588] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/21/2023] [Indexed: 10/14/2023] Open
Abstract
Background Artificial intelligence (AI) is predicted to be a solution for improving healthcare, increasing efficiency, and saving time and recourses. A lack of ethical principles for the use of AI in practice has been highlighted by several stakeholders due to the recent attention given to it. Research has shown an urgent need for more knowledge regarding the ethical implications of AI applications in healthcare. However, fundamental ethical principles may not be sufficient to describe ethical concerns associated with implementing AI applications. Objective The aim of this study is twofold, (1) to use the implementation of AI applications to predict patient mortality in emergency departments as a setting to explore healthcare professionals' perspectives on ethical issues in relation to ethical principles and (2) to develop a model to guide ethical considerations in AI implementation in healthcare based on ethical theory. Methods Semi-structured interviews were conducted with 18 participants. The abductive approach used to analyze the empirical data consisted of four steps alternating between inductive and deductive analyses. Results Our findings provide an ethical model demonstrating the need to address six ethical principles (autonomy, beneficence, non-maleficence, justice, explicability, and professional governance) in relation to ethical theories defined as virtue, deontology, and consequentialism when AI applications are to be implemented in clinical practice. Conclusions Ethical aspects of AI applications are broader than the prima facie principles of medical ethics and the principle of explicability. Ethical aspects thus need to be viewed from a broader perspective to cover different situations that healthcare professionals, in general, and physicians, in particular, may face when using AI applications in clinical practice.
Collapse
Affiliation(s)
- Lena Petersson
- School of Health and Welfare, Halmstad University, Halmstad, Sweden
| | - Kalista Vincent
- School of Health and Welfare, Halmstad University, Halmstad, Sweden
| | - Petra Svedberg
- School of Health and Welfare, Halmstad University, Halmstad, Sweden
| | - Jens M Nygren
- School of Health and Welfare, Halmstad University, Halmstad, Sweden
| | - Ingrid Larsson
- School of Health and Welfare, Halmstad University, Halmstad, Sweden
| |
Collapse
|
12
|
Establishment of ICU Mortality Risk Prediction Models with Machine Learning Algorithm Using MIMIC-IV Database. Diagnostics (Basel) 2022; 12:diagnostics12051068. [PMID: 35626224 PMCID: PMC9139972 DOI: 10.3390/diagnostics12051068] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 04/21/2022] [Accepted: 04/22/2022] [Indexed: 12/10/2022] Open
Abstract
Objective: The mortality rate of critically ill patients in ICUs is relatively high. In order to evaluate patients’ mortality risk, different scoring systems are used to help clinicians assess prognosis in ICUs, such as the Acute Physiology and Chronic Health Evaluation III (APACHE III) and the Logistic Organ Dysfunction Score (LODS). In this research, we aimed to establish and compare multiple machine learning models with physiology subscores of APACHE III—namely, the Acute Physiology Score III (APS III)—and LODS scoring systems in order to obtain better performance for ICU mortality prediction. Methods: A total number of 67,748 patients from the Medical Information Database for Intensive Care (MIMIC-IV) were enrolled, including 7055 deceased patients, and the same number of surviving patients were selected by the random downsampling technique, for a total of 14,110 patients included in the study. The enrolled patients were randomly divided into a training dataset (n = 9877) and a validation dataset (n = 4233). Fivefold cross-validation and grid search procedures were used to find and evaluate the best hyperparameters in different machine learning models. Taking the subscores of LODS and the physiology subscores that are part of the APACHE III scoring systems as input variables, four machine learning methods of XGBoost, logistic regression, support vector machine, and decision tree were used to establish ICU mortality prediction models, with AUCs as metrics. AUCs, specificity, sensitivity, positive predictive value, negative predictive value, and calibration curves were used to find the best model. Results: For the prediction of mortality risk in ICU patients, the AUC of the XGBoost model was 0.918 (95%CI, 0.915–0.922), and the AUCs of logistic regression, SVM, and decision tree were 0.872 (95%CI, 0.867–0.877), 0.872 (95%CI, 0.867–0.877), and 0.852 (95%CI, 0.847–0.857), respectively. The calibration curves of logistic regression and support vector machine performed better than the other two models in the ranges 0–40% and 70%–100%, respectively, while XGBoost performed better in the range of 40–70%. Conclusions: The mortality risk of ICU patients can be better predicted by the characteristics of the Acute Physiology Score III and the Logistic Organ Dysfunction Score with XGBoost in terms of ROC curve, sensitivity, and specificity. The XGBoost model could assist clinicians in judging in-hospital outcome of critically ill patients, especially in patients with a more uncertain survival outcome.
Collapse
|