1
|
Chen M, Qian Q, Pan X, Li T. An investigation into the impact of temporality on COVID-19 infection and mortality predictions: new perspective based on Shapley Values. BMC Med Res Methodol 2025; 25:111. [PMID: 40275181 PMCID: PMC12020040 DOI: 10.1186/s12874-025-02572-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Accepted: 04/16/2025] [Indexed: 04/26/2025] Open
Abstract
INTRODUCTION Machine learning models have been employed to predict COVID-19 infections and mortality, but many models were built on training and testing sets from different periods. The purpose of this study is to investigate the impact of temporality, i.e., the temporal gap between training and testing sets, on model performances for predicting COVID-19 infections and mortality. Furthermore, this study seeks to understand the causes of the impact of temporality. METHODS This study used a COVID-19 surveillance dataset collected from Brazil in year 2020, 2021 and 2022, and built prediction models for COVID-19 infections and mortality using random forest and logistic regression, with 20 model features. Models were trained and tested based on data from different years and the same year as well, to examine the impact of temporality. To further explain the impact of temporality and its driving factors, Shapley values are employed to quantify individual contributions to model predictions. RESULTS For the infection model, we found that the temporal gap had a negative impact on prediction accuracy. On average, the loss in accuracy was 0.0256 for logistic regression and 0.0436 for random forest when there was a temporal gap between the training and testing sets. For the mortality model, the loss in accuracy was 0.0144 for logistic regression and 0.0098 for random forest, which means the impact of temporality was not as strong as in the infection model. Shapley values uncovered the reason behind such differences between the infection and mortality models. CONCLUSIONS Our study confirmed the negative impact of temporality on model performance for predicting COVID-19 infections, but it did not find such negative impact of temporality for predicting COVID-19 mortality. Shapley value revealed that there was a fixed set of four features that made predominant contributions for the mortality model across data in three years (2020-2022), while for the infection model there was no such fixed set of features across different years.
Collapse
Affiliation(s)
- Mingming Chen
- Academy of Pharmacy, Xi'an Jiaotong-Liverpool University, 111 Ren'ai Road, Suzhou, 215123, Jiangsu, P.R. China
- Institute of Population Health, Faculty of Health & Life Sciences Waterhouse Building, University of Liverpool, Liverpool, England
| | - Qihang Qian
- School of Computer Science and Technology, Zhejiang University of Technology, No. 18 Chaowang Road, Hangzhou, Zhejiang, 310014, P.R. China
| | - Xiang Pan
- School of Computer Science and Technology, Zhejiang University of Technology, No. 18 Chaowang Road, Hangzhou, Zhejiang, 310014, P.R. China
| | - Tenglong Li
- Academy of Pharmacy, Xi'an Jiaotong-Liverpool University, 111 Ren'ai Road, Suzhou, 215123, Jiangsu, P.R. China.
- Institute of Population Health, Faculty of Health & Life Sciences Waterhouse Building, University of Liverpool, Liverpool, England.
| |
Collapse
|
2
|
Orhan F, Kurutkan MN. Predicting total healthcare demand using machine learning: separate and combined analysis of predisposing, enabling, and need factors. BMC Health Serv Res 2025; 25:366. [PMID: 40075408 PMCID: PMC11900254 DOI: 10.1186/s12913-025-12502-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2025] [Accepted: 02/28/2025] [Indexed: 03/14/2025] Open
Abstract
OBJECTIVE Predicting healthcare demand is essential for effective resource allocation and planning. This study applies Andersen's Behavioral Model of Health Services Use, focusing on predisposing, enabling, and need factors, using data from the 2022 Turkey Health Survey by TUIK. Machine learning methods provide a powerful approach to analyze these factors and their combined impact on healthcare utilization, offering valuable insights for health policy. METHODS Seven different machine learning models-Decision Tree, Random Forest, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Logistic Regression, XGBoost, and Gradient Boosting-were utilized. Feature selection was conducted to identify the most significant factors influencing healthcare demand. The models were evaluated for accuracy and generalization ability using performance metrics such as recall, precision, F1 score, and ROC AUC. RESULTS The study identified key features affecting healthcare demand. For predisposing factors, gender, educational level, and age group were significant. Enabling factors included treatment costs, community interest, and payment difficulties. Need factors were influenced by smoking status, chronic diseases, and overall health status. The models demonstrated high recall (approximately 0.90) and strong F1 scores (ranging from 0.87 to 0.88), indicating a balanced performance between precision and recall. Among the models, Gradient Boosting, XGBoost, and Logistic Regression consistently outperformed others, achieving the highest predictive accuracy. Random Forest and SVM also performed well, showing robust classification capability. CONCLUSIONS The findings highlight the effectiveness of machine learning methods in predicting healthcare demand, providing valuable insights for health policy and resource allocation. Gradient Boosting, XGBoost, and Logistic Regression emerged as the most reliable models, demonstrating superior generalization and classification performance. Understanding the separate and combined effects of predisposing, enabling, and need factors on healthcare demand can contribute to more efficient and data-driven healthcare planning, facilitating strategic decision-making in resource allocation and service delivery.
Collapse
Affiliation(s)
- Fatih Orhan
- University of Health Sciences, Gülhane Vocational School of Health, Ankara, Turkey.
| | | |
Collapse
|
3
|
Alzahrani SI, Yafooz WMS, Aljamaan IA, Alwaleedi A, Al-Hariri M, Saleh G. AI-driven health analysis for emerging respiratory diseases: A case study of Yemen patients using COVID-19 data. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2025; 22:554-584. [PMID: 40083282 DOI: 10.3934/mbe.2025021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/16/2025]
Abstract
In low-income and resource-limited countries, distinguishing COVID-19 from other respiratory diseases is challenging due to similar symptoms and the prevalence of comorbidities. In Yemen, acute comorbidities further complicate the differentiation between COVID-19 and other infectious diseases. We explored the use of AI-powered predictive models and classifiers to enhance healthcare preparedness by forecasting respiratory disease trends using COVID-19 data. We developed mathematical models based on autoregressive (AR), moving average (MA), ARMA, and machine and deep learning algorithms to predict daily confirmed deaths. Statistical models were trained on 80% of the data and tested on the remaining 20%, with predicted results compared to actual values. The ARMA model demonstrated promising performance. Additionally, eight machine learning (ML) classifiers and deep learning (DL) models were utilized to identify COVID-19 severity indicators. Among the ML classifiers, the Decision Tree (DT) achieved the highest accuracy at 74.70%, followed closely by Random Forest (RF) at 74.66%. DL models showed comparable accuracy scores, around 70%. In terms of AUC-ROC, the kernel Support Vector Machine (SVM) outperformed others, achieving 71% accuracy, with precision, recall, F-measure, and area under the curve values of 0.7, 0.75, 0.59, and 0.72, respectively. These findings underscore the potential of AI-driven health analysis to optimize resource allocation and enhance forecasting for respiratory diseases.
Collapse
Affiliation(s)
- Saleh I Alzahrani
- Biomedical Engineering Department, College of Engineering, Imam Abdulrahman Bin Faisal University, PO box 1982, Dammam 31451, Saudi Arabia
| | - Wael M S Yafooz
- Computer Science Department, Taibah University, Saudi Arabia
| | - Ibrahim A Aljamaan
- Biomedical Engineering Department, College of Engineering, Imam Abdulrahman Bin Faisal University, PO box 1982, Dammam 31451, Saudi Arabia
| | - Ali Alwaleedi
- Department of Epidemiology and Public Health, College of Medicine, Aden University, Aden, Yemen
| | - Mohammed Al-Hariri
- Department of Physiology, College of Medicine, Imam Abdulrahman Bin Faisal University, PO box 1982, Dammam 31451, Saudi Arabia
| | - Gameel Saleh
- Biomedical Engineering Department, College of Engineering, Imam Abdulrahman Bin Faisal University, PO box 1982, Dammam 31451, Saudi Arabia
| |
Collapse
|
4
|
Rahman A, Rahman MH. Explore the factors related to the death of offspring under age five and appraise the hazard of child mortality using machine learning techniques in Bangladesh. BMC Public Health 2025; 25:360. [PMID: 39881228 PMCID: PMC11776272 DOI: 10.1186/s12889-025-21460-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 01/14/2025] [Indexed: 01/31/2025] Open
Abstract
BACKGROUND Child mortality is a reliable and significant indicator of a nation's health. Although the child mortality rate in Bangladesh is declining over time, it still needs to drop even more in order to meet the Sustainable Development Goals (SDGs). Machine Learning models are one of the best tools for making more accurate and efficient forecasts and gaining in-depth knowledge. A deeper understanding is crucial for significantly reducing child mortality rates. Accurate predictions using machine learning models can empower authorities to implement timely interventions and raise awareness. So, the study aimed to explore the factors related to child mortality and assess the efficacy of various machine-learning models in predicting child mortality in Bangladesh. METHODS AND MATERIALS About Forty-two thousand observations, except the missing observations, were extracted for this study from the Bangladesh Demographic and Health Survey (BDHS) data conducted in 2017-18. The survey utilized a two-stage stratified sampling method, selecting 675 enumeration areas-250 in urban settings and 425 in rural areas-resulting in effective data collection from 672 clusters and 20160 households. The Chi-square test and recursive feature elimination (RFE) are used to find the relevant risk factors of child mortality among the number of factors. Six ML-based algorithms were implemented for predicting child mortality, such as Naïve Bayes, Classification and Regression Trees, Random Forest, C5.0 Classification, Gradient Boosting Machine, and Logistic Regression. Model evaluation metrics like accuracy, specificity, sensitivity, negative predictive value, F 1 score, positive predictive value, k-fold cross-validation, and area under the curve (AUC) techniques were used to evaluate the performance of the models. RESULTS AND DISCUSSION The child mortality rate is 8.2%, according to the data. The bivariate analysis showed that the child mortality rate was higher among the children whose mothers were uneducated, impoverished, underweight, aged 35-49, and gave birth before age 20. Families' water sources and religious connections had no statistically significant impact on child mortality. The prediction of child mortality using machine learning models is the main objective of this study. None of the machine learning models correctly classified dead occurrences. Therefore, this study conducted over-sampling and under-sampling analysis. Approximately 76727 and 6910 observations were sampled for over-sampling and under-sampling techniques, respectively. According to the findings of the over-sampling data, the Random Forest outperformed all the other models in terms of total performance based on training and testing sets, with an accuracy of seventy percent. The k-fold cross-validation approach demonstrated the Random Forest model's superior performance, and achieved the highest AUC (0.701). On the other hand, the Gradient Boosting Machine has the highest assessment for predicting child mortality in under-sampling analysis. The k-fold cross-validation also illustrated the better performance of the Gradient Boosting Machine. CONCLUSION The Gradient Boosting Machine and Random Forest produce the best predictive power for classifying child mortality and may help to ameliorate policy decision-making in this regard.
Collapse
Affiliation(s)
- Ashikur Rahman
- Department of Statistics and Data Science, Jahangirnagar University, Dhaka, 1342, Bangladesh
| | - Md Habibur Rahman
- Department of Statistics and Data Science, Jahangirnagar University, Dhaka, 1342, Bangladesh.
| |
Collapse
|
5
|
Schniederova M, Bobcakova A, Grendar M, Markocsy A, Ceres A, Cibulka M, Dobrota D, Jesenak M. Lymphocyte Inhibition Mechanisms and Immune Checkpoints in COVID-19: Insights into Prognostic Markers and Disease Severity. MEDICINA (KAUNAS, LITHUANIA) 2025; 61:189. [PMID: 40005306 PMCID: PMC11857393 DOI: 10.3390/medicina61020189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2024] [Revised: 01/13/2025] [Accepted: 01/17/2025] [Indexed: 02/27/2025]
Abstract
Background and Objectives: Immune checkpoint inhibitors such as PD-1 and TIM-3 play an important role in regulating the host immune response and are proposed as potential prognostic markers and therapeutic targets in severe cases of COVID-19. We evaluated the expression of PD-1 and TIM-3 on T cells, as well as the concentration of sPD-1 in plasma, to clarify the role of these molecules in patients infected with SARS-CoV-2. Materials and Methods: In this retrospective observational study, we analysed the expression of PD-1 and TIM-3 on CD4+ and CD8+ T cells upon admission and after 7 days of hospitalisation in 770 adult patients. We also evaluated sPD-1 levels in the plasma of 145 patients at different stages of COVID-19 and of 11 control subjects. Molecules were determined using conventional flow cytometry and ELISA and the data were statistically processed. Results: We observed a significantly higher expression of PD-1 on CD4+ cells in deceased patients than in those with mild-to-moderate disease. All patients with COVID-19 exhibited a significantly higher expression of TIM-3 on both CD4+ and CD8+ T cells compared to controls. After 1 week of hospitalisation, there was no significant change in PD-1 or TIM-3 expression on CD4+ or CD8+ T cells across the studied groups. sPD-1 concentrations were not significantly different between survivors and non-survivors. Plasma sPD-1 levels did not correlate with PD-1 expression on T cells, but a significant correlation was observed between CD4+ PD-1 and CD8+ PD-1. Using machine-learning algorithms, we supported our observations and confirmed immunological variables capable of predicting survival, with AUC = 0.786. Conclusions: Analysis of the immune response may be useful for monitoring and predicting the course of COVID-19 upon admission. However, it is essential to evaluate complex immune parameters in conjunction with other key clinical and laboratory indicators.
Collapse
Affiliation(s)
- Martina Schniederova
- Institute of Clinical Immunology and Medical Genetics, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, 03659 Martin, Slovakia; (M.S.); (A.B.); (A.M.); (M.C.)
| | - Anna Bobcakova
- Institute of Clinical Immunology and Medical Genetics, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, 03659 Martin, Slovakia; (M.S.); (A.B.); (A.M.); (M.C.)
- Department of Pulmonology and Phthisiology, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, 03659 Martin, Slovakia
- Department of Paediatrics and Adolescent Medicine, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, 03659 Martin, Slovakia
| | - Marian Grendar
- Biomed—Centre for Biomedicine, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, 03659 Martin, Slovakia
| | - Adam Markocsy
- Institute of Clinical Immunology and Medical Genetics, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, 03659 Martin, Slovakia; (M.S.); (A.B.); (A.M.); (M.C.)
- Department of Pulmonology and Phthisiology, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, 03659 Martin, Slovakia
| | - Andrej Ceres
- Institute of Clinical Immunology and Medical Genetics, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, 03659 Martin, Slovakia; (M.S.); (A.B.); (A.M.); (M.C.)
| | - Michal Cibulka
- Institute of Clinical Immunology and Medical Genetics, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, 03659 Martin, Slovakia; (M.S.); (A.B.); (A.M.); (M.C.)
- Department of Medical Biochemistry, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, 03659 Martin, Slovakia;
| | - Dusan Dobrota
- Department of Medical Biochemistry, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, 03659 Martin, Slovakia;
- Department of Clinical Biochemisty, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, 03659 Martin, Slovakia
| | - Milos Jesenak
- Institute of Clinical Immunology and Medical Genetics, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, 03659 Martin, Slovakia; (M.S.); (A.B.); (A.M.); (M.C.)
- Department of Pulmonology and Phthisiology, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, 03659 Martin, Slovakia
- Department of Paediatrics and Adolescent Medicine, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, 03659 Martin, Slovakia
| |
Collapse
|
6
|
Daramola O, Kavu TD, Kotze MJ, Marnewick JL, Sarumi OA, Kabaso B, Moser T, Stroetmann K, Fwemba I, Daramola F, Nyirenda M, van Rensburg SJ, Nyasulu PS. Predictive modelling and identification of critical variables of mortality risk in COVID-19 patients. Sci Rep 2025; 15:2184. [PMID: 39820088 PMCID: PMC11739597 DOI: 10.1038/s41598-023-46712-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Accepted: 11/03/2023] [Indexed: 01/19/2025] Open
Abstract
South Africa was the most affected country in Africa by the coronavirus disease 2019 (COVID-19) pandemic, where over 4 million confirmed cases of COVID-19 and over 102,000 deaths have been recorded since 2019. Aside from clinical methods, artificial intelligence (AI)-based solutions such as machine learning (ML) models have been employed in treating COVID-19 cases. However, limited application of AI for COVID-19 in Africa has been reported in the literature. This study aimed to investigate the performance and interpretability of several ML algorithms, including deep multilayer perceptron (Deep MLP), support vector machine (SVM) and Extreme gradient boosting trees (XGBoost) for predicting COVID-19 mortality risk with an emphasis on the effect of cross-validation (CV) and principal component analysis (PCA) on the results. For this purpose, a dataset with 154 features from 490 COVID-19 patients admitted into the intensive care unit (ICU) of Tygerberg Hospital in Cape Town, South Africa, during the first wave of COVID-19 in 2020 was retrospectively analysed. Our results show that Deep MLP had the best overall performance (F1 = 0.92; area under the curve (AUC) = 0.94) when CV and the synthetic minority oversampling technique (SMOTE) were applied without PCA. By using the Shapley Additive exPlanations (SHAP) model to interpret the mortality risk predictions, we identified the Length of stay (LOS) in the hospital, LOS in the ICU, Time to ICU from admission, days discharged alive or death, D-dimer (blood clotting factor), and blood pH as the six most critical variables for mortality risk prediction. Also, Age at admission, Pf ratio (PaO2/FiO2 ratio), troponin T (TropT), ferritin, ventilation, C-reactive protein (CRP), and symptoms of acute respiratory distress syndrome (ARDS) were associated with the severity and fatality of COVID-19 cases. The study reveals how ML could assist medical practitioners in making informed decisions on handling critically ill COVID-19 patients with comorbidities. It also offers insight into the combined effect of CV, PCA, and SMOTE on the performance of ML models for COVID-19 mortality risk prediction, which has been little explored.
Collapse
Affiliation(s)
- Olawande Daramola
- Department of Information Technology, Cape Peninsula University of Technology, Cape Town, South Africa.
| | - Tatenda Duncan Kavu
- Department of Information Technology, Cape Peninsula University of Technology, Cape Town, South Africa
| | - Maritha J Kotze
- Division of Chemical Pathology, Department of Pathology, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
- Division of Chemical Pathology, Department of Pathology, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Jeanine L Marnewick
- Applied Microbial and Health Biotechnology Institute, Cape Peninsula University of Technology, Cape Town, South Africa
| | - Oluwafemi A Sarumi
- Department of Mathematics and Computer Science, Philipps University of Marburg, Hans-Meerwein Str. 6 D-35032, Marburg, Germany
| | - Boniface Kabaso
- Department of Information Technology, Cape Peninsula University of Technology, Cape Town, South Africa
| | - Thomas Moser
- St Pölten University of Applied Sciences, St Pölten, Austria
| | - Karl Stroetmann
- School of Health Information Science, University of Victoria, Victoria, BC, Canada
| | - Isaac Fwemba
- Division of Epidemiology and Biostatistics, Faculty of Medicine, and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Fisayo Daramola
- Division of Epidemiology and Biostatistics, Faculty of Medicine, and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Martha Nyirenda
- Division of Epidemiology and Biostatistics, Faculty of Medicine, and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Susan J van Rensburg
- Division of Chemical Pathology, Department of Pathology, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Peter S Nyasulu
- Division of Epidemiology and Biostatistics, Faculty of Medicine, and Health Sciences, Stellenbosch University, Cape Town, South Africa
| |
Collapse
|
7
|
Tsanakas AT, Mueller YM, van de Werken HJG, Pujol Borrell R, Ouzounis CA, Katsikis PD. An explainable machine learning model for COVID-19 severity prognosis at hospital admission. INFORMATICS IN MEDICINE UNLOCKED 2025; 52:101602. [DOI: 10.1016/j.imu.2024.101602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2025] Open
|
8
|
Barreto TDO, Farias FLDO, Veras NVR, Cardoso PH, Silva GJPC, Pinheiro CDO, Medina MVB, Fernandes FRDS, Barbalho IMP, Cortez LR, dos Santos JPQ, de Morais AHF, de Souza GF, Machado GM, Lucena MJNR, Valentim RADM. Artificial intelligence applied to bed regulation in Rio Grande do Norte: Data analysis and application of machine learning on the "RegulaRN Leitos Gerais" platform. PLoS One 2024; 19:e0315379. [PMID: 39775276 PMCID: PMC11684685 DOI: 10.1371/journal.pone.0315379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2024] [Accepted: 11/25/2024] [Indexed: 01/11/2025] Open
Abstract
Bed regulation within Brazil's National Health System (SUS) plays a crucial role in managing care for patients in need of hospitalization. In Rio Grande do Norte, Brazil, the RegulaRN Leitos Gerais platform was the information system developed to register requests for bed regulation for COVID-19 cases. However, the platform was expanded to cover a range of diseases that require hospitalization. This study explored different machine learning models in the RegulaRN database, from October 2021 to January 2024, totaling 47,056 regulations. From the data obtained, 12 features were selected from the 24 available. After that, blank and inconclusive data were removed, as well as the outcomes that had values other than discharge and death, rendering a binary classification. Data was also correlated, balanced, and divided into training and test portions for application in machine learning models. The results showed better accuracy (87.77%) and recall (87.77%) for the XGBoost model, and higher precision (87.85%) and F1-Score (87.56%) for the Random Forest and Gradient Boosting models, respectively. As for Specificity (82.94%) and ROC-AUC (82.13%), the Multilayer Perceptron with SGD optimizer obtained the highest scores. The results evidenced which models could adequately assist medical regulators during the decision-making process for bed regulation, enabling even more effective regulation and, consequently, greater availability of beds and a decrease in waiting time for patients.
Collapse
Affiliation(s)
- Tiago de Oliveira Barreto
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | - Fernando Lucas de Oliveira Farias
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | - Nicolas Vinícius Rodrigues Veras
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Pablo Holanda Cardoso
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | | | | | | | - Felipe Ricardo dos Santos Fernandes
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | - Ingridy Marina Pierre Barbalho
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | - Lyane Ramalho Cortez
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Secretary of Public Health of Rio Grande do Norte, Natal, Rio Grande do Norte, Brazil
| | - João Paulo Queiroz dos Santos
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Antonio Higor Freire de Morais
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Gustavo Fontoura de Souza
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | | | | | | |
Collapse
|
9
|
Al Meslamani AZ, Sobrino I, de la Fuente J. Machine learning in infectious diseases: potential applications and limitations. Ann Med 2024; 56:2362869. [PMID: 38853633 PMCID: PMC11168216 DOI: 10.1080/07853890.2024.2362869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 05/02/2024] [Indexed: 06/11/2024] Open
Abstract
Infectious diseases are a major threat for human and animal health worldwide. Artificial Intelligence (AI) combined algorithms including Machine Learning and Big Data analytics have emerged as a potential solution to analyse diverse datasets and face challenges posed by infectious diseases. In this commentary we explore the potential applications and limitations of ML to management of infectious disease. It explores challenges in key areas such as outbreak prediction, pathogen identification, drug discovery, and personalized medicine. We propose potential solutions to mitigate these hurdles and applications of ML to identify biomolecules for effective treatment and prevention of infectious diseases. In addition to use of ML for management of infectious diseases, potential applications are based on catastrophic evolution events for the identification of biomolecular targets to reduce risks for infectious diseases and vaccinomics for discovery and characterization of vaccine protective antigens using intelligent Big Data analytics techniques. These considerations set a foundation for developing effective strategies for managing infectious diseases in the future.
Collapse
Affiliation(s)
- Ahmad Z. Al Meslamani
- College of Pharmacy, Al Ain University, Abu Dhabi, United Arab Emirates
- AAU Health and Biomedical Research Center, Al Ain University, Abu Dhabi, United Arab Emirates
| | - Isidro Sobrino
- SaBio, Instituto de Investigación en Recursos Cinegéticos (IREC), Consejo Superior de Investigaciones Científicas (CSIC), Universidad de Castilla-La Mancha (UCLM)-Junta de Comunidades de Castilla-La Mancha (JCCM), Ciudad Real, Spain
| | - José de la Fuente
- SaBio, Instituto de Investigación en Recursos Cinegéticos (IREC), Consejo Superior de Investigaciones Científicas (CSIC), Universidad de Castilla-La Mancha (UCLM)-Junta de Comunidades de Castilla-La Mancha (JCCM), Ciudad Real, Spain
- Department of Veterinary Pathobiology, Center for Veterinary Health Sciences, OK State University, Stillwater, Oklahoma, USA
| |
Collapse
|
10
|
Cao S, Yang S, Chen B, Chen X, Fu X, Tang S. Establishing a differential diagnosis model between primary membranous nephropathy and non-primary membranous nephropathy by machine learning algorithms. Ren Fail 2024; 46:2380752. [PMID: 39039848 PMCID: PMC11268222 DOI: 10.1080/0886022x.2024.2380752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Accepted: 07/11/2024] [Indexed: 07/24/2024] Open
Abstract
CONTEXT Four algorithms with relatively balanced complexity and accuracy in deep learning classification algorithm were selected for differential diagnosis of primary membranous nephropathy (PMN). OBJECTIVE This study explored the most suitable classification algorithm for PMN identification, and to provide data reference for PMN diagnosis research. METHODS A total of 500 patients were referred to Luo-he Central Hospital from 2019 to 2021. All patients were diagnosed with primary glomerular disease confirmed by renal biopsy, contained 322 cases of PMN, the 178 cases of non-PMN. Using the decision tree, random forest, support vector machine, and extreme gradient boosting (Xgboost) to establish a differential diagnosis model for PMN and non-PMN. Based on the true positive rate, true negative rate, false-positive rate, false-negative rate, accuracy, feature work area under the curve (AUC) of subjects, the best performance of the model was chosen. RESULTS The efficiency of the Xgboost model based on the above evaluation indicators was the highest, which the diagnosis of PMN of the sensitivity and specificity, respectively 92% and 96%. CONCLUSIONS The differential diagnosis model for PMN was established successfully and the efficiency performance of the Xgboost model was the best. It could be used for the clinical diagnosis of PMN.
Collapse
Affiliation(s)
- Shangmei Cao
- Department of Science and Technology Innovation Center, Luohe Central Hospital, The First Affiliated Hospital of Luohe Medical College, Henan Key Laboratory of Fertility Protection and Aristogenesis, Luohe, China
| | - Shaozhe Yang
- Department of Science and Technology Innovation Center, Luohe Central Hospital, The First Affiliated Hospital of Luohe Medical College, Henan Key Laboratory of Fertility Protection and Aristogenesis, Luohe, China
| | - Bolin Chen
- Department of Science and Technology Innovation Center, Luohe Central Hospital, The First Affiliated Hospital of Luohe Medical College, Henan Key Laboratory of Fertility Protection and Aristogenesis, Luohe, China
| | - Xixia Chen
- Division of Nephrology, First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, Guangdong Province, China
| | - Xiuhong Fu
- Department of Science and Technology Innovation Center, Luohe Central Hospital, The First Affiliated Hospital of Luohe Medical College, Henan Key Laboratory of Fertility Protection and Aristogenesis, Luohe, China
| | - Shuifu Tang
- Division of Nephrology, First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, Guangdong Province, China
| |
Collapse
|
11
|
Luo L, Gao P, Yang C, Yu S. Predictive modeling of COVID-19 mortality risk in chronic kidney disease patients using multiple machine learning algorithms. Sci Rep 2024; 14:26979. [PMID: 39506019 PMCID: PMC11541900 DOI: 10.1038/s41598-024-78498-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 10/31/2024] [Indexed: 11/08/2024] Open
Abstract
The coronavirus disease 2019 (COVID-19) has a significant impact on the global population, particularly on individuals with chronic kidney disease (CKD). COVID-19 patients with CKD will face a considerably higher risk of mortality than the general population. This study developed a predictive model for assessing mortality in COVID-19-affected CKD patients, providing personalized risk prediction to optimize clinical management and reduce mortality rates. We developed machine learning algorithms to analyze 219 patients' clinical laboratory test data retrospectively. The performance of each model was assessed using a calibration curve, decision curve analysis, and receiver operating characteristic (ROC) curve. It was found that the LightGBM model showed the most satisfied performance, with an area under the ROC curve of 0.833, sensitivity of 0.952, and specificity of 0.714. Prealbumin, neutrophil percent, respiratory index in arterial blood, half-saturated pressure of oxygen, carbon dioxide in serum, glucose, neutrophil count, and uric acid were the top 8 significant variables in the prediction model. Validation by 46 patients demonstrated acceptable accuracy. This model can serve as a powerful tool for screening CKD patients at high risk of COVID-19-related mortality and providing decision support for clinical staff, enabling efficient allocation of resources, and facilitating timely and targeted management for those who need the relevant interference urgently.
Collapse
Affiliation(s)
- Lin Luo
- Department of Clinical Laboratory, Second Affiliated Hospital of Dalian Medical University, No.467, Zhongshan Road, Shahekou District, Dalian, 116027, Liaoning, China
| | - Peng Gao
- Department of Clinical Laboratory, Second Affiliated Hospital of Dalian Medical University, No.467, Zhongshan Road, Shahekou District, Dalian, 116027, Liaoning, China
| | - Chunhui Yang
- Department of Clinical Laboratory, Second Affiliated Hospital of Dalian Medical University, No.467, Zhongshan Road, Shahekou District, Dalian, 116027, Liaoning, China.
| | - Sha Yu
- Department of Clinical Laboratory, Second Affiliated Hospital of Dalian Medical University, No.467, Zhongshan Road, Shahekou District, Dalian, 116027, Liaoning, China.
| |
Collapse
|
12
|
Zhu L, Yao Y. Prediction of the risk of mortality in older patients with coronavirus disease 2019 using blood markers and machine learning. Front Immunol 2024; 15:1445618. [PMID: 39555074 PMCID: PMC11563789 DOI: 10.3389/fimmu.2024.1445618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Accepted: 10/11/2024] [Indexed: 11/19/2024] Open
Abstract
Introduction The mortality rate among older people infected with severe acute respiratory syndrome coronavirus 2 is alarmingly high. This study aimed to explore the predictive value of a novel model for assessing the risk of death in this vulnerable cohort. Methods We enrolled 199 older patients with coronavirus disease 2019 (COVID-19) from Zhejiang Provincial Hospital of Chinese Medicine (Hubin) between 16 December 2022 and 17 January 2023. Additionally, 90 patients from two other centers (Qiantang and Xixi) formed an external independent testing cohort. Univariate and multivariate analyses were used to identify the risk factors for mortality. Least absolute shrinkage and selection operator (LASSO) regression analysis was used to select variables associated with COVID-19 mortality. Nine machine-learning algorithms were used to predict mortality risk in older patients, and their performance was assessed using receiver operating characteristic curves, area under the curve (AUC), calibration curve analysis, and decision curve analysis. Results Neutrophil-monocyte ratio, neutrophil-lymphocyte ratio, C- reactive protein, interleukin 6, and D-dimer were considered to be relevant factors associated with the death risk of COVID-19-related death by LASSO regression. The Gaussian naive Bayes model was the best-performing model. In the validation cohort, the model had an AUC of 0.901, whereas in the testing cohort, the model had an AUC of 0.952. The calibration curve showed a good correlation between the actual and predicted probabilities, and the decision curve indicated a strong clinical benefit. Furthermore, the model had an AUC of 0.873 in an external independent testing cohort. Discussion In this study, a predictive machine-learning model was developed with an online prediction tool designed to assist clinicians in evaluating mortality risk factors and devising targeted and effective treatments for older patients with COVID-19, potentially reducing the mortality rates.
Collapse
Affiliation(s)
| | - Yimin Yao
- Medical Laboratory, The First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Chinese Medicine), Hangzhou, China
| |
Collapse
|
13
|
Alie MS, Negesse Y, Kindie K, Merawi DS. Machine learning algorithms for predicting COVID-19 mortality in Ethiopia. BMC Public Health 2024; 24:1728. [PMID: 38943093 PMCID: PMC11212371 DOI: 10.1186/s12889-024-19196-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 06/19/2024] [Indexed: 07/01/2024] Open
Abstract
BACKGROUND Coronavirus disease 2019 (COVID-19), a global public health crisis, continues to pose challenges despite preventive measures. The daily rise in COVID-19 cases is concerning, and the testing process is both time-consuming and costly. While several models have been created to predict mortality in COVID-19 patients, only a few have shown sufficient accuracy. Machine learning algorithms offer a promising approach to data-driven prediction of clinical outcomes, surpassing traditional statistical modeling. Leveraging machine learning (ML) algorithms could potentially provide a solution for predicting mortality in hospitalized COVID-19 patients in Ethiopia. Therefore, the aim of this study is to develop and validate machine-learning models for accurately predicting mortality in COVID-19 hospitalized patients in Ethiopia. METHODS Our study involved analyzing electronic medical records of COVID-19 patients who were admitted to public hospitals in Ethiopia. Specifically, we developed seven different machine learning models to predict COVID-19 patient mortality. These models included J48 decision tree, random forest (RF), k-nearest neighborhood (k-NN), multi-layer perceptron (MLP), Naïve Bayes (NB), eXtreme gradient boosting (XGBoost), and logistic regression (LR). We then compared the performance of these models using data from a cohort of 696 patients through statistical analysis. To evaluate the effectiveness of the models, we utilized metrics derived from the confusion matrix such as sensitivity, specificity, precision, and receiver operating characteristic (ROC). RESULTS The study included a total of 696 patients, with a higher number of females (440 patients, accounting for 63.2%) compared to males. The median age of the participants was 35.0 years old, with an interquartile range of 18-79. After conducting different feature selection procedures, 23 features were examined, and identified as predictors of mortality, and it was determined that gender, Intensive care unit (ICU) admission, and alcohol drinking/addiction were the top three predictors of COVID-19 mortality. On the other hand, loss of smell, loss of taste, and hypertension were identified as the three lowest predictors of COVID-19 mortality. The experimental results revealed that the k-nearest neighbor (k-NN) algorithm outperformed than other machine learning algorithms, achieving an accuracy of 95.25%, sensitivity of 95.30%, precision of 92.7%, specificity of 93.30%, F1 score 93.98% and a receiver operating characteristic (ROC) score of 96.90%. These findings highlight the effectiveness of the k-NN algorithm in predicting COVID-19 outcomes based on the selected features. CONCLUSION Our study has developed an innovative model that utilizes hospital data to accurately predict the mortality risk of COVID-19 patients. The main objective of this model is to prioritize early treatment for high-risk patients and optimize strained healthcare systems during the ongoing pandemic. By integrating machine learning with comprehensive hospital databases, our model effectively classifies patients' mortality risk, enabling targeted medical interventions and improved resource management. Among the various methods tested, the K-nearest neighbors (KNN) algorithm demonstrated the highest accuracy, allowing for early identification of high-risk patients. Through KNN feature identification, we identified 23 predictors that significantly contribute to predicting COVID-19 mortality. The top five predictors are gender (female), intensive care unit (ICU) admission, alcohol drinking, smoking, and symptoms of headache and chills. This advancement holds great promise in enhancing healthcare outcomes and decision-making during the pandemic. By providing services and prioritizing patients based on the identified predictors, healthcare facilities and providers can improve the chances of survival for individuals. This model provides valuable insights that can guide healthcare professionals in allocating resources and delivering appropriate care to those at highest risk.
Collapse
Affiliation(s)
- Melsew Setegn Alie
- Department Public Health, School of Public Health, College of Medicine and Health Science, Mizan-Tepi University, Mizan-Aman, Ethiopia.
| | - Yilkal Negesse
- Department of Public Health, College of Medicine and Health Science, Debre Markos University, Gojjam, Ethiopia
| | - Kassa Kindie
- Department Nursing, College of Medicine and Health Science, Mizan-Tepi University, Mizan-Aman, Ethiopia
| | - Dereje Senay Merawi
- Department of Information Technology, Faculty of Technology, Debre Tabor University, Gonder, Ethiopia
| |
Collapse
|
14
|
Premeaux TA, Bowler S, Friday CM, Moser CB, Hoenigl M, Lederman MM, Landay AL, Gianella S, Ndhlovu LC. Machine learning models based on fluid immunoproteins that predict non-AIDS adverse events in people with HIV. iScience 2024; 27:109945. [PMID: 38812553 PMCID: PMC11134891 DOI: 10.1016/j.isci.2024.109945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 03/12/2024] [Accepted: 05/06/2024] [Indexed: 05/31/2024] Open
Abstract
Despite the success of antiretroviral therapy (ART), individuals with HIV remain at risk for experiencing non-AIDS adverse events (NAEs), including cardiovascular complications and malignancy. Several surrogate immune biomarkers in blood have shown predictive value in predicting NAEs; however, composite panels generated using machine learning may provide a more accurate advancement for monitoring and discriminating NAEs. In a nested case-control study, we aimed to develop machine learning models to discriminate cases (experienced an event) and matched controls using demographic and clinical characteristics alongside 49 plasma immunoproteins measured prior to and post-ART initiation. We generated support vector machine (SVM) classifier models for high-accuracy discrimination of individuals aged 30-50 years who experienced non-fatal NAEs at pre-ART and one-year post-ART. Extreme gradient boosting generated a high-accuracy model at pre-ART, while K-nearest neighbors performed poorly all around. SVM modeling may offer guidance to improve disease monitoring and elucidate potential therapeutic interventions.
Collapse
Affiliation(s)
- Thomas A. Premeaux
- Division of Infectious Diseases, Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Scott Bowler
- Division of Infectious Diseases, Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Courtney M. Friday
- Division of Infectious Diseases, Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Carlee B. Moser
- Center for Biostatistics in AIDS Research in the Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Martin Hoenigl
- Division of Infectious Diseases, Department of Medicine, University of California San Diego, San Diego, CA, USA
- Division of Infectious Diseases, Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Michael M. Lederman
- Department of Medicine, Division of Infectious Diseases and HIV Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Alan L. Landay
- Department of Internal Medicine, Rush University Medical Center, Chicago, IL, USA
| | - Sara Gianella
- Division of Infectious Diseases, Department of Medicine, University of California San Diego, San Diego, CA, USA
| | - Lishomwa C. Ndhlovu
- Division of Infectious Diseases, Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| |
Collapse
|
15
|
Sethi S, Shakyawar S, Reddy AS, Patel JC, Guda C. A Machine Learning Model for the Prediction of COVID-19 Severity Using RNA-Seq, Clinical, and Co-Morbidity Data. Diagnostics (Basel) 2024; 14:1284. [PMID: 38928699 PMCID: PMC11202902 DOI: 10.3390/diagnostics14121284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 05/29/2024] [Accepted: 06/12/2024] [Indexed: 06/28/2024] Open
Abstract
The premise for this study emanated from the need to understand SARS-CoV-2 infections at the molecular level and to develop predictive tools for managing COVID-19 severity. With the varied clinical outcomes observed among infected individuals, creating a reliable machine learning (ML) model for predicting the severity of COVID-19 became paramount. Despite the availability of large-scale genomic and clinical data, previous studies have not effectively utilized multi-modality data for disease severity prediction using data-driven approaches. Our primary goal is to predict COVID-19 severity using a machine-learning model trained on a combination of patients' gene expression, clinical features, and co-morbidity data. Employing various ML algorithms, including Logistic Regression (LR), XGBoost (XG), Naïve Bayes (NB), and Support Vector Machine (SVM), alongside feature selection methods, we sought to identify the best-performing model for disease severity prediction. The results highlighted XG as the superior classifier, with 95% accuracy and a 0.99 AUC (Area Under the Curve), for distinguishing severity groups. Additionally, the SHAP analysis revealed vital features contributing to prediction, including several genes such as COX14, LAMB2, DOLK, SDCBP2, RHBDL1, and IER3-AS1. Notably, two clinical features, the absolute neutrophil count and Viremia Categories, emerged as top contributors. Integrating multiple data modalities has significantly improved the accuracy of disease severity prediction compared to using any single modality. The identified features could serve as biomarkers for COVID-19 prognosis and patient care, allowing clinicians to optimize treatment strategies and refine clinical decision-making processes for enhanced patient outcomes.
Collapse
Affiliation(s)
- Sahil Sethi
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68105, USA
| | - Sushil Shakyawar
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68105, USA
| | - Athreya S. Reddy
- Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Jai Chand Patel
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68105, USA
| | - Chittibabu Guda
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68105, USA
| |
Collapse
|
16
|
Ahn H, Lee H. Predicting the transmission trends of COVID-19: an interpretable machine learning approach based on daily, death, and imported cases. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:6150-6166. [PMID: 38872573 DOI: 10.3934/mbe.2024270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
COVID-19 is caused by the SARS-CoV-2 virus, which has produced variants and increasing concerns about a potential resurgence since the pandemic outbreak in 2019. Predicting infectious disease outbreaks is crucial for effective prevention and control. This study aims to predict the transmission patterns of COVID-19 using machine learning, such as support vector machine, random forest, and XGBoost, using confirmed cases, death cases, and imported cases, respectively. The study categorizes the transmission trends into the three groups: L0 (decrease), L1 (maintain), and L2 (increase). We develop the risk index function to quantify changes in the transmission trends, which is applied to the classification of machine learning. A high accuracy is achieved when estimating the transmission trends for the confirmed cases (91.5-95.5%), death cases (85.6-91.8%), and imported cases (77.7-89.4%). Notably, the confirmed cases exhibit a higher level of accuracy compared to the data on the deaths and imported cases. L2 predictions outperformed L0 and L1 in all cases. Predicting L2 is important because it can lead to new outbreaks. Thus, this robust L2 prediction is crucial for the timely implementation of control policies for the management of transmission dynamics.
Collapse
Affiliation(s)
- Hyeonjeong Ahn
- Department of Statistics, Kyungpook National University, Daegu 41566, Republic of Korea
| | - Hyojung Lee
- Department of Statistics, Kyungpook National University, Daegu 41566, Republic of Korea
| |
Collapse
|
17
|
Kumari S, Tripathy S, Nayak S, Rajasimman AS. Machine learning-aided algorithm design for prediction of severity from clinical, demographic, biochemical and immunological parameters: Our COVID-19 experience from the pandemic. J Family Med Prim Care 2024; 13:1937-1943. [PMID: 38948617 PMCID: PMC11213376 DOI: 10.4103/jfmpc.jfmpc_1752_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 12/11/2023] [Accepted: 12/12/2023] [Indexed: 07/02/2024] Open
Abstract
Background The severity of laboratory and imaging finding was found to be inconsistent with clinical symptoms in COVID-19 patients, thereby increasing casualties. As compared to conventional biomarkers, machine learning algorithms can learn nonlinear and complex interactions and thus improve prediction accuracy. This study aimed at evaluating role of biochemical and immunological parameters-based machine learning algorithms for severity indexing in COVID-19. Methods Laboratory biochemical results of 5715 COVID-19 patients were mined from electronic records including 509 admitted in COVID-19 ICU. Random Forest Classifier (RFC), Support Vector Machine (SVM), Naive Bayesian Classifier (NBC) and K-Nearest Neighbours (KNN) classifier models were used. Lasso regression helped in identifying the most influential parameter. A decision tree was made for subdivided data set, based on randomization. Results Accuracy of SVM was highest with 94.18% and RFC with 94.04%. SVM had highest PPV (1.00), and NBC had highest NPV (0.95). QUEST modelling ignored age, urea and total protein, and only C-reactive protein and lactate dehydrogenase were considered to be a part of decision-tree algorithm. The overall percentage of correct classification was 78.31% in the overall algorithm with a sensitivity of 87.95% and an AUC of 0.747. Conclusion C-reactive protein and lactate dehydrogenase being routinely performed tests in clinical laboratories in peripheral setups, this algorithm could be an effective predictive tool. SVM and RFC models showed significant accuracy in predicting COVID-19 severity and could be useful for future pandemics.
Collapse
Affiliation(s)
- Suchitra Kumari
- Department of Biochemistry, All India Institute of Medical Sciences, Bhubaneswar, Odisha, India
| | - Swagata Tripathy
- Department of Anesthesiology, All India Institute of Medical Sciences, Bhubaneswar, Odisha, India
| | - Saurav Nayak
- Department of Biochemistry, All India Institute of Medical Sciences, Bhubaneswar, Odisha, India
| | - Aishvarya S. Rajasimman
- Department of Radiodiagnosis, All India Institute of Medical Sciences, Bhubaneswar, Odisha, India
| |
Collapse
|
18
|
Seyedtabib M, Najafi-Vosough R, Kamyari N. The predictive power of data: machine learning analysis for Covid-19 mortality based on personal, clinical, preclinical, and laboratory variables in a case-control study. BMC Infect Dis 2024; 24:411. [PMID: 38637727 PMCID: PMC11025285 DOI: 10.1186/s12879-024-09298-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 04/05/2024] [Indexed: 04/20/2024] Open
Abstract
BACKGROUND AND PURPOSE The COVID-19 pandemic has presented unprecedented public health challenges worldwide. Understanding the factors contributing to COVID-19 mortality is critical for effective management and intervention strategies. This study aims to unlock the predictive power of data collected from personal, clinical, preclinical, and laboratory variables through machine learning (ML) analyses. METHODS A retrospective study was conducted in 2022 in a large hospital in Abadan, Iran. Data were collected and categorized into demographic, clinical, comorbid, treatment, initial vital signs, symptoms, and laboratory test groups. The collected data were subjected to ML analysis to identify predictive factors associated with COVID-19 mortality. Five algorithms were used to analyze the data set and derive the latent predictive power of the variables by the shapely additive explanation values. RESULTS Results highlight key factors associated with COVID-19 mortality, including age, comorbidities (hypertension, diabetes), specific treatments (antibiotics, remdesivir, favipiravir, vitamin zinc), and clinical indicators (heart rate, respiratory rate, temperature). Notably, specific symptoms (productive cough, dyspnea, delirium) and laboratory values (D-dimer, ESR) also play a critical role in predicting outcomes. This study highlights the importance of feature selection and the impact of data quantity and quality on model performance. CONCLUSION This study highlights the potential of ML analysis to improve the accuracy of COVID-19 mortality prediction and emphasizes the need for a comprehensive approach that considers multiple feature categories. It highlights the critical role of data quality and quantity in improving model performance and contributes to our understanding of the multifaceted factors that influence COVID-19 outcomes.
Collapse
Affiliation(s)
- Maryam Seyedtabib
- Department of Biostatistics and Epidemiology, School of Health, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
| | - Roya Najafi-Vosough
- Research Center for Health Sciences, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Naser Kamyari
- Department of Biostatistics and Epidemiology, School of Health, Abadan University of Medical Sciences, Abadan, Iran.
| |
Collapse
|
19
|
Elmitwalli S, Mehegan J. Sentiment analysis of COP9-related tweets: a comparative study of pre-trained models and traditional techniques. Front Big Data 2024; 7:1357926. [PMID: 38572292 PMCID: PMC10987730 DOI: 10.3389/fdata.2024.1357926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 03/04/2024] [Indexed: 04/05/2024] Open
Abstract
Introduction Sentiment analysis has become a crucial area of research in natural language processing in recent years. The study aims to compare the performance of various sentiment analysis techniques, including lexicon-based, machine learning, Bi-LSTM, BERT, and GPT-3 approaches, using two commonly used datasets, IMDB reviews and Sentiment140. The objective is to identify the best-performing technique for an exemplar dataset, tweets associated with the WHO Framework Convention on Tobacco Control Ninth Conference of the Parties in 2021 (COP9). Methods A two-stage evaluation was conducted. In the first stage, various techniques were compared on standard sentiment analysis datasets using standard evaluation metrics such as accuracy, F1-score, and precision. In the second stage, the best-performing techniques from the first stage were applied to partially annotated COP9 conference-related tweets. Results In the first stage, BERT achieved the highest F1-scores (0.9380 for IMDB and 0.8114 for Sentiment 140), followed by GPT-3 (0.9119 and 0.7913) and Bi-LSTM (0.8971 and 0.7778). In the second stage, GPT-3 performed the best for sentiment analysis on partially annotated COP9 conference-related tweets, with an F1-score of 0.8812. Discussion The study demonstrates the effectiveness of pre-trained models like BERT and GPT-3 for sentiment analysis tasks, outperforming traditional techniques on standard datasets. Moreover, the better performance of GPT-3 on the partially annotated COP9 tweets highlights its ability to generalize well to domain-specific data with limited annotations. This provides researchers and practitioners with a viable option of using pre-trained models for sentiment analysis in scenarios with limited or no annotated data across different domains.
Collapse
Affiliation(s)
- Sherif Elmitwalli
- Tobacco Control Research Group, Department for Health, University of Bath, Bath, United Kingdom
| | | |
Collapse
|
20
|
Sagar D, Dwivedi T, Gupta A, Aggarwal P, Bhatnagar S, Mohan A, Kaur P, Gupta R. Clinical Features Predicting COVID-19 Severity Risk at the Time of Hospitalization. Cureus 2024; 16:e57336. [PMID: 38690475 PMCID: PMC11059179 DOI: 10.7759/cureus.57336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/31/2024] [Indexed: 05/02/2024] Open
Abstract
The global spread of COVID-19 has led to significant mortality and morbidity worldwide. Early identification of COVID-19 patients who are at high risk of developing severe disease can help in improved patient management, care, and treatment, as well as in the effective allocation of hospital resources. The severity prediction at the time of hospitalization can be extremely helpful in deciding the treatment of COVID-19 patients. To this end, this study presents an interpretable artificial intelligence (AI) model, named COVID-19 severity predictor (CoSP) that predicts COVID-19 severity using the clinical features at the time of hospital admission. We utilized a dataset comprising 64 demographic and laboratory features of 7,416 confirmed COVID-19 patients that were collected at the time of hospital admission. The proposed hierarchical CoSP model performs four-class COVID severity risk prediction into asymptomatic, mild, moderate, and severe categories. CoSP yielded better performance with good interpretability, as observed via Shapley analysis on COVID severity prediction compared to the other popular ML methods, with an area under the received operating characteristic curve (AUC-ROC) of 0.95, an area under the precision-recall curve (AUPRC) of 0.91, and a weighted F1-score of 0.83. Out of 64 initial features, 19 features were inferred as predictive of the severity of COVID-19 disease by the CoSP model. Therefore, an AI model predicting COVID-19 severity may be helpful for early intervention, optimizing resource allocation, and guiding personalized treatments, potentially enabling healthcare professionals to save lives and allocate resources effectively in the fight against the pandemic.
Collapse
Affiliation(s)
- Dikshant Sagar
- Computer Science, Indraprastha Institute of Information Technology - Delhi, Delhi, IND
- Computer Science, Calfornia State University, Los Angeles, Los Angeles, USA
| | - Tanima Dwivedi
- Oncology, Dr. B.R.A Institute-Rotary Cancer Hospital, All India Institute of Medical Sciences, New Delhi, IND
| | - Anubha Gupta
- Centre of Excellence in Healthcare, Indraprastha Institute of Information Technology - Delhi, Delhi, IND
| | - Priya Aggarwal
- Electronics and Communication Engineering, Indraprastha Institute of Information Technology - Delhi, Delhi, IND
| | - Sushma Bhatnagar
- Onco-Anaesthesia and Palliative Medicine, Dr. B.R.A Institute-Rotary Cancer Hospital, All India Institute of Medical Sciences, New Delhi, IND
| | - Anant Mohan
- Pulmonary, Critical Care and Sleep Medicine, All India Institute of Medical Sciences, New Delhi, IND
| | - Punit Kaur
- Biophysics, All India Institute of Medical Sciences, New Delhi, IND
| | - Ritu Gupta
- Oncology, Dr. B.R.A Institute-Rotary Cancer Hospital, All India Institute of Medical Sciences, New Delhi, IND
| |
Collapse
|
21
|
Mahadhika CK, Aldila D. A deterministic transmission model for analytics-driven optimization of COVID-19 post-pandemic vaccination and quarantine strategies. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:4956-4988. [PMID: 38872522 DOI: 10.3934/mbe.2024219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
This study developed a deterministic transmission model for the coronavirus disease of 2019 (COVID-19), considering various factors such as vaccination, awareness, quarantine, and treatment resource limitations for infected individuals in quarantine facilities. The proposed model comprised five compartments: susceptible, vaccinated, quarantined, infected, and recovery. It also considered awareness and limited resources by using a saturated function. Dynamic analyses, including equilibrium points, control reproduction numbers, and bifurcation analyses, were conducted in this research, employing analytics to derive insights. Our results indicated the possibility of an endemic equilibrium even if the reproduction number for control was less than one. Using incidence data from West Java, Indonesia, we estimated our model parameter values to calibrate them with the real situation in the field. Elasticity analysis highlighted the crucial role of contact restrictions in reducing the spread of COVID-19, especially when combined with community awareness. This emphasized the analytics-driven nature of our approach. We transformed our model into an optimal control framework due to budget constraints. Leveraging Pontriagin's maximum principle, we meticulously formulated and solved our optimal control problem using the forward-backward sweep method. Our experiments underscored the pivotal role of vaccination in infection containment. Vaccination effectively reduces the risk of infection among vaccinated individuals, leading to a lower overall infection rate. However, combining vaccination and quarantine measures yields even more promising results than vaccination alone. A second crucial finding emphasized the need for early intervention during outbreaks rather than delayed responses. Early interventions significantly reduce the number of preventable infections, underscoring their importance.
Collapse
Affiliation(s)
- C K Mahadhika
- Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Indonesia, Depok 16424, Indonesia
| | - Dipo Aldila
- Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Indonesia, Depok 16424, Indonesia
| |
Collapse
|
22
|
Elmitwalli S, Mehegan J, Wellock G, Gallagher A, Gilmore A. Topic prediction for tobacco control based on COP9 tweets using machine learning techniques. PLoS One 2024; 19:e0298298. [PMID: 38358979 PMCID: PMC10868820 DOI: 10.1371/journal.pone.0298298] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 01/23/2024] [Indexed: 02/17/2024] Open
Abstract
The prediction of tweets associated with specific topics offers the potential to automatically focus on and understand online discussions surrounding these issues. This paper introduces a comprehensive approach that centers on the topic of "harm reduction" within the broader context of tobacco control. The study leveraged tweets from the period surrounding the ninth Conference of the Parties to review the Framework Convention on Tobacco Control (COP9) as a case study to pilot this approach. By using Latent Dirichlet Allocation (LDA)-based topic modeling, the study successfully categorized tweets related to harm reduction. Subsequently, various machine learning techniques were employed to predict these topics, achieving a prediction accuracy of 91.87% using the Random Forest algorithm. Additionally, the study explored correlations between retweets and sentiment scores. It also conducted a toxicity analysis to understand the extent to which online conversations lacked neutrality. Understanding the topics, sentiment, and toxicity of Twitter data is crucial for identifying public opinion and its formation. By specifically focusing on the topic of "harm reduction" in tweets related to COP9, the findings offer valuable insights into online discussions surrounding tobacco control. This understanding can aid policymakers in effectively informing the public and garnering public support, ultimately contributing to the successful implementation of tobacco control policies.
Collapse
Affiliation(s)
- Sherif Elmitwalli
- Tobacco Control Research Group, Department for Health, University of Bath, Bath, United Kingdom
| | - John Mehegan
- Tobacco Control Research Group, Department for Health, University of Bath, Bath, United Kingdom
| | - Georgie Wellock
- Tobacco Control Research Group, Department for Health, University of Bath, Bath, United Kingdom
| | - Allen Gallagher
- Tobacco Control Research Group, Department for Health, University of Bath, Bath, United Kingdom
| | - Anna Gilmore
- Tobacco Control Research Group, Department for Health, University of Bath, Bath, United Kingdom
| |
Collapse
|
23
|
Rahmatinejad Z, Dehghani T, Hoseini B, Rahmatinejad F, Lotfata A, Reihani H, Eslami S. A comparative study of explainable ensemble learning and logistic regression for predicting in-hospital mortality in the emergency department. Sci Rep 2024; 14:3406. [PMID: 38337000 PMCID: PMC10858239 DOI: 10.1038/s41598-024-54038-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 02/07/2024] [Indexed: 02/12/2024] Open
Abstract
This study addresses the challenges associated with emergency department (ED) overcrowding and emphasizes the need for efficient risk stratification tools to identify high-risk patients for early intervention. While several scoring systems, often based on logistic regression (LR) models, have been proposed to indicate patient illness severity, this study aims to compare the predictive performance of ensemble learning (EL) models with LR for in-hospital mortality in the ED. A cross-sectional single-center study was conducted at the ED of Imam Reza Hospital in northeast Iran from March 2016 to March 2017. The study included adult patients with one to three levels of emergency severity index. EL models using Bagging, AdaBoost, random forests (RF), Stacking and extreme gradient boosting (XGB) algorithms, along with an LR model, were constructed. The training and validation visits from the ED were randomly divided into 80% and 20%, respectively. After training the proposed models using tenfold cross-validation, their predictive performance was evaluated. Model performance was compared using the Brier score (BS), The area under the receiver operating characteristics curve (AUROC), The area and precision-recall curve (AUCPR), Hosmer-Lemeshow (H-L) goodness-of-fit test, precision, sensitivity, accuracy, F1-score, and Matthews correlation coefficient (MCC). The study included 2025 unique patients admitted to the hospital's ED, with a total percentage of hospital deaths at approximately 19%. In the training group and the validation group, 274 of 1476 (18.6%) and 152 of 728 (20.8%) patients died during hospitalization, respectively. According to the evaluation of the presented framework, EL models, particularly Bagging, predicted in-hospital mortality with the highest AUROC (0.839, CI (0.802-0.875)) and AUCPR = 0.64 comparable in terms of discrimination power with LR (AUROC (0.826, CI (0.787-0.864)) and AUCPR = 0.61). XGB achieved the highest precision (0.83), sensitivity (0.831), accuracy (0.842), F1-score (0.833), and the highest MCC (0.48). Additionally, the most accurate models in the unbalanced dataset belonged to RF with the lowest BS (0.128). Although all studied models overestimate mortality risk and have insufficient calibration (P > 0.05), stacking demonstrated relatively good agreement between predicted and actual mortality. EL models are not superior to LR in predicting in-hospital mortality in the ED. Both EL and LR models can be considered as screening tools to identify patients at risk of mortality.
Collapse
Affiliation(s)
- Zahra Rahmatinejad
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Toktam Dehghani
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
- Toos Institute of Higher Education, Mashhad, Iran
| | - Benyamin Hoseini
- Pharmaceutical Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Fatemeh Rahmatinejad
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Aynaz Lotfata
- Department of Pathology, Microbiology, and Immunology, School of Veterinary Medicine, University of California, Davis, CA, USA
| | - Hamidreza Reihani
- Department of Emergency Medicine, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
| | - Saeid Eslami
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
- Pharmaceutical Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran.
- Department of Medical Informatics, Amsterdam UMC - Location AMC, University of Amsterdam, Amsterdam, The Netherlands.
| |
Collapse
|
24
|
Viderman D, Kotov A, Popov M, Abdildin Y. Machine and deep learning methods for clinical outcome prediction based on physiological data of COVID-19 patients: a scoping review. Int J Med Inform 2024; 182:105308. [PMID: 38091862 DOI: 10.1016/j.ijmedinf.2023.105308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 11/20/2023] [Accepted: 12/03/2023] [Indexed: 01/07/2024]
Abstract
INTRODUCTION Since the beginning of the COVID-19 pandemic, numerous machine and deep learning (MDL) methods have been proposed in the literature to analyze patient physiological data. The objective of this review is to summarize various aspects of these methods and assess their practical utility for predicting various clinical outcomes. METHODS We searched PubMed, Scopus, and Cochrane Library, screened and selected the studies matching the inclusion criteria. The clinical analysis focused on the characteristics of the patient cohorts in the studies included in this review, the specific tasks in the context of the COVID-19 pandemic that machine and deep learning methods were used for, and their practical limitations. The technical analysis focused on the details of specific MDL methods and their performance. RESULTS Analysis of the 48 selected studies revealed that the majority (∼54 %) of them examined the application of MDL methods for the prediction of survival/mortality-related patient outcomes, while a smaller fraction (∼13 %) of studies also examined applications to the prediction of patients' physiological outcomes and hospital resource utilization. 21 % of the studies examined the application of MDL methods to multiple clinical tasks. Machine and deep learning methods have been shown to be effective at predicting several outcomes of COVID-19 patients, such as disease severity, complications, intensive care unit (ICU) transfer, and mortality. MDL methods also achieved high accuracy in predicting the required number of ICU beds and ventilators. CONCLUSION Machine and deep learning methods have been shown to be valuable tools for predicting disease severity, organ dysfunction and failure, patient outcomes, and hospital resource utilization during the COVID-19 pandemic. The discovered knowledge and our conclusions and recommendations can also be useful to healthcare professionals and artificial intelligence researchers in managing future pandemics.
Collapse
Affiliation(s)
- Dmitriy Viderman
- Department of Surgery, School of Medicine, Nazarbayev University, Astana, Kazakhstan; Department of Anesthesiology, Intensive Care, and Pain Medicine, National Research Oncology Center, Astana, Kazakhstan.
| | - Alexander Kotov
- Department of Computer Science, College of Engineering, Wayne State University, Detroit, USA.
| | - Maxim Popov
- Department of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University, Astana, Kazakhstan.
| | - Yerkin Abdildin
- Department of Mechanical and Aerospace Engineering, School of Engineering and Digital Sciences, Nazarbayev University, Astana, Kazakhstan.
| |
Collapse
|
25
|
Agnello L, Vidali M, Padoan A, Lucis R, Mancini A, Guerranti R, Plebani M, Ciaccio M, Carobene A. Machine learning algorithms in sepsis. Clin Chim Acta 2024; 553:117738. [PMID: 38158005 DOI: 10.1016/j.cca.2023.117738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 12/20/2023] [Accepted: 12/20/2023] [Indexed: 01/03/2024]
Abstract
Sepsis remains a significant global health challenge due to its high mortality and morbidity, compounded by the difficulty of early detection given its variable clinical manifestations. The integration of machine learning (ML) into laboratory medicine for timely sepsis identification and outcome forecasting is an emerging field of interest. This comprehensive review assesses the current body of research on ML applications for sepsis within the realm of laboratory diagnostics, detailing both their strengths and shortcomings. An extensive literature search was performed by two independent investigators across PubMed and Scopus databases, employing the keywords "Sepsis," "Machine Learning," and "Laboratory" without publication date limitations, culminating in January 2023. Each selected study was meticulously evaluated for various aspects, including its design, intent (diagnostic or prognostic), clinical environment, demographics, sepsis criteria, data gathering period, and the scope and nature of features, in addition to the ML methodologies and their validation procedures. Out of 135 articles reviewed, 39 fulfilled the criteria for inclusion. Among these, the majority (30 studies) were focused on devising ML algorithms for diagnosis, fewer (8 studies) on prognosis, and one study addressed both aspects. The dissemination of these studies across an array of journals reflects the interdisciplinary engagement in the development of ML algorithms for sepsis. This analysis highlights the promising role of ML in the early diagnosis of sepsis while drawing attention to the need for uniformity in validating models and defining features, crucial steps for ensuring the reliability and practicality of ML in clinical setting.
Collapse
Affiliation(s)
- Luisa Agnello
- Institute of Clinical Biochemistry, Clinical Molecular Medicine and Clinical Laboratory Medicine, Department of Biomedicine, Neurosciences and Advanced Diagnostics, University of Palermo, Palermo, Italy
| | - Matteo Vidali
- Clinical Pathology Unit, Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico, Milano, Italy
| | - Andrea Padoan
- Department of Medicine-DIMED, University of Padova, Padova, Italy; Laboratory Medicine Unit, University-Hospital of Padova, Padova, Italy; QI.LAB.MED., Spin-off of the University of Padova, Padova, Italy
| | - Riccardo Lucis
- Department of Medicine (DAME), University of Udine, 33100, Udine, Italy; Microbiology and Virology Unit, Department of Laboratory Medicine, Azienda Sanitaria Friuli Occidentale (ASFO), Santa Maria degli Angeli Hospital, 33170, Pordenone, Italy
| | - Alessio Mancini
- School of Biosciences and Veterinary Medicine, University of Camerino, Camerino, Italy; Operative Unit of Clinical Pathology, AST2 Ancona, Senigallia, Italy
| | - Roberto Guerranti
- Department of Medical Biotechnologies, University of Siena, Siena, Italy; Clinical Pathology Unit, Innovation, Experimentation and Clinical and Translational Research Department, University Hospital of Siena, Siena, Italy
| | - Mario Plebani
- Department of Medicine-DIMED, University of Padova, Padova, Italy; Laboratory Medicine Unit, University-Hospital of Padova, Padova, Italy; QI.LAB.MED., Spin-off of the University of Padova, Padova, Italy; Clinical Biochemistry and Clinical Molecular Biology, School of Medicine, University of Padova, Padova, Italy
| | - Marcello Ciaccio
- Institute of Clinical Biochemistry, Clinical Molecular Medicine and Clinical Laboratory Medicine, Department of Biomedicine, Neurosciences and Advanced Diagnostics, University of Palermo, Palermo, Italy; Department of Laboratory Medicine, University Hospital "P. Giaccone", Palermo, Italy.
| | - Anna Carobene
- IRCCS San Raffaele Scientific Institute, Milan, Italy
| |
Collapse
|
26
|
Kanchan S, Ogden E, Kesheri M, Skinner A, Miliken E, Lyman D, Armstrong J, Sciglitano L, Hampikian G. COVID-19 hospitalizations and deaths predicted by SARS-CoV-2 levels in Boise, Idaho wastewater. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 907:167742. [PMID: 37852488 DOI: 10.1016/j.scitotenv.2023.167742] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 09/22/2023] [Accepted: 10/09/2023] [Indexed: 10/20/2023]
Abstract
The viral load of COVID-19 in untreated wastewater from Idaho's capital city Boise, ID (Ada County) has been used to predict changes in hospital admissions (statewide in Idaho) and deaths (Ada County) using distributed fixed lag modeling and artificial neural networks (ANN). The wastewater viral counts were used to determine the lag time between peaks in wastewater viral counts and COVID-19 hospitalizations as well as deaths (14 and 23 days, respectively). Quantitative measurement of SARS-CoV-2 viral RNA counts in the untreated wastewater was determined three times a week using RT-qPCR over a span of 13 months. To mitigate the effects of PCR inhibitors in wastewater, a series of dilution tests were conducted, and the 1/4 dilution was used to generate the most successful model. Wastewater SARS-CoV-2 viral RNA counts and hospitalization from June 7, 2021 to December 29, 2021 were used as training data to predict hospitalizations; and wastewater SARS-CoV-2 viral RNA counts and deaths from June 7, 2021 to December 20, 2021 were used as training data to predict deaths. These training data were used to make predictive ANN models for future hospitalizations and deaths. To the best of our knowledge, this is the first report of prediction of deaths from COVID-19 based on wastewater SARS-CoV-2 viral RNA counts using machine learning-based multilayered ANN. The applied modeling demonstrates that wastewater surveillance data can be combined with hospitalizations and death data to generate machine learning-based ANN models that predict future COVID-19 hospital admissions and deaths, providing an early warning for medical response teams and healthcare policymakers.
Collapse
Affiliation(s)
- Swarna Kanchan
- Department of Biological Sciences, Boise State University, Boise, Idaho, 83725, United States of America; Department of Biomedical Sciences, Joan C. Edwards School of Medicine, Marshall University, Huntington, West Virginia, 25701, United States of America
| | - Ernie Ogden
- Department of Biological Sciences, Boise State University, Boise, Idaho, 83725, United States of America
| | - Minu Kesheri
- Department of Biological Sciences, Boise State University, Boise, Idaho, 83725, United States of America; Department of Biomedical Sciences, Joan C. Edwards School of Medicine, Marshall University, Huntington, West Virginia, 25701, United States of America
| | - Alexis Skinner
- Department of Biological Sciences, Boise State University, Boise, Idaho, 83725, United States of America
| | - Erin Miliken
- Department of Biological Sciences, Boise State University, Boise, Idaho, 83725, United States of America
| | - Devyn Lyman
- Department of Biological Sciences, Boise State University, Boise, Idaho, 83725, United States of America
| | - Jacob Armstrong
- Department of Biological Sciences, Boise State University, Boise, Idaho, 83725, United States of America
| | - Lawrence Sciglitano
- Department of Biological Sciences, Boise State University, Boise, Idaho, 83725, United States of America
| | - Greg Hampikian
- Department of Biological Sciences, Boise State University, Boise, Idaho, 83725, United States of America.
| |
Collapse
|
27
|
Moumen A, Shafqat A, Alraqad T, Alshawarbeh ES, Saber H, Shafqat R. Divorce prediction using machine learning algorithms in Ha'il region, KSA. Sci Rep 2024; 14:502. [PMID: 38177210 PMCID: PMC10766631 DOI: 10.1038/s41598-023-50839-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 12/26/2023] [Indexed: 01/06/2024] Open
Abstract
The application of artificial intelligence (AI) in predictive analytics is growing in popularity. It has the power to offer ground-breaking solutions for a range of social problems and real world societal difficulties. It is helpful in addressing some of the social issues that today's world seems incapable of solving. One of the most significant phenomena affecting people's lives is divorce. The goal of this paper is to study the use of machine learning algorithms to determine the effectiveness of divorce predictor scale (DPS) and identify the reasons that usually lead to divorce in the scenario of Hail region, KSA. For this purpose, in this study, the DPS, based on Gottman couples therapy, was used to predict divorce by applying different machine learning algorithms. There were 54 items of the DPS used as features or attributes for data collection. In addition to the DPS, a personal information form was utilized to gather participants' personal data in order to conduct this study in a more structured and traditional manner. Out of 148 participants 116 participants were married whereas 32 were divorced. With the use of algorithms artificial neural network (ANN), naïve bayes (NB), and random forest (RF), the effectiveness of DPS was examined in this study. The correlation based feature selection method was used to identify the top six features from the same dataset and the highest accuracy rate was 91.66% with RF. The results show that DPS can predict divorce. This scale can help family counselors and therapists in case formulation and intervention plan development process. Additionally, it may be argued that the Hail region, KSA sampling confirmed the Gottman couples treatment predictors.
Collapse
Affiliation(s)
- Abdelkader Moumen
- Department of Mathematics, College of Science, University of Ha'il, Ha'il, 55473, Saudi Arabia.
| | - Ayesha Shafqat
- Department of Education, The University of Lahore, Sargodha, 40100, Pakistan
| | - Tariq Alraqad
- Department of Mathematics, College of Science, University of Ha'il, Ha'il, 55473, Saudi Arabia
| | - Etaf Saleh Alshawarbeh
- Department of Mathematics, College of Science, University of Ha'il, Ha'il, 55473, Saudi Arabia
| | - Hicham Saber
- Department of Mathematics, College of Science, University of Ha'il, Ha'il, 55473, Saudi Arabia
| | - Ramsha Shafqat
- Department of Mathematics and Statistics, The University of Lahore, Sargodha, 40100, Pakistan
| |
Collapse
|
28
|
Huang B, Hu S, Liu Z, Lin CL, Su J, Zhao C, Wang L, Wang W. Challenges and prospects of visual contactless physiological monitoring in clinical study. NPJ Digit Med 2023; 6:231. [PMID: 38097771 PMCID: PMC10721846 DOI: 10.1038/s41746-023-00973-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Accepted: 11/21/2023] [Indexed: 12/17/2023] Open
Abstract
The monitoring of physiological parameters is a crucial topic in promoting human health and an indispensable approach for assessing physiological status and diagnosing diseases. Particularly, it holds significant value for patients who require long-term monitoring or with underlying cardiovascular disease. To this end, Visual Contactless Physiological Monitoring (VCPM) is capable of using videos recorded by a consumer camera to monitor blood volume pulse (BVP) signal, heart rate (HR), respiratory rate (RR), oxygen saturation (SpO2) and blood pressure (BP). Recently, deep learning-based pipelines have attracted numerous scholars and achieved unprecedented development. Although VCPM is still an emerging digital medical technology and presents many challenges and opportunities, it has the potential to revolutionize clinical medicine, digital health, telemedicine as well as other areas. The VCPM technology presents a viable solution that can be integrated into these systems for measuring vital parameters during video consultation, owing to its merits of contactless measurement, cost-effectiveness, user-friendly passive monitoring and the sole requirement of an off-the-shelf camera. In fact, the studies of VCPM technologies have been rocketing recently, particularly AI-based approaches, but few are employed in clinical settings. Here we provide a comprehensive overview of the applications, challenges, and prospects of VCPM from the perspective of clinical settings and AI technologies for the first time. The thorough exploration and analysis of clinical scenarios will provide profound guidance for the research and development of VCPM technologies in clinical settings.
Collapse
Affiliation(s)
- Bin Huang
- AI Research Center, Hangzhou Innovation Institute, Beihang University, 99 Juhang Rd., Binjiang Dist., Hangzhou, Zhejiang, China.
- School of Automation Science and Electrical Engineering, Beihang University, Beijing, China.
| | - Shen Hu
- Department of Obstetrics, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
- Department of Epidemiology, The Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Zimeng Liu
- School of Automation Science and Electrical Engineering, Beihang University, Beijing, China
| | - Chun-Liang Lin
- College of Electrical Engineering and Computer Science, National Chung Hsing University, 145 Xingda Rd., South Dist., Taichung, Taiwan.
| | - Junfeng Su
- Department of General Intensive Care Unit, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
- Key Laboratory of Early Warning and Intervention of Multiple Organ Failure, China National Ministry of Education, Hangzhou, Zhejiang, China
| | - Changchen Zhao
- AI Research Center, Hangzhou Innovation Institute, Beihang University, 99 Juhang Rd., Binjiang Dist., Hangzhou, Zhejiang, China
| | - Li Wang
- Department of Rehabilitation Medicine, The First Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Wenjin Wang
- Department of Biomedical Engineering, Southern University of Science and Technology, 1088 Xueyuan Ave, Nanshan Dist., Shenzhen, Guangdong, China.
| |
Collapse
|
29
|
Barreto TDO, Veras NVR, Cardoso PH, Fernandes FRDS, Medeiros LPDS, Bezerra MV, de Andrade FMQ, Pinheiro CDO, Sánchez-Gendriz I, Silva GJPC, Rodrigues LF, de Morais AHF, dos Santos JPQ, Paiva JC, de Andrade IGM, Valentim RADM. Artificial intelligence applied to analyzes during the pandemic: COVID-19 beds occupancy in the state of Rio Grande do Norte, Brazil. Front Artif Intell 2023; 6:1290022. [PMID: 38145230 PMCID: PMC10748397 DOI: 10.3389/frai.2023.1290022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 11/17/2023] [Indexed: 12/26/2023] Open
Abstract
The COVID-19 pandemic is already considered one of the biggest global health crises. In Rio Grande do Norte, a Brazilian state, the RegulaRN platform was the health information system used to regulate beds for patients with COVID-19. This article explored machine learning and deep learning techniques with RegulaRN data in order to identify the best models and parameters to predict the outcome of a hospitalized patient. A total of 25,366 bed regulations for COVID-19 patients were analyzed. The data analyzed comes from the RegulaRN Platform database from April 2020 to August 2022. From these data, the nine most pertinent characteristics were selected from the twenty available, and blank or inconclusive data were excluded. This was followed by the following steps: data pre-processing, database balancing, training, and test. The results showed better performance in terms of accuracy (84.01%), precision (79.57%), and F1-score (81.00%) for the Multilayer Perceptron model with Stochastic Gradient Descent optimizer. The best results for recall (84.67%), specificity (84.67%), and ROC-AUC (91.6%) were achieved by Root Mean Squared Propagation. This study compared different computational methods of machine and deep learning whose objective was to classify bed regulation data for patients with COVID-19 from the RegulaRN Platform. The results have made it possible to identify the best model to help health professionals during the process of regulating beds for patients with COVID-19. The scientific findings of this article demonstrate that the computational methods used applied through a digital health solution, can assist in the decision-making of medical regulators and government institutions in situations of public health crisis.
Collapse
Affiliation(s)
- Tiago de Oliveira Barreto
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | - Nícolas Vinícius Rodrigues Veras
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Pablo Holanda Cardoso
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Felipe Ricardo dos Santos Fernandes
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | | | - Maria Valéria Bezerra
- Secretary of Public Health of Rio Grande do Norte, Natal, Rio Grande do Norte, Brazil
| | | | | | - Ignacio Sánchez-Gendriz
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | - Gleyson José Pinheiro Caldeira Silva
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Leandro Farias Rodrigues
- Brazilian Company of Hospital Services (EBSERH), University Hospital of Pelotas, Federal University of Pelotas (UFPel), Pelotas, Rio Grande do Sul, Brazil
| | - Antonio Higor Freire de Morais
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - João Paulo Queiroz dos Santos
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Jailton Carlos Paiva
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Ion Garcia Mascarenhas de Andrade
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | | |
Collapse
|
30
|
Ciaccio M, Schneiderman C, Pandey A, Fowler R, Chiou K, Koeller G, Hallett D, Krueger W, Raskin L. A time-course prediction model of global COVID-19 mortality. Front Public Health 2023; 11:1232531. [PMID: 38192563 PMCID: PMC10773778 DOI: 10.3389/fpubh.2023.1232531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 11/20/2023] [Indexed: 01/10/2024] Open
Abstract
Introduction The COVID-19 pandemic has caused over 6 million deaths worldwide and is a significant cause of mortality. Mortality dynamics vary significantly by country due to pathogen, host, social and environmental factors, in addition to vaccination and treatments. However, there is limited data on the relative contribution of different explanatory variables, which may explain changes in mortality over time. We, therefore, created a predictive model using orthogonal machine learning techniques to attempt to quantify the contribution of static and dynamic variables over time. Methods A model was created using Partial Least Squares Regression trained on data from 2020 to rank order the significance and effect size of static variables on mortality per country. This model enables the prediction of mortality levels for countries based on demographics alone. Partial Least Squares Regression was then used to quantify how dynamic variables, including weather and non-pharmaceutical interventions, contributed to the overall mortality in 2020. Finally, mortality levels for the first 60 days of 2021 were predicted using rolling-window Elastic Net regression. Results This model allowed prediction of deaths per day and quantification of the degree of influence of included variables, accounting for timing of occurrence or implementation. We found that the most parsimonious model could be reduced to six variables; three policy-related variables - COVID-19 testing policy, canceled public events policy, workplace closing policy; in addition to three environmental variables - maximum temperature per day, minimum temperature per day, and the dewpoint temperature per day. Conclusion Country and population-level static and dynamic variables can be used to predict COVID-19 mortality, providing an example of how broad temporal data can inform a preparation and mitigation strategy for both COVID-19 and future pandemics and assist decision-makers by identifying population-level contributors, including interventions, that have the greatest influence in mitigating mortality, and optimizing the health and safety of populations.
Collapse
Affiliation(s)
| | | | | | - Robert Fowler
- Sunnybrook Health Sciences Center, Toronto, ON, Canada
| | - Kevin Chiou
- Meta Reality Labs, Burlingame, CA, United States
| | - Gage Koeller
- Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, United States
| | | | | | - Leon Raskin
- AbbVie Inc., North Chicago, IL, United States
| |
Collapse
|
31
|
Reina-Reina A, Barrera J, Maté A, Trujillo J, Valdivieso B, Gas ME. Developing an interpretable machine learning model for predicting COVID-19 patients deteriorating prior to intensive care unit admission using laboratory markers. Heliyon 2023; 9:e22878. [PMID: 38125502 PMCID: PMC10731083 DOI: 10.1016/j.heliyon.2023.e22878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 11/15/2023] [Accepted: 11/22/2023] [Indexed: 12/23/2023] Open
Abstract
Coronavirus disease (COVID-19) remains a significant global health challenge, prompting a transition from emergency response to comprehensive management strategies. Furthermore, the emergence of new variants of concern, such as BA.2.286, underscores the need for early detection and response to new variants, which continues to be a crucial strategy for mitigating the impact of COVID-19, especially among the vulnerable population. This study aims to anticipate patients requiring intensive care or facing elevated mortality risk throughout their COVID-19 infection while also identifying laboratory predictive markers for early diagnosis of patients. Therefore, haematological, biochemical, and demographic variables were retrospectively evaluated in 8,844 blood samples obtained from 2,935 patients before intensive care unit admission using an interpretable machine learning model. Feature selection techniques were applied using precision-recall measures to address data imbalance and evaluate the suitability of the different variables. The model was trained using stratified cross-validation with k=5 and internally validated, achieving an accuracy of 77.27%, sensitivity of 78.55%, and area under the receiver operating characteristic (AUC) of 0.85; successfully identifying patients at increased risk of severe progression. From a medical perspective, the most important features of the progression or severity of patients with COVID-19 were lactate dehydrogenase, age, red blood cell distribution standard deviation, neutrophils, and platelets, which align with findings from several prior investigations. In light of these insights, diagnostic processes can be significantly expedited through the use of laboratory tests, with a greater focus on key indicators. This strategic approach not only improves diagnostic efficiency but also extends its reach to a broader spectrum of patients. In addition, it allows healthcare professionals to take early preventive measures for those most at risk of adverse outcomes, thereby optimising patient care and prognosis.
Collapse
Affiliation(s)
- A. Reina-Reina
- Lucentia Research. Department of Software and Computing System, University of Alicante, Carretera San Vicente del Raspeig s/n, 03690, Alicante, Spain
- Lucentia Lab, Av. Pintor Pérez Gil, 16, 03540, Alicante, Spain
| | - J.M. Barrera
- Lucentia Research. Department of Software and Computing System, University of Alicante, Carretera San Vicente del Raspeig s/n, 03690, Alicante, Spain
- Lucentia Lab, Av. Pintor Pérez Gil, 16, 03540, Alicante, Spain
| | - A. Maté
- Lucentia Research. Department of Software and Computing System, University of Alicante, Carretera San Vicente del Raspeig s/n, 03690, Alicante, Spain
- Lucentia Lab, Av. Pintor Pérez Gil, 16, 03540, Alicante, Spain
| | - J.C. Trujillo
- Lucentia Research. Department of Software and Computing System, University of Alicante, Carretera San Vicente del Raspeig s/n, 03690, Alicante, Spain
- Lucentia Lab, Av. Pintor Pérez Gil, 16, 03540, Alicante, Spain
| | - B. Valdivieso
- The University and Polytechnic La Fe Hospital of Valencia, Avenida Fernando Abril Martorell, 106 Torre H 1st floor, 46026, Valencia, Spain
- The Medical Research Institute of Hospital La Fe, Avenida Fernando Abril Martorell, 106 Torre F 7th floor, 46026, Valencia, Spain
| | - María-Eugenia Gas
- The Medical Research Institute of Hospital La Fe, Avenida Fernando Abril Martorell, 106 Torre F 7th floor, 46026, Valencia, Spain
| |
Collapse
|
32
|
Panç K, Hürsoy N, Başaran M, Yazici MM, Kaba E, Nalbant E, Gündoğdu H, Gürün E. Predicting COVID-19 Outcomes: Machine Learning Predictions Across Diverse Datasets. Cureus 2023; 15:e50932. [PMID: 38249212 PMCID: PMC10800012 DOI: 10.7759/cureus.50932] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/21/2023] [Indexed: 01/23/2024] Open
Abstract
Background The COVID-19 infection has spread rapidly since its emergence and has affected a large part of the global population. With the increasing number of cases, researchers are trying to predict the prognosis of patients by using different data with artificial intelligence methods such as machine learning (ML). In this study, we aimed to predict mortality risk in COVID-19 patients using ML algorithms with different datasets. Methodology In this retrospective study, we evaluated the fever, oxygen saturation, laboratory results, thorax computed tomography (CT) findings, and comorbid diseases at admission to the hospital of 404 patients whose diagnosis was confirmed by the reverse transcription polymerase chain reaction test. Different datasets were created by combining the data. The Synthetic Minority Oversampling Technique was used to reduce the imbalance in the dataset. K-nearest neighbors, support vector machine, stochastic gradient descent, random forest, neural network, naive Bayes, logistic regression, gradient boosting, XGBoost, and AdaBoost models were used to create the ML algorithm, and the accuracy rates of mortality prediction were compared. Results When the dataset was created with CT parenchyma score, pulmonary artery and inferior vena cava diameters, and laboratory results, mortality was predicted with an accuracy of 98.4% with the gradient boosting model. Conclusions The study demonstrates that patient prognosis can be accurately predicted using simple measurements from thorax CT scans and laboratory findings.
Collapse
Affiliation(s)
- Kemal Panç
- Radiology, Recep Tayyip Erdoğan Education and Research Hospital, Rize, TUR
| | - Nur Hürsoy
- Radiology, Recep Tayyip Erdoğan Education and Research Hospital, Rize, TUR
| | - Mustafa Başaran
- Radiology, Recep Tayyip Erdoğan Education and Research Hospital, Rize, TUR
| | - Mümin Murat Yazici
- Emergency Medicine, Recep Tayyip Erdoğan Education and Research Hospital, Rize, TUR
| | - Esat Kaba
- Radiology, Recep Tayyip Erdoğan Education and Research Hospital, Rize, TUR
| | | | - Hasan Gündoğdu
- Radiology, Recep Tayyip Erdoğan Education and Research Hospital, Rize, TUR
| | - Enes Gürün
- Radiology, Samsun University, Samsun, TUR
| |
Collapse
|
33
|
Zhuang Z, Qi Y, Yao Y, Yu Y. A predictive model for disease severity among COVID-19 elderly patients based on IgG subtypes and machine learning. Front Immunol 2023; 14:1286380. [PMID: 38106427 PMCID: PMC10723829 DOI: 10.3389/fimmu.2023.1286380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 11/15/2023] [Indexed: 12/19/2023] Open
Abstract
Objective Due to the increased likelihood of progression of severe pneumonia, the mortality rate of the elderly infected with coronavirus disease 2019 (COVID-19) is high. However, there is a lack of models based on immunoglobulin G (IgG) subtypes to forecast the severity of COVID-19 in elderly individuals. The objective of this study was to create and verify a new algorithm for distinguishing elderly individuals with severe COVID-19. Methods In this study, laboratory data were gathered from 103 individuals who had confirmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection using a retrospective analysis. These individuals were split into training (80%) and testing cohort (20%) by using random allocation. Furthermore, 22 COVID-19 elderly patients from the other two centers were divided into an external validation cohort. Differential indicators were analyzed through univariate analysis, and variable selection was performed using least absolute shrinkage and selection operator (LASSO) regression. The severity of elderly patients with COVID-19 was predicted using a combination of five machine learning algorithms. Area under the curve (AUC) was utilized to evaluate the performance of these models. Calibration curves, decision curves analysis (DCA), and Shapley additive explanations (SHAP) plots were utilized to interpret and evaluate the model. Results The logistic regression model was chosen as the best machine learning model with four principal variables that could predict the probability of COVID-19 severity. In the training cohort, the model achieved an AUC of 0.889, while in the testing cohort, it obtained an AUC of 0.824. The calibration curve demonstrated excellent consistency between actual and predicted probabilities. According to the DCA curve, it was evident that the model provided significant clinical advantages. Moreover, the model performed effectively in an external validation group (AUC=0.74). Conclusion The present study developed a model that can distinguish between severe and non-severe patients of COVID-19 in the elderly, which might assist clinical doctors in evaluating the severity of COVID-19 and reducing the bad outcomes of elderly patients.
Collapse
Affiliation(s)
- Zhenchao Zhuang
- Department of Laboratory Medicine, The First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Chinese Medicine), Hangzhou, China
| | - Yuxiang Qi
- School of Medical Technology and Information Engineering, Zhejiang Chinese Medical University, Hangzhou, China
| | - Yimin Yao
- Department of Laboratory Medicine, The First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Chinese Medicine), Hangzhou, China
| | - Ying Yu
- Department of Laboratory Medicine, The First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Chinese Medicine), Hangzhou, China
| |
Collapse
|
34
|
Benjamin R. Reproduction number projection for the COVID-19 pandemic. ADVANCES IN CONTINUOUS AND DISCRETE MODELS 2023; 2023:46. [DOI: 10.1186/s13662-023-03792-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 11/10/2023] [Indexed: 01/02/2025]
Abstract
AbstractThe recently derived Hybrid-Incidence Susceptible-Transmissible-Removed (HI-STR) prototype is a deterministic compartment model for epidemics and an alternative to the Susceptible-Infected-Removed (SIR) model. The HI-STR predicts that pathogen transmission depends on host population characteristics including population size, population density and social behaviour common within that population.The HI-STR prototype is applied to the ancestral Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV2) to show that the original estimates of the Coronavirus Disease 2019 (COVID-19) basic reproduction number $\mathcal{R}_{0}$
R
0
for the United Kingdom (UK) could have been projected onto the individual states of the United States of America (USA) prior to being detected in the USA.The Imperial College London (ICL) group’s estimate of $\mathcal{R}_{0}$
R
0
for the UK is projected onto each USA state. The difference between these projections and the ICL’s estimates for USA states is either not statistically significant on the paired Student t-test or not epidemiologically significant.The SARS-CoV2 Delta variant’s $\mathcal{R}_{0}$
R
0
is also projected from the UK to the USA to prove that projection can be applied to a Variant of Concern (VOC). Projection provides both a localised baseline for evaluating the implementation of an intervention policy and a mechanism for anticipating the impact of a VOC before local manifestation.
Collapse
|
35
|
Moulaei K, Sharifi H, Bahaadinbeigy K, Haghdoost AA, Nasiri N. Machine learning for prediction of viral hepatitis: A systematic review and meta-analysis. Int J Med Inform 2023; 179:105243. [PMID: 37806178 DOI: 10.1016/j.ijmedinf.2023.105243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 09/21/2023] [Accepted: 10/01/2023] [Indexed: 10/10/2023]
Abstract
BACKGROUND Lack of accurate and timely diagnosis of hepatitis poses obstacles to effective treatment, disease progression prevention, complication reduction, and life-saving interventions of patients. Utilizing machine learning can greatly enhance the achievement of timely and precise disease diagnosis. Therefore, we carried out this systematic review and meta-analysis to explore the performance of machine learning algorithms in predicting viral hepatitis. METHODS Using an extensive literature search in PubMed, Scopus, and Web of Science databases until June 15, 2023, English publications on hepatitis prediction using machine learning algorithms were included. Two authors independently extracted pertinent information from the selected studies. The PRISMA 2020 checklist was followed for study selection and result reporting. The risk of bias was checked using the International Journal of Medical Informatics (IJMEDI) checklist. Data were analyzed using the 'metandi' command in Stata 17. RESULTS Twenty-one original studies were included, covering 82 algorithms. Sixteen studies utilized five algorithms to predict hepatitis B. Ten studies used five algorithms for hepatitis C prediction. For hepatitis B prediction, the SVM algorithms demonstrated the highest sensitivity (90.0%; 95% confidence interval (CI): 77.0%-96.0%), specificity (94%; 95% CI: 90.0%-97.0%), and a diagnostic odds ratio (DOR) of 145 (95% CI: 37.0-559.0). In the case of hepatitis C, the KNN algorithms exhibited the highest sensitivity (80%; 95% CI:30.0%-97.0%), specificity (95%; 95% CI: 58.0%-99.0%), and DOR (72; 95% CI: 3.0-1644.0) for prediction. CONCLUSION SVM and KNN demonstrated superior performance in predicting hepatitis. The proper algorithm along with clinical practice could improve hepatitis prediction and management.
Collapse
Affiliation(s)
- Khadijeh Moulaei
- Department of Health Information Technology, Faculty of Paramedical, Ilam University of Medical Sciences, Ilam, Iran
| | - Hamid Sharifi
- HIV/STI Surveillance Research Center, and WHO Collaborating Center for HIV Surveillance, Institute for Futures Studies in Health, Kerman University of Medical Sciences, Kerman, Iran
| | | | - Ali Akbar Haghdoost
- Modeling in Health Research Center, Institute for Futures Studies in Health, Kerman University of Medical Sciences, Kerman, Iran
| | - Naser Nasiri
- School of Public Health, Jiroft University of Medical Sciences, Jiroft, Kerman, Iran.
| |
Collapse
|
36
|
Chimbunde E, Sigwadhi LN, Tamuzi JL, Okango EL, Daramola O, Ngah VD, Nyasulu PS. Machine learning algorithms for predicting determinants of COVID-19 mortality in South Africa. Front Artif Intell 2023; 6:1171256. [PMID: 37899965 PMCID: PMC10600470 DOI: 10.3389/frai.2023.1171256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 08/15/2023] [Indexed: 10/31/2023] Open
Abstract
Background COVID-19 has strained healthcare resources, necessitating efficient prognostication to triage patients effectively. This study quantified COVID-19 risk factors and predicted COVID-19 intensive care unit (ICU) mortality in South Africa based on machine learning algorithms. Methods Data for this study were obtained from 392 COVID-19 ICU patients enrolled between 26 March 2020 and 10 February 2021. We used an artificial neural network (ANN) and random forest (RF) to predict mortality among ICU patients and a semi-parametric logistic regression with nine covariates, including a grouping variable based on K-means clustering. Further evaluation of the algorithms was performed using sensitivity, accuracy, specificity, and Cohen's K statistics. Results From the semi-parametric logistic regression and ANN variable importance, age, gender, cluster, presence of severe symptoms, being on the ventilator, and comorbidities of asthma significantly contributed to ICU death. In particular, the odds of mortality were six times higher among asthmatic patients than non-asthmatic patients. In univariable and multivariate regression, advanced age, PF1 and 2, FiO2, severe symptoms, asthma, oxygen saturation, and cluster 4 were strongly predictive of mortality. The RF model revealed that intubation status, age, cluster, diabetes, and hypertension were the top five significant predictors of mortality. The ANN performed well with an accuracy of 71%, a precision of 83%, an F1 score of 100%, Matthew's correlation coefficient (MCC) score of 100%, and a recall of 88%. In addition, Cohen's k-value of 0.75 verified the most extreme discriminative power of the ANN. In comparison, the RF model provided a 76% recall, an 87% precision, and a 65% MCC. Conclusion Based on the findings, we can conclude that both ANN and RF can predict COVID-19 mortality in the ICU with accuracy. The proposed models accurately predict the prognosis of COVID-19 patients after diagnosis. The models can be used to prioritize COVID-19 patients with a high mortality risk in resource-constrained ICUs.
Collapse
Affiliation(s)
- Emmanuel Chimbunde
- Division of Epidemiology and Biostatistics, Department of Global Health, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Lovemore N. Sigwadhi
- Division of Epidemiology and Biostatistics, Department of Global Health, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Jacques L. Tamuzi
- Division of Epidemiology and Biostatistics, Department of Global Health, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | | | - Olawande Daramola
- Department of Information Technology, Faculty of Informatics and Design, Cape Peninsula University of Technology, Cape Town, South Africa
| | - Veranyuy D. Ngah
- Division of Epidemiology and Biostatistics, Department of Global Health, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Peter S. Nyasulu
- Division of Epidemiology and Biostatistics, Department of Global Health, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
- Division of Epidemiology and Biostatistics, School of Public Health, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| |
Collapse
|
37
|
Alkhammash EH, Assiri SA, Nemenqani DM, Althaqafi RMM, Hadjouni M, Saeed F, Elshewey AM. Application of Machine Learning to Predict COVID-19 Spread via an Optimized BPSO Model. Biomimetics (Basel) 2023; 8:457. [PMID: 37887588 PMCID: PMC10604133 DOI: 10.3390/biomimetics8060457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 09/21/2023] [Accepted: 09/21/2023] [Indexed: 10/28/2023] Open
Abstract
During the pandemic of the coronavirus disease (COVID-19), statistics showed that the number of affected cases differed from one country to another and also from one city to another. Therefore, in this paper, we provide an enhanced model for predicting COVID-19 samples in different regions of Saudi Arabia (high-altitude and sea-level areas). The model is developed using several stages and was successfully trained and tested using two datasets that were collected from Taif city (high-altitude area) and Jeddah city (sea-level area) in Saudi Arabia. Binary particle swarm optimization (BPSO) is used in this study for making feature selections using three different machine learning models, i.e., the random forest model, gradient boosting model, and naive Bayes model. A number of predicting evaluation metrics including accuracy, training score, testing score, F-measure, recall, precision, and receiver operating characteristic (ROC) curve were calculated to verify the performance of the three machine learning models on these datasets. The experimental results demonstrated that the gradient boosting model gives better results than the random forest and naive Bayes models with an accuracy of 94.6% using the Taif city dataset. For the dataset of Jeddah city, the results demonstrated that the random forest model outperforms the gradient boosting and naive Bayes models with an accuracy of 95.5%. The dataset of Jeddah city achieved better results than the dataset of Taif city in Saudi Arabia using the enhanced model for the term of accuracy.
Collapse
Affiliation(s)
- Eman H. Alkhammash
- Department of Computer Science, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia;
| | - Sara Ahmad Assiri
- Otolaryngology-Head and Neck Surgert Department, King Faisal Hospital, P.O. Box 11099, Taif 21944, Saudi Arabia;
| | - Dalal M. Nemenqani
- College of Medicine, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia; (D.M.N.); (R.M.M.A.)
| | - Raad M. M. Althaqafi
- College of Medicine, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia; (D.M.N.); (R.M.M.A.)
| | - Myriam Hadjouni
- Department of Computer Sciences, College of Computer and Information Science, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| | - Faisal Saeed
- DAAI Research Group, Department of Computing and Data Science, School of Computing and Digital Technology, Birmingham City University, Birmingham B4 7XG, UK;
| | - Ahmed M. Elshewey
- Faculty of Computers and Information, Computer Science Department, Suez University, Suez 43533, Egypt;
| |
Collapse
|
38
|
Zakariaee SS, Naderi N, Ebrahimi M, Kazemi-Arpanahi H. Comparing machine learning algorithms to predict COVID‑19 mortality using a dataset including chest computed tomography severity score data. Sci Rep 2023; 13:11343. [PMID: 37443373 PMCID: PMC10345104 DOI: 10.1038/s41598-023-38133-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 07/04/2023] [Indexed: 07/15/2023] Open
Abstract
Since the beginning of the COVID-19 pandemic, new and non-invasive digital technologies such as artificial intelligence (AI) had been introduced for mortality prediction of COVID-19 patients. The prognostic performances of the machine learning (ML)-based models for predicting clinical outcomes of COVID-19 patients had been mainly evaluated using demographics, risk factors, clinical manifestations, and laboratory results. There is a lack of information about the prognostic role of imaging manifestations in combination with demographics, clinical manifestations, and laboratory predictors. The purpose of the present study is to develop an efficient ML prognostic model based on a more comprehensive dataset including chest CT severity score (CT-SS). Fifty-five primary features in six main classes were retrospectively reviewed for 6854 suspected cases. The independence test of Chi-square was used to determine the most important features in the mortality prediction of COVID-19 patients. The most relevant predictors were used to train and test ML algorithms. The predictive models were developed using eight ML algorithms including the J48 decision tree (J48), support vector machine (SVM), multi-layer perceptron (MLP), k-nearest neighbourhood (k-NN), Naïve Bayes (NB), logistic regression (LR), random forest (RF), and eXtreme gradient boosting (XGBoost). The performances of the predictive models were evaluated using accuracy, precision, sensitivity, specificity, and area under the ROC curve (AUC) metrics. After applying the exclusion criteria, a total of 815 positive RT-PCR patients were the final sample size, where 54.85% of the patients were male and the mean age of the study population was 57.22 ± 16.76 years. The RF algorithm with an accuracy of 97.2%, the sensitivity of 100%, a precision of 94.8%, specificity of 94.5%, F1-score of 97.3%, and AUC of 99.9% had the best performance. Other ML algorithms with AUC ranging from 81.2 to 93.9% had also good prediction performances in predicting COVID-19 mortality. Results showed that timely and accurate risk stratification of COVID-19 patients could be performed using ML-based predictive models fed by routine data. The proposed algorithm with the more comprehensive dataset including CT-SS could efficiently predict the mortality of COVID-19 patients. This could lead to promptly targeting high-risk patients on admission, the optimal use of hospital resources, and an increased probability of survival of patients.
Collapse
Affiliation(s)
| | - Negar Naderi
- Department of Midwifery, Ilam University of Medical Sciences, Ilam, Iran
| | - Mahdi Ebrahimi
- Department of Emergency Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - Hadi Kazemi-Arpanahi
- Department of Health Information Technology, Abadan University of Medical Sciences, Abadan, Iran.
| |
Collapse
|
39
|
Kablan R, Miller HA, Suliman S, Frieboes HB. Evaluation of stacked ensemble model performance to predict clinical outcomes: A COVID-19 study. Int J Med Inform 2023; 175:105090. [PMID: 37172507 PMCID: PMC10165871 DOI: 10.1016/j.ijmedinf.2023.105090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 04/17/2023] [Accepted: 05/03/2023] [Indexed: 05/15/2023]
Abstract
BACKGROUND The application of machine learning (ML) to analyze clinical data with the goal to predict patient outcomes has garnered increasing attention. Ensemble learning has been used in conjunction with ML to improve predictive performance. Although stacked generalization (stacking), a type of heterogeneous ensemble of ML models, has emerged in clinical data analysis, it remains unclear how to define the best model combinations for strong predictive performance. This study develops a methodology to evaluate the performance of "base" learner models and their optimized combination using "meta" learner models in stacked ensembles to accurately assess performance in the context of clinical outcomes. METHODS De-identified COVID-19 data was obtained from the University of Louisville Hospital, where a retrospective chart review was performed from March 2020 to November 2021. Three differently-sized subsets using features from the overall dataset were chosen to train and evaluate ensemble classification performance. The number of base learners chosen from several algorithm families coupled with a complementary meta learner was varied from a minimum of 2 to a maximum of 8. Predictive performance of these combinations was evaluated in terms of mortality and severe cardiac event outcomes using area-under-the-receiver-operating-characteristic (AUROC), F1, balanced accuracy, and kappa. RESULTS The results highlight the potential to accurately predict clinical outcomes, such as severe cardiac events with COVID-19, from routinely acquired in-hospital patient data. Meta learners Generalized Linear Model (GLM), Multi-Layer Perceptron (MLP), and Partial Least Squares (PLS) had the highest AUROC for both outcomes, while K-Nearest Neighbors (KNN) had the lowest. Performance trended lower in the training set as the number of features increased, and exhibited less variance in both training and validation across all feature subsets as the number of base learners increased. CONCLUSION This study offers a methodology to robustly evaluate ensemble ML performance when analyzing clinical data.
Collapse
Affiliation(s)
- Rianne Kablan
- Department of Bioengineering, University of Louisville, Louisville, KY, USA
| | - Hunter A Miller
- Department of Bioengineering, University of Louisville, Louisville, KY, USA
| | | | - Hermann B Frieboes
- Department of Bioengineering, University of Louisville, Louisville, KY, USA; James Graham Brown Cancer Center, University of Louisville, Louisville, KY, USA; Center for Predictive Medicine, University of Louisville, Louisville, KY, USA.
| |
Collapse
|
40
|
Yazdani A, Bigdeli SK, Zahmatkeshan M. Investigating the performance of machine learning algorithms in predicting the survival of COVID-19 patients: A cross section study of Iran. Health Sci Rep 2023; 6:e1212. [PMID: 37064314 PMCID: PMC10099201 DOI: 10.1002/hsr2.1212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Revised: 03/23/2023] [Accepted: 03/30/2023] [Indexed: 04/18/2023] Open
Abstract
Background and Aims Like early diagnosis, predicting the survival of patients with Coronavirus Disease 2019 (COVID-19) is of great importance. Survival prediction models help doctors be more cautious to treat the patients who are at high risk of dying because of medical conditions. This study aims to predict the survival of hospitalized patients with COVID-19 by comparing the accuracy of machine learning (ML) models. Methods It is a cross-sectional study which was performed in 2022 in Fasa city in Iran country. The research data set was extracted from the period February 18, 2020 to February 10, 2021, and contains 2442 hospitalized patients' records with 84 features. A comparison was made between the efficiency of five ML algorithms to predict survival, includes Naive Bayes (NB), K-nearest neighbors (KNN), random forest (RF), decision tree (DT), and multilayer perceptron (MLP). Modeling steps were done with Python language in the Anaconda Navigator 3 environment. Results Our findings show that NB algorithm had better performance than others with accuracy, precision, recall, F-score, and area under receiver operating characteristic curve of 97%, 96%, 96%, 96%, and 97%, respectively. Based on the analysis of factors affecting survival, heart disease, pulmonary diseases and blood related disease were the most important disease related to death. Conclusion The development of software systems based on NB will be effective to predict the survival of COVID-19 patients.
Collapse
Affiliation(s)
- Azita Yazdani
- Department of Health Information Management, School of Health Management and Information SciencesShiraz University of Medical SciencesShirazIran
- Clinical Education Research CenterShiraz University of Medical SciencesShirazIran
- Health Human Resources Research Center, School of Health Management and Information SciencesShiraz University of Medical SciencesShirazIran
| | - Somayeh Kianian Bigdeli
- Health Information Management Department, School of Allied Medical SciencesTehran University of Medical SciencesTehranIran
| | - Maryam Zahmatkeshan
- Noncommunicable Diseases Research CenterFasa University of Medical SciencesFasaIran
- School of Allied Medical SciencesFasa University of Medical SciencesFasaIran
| |
Collapse
|
41
|
Buttia C, Llanaj E, Raeisi-Dehkordi H, Kastrati L, Amiri M, Meçani R, Taneri PE, Ochoa SAG, Raguindin PF, Wehrli F, Khatami F, Espínola OP, Rojas LZ, de Mortanges AP, Macharia-Nimietz EF, Alijla F, Minder B, Leichtle AB, Lüthi N, Ehrhard S, Que YA, Fernandes LK, Hautz W, Muka T. Prognostic models in COVID-19 infection that predict severity: a systematic review. Eur J Epidemiol 2023; 38:355-372. [PMID: 36840867 PMCID: PMC9958330 DOI: 10.1007/s10654-023-00973-x] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 01/28/2023] [Indexed: 02/26/2023]
Abstract
Current evidence on COVID-19 prognostic models is inconsistent and clinical applicability remains controversial. We performed a systematic review to summarize and critically appraise the available studies that have developed, assessed and/or validated prognostic models of COVID-19 predicting health outcomes. We searched six bibliographic databases to identify published articles that investigated univariable and multivariable prognostic models predicting adverse outcomes in adult COVID-19 patients, including intensive care unit (ICU) admission, intubation, high-flow nasal therapy (HFNT), extracorporeal membrane oxygenation (ECMO) and mortality. We identified and assessed 314 eligible articles from more than 40 countries, with 152 of these studies presenting mortality, 66 progression to severe or critical illness, 35 mortality and ICU admission combined, 17 ICU admission only, while the remaining 44 studies reported prediction models for mechanical ventilation (MV) or a combination of multiple outcomes. The sample size of included studies varied from 11 to 7,704,171 participants, with a mean age ranging from 18 to 93 years. There were 353 prognostic models investigated, with area under the curve (AUC) ranging from 0.44 to 0.99. A great proportion of studies (61.5%, 193 out of 314) performed internal or external validation or replication. In 312 (99.4%) studies, prognostic models were reported to be at high risk of bias due to uncertainties and challenges surrounding methodological rigor, sampling, handling of missing data, failure to deal with overfitting and heterogeneous definitions of COVID-19 and severity outcomes. While several clinical prognostic models for COVID-19 have been described in the literature, they are limited in generalizability and/or applicability due to deficiencies in addressing fundamental statistical and methodological concerns. Future large, multi-centric and well-designed prognostic prospective studies are needed to clarify remaining uncertainties.
Collapse
Affiliation(s)
- Chepkoech Buttia
- Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland
- Emergency Department, Inselspital, Bern University Hospital, University of Bern, Freiburgstrasse 16C, 3010 Bern, Switzerland
- Epistudia, Bern, Switzerland
| | - Erand Llanaj
- Department of Molecular Epidemiology, German Institute of Human Nutrition Potsdam-Rehbrücke, Nuthetal, Germany
- ELKH-DE Public Health Research Group of the Hungarian Academy of Sciences, Department of Public Health and Epidemiology, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
- Epistudia, Bern, Switzerland
- German Center for Diabetes Research (DZD), München-Neuherberg, Germany
| | - Hamidreza Raeisi-Dehkordi
- Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Lum Kastrati
- Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland
- Graduate School for Health Sciences, University of Bern, Bern, Switzerland
- Department of Diabetes, Endocrinology, Nutritional Medicine and Metabolism, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Mojgan Amiri
- Department of Epidemiology, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Renald Meçani
- Department of Pediatrics, “Mother Teresa” University Hospital Center, Tirana, University of Medicine, Tirana, Albania
- Division of Endocrinology and Diabetology, Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Petek Eylul Taneri
- Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland
- HRB-Trials Methodology Research Network College of Medicine, Nursing and Health Sciences University of Galway, Galway, Ireland
| | | | - Peter Francis Raguindin
- Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland
- Swiss Paraplegic Research, Nottwil, Switzerland
- Faculty of Health Sciences, University of Lucerne, Lucerne, Switzerland
| | - Faina Wehrli
- Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland
| | - Farnaz Khatami
- Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland
- Graduate School for Health Sciences, University of Bern, Bern, Switzerland
- Department of Community Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - Octavio Pano Espínola
- Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland
- Department of Preventive Medicine and Public Health, University of Navarre, Pamplona, Spain
- Navarra Institute for Health Research, IdiSNA, Pamplona, Spain
| | - Lyda Z. Rojas
- Research Group and Development of Nursing Knowledge (GIDCEN-FCV), Research Center, Cardiovascular Foundation of Colombia, Floridablanca, Santander, Colombia
| | | | | | - Fadi Alijla
- Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland
- Graduate School for Health Sciences, University of Bern, Bern, Switzerland
| | - Beatrice Minder
- Public Health and Primary Care Library, University Library of Bern, University of Bern, Bern, Switzerland
| | - Alexander B. Leichtle
- University Institute of Clinical Chemistry, Inselspital, Bern University Hospital, and Center for Artificial Intelligence in Medicine (CAIM), University of Bern, Bern, Switzerland
| | - Nora Lüthi
- Emergency Department, Inselspital, Bern University Hospital, University of Bern, Freiburgstrasse 16C, 3010 Bern, Switzerland
| | - Simone Ehrhard
- Emergency Department, Inselspital, Bern University Hospital, University of Bern, Freiburgstrasse 16C, 3010 Bern, Switzerland
| | - Yok-Ai Que
- Department of Intensive Care Medicine, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Laurenz Kopp Fernandes
- Deutsches Herzzentrum Berlin (DHZB), Berlin, Germany
- Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Wolf Hautz
- Emergency Department, Inselspital, Bern University Hospital, University of Bern, Freiburgstrasse 16C, 3010 Bern, Switzerland
| | - Taulant Muka
- Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland
- Epistudia, Bern, Switzerland
| |
Collapse
|
42
|
Barough SS, Safavi-Naini SAA, Siavoshi F, Tamimi A, Ilkhani S, Akbari S, Ezzati S, Hatamabadi H, Pourhoseingholi MA. Generalizable machine learning approach for COVID-19 mortality risk prediction using on-admission clinical and laboratory features. Sci Rep 2023; 13:2399. [PMID: 36765157 PMCID: PMC9911952 DOI: 10.1038/s41598-023-28943-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 01/27/2023] [Indexed: 02/12/2023] Open
Abstract
We aimed to propose a mortality risk prediction model using on-admission clinical and laboratory predictors. We used a dataset of confirmed COVID-19 patients admitted to three general hospitals in Tehran. Clinical and laboratory values were gathered on admission. Six different machine learning models and two feature selection methods were used to assess the risk of in-hospital mortality. The proposed model was selected using the area under the receiver operator curve (AUC). Furthermore, a dataset from an additional hospital was used for external validation. 5320 hospitalized COVID-19 patients were enrolled in the study, with a mortality rate of 17.24% (N = 917). Among 82 features, ten laboratories and 27 clinical features were selected by LASSO. All methods showed acceptable performance (AUC > 80%), except for K-nearest neighbor. Our proposed deep neural network on features selected by LASSO showed AUC scores of 83.4% and 82.8% in internal and external validation, respectively. Furthermore, our imputer worked efficiently when two out of ten laboratory parameters were missing (AUC = 81.8%). We worked intimately with healthcare professionals to provide a tool that can solve real-world needs. Our model confirmed the potential of machine learning methods for use in clinical practice as a decision-support system.
Collapse
Affiliation(s)
- Siavash Shirzadeh Barough
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Seyed Amir Ahmad Safavi-Naini
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Fatemeh Siavoshi
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Atena Tamimi
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Saba Ilkhani
- Department of Surgery, Center for Surgery and Public Health, Brigham and Women's Hospital, Harvard Medical School and Harvard T.H Chan School of Public Health, Boston, MA, USA
| | - Setareh Akbari
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Sadaf Ezzati
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Hamidreza Hatamabadi
- Department of Emergency Medicine, School of Medicine, Safety Promotion and Injury Prevention Research Center, Imam Hossein Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mohamad Amin Pourhoseingholi
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
43
|
Cavallazzi R, Bradley J, Chandler T, Furmanek S, Ramirez JA. Severity of Illness Scores and Biomarkers for Prognosis of Patients with Coronavirus Disease 2019. Semin Respir Crit Care Med 2023; 44:75-90. [PMID: 36646087 DOI: 10.1055/s-0042-1759567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
The spectrum of disease severity and the insidiousness of clinical presentation make it difficult to recognize patients with coronavirus disease 2019 (COVID-19) at higher risk of worse outcomes or death when they are seen in the early phases of the disease. There are now well-established risk factors for worse outcomes in patients with COVID-19. These should be factored in when assessing the prognosis of these patients. However, a more precise prognostic assessment in an individual patient may warrant the use of predictive tools. In this manuscript, we conduct a literature review on the severity of illness scores and biomarkers for the prognosis of patients with COVID-19. Several COVID-19-specific scores have been developed since the onset of the pandemic. Some of them are promising and can be integrated into the assessment of these patients. We also found that the well-known pneumonia severity index (PSI) and CURB-65 (confusion, uremia, respiratory rate, BP, age ≥ 65 years) are good predictors of mortality in hospitalized patients with COVID-19. While neither the PSI nor the CURB-65 should be used for the triage of outpatient versus inpatient treatment, they can be integrated by a clinician into the assessment of disease severity and can be used in epidemiological studies to determine the severity of illness in patient populations. Biomarkers also provide valuable prognostic information and, importantly, may depict the main physiological derangements in severe disease. We, however, do not advocate the isolated use of severity of illness scores or biomarkers for decision-making in an individual patient. Instead, we suggest the use of these tools on a case-by-case basis with the goal of enhancing clinician judgment.
Collapse
Affiliation(s)
- Rodrigo Cavallazzi
- Division of Pulmonary, Critical Care Medicine, and Sleep Disorders, University of Louisville, Norton Healthcare, Louisville, Kentucky
| | - James Bradley
- Division of Pulmonary, Critical Care Medicine, and Sleep Disorders, University of Louisville, Norton Healthcare, Louisville, Kentucky
| | - Thomas Chandler
- Norton Infectious Diseases Institute, Norton Healthcare, Louisville, Kentucky
| | - Stephen Furmanek
- Norton Infectious Diseases Institute, Norton Healthcare, Louisville, Kentucky
| | - Julio A Ramirez
- Norton Infectious Diseases Institute, Norton Healthcare, Louisville, Kentucky
| |
Collapse
|
44
|
Donat N, Mellati N, Frumento T, Cirodde A, Gette S, Guitard PG, Hoffmann C, Veber B, Leclerc T. Validation of a pre-established triage protocol for critically ill patients in a COVID-19 outbreak under resource scarcity: A retrospective multicenter cohort study. PLoS One 2023; 18:e0285690. [PMID: 37167306 PMCID: PMC10174588 DOI: 10.1371/journal.pone.0285690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Accepted: 04/28/2023] [Indexed: 05/13/2023] Open
Abstract
INTRODUCTION In case of COVID-19 related scarcity of critical care resources, an early French triage algorithm categorized critically ill patients by probability of survival based on medical history and severity, with four priority levels for initiation or continuation of critical care: P1 -high priority, P2 -intermediate priority, P3 -not needed, P4 -not appropriate. This retrospective multi-center study aimed to assess its classification performance and its ability to help saving lives under capacity saturation. METHODS ICU patients admitted for severe COVID-19 without triage in spring 2020 were retrospectively included from three hospitals. Demographic data, medical history and severity items were collected. Priority levels were retrospectively allocated at ICU admission and on ICU day 7-10. Mortality rate, cumulative incidence of death and of alive ICU discharge, length of ICU stay and of mechanical ventilation were compared between priority levels. Calculated mortality and survival were compared between full simulated triage and no triage. RESULTS 225 patients were included, aged 63.1±11.9 years. Median SAPS2 was 40 (IQR 29-49). At the end of follow-up, 61 (27%) had died, 26 were still in ICU, and 138 had been discharged. Following retrospective initial priority allocation, mortality rate was 53% among P4 patients (95CI 34-72%) versus 23% among all P1 to P3 patients (95CI 17-30%, chi-squared p = 5.2e-4). The cumulative incidence of death consistently increased in the order P3, P1, P2 and P4 both at admission (Gray's test p = 3.1e-5) and at reassessment (p = 8e-5), and conversely for that of alive ICU discharge. Reassessment strengthened consistency. Simulation under saturation showed that this two-step triage protocol could have saved 28 to 40 more lives than no triage. CONCLUSION Although it cannot eliminate potentially avoidable deaths, this triage protocol proved able to adequately prioritize critical care for patients with highest probability of survival, hence to save more lives if applied.
Collapse
Affiliation(s)
- Nicolas Donat
- Burn Treatment Center and COVID-19 ICU, Percy Military Teaching Hospital, Clamart, France
| | - Nouchan Mellati
- ICU, Mercy Regional Hospital, Metz, France
- Legouest Military Teaching Hospital, Metz, France
| | | | - Audrey Cirodde
- Burn Treatment Center and COVID-19 ICU, Percy Military Teaching Hospital, Clamart, France
| | | | | | - Clément Hoffmann
- Burn Treatment Center and COVID-19 ICU, Percy Military Teaching Hospital, Clamart, France
| | - Benoît Veber
- ICU, Rouen University Hospital, Rouen, France
- Faculty of Medicine, Rouen University, Rouen, France
| | - Thomas Leclerc
- Burn Treatment Center and COVID-19 ICU, Percy Military Teaching Hospital, Clamart, France
- Val-de-Grâce Military Medical Academy, Paris, France
| |
Collapse
|
45
|
Zakariaee SS, Abdi AI, Naderi N, Babashahi M. Prognostic significance of chest CT severity score in mortality prediction of COVID-19 patients, a machine learning study. THE EGYPTIAN JOURNAL OF RADIOLOGY AND NUCLEAR MEDICINE 2023; 54:73. [PMCID: PMC10116092 DOI: 10.1186/s43055-023-01022-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 04/13/2023] [Indexed: 04/05/2024] Open
Abstract
Background The high mortality rate of COVID-19 makes it necessary to seek early identification of high-risk patients with poor prognoses. Although the association between CT-SS and mortality of COVID-19 patients was reported, its prognosis significance in combination with other prognostic parameters was not evaluated yet. Methods This retrospective single-center study reviewed a total of 6854 suspected patients referred to Imam Khomeini hospital, Ilam city, west of Iran, from February 9, 2020 to December 20, 2020. The prognostic performances of k-Nearest Neighbors (kNN), Multilayer Perceptron (MLP), Support Vector Machine (SVM), and J48 decision tree algorithms were evaluated based on the most important and relevant predictors. The metrics derived from the confusion matrix were used to determine the performance of the ML models. Results After applying exclusion criteria, 815 hospitalized cases were entered into the study. Of these, 447(54.85%) were male and the mean (± SD) age of participants was 57.22(± 16.76) years. The results showed that the performances of the ML algorithms were improved when they are fed by the dataset with CT-SS data. The kNN model with an accuracy of 94.1%, sensitivity of 100. 0%, precision of 89.5%, specificity of 88.3%, and AUC around 97.2% had the best performance among the other three ML techniques. Conclusions The integration of CT-SS data with demographics, risk factors, clinical manifestations, and laboratory parameters improved the prognostic performances of the ML algorithms. An ML model with a comprehensive collection of predictors could identify high-risk patients more efficiently and lead to the optimal use of hospital resources.
Collapse
Affiliation(s)
- Seyed Salman Zakariaee
- Department of Medical Physics, Faculty of Paramedical Sciences, Ilam University of Medical Sciences, Ilam, Iran
| | - Aza Ismail Abdi
- Department of Radiology, Erbil Medical Technical Institute, Erbil Polytechnic University, Erbil, Iraq
| | - Negar Naderi
- Department of Midwifery, Faculty of Nursing and Midwifery, Ilam University of Medical Sciences, Ilam, Iran
| | - Mashallah Babashahi
- Department of Pathology, Faculty of Paramedical Sciences, Ilam University of Medical Sciences, Ilam, Iran
| |
Collapse
|
46
|
Sievering AW, Wohlmuth P, Geßler N, Gunawardene MA, Herrlinger K, Bein B, Arnold D, Bergmann M, Nowak L, Gloeckner C, Koch I, Bachmann M, Herborn CU, Stang A. Comparison of machine learning methods with logistic regression analysis in creating predictive models for risk of critical in-hospital events in COVID-19 patients on hospital admission. BMC Med Inform Decis Mak 2022; 22:309. [PMID: 36437469 PMCID: PMC9702742 DOI: 10.1186/s12911-022-02057-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 11/17/2022] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND Machine learning (ML) algorithms have been trained to early predict critical in-hospital events from COVID-19 using patient data at admission, but little is known on how their performance compares with each other and/or with statistical logistic regression (LR). This prospective multicentre cohort study compares the performance of a LR and five ML models on the contribution of influencing predictors and predictor-to-event relationships on prediction model´s performance. METHODS We used 25 baseline variables of 490 COVID-19 patients admitted to 8 hospitals in Germany (March-November 2020) to develop and validate (75/25 random-split) 3 linear (L1 and L2 penalty, elastic net [EN]) and 2 non-linear (support vector machine [SVM] with radial kernel, random forest [RF]) ML approaches for predicting critical events defined by intensive care unit transfer, invasive ventilation and/or death (composite end-point: 181 patients). Models were compared for performance (area-under-the-receiver-operating characteristic-curve [AUC], Brier score) and predictor importance (performance-loss metrics, partial-dependence profiles). RESULTS Models performed close with a small benefit for LR (utilizing restricted cubic splines for non-linearity) and RF (AUC means: 0.763-0.731 [RF-L1]); Brier scores: 0.184-0.197 [LR-L1]). Top ranked predictor variables (consistently highest importance: C-reactive protein) were largely identical across models, except creatinine, which exhibited marginal (L1, L2, EN, SVM) or high/non-linear effects (LR, RF) on events. CONCLUSIONS Although the LR and ML models analysed showed no strong differences in performance and the most influencing predictors for COVID-19-related event prediction, our results indicate a predictive benefit from taking account for non-linear predictor-to-event relationships and effects. Future efforts should focus on leveraging data-driven ML technologies from static towards dynamic modelling solutions that continuously learn and adapt to changes in data environments during the evolving pandemic. TRIAL REGISTRATION NUMBER NCT04659187.
Collapse
Affiliation(s)
| | - Peter Wohlmuth
- Semmelweis University, Asklepios Campus Hamburg, Budapest, Hungary.,Asklepios Proresearch, Research Institute, Hamburg, Germany
| | - Nele Geßler
- Semmelweis University, Asklepios Campus Hamburg, Budapest, Hungary.,Asklepios Proresearch, Research Institute, Hamburg, Germany.,Department of Cardiology and Intensive Care Medicine, Asklepios Hospital St. Georg, Hamburg, Germany
| | - Melanie A Gunawardene
- Department of Cardiology and Intensive Care Medicine, Asklepios Hospital St. Georg, Hamburg, Germany
| | - Klaus Herrlinger
- Department of Internal Medicine, Asklepios Hospital Nord-Heidberg, Hamburg, Germany.,Asklepios Tumorzentrum, Hamburg, Germany
| | - Berthold Bein
- Department of Anesthesiology and Intensive Care Medicine, Asklepios Hospital St. Georg, Hamburg, Germany
| | - Dirk Arnold
- Asklepios Tumorzentrum, Hamburg, Germany.,Department of Hematology, Oncology, Palliative Care and Rheumatology, Asklepios Hospital Altona, Hamburg, Germany
| | - Martin Bergmann
- Department of Internal Medicine, Cardiology, and Pneumology, Asklepios Hospital Wandsbek, Hamburg, Germany
| | - Lorenz Nowak
- Department of Intensive Care and Ventilation Medicine, Asklepios Hospital München-Gauting, Gauting, Germany
| | - Christian Gloeckner
- Department of Internal Medicine, Asklepios Hospital Oberviechtach, Oberviechtach, Germany
| | - Ina Koch
- Biobank for Pulmonary Diseases, Asklepios Hospital München-Gauting, Gauting, Germany
| | - Martin Bachmann
- Department of Intensive Care and Ventilatory Medicine, Asklepios Hospital Harburg, Hamburg, Germany
| | - Christoph U Herborn
- Semmelweis University, Asklepios Campus Hamburg, Budapest, Hungary.,Asklepios Hospitals GmbH & Co. KGaA, Hamburg, Germany
| | - Axel Stang
- Semmelweis University, Asklepios Campus Hamburg, Budapest, Hungary. .,Asklepios Tumorzentrum, Hamburg, Germany. .,Department of Hematology, Oncology and Palliative Care Medicine, Asklepios Hospital Barmbek, Rübenkamp 220, 22291, Hamburg, Germany.
| |
Collapse
|
47
|
Ebrahimi A, Wiil UK, Naemi A, Mansourvar M, Andersen K, Nielsen AS. Identification of clinical factors related to prediction of alcohol use disorder from electronic health records using feature selection methods. BMC Med Inform Decis Mak 2022; 22:304. [PMID: 36424597 PMCID: PMC9686074 DOI: 10.1186/s12911-022-02051-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Accepted: 11/16/2022] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND High dimensionality in electronic health records (EHR) causes a significant computational problem for any systematic search for predictive, diagnostic, or prognostic patterns. Feature selection (FS) methods have been indicated to be effective in feature reduction as well as in identifying risk factors related to prediction of clinical disorders. This paper examines the prediction of patients with alcohol use disorder (AUD) using machine learning (ML) and attempts to identify risk factors related to the diagnosis of AUD. METHODS A FS framework consisting of two operational levels, base selectors and ensemble selectors. The first level consists of five FS methods: three filter methods, one wrapper method, and one embedded method. Base selector outputs are aggregated to develop four ensemble FS methods. The outputs of FS method were then fed into three ML algorithms: support vector machine (SVM), K-nearest neighbor (KNN), and random forest (RF) to compare and identify the best feature subset for the prediction of AUD from EHRs. RESULTS In terms of feature reduction, the embedded FS method could significantly reduce the number of features from 361 to 131. In terms of classification performance, RF based on 272 features selected by our proposed ensemble method (Union FS) with the highest accuracy in predicting patients with AUD, 96%, outperformed all other models in terms of AUROC, AUPRC, Precision, Recall, and F1-Score. Considering the limitations of embedded and wrapper methods, the best overall performance was achieved by our proposed Union Filter FS, which reduced the number of features to 223 and improved Precision, Recall, and F1-Score in RF from 0.77, 0.65, and 0.71 to 0.87, 0.81, and 0.84, respectively. Our findings indicate that, besides gender, age, and length of stay at the hospital, diagnosis related to digestive organs, bones, muscles and connective tissue, and the nervous systems are important clinical factors related to the prediction of patients with AUD. CONCLUSION Our proposed FS method could improve the classification performance significantly. It could identify clinical factors related to prediction of AUD from EHRs, thereby effectively helping clinical staff to identify and treat AUD patients and improving medical knowledge of the AUD condition. Moreover, the diversity of features among female and male patients as well as gender disparity were investigated using FS methods and ML techniques.
Collapse
Affiliation(s)
- Ali Ebrahimi
- grid.10825.3e0000 0001 0728 0170SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| | - Uffe Kock Wiil
- grid.10825.3e0000 0001 0728 0170SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| | - Amin Naemi
- grid.10825.3e0000 0001 0728 0170SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| | - Marjan Mansourvar
- grid.10825.3e0000 0001 0728 0170Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Kjeld Andersen
- grid.10825.3e0000 0001 0728 0170Unit for Clinical Alcohol Research, Clinical Institute, University of Southern Denmark, Odense, Denmark
| | - Anette Søgaard Nielsen
- grid.10825.3e0000 0001 0728 0170Unit for Clinical Alcohol Research, Clinical Institute, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
48
|
Saadatmand S, Salimifard K, Mohammadi R, Kuiper A, Marzban M, Farhadi A. Using machine learning in prediction of ICU admission, mortality, and length of stay in the early stage of admission of COVID-19 patients. ANNALS OF OPERATIONS RESEARCH 2022; 328:1-29. [PMID: 36196268 PMCID: PMC9521862 DOI: 10.1007/s10479-022-04984-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 09/06/2022] [Indexed: 05/19/2023]
Abstract
The recent COVID-19 pandemic has affected health systems across the world. Especially, Intensive Care Units (ICUs) have played a pivotal role in the treatment of critically-ill patients. At the same time however, the increasing number of admissions due to the vast prevalence of the virus have caused several problems for ICU wards such as overburdening of staff and shortages of medical resources. These issues might have affected the quality of healthcare services provided directly impacting a patient's survival. The objective of this research is to leverage Machine Learning (ML) on hospital data in order to support hospital managers and practitioners with the treatment of COVID-19 patients. This is accomplished by providing more detailed inference about a patient's likelihood of ICU admission, mortality and in case of hospitalization the length of stay (LOS). In this pursuit, the outcome variables are in three separate models predicted by five different ML algorithms: eXtreme Gradient Boosting (XGB), K-Nearest Neighbor (KNN), Random Forest (RF), bagged-CART (b-CART), and LogitBoost (LB). With the exception of KNN, the studied models show good predictive capabilities when evaluating relevant accuracy scores, such as area under the curve. By implementing an ensemble stacking approach (either a Neural Net or a General Linear Model) on top of the aforementioned ML algorithms the performance is further boosted. Ultimately, for the prediction of admission to the ICU, the ensemble stacking via a Neural Net achieved the best result with an accuracy of over 95%. For mortality at the ICU, the vanilla XGB performed slightly better (1% difference with the meta-model). To predict large length of stays both ensemble stacking approaches yield comparable results. Besides it direct implications for managing COVID-19 patients, the approach presented serves as an example how data can be employed in future pandemics or crises.
Collapse
Affiliation(s)
- Sara Saadatmand
- Computational Intelligence and Intelligent Optimization Research Group, Persian Gulf University, Bushehr, 75169 Iran
| | - Khodakaram Salimifard
- Computational Intelligence and Intelligent Optimization Research Group, Persian Gulf University, Bushehr, 75169 Iran
| | - Reza Mohammadi
- Section Business Analytics, Amsterdam Business School, University of Amsterdam, Amsterdam, The Netherlands
| | - Alex Kuiper
- Section Business Analytics, Amsterdam Business School, University of Amsterdam, Amsterdam, The Netherlands
| | - Maryam Marzban
- Department of Public Health, School of Public Health, Bushehr University of Medical Science, Bushehr, Iran
| | - Akram Farhadi
- The Persian Gulf Tropical Medicine Research Center, The Persian Gulf Biomedical Science Research Institute, Bushehr University of Medical Science, Bushehr, Iran
| |
Collapse
|
49
|
Yazdani A, Zahmatkeshan M, Ravangard R, Sharifian R, Shirdeli M. Supervised Machine Learning Approach to COVID-19 Detection Based on Clinical Data. Med J Islam Repub Iran 2022; 36:110. [PMID: 36447543 PMCID: PMC9700415 DOI: 10.47176/mjiri.36.110] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Indexed: 09/10/2024] Open
Abstract
Background: The new coronavirus has been spreading since the beginning of 2020, and many efforts have been made to develop vaccines to help patients recover. It is now clear that the world needs a rapid solution to curb the spread of COVID-19 worldwide with non-clinical approaches such as artificial intelligence techniques. These approaches can be effective in reducing the burden on the health care system to provide the best possible way to diagnose the COVID-19 epidemic. This study was conducted to use Machine Learning (ML) algorithms for the early detection of COVID-19 in patients. Methods: This retrospective study used data from hospitals affiliated with Shiraz University of Medical Sciences in Iran. This dataset was collected in the period March to October 2020 andcontained 10055 cases with 63 features. We selected and compared six algorithms: C4.5, support vector machine (SVM), Naive Bayes, logistic Regression (LR), Random Forest, and K-Nearest Neighbor algorithm using Rapid Miner software. The performance of algorithms was measured using evaluation metrics, such as precision, recall, accuracy, and f-measure. Results: The results of the study show that among the various used classification methods in the diagnosis of coronavirus, SVM (93.41% accuracy) and C4.5 (91.87% accuracy) achieved the highest performance. According to the C4.5 decision tree, "contact with a person who has COVID-19" was considered the most important diagnostic criterion based on the Gini index. Conclusion: We found that ML approaches enable a reasonable level of accuracy in the diagnosis of COVID-19.
Collapse
Affiliation(s)
- Azita Yazdani
- Department of Health Information Management, Clinical Education Research Center, Health Human Resources Research Center, School of Health Management and Information Sciences, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Maryam Zahmatkeshan
- Noncommunicable Diseases Research Center, Fasa University of Medical Sciences, Fasa, Iran
- School of Allied Medical Sciences, Fasa University of Medical Sciences, Fasa, Iran
| | - Ramin Ravangard
- Department of Health Services Management, Health Human Resources Research Center, School of Health Management and Information Sciences, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Roxana Sharifian
- Department of Health Information Management, Health Human Resources Research Center, School of Health Management and Information Sciences, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Mohammad Shirdeli
- Department of Health Information Management, Student Research Committee, Health Human Resources Research Center, School of Health Management and Information Sciences, Shiraz University of Medical Sciences, Shiraz, Iran
| |
Collapse
|
50
|
Smadi AA, Abugabah A, Al-Smadi AM, Almotairi S. SEL-COVIDNET: An intelligent application for the diagnosis of COVID-19 from chest X-rays and CT-scans. INFORMATICS IN MEDICINE UNLOCKED 2022; 32:101059. [PMID: 36033909 PMCID: PMC9398554 DOI: 10.1016/j.imu.2022.101059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 08/17/2022] [Accepted: 08/17/2022] [Indexed: 11/06/2022] Open
Abstract
COVID-19 detection from medical imaging is a difficult challenge that has piqued the interest of experts worldwide. Chest X-rays and computed tomography (CT) scanning are the essential imaging modalities for diagnosing COVID-19. All researchers focus their efforts on developing viable methods and rapid treatment procedures for this pandemic. Fast and accurate automated detection approaches have been devised to alleviate the need for medical professionals. Deep Learning (DL) technologies have successfully recognized COVID-19 situations. This paper proposes a developed set of nine deep learning models for diagnosing COVID-19 based on transfer learning and implementation in a novel architecture (SEL-COVIDNET). We include a global average pooling layer, flattening, and two dense layers that are fully connected. The model’s effectiveness is evaluated using balanced and unbalanced COVID-19 radiography datasets. After that, our model’s performance is analyzed using six evaluation measures: accuracy, sensitivity, specificity, precision, F1-score, and Matthew’s correlation coefficient (MCC). Experiments demonstrated that the proposed SEL-COVIDNET with tuned DenseNet121, InceptionResNetV2, and MobileNetV3Large models outperformed the results of comparative SOTA for multi-class classification (COVID-19 vs. No-finding vs. Pneumonia) in terms of accuracy (98.52%), specificity (98.5%), sensitivity (98.5%), precision (98.7%), F1-score (98.7%), and MCC (97.5%). For the COVID-19 vs. No-finding classification, our method had an accuracy of 99.77%, a specificity of 99.85%, a sensitivity of 99.85%, a precision of 99.55%, an F1-score of 99.7%, and an MCC of 99.4%. The proposed model offers an accurate approach for detecting COVID-19 patients, which aids in the containment of the COVID-19 pandemic.
Collapse
Affiliation(s)
- Ahmad Al Smadi
- School of Artificial Intelligence, Xidian University, No. 2 South Taibai Road, Xian, 710071, China.,College of Technological Innovation, Zayed University, Abu Dhabi Campus, UAE
| | - Ahed Abugabah
- College of Technological Innovation, Zayed University, Abu Dhabi Campus, UAE
| | - Ahmad Mohammad Al-Smadi
- Department of Computer Science, Al-Balqa Applied University, Ajloun University College, Jordan
| | - Sultan Almotairi
- Faculty of Community College, Majmaah University, Al Majma'ah, Saudi Arabia
| |
Collapse
|