1
|
Loo WK, Voon W, Suhaimi A, Teh CSJ, Tee YK, Hum YC, Hasikin K, Teo K, Ong HC, Lai KW. Predictive Modeling of COVID-19 Readmissions: Insights from Machine Learning and Deep Learning Approaches. Diagnostics (Basel) 2024; 14:1511. [PMID: 39061647 PMCID: PMC11275856 DOI: 10.3390/diagnostics14141511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 06/15/2024] [Accepted: 07/03/2024] [Indexed: 07/28/2024] Open
Abstract
This project employs artificial intelligence, including machine learning and deep learning, to assess COVID-19 readmission risk in Malaysia. It offers tools to mitigate healthcare resource strain and enhance patient outcomes. This study outlines a methodology for classifying COVID-19 readmissions. It starts with dataset description and pre-processing, while the data balancing was computed through Random Oversampling, Borderline SMOTE, and Adaptive Synthetic Sampling. Nine machine learning and ten deep learning techniques are applied, with five-fold cross-validation for evaluation. Optuna is used for hyperparameter selection, while the consistency in training hyperparameters is maintained. Evaluation metrics encompass accuracy, AUC, and training/inference times. Results were based on stratified five-fold cross-validation and different data-balancing methods. Notably, CatBoost consistently excelled in accuracy and AUC across all tables. Using ROS, CatBoost achieved the highest accuracy (0.9882 ± 0.0020) with an AUC of 1.0000 ± 0.0000. CatBoost maintained its superiority in BSMOTE and ADASYN as well. Deep learning approaches performed well, with SAINT leading in ROS and TabNet leading in BSMOTE and ADASYN. Decision Tree ensembles like Random Forest and XGBoost consistently showed strong performance.
Collapse
Affiliation(s)
- Wei Kit Loo
- Department of Biomedical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur 50603, Malaysia; (W.K.L.); (K.H.); (K.T.)
| | - Wingates Voon
- Lee Kong Chian Faculty of Engineering and Science, Universiti Tunku Abdul Rahman, Kajang 43000, Malaysia; (W.V.); (Y.K.T.)
| | - Anwar Suhaimi
- Department of Rehabilitation Medicine, Faculty of Medicine, Universiti Malaya, Kuala Lumpur 50603, Malaysia;
| | - Cindy Shuan Ju Teh
- Department of Medical Microbiology, Faculty of Medicine, Universiti Malaya, Kuala Lumpur 50603, Malaysia;
| | - Yee Kai Tee
- Lee Kong Chian Faculty of Engineering and Science, Universiti Tunku Abdul Rahman, Kajang 43000, Malaysia; (W.V.); (Y.K.T.)
| | - Yan Chai Hum
- Lee Kong Chian Faculty of Engineering and Science, Universiti Tunku Abdul Rahman, Kajang 43000, Malaysia; (W.V.); (Y.K.T.)
| | - Khairunnisa Hasikin
- Department of Biomedical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur 50603, Malaysia; (W.K.L.); (K.H.); (K.T.)
| | - Kareen Teo
- Department of Biomedical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur 50603, Malaysia; (W.K.L.); (K.H.); (K.T.)
| | - Hang Cheng Ong
- Infectious Diseases Unit, Department of Medicine, Faculty of Medicine, Universiti Malaya, Kuala Lumpur 56300, Malaysia;
| | - Khin Wee Lai
- Department of Biomedical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur 50603, Malaysia; (W.K.L.); (K.H.); (K.T.)
| |
Collapse
|
2
|
Chen TY, Huang TY, Chang YC. Using a clinical narrative-aware pre-trained language model for predicting emergency department patient disposition and unscheduled return visits. J Biomed Inform 2024; 155:104657. [PMID: 38772443 DOI: 10.1016/j.jbi.2024.104657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2024] [Revised: 04/07/2024] [Accepted: 05/18/2024] [Indexed: 05/23/2024]
Abstract
The increasing prevalence of overcrowding in Emergency Departments (EDs) threatens the effective delivery of urgent healthcare. Mitigation strategies include the deployment of monitoring systems capable of tracking and managing patient disposition to facilitate appropriate and timely care, which subsequently reduces patient revisits, optimizes resource allocation, and enhances patient outcomes. This study used ∼ 250,000 emergency department visit records from Taipei Medical University-Shuang Ho Hospital to develop a natural language processing model using BlueBERT, a biomedical domain-specific pre-trained language model, to predict patient disposition status and unplanned readmissions. Data preprocessing and the integration of both structured and unstructured data were central to our approach. Compared to other models, BlueBERT outperformed due to its pre-training on a diverse range of medical literature, enabling it to better comprehend the specialized terminology, relationships, and context present in ED data. We found that translating Chinese-English clinical narratives into English and textualizing numerical data into categorical representations significantly improved the prediction of patient disposition (AUROC = 0.9014) and 72-hour unscheduled return visits (AUROC = 0.6475). The study concludes that the BlueBERT-based model demonstrated superior prediction capabilities, surpassing the performance of prior patient disposition predictive models, thus offering promising applications in the realm of ED clinical practice.
Collapse
Affiliation(s)
- Tzu-Ying Chen
- Graduate Institute of Data Science, Taipei Medical University, Taipei City, Taiwan
| | - Ting-Yun Huang
- Shuang-Ho Hospital, Taipei Medical University, New Taipei City, Taiwan
| | - Yung-Chun Chang
- Graduate Institute of Data Science, Taipei Medical University, Taipei City, Taiwan; Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei City, Taiwan.
| |
Collapse
|
3
|
Caterson J, Lewin A, Williamson E. The application of explainable artificial intelligence (XAI) in electronic health record research: A scoping review. Digit Health 2024; 10:20552076241272657. [PMID: 39493635 PMCID: PMC11528818 DOI: 10.1177/20552076241272657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Accepted: 07/09/2024] [Indexed: 11/05/2024] Open
Abstract
Machine Learning (ML) and Deep Learning (DL) models show potential in surpassing traditional methods including generalised linear models for healthcare predictions, particularly with large, complex datasets. However, low interpretability hinders practical implementation. To address this, Explainable Artificial Intelligence (XAI) methods are proposed, but a comprehensive evaluation of their effectiveness is currently limited. The aim of this scoping review is to critically appraise the application of XAI methods in ML/DL models using Electronic Health Record (EHR) data. In accordance with PRISMA scoping review guidelines, the study searched PUBMED and OVID/MEDLINE (including EMBASE) for publications related to tabular EHR data that employed ML/DL models with XAI. Out of 3220 identified publications, 76 were included. The selected publications published between February 2017 and June 2023, demonstrated an exponential increase over time. Extreme Gradient Boosting and Random Forest models were the most frequently used ML/DL methods, with 51 and 50 publications, respectively. Among XAI methods, Shapley Additive Explanations (SHAP) was predominant in 63 out of 76 publications, followed by partial dependence plots (PDPs) in 11 publications, and Locally Interpretable Model-Agnostic Explanations (LIME) in 8 publications. Despite the growing adoption of XAI methods, their applications varied widely and lacked critical evaluation. This review identifies the increasing use of XAI in tabular EHR research and highlights a deficiency in the reporting of methods and a lack of critical appraisal of validity and robustness. The study emphasises the need for further evaluation of XAI methods and underscores the importance of cautious implementation and interpretation in healthcare settings.
Collapse
Affiliation(s)
| | - Alexandra Lewin
- London School of Hygiene and Tropical Medicine, Bloomsbury, UK
| | | |
Collapse
|
4
|
Pierri F, Scotti F, Bonaccorsi G, Flori A, Pammolli F. Predicting economic resilience of territories in Italy during the COVID-19 first lockdown. EXPERT SYSTEMS WITH APPLICATIONS 2023; 232:120803. [PMID: 37363270 PMCID: PMC10281035 DOI: 10.1016/j.eswa.2023.120803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 04/19/2023] [Accepted: 06/08/2023] [Indexed: 06/28/2023]
Abstract
This paper aims to predict the economic resilience to crises of territories based on local pre-existing socioeconomic characteristics. Specifically, we consider the case of Italian municipalities during the first wave of the COVID-19 pandemic, leveraging a large-scale dataset of cardholders performing transactions in Point-of-Sales. Based on a set of machine learning classifiers, we show that network-based measures and variables related to the social, economic, demographic and environmental dimensions are relevant predictors of the economic resilience of Italian municipalities to the crisis. In particular, we find accurate classification performance both in balanced and un-balanced scenarios, as well as in the case we restrict the analysis to specific geographical areas. Our analysis predicts that territories with larger income per capita, soil consumption, concentration of real estate activities and commuting network centrality in terms of closeness and Pagerank constitute the set of most affected areas, experiencing the strongest reduction of economic activities during the COVID-19 pandemic. Overall, we provide an application of an early-warning system able to provide timely evidence to policymakers about the detrimental effects generated by natural disasters and severe crisis episodes, thus contributing to optimize public decision support systems.
Collapse
Affiliation(s)
- Francesco Pierri
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milano, Italy
| | - Francesco Scotti
- Department of Management, Economics and Industrial Engineering, Politecnico di Milano, Milano, Italy
| | - Giovanni Bonaccorsi
- Department of Management, Economics and Industrial Engineering, Politecnico di Milano, Milano, Italy
| | - Andrea Flori
- Department of Management, Economics and Industrial Engineering, Politecnico di Milano, Milano, Italy
| | - Fabio Pammolli
- Department of Management, Economics and Industrial Engineering, Politecnico di Milano, Milano, Italy
| |
Collapse
|
5
|
Carvantes-Barrera A, Díaz-González L, Rosales-Rivera M, Chávez-Almazán LA. Risk Factors Associated with COVID-19 Lethality: A Machine Learning Approach Using Mexico Database. J Med Syst 2023; 47:90. [PMID: 37597034 DOI: 10.1007/s10916-023-01979-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 07/21/2023] [Indexed: 08/21/2023]
Abstract
Identifying risk factors associated with COVID-19 lethality is crucial in combating the ongoing pandemic. In this study, we developed lethality predictive models for each epidemiological wave and for the overall dataset using the Extreme Gradient Boosting technique and analyzed them using Shapley values to determine the contribution levels of various features, including demographics, comorbidities, medical units, and recent medical information from confirmed COVID-19 cases in Mexico between February 23, 2020, and April 15, 2022. The results showed that pneumonia and advanced age were the most important factors predicting patient death in all cohorts. Additionally, the medical unit where the patient received care acted as a risk or protective factor. IMSS medical units were identified as high-risk factors in all cohorts, except in wave four, while SSA medical units generally were moderate protective factors. We also found that intubation was a high-risk factor in the first epidemiological wave and a moderate-risk factor in the following waves. Female gender was a protective factor of moderate-high importance in all cohorts, while being between 18 and 29 years old was a moderate protective factor and being between 50 and 59 years old was a moderate risk factor. Additionally, diabetes (all cohorts), obesity (third wave), and hypertension (fourth wave) were identified as moderate risk factors. Finally, residing in municipalities with the lowest Human Development Index level represented a moderate risk factor. In conclusion, this study identified several significant risk factors associated with COVID-19 lethality in Mexico, which could aid policymakers in developing targeted interventions to reduce mortality rates.
Collapse
Affiliation(s)
- Alejandro Carvantes-Barrera
- Maestría en Optimización y Cómputo Aplicado, Universidad Autónoma del Estado de Morelos, Cuernavaca, 62209, Morelos, México
| | - Lorena Díaz-González
- Centro de Investigación en Ciencias, Universidad Autónoma del Estado de Morelos, Cuernavaca, 62209, Morelos, México.
| | - Mauricio Rosales-Rivera
- Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, 62209, México
| | - Luis A Chávez-Almazán
- Unidad de Innovación Clínica y Epidemiológica del Estado de Guerrero, Acapulco, Guerrero, 39715, México
| |
Collapse
|
6
|
Abu Khurma R, Albashish D, Braik M, Alzaqebah A, Qasem A, Adwan O. An augmented Snake Optimizer for diseases and COVID-19 diagnosis. Biomed Signal Process Control 2023; 84:104718. [PMID: 36811003 PMCID: PMC9935299 DOI: 10.1016/j.bspc.2023.104718] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Revised: 01/21/2023] [Accepted: 02/14/2023] [Indexed: 02/19/2023]
Abstract
Feature Selection (FS) techniques extract the most recognizable features for improving the performance of classification methods for medical applications. In this paper, two intelligent wrapper FS approaches based on a new metaheuristic algorithm named the Snake Optimizer (SO) are introduced. The binary SO, called BSO, is built based on an S-shape transform function to handle the binary discrete values in the FS domain. To improve the exploration of the search space by BSO, three evolutionary crossover operators (i.e., one-point crossover, two-point crossover, and uniform crossover) are incorporated and controlled by a switch probability. The two newly developed FS algorithms, BSO and BSO-CV, are implemented and assessed on a real-world COVID-19 dataset and 23 disease benchmark datasets. According to the experimental results, the improved BSO-CV significantly outperformed the standard BSO in terms of accuracy and running time in 17 datasets. Furthermore, it shrinks the COVID-19 dataset's dimension by 89% as opposed to the BSO's 79%. Moreover, the adopted operator on BSO-CV improved the balance between exploitation and exploration capabilities in the standard BSO, particularly in searching and converging toward optimal solutions. The BSO-CV was compared against the most recent wrapper-based FS methods; namely, the hyperlearning binary dragonfly algorithm (HLBDA), the binary moth flame optimization with Lévy flight (LBMFO-V3), the coronavirus herd immunity optimizer with greedy crossover operator (CHIO-GC), as well as four filter methods with an accuracy of more than 90% in most benchmark datasets. These optimistic results reveal the great potential of BSO-CV in reliably searching the feature space.
Collapse
Affiliation(s)
- Ruba Abu Khurma
- Computer Science Department, Faculty of Information Technology, Al-Ahliyya Amman University, Amman, Jordan
| | - Dheeb Albashish
- Computer Science Department, Prince Abdullah bin Ghazi Faculty of Information and Communication Technology, Al-Balqa Applied University, Salt, Jordan
| | - Malik Braik
- Computer Science Department, Prince Abdullah bin Ghazi Faculty of Information and Communication Technology, Al-Balqa Applied University, Salt, Jordan
| | - Abdullah Alzaqebah
- Computer Science Department, Faculty of Information Technology, The World Islamic Sciences & Education University, Amman, Jordan
| | - Ashwaq Qasem
- School of Electrical Engineering and Artificial Intelligence, Xiamen University Malaysia, Sepang, Malaysia
| | - Omar Adwan
- Computer Science Department, Faculty of Information Technology, Al-Ahliyya Amman University, Amman, Jordan
- Department of Computer Science, University of Jordan, Amman, Jordan
| |
Collapse
|
7
|
Paul SG, Saha A, Biswas AA, Zulfiker MS, Arefin MS, Rahman MM, Reza AW. Combating Covid-19 using machine learning and deep learning: Applications, challenges, and future perspectives. ARRAY 2023; 17:100271. [PMID: 36530931 PMCID: PMC9737520 DOI: 10.1016/j.array.2022.100271] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 12/05/2022] [Accepted: 12/07/2022] [Indexed: 12/14/2022] Open
Abstract
COVID-19, a worldwide pandemic that has affected many people and thousands of individuals have died due to COVID-19, during the last two years. Due to the benefits of Artificial Intelligence (AI) in X-ray image interpretation, sound analysis, diagnosis, patient monitoring, and CT image identification, it has been further researched in the area of medical science during the period of COVID-19. This study has assessed the performance and investigated different machine learning (ML), deep learning (DL), and combinations of various ML, DL, and AI approaches that have been employed in recent studies with diverse data formats to combat the problems that have arisen due to the COVID-19 pandemic. Finally, this study shows the comparison among the stand-alone ML and DL-based research works regarding the COVID-19 issues with the combinations of ML, DL, and AI-based research works. After in-depth analysis and comparison, this study responds to the proposed research questions and presents the future research directions in this context. This review work will guide different research groups to develop viable applications based on ML, DL, and AI models, and will also guide healthcare institutes, researchers, and governments by showing them how these techniques can ease the process of tackling the COVID-19.
Collapse
Affiliation(s)
- Showmick Guha Paul
- Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh
| | - Arpa Saha
- Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh
| | - Al Amin Biswas
- Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh,Corresponding author
| | - Md. Sabab Zulfiker
- Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh
| | - Mohammad Shamsul Arefin
- Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh,Department of Computer Science and Engineering, Chittagong University of Engineering and Technology, Chittagong, Bangladesh
| | - Md. Mahfujur Rahman
- Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh
| | - Ahmed Wasif Reza
- Department of Computer Science and Engineering, East West University, Dhaka, Bangladesh
| |
Collapse
|
8
|
Kistenev YV, Vrazhnov DA, Shnaider EE, Zuhayri H. Predictive models for COVID-19 detection using routine blood tests and machine learning. Heliyon 2022; 8:e11185. [PMID: 36311357 PMCID: PMC9595489 DOI: 10.1016/j.heliyon.2022.e11185] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 03/25/2022] [Accepted: 10/16/2022] [Indexed: 11/06/2022] Open
Abstract
The problem of accurate, fast, and inexpensive COVID-19 tests has been urgent till now. Standard COVID-19 tests need high-cost reagents and specialized laboratories with high safety requirements, are time-consuming. Data of routine blood tests as a base of SARS-CoV-2 invasion detection allows using the most practical medicine facilities. But blood tests give general information about a patient's state, which is not directly associated with COVID-19. COVID-19-specific features should be selected from the list of standard blood characteristics, and decision-making software based on appropriate clinical data should be created. This review describes the abilities to develop predictive models for COVID-19 detection using routine blood tests and machine learning.
Collapse
Affiliation(s)
- Yury V. Kistenev
- Laboratory of Laser Molecular Imaging and Machine Learning, Tomsk State University, 36 Lenin Av., 634050 Tomsk, Russia
| | - Denis A. Vrazhnov
- Laboratory of Laser Molecular Imaging and Machine Learning, Tomsk State University, 36 Lenin Av., 634050 Tomsk, Russia
| | - Ekaterina E. Shnaider
- Laboratory of Laser Molecular Imaging and Machine Learning, Tomsk State University, 36 Lenin Av., 634050 Tomsk, Russia
| | - Hala Zuhayri
- Laboratory of Laser Molecular Imaging and Machine Learning, Tomsk State University, 36 Lenin Av., 634050 Tomsk, Russia
| |
Collapse
|
9
|
Shanbehzadeh M, Yazdani A, Shafiee M, Kazemi-Arpanahi H. Predictive modeling for COVID-19 readmission risk using machine learning algorithms. BMC Med Inform Decis Mak 2022; 22:139. [PMID: 35596167 PMCID: PMC9122247 DOI: 10.1186/s12911-022-01880-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Accepted: 05/18/2022] [Indexed: 12/15/2022] Open
Abstract
Introduction The COVID-19 pandemic overwhelmed healthcare systems with severe shortages in hospital resources such as ICU beds, specialized doctors, and respiratory ventilators. In this situation, reducing COVID-19 readmissions could potentially maintain hospital capacity. By employing machine learning (ML), we can predict the likelihood of COVID-19 readmission risk, which can assist in the optimal allocation of restricted resources to seriously ill patients. Methods In this retrospective single-center study, the data of 1225 COVID-19 patients discharged between January 9, 2020, and October 20, 2021 were analyzed. First, the most important predictors were selected using the horse herd optimization algorithms. Then, three classical ML algorithms, including decision tree, support vector machine, and k-nearest neighbors, and a hybrid algorithm, namely water wave optimization (WWO) as a precise metaheuristic evolutionary algorithm combined with a neural network were used to construct predictive models for COVID-19 readmission. Finally, the performance of prediction models was measured, and the best-performing one was identified. Results The ML algorithms were trained using 17 validated features. Among the four selected ML algorithms, the WWO had the best average performance in tenfold cross-validation (accuracy: 0.9705, precision: 0.9729, recall: 0.9869, specificity: 0.9259, F-measure: 0.9795). Conclusions Our findings show that the WWO algorithm predicts the risk of readmission of COVID-19 patients more accurately than other ML algorithms. The models developed herein can inform frontline clinicians and healthcare policymakers to manage and optimally allocate limited hospital resources to seriously ill COVID-19 patients.
Collapse
Affiliation(s)
- Mostafa Shanbehzadeh
- Department of Health Information Technology, School of Paramedical, Ilam University of Medical Sciences, Ilam, Iran
| | - Azita Yazdani
- Clinical Education Research Center, Health Human Resources Research Center, Department of Health Information Management, School of Health Management and Information Sciences, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Mohsen Shafiee
- Department of Nursing, Abadan University of Medical Sciences, Abadan, Iran
| | - Hadi Kazemi-Arpanahi
- Department of Health Information Technology, Abadan University of Medical Sciences, Abadan, Iran. .,Department of Student Research Committee, Abadan University of Medical Sciences, Abadan, Iran.
| |
Collapse
|