1
|
Hu H, Zhao Y, Sun C, Wu Q, Deng Y, Liu J. Enhancing readmission prediction model in older stroke patients by integrating insight from readiness for hospital discharge: Prospective cohort study. Int J Med Inform 2025; 197:105845. [PMID: 40015152 DOI: 10.1016/j.ijmedinf.2025.105845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2024] [Revised: 01/27/2025] [Accepted: 02/19/2025] [Indexed: 03/01/2025]
Abstract
BACKGROUND The 30-day hospital readmission rate is a key indicator of healthcare quality and system efficiency. This study aimed to develop machine-learning (ML) models to predict unplanned 30-day readmissions in older patients with ischemic stroke (IS) using a prospective cohort design. METHODS Patients were divided into two datasets: dataset I (January 2020-December 2021) for model development and dataset II (January 2022-December 2023) for validation. A diffusion model was applied to address data imbalance. Eleven machine-learning methods, including Random Forest (RF), Logistic Regression, CatBoost, eXtreme Gradient Boosting Light Gradient Boosting Machine, K-Nearest Neighbors Support Vector Machine, Multi-Layer Perceptron, and Gaussian Naive Bayes, and 2 ensemble learning models, were constructed to predict readmissions. Bayesian optimization was used to fine-tune the hyperparameters of these models. Model performance was primarily evaluated using the area under the receiver operating characteristic curve (AUC). Shapley Additive Explanations (SHAP) were utilized to identify and interpret the significance of predictive variables. RESULTS Dataset I included 489 patients, while dataset II comprised 418 patients, with readmission rates of 15.3 % and 16.0 %, respectively. The RF model achieved the highest predictive performance (AUC = 0.9116, sensitivity = 0.8806, specificity = 0.7806). SHAP analysis identified readiness for hospital discharge as the most significant predictor of readmission. CONCLUSION The RF model shows promise for predicting unplanned 30-day readmissions in older patients with IS. Multi-center studies with larger sample sizes are needed to validate these findings.
Collapse
Affiliation(s)
- Huixiu Hu
- Department of Nursing, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, PR China
| | - Yajie Zhao
- Department of Cardiology, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, PR China
| | - Chao Sun
- Department of Nursing, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, PR China.
| | - Quanying Wu
- Department of Nursing, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, PR China
| | - Ying Deng
- Department of Neurosurgery, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, PR China
| | - Jie Liu
- Department of Neurology, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, PR China
| |
Collapse
|
2
|
Hwang YS, Kim S, Yim I, Park Y, Kang S, Jo HS. Predicting the likelihood of readmission in patients with ischemic stroke: An explainable machine learning approach using common data model data. Int J Med Inform 2025; 195:105754. [PMID: 39755003 DOI: 10.1016/j.ijmedinf.2024.105754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2024] [Revised: 12/03/2024] [Accepted: 12/04/2024] [Indexed: 01/06/2025]
Abstract
BACKGROUND Ischemic stroke affects 15 million people worldwide, causing five million deaths annually. Despite declining mortality rates, stroke incidence and readmission risks remain high, highlighting the need for preventing readmission to improve the quality of life of survivors. This study developed a machine-learning model to predict 90-day stroke readmission using electronic medical records converted to the common data model (CDM) from the Regional Accountable Care Hospital in Gangwon state in South Korea. METHODS We retrospectively analyzed data from 1,136 patients with ischemic stroke admitted between August 2003 and August 2021 after excluding cases with missing blood test values. Demographics, blood test results, treatments, and comorbidities were used as key features. Six machine learning models and three deep learning models were used to predict 90-day readmission using the synthetic minority over-sampling technique to address class imbalance. Models were evaluated using threefold cross-validation, and SHapley Additive exPlanations (SHAP) values were calculated to interpret feature importance. RESULTS Among 1,136 patients, 196 (17.2 %) were readmitted within 90 days. Male patients were significantly more likely to experience readmission (p = 0.02). LightGBM achieved an area under the curve of 0.94, demonstrating that analyzing stroke and stroke-related conditions provides greater predictive accuracy than predicting stroke alone or all-cause readmissions. SHAP analysis highlighted renal and metabolic variables, including creatinine, blood urea nitrogen, calcium, sodium, and potassium, as key predictors of readmission. CONCLUSION Machine-learning models using electronic health record-based CDM data demonstrated strong predictive performance for 90-day stroke readmission. These results support personalized post-discharge management and lay the groundwork for future multicenter studies.
Collapse
Affiliation(s)
- Yu Seong Hwang
- Department of Health Policy and Management, School of Medicine, Kangwon National University, 510 School of Medicine Building #1 (N414), 1, Kangwondaehak-gil, Chuncheon-si, Gangwon-do 24341, Republic of Korea
| | - Seongheon Kim
- Department of Neurology, Kangwon National University Hospital, 156 Baengnyeong-ro, Chuncheon-si, Gangwon-do 24289, Republic of Korea
| | - Inhyeok Yim
- Department of Family Medicine, Kangwon National University Hospital, Kangwon National University School of Medicine, 156 Baengnyeong-ro, Chuncheon-si, Gangwon-do 24289, Republic of Korea
| | - Yukyoung Park
- Department of Preventive Medicine, Kangwon National University Hospital, 156 Baengnyeong-ro, Chuncheon-si, Gangwon-do 24289, Republic of Korea
| | - Seonguk Kang
- Department of Convergence Security, Kangwon National University Hospital, 156 Baengnyeong-ro, Chuncheon-si, Gangwon-do 24289, Republic of Korea
| | - Heui Sug Jo
- Department of Health Policy and Management, School of Medicine, Kangwon National University, 510 School of Medicine Building #1 (N414), 1, Kangwondaehak-gil, Chuncheon-si, Gangwon-do 24341, Republic of Korea; Department of Preventive Medicine, Kangwon National University Hospital, 156 Baengnyeong-ro, Chuncheon-si, Gangwon-do 24289, Republic of Korea; Team of Public Medical Policy Development, Gangwon State Research Institute for People's Health, 880 Baksa-ro, Seo-myeon, Chuncheon-si, Gangwon-do 24461, Republic of Korea.
| |
Collapse
|
3
|
Cun W, Xu K, Chai Q, Duan L. Factors Affecting the Readmission of Patients with Stroke. World Neurosurg 2025; 194:123572. [PMID: 39701519 DOI: 10.1016/j.wneu.2024.123572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2024] [Accepted: 12/08/2024] [Indexed: 12/21/2024]
Abstract
BACKGROUND Incidence of stroke is increasing annually in China, the readmission rate of patients with stroke remains high. METHODS In total, 441 patients were enrolled in this study. We described the incidence of stroke readmissions. Furthermore, we used the Andersen-Gill model to explore the factors affecting all-cause readmission and cardio-cerebrovascular-related readmission of patients with stroke. Identification of these predictors can help reduce the readmission rate of patients with stroke. RESULTS 1) In total, 441 patients with stroke were included. Among them, 163 (40%) had readmission records. Among them, 44 patients had readmission due to cardiovascular and cerebrovascular diseases, accounting for 10.70%. 2) The Modified Rankin Scale (mRs scale) score affected all-cause readmission of patients with stroke. Patients with stroke and a score of 5 were 5.46 times more likely to be readmitted than those with a score of 0 (HR = 5.46, 95% CI: 1.59∼18.7, P < 0.1). 3) Patients with college degree or above were 2.48 times more likely to have cardio-cerebrovascular-related readmission compared to those with junior high school education or below (HR = 2.48, 95% CI: 1.11∼5.54, P < 0.1). Patients with chronic diseases were 3.68 times more likely to have cardio-cerebrovascular-related readmission compared to those without chronic diseases (HR = 3.68, 95% CI: 1.61∼8.39, P < 0.1). CONCLUSIONS The readmission of patients with stroke may be related to their physical activity function, chronic diseases, and socioeconomic status. When considering the factors predicting the readmission of patients with stroke, we cannot blindly draw on the results of relevant foreign studies.
Collapse
Affiliation(s)
- Wei Cun
- Evidence-Based Nursing Research Laboratory, West China Hospital, Sichuan University/West China School of Nursing, Sichuan University, Chengdu, China
| | - Ke Xu
- Evidence-Based Nursing Research Laboratory, West China Hospital, Sichuan University/West China School of Nursing, Sichuan University, Chengdu, China
| | - Qi Chai
- Integrated Care Management Center, Institute of Respiratory Health, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Lijuan Duan
- Department of Neurosurgery, West China Hospital, Sichuan University/West China School of Nursing, Sichuan University, Chengdu, China.
| |
Collapse
|
4
|
Arif U, Zhang C, Hussain S, Abbasi AR. An efficient interpretable stacking ensemble model for lung cancer prognosis. Comput Biol Chem 2024; 113:108248. [PMID: 39426256 DOI: 10.1016/j.compbiolchem.2024.108248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Revised: 09/29/2024] [Accepted: 10/09/2024] [Indexed: 10/21/2024]
Abstract
Lung cancer significantly contributes to global cancer mortality, posing challenges in clinical management. Early detection and accurate prognosis are crucial for improving patient outcomes. This study develops an interpretable stacking ensemble model (SEM) for lung cancer prognosis prediction and identifies key risk factors. Using a Kaggle dataset of 1000 patients with 22 variables, the model classifies prognosis into Low, Medium, and High-risk categories. The bootstrap method was employed for evaluation metrics, while SHAP (Shapley Additive Explanations) and LIME (Local Interpretable Model-agnostic Explanations) assessed model interpretability. Results showed SEM's superior interpretability over traditional models, such as Random Forest, Logistic Regression, Decision Tree, Gradient Boosting Machine, Extreme Gradient Boosting Machine, and Light Gradient Boosting Machine. SEM achieved an accuracy of 98.90 %, precision of 98.70 %, F1 score of 98.85 %, sensitivity of 98.77 %, specificity of 95.45 %, Cohen's kappa value of 94.56 %, and an AUC of 98.10 %. The SEM demonstrated robust performance in lung cancer prognosis, revealing chronic lung cancer and genetic risk as major factors.
Collapse
Affiliation(s)
- Umair Arif
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xian, Shaanxi 710049, China.
| | - Chunxia Zhang
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xian, Shaanxi 710049, China.
| | - Sajid Hussain
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xian, Shaanxi 710049, China.
| | - Abdul Rauf Abbasi
- Department of Statistics, COMSATS University Islamabad, Lahore Campus, Lahore 5400, Pakistan.
| |
Collapse
|
5
|
Yang C, Hu R, Xiong S, Hong Z, Liu J, Mao Z, Chen M. Development of machine learning-based models for predicting risk factors in acute cerebral infarction patients: a clinical retrospective study. BMC Neurol 2024; 24:306. [PMID: 39217304 PMCID: PMC11365171 DOI: 10.1186/s12883-024-03818-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 08/22/2024] [Indexed: 09/04/2024] Open
Abstract
OBJECTIVES The aim of this study was to develop machine learning-based models for predicting acute cerebral infarction (ACI) in patients. METHODS We extracted the data of ACI patients and non-ACI patients (as control) from two hospitals. The Lasso algorithm was employed to select the most crucial features associated with ACI. Five machine learning algorithms-based models were trained, which was performed with 10-fold cross-validation. Then, the area under the receiver operating characteristic curve (AUC), accuracy, and F1-score were calculated in the training models. Accordingly, the training models with excellent performance was selected as the final predictive model. The relative importance of variables was analyzed and ranked. RESULTS A total of 150 patients were diagnosed with ACI (50.00%), with a higher proportion of males (70.67% vs. 44.00%) compared to the non-ACI patients. The logistic regression model exhibited a good performance in predicting ACI in the training set, as evidenced by its highest AUC, accuracy, sensitivity, and F1-score. Furthermore, feature importance analysis showed that blood glucose, gender, smoking history, serum homocysteine, folic acid, and C-reactive protein were the top six crucial variables of the logistic regression. CONCLUSIONS In our work, the ACI risk prediction model developed by the logistic regression exhibited excellent performance. This could contribute to the identification of risk variables for ACI patients and enables clinicians timely and effective interventions.
Collapse
Affiliation(s)
- Changqing Yang
- Department of Hematology, Affiliated Hospital 6 of Nantong University, 02 Xinduxi Road, Yancheng, 224000, China
- Department of Hematology, Yancheng Third People's Hospital, 02 Xinduxi Road, Yancheng, 224000, China
| | - Renlin Hu
- Department of Internal Medicine Neurology, Wuhan Fifth Hospital, 122 Xianzheng Street, Wuhan, 430050, China
| | - Shilan Xiong
- Department of Neurology, Affiliated Hospital 6 of Nantong University, 02 Xinduxi Road, Yancheng, 224000, China
- Department of Neurology, Yancheng Third People's Hospital, 02 Xinduxi Road, Yancheng, 224000, China
| | - Zhou Hong
- Department of Internal Medicine Neurology, Wuhan Fifth Hospital, 122 Xianzheng Street, Wuhan, 430050, China
| | - Jiaqi Liu
- School of Medicine of Nantong University, 19 Qixiu Road, Nantong, 226000, China
| | - Zhuqing Mao
- Department of Neurology, Fushun Central Hospital, 05 Xincheng Road, Jinzhou, 113000, China.
| | - Mingzhu Chen
- Department of Neurology, Affiliated Hospital 6 of Nantong University, 02 Xinduxi Road, Yancheng, 224000, China.
- Department of Neurology, Yancheng Third People's Hospital, 02 Xinduxi Road, Yancheng, 224000, China.
| |
Collapse
|
6
|
Xiaoxue W, Zijun W, Shichen C, Mukun Y, Yi C, Linqing M, Wenpei B. Risk prediction model of metabolic syndrome in perimenopausal women based on machine learning. Int J Med Inform 2024; 188:105480. [PMID: 38754284 DOI: 10.1016/j.ijmedinf.2024.105480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 11/24/2023] [Accepted: 05/08/2024] [Indexed: 05/18/2024]
Abstract
INTRODUCTION Metabolic syndrome (MetS) is considered to be an important parameter of cardio-metabolic health and contributing to the development of atherosclerosis, type 2 diabetes. The incidence of MetS significantly increases in postmenopausal women, therefore, the perimenopausal period is considered a critical phase for prevention. We aimed to use four machine learning methods to predict whether perimenopausal women will develop MetS within 2 years. METHODS Women aged 45-55 years who underwent 2 consecutive years of physical examinations in Ninth Clinical College of Peking University between January 2021 and December 2022 were included. We extracted 26 features from physical examinations, and used backward selection method to select top 10 features with the largest area under the receiver operating characteristic curve (AUC). Extreme gradient boosting (XGBoost), Random forest (RF), Multilayer perceptron (MLP) and Logistic regression (LR) were used to establish the model. Those performance were measured by AUC, accuracy, precision, recall and F1 score. SHapley Additive exPlanation (SHAP) value was used to identify risk factors affecting perimenopausal MetS. RESULTS A total of 8700 women had physical examination records, and 2,254 women finally met the inclusion criteria. For predicting MetS events, RF and XGBoost had the highest AUC (0.96, 0.95, respectively). XGBoost has the highest F1 value (F1 = 0.77), followed by RF, LR and MLP. SHAP value suggested that the top 5 variables affecting MetS in this study were Waist circumference, Fasting blood glucose, High-density lipoprotein cholesterol, Triglycerides and Diastolic blood pressure, respectively. CONCLUSION We've developed a targeted MetS risk prediction model for perimenopausal women, using health examination data. This model enables early identification of high MetS risk in this group, offering significant benefits for individual health management and wider socio-economic health initiatives.
Collapse
Affiliation(s)
- Wang Xiaoxue
- Department of Obstetrics and Gynecology, Peking University Ninth School of Clinical Medicine, Beijing Shijitan Hospital, Beijing 100038, China
| | - Wang Zijun
- Department of Obstetrics and Gynecology, Peking University Ninth School of Clinical Medicine, Beijing Shijitan Hospital, Beijing 100038, China
| | - Chen Shichen
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Yang Mukun
- Department of Obstetrics and Gynecology, Peking University Ninth School of Clinical Medicine, Beijing Shijitan Hospital, Beijing 100038, China
| | - Chen Yi
- Department of Obstetrics and Gynecology, Peking University Ninth School of Clinical Medicine, Beijing Shijitan Hospital, Beijing 100038, China
| | - Miao Linqing
- Beijing Advanced Innovation Center for Intelligent Robots and Systems, Beijing Institute of Technology, Beijing 100081, China
| | - Bai Wenpei
- Department of Obstetrics and Gynecology, Peking University Ninth School of Clinical Medicine, Beijing Shijitan Hospital, Beijing 100038, China.
| |
Collapse
|
7
|
Lee CC, Su SY, Sung SF. Machine learning-based survival analysis approaches for predicting the risk of pneumonia post-stroke discharge. Int J Med Inform 2024; 186:105422. [PMID: 38518677 DOI: 10.1016/j.ijmedinf.2024.105422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Revised: 02/25/2024] [Accepted: 03/19/2024] [Indexed: 03/24/2024]
Abstract
BACKGROUND Post-stroke pneumonia (PSP) is common among stroke patients. PSP occurring after hospital discharge continues to increase the risk of poor functional outcomes and death among stroke survivors. Currently, there is no prediction model specifically designed to predict the occurrence of PSP beyond the acute stage of stroke. This study aimed to explore the use of machine learning (ML) methods in predicting the risk of PSP after hospital discharge. METHODS This study analyzed data from 5,754 hospitalized stroke patients. The dataset was randomly divided into a training set and a holdout test set, with a ratio of 80:20. Several clinical and laboratory variables were utilized as predictors and different ML algorithms were employed to model time-to-event data. The ML model's predictive performance was compared to existing risk-scoring systems. A model-agnostic method based on Shapley additive explanations was utilized to interpret the ML model. RESULTS The study found that 5.7% of the study patients experienced pneumonia within one year after discharge. Based on repeated 5-fold cross-validation on the training set, the random survival forest (RSF) model had the highest C-index among the various ML algorithms and traditional Cox regression analysis. The final RSF model achieved a C-index of 0.787 (95% confidence interval: 0.737-0.840) on the holdout test set, outperforming five existing risk-scoring systems. The top three important predictors were the Glasgow Coma Scale score, age, and length of hospital stay. CONCLUSIONS The RSF model demonstrated superior discriminative ability compared to other ML algorithms and traditional Cox regression analysis, suggesting a non-linear relationship between predictors and outcomes. The developed ML model can be integrated into the hospital information system to provide personalized risk assessments.
Collapse
Affiliation(s)
- Chang-Ching Lee
- Division of Pulmonary Medicine, Department of Internal Medicine, Ditmanson Medical Foundation Chia-Yi Christian Hospital, Chiayi City, Taiwan
| | - Sheng-You Su
- Clinical Medicine Research Center, Department of Medical Research, Ditmanson Medical Foundation Chia-Yi Christian Hospital, Chiayi City, Taiwan
| | - Sheng-Feng Sung
- Division of Neurology, Department of Internal Medicine, Ditmanson Medical Foundation Chia-Yi Christian Hospital, Chiayi City, Taiwan; Department of Beauty & Health Care, Min-Hwei Junior College of Health Care Management, Tainan, Taiwan.
| |
Collapse
|
8
|
Bakris G, Lin P(P, Xu C, Chen C, Ashton V, Singhal M. Prediction of cardiovascular and renal risk among patients with apparent treatment-resistant hypertension in the United States using machine learning methods. J Clin Hypertens (Greenwich) 2024; 26:500-513. [PMID: 38523465 PMCID: PMC11088433 DOI: 10.1111/jch.14791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 02/08/2024] [Accepted: 02/11/2024] [Indexed: 03/26/2024]
Abstract
Apparent treatment-resistant hypertension (aTRH), defined as blood pressure (BP) that remains uncontrolled despite unconfirmed concurrent treatment with three antihypertensives, is associated with an increased risk of developing cardiovascular and renal complications compared with controlled hypertension. We aimed to identify the characteristics of aTRH patients with an elevated risk of major adverse cardiovascular events plus (MACE+; defined as stroke, myocardial infarction, or heart failure hospitalization) and end stage renal disease (ESRD). This retrospective cohort study included aTRH patients (BP ≥140/90 mmHg and taking ≥3 antihypertensives) from the United States-based Optum® de-identified Electronic Health Record dataset and used machine learning models to identify risk factors of MACE+ or ESRD. Patients had claims for ≥3 antihypertensive classes within 30 days between January 1, 2015 and June 30, 2021, and two office BP measures recorded 1-90 days apart within 30 days to 11 months after the index regimen date. Of a total 18 797 070 patients identified with any hypertension, 71 100 patients had aTRH. During the study period (mean 25.5 months), 4944 (7.0%) patients had a MACE+ and 2403 (3.4%) developed ESRD. In total, 22 risk factors were included in the MACE+ model and 16 in the ESRD model, and most were significantly associated with study outcomes. The risk factors with the largest impact on MACE+ risk were congestive heart failure, stages 4 and 5 chronic kidney disease (CKD), age ≥80 years, and living in the Southern region of the United States. The risk factors with the largest impact on ESRD risk, other than pre-existing CKD, were anemia, congestive heart failure, and type 2 diabetes. The overall study cohort had a 5-year predicted MACE+ risk of 13.4%; this risk was increased in those in the top 50% and 25% high-risk groups (21.2% and 29.5%, respectively). The overall study cohort had a predicted 5-year risk of ESRD of 6.8%, which was increased in the top 50% and 25% high-risk groups (10.9% and 17.1%, respectively). We conclude that risk models developed in our study can reliably identify patients with aTRH at risk of MACE+ and ESRD based on information available in electronic health records; such models may be used to identify aTRH patients at high risk of adverse outcomes who may benefit from novel treatment interventions.
Collapse
Affiliation(s)
| | | | - Chang Xu
- Janssen Scientific Affairs, LLCTitusvilleNew JerseyUSA
| | - Cindy Chen
- Janssen Scientific Affairs, LLCTitusvilleNew JerseyUSA
| | | | - Mukul Singhal
- Janssen Scientific Affairs, LLCTitusvilleNew JerseyUSA
| |
Collapse
|
9
|
Lensky A, Lueck C, Suominen H, Jones B, Vlieger R, Ahluwalia T. Explaining predictors of discharge destination assessed along the patients' acute stroke journey. J Stroke Cerebrovasc Dis 2024; 33:107514. [PMID: 38104492 DOI: 10.1016/j.jstrokecerebrovasdis.2023.107514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Revised: 11/15/2023] [Accepted: 11/26/2023] [Indexed: 12/19/2023] Open
Abstract
INTRODUCTION Accurate prediction of outcome destination at an early stage would help manage patients presenting with stroke. This study assessed the predictive ability of three machine learning (ML) algorithms to predict outcomes at four different stages as well as compared the predictive power of stroke scores. METHODS Patients presenting with acute stroke to the Canberra Hospital between 2015 and 2019 were selected retrospectively. 16 potential predictors and one target variable (discharge destination) were obtained from the notes. k-Nearest Neighbour (kNN) and two ensemble-based classification algorithms (Adaptive Boosting and Bootstrap Aggregation) were employed to predict outcomes. Predictive accuracy was assessed at each of the four stages using both overall and per-class accuracy. The contribution of each variable to the prediction outcome was evaluated by the ensemble-based algorithm and using the Relief feature selection algorithm. Various combinations of stroke scores were tested using the aforementioned models. RESULTS Of the three ML models, Adaptive Boosting demonstrated the highest accuracy (90%) at Stage 4 in predicting death while the highest overall accuracy (81.7%) was achieved by kNN (k=2/City-block distance). Feature importance analysis has shown that the most important features are the 24-hour Scandinavian Stroke Scale (SSS) and 24-hour National Institutes of Health Stroke Scale (NIHSS) scores, dyslipidaemia, hypertension and premorbid mRS score. For the initial and 24-hour scores, there was a higher correlation (0.93) between SSS scores than for NIHSS scores (0.81). Reducing the overall four scores to InitSSS/24hrNIHSS increased accuracy to 95% in predicting death (Adaptive Boosting) and overall accuracy to 85.4% (kNN). Accuracies at Stage 2 (pre-treatment, 11 predictors) were not far behind those at Stage 4. CONCLUSION Our findings suggest that even in the early stages of management, a clinically useful prediction regarding discharge destination can be made. Adaptive Boosting might be the best ML model, especially when it comes to predicting death. The predictors' importance analysis also showed that dyslipidemia and hypertension contributed to the discharge outcome even more than expected. Further, surprisingly using mixed score systems might also lead to higher prediction accuracies.
Collapse
Affiliation(s)
- Artem Lensky
- School of Engineering and Technology, The University of New South Wales, Canberra ACT 2600, Australia; School of Biomedical Engineering, The University of Sydney, NSW, Australia.
| | - Christian Lueck
- School of Medicine and Psychology, The Australian National University, ACT, Australia
| | - Hanna Suominen
- School of Medicine and Psychology, The Australian National University, ACT, Australia; School of Computing, The Australian National University, ACT, Australia; Department of Computing, University of Turku, Finland
| | - Brett Jones
- Department of Neurology, Canberra Hospital, ACT, Australia
| | - Robin Vlieger
- School of Computing, The Australian National University, ACT, Australia
| | - Tina Ahluwalia
- Department of Neurology, Canberra Hospital, ACT, Australia
| |
Collapse
|
10
|
Wu M, Yu K, Zhao Z, Zhu B. Knowledge structure and global trends of machine learning in stroke over the past decade: A scientometric analysis. Heliyon 2024; 10:e24230. [PMID: 38288018 PMCID: PMC10823080 DOI: 10.1016/j.heliyon.2024.e24230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 11/23/2023] [Accepted: 01/04/2024] [Indexed: 01/31/2024] Open
Abstract
Objective Machine learning (ML) models have been widely applied in stroke prediction, diagnosis, treatment, and prognosis assessment. We aimed to conduct a comprehensive scientometrics analysis of studies related to ML in stroke and reveal its current status, knowledge structure, and global trends. Methods All documents related to ML in stroke were retrieved from the Web of Science database on March 15, 2023. We refined the documents by including only original articles and reviews in the English language. The literature published over the past decade was imported into scientometrics software for influence detection and collaborative network analysis. Results 2389 related publications were included. The annual publication outputs demonstrated explosive growth, with an average growth rate of 63.99 %. Among the 90 countries/regions involved, the United States (729 articles) and China (636 articles) were the most productive countries. Frontiers in Neurology was the most prolific journal with 94 articles. 234 highly cited articles, each with more than 31 citations, were detected. Keyword analysis revealed a total of 5333 keywords, with a predominant focus on the application of ML models in the early diagnosis, classification, and prediction of "acute ischemic stroke" and "atrial fibrillation-related stroke". The keyword "classification" had the first and longest burst, spanning from 2013 to 2018. 'Upport vector machine' got the strongest burst strength with 6.2. Keywords such as 'mechanical thrombectomy', 'expression', and 'prognosis' experienced bursts in 2022 and have continued to be prominent. Conclusion The applications of ML in stroke are increasingly diverse and extensive, with researchers showing growing interest over the past decade. However, the clinical application of ML in stroke is still in its early stages, and several limitations and challenges need to be addressed for its widespread adoption in clinical practice.
Collapse
Affiliation(s)
- Mingfen Wu
- Department of Pharmacy, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China
| | - Kefu Yu
- Department of Pharmacy, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China
| | - Zhigang Zhao
- Department of Pharmacy, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China
| | - Bin Zhu
- Department of Pharmacy, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China
| |
Collapse
|
11
|
Caterson J, Lewin A, Williamson E. The application of explainable artificial intelligence (XAI) in electronic health record research: A scoping review. Digit Health 2024; 10:20552076241272657. [PMID: 39493635 PMCID: PMC11528818 DOI: 10.1177/20552076241272657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Accepted: 07/09/2024] [Indexed: 11/05/2024] Open
Abstract
Machine Learning (ML) and Deep Learning (DL) models show potential in surpassing traditional methods including generalised linear models for healthcare predictions, particularly with large, complex datasets. However, low interpretability hinders practical implementation. To address this, Explainable Artificial Intelligence (XAI) methods are proposed, but a comprehensive evaluation of their effectiveness is currently limited. The aim of this scoping review is to critically appraise the application of XAI methods in ML/DL models using Electronic Health Record (EHR) data. In accordance with PRISMA scoping review guidelines, the study searched PUBMED and OVID/MEDLINE (including EMBASE) for publications related to tabular EHR data that employed ML/DL models with XAI. Out of 3220 identified publications, 76 were included. The selected publications published between February 2017 and June 2023, demonstrated an exponential increase over time. Extreme Gradient Boosting and Random Forest models were the most frequently used ML/DL methods, with 51 and 50 publications, respectively. Among XAI methods, Shapley Additive Explanations (SHAP) was predominant in 63 out of 76 publications, followed by partial dependence plots (PDPs) in 11 publications, and Locally Interpretable Model-Agnostic Explanations (LIME) in 8 publications. Despite the growing adoption of XAI methods, their applications varied widely and lacked critical evaluation. This review identifies the increasing use of XAI in tabular EHR research and highlights a deficiency in the reporting of methods and a lack of critical appraisal of validity and robustness. The study emphasises the need for further evaluation of XAI methods and underscores the importance of cautious implementation and interpretation in healthcare settings.
Collapse
Affiliation(s)
| | - Alexandra Lewin
- London School of Hygiene and Tropical Medicine, Bloomsbury, UK
| | | |
Collapse
|
12
|
Zhuo X, Lv J, Chen B, Liu J, Luo Y, Liu J, Xie X, Lu J, Zhao N. Combining conventional ultrasound and ultrasound elastography to predict HER2 status in patients with breast cancer. Front Physiol 2023; 14:1188502. [PMID: 37501928 PMCID: PMC10369848 DOI: 10.3389/fphys.2023.1188502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Accepted: 06/30/2023] [Indexed: 07/29/2023] Open
Abstract
Introduction: Identifying the HER2 status of breast cancer patients is important for treatment options. Previous studies have shown that ultrasound features are closely related to the subtype of breast cancer. Methods: In this study, we used features of conventional ultrasound and ultrasound elastography to predict HER2 status. Results and Discussion: The performance of model (AUROC) with features of conventional ultrasound and ultrasound elastography is higher than that of the model with features of conventional ultrasound (0.82 vs. 0.53). The SHAP method was used to explore the interpretability of the models. Compared with HER2- tumors, HER2+ tumors usually have greater elastic modulus parameters and microcalcifications. Therefore, we concluded that the features of conventional ultrasound combined with ultrasound elastography could improve the accuracy for predicting HER2 status.
Collapse
Affiliation(s)
- Xiaoying Zhuo
- Ultrasound Medicine Department of the Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
- Medical Imaging College of Xuzhou Medical University, Xuzhou, China
| | - Ji Lv
- Emergency Medicine Department of the Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
- College of Computer Science and Technology, Jilin University, Changchun, China
| | - Binjie Chen
- Emergency Medicine Department of the Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
| | - Jia Liu
- Pathology Department of the Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
| | - Yujie Luo
- Ultrasound Medicine Department of the Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
| | - Jie Liu
- Ultrasound Medicine Department of the Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
| | - Xiaowei Xie
- Ultrasound Medicine Department of the Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
| | - Jiao Lu
- Ultrasound Medicine Department of the Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
| | - Ningjun Zhao
- Emergency Medicine Department of the Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
- Laboratory of Emergency Medicine, Second Clinical Medical College of Xuzhou Medical University, Xuzhou, China
| |
Collapse
|