Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Abedi V, Li J, Shivakumar MK, Avula V, Chaudhary DP, Shellenberger MJ, Khara HS, Zhang Y, Lee MTM, Wolk DM, Yeasin M, Hontecillas R, Bassaganya-Riera J, Zand R. Increasing the Density of Laboratory Measures for Machine Learning Applications. J Clin Med 2020;10:E103. [PMID: 33396741 DOI: 10.3390/jcm10010103] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Revised: 12/23/2020] [Accepted: 12/25/2020] [Indexed: 12/12/2022] Open

Number

Cited by Other Article(s)

Abedi V, Misra D, Chaudhary D, Avula V, Schirmer CM, Li J, Zand R. Machine Learning-Based Prediction of Stroke in Emergency Departments. Ther Adv Neurol Disord 2024;17:17562864241239108. [PMID: 38572394 PMCID: PMC10989051 DOI: 10.1177/17562864241239108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Accepted: 02/07/2024] [Indexed: 04/05/2024] Open

Abedi V, Lambert C, Chaudhary D, Rieder E, Avula V, Hwang W, Li J, Zand R. Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches. J Clin Med 2023;12:jcm12072600. [PMID: 37048683 PMCID: PMC10095415 DOI: 10.3390/jcm12072600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 03/15/2023] [Accepted: 03/21/2023] [Indexed: 04/03/2023] Open

Abstract Introduction: The cut-point for defining the age of young ischemic stroke (IS) is clinically and epidemiologically important, yet it is arbitrary and differs across studies. In this study, we leveraged electronic health records (EHRs) and data science techniques to estimate an optimal cut-point for defining the age of young IS. Methods: Patient-level EHRs were extracted from 13 hospitals in Pennsylvania, and used in two parallel approaches. The first approach included ICD9/10, from IS patients to group comorbidities, and computed similarity scores between every patient pair. We determined the optimal age of young IS by analyzing the trend of patient similarity with respect to their clinical profile for different ages of index IS. The second approach used the IS cohort and control (without IS), and built three sets of machine-learning models—generalized linear regression (GLM), random forest (RF), and XGBoost (XGB)—to classify patients for seventeen age groups. After extracting feature importance from the models, we determined the optimal age of young IS by analyzing the pattern of comorbidity with respect to the age of index IS. Both approaches were completed separately for male and female patients. Results: The stroke cohort contained 7555 ISs, and the control included 31,067 patients. In the first approach, the optimal age of young stroke was 53.7 and 51.0 years in female and male patients, respectively. In the second approach, we created 102 models, based on three algorithms, 17 age brackets, and two sexes. The optimal age was 53 (GLM), 52 (RF), and 54 (XGB) for female, and 52 (GLM and RF) and 53 (RF) for male patients. Different age and sex groups exhibited different comorbidity patterns. Discussion: Using a data-driven approach, we determined the age of young stroke to be 54 years for women and 52 years for men in our mainly rural population, in central Pennsylvania. Future validation studies should include more diverse populations. Collapse

Abedi V, Razavi SM, Khan A, Avula V, Tompe A, Poursoroush A, Vafaei Sadr A, Li J, Zand R. Artificial Intelligence: A Shifting Paradigm in Cardio-Cerebrovascular Medicine. J Clin Med 2021;10:5710. [PMID: 34884412 DOI: 10.3390/jcm10235710] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 12/02/2021] [Indexed: 12/21/2022] Open

Li J, Yan XS, Chaudhary D, Avula V, Mudiganti S, Husby H, Shahjouei S, Afshar A, Stewart WF, Yeasin M, Zand R, Abedi V. Imputation of missing values for electronic health record laboratory data. NPJ Digit Med 2021;4:147. [PMID: 34635760 DOI: 10.1038/s41746-021-00518-0] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 09/13/2021] [Indexed: 11/08/2022] Open

Abedi V, Avula V, Razavi SM, Bavishi S, Chaudhary D, Shahjouei S, Wang M, Griessenauer CJ, Li J, Zand R. Predicting short and long-term mortality after acute ischemic stroke using EHR. J Neurol Sci 2021;427:117560. [PMID: 34218182 DOI: 10.1016/j.jns.2021.117560] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 06/21/2021] [Accepted: 06/25/2021] [Indexed: 12/14/2022]

Darabi N, Hosseinichimeh N, Noto A, Zand R, Abedi V. Machine Learning-Enabled 30-Day Readmission Model for Stroke Patients. Front Neurol 2021;12:638267. [PMID: 33868147 PMCID: PMC8044392 DOI: 10.3389/fneur.2021.638267] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Accepted: 03/08/2021] [Indexed: 11/18/2022] Open

Abstract

Background and Purpose: Hospital readmissions impose a substantial burden on the healthcare system. Reducing readmissions after stroke could lead to improved quality of care especially since stroke is associated with a high rate of readmission. The goal of this study is to enhance our understanding of the predictors of 30-day readmission after ischemic stroke and develop models to identify high-risk individuals for targeted interventions. Methods: We used patient-level data from electronic health records (EHR), five machine learning algorithms (random forest, gradient boosting machine, extreme gradient boosting-XGBoost, support vector machine, and logistic regression-LR), data-driven feature selection strategy, and adaptive sampling to develop 15 models of 30-day readmission after ischemic stroke. We further identified important clinical variables. Results: We included 3,184 patients with ischemic stroke (mean age: 71 ± 13.90 years, men: 51.06%). Among the 61 clinical variables included in the model, the National Institutes of Health Stroke Scale score above 24, insert indwelling urinary catheter, hypercoagulable state, and percutaneous gastrostomy had the highest importance score. The Model's AUC (area under the curve) for predicting 30-day readmission was 0.74 (95%CI: 0.64-0.78) with PPV of 0.43 when the XGBoost algorithm was used with ROSE-sampling. The balance between specificity and sensitivity improved through the sampling strategy. The best sensitivity was achieved with LR when optimized with feature selection and ROSE-sampling (AUC: 0.64, sensitivity: 0.53, specificity: 0.69). Conclusions: Machine learning-based models can be designed to predict 30-day readmission after stroke using structured data from EHR. Among the algorithms analyzed, XGBoost with ROSE-sampling had the best performance in terms of AUC while LR with ROSE-sampling and feature selection had the best sensitivity. Clinical variables highly associated with 30-day readmission could be targeted for personalized interventions. Depending on healthcare systems' resources and criteria, models with optimized performance metrics can be implemented to improve outcomes.

Collapse

Abedi V, Avula V, Chaudhary D, Shahjouei S, Khan A, Griessenauer CJ, Li J, Zand R. Prediction of Long-Term Stroke Recurrence Using Machine Learning Models. J Clin Med 2021;10:1286. [PMID: 33804724 DOI: 10.3390/jcm10061286] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Revised: 03/15/2021] [Accepted: 03/16/2021] [Indexed: 01/01/2023] Open

Abstract

Background: The long-term risk of recurrent ischemic stroke, estimated to be between 17% and 30%, cannot be reliably assessed at an individual level. Our goal was to study whether machine-learning can be trained to predict stroke recurrence and identify key clinical variables and assess whether performance metrics can be optimized. Methods: We used patient-level data from electronic health records, six interpretable algorithms (Logistic Regression, Extreme Gradient Boosting, Gradient Boosting Machine, Random Forest, Support Vector Machine, Decision Tree), four feature selection strategies, five prediction windows, and two sampling strategies to develop 288 models for up to 5-year stroke recurrence prediction. We further identified important clinical features and different optimization strategies. Results: We included 2091 ischemic stroke patients. Model area under the receiver operating characteristic (AUROC) curve was stable for prediction windows of 1, 2, 3, 4, and 5 years, with the highest score for the 1-year (0.79) and the lowest score for the 5-year prediction window (0.69). A total of 21 (7%) models reached an AUROC above 0.73 while 110 (38%) models reached an AUROC greater than 0.7. Among the 53 features analyzed, age, body mass index, and laboratory-based features (such as high-density lipoprotein, hemoglobin A1c, and creatinine) had the highest overall importance scores. The balance between specificity and sensitivity improved through sampling strategies. Conclusion: All of the selected six algorithms could be trained to predict the long-term stroke recurrence and laboratory-based variables were highly associated with stroke recurrence. The latter could be targeted for personalized interventions. Model performance metrics could be optimized, and models can be implemented in the same healthcare system as intelligent decision support for targeted intervention.

Collapse

Misra D, Avula V, Wolk DM, Farag HA, Li J, Mehta YB, Sandhu R, Karunakaran B, Kethireddy S, Zand R, Abedi V. Early Detection of Septic Shock Onset Using Interpretable Machine Learners. J Clin Med 2021;10:301. [PMID: 33467539 DOI: 10.3390/jcm10020301] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 12/31/2020] [Accepted: 01/12/2021] [Indexed: 02/06/2023] Open

Abstract

BACKGROUND

Developing a decision support system based on advances in machine learning is one area for strategic innovation in healthcare. Predicting a patient's progression to septic shock is an active field of translational research. The goal of this study was to develop a working model of a clinical decision support system for predicting septic shock in an acute care setting for up to 6 h from the time of admission in an integrated healthcare setting.

METHOD

Clinical data from Electronic Health Record (EHR), at encounter level, were used to build a predictive model for progression from sepsis to septic shock up to 6 h from the time of admission; that is, T = 1, 3, and 6 h from admission. Eight different machine learning algorithms (Random Forest, XGBoost, C5.0, Decision Trees, Boosted Logistic Regression, Support Vector Machine, Logistic Regression, Regularized Logistic, and Bayes Generalized Linear Model) were used for model development. Two adaptive sampling strategies were used to address the class imbalance. Data from two sources (clinical and billing codes) were used to define the case definition (septic shock) using the Centers for Medicare & Medicaid Services (CMS) Sepsis criteria. The model assessment was performed using Area under Receiving Operator Characteristics (AUROC), sensitivity, and specificity. Model predictions for each feature window (1, 3 and 6 h from admission) were consolidated.

RESULTS

Retrospective data from April 2005 to September 2018 were extracted from the EHR, Insurance Claims, Billing, and Laboratory Systems to create a dataset for septic shock detection. The clinical criteria and billing information were used to label patients into two classes-septic shock patients and sepsis patients at three different time points from admission, creating two different case-control cohorts. Data from 45,425 unique in-patient visits were used to build 96 prediction models comparing clinical-based definition versus billing-based information as the gold standard. Of the 24 consolidated models (based on eight machine learning algorithms and three feature windows), four models reached an AUROC greater than 0.9. Overall, all the consolidated models reached an AUROC of at least 0.8820 or higher. Based on the AUROC of 0.9483, the best model was based on Random Forest, with a sensitivity of 83.9% and specificity of 88.1%. The sepsis detection window at 6 h outperformed the 1 and 3-h windows. The sepsis definition based on clinical variables had improved performance when compared to the sepsis definition based on only billing information.

CONCLUSION

This study corroborated that machine learning models can be developed to predict septic shock using clinical and administrative data. However, the use of clinical information to define septic shock outperformed models developed based on only administrative data. Intelligent decision support tools can be developed and integrated into the EHR and improve clinical outcomes and facilitate the optimization of resources in real-time.

Collapse