1
|
Silva NCD, Albertini MK, Backes AR, Pena GDG. Machine learning for hospital readmission prediction in pediatric population. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 244:107980. [PMID: 38134648 DOI: 10.1016/j.cmpb.2023.107980] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 10/31/2023] [Accepted: 12/12/2023] [Indexed: 12/24/2023]
Abstract
BACKGROUND AND OBJECTIVE Pediatric readmissions are a burden on patients, families, and the healthcare system. In order to identify patients at higher readmission risk, more accurate techniques, as machine learning (ML), could be a good strategy to expand the knowledge in this area. The aim of this study was to develop predictive models capable of identifying children and adolescents at high risk of potentially avoidable 30-day readmission using ML. METHODS Retrospective cohort study was carried out with 9,080 patients under 18 years old admitted to a tertiary university hospital. Demographic, clinical, and biochemical data were collected from electronic databases. We randomly divided the dataset into training (75 %) and testing (25 %), applied downsampling, repeated cross-validation with five folds and ten repetitions, and the hyperparameter was optimized of each technique using a grid search via racing with ANOVA models. We applied six ML classification algorithms to build the predictive models, including classification and regression tree (CART), random forest (RF), gradient boosting machine (GBM), extreme gradient boosting (XGBoost), decision tree and logistic regression (LR). The area under the receiver operating curve (AUC), sensitivity, specificity, Youden's J-index and accuracy were used to evaluate the performance of each model. RESULTS The avoidable 30-day hospital readmissions rate was 9.5 %. Some algorithms presented similar AUC, both in the dataset training and in the dataset testing, such as XGBoost, RF, GBM and CART. Considering the Youden's J-index, the algorithm that presented the best index was XGBoost with bagging imputation, with AUC of 0.814 (J-index of 0.484). Cancer diagnosis, age, red blood cells, leukocytes, red cell distribution width and sodium levels, elective admission, and multimorbidity were the most important characteristics to classify between readmission and non-readmission groups. CONCLUSION Machine learning approaches, especially XGBoost, can predict potentially avoidable 30-day pediatric hospital readmission into tertiary assistance. If implemented in the computer hospital system, our model can help in the early and more accurate identification of patients at readmission risk, targeting health strategic interventions.
Collapse
Affiliation(s)
- Nayara Cristina da Silva
- Graduate Program in Health Sciences, Federal University of Uberlandia, Uberlandia, Minas Gerais, Brazil, Pará Av, 1720, Campus Umuarama, Uberlândia, Minas Gerais 38400-902, Brazil
| | - Marcelo Keese Albertini
- School of Computer Science, Federal University of Uberlandia, Uberlandia, Minas Gerais 38408-100, Brazil
| | - André Ricardo Backes
- Department of Computing, Federal University of Sao Carlos, Sao Carlos, São Paulo 13565-905, Brazil
| | - Geórgia das Graças Pena
- Graduate Program in Health Sciences, Federal University of Uberlandia, Uberlandia, Minas Gerais, Brazil, Pará Av, 1720, Campus Umuarama, Uberlândia, Minas Gerais 38400-902, Brazil.
| |
Collapse
|
2
|
Goodman DM, Casale MT, Rychlik K, Carroll MS, Auger KA, Smith TL, Cartland J, Davis MM. Development and Validation of an Integrated Suite of Prediction Models for All-Cause 30-Day Readmissions of Children and Adolescents Aged 0 to 18 Years. JAMA Netw Open 2022; 5:e2241513. [PMID: 36367725 PMCID: PMC9652755 DOI: 10.1001/jamanetworkopen.2022.41513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
IMPORTANCE Readmission is often considered a hospital quality measure, yet no validated risk prediction models exist for children. OBJECTIVE To develop and validate a tool identifying patients before hospital discharge who are at risk for subsequent readmission, applicable to all ages. DESIGN, SETTING, AND PARTICIPANTS This population-based prognostic analysis used electronic health record-derived data from a freestanding children's hospital from January 1, 2016, to December 31, 2019. All-cause 30-day readmission was modeled using 3 years of discharge data. Data were analyzed from June 1 to November 30, 2021. MAIN OUTCOMES AND MEASURES Three models were derived as a complementary suite to include (1) children 6 months or older with 1 or more prior hospitalizations within the last 6 months (recent admission model [RAM]), (2) children 6 months or older with no prior hospitalizations in the last 6 months (new admission model [NAM]), and (3) children younger than 6 months (young infant model [YIM]). Generalized mixed linear models were used for all analyses. Models were validated using an additional year of discharges. RESULTS The derivation set contained 29 988 patients with 48 019 hospitalizations; 50.1% of these admissions were for children younger than 5 years and 54.7% were boys. In the derivation set, 4878 of 13 490 admissions (36.2%) in the RAM cohort, 2044 of 27 531 (7.4%) in the NAM cohort, and 855 of 6998 (12.2%) in the YIM cohort were followed within 30 days by a readmission. In the RAM cohort, prior utilization, current or prior procedures indicative of severity of illness (transfusion, ventilation, or central venous catheter), commercial insurance, and prolonged length of stay (LOS) were associated with readmission. In the NAM cohort, procedures, prolonged LOS, and emergency department visit in the past 6 months were associated with readmission. In the YIM cohort, LOS, prior visits, and critical procedures were associated with readmission. The area under the receiver operating characteristics curve was 83.1 (95% CI, 82.4-83.8) for the RAM cohort, 76.1 (95% CI, 75.0-77.2) for the NAM cohort, and 80.3 (95% CI, 78.8-81.9) for the YIM cohort. CONCLUSIONS AND RELEVANCE In this prognostic study, the suite of 3 prediction models had acceptable to excellent discrimination for children. These models may allow future improvements in tailored discharge preparedness to prevent high-risk readmissions.
Collapse
Affiliation(s)
- Denise M. Goodman
- Division of Critical Care Medicine, Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, Illinois
- Department of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, Illinois
| | - Mia T. Casale
- Data Analytics and Reporting, Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, Illinois
| | - Karen Rychlik
- Department of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, Illinois
- Biostatistics Research Core, Stanley Manne Children’s Research Institute, Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, Illinois
- Currently serving as an independent consultant
| | - Michael S. Carroll
- Department of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, Illinois
- Data Analytics and Reporting, Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, Illinois
| | - Katherine A. Auger
- Division of Hospital Medicine, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio
| | - Tracie L. Smith
- Data Analytics and Reporting, Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, Illinois
| | - Jenifer Cartland
- Department of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, Illinois
- Data Analytics and Reporting, Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, Illinois
- Mary Ann & J. Milburn Smith Child Health Outcomes, Research, and Evaluation Center, Stanley Manne Children’s Research Institute, Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, Illinois
- Currently retired
| | - Matthew M. Davis
- Department of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, Illinois
- Mary Ann & J. Milburn Smith Child Health Outcomes, Research, and Evaluation Center, Stanley Manne Children’s Research Institute, Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, Illinois
- Division of Advanced General Pediatrics and Primary Care, Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, Illinois
- Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois
- Department of Medical Social Sciences, Northwestern University Feinberg School of Medicine, Chicago, Illinois
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois
| |
Collapse
|
3
|
Wang S, Zhu X. Predictive Modeling of Hospital Readmission: Challenges and Solutions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2975-2995. [PMID: 34133285 DOI: 10.1109/tcbb.2021.3089682] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Hospital readmission prediction is a study to learn models from historical medical data to predict probability of a patient returning to hospital in a certain period, e.g. 30 or 90 days, after the discharge. The motivation is to help health providers deliver better treatment and post-discharge strategies, lower the hospital readmission rate, and eventually reduce the medical costs. Due to inherent complexity of diseases and healthcare ecosystems, modeling hospital readmission is facing many challenges. By now, a variety of methods have been developed, but existing literature fails to deliver a complete picture to answer some fundamental questions, such as what are the main challenges and solutions in modeling hospital readmission; what are typical features/models used for readmission prediction; how to achieve meaningful and transparent predictions for decision making; and what are possible conflicts when deploying predictive approaches for real-world usages. In this paper, we systematically review computational models for hospital readmission prediction, and propose a taxonomy of challenges featuring four main categories: (1) data variety and complexity; (2) data imbalance, locality and privacy; (3) model interpretability; and (4) model implementation. The review summarizes methods in each category, and highlights technical solutions proposed to address the challenges. In addition, a review of datasets and resources available for hospital readmission modeling also provides firsthand materials to support researchers and practitioners to design new approaches for effective and efficient hospital readmission prediction.
Collapse
|
4
|
Abdulaal MJ, Mehedi IM, Aljohani AJ, Milyani AH, Mahmoud M, Abusorrah AM, Jannat R. Separation of Different Blogs from Skin Disease Data using Artificial Intelligence. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:7538643. [PMID: 36052051 PMCID: PMC9427218 DOI: 10.1155/2022/7538643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 07/20/2022] [Accepted: 07/25/2022] [Indexed: 11/23/2022]
Abstract
A combination of environmental conditions may cause skin illness everywhere on the earth, and it is one of the most dangerous diseases that can develop as a result. A major goal in the selection of characteristics is to produce predictions about skin disease instances in connection with influencing variables, which is one of the most important tasks. As a consequence of the widespread usage of sensors, the amount of data collected in the health industry is disproportionately large when compared to data collected in other sectors. In the past, researchers have used a variety of machine learning algorithms to determine the relationship between illnesses and other disorders. Forecasting is a procedure that involves many steps, the most important of which are the preprocessing of any scenario and the selection of forecasting features. A major disadvantage of doing business in the health industry is a lack of data availability, which is particularly problematic when data is provided in an unstructured format. Filling in missing numbers and converting between various types of data take somewhat more than 70% of the total time. When dealing with missing data in machine learning applications, the mean, average, and median, as well as the stand mechanism, may all be employed to solve the problem. Previous research has shown that the characteristics chosen for a model's overall performance may have an influence on the overall performance of the model's overall performance. One of the primary goals of this study is to develop an intelligent algorithm for identifying relevant traits in models while simultaneously eliminating nonsignificant attributes that have an impact on model performance. To present a full view of the data, artificial intelligence techniques such as SVM, decision tree, and logistic regression models were used in conjunction with three separate feature combination methodologies, each of which was developed independently. As a consequence of this, their accuracy, F-measure, and precision are all raised by a factor of ten, respectively. We then have a list of the most important features, together with the weights that have been allocated to each of them.
Collapse
Affiliation(s)
- Mohammed J. Abdulaal
- Department of Electrical and Computer Engineering (ECE), King Abdulaziz University, Jeddah, Saudi Arabia
- Center of Excellence in Intelligent Engineering Systems (CEIES), King Abdulaziz University, Jeddah, Saudi Arabia
| | - Ibrahim M. Mehedi
- Department of Electrical and Computer Engineering (ECE), King Abdulaziz University, Jeddah, Saudi Arabia
- Center of Excellence in Intelligent Engineering Systems (CEIES), King Abdulaziz University, Jeddah, Saudi Arabia
| | - Abdulah Jeza Aljohani
- Department of Electrical and Computer Engineering (ECE), King Abdulaziz University, Jeddah, Saudi Arabia
- Center of Excellence in Intelligent Engineering Systems (CEIES), King Abdulaziz University, Jeddah, Saudi Arabia
| | - Ahmad H. Milyani
- Department of Electrical and Computer Engineering (ECE), King Abdulaziz University, Jeddah, Saudi Arabia
| | - Mohamed Mahmoud
- Electrical and Engineering Department, Tennessee Technological University, Cookeville, TN, USA
| | - Abdullah M. Abusorrah
- Department of Electrical and Computer Engineering (ECE), King Abdulaziz University, Jeddah, Saudi Arabia
| | - Rahtul Jannat
- Department of Electrical and Electronic Engineering, BRAC University, Dhaka, Bangladesh
| |
Collapse
|
5
|
A novel method for prediction of skin disease through supervised classification techniques. Soft comput 2022. [DOI: 10.1007/s00500-022-07435-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
6
|
Gopukumar D, Ghoshal A, Zhao H. A Machine Learning Approach for Predicting Readmission Charges Billed by Hospitals. JMIR Med Inform 2022; 10:e37578. [PMID: 35896038 PMCID: PMC9472041 DOI: 10.2196/37578] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 05/02/2022] [Accepted: 07/26/2022] [Indexed: 11/29/2022] Open
Abstract
Background The Centers for Medicare and Medicaid Services projects that health care costs will continue to grow over the next few years. Rising readmission costs contribute significantly to increasing health care costs. Multiple areas of health care, including readmissions, have benefited from the application of various machine learning algorithms in several ways. Objective We aimed to identify suitable models for predicting readmission charges billed by hospitals. Our literature review revealed that this application of machine learning is underexplored. We used various predictive methods, ranging from glass-box models (such as regularization techniques) to black-box models (such as deep learning–based models). Methods We defined readmissions as readmission with the same major diagnostic category (RSDC) and all-cause readmission category (RADC). For these readmission categories, 576,701 and 1,091,580 individuals, respectively, were identified from the Nationwide Readmission Database of the Healthcare Cost and Utilization Project by the Agency for Healthcare Research and Quality for 2013. Linear regression, lasso regression, elastic net, ridge regression, eXtreme gradient boosting (XGBoost), and a deep learning model based on multilayer perceptron (MLP) were the 6 machine learning algorithms we tested for RSDC and RADC through 10-fold cross-validation. Results Our preliminary analysis using a data-driven approach revealed that within RADC, the subsequent readmission charge billed per patient was higher than the previous charge for 541,090 individuals, and this number was 319,233 for RSDC. The top 3 major diagnostic categories (MDCs) for such instances were the same for RADC and RSDC. The average readmission charge billed was higher than the previous charge for 21 of the MDCs in the case of RSDC, whereas it was only for 13 of the MDCs in RADC. We recommend XGBoost and the deep learning model based on MLP for predicting readmission charges. The following performance metrics were obtained for XGBoost: (1) RADC (mean absolute percentage error [MAPE]=3.121%; root mean squared error [RMSE]=0.414; mean absolute error [MAE]=0.317; root relative squared error [RRSE]=0.410; relative absolute error [RAE]=0.399; normalized RMSE [NRMSE]=0.040; mean absolute deviation [MAD]=0.031) and (2) RSDC (MAPE=3.171%; RMSE=0.421; MAE=0.321; RRSE=0.407; RAE=0.393; NRMSE=0.041; MAD=0.031). The performance obtained for MLP-based deep neural networks are as follows: (1) RADC (MAPE=3.103%; RMSE=0.413; MAE=0.316; RRSE=0.410; RAE=0.397; NRMSE=0.040; MAD=0.031) and (2) RSDC (MAPE=3.202%; RMSE=0.427; MAE=0.326; RRSE=0.413; RAE=0.399; NRMSE=0.041; MAD=0.032). Repeated measures ANOVA revealed that the mean RMSE differed significantly across models with P<.001. Post hoc tests using the Bonferroni correction method indicated that the mean RMSE of the deep learning/XGBoost models was statistically significantly (P<.001) lower than that of all other models, namely linear regression/elastic net/lasso/ridge regression. Conclusions Models built using XGBoost and MLP are suitable for predicting readmission charges billed by hospitals. The MDCs allow models to accurately predict hospital readmission charges.
Collapse
Affiliation(s)
- Deepika Gopukumar
- Department of Health and Clinical Outcomes Research, School of Medicine, Saint Louis University, SALUS Center, 3545 Lafayette Ave., 4rth floor, Room 409 B, St.Louis, US
| | - Abhijeet Ghoshal
- Department of Business Administration, Gies College of Business, University of Illinois Urbana-Champaign, Champaign, US
| | - Huimin Zhao
- Sheldon B. Lubar College of Business, University of Wisconsin-Milwaukee, Milwaukee, US
| |
Collapse
|
7
|
Xie J, Zhang B, Ma J, Zeng D, Lo-Ciganic J. Readmission Prediction for Patients with Heterogeneous Medical History: A Trajectory-Based Deep Learning Approach. ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS 2022. [DOI: 10.1145/3468780] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Hospital readmission refers to the situation where a patient is re-hospitalized with the same primary diagnosis within a specific time interval after discharge. Hospital readmission causes $26 billion preventable expenses to the U.S. health systems annually and often indicates suboptimal patient care. To alleviate those severe financial and health consequences, it is crucial to proactively predict patients’ readmission risk. Such prediction is challenging because the evolution of patients’ medical history is dynamic and complex. The state-of-the-art studies apply statistical models which use static predictors in a period, failing to consider patients’ heterogeneous medical history. Our approach –
Trajectory-BAsed DEep Learning (TADEL)
– is motivated to tackle the deficiencies of the existing approaches by capturing dynamic medical history. We evaluate TADEL on a five-year national Medicare claims dataset including 3.6 million patients per year over all hospitals in the United States, reaching an F1 score of 87.3% and an AUC of 88.4%. Our approach significantly outperforms all the state-of-the-art methods. Our findings suggest that health status factors and insurance coverage are important predictors for readmission. This study contributes to IS literature and analytical methodology by formulating the trajectory-based readmission prediction problem and developing a novel deep-learning-based readmission risk prediction framework. From a health IT perspective, this research delivers implementable methods to assess patients’ readmission risk and take early interventions to avoid potential negative consequences.
Collapse
Affiliation(s)
- Jiaheng Xie
- Lerner College of Business & Economics, University of Delaware, Newark, DE, USA
| | - Bin Zhang
- Eller College of Management, University of Arizona, Tucson, AZ, USA
| | - Jian Ma
- University of Colorado, Colorado Springs, Colorado Springs CO, USA
| | - Daniel Zeng
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Jenny Lo-Ciganic
- Department of Pharmaceutical Outcomes & Policy, University of Florida, FL
| |
Collapse
|
8
|
Safaei N, Safaei B, Seyedekrami S, Talafidaryani M, Masoud A, Wang S, Li Q, Moqri M. E-CatBoost: An efficient machine learning framework for predicting ICU mortality using the eICU Collaborative Research Database. PLoS One 2022; 17:e0262895. [PMID: 35511882 PMCID: PMC9070907 DOI: 10.1371/journal.pone.0262895] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Accepted: 01/09/2022] [Indexed: 11/19/2022] Open
Abstract
Improving the Intensive Care Unit (ICU) management network and building cost-effective and well-managed healthcare systems are high priorities for healthcare units. Creating accurate and explainable mortality prediction models helps identify the most critical risk factors in the patients' survival/death status and early detect the most in-need patients. This study proposes a highly accurate and efficient machine learning model for predicting ICU mortality status upon discharge using the information available during the first 24 hours of admission. The most important features in mortality prediction are identified, and the effects of changing each feature on the prediction are studied. We used supervised machine learning models and illness severity scoring systems to benchmark the mortality prediction. We also implemented a combination of SHAP, LIME, partial dependence, and individual conditional expectation plots to explain the predictions made by the best-performing model (CatBoost). We proposed E-CatBoost, an optimized and efficient patient mortality prediction model, which can accurately predict the patients' discharge status using only ten input features. We used eICU-CRD v2.0 to train and validate the models; the dataset contains information on over 200,000 ICU admissions. The patients were divided into twelve disease groups, and models were fitted and tuned for each group. The models' predictive performance was evaluated using the area under a receiver operating curve (AUROC). The AUROC scores were 0.86 [std:0.02] to 0.92 [std:0.02] for CatBoost and 0.83 [std:0.02] to 0.91 [std:0.03] for E-CatBoost models across the defined disease groups; if measured over the entire patient population, their AUROC scores were 7 to 18 and 2 to 12 percent higher than the baseline models, respectively. Based on SHAP explanations, we found age, heart rate, respiratory rate, blood urine nitrogen, and creatinine level as the most critical cross-disease features in mortality predictions.
Collapse
Affiliation(s)
- Nima Safaei
- Department of Business Analytics and Information Systems, Tippie College of Business, University of Iowa, Iowa City, IA, United States of America
| | - Babak Safaei
- Civil and Environmental Engineering Department, Michigan State University, East Lansing, MI, United States of America
| | - Seyedhouman Seyedekrami
- Department of Computer Science and Engineering, University of Nevada, Reno, NV, United States of America
| | | | - Arezoo Masoud
- Department of Business Analytics and Information Systems, Tippie College of Business, University of Iowa, Iowa City, IA, United States of America
| | - Shaodong Wang
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United States of America
| | - Qing Li
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United States of America
| | - Mahdi Moqri
- Department of Information Systems and Business Analytics, Ivy College of Business, Iowa State University, Ames, IA, United States of America
| |
Collapse
|
9
|
Niehaus IM, Kansy N, Stock S, Dötsch J, Müller D. Applicability of predictive models for 30-day unplanned hospital readmission risk in paediatrics: a systematic review. BMJ Open 2022; 12:e055956. [PMID: 35354615 PMCID: PMC8968996 DOI: 10.1136/bmjopen-2021-055956] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
OBJECTIVES To summarise multivariable predictive models for 30-day unplanned hospital readmissions (UHRs) in paediatrics, describe their performance and completeness in reporting, and determine their potential for application in practice. DESIGN Systematic review. DATA SOURCE CINAHL, Embase and PubMed up to 7 October 2021. ELIGIBILITY CRITERIA English or German language studies aiming to develop or validate a multivariable predictive model for 30-day paediatric UHRs related to all-cause, surgical conditions or general medical conditions were included. DATA EXTRACTION AND SYNTHESIS Study characteristics, risk factors significant for predicting readmissions and information about performance measures (eg, c-statistic) were extracted. Reporting quality was addressed by the 'Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis' (TRIPOD) adherence form. The study quality was assessed by applying six domains of potential biases. Due to expected heterogeneity among the studies, the data were qualitatively synthesised. RESULTS Based on 28 studies, 37 predictive models were identified, which could potentially be used for determining individual 30-day UHR risk in paediatrics. The number of study participants ranged from 190 children to 1.4 million encounters. The two most common significant risk factors were comorbidity and (postoperative) length of stay. 23 models showed a c-statistic above 0.7 and are primarily applicable at discharge. The median TRIPOD adherence of the models was 59% (P25-P75, 55%-69%), ranging from a minimum of 33% to a maximum of 81%. Overall, the quality of many studies was moderate to low in all six domains. CONCLUSION Predictive models may be useful in identifying paediatric patients at increased risk of readmission. To support the application of predictive models, more attention should be placed on completeness in reporting, particularly for those items that may be relevant for implementation in practice.
Collapse
Affiliation(s)
- Ines Marina Niehaus
- Department of Business Administration and Health Care Management, University of Cologne, Cologne, Germany
| | - Nina Kansy
- Department of Business Administration and Health Care Management, University of Cologne, Cologne, Germany
| | - Stephanie Stock
- Institute for Health Economics and Clinical Epidemiology, University of Cologne, Cologne, Germany
| | - Jörg Dötsch
- Department of Paediatrics and Adolescent Medicine, University Hospital Cologne, Cologne, Germany
| | - Dirk Müller
- Institute for Health Economics and Clinical Epidemiology, University of Cologne, Cologne, Germany
| |
Collapse
|
10
|
Miswan NH, Chan CS, Ng CG. Predictive modelling of hospital readmission: Evaluation of different preprocessing techniques on machine learning classifiers. INTELL DATA ANAL 2021. [DOI: 10.3233/ida-205468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Hospital readmission is a major cost for healthcare systems worldwide. If patients with a higher potential of readmission could be identified at the start, existing resources could be used more efficiently, and appropriate plans could be implemented to reduce the risk of readmission. Therefore, it is important to predict the right target patients. Medical data is usually noisy, incomplete, and inconsistent. Hence, before developing a prediction model, it is crucial to efficiently set up the predictive model so that improved predictive performance is achieved. The current study aims to analyse the impact of different preprocessing methods on the performance of different machine learning classifiers. The preprocessing applied by previous hospital readmission studies were compared, and the most common approaches highlighted such as missing value imputation, feature selection, data balancing, and feature scaling. The hyperparameters were selected using Bayesian optimisation. The different preprocessing pipelines were assessed using various performance metrics and computational costs. The results indicated that the preprocessing approaches helped improve the model’s prediction of hospital readmission.
Collapse
Affiliation(s)
- Nor Hamizah Miswan
- Centre of Image and Signal Processing, Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia
- Department of Mathematical Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, UKM Bangi, Selangor, Malaysia
| | - Chee Seng Chan
- Centre of Image and Signal Processing, Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia
| | - Chong Guan Ng
- Department of Psychological Medicine, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia
| |
Collapse
|
11
|
Garnica O, Gómez D, Ramos V, Hidalgo JI, Ruiz-Giardín JM. Diagnosing hospital bacteraemia in the framework of predictive, preventive and personalised medicine using electronic health records and machine learning classifiers. EPMA J 2021; 12:365-381. [PMID: 34484472 PMCID: PMC8405861 DOI: 10.1007/s13167-021-00252-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 07/30/2021] [Indexed: 12/12/2022]
Abstract
Background The bacteraemia prediction is relevant because sepsis is one of the most important causes of morbidity and mortality. Bacteraemia prognosis primarily depends on a rapid diagnosis. The bacteraemia prediction would shorten up to 6 days the diagnosis, and, in conjunction with individual patient variables, should be considered to start the early administration of personalised antibiotic treatment and medical services, the election of specific diagnostic techniques and the determination of additional treatments, such as surgery, that would prevent subsequent complications. Machine learning techniques could help physicians make these informed decisions by predicting bacteraemia using the data already available in electronic hospital records. Objective This study presents the application of machine learning techniques to these records to predict the blood culture's outcome, which would reduce the lag in starting a personalised antibiotic treatment and the medical costs associated with erroneous treatments due to conservative assumptions about blood culture outcomes. Methods Six supervised classifiers were created using three machine learning techniques, Support Vector Machine, Random Forest and K-Nearest Neighbours, on the electronic health records of hospital patients. The best approach to handle missing data was chosen and, for each machine learning technique, two classification models were created: the first uses the features known at the time of blood extraction, whereas the second uses four extra features revealed during the blood culture. Results The six classifiers were trained and tested using a dataset of 4357 patients with 117 features per patient. The models obtain predictions that, for the best case, are up to a state-of-the-art accuracy of 85.9%, a sensitivity of 87.4% and an AUC of 0.93. Conclusions Our results provide cutting-edge metrics of interest in predictive medical models with values that exceed the medical practice threshold and previous results in the literature using classical modelling techniques in specific types of bacteraemia. Additionally, the consistency of results is reasserted because the three classifiers' importance ranking shows similar features that coincide with those that physicians use in their manual heuristics. Therefore, the efficacy of these machine learning techniques confirms their viability to assist in the aims of predictive and personalised medicine once the disease presents bacteraemia-compatible symptoms and to assist in improving the healthcare economy.
Collapse
Affiliation(s)
- Oscar Garnica
- Departamento de Arquitectura de Computadores, Universidad Complutense de Madrid, Madrid, Spain
| | - Diego Gómez
- Universidad Complutense de Madrid, Madrid, Spain
| | - Víctor Ramos
- Universidad Complutense de Madrid, Madrid, Spain
| | - J. Ignacio Hidalgo
- Departamento de Arquitectura de Computadores, Universidad Complutense de Madrid, Madrid, Spain
| | - José M. Ruiz-Giardín
- Departamento de Medicina Interna, Hospital Universitario de Fuenlabrada, Madrid, Spain
| |
Collapse
|
12
|
Zhou H, Albrecht MA, Roberts PA, Porter P, Della PR. Using machine learning to predict paediatric 30-day unplanned hospital readmissions: a case-control retrospective analysis of medical records, including written discharge documentation. AUST HEALTH REV 2021; 45:328-337. [PMID: 33840419 DOI: 10.1071/ah20062] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Accepted: 06/18/2020] [Indexed: 11/23/2022]
Abstract
Objectives To assess whether adding clinical information and written discharge documentation variables improves prediction of paediatric 30-day same-hospital unplanned readmission compared with predictions based on administrative information alone. Methods A retrospective matched case-control study audited the medical records of patients discharged from a tertiary paediatric hospital in Western Australia (WA) between January 2010 and December 2014. A random selection of 470 patients with unplanned readmissions (out of 3330) were matched to 470 patients without readmissions based on age, sex, and principal diagnosis at the index admission. Prediction utility of three groups of variables (administrative, administrative and clinical, and administrative, clinical and written discharge documentation) were assessed using standard logistic regression and machine learning. Results Inclusion of written discharge documentation variables significantly improved prediction of readmission compared with models that used only administrative and/or clinical variables in standard logistic regression analysis (χ2 17=29.4, P=0.03). Highest prediction accuracy was obtained using a gradient boosted tree model (C-statistic=0.654), followed closely by random forest and elastic net modelling approaches. Variables highlighted as important for prediction included patients' social history (legal custody or patient was under the care of the Department for Child Protection), languages spoken other than English, completeness of nursing admission and discharge planning documentation, and timing of issuing discharge summary. Conclusions The variables of significant social history, low English language proficiency, incomplete discharge documentation, and delay in issuing the discharge summary add value to prediction models. What is known about the topic? Despite written discharge documentation playing a critical role in the continuity of care for paediatric patients, limited research has examined its association with, and ability to predict, unplanned hospital readmissions. Machine learning approaches have been applied to various health conditions and demonstrated improved predictive accuracy. However, few published studies have used machine learning to predict paediatric readmissions. What does this paper add? This paper presents the findings of the first known study in Australia to assess and report that written discharge documentation and clinical information improves unplanned rehospitalisation prediction accuracy in a paediatric cohort compared with administrative data alone. It is also the first known published study to use machine learning for the prediction of paediatric same-hospital unplanned readmission in Australia. The results show improved predictive performance of the machine learning approach compared with standard logistic regression. What are the implications for practitioners? The identified social and written discharge documentation predictors could be translated into clinical practice through improved discharge planning and processes, to prevent paediatric 30-day all-cause same-hospital unplanned readmission. The predictors identified in this study include significant social history, low English language proficiency, incomplete discharge documentation, and delay in issuing the discharge summary.
Collapse
Affiliation(s)
- Huaqiong Zhou
- General Surgical Ward, Princess Margaret Hospital for Children, Perth, WA 6008, Australia; and School of Nursing, Curtin University, GPO Box U 1987, Perth, WA 6845, Australia. Email address: ; ; ;
| | - Matthew A Albrecht
- School of Nursing, Curtin University, GPO Box U 1987, Perth, WA 6845, Australia. Email address: ; ; ;
| | - Pamela A Roberts
- School of Nursing, Curtin University, GPO Box U 1987, Perth, WA 6845, Australia. Email address: ; ; ;
| | - Paul Porter
- School of Nursing, Curtin University, GPO Box U 1987, Perth, WA 6845, Australia. Email address: ; ; ; ; and Joondalup Health Campus, Joondalup, WA 6027, Australia
| | - Philip R Della
- School of Nursing, Curtin University, GPO Box U 1987, Perth, WA 6845, Australia. Email address: ; ; ; ; and Visiting Professor, College of Nursing, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China; and Corresponding author.
| |
Collapse
|
13
|
gbt-HIPS: Explaining the Classifications of Gradient Boosted Tree Ensembles. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11062511] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
This research presents Gradient Boosted Tree High Importance Path Snippets (gbt-HIPS), a novel, heuristic method for explaining gradient boosted tree (GBT) classification models by extracting a single classification rule (CR) from the ensemble of decision trees that make up the GBT model. This CR contains the most statistically important boundary values of the input space as antecedent terms. The CR represents a hyper-rectangle of the input space inside which the GBT model is, very reliably, classifying all instances with the same class label as the explanandum instance. In a benchmark test using nine data sets and five competing state-of-the-art methods, gbt-HIPS offered the best trade-off between coverage (0.16–0.75) and precision (0.85–0.98). Unlike competing methods, gbt-HIPS is also demonstrably guarded against under- and over-fitting. A further distinguishing feature of our method is that, unlike much prior work, our explanations also provide counterfactual detail in accordance with widely accepted recommendations for what makes a good explanation.
Collapse
|
14
|
De Silva K, Mathews N, Teede H, Forbes A, Jönsson D, Demmer RT, Enticott J. Clinical notes as prognostic markers of mortality associated with diabetes mellitus following critical care: A retrospective cohort analysis using machine learning and unstructured big data. Comput Biol Med 2021; 132:104305. [PMID: 33705995 DOI: 10.1016/j.compbiomed.2021.104305] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 02/23/2021] [Accepted: 02/27/2021] [Indexed: 12/14/2022]
Abstract
BACKGROUND Clinical notes are ubiquitous resources offering potential value in optimizing critical care via data mining technologies. OBJECTIVE To determine the predictive value of clinical notes as prognostic markers of 1-year all-cause mortality among people with diabetes following critical care. MATERIALS AND METHODS Mortality of diabetes patients were predicted using three cohorts of clinical text in a critical care database, written by physicians (n = 45253), nurses (159027), and both (n = 204280). Natural language processing was used to pre-process text documents and LASSO-regularized logistic regression models were trained and tested. Confusion matrix metrics of each model were calculated and AUROC estimates between models were compared. All predictive words and corresponding coefficients were extracted. Outcome probability associated with each text document was estimated. RESULTS Models built on clinical text of physicians, nurses, and the combined cohort predicted mortality with AUROC of 0.996, 0.893, and 0.922, respectively. Predictive performance of the models significantly differed from one another whereas inter-rater reliability ranged from substantial to almost perfect across them. Number of predictive words with non-zero coefficients were 3994, 8159, and 10579, respectively, in the models of physicians, nurses, and the combined cohort. Physicians' and nursing notes, both individually and when combined, strongly predicted 1-year all-cause mortality among people with diabetes following critical care. CONCLUSION Clinical notes of physicians and nurses are strong and novel prognostic markers of diabetes-associated mortality in critical care, offering potentially generalizable and scalable applications. Clinical text-derived personalized risk estimates of prognostic outcomes such as mortality could be used to optimize patient care.
Collapse
Affiliation(s)
- Kushan De Silva
- Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Clayton, 3168, Australia.
| | - Noel Mathews
- Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Clayton, 3168, Australia
| | - Helena Teede
- Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Clayton, 3168, Australia
| | - Andrew Forbes
- Biostatistics Unit, Division of Research Methodology, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Melbourne, 3004, Australia
| | - Daniel Jönsson
- Department of Periodontology, Faculty of Odontology, Malmö University, Malmö, 21119, Sweden; Swedish Dental Service of Skane, Lund, 22647, Sweden
| | - Ryan T Demmer
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, MN, USA; Mailman School of Public Health, Columbia University, New York, USA
| | - Joanne Enticott
- Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Clayton, 3168, Australia
| |
Collapse
|
15
|
Lupton-Smith C, Stuart EA, McGinty EE, Dalcin AT, Jerome GJ, Wang NY, Daumit GL. Determining Predictors of Weight Loss in a Behavioral Intervention: A Case Study in the Use of Lasso Regression. Front Psychiatry 2021; 12:707707. [PMID: 35185628 PMCID: PMC8850776 DOI: 10.3389/fpsyt.2021.707707] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 12/29/2021] [Indexed: 01/26/2023] Open
Abstract
OBJECTIVE This study investigates predictors of weight loss among individuals with serious mental illness participating in an 18-month behavioral weight loss intervention, using Lasso regression to select the most powerful predictors. METHODS Data were analyzed from the intervention group of the ACHIEVE trial, an 18-month behavioral weight loss intervention in adults with serious mental illness. Lasso regression was employed to identify predictors of at least five-pound weight loss across the intervention time span. Once predictors were identified, classification trees were created to show examples of how to classify participants into having likely outcomes based on characteristics at baseline and during the intervention. RESULTS The analyzed sample contained 137 participants. Seventy-one (51.8%) individuals had a net weight loss of at least five pounds from baseline to 18 months. The Lasso regression selected weight loss from baseline to 6 months as a primary predictor of at least five pound 18-month weight loss, with a standardized coefficient of 0.51 (95% CI: -0.37, 1.40). Three other variables were also selected in the regression but added minimal predictive ability. CONCLUSIONS The analyses in this paper demonstrate the importance of tracking weight loss incrementally during an intervention as an indicator for overall weight loss, as well as the challenges in predicting long-term weight loss with other variables commonly available in clinical trials. The methods used in this paper also exemplify how to effectively analyze a clinical trial dataset containing many variables and identify factors related to desired outcomes.
Collapse
Affiliation(s)
- Carly Lupton-Smith
- Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States
| | - Elizabeth A Stuart
- Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States
| | - Emma E McGinty
- Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States
| | - Arlene T Dalcin
- Johns Hopkins School of Medicine, Baltimore, MD, United States
| | - Gerald J Jerome
- Department of Kinesiology, Towson University, Towson, MD, United States
| | - Nae-Yuh Wang
- Johns Hopkins School of Medicine, Baltimore, MD, United States
| | - Gail L Daumit
- Johns Hopkins School of Medicine, Baltimore, MD, United States
| |
Collapse
|
16
|
Hatwell J, Gaber MM, Atif Azad RM. Ada-WHIPS: explaining AdaBoost classification with applications in the health sciences. BMC Med Inform Decis Mak 2020; 20:250. [PMID: 33008388 PMCID: PMC7531148 DOI: 10.1186/s12911-020-01201-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 07/23/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Computer Aided Diagnostics (CAD) can support medical practitioners to make critical decisions about their patients' disease conditions. Practitioners require access to the chain of reasoning behind CAD to build trust in the CAD advice and to supplement their own expertise. Yet, CAD systems might be based on black box machine learning models and high dimensional data sources such as electronic health records, magnetic resonance imaging scans, cardiotocograms, etc. These foundations make interpretation and explanation of the CAD advice very challenging. This challenge is recognised throughout the machine learning research community. eXplainable Artificial Intelligence (XAI) is emerging as one of the most important research areas of recent years because it addresses the interpretability and trust concerns of critical decision makers, including those in clinical and medical practice. METHODS In this work, we focus on AdaBoost, a black box model that has been widely adopted in the CAD literature. We address the challenge - to explain AdaBoost classification - with a novel algorithm that extracts simple, logical rules from AdaBoost models. Our algorithm, Adaptive-Weighted High Importance Path Snippets (Ada-WHIPS), makes use of AdaBoost's adaptive classifier weights. Using a novel formulation, Ada-WHIPS uniquely redistributes the weights among individual decision nodes of the internal decision trees of the AdaBoost model. Then, a simple heuristic search of the weighted nodes finds a single rule that dominated the model's decision. We compare the explanations generated by our novel approach with the state of the art in an experimental study. We evaluate the derived explanations with simple statistical tests of well-known quality measures, precision and coverage, and a novel measure stability that is better suited to the XAI setting. RESULTS Experiments on 9 CAD-related data sets showed that Ada-WHIPS explanations consistently generalise better (mean coverage 15%-68%) than the state of the art while remaining competitive for specificity (mean precision 80%-99%). A very small trade-off in specificity is shown to guard against over-fitting which is a known problem in the state of the art methods. CONCLUSIONS The experimental results demonstrate the benefits of using our novel algorithm for explaining CAD AdaBoost classifiers widely found in the literature. Our tightly coupled, AdaBoost-specific approach outperforms model-agnostic explanation methods and should be considered by practitioners looking for an XAI solution for this class of models.
Collapse
Affiliation(s)
- Julian Hatwell
- Birmingham City University, Curzon Street, Birmingham, B5 5JU UK
| | | | | |
Collapse
|
17
|
Abstract
Abstract
Modern machine learning methods typically produce “black box” models that are opaque to interpretation. Yet, their demand has been increasing in the Human-in-the-Loop processes, that is, those processes that require a human agent to verify, approve or reason about the automated decisions before they can be applied. To facilitate this interpretation, we propose Collection of High Importance Random Path Snippets (CHIRPS); a novel algorithm for explaining random forest classification per data instance. CHIRPS extracts a decision path from each tree in the forest that contributes to the majority classification, and then uses frequent pattern mining to identify the most commonly occurring split conditions. Then a simple, conjunctive form rule is constructed where the antecedent terms are derived from the attributes that had the most influence on the classification. This rule is returned alongside estimates of the rule’s precision and coverage on the training data along with counter-factual details. An experimental study involving nine data sets shows that classification rules returned by CHIRPS have a precision at least as high as the state of the art when evaluated on unseen data (0.91–0.99) and offer a much greater coverage (0.04–0.54). Furthermore, CHIRPS uniquely controls against under- and over-fitting solutions by maximising novel objective functions that are better suited to the local (per instance) explanation setting.
Collapse
|
18
|
Bradshaw S, Buenning B, Powell A, Teasley S, Olney A, Lee B. Retrospective Chart Review: Readmission Prediction Ability of the High Acuity Readmission Risk Pediatric Screen (HARRPS) Tool. J Pediatr Nurs 2020; 51:49-56. [PMID: 31887721 DOI: 10.1016/j.pedn.2019.12.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/11/2019] [Revised: 12/13/2019] [Accepted: 12/14/2019] [Indexed: 11/28/2022]
Abstract
BACKGROUND Nurse Case Managers utilize adult based readmission risk tools upon admission to identify readmission risk. An evidence-based pediatric readmission tool could not be identified to replicate in the pediatric space, therefore the High Acuity Readmission Risk Pediatric Screen (HARRPS) Tool was developed to fill this gap. The research aim was to develop a risk score algorithm that accurately predicts pediatric readmissions and provide a predictive validation of the HARRPS Tool. METHOD This was a single-centered, retrospective chart review study which compared pediatric patients with thirty-day readmissions to those without thirty-day readmissions over a twelve-month period. Sample size ratio of 1:2 was determined via power analysis with an overall sample size of 5371. Each category from the HARRPS Tool was appropriately weighted based upon data from this study to then produce an overall, patient-level risk score, which was summed [allowable range: 0, 14] across all components. Cross validation was used to ascertain the readmission risk predictability. RESULTS Of the 5306 patients included in the final analysis, 1343 (25.3%) had a thirty-day readmission. Out of nine risk components analyzed, eight were consistent with the literature review findings. Patients with a score of seven or higher had a 54.9% predicted probability of a thirty-day readmission, compared to 13.6% for patients with a risk score of zero. The c-statistic score of the HARRPS Tool was determined to be 0.68 [95% CI, 0.67, 0.69]. Overall, the HARRPS Tool was favorable and provides initial credibility of the tool's predictive power for the general pediatric population.
Collapse
Affiliation(s)
| | | | - Anita Powell
- Children's Mercy Hospital, United States of America
| | | | | | - Brian Lee
- Children's Mercy Hospital, United States of America
| |
Collapse
|
19
|
Radovanović S, Delibašić B, Jovanović M, Vukićević M, Suknović M. A Framework for Integrating Domain Knowledge in Logistic Regression with Application to Hospital Readmission Prediction. INT J ARTIF INTELL T 2019. [DOI: 10.1142/s0218213019600066] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
It is commonly understood that machine learning algorithms discover and extract knowledge based on data at hand. However, a huge amount of knowledge is available which is in machine-readable format and ready for inclusion in machine learning algorithms and models. In this paper, we propose a framework that integrates domain knowledge in form of ontologies/hierarchies into logistic regression using stacked generalization. Namely, relations from ontology/hierarchy are used in stacking manner in order to obtain higher, more abstract concepts. Obtained concepts are further used for prediction. The problem we solved is unplanned 30-days hospital readmission, which is considered as one of the major problems in healthcare. Proposed framework yields better results compared to Ridge, Lasso, and Tree Lasso Logistic Regression. Results suggest that the proposed framework improves AUC by up to 9.5% on pediatric datasets and up to 4% on morbidly obese patients’ datasets and also improves AUPRC by up to 5.7% on pediatric datasets and up to 2.6% on morbidly obese patients’ datasets on average. This indicates that the inclusion of domain knowledge improves the predictive performance of Logistic Regression.
Collapse
Affiliation(s)
- Sandro Radovanović
- University of Belgrade, Faculty of Organizational Sciences, Jove Ilića 154, Belgrade, Serbia
| | - Boris Delibašić
- University of Belgrade, Faculty of Organizational Sciences, Jove Ilića 154, Belgrade, Serbia
| | - Miloš Jovanović
- University of Belgrade, Faculty of Organizational Sciences, Jove Ilića 154, Belgrade, Serbia
| | - Milan Vukićević
- University of Belgrade, Faculty of Organizational Sciences, Jove Ilića 154, Belgrade, Serbia
| | - Milija Suknović
- University of Belgrade, Faculty of Organizational Sciences, Jove Ilića 154, Belgrade, Serbia
| |
Collapse
|
20
|
Gilvary C, Madhukar N, Elkhader J, Elemento O. The Missing Pieces of Artificial Intelligence in Medicine. Trends Pharmacol Sci 2019; 40:555-564. [DOI: 10.1016/j.tips.2019.06.001] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Revised: 06/03/2019] [Accepted: 06/04/2019] [Indexed: 12/22/2022]
|
21
|
Deschepper M, Eeckloo K, Vogelaers D, Waegeman W. A hospital wide predictive model for unplanned readmission using hierarchical ICD data. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2019; 173:177-183. [PMID: 30777619 DOI: 10.1016/j.cmpb.2019.02.007] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2018] [Revised: 01/22/2019] [Accepted: 02/12/2019] [Indexed: 06/09/2023]
Abstract
BACKGROUND AND OBJECTIVE Hospitals already acquire a large amount of data, mainly for administrative, billing and registration purposes. Tapping on these already available data for additional purposes, aiming at improving care, without significant incremental effort and cost. This potential of secondary patient data is explored through modeling administrative and billing data, as well as the hierarchical structure of pathology codes of the International Classification of Diseases (ICD) in the prediction of unplanned readmissions, as a clinically relevant outcome parameter that can be impacted on in a quality improvement program. METHODS In this single-center, hospital-wide observational cohort study, we included all adult patients discharged in 2016 after applying an exclusion protocol (n = 29,702). In addition to administrative variables, such as age and length of stay, structured pathology data were taken into account in predictive models. As a first research question, we compared logistic regression against penalized logistic regression, gradient boosting and Random Forests to predict unplanned readmission. As a second research goal, we investigated the level of hierarchy within the pathology data needed to achieve the best accuracy. Finally, we investigated which prediction variables play a prominent role in predicting hospital readmission. The performance of all models was evaluated using the Area Under the ROC Curve (AUC) measure. RESULTS All models have the best predictive results using Random Forests. An added value of 7% is observed compared to a baseline method such as logistic regression. The best model, based on Random Forests, achieved an AUC of 0.77, using the diagnosis category and procedure code as lowest level of the hierarchical pathology data. CONCLUSIONS The most accurate model to predict hospital wide unplanned readmission is based on Random Forests and includes the ICD hierarchy, especially diagnosis category. Such an approach lowers the number of predictor variables and yields a higher interpretability than a model based on a detailed diagnosis. The performance of the model proved high enough to be used as a decision support tool.
Collapse
Affiliation(s)
- M Deschepper
- Strategic Policy Cell at Ghent University Hospital, C. Heymanslaan 10, 9000 Ghent, Belgium.
| | - K Eeckloo
- Strategic Policy Cell at Ghent University Hospital, C. Heymanslaan 10, 9000 Ghent, Belgium; Department of Public Health and Primary Care, Ghent University, C. Heymanslaan 10, 9000 Ghent, Belgium
| | - D Vogelaers
- General Internal Medicine, Ghent University Hospital, C. Heymanslaan 10, 9000 Ghent, Belgium; Department of Internal Medicine, Ghent University, C. Heymanslaan 10, 9000 Ghent, Belgium
| | - W Waegeman
- Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000, Ghent, Belgium
| |
Collapse
|
22
|
Predicting hospital associated disability from imbalanced data using supervised learning. Artif Intell Med 2019; 95:88-95. [DOI: 10.1016/j.artmed.2018.09.004] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Revised: 07/30/2018] [Accepted: 09/28/2018] [Indexed: 12/11/2022]
|
23
|
Janjua MB, Reddy S, Samdani AF, Welch WC, Ozturk AK, Price AV, Weprin BE, Swift DM. Predictors of 90-Day Readmission in Children Undergoing Spinal Cord Tumor Surgery: A Nationwide Readmissions Database Analysis. World Neurosurg 2019; 127:e697-e706. [PMID: 30947001 DOI: 10.1016/j.wneu.2019.03.245] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2018] [Revised: 03/22/2019] [Accepted: 03/23/2019] [Indexed: 10/27/2022]
Abstract
OBJECTIVE A fair number of hospital admissions occur after 30 days; thus, the true readmission rate could have been underestimated. Therefore, we hypothesized that the 90-day readmission rate might better characterize the factors contributing to readmission for pediatric patients undergoing spinal tumor resection. METHODS The Nationwide Readmissions Database was used to study the patient demographic data, comorbidities, admissions, hospital course, spinal tumor behavior (malignant vs. benign), complications, revisions, and 30- and 90-day readmissions. RESULTS Of the 397 patients included in the 30-day cohort, 43 (10.8%) had been readmitted. In comparison, the 90-day readmission rate was significantly greater; 52 of 325 patients were readmitted (16.0%; P < 0.04). Patients aged 16-20 constituted the largest subgroup. However, the highest readmission rate was observed for patients aged <5 years (30-day, 21.7%; 90-day, 26.4%). Medicaid patients were more likely to be readmitted than were private insurance patients (30-day odds ratio [OR], 3.3 [P < 0.001]; 90-day OR, 2.29 [P < 0.02]). In both cohorts, patients with malignant tumors required readmission more often than did those with benign tumors (30-day OR, 2.78 [P < 0.02]; 90-day OR, 1.92 [P = 0.08]). In the 90-day cohort, the patients had been readmitted 26.4 days after discharge versus 10.6 days in the 30-day cohort. Within the 90-day cohort, 18.6% of the readmissions were for spinal reoperation, 28.3% for chemotherapy or hematologic complications, and 25.6% for other central nervous system disorders. The median charges for each readmission were ∼$50,000 and ∼$40,000 for the 30- and 90-day cohorts, respectively. Medicaid insurance, malignant tumors, and younger age were significant predictors of readmission in the 90-day cohort. CONCLUSIONS The prevalence and charges associated with unplanned hospital readmissions after spinal tumor resection were remarkably high. Younger age, Medicaid insurance, malignant tumors, and complications during the initial admission were significant predictors of 90-day readmission.
Collapse
Affiliation(s)
- M Burhan Janjua
- Department of Pediatric Neurosurgery, UT Southwestern Medical Center, Dallas, Texas, USA; Department of Neurosurgery, University of Pennsylvania Hospital, Philadelphia, Pennsylvania, USA.
| | - Sumanth Reddy
- Department of Pediatric Neurosurgery, UT Southwestern Medical Center, Dallas, Texas, USA
| | - Amer F Samdani
- Division of Pediatric Spine, Department of Neurosurgery, Shriners Hospital for Children - Philadelphia, Philadelphia, Pennsylvania, USA
| | - William C Welch
- Department of Neurosurgery, University of Pennsylvania Hospital, Philadelphia, Pennsylvania, USA
| | - Ali K Ozturk
- Department of Neurosurgery, University of Pennsylvania Hospital, Philadelphia, Pennsylvania, USA
| | - Angela V Price
- Department of Pediatric Neurosurgery, UT Southwestern Medical Center, Dallas, Texas, USA
| | - Bradley E Weprin
- Department of Pediatric Neurosurgery, UT Southwestern Medical Center, Dallas, Texas, USA
| | - Dale M Swift
- Department of Pediatric Neurosurgery, UT Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
24
|
Carmen R, Yom-Tov GB, Van Nieuwenhuyse I, Foubert B, Ofran Y. The role of specialized hospital units in infection and mortality risk reduction among patients with hematological cancers. PLoS One 2019; 14:e0211694. [PMID: 30893320 PMCID: PMC6426175 DOI: 10.1371/journal.pone.0211694] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Accepted: 01/18/2019] [Indexed: 11/18/2022] Open
Abstract
MOTIVATION Patients with hematological malignancies are susceptible to life-threatening infections after chemotherapy. The current study aimed to evaluate whether management of such patients in dedicated inpatient and emergency wards could provide superior infection prevention and outcome. METHODS We have developed an approach allowing to retrieve infection-related information from unstructured electronic medical records of a tertiary center. Data on 2,330 adults receiving 13,529 chemotherapy treatments for hematological malignancies were identified and assessed. Infection and mortality hazard rates were calculated with multivariate models. Patients were randomly divided into 80:20 training and validation cohorts. To develop patient-tailored risk-prediction models, several machine-learning methods were compared using area under the curve (AUC). RESULTS Of the tested algorithms, the probit model was found to most accurately predict the evaluated hazards and was implemented in an online calculator. The infection-prediction model identified risk factors for infection based on patient characteristics, treatment and history. Observation of patients with a high predicted infection risk in general wards appeared to increase their infection hazard (p = 0.009) compared to similar patients observed in hematology units. The mortality-risk model demonstrated that for infection events starting at home, admission through hematology services was associated with a lower mortality hazard compared to admission through the general emergency department (p = 0.007). Both models show that dedicated hematological facilities and emergency services improve patient outcome post-chemotherapy. The calculated numbers needed to treat were 30.27 and 31.08 for the dedicated emergency and observation facilities, respectively. Infection hazard risks were found to be non-monotonic in time. CONCLUSIONS The accuracy of the proposed mortality and infection risk-prediction models was high, with the AUC of 0.74 and 0.83, respectively. Our results demonstrate that temporal assessment of patient risks is feasible. This may enable physicians to move from one-point decision-making to a continuous dynamic observation, allowing a more flexible and patient-tailored admission policy.
Collapse
Affiliation(s)
- Raïsa Carmen
- Department of Decision Sciences and Information Management, Faculty of Business and Economics, KU Leuven, Brussels Campus, Brussel, Belgium
| | - Galit B. Yom-Tov
- Faculty of Industrial Engineering and Management, Technion, Haifa, Israel
| | | | - Bram Foubert
- Department of Marketing and Supply Chain Management, Maastricht University, Maastricht, The Netherlands
| | - Yishai Ofran
- Department of Hematology and Bone Marrow Transplantation, Rambam Health Care Campus and Bruce Rappaport Faculty of Medicine, Technion, Haifa, Israel
- * E-mail:
| |
Collapse
|
25
|
Cui S, Wang D, Wang Y, Yu PW, Jin Y. An improved support vector machine-based diabetic readmission prediction. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2018; 166:123-135. [PMID: 30415712 DOI: 10.1016/j.cmpb.2018.10.012] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Revised: 10/07/2018] [Accepted: 10/12/2018] [Indexed: 06/09/2023]
Abstract
BACKGROUND AND OBJECTIVE In healthcare systems, the cost of unplanned readmission accounts for a large proportion of total hospital payment. Hospital-specific readmission rate becomes a critical issue around the world. Quantification and early identification of unplanned readmission risks will improve the quality of care during hospitalization and reduce the occurrence of readmission. In clinical practice, medical workers generally use LACE score method to evaluate patient readmission risks, but this method usually performs poorly. With this in mind, this study presents a novel method combining support vector machine and genetic algorithm to build the risk prediction model, which simultaneously involves feature selection and the processing of imbalanced data. This model aims to provide decision support for clinicians during the discharge management of patients with diabetes. METHOD The experiments were conducted from a set of 8756 medical records with 50 different features about diabetic readmission. After preprocessing the data, an effective SMOTE-based method was proposed to solve the imbalance data problem. Further, in order to improve prediction performance, a hybrid feature selection mechanism was devised to select the important features. Subsequently, an improved support vector machine-based (SVM-based) method was developed and the genetic algorithm was used to tune the sensitive parameter of the algorithm. Finally, the five-fold cross-validation method was applied to compare the performance of proposed method with other methods (LACE score, logistic regression, naïve bayes, decision tree and feed forward neural networks). RESULTS Experimental results indicate that the proposed SVM-based method achieves an accuracy of 81.02%, a sensitivity of 82.89%, a specificity of 79.23%, and outperforms other popular algorithms in identifying diabetic patients who may be readmitted. CONCLUSIONS Our research can improve the performance of clinic decision support systems for diabetic readmission, by which the readmission possibility as well as the waste of medical resources can be reduced.
Collapse
Affiliation(s)
- Shaoze Cui
- School of Management Science and Engineering, Dalian University of Technology, Dalian 116023, PR China
| | - Dujuan Wang
- Business School of Sichuan University, Chengdu 610064, China.
| | - Yanzhang Wang
- School of Management Science and Engineering, Dalian University of Technology, Dalian 116023, PR China
| | - Pay-Wen Yu
- Department of Physical Education, Fu Jen Catholic University, New Taipei City 24205, Taiwan
| | - Yaochu Jin
- School of Management Science and Engineering, Dalian University of Technology, Dalian 116023, PR China; Department of Computer Science, University of Surrey, Guildford, Surrey GU2 7XH, United Kingdom
| |
Collapse
|
26
|
Brittan MS, Martin S, Anderson L, Moss A, Torok MR. An Electronic Health Record Tool Designed to Improve Pediatric Hospital Discharge has Low Predictive Utility for Readmissions. J Hosp Med 2018; 13:779-782. [PMID: 30156576 DOI: 10.12788/jhm.3043] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
We developed an electronic health record tool to improve pediatric hospital discharge. This tool flags children with three components that might complicate discharge: home health, polypharmacy (greater than or equal to 6 medications), or nonEnglish speaking caregiver. The tool tallies components and displays them as a composite score of 0-3 points. We describe the tool's development, implementation, and an evaluation of its predictive utility for 30-day unplanned readmissions in 29,542 discharged children. Of these children, 28% had a composite score of 1, 8% a score greater than or equal to 2, and 4% were readmitted. The odds of readmission was significantly higher in children with composite score of 1 versus 0 (odds ratio [OR]: 1.7; 95% CI, 1.5-2) and greater than or equal to 2 versus 0 (OR 4.2; 95% CI 3.6-4.9). The C-statistic for this model was 0.6259. Despite the positive association of the score with readmission, the tool's discriminatory performance is low. Additional research is needed to evaluate its practical benefit for improving the quality of hospital discharge. This study was supported by an institutional Clinical and Operational Effectiveness and Patient Safety Small Grants Program.
Collapse
Affiliation(s)
- Mark S Brittan
- Department of Pediatrics, Children's Hospital Colorado, Aurora, Colorado, USA.
- Adult and Child Consortium for Health Outcomes Research and Delivery Science, University of Colorado School of Medicine and Children's Hospital Colorado, Aurora, Colorado, USA
| | - Sara Martin
- Department of Clinical Application Services, Children's Hospital Colorado, Aurora, Colorado, USA
| | - Leslie Anderson
- Manager of Case Management, Children's Hospital Colorado, Aurora, Colorado, USA
| | - Angela Moss
- Adult and Child Consortium for Health Outcomes Research and Delivery Science, University of Colorado School of Medicine and Children's Hospital Colorado, Aurora, Colorado, USA
| | - Michelle R Torok
- Department of Pediatrics, Children's Hospital Colorado, Aurora, Colorado, USA
- Adult and Child Consortium for Health Outcomes Research and Delivery Science, University of Colorado School of Medicine and Children's Hospital Colorado, Aurora, Colorado, USA
| |
Collapse
|
27
|
Zhao C, Jiang J, Guan Y, Guo X, He B. EMR-based medical knowledge representation and inference via Markov random fields and distributed representation learning. Artif Intell Med 2018; 87:49-59. [PMID: 29691122 DOI: 10.1016/j.artmed.2018.03.005] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2017] [Revised: 02/28/2018] [Accepted: 03/29/2018] [Indexed: 01/09/2023]
Abstract
OBJECTIVE Electronic medical records (EMRs) contain medical knowledge that can be used for clinical decision support (CDS). Our objective is to develop a general system that can extract and represent knowledge contained in EMRs to support three CDS tasks-test recommendation, initial diagnosis, and treatment plan recommendation-given the condition of a patient. METHODS We extracted four kinds of medical entities from records and constructed an EMR-based medical knowledge network (EMKN), in which nodes are entities and edges reflect their co-occurrence in a record. Three bipartite subgraphs (bigraphs) were extracted from the EMKN, one to support each task. One part of the bigraph was the given condition (e.g., symptoms), and the other was the condition to be inferred (e.g., diseases). Each bigraph was regarded as a Markov random field (MRF) to support the inference. We proposed three graph-based energy functions and three likelihood-based energy functions. Two of these functions are based on knowledge representation learning and can provide distributed representations of medical entities. Two EMR datasets and three metrics were utilized to evaluate the performance. RESULTS As a whole, the evaluation results indicate that the proposed system outperformed the baseline methods. The distributed representation of medical entities does reflect similarity relationships with respect to knowledge level. CONCLUSION Combining EMKN and MRF is an effective approach for general medical knowledge representation and inference. Different tasks, however, require individually designed energy functions.
Collapse
Affiliation(s)
- Chao Zhao
- School of Computer Science and Technology, Harbin, Heilongjiang 150001, China.
| | - Jingchi Jiang
- School of Computer Science and Technology, Harbin, Heilongjiang 150001, China.
| | - Yi Guan
- School of Computer Science and Technology, Harbin, Heilongjiang 150001, China.
| | - Xitong Guo
- School of Management, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China.
| | - Bin He
- School of Computer Science and Technology, Harbin, Heilongjiang 150001, China.
| |
Collapse
|
28
|
Abstract
Despite a newfound wealth of data and information, the healthcare sector is lacking in actionable knowledge. This is largely because healthcare data, though plentiful, tends to be inherently complex and fragmented. Health data analytics, with an emphasis on predictive analytics, is emerging as a transformative tool that can enable more proactive and preventative treatment options. This review considers the ways in which predictive analytics has been applied in the for-profit business sector to generate well-timed and accurate predictions of key outcomes, with a focus on key features that may be applicable to healthcare-specific applications. Published medical research presenting assessments of predictive analytics technology in medical applications are reviewed, with particular emphasis on how hospitals have integrated predictive analytics into their day-to-day healthcare services to improve quality of care. This review also highlights the numerous challenges of implementing predictive analytics in healthcare settings and concludes with a discussion of current efforts to implement healthcare data analytics in the developing country, Saudi Arabia.
Collapse
Affiliation(s)
- Hana Alharthi
- Department of Health Information Management and Technology, College of Public Health, Imam Abdulrahman Bin Faisal University (IAU), formerly known as University of Dammam (UoD), P.O. Box 2435, Dammam, 31441, Saudi Arabia.
| |
Collapse
|
29
|
Chen Y, Chu CW, Chen MIC, Cook AR. The utility of LASSO-based models for real time forecasts of endemic infectious diseases: A cross country comparison. J Biomed Inform 2018; 81:16-30. [PMID: 29496631 PMCID: PMC7185473 DOI: 10.1016/j.jbi.2018.02.014] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Revised: 01/19/2018] [Accepted: 02/24/2018] [Indexed: 01/09/2023]
Abstract
A LASSO based forecast model for endemic infectious diseases is proposed. Predictions at 4 weeks achieve desirable accuracy. Models predict outbreaks but may struggle to predict outbreak size.
Introduction Accurate and timely prediction for endemic infectious diseases is vital for public health agencies to plan and carry out any control methods at an early stage of disease outbreaks. Climatic variables has been identified as important predictors in models for infectious disease forecasts. Various approaches have been proposed in the literature to produce accurate and timely predictions and potentially improve public health response. Methods We assessed how the machine learning LASSO method may be useful in providing useful forecasts for different pathogens in countries with different climates. Separate LASSO models were constructed for different disease/country/forecast window with different model complexity by including different sets of predictors to assess the importance of different predictors under various conditions. Results There was a more apparent cyclicity for both climatic variables and incidence in regions further away from the equator. For most diseases, predictions made beyond 4 weeks ahead were increasingly discrepant from the actual scenario. Prediction models were more accurate in capturing the outbreak but less sensitive to predict the outbreak size. In different situations, climatic variables have different levels of importance in prediction accuracy. Conclusions For LASSO models used for prediction, including different sets of predictors has varying effect in different situations. Short term predictions generally perform better than longer term predictions, suggesting public health agencies may need the capacity to respond at short-notice to early warnings.
Collapse
Affiliation(s)
- Yirong Chen
- Saw Swee Hock School of Public Health, National University of Singapore and National University Health System, Tahir Foundation Building, 12 Science Drive 2, 117549, Singapore
| | - Collins Wenhan Chu
- Genome Institute of Singapore, 60 Biopolis Street, Genome, 138672, Singapore
| | - Mark I C Chen
- Saw Swee Hock School of Public Health, National University of Singapore and National University Health System, Tahir Foundation Building, 12 Science Drive 2, 117549, Singapore; Department of Clinical Epidemiology, Communicable Disease Centre, Tan Tock Seng Hospital, Singapore, Moulmein Road, 308433, Singapore
| | - Alex R Cook
- Saw Swee Hock School of Public Health, National University of Singapore and National University Health System, Tahir Foundation Building, 12 Science Drive 2, 117549, Singapore.
| |
Collapse
|