1
|
Jaotombo F, Pauly V, Fond G, Orleans V, Auquier P, Ghattas B, Boyer L. Machine-learning prediction for hospital length of stay using a French medico-administrative database. JOURNAL OF MARKET ACCESS & HEALTH POLICY 2022; 11:2149318. [PMID: 36457821 PMCID: PMC9707380 DOI: 10.1080/20016689.2022.2149318] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 10/17/2022] [Accepted: 11/16/2022] [Indexed: 06/17/2023]
Abstract
INTRODUCTION Prolonged Hospital Length of Stay (PLOS) is an indicator of deteriorated efficiency in Quality of Care. One goal of public health management is to reduce PLOS by identifying its most relevant predictors. The objective of this study is to explore Machine Learning (ML) models that best predict PLOS. METHODS Our dataset was collected from the French Medico-Administrative database (PMSI) as a retrospective cohort study of all discharges in the year 2015 from a large university hospital in France (APHM). The study outcomes were LOS transformed into a binary variable (long vs. short LOS) according to the 90th percentile (14 days). Logistic regression (LR), classification and regression trees (CART), random forest (RF), gradient boosting (GB) and neural networks (NN) were applied to the collected data. The predictive performance of the models was evaluated using the area under the ROC curve (AUC). RESULTS Our analysis included 73,182 hospitalizations, of which 7,341 (10.0%) led to PLOS. The GB classifier was the most performant model with the highest AUC (0.810), superior to all the other models (all p-values <0.0001). The performance of the RF, GB and NN models (AUC ranged from 0.808 to 0.810) was superior to that of the LR model (AUC = 0.795); all p-values <0.0001. In contrast, LR was superior to CART (AUC = 0.786), p < 0.0001. The variable most predictive of the PLOS was the destination of the patient after hospitalization to other institutions. The typical clinical profile of these patients (17.5% of the sample) was the elderly patient, admitted in emergency, for a trauma, a neurological or a cardiovascular pathology, more often institutionalized, with more comorbidities notably mental health problems, dementia and hemiplegia. DISCUSSION The integration of ML, particularly the GB algorithm, may be useful for health-care professionals and bed managers to better identify patients at risk of PLOS. These findings underscore the need to strengthen hospitals through targeted allocation to meet the needs of an aging population.
Collapse
Affiliation(s)
- Franck Jaotombo
- Aix-Marseille University, EA 3279 - Public Health, Chronic Diseases and Quality of Life - Research Unit, La Timone Medical University, Marseille, France
- I2M, CNRS, UMR, Aix-Marseille University, Marseille, France
- Operations Data and Artificial Intelligence, EM Lyon Business School, Ecully, France
| | - Vanessa Pauly
- Aix-Marseille University, EA 3279 - Public Health, Chronic Diseases and Quality of Life - Research Unit, La Timone Medical University, Marseille, France
- Service d’Information Médicale, Public Health Department, La Conception Hospital, Assistance Publique - Hôpitaux de Marseille, Marseille, France
| | - Guillaume Fond
- Aix-Marseille University, EA 3279 - Public Health, Chronic Diseases and Quality of Life - Research Unit, La Timone Medical University, Marseille, France
| | - Veronica Orleans
- Service d’Information Médicale, Public Health Department, La Conception Hospital, Assistance Publique - Hôpitaux de Marseille, Marseille, France
| | - Pascal Auquier
- Aix-Marseille University, EA 3279 - Public Health, Chronic Diseases and Quality of Life - Research Unit, La Timone Medical University, Marseille, France
| | - Badih Ghattas
- I2M, CNRS, UMR, Aix-Marseille University, Marseille, France
| | - Laurent Boyer
- Aix-Marseille University, EA 3279 - Public Health, Chronic Diseases and Quality of Life - Research Unit, La Timone Medical University, Marseille, France
- Service d’Information Médicale, Public Health Department, La Conception Hospital, Assistance Publique - Hôpitaux de Marseille, Marseille, France
| |
Collapse
|
2
|
Explaining predictive factors in patient pathways using autoencoders. PLoS One 2022; 17:e0277135. [DOI: 10.1371/journal.pone.0277135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 10/20/2022] [Indexed: 11/12/2022] Open
Abstract
This paper introduces an end-to-end methodology to predict a pathway-related outcome and identifying predictive factors using autoencoders. A formal description of autoencoders for explainable binary predictions is presented, along with two objective functions that allows for filtering and inverting negative examples during training. A methodology to model and transform complex medical event logs is also proposed, which keeps the pathway information in terms of events and time, as well as the hierarchy information carried in medical codes. A case study is presented, in which the short-term mortality after the implementation of an Implantable Cardioverter-Defibrillator is predicted. Proposed methodologies have been tested and compared to other predictive methods, both explainable and not explainable. Results show the competitiveness of the method in terms of performances, particularly the use of a Variational Auto Encoder with an inverse objective function. Finally, the explainability of the method has been demonstrated, allowing for the identification of interesting predictive factors validated using relative risks.
Collapse
|
3
|
Karboub K, Tabaa M. A Machine Learning Based Discharge Prediction of Cardiovascular Diseases Patients in Intensive Care Units. Healthcare (Basel) 2022; 10:healthcare10060966. [PMID: 35742018 PMCID: PMC9222879 DOI: 10.3390/healthcare10060966] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 05/03/2022] [Accepted: 05/09/2022] [Indexed: 01/12/2023] Open
Abstract
This paper targets a major challenge of how to effectively allocate medical resources in intensive care units (ICUs). We trained multiple regression models using the Medical Information Mart for Intensive Care III (MIMIC III) database recorded in the period between 2001 and 2012. The training and validation dataset included pneumonia, sepsis, congestive heart failure, hypotension, chest pain, coronary artery disease, fever, respiratory failure, acute coronary syndrome, shortness of breath, seizure and transient ischemic attack, and aortic stenosis patients’ recorded data. Then we tested the models on the unseen data of patients diagnosed with coronary artery disease, congestive heart failure or acute coronary syndrome. We included the admission characteristics, clinical prescriptions, physiological measurements, and discharge characteristics of those patients. We assessed the models’ performance using mean residuals and running times as metrics. We ran multiple experiments to study the data partition’s impact on the learning phase. The total running time of our best-evaluated model is 123,450.9 mS. The best model gives an average accuracy of 98%, highlighting the location of discharge, initial diagnosis, location of admission, drug therapy, length of stay and internal transfers as the most influencing patterns to decide a patient’s readiness for discharge.
Collapse
Affiliation(s)
- Kaouter Karboub
- FRDISI, Hassan II University Casablanca, Casablanca 20000, Morocco
- LRI-EAS, ENSEM, Hassan II University Casablanca, Casablanca 20000, Morocco
- LGIPM, Lorraine University, 57000 Metz, France
- Correspondence: (K.K.); (M.T.); Tel.: +212-661-943-174 (M.T.)
| | - Mohamed Tabaa
- LPRI, EMSI, Casablanca 23300, Morocco
- Correspondence: (K.K.); (M.T.); Tel.: +212-661-943-174 (M.T.)
| |
Collapse
|
4
|
Crabb BT, Hamrick F, Campbell JM, Vignolles-Jeong J, Magill ST, Prevedello DM, Carrau RL, Otto BA, Hardesty DA, Couldwell WT, Karsy M. Machine Learning-Based Analysis and Prediction of Unplanned 30-Day Readmissions After Pituitary Adenoma Resection: A Multi-Institutional Retrospective Study With External Validation. Neurosurgery 2022; 91:263-271. [PMID: 35384923 DOI: 10.1227/neu.0000000000001967] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Accepted: 02/05/2022] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Unplanned readmission after transsphenoidal resection of pituitary adenoma can occur in up to 10% of patients but is unpredictable. OBJECTIVE To develop a reliable system for predicting unplanned readmission and create a validated method for stratifying patients by risk. METHODS Data sets were retrospectively collected from the National Surgical Quality Improvement Program and 2 tertiary academic medical centers. Eight machine learning classifiers were fit to the National Surgical Quality Improvement Program data, optimized using Bayesian parameter optimization and evaluated on the external data. Permutation analysis identified the relative importance of predictive variables, and a risk stratification system was built using the trained machine learning models. RESULTS Readmissions were accurately predicted by several classification models with an area under the receiving operator characteristic curve of 0.76 (95% CI 0.68-0.83) on the external data set. Permutation analysis identified the most important variables for predicting readmission as preoperative sodium level, returning to the operating room, and total operation time. High-risk and medium-risk patients, as identified by the proposed risk stratification system, were more likely to be readmitted than low-risk patients, with relative risks of 12.2 (95% CI 5.9-26.5) and 4.2 (95% CI 2.3-8.7), respectively. Overall risk stratification showed high discriminative capability with a C-statistic of 0.73. CONCLUSION In this multi-institutional study with outside validation, unplanned readmissions after pituitary adenoma resection were accurately predicted using machine learning techniques. The features identified in this study and the risk stratification system developed could guide clinical and surgical decision making, reduce healthcare costs, and improve the quality of patient care by better identifying high-risk patients for closer perioperative management.
Collapse
Affiliation(s)
- Brendan T Crabb
- Department of Neurosurgery, University of Utah, Salt Lake City, Utah, USA
| | - Forrest Hamrick
- Department of Neurosurgery, University of Utah, Salt Lake City, Utah, USA
| | - Justin M Campbell
- Department of Neurosurgery, University of Utah, Salt Lake City, Utah, USA
| | | | - Stephen T Magill
- Department of Neurosurgery, The Ohio State University, Columbus, Ohio, USA
| | | | - Ricardo L Carrau
- Department of Neurosurgery, The Ohio State University, Columbus, Ohio, USA
| | - Bradley A Otto
- Department of Neurosurgery, The Ohio State University, Columbus, Ohio, USA
| | - Douglas A Hardesty
- Department of Neurosurgery, The Ohio State University, Columbus, Ohio, USA
| | | | - Michael Karsy
- Department of Neurosurgery, University of Utah, Salt Lake City, Utah, USA
| |
Collapse
|
5
|
Abstract
Health is often qualitatively defined as a status free from disease and its quantitative definition requires finding the boundary separating health from pathological conditions. Since many complex diseases have a strong genetic component, substantial efforts have been made to sequence large-scale personal genomes; however, we are not yet able to effectively quantify health status from personal genomes. Since mutational impacts are ultimately manifested at the protein level, we envision that introducing a panoramic proteomic view of complex diseases will allow us to mechanistically understand the molecular etiologies of human diseases. In this perspective article, we will highlight key proteomic approaches to identify pathogenic mutations and map their convergent pathways underlying disease pathogenesis and the integration of omics data at multiple levels to define the borderline between health and disease.
Collapse
Affiliation(s)
- Mara Zilocchi
- Department of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Cheng Wang
- The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, the Bakar Computational Health Sciences Institute, the Parker Institute for Cancer Immunotherapy, and the Department of Neurology, School of Medicine, University of California, San Francisco, CA, USA
| | - Mohan Babu
- Department of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Jingjing Li
- The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, the Bakar Computational Health Sciences Institute, the Parker Institute for Cancer Immunotherapy, and the Department of Neurology, School of Medicine, University of California, San Francisco, CA, USA
| |
Collapse
|