1
|
Sulaiman WA, Stylianides C, Nikolaou A, Antoniou Z, Constantinou I, Palazis L, Vavlitou A, Kyprianou T, Kyriacou E, Kakas A, Pattichis MS, Panayides AS, Pattichis CS. Leveraging machine learning and rule extraction for enhanced transparency in emergency department length of stay prediction. Front Digit Health 2025; 6:1498939. [PMID: 40012602 PMCID: PMC11861435 DOI: 10.3389/fdgth.2024.1498939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2024] [Accepted: 12/16/2024] [Indexed: 02/28/2025] Open
Abstract
This study aims to address the critical issue of emergency department (ED) overcrowding, which negatively affects patient outcomes, wait times, and resource efficiency. Accurate prediction of ED length of stay (LOS) can streamline operations and improve care delivery. We utilized the MIMIC IV-ED dataset, comprising over 400,000 patient records, to classify ED LOS into short (≤4.5 hours) and long (>4.5 hours) categories. Using machine learning models, including Gradient Boosting (GB), Random Forest (RF), Logistic Regression (LR), and Multilayer Perceptron (MLP), we identified GB as the best performing model outperforming the other models with an AUC of 0.730, accuracy of 69.93%, sensitivity of 88.20%, and specificity of 40.95% on the original dataset. In the balanced dataset, GB had an AUC of 0.729, accuracy of 68.86%, sensitivity of 75.39%, and specificity of 58.59%. To enhance interpretability, a novel rule extraction method for GB model was implemented using relevant important predictors, such as triage acuity, comorbidity scores, and arrival methods. By combining predictive analytics with interpretable rule-based methods, this research provides actionable insights for optimizing patient flow and resource allocation. The findings highlight the importance of transparency in machine learning applications for healthcare, paving the way for future improvements in model performance and clinical adoption.
Collapse
Affiliation(s)
- Waqar A. Sulaiman
- Department of Computer Science and Biomedical Engineering Research Centre, University of Cyprus, Nicosia, Cyprus
| | | | - Andria Nikolaou
- Department of Computer Science and Biomedical Engineering Research Centre, University of Cyprus, Nicosia, Cyprus
- CYENS Centre of Excellence, Nicosia, Cyprus
| | | | | | - Lakis Palazis
- Department of Intensive Care Medicine, Limassol General Hospital, State Health Services Organisation, Nicosia, Cyprus
| | - Anna Vavlitou
- Department of Intensive Care Medicine, Limassol General Hospital, State Health Services Organisation, Nicosia, Cyprus
| | - Theodoros Kyprianou
- Department of Critical Care and Emergency Medicine, Medical School, University of Nicosia, Nicosia, Cyprus
- Department of Critical Care, St Thomas's Hospital NHS, London, United Kingdom
| | - Efthyvoulos Kyriacou
- Department of Electrical Engineering, Computer Engineering and Informatics, Cyprus University of Technology, Limassol, Cyprus
| | - Antonis Kakas
- Department of Computer Science and Biomedical Engineering Research Centre, University of Cyprus, Nicosia, Cyprus
- CYENS Centre of Excellence, Nicosia, Cyprus
| | - Marios S. Pattichis
- Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM, United States
| | | | - Constantinos S. Pattichis
- Department of Computer Science and Biomedical Engineering Research Centre, University of Cyprus, Nicosia, Cyprus
- CYENS Centre of Excellence, Nicosia, Cyprus
| |
Collapse
|
2
|
Zaboli A, Turcato G, Brigiari G, Massar M, Ziller M, Sibilio S, Brigo F. Emergency Departments in Contemporary Healthcare: Are They Still for Emergencies? An Analysis of over 1 Million Attendances. Healthcare (Basel) 2024; 12:2426. [PMID: 39685048 DOI: 10.3390/healthcare12232426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Revised: 11/18/2024] [Accepted: 11/30/2024] [Indexed: 12/18/2024] Open
Abstract
BACKGROUND Over the past few decades, emergency departments (EDs) have experienced an increasing workload. However, the variation in the types of patient accesses to these departments remains poorly understood. OBJECTIVE To evaluate the 5-year temporal trend in the volume of patients attending EDs based on the urgency of their conditions. METHODS This multicenter observational retrospective study was conducted from 1 January 2019, to 31 December 2023, across seven Italian EDs located within the same province. All patients accessing the EDs during the study period were included, totaling 1,282,735 patients. The triage code was used as an urgency index; non-urgent patients were defined as those who received a code 4 or 5 in triage, while urgent patients were defined as those who received a code 3, 2, or 1 in triage. Temporal analyses of admissions were conducted, also evaluating individual age groups to understand behavior over time. RESULTS From 2019 to 2023, there was a significant 10% increase in ED attendances by non-urgent patients. This increase was observed during both daytime and nighttime shifts. Notably, all age groups showed an increase in non-urgent patients, except for pediatric patients aged 0 to 14. CONCLUSIONS Over the past 5 years, there has been a consistent upward trend in ED attendances by non-urgent patients. Healthcare policies should consider implementing strategies to manage or mitigate the overload in EDs, particularly related to non-urgent patient accesses.
Collapse
Affiliation(s)
- Arian Zaboli
- Innovation, Research and Teaching Service (SABES-ASDAA), Teaching Hospital of the Paracelsus Medical Private University (PMU), 39100 Bolzano, Italy
| | - Gianni Turcato
- Department of Internal Medicine, Intermediate Care Unit, Hospital Alto Vicentino (AULSS-7), 36014 Santorso, Italy
| | - Gloria Brigiari
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, Vascular Sciences and Public Health, University of Padova, 35129 Padova, Italy
| | - Magdalena Massar
- Innovation, Research and Teaching Service (SABES-ASDAA), Teaching Hospital of the Paracelsus Medical Private University (PMU), 39100 Bolzano, Italy
| | - Marta Ziller
- Cardiology Department, Hospital of Bolzano, 39100 Bolzano, Italy
| | - Serena Sibilio
- Department Public Health, Institute of Nursing Science, Universitat Basel, 4051 Basel, Switzerland
| | - Francesco Brigo
- Innovation, Research and Teaching Service (SABES-ASDAA), Teaching Hospital of the Paracelsus Medical Private University (PMU), 39100 Bolzano, Italy
| |
Collapse
|
3
|
Farimani RM, Karim H, Atashi A, Tohidinezhad F, Bahaadini K, Abu-Hanna A, Eslami S. Models to predict length of stay in the emergency department: a systematic literature review and appraisal. BMC Emerg Med 2024; 24:54. [PMID: 38575857 PMCID: PMC10996208 DOI: 10.1186/s12873-024-00965-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 03/11/2024] [Indexed: 04/06/2024] Open
Abstract
INTRODUCTION Prolonged Length of Stay (LOS) in ED (Emergency Department) has been associated with poor clinical outcomes. Prediction of ED LOS may help optimize resource utilization, clinical management, and benchmarking. This study aims to systematically review models for predicting ED LOS and to assess the reporting and methodological quality about these models. METHODS The online database PubMed, Scopus, and Web of Science (10 Sep 2023) was searched for English language articles that reported prediction models of LOS in ED. Identified titles and abstracts were independently screened by two reviewers. All original papers describing either development (with or without internal validation) or external validation of a prediction model for LOS in ED were included. RESULTS Of 12,193 uniquely identified articles, 34 studies were included (29 describe the development of new models and five describe the validation of existing models). Different statistical and machine learning methods were applied to the papers. On the 39-point reporting score and 11-point methodological quality score, the highest reporting scores for development and validation studies were 39 and 8, respectively. CONCLUSION Various studies on prediction models for ED LOS were published but they are fairly heterogeneous and suffer from methodological and reporting issues. Model development studies were associated with a poor to a fair level of methodological quality in terms of the predictor selection approach, the sample size, reproducibility of the results, missing imputation technique, and avoiding dichotomizing continuous variables. Moreover, it is recommended that future investigators use the confirmed checklist to improve the quality of reporting.
Collapse
Affiliation(s)
| | - Hesam Karim
- Department of Health Information Management, Faculty of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran
| | - Alireza Atashi
- E-Health Department, Virtual School, Tehran University of Medical Sciences, Tehran, Iran
| | - Fariba Tohidinezhad
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Kambiz Bahaadini
- Department of Medical Informatics, Kerman University of Medical Sciences, Kerman, Iran
| | - Ameen Abu-Hanna
- Medical Informatics, UMC Location University of Amsterdam, Meibergdreef, Amsterdam, The Netherlands
- Amsterdam Public Health, Amsterdam, The Netherlands
| | - Saeid Eslami
- Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
- Medical Informatics, UMC Location University of Amsterdam, Meibergdreef, Amsterdam, The Netherlands.
- Pharmaceutical Research Center, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad, Iran.
| |
Collapse
|
4
|
Ricciardi C, Marino MR, Trunfio TA, Majolo M, Romano M, Amato F, Improta G. Evaluation of different machine learning algorithms for predicting the length of stay in the emergency departments: a single-centre study. Front Digit Health 2024; 5:1323849. [PMID: 38259256 PMCID: PMC10800466 DOI: 10.3389/fdgth.2023.1323849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 12/15/2023] [Indexed: 01/24/2024] Open
Abstract
Background Recently, crowding in emergency departments (EDs) has become a recognised critical factor impacting global public healthcare, resulting from both the rising supply/demand mismatch in medical services and the paucity of hospital beds available in inpatients units and EDs. The length of stay in the ED (ED-LOS) has been found to be a significant indicator of ED bottlenecks. The time a patient spends in the ED is quantified by measuring the ED-LOS, which can be influenced by inefficient care processes and results in increased mortality and health expenditure. Therefore, it is critical to understand the major factors influencing the ED-LOS through forecasting tools enabling early improvements. Methods The purpose of this work is to use a limited set of features impacting ED-LOS, both related to patient characteristics and to ED workflow, to predict it. Different factors were chosen (age, gender, triage level, time of admission, arrival mode) and analysed. Then, machine learning (ML) algorithms were employed to foresee ED-LOS. ML procedures were implemented taking into consideration a dataset of patients obtained from the ED database of the "San Giovanni di Dio e Ruggi d'Aragona" University Hospital (Salerno, Italy) from the period 2014-2019. Results For the years considered, 496,172 admissions were evaluated and 143,641 of them (28.9%) revealed a prolonged ED-LOS. Considering the complete data (48.1% female vs. 51.9% male), 51.7% patients with prolonged ED-LOS were male and 47.3% were female. Regarding the age groups, the patients that were most affected by prolonged ED-LOS were over 64 years. The evaluation metrics of Random Forest algorithm proved to be the best; indeed, it achieved the highest accuracy (74.8%), precision (72.8%), and recall (74.8%) in predicting ED-LOS. Conclusions Different variables, referring to patients' personal and clinical attributes and to the ED process, have a direct impact on the value of ED-LOS. The suggested prediction model has encouraging results; thus, it may be applied to anticipate and manage ED-LOS, preventing crowding and optimising effectiveness and efficiency of the ED.
Collapse
Affiliation(s)
- Carlo Ricciardi
- Department of Electrical Engineering and Information Technology, University of Naples “Federico II”, Naples, Italy
| | | | - Teresa Angela Trunfio
- Department of Advanced Biomedical Sciences, University of Naples “Federico II”, Naples, Italy
| | - Massimo Majolo
- Department of Public Health, University of Naples “Federico II”, Naples, Italy
| | - Maria Romano
- Department of Electrical Engineering and Information Technology, University of Naples “Federico II”, Naples, Italy
| | - Francesco Amato
- Department of Electrical Engineering and Information Technology, University of Naples “Federico II”, Naples, Italy
| | - Giovanni Improta
- Department of Public Health, University of Naples “Federico II”, Naples, Italy
- Interdepartmental Center for Research in Healthcare Management and Innovation in Healthcare (CIRMIS), University of Naples “Federico II”, Naples, Italy
| |
Collapse
|
5
|
Zeleke AJ, Palumbo P, Tubertini P, Miglio R, Chiari L. Machine learning-based prediction of hospital prolonged length of stay admission at emergency department: a Gradient Boosting algorithm analysis. Front Artif Intell 2023; 6:1179226. [PMID: 37588696 PMCID: PMC10426288 DOI: 10.3389/frai.2023.1179226] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 07/10/2023] [Indexed: 08/18/2023] Open
Abstract
Objective This study aims to develop and compare different models to predict the Length of Stay (LoS) and the Prolonged Length of Stay (PLoS) of inpatients admitted through the emergency department (ED) in general patient settings. This aim is not only to promote any specific model but rather to suggest a decision-supporting tool (i.e., a prediction framework). Methods We analyzed a dataset of patients admitted through the ED to the "Sant"Orsola Malpighi University Hospital of Bologna, Italy, between January 1 and October 26, 2022. PLoS was defined as any hospitalization with LoS longer than 6 days. We deployed six classification algorithms for predicting PLoS: Random Forest (RF), Support Vector Machines (SVM), Gradient Boosting (GB), AdaBoost, K-Nearest Neighbors (KNN), and logistic regression (LoR). We evaluated the performance of these models with the Brier score, the area under the ROC curve (AUC), accuracy, sensitivity (recall), specificity, precision, and F1-score. We further developed eight regression models for LoS prediction: Linear Regression (LR), including the penalized linear models Least Absolute Shrinkage and Selection Operator (LASSO), Ridge and Elastic-net regression, Support vector regression, RF regression, KNN, and eXtreme Gradient Boosting (XGBoost) regression. The model performances were measured by their mean square error, mean absolute error, and mean relative error. The dataset was randomly split into a training set (70%) and a validation set (30%). Results A total of 12,858 eligible patients were included in our study, of whom 60.88% had a PloS. The GB classifier best predicted PloS (accuracy 75%, AUC 75.4%, Brier score 0.181), followed by LoR classifier (accuracy 75%, AUC 75.2%, Brier score 0.182). These models also showed to be adequately calibrated. Ridge and XGBoost regressions best predicted LoS, with the smallest total prediction error. The overall prediction error is between 6 and 7 days, meaning there is a 6-7 day mean difference between actual and predicted LoS. Conclusion Our results demonstrate the potential of machine learning-based methods to predict LoS and provide valuable insights into the risks behind prolonged hospitalizations. In addition to physicians' clinical expertise, the results of these models can be utilized as input to make informed decisions, such as predicting hospitalizations and enhancing the overall performance of a public healthcare system.
Collapse
Affiliation(s)
- Addisu Jember Zeleke
- Department of Electrical, Electronic, and Information Engineering Guglielmo Marconi, University of Bologna, Bologna, Italy
| | - Pierpaolo Palumbo
- Department of Electrical, Electronic, and Information Engineering Guglielmo Marconi, University of Bologna, Bologna, Italy
| | - Paolo Tubertini
- Enterprise Information Systems for Integrated Care and Research Data Management, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Azienda Ospedaliero—Universitaria di Bologna, Bologna, Italy
| | - Rossella Miglio
- Department of Statistical Sciences, University of Bologna, Bologna, Italy
| | - Lorenzo Chiari
- Department of Electrical, Electronic, and Information Engineering Guglielmo Marconi, University of Bologna, Bologna, Italy
- Health Sciences and Technologies Interdepartmental Center for Industrial Research (CIRI SDV), University of Bologna, Bologna, Italy
| |
Collapse
|
6
|
Saleem SG, Ali S, Akhtar A, Khatri A, Ashraf N, Jamal I, Maroof Q, Aziz T, Mukhtar S. Impact of sequential capacity building on emergency department organisational flow during COVID-19 pandemic: a quasi-experimental study in a low-resource, tertiary care centre. BMJ Open 2023; 13:e063413. [PMID: 37474172 PMCID: PMC10360418 DOI: 10.1136/bmjopen-2022-063413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 07/22/2023] Open
Abstract
INTRODUCTION A quasi-experimental study was conducted to estimate the impact of sequential emergency department (ED) capacity building interventions on key performance indicators such as patients' length of stay (LOS) and wait time (WT) during the COVID-19 pandemic. This was achieved through augmenting personnel education and head count, space restructuring and workflow reorganisation. SETTING AND PARTICIPANTS This study included 268 352 patients presenting from January 2019 to December 2020 at Indus Hospital and Health network Karachi, a philanthropic tertiary healthcare facility in a city of 20 million residents. A follow-up study was undertaken from January to December 2021 with 123 938 participants. PRIMARY AND SECONDARY OUTCOME MEASURES These included mean and median ED-LOS and WT for participants presenting in different cohorts. The results of the pre-COVID-19 year 2019 (phase 0) were compared with that of the COVID-19 year, 2020 (phases 1-3 corresponding to peaks, and phase 4 corresponding to reduction in caseloads). The follow-up was conducted in 2021 to see the sustainability of the sequential capacity building. RESULTS Phases 1, 2 and 3 had a lower mean adjusted LOS (4.42, 3.92 and 4.40 hours) compared with phase 0 (4.78 hours, p<0.05) with the lowest numbers seen in phase 2. The same held true for WT with 45.1, 23.8 and 30.4 min in phases 1-3 compared with 49.9 in phase 0. However, phase 4 had a higher LOS but a lower WT when compared with phase 0 with a p<0.05. CONCLUSION Sequential capacity building and improving the operational flow through stage appropriate interventions can be used to off-load ED patients and improve process flow metrics. This shows that models created during COVID-19 can be used to develop sustainable solutions and investment is needed in ideas such as ED-based telehealth to improve patient satisfaction and outcomes.
Collapse
Affiliation(s)
| | - Saima Ali
- Indus Hospital & Health Network, Karachi, Pakistan
| | - Ahwaz Akhtar
- Indus Hospital & Health Network, Karachi, Pakistan
| | - Adeel Khatri
- Indus Hospital & Health Network, Karachi, Pakistan
| | | | - Imran Jamal
- Indus Hospital & Health Network, Karachi, Pakistan
| | | | - Tariq Aziz
- Indus Hospital & Health Network, Karachi, Pakistan
| | - Sama Mukhtar
- Indus Hospital & Health Network, Karachi, Pakistan
| |
Collapse
|
7
|
Rahman MA, Moayedikia A, Wiil UK. Editorial: Data-driven technologies for future healthcare systems. FRONTIERS IN MEDICAL TECHNOLOGY 2023; 5:1183687. [PMID: 37293511 PMCID: PMC10244758 DOI: 10.3389/fmedt.2023.1183687] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 05/15/2023] [Indexed: 06/10/2023] Open
Affiliation(s)
- Md Anisur Rahman
- School of Computing, Mathematics and Engineering, Charles Sturt University, Bathurst, NSW, Australia
| | - Alireza Moayedikia
- Department of Business Technology and Entrepreneurship, Swinburne University of Technology, Hawthorn, VIC, Australia
| | - Uffe Kock Wiil
- SDU Health Informatics and Technology, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
8
|
Developing a machine learning model to predict patient need for computed tomography imaging in the emergency department. PLoS One 2022; 17:e0278229. [PMID: 36520785 PMCID: PMC9754219 DOI: 10.1371/journal.pone.0278229] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 11/13/2022] [Indexed: 12/23/2022] Open
Abstract
Overcrowding is a well-known problem in hospitals and emergency departments (ED) that can negatively impact patients and staff. This study aims to present a machine learning model to detect a patient's need for a Computed Tomography (CT) exam in the emergency department at the earliest possible time. The data for this work was collected from ED at Thunder Bay Regional Health Sciences Centre over one year (05/2016-05/2017) and contained administrative triage information. The target outcome was whether or not a patient required a CT exam. Multiple combinations of text embedding methods, machine learning algorithms, and data resampling methods were experimented with to find the optimal model for this task. The final model was trained with 81, 118 visits and tested on a hold-out test set with a size of 9, 013 visits. The best model achieved a ROC AUC score of 0.86 and had a sensitivity of 87.3% and specificity of 70.9%. The most important factors that led to a CT scan order were found to be chief complaint, treatment area, and triage acuity. The proposed model was able to successfully identify patients needing a CT using administrative triage data that is available at the initial stage of a patient's arrival. By determining that a CT scan is needed early in the patient's visit, the ED can allocate resources to ensure these investigations are completed quickly and patient flow is maintained to reduce overcrowding.
Collapse
|
9
|
Gurazada SG, Gao SC, Burstein F, Buntine P. Predicting Patient Length of Stay in Australian Emergency Departments Using Data Mining. SENSORS 2022; 22:s22134968. [PMID: 35808458 PMCID: PMC9269793 DOI: 10.3390/s22134968] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 06/14/2022] [Accepted: 06/28/2022] [Indexed: 02/01/2023]
Abstract
Length of Stay (LOS) is an important performance metric in Australian Emergency Departments (EDs). Recent evidence suggests that an LOS in excess of 4 h may be associated with increased mortality, but despite this, the average LOS continues to remain greater than 4 h in many EDs. Previous studies have found that Data Mining (DM) can be used to help hospitals to manage this metric and there is continued research into identifying factors that cause delays in ED LOS. Despite this, there is still a lack of specific research into how DM could use these factors to manage ED LOS. This study adds to the emerging literature and offers evidence that it is possible to predict delays in ED LOS to offer Clinical Decision Support (CDS) by using DM. Sixteen potentially relevant factors that impact ED LOS were identified through a literature survey and subsequently used as predictors to create six Data Mining Models (DMMs). An extract based on the Victorian Emergency Minimum Dataset (VEMD) was used to obtain relevant patient details and the DMMs were implemented using the Weka Software. The DMMs implemented in this study were successful in identifying the factors that were most likely to cause ED LOS > 4 h and also identify their correlation. These DMMs can be used by hospitals, not only to identify risk factors in their EDs that could lead to ED LOS > 4 h, but also to monitor these factors over time.
Collapse
Affiliation(s)
- Sai Gayatri Gurazada
- Faculty of Information Technology, Monash University, Clayton, Melbourne, VIC 3800, Australia
| | - Shijia Caddie Gao
- Faculty of Information Technology, Monash University, Clayton, Melbourne, VIC 3800, Australia
| | - Frada Burstein
- Faculty of Information Technology, Monash University, Clayton, Melbourne, VIC 3800, Australia
| | - Paul Buntine
- Eastern Health Clinical School Monash University, Box Hill, Melbourne, VIC 3128, Australia
| |
Collapse
|
10
|
Douthit BJ, Walden RL, Cato K, Coviak CP, Cruz C, D'Agostino F, Forbes T, Gao G, Kapetanovic TA, Lee MA, Pruinelli L, Schultz MA, Wieben A, Jeffery AD. Data Science Trends Relevant to Nursing Practice: A Rapid Review of the 2020 Literature. Appl Clin Inform 2022; 13:161-179. [PMID: 35139564 PMCID: PMC8828453 DOI: 10.1055/s-0041-1742218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND The term "data science" encompasses several methods, many of which are considered cutting edge and are being used to influence care processes across the world. Nursing is an applied science and a key discipline in health care systems in both clinical and administrative areas, making the profession increasingly influenced by the latest advances in data science. The greater informatics community should be aware of current trends regarding the intersection of nursing and data science, as developments in nursing practice have cross-professional implications. OBJECTIVES This study aimed to summarize the latest (calendar year 2020) research and applications of nursing-relevant patient outcomes and clinical processes in the data science literature. METHODS We conducted a rapid review of the literature to identify relevant research published during the year 2020. We explored the following 16 topics: (1) artificial intelligence/machine learning credibility and acceptance, (2) burnout, (3) complex care (outpatient), (4) emergency department visits, (5) falls, (6) health care-acquired infections, (7) health care utilization and costs, (8) hospitalization, (9) in-hospital mortality, (10) length of stay, (11) pain, (12) patient safety, (13) pressure injuries, (14) readmissions, (15) staffing, and (16) unit culture. RESULTS Of 16,589 articles, 244 were included in the review. All topics were represented by literature published in 2020, ranging from 1 article to 59 articles. Numerous contemporary data science methods were represented in the literature including the use of machine learning, neural networks, and natural language processing. CONCLUSION This review provides an overview of the data science trends that were relevant to nursing practice in 2020. Examinations of such literature are important to monitor the status of data science's influence in nursing practice.
Collapse
Affiliation(s)
- Brian J. Douthit
- Tennessee Valley Healthcare System, U.S. Department of Veterans Affairs; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States
| | - Rachel L. Walden
- Annette and Irwin Eskind Family Biomedical Library, Vanderbilt University, Nashville, Tennessee, United States
| | - Kenrick Cato
- Department of Emergency Medicine, Columbia University School of Nursing, New York, New York, United States
| | - Cynthia P. Coviak
- Professor Emerita of Nursing, Grand Valley State University, Allendale, Michigan, United States
| | - Christopher Cruz
- Global Health Technology and Informatics, Chevron, San Ramon, California, United States
| | - Fabio D'Agostino
- Department of Medicine and Surgery, Saint Camillus International University of Health Sciences, Rome, Italy
| | - Thompson Forbes
- College of Nursing, East Carolina University, Greenville, North California, United States
| | - Grace Gao
- Department of Nursing, St Catherine University, Saint Paul, Minnesota, United States
| | - Theresa A. Kapetanovic
- College of Nursing, East Carolina University, Greenville, North California, United States
| | - Mikyoung A. Lee
- College of Nursing, Texas Woman's University, Denton, Texas, United States
| | - Lisiane Pruinelli
- School of Nursing, University of Minnesota, Minneapolis, Minnesota, United States
| | - Mary A. Schultz
- Department of Nursing, California State University, San Bernardino, California, United States
| | - Ann Wieben
- School of Nursing, University of Wisconsin-Madison, Wisconsin, United States
| | - Alvin D. Jeffery
- School of Nursing, Vanderbilt University; Tennessee Valley Healthcare System, U.S. Department of Veterans Affairs, Nashville, Tennessee, United States,Address for correspondence Alvin D. Jeffery, PhD, RN-BC, CCRN-K, FNP-BC 461 21st Avenue South, Nashville, TN 37240United States
| |
Collapse
|
11
|
Different Data Mining Approaches Based Medical Text Data. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:1285167. [PMID: 34912530 PMCID: PMC8668297 DOI: 10.1155/2021/1285167] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 11/18/2021] [Indexed: 12/15/2022]
Abstract
The amount of medical text data is increasing dramatically. Medical text data record the progress of medicine and imply a large amount of medical knowledge. As a natural language, they are characterized by semistructured, high-dimensional, high data volume semantics and cannot participate in arithmetic operations. Therefore, how to extract useful knowledge or information from the total available data is very important task. Using various techniques of data mining can extract valuable knowledge or information from data. In the current study, we reviewed different approaches to apply for medical text data mining. The advantages and shortcomings for each technique compared to different processes of medical text data were analyzed. We also explored the applications of algorithms for providing insights to the users and enabling them to use the resources for the specific challenges in medical text data. Further, the main challenges in medical text data mining were discussed. Findings of this paper are benefit for helping the researchers to choose the reasonable techniques for mining medical text data and presenting the main challenges to them in medical text data mining.
Collapse
|
12
|
Rahman MA, Duradoni M, Guazzini A. Identification and prediction of phubbing behavior: a data-driven approach. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-06649-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
13
|
Naemi A, Schmidt T, Mansourvar M, Ebrahimi A, Wiil UK. Quantifying the impact of addressing data challenges in prediction of length of stay. BMC Med Inform Decis Mak 2021; 21:298. [PMID: 34749708 PMCID: PMC8576901 DOI: 10.1186/s12911-021-01660-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 10/17/2021] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Prediction of length of stay (LOS) at admission time can provide physicians and nurses insight into the illness severity of patients and aid them in avoiding adverse events and clinical deterioration. It also assists hospitals with more effectively managing their resources and manpower. METHODS In this field of research, there are some important challenges, such as missing values and LOS data skewness. Moreover, various studies use a binary classification which puts a wide range of patients with different conditions into one category. To address these shortcomings, first multivariate imputation techniques are applied to fill incomplete records, then two proper resampling techniques, namely Borderline-SMOTE and SMOGN, are applied to address data skewness in the classification and regression domains, respectively. Finally, machine learning (ML) techniques including neural networks, extreme gradient boosting, random forest, support vector machine, and decision tree are implemented for both approaches to predict LOS of patients admitted to the Emergency Department of Odense University Hospital between June 2018 and April 2019. The ML models are developed based on data obtained from patients at admission time, including pulse rate, arterial blood oxygen saturation, respiratory rate, systolic blood pressure, triage category, arrival ICD-10 codes, age, and gender. RESULTS The performance of predictive models before and after addressing missing values and data skewness is evaluated using four evaluation metrics namely receiver operating characteristic, area under the curve (AUC), R-squared score (R2), and normalized root mean square error (NRMSE). Results show that the performance of predictive models is improved on average by 15.75% for AUC, 32.19% for R2 score, and 11.32% for NRMSE after addressing the mentioned challenges. Moreover, our results indicate that there is a relationship between the missing values rate, data skewness, and illness severity of patients, so it is clinically essential to take incomplete records of patients into account and apply proper solutions for interpolation of missing values. CONCLUSION We propose a new method comprised of three stages: missing values imputation, data skewness handling, and building predictive models based on classification and regression approaches. Our results indicated that addressing these challenges in a proper way enhanced the performance of models significantly, which led to a more valid prediction of LOS.
Collapse
Affiliation(s)
- Amin Naemi
- Center for Health Informatics and Technology, The Maersk Mc-Kinney Institute, University of Southern Denmark, Odense, Denmark.
| | - Thomas Schmidt
- Center for Health Informatics and Technology, The Maersk Mc-Kinney Institute, University of Southern Denmark, Odense, Denmark
| | - Marjan Mansourvar
- Department of Mathematics and Computer Science (IMADA), University of Southern Denmark, Odense, Denmark
| | - Ali Ebrahimi
- Center for Health Informatics and Technology, The Maersk Mc-Kinney Institute, University of Southern Denmark, Odense, Denmark
| | - Uffe Kock Wiil
- Center for Health Informatics and Technology, The Maersk Mc-Kinney Institute, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
14
|
Wang X, Blumenthal HJ, Hoffman D, Benda N, Kim T, Perry S, Franklin ES, Roth EM, Hettinger AZ, Bisantz AM. Modeling patient-related workload in the emergency department using electronic health record data. Int J Med Inform 2021; 150:104451. [PMID: 33862507 DOI: 10.1016/j.ijmedinf.2021.104451] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2020] [Revised: 03/29/2021] [Accepted: 03/30/2021] [Indexed: 11/24/2022]
Abstract
INTRODUCTION Understanding and managing clinician workload is important for clinician (nurses, physicians and advanced practice providers) occupational health as well as patient safety. Efforts have been made to develop strategies for managing clinician workload by improving patient assignment. The goal of the current study is to use electronic health record (EHR) data to predict the amount of work that individual patients contribute to clinician workload (patient-related workload). METHODS One month of EHR data was retrieved from an emergency department (ED). A list of workload indicators and five potential workload proxies were extracted from the data. Linear regression and four machine learning classification algorithms were utilized to model the relationship between the indicators and the proxies. RESULTS Linear regression proved that the indicators explained a substantial amount of variance of the proxies (four out of five proxies were modeled with R2 > 0.80). Classification algorithms also showed success in classifying a patient as having high or low task demand based on data from early in the ED visit (e.g. 80 % accurate binary classification with data from the first hour). CONCLUSION The main contribution of this study is demonstrating the potential of using EHR data to predict patient-related workload automatically in the ED. The predicted workload can potentially help in managing clinician workload by supporting decisions around the assignment of new patients to providers. Future work should focus on identifying the relationship between workload proxies and actual workload, as well as improving prediction performance of regression and multi-class classification.
Collapse
Affiliation(s)
| | - H Joseph Blumenthal
- National Center for Human Factors in Healthcare, MedStar Institute for Innovation, United States
| | - Daniel Hoffman
- National Center for Human Factors in Healthcare, MedStar Institute for Innovation, United States
| | - Natalie Benda
- National Center for Human Factors in Healthcare, MedStar Institute for Innovation, United States
| | - Tracy Kim
- National Center for Human Factors in Healthcare, MedStar Institute for Innovation, United States
| | | | - Ella S Franklin
- National Center for Human Factors in Healthcare, MedStar Institute for Innovation, United States
| | | | - A Zachary Hettinger
- National Center for Human Factors in Healthcare, MedStar Institute for Innovation, United States; Georgetown University School of Medicine, United States
| | | |
Collapse
|
15
|
El-Bouri R, Taylor T, Youssef A, Zhu T, Clifton DA. Machine learning in patient flow: a review. PROGRESS IN BIOMEDICAL ENGINEERING (BRISTOL, ENGLAND) 2021; 3:022002. [PMID: 34738074 PMCID: PMC8559147 DOI: 10.1088/2516-1091/abddc5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 01/18/2021] [Accepted: 01/20/2021] [Indexed: 12/13/2022]
Abstract
This work is a review of the ways in which machine learning has been used in order to plan, improve or aid the problem of moving patients through healthcare services. We decompose the patient flow problem into four subcategories: prediction of demand on a healthcare institution, prediction of the demand and resource required to transfer patients from the emergency department to the hospital, prediction of potential resource required for the treatment and movement of inpatients and prediction of length-of-stay and discharge timing. We argue that there are benefits to both approaches of considering the healthcare institution as a whole as well as the patient by patient case and that ideally a combination of these would be best for improving patient flow through hospitals. We also argue that it is essential for there to be a shared dataset that will allow researchers to benchmark their algorithms on and thereby allow future researchers to build on that which has already been done. We conclude that machine learning for the improvement of patient flow is still a young field with very few papers tailor-making machine learning methods for the problem being considered. Future works should consider the need to transfer algorithms trained on a dataset to multiple hospitals and allowing for dynamic algorithms which will allow real-time decision-making to help clinical staff on the shop floor.
Collapse
Affiliation(s)
- Rasheed El-Bouri
- Institute of Biomedical Engineering, University of Oxford, Oxford, United Kingdom
| | - Thomas Taylor
- Institute of Biomedical Engineering, University of Oxford, Oxford, United Kingdom
| | - Alexey Youssef
- Institute of Biomedical Engineering, University of Oxford, Oxford, United Kingdom
| | - Tingting Zhu
- Institute of Biomedical Engineering, University of Oxford, Oxford, United Kingdom
| | - David A Clifton
- Institute of Biomedical Engineering, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
16
|
Kolling ML, Furstenau LB, Sott MK, Rabaioli B, Ulmi PH, Bragazzi NL, Tedesco LPC. Data Mining in Healthcare: Applying Strategic Intelligence Techniques to Depict 25 Years of Research Development. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:3099. [PMID: 33802880 PMCID: PMC8002654 DOI: 10.3390/ijerph18063099] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 03/12/2021] [Accepted: 03/15/2021] [Indexed: 12/15/2022]
Abstract
In order to identify the strategic topics and the thematic evolution structure of data mining applied to healthcare, in this paper, a bibliometric performance and network analysis (BPNA) was conducted. For this purpose, 6138 articles were sourced from the Web of Science covering the period from 1995 to July 2020 and the SciMAT software was used. Our results present a strategic diagram composed of 19 themes, of which the 8 motor themes ('NEURAL-NETWORKS', 'CANCER', 'ELETRONIC-HEALTH-RECORDS', 'DIABETES-MELLITUS', 'ALZHEIMER'S-DISEASE', 'BREAST-CANCER', 'DEPRESSION', and 'RANDOM-FOREST') are depicted in a thematic network. An in-depth analysis was carried out in order to find hidden patterns and to provide a general perspective of the field. The thematic network structure is arranged thusly that its subjects are organized into two different areas, (i) practices and techniques related to data mining in healthcare, and (ii) health concepts and disease supported by data mining, embodying, respectively, the hotspots related to the data mining and medical scopes, hence demonstrating the field's evolution over time. Such results make it possible to form the basis for future research and facilitate decision-making by researchers and practitioners, institutions, and governments interested in data mining in healthcare.
Collapse
Affiliation(s)
- Maikel Luis Kolling
- Graduate Program of Industrial Systems and Processes, University of Santa Cruz do Sul, Santa Cruz do Sul 96816-501, Brazil; (M.L.K.); (M.K.S.)
| | - Leonardo B. Furstenau
- Department of Industrial Engineering, Federal University of Rio Grande do Sul, Porto Alegre 90035-190, Brazil;
| | - Michele Kremer Sott
- Graduate Program of Industrial Systems and Processes, University of Santa Cruz do Sul, Santa Cruz do Sul 96816-501, Brazil; (M.L.K.); (M.K.S.)
| | - Bruna Rabaioli
- Department of Medicine, University of Santa Cruz do Sul, Santa Cruz do Sul 96816-501, Brazil;
| | - Pedro Henrique Ulmi
- Department of Computer Science, University of Santa Cruz do Sul, Santa Cruz do Sul 96816-501, Brazil;
| | - Nicola Luigi Bragazzi
- Laboratory for Industrial and Applied Mathematics (LIAM), Department of Mathematics and Statistics, York University, Toronto, ON M3J 1P3, Canada
| | - Leonel Pablo Carvalho Tedesco
- Graduate Program of Industrial Systems and Processes, University of Santa Cruz do Sul, Santa Cruz do Sul 96816-501, Brazil; (M.L.K.); (M.K.S.)
- Department of Computer Science, University of Santa Cruz do Sul, Santa Cruz do Sul 96816-501, Brazil;
| |
Collapse
|