1
|
Hama T, Alsaleh MM, Allery F, Choi JW, Tomlinson C, Wu H, Lai A, Pontikos N, Thygesen JH. Enhancing Patient Outcome Prediction Through Deep Learning With Sequential Diagnosis Codes From Structured Electronic Health Record Data: Systematic Review. J Med Internet Res 2025; 27:e57358. [PMID: 40100249 PMCID: PMC11962322 DOI: 10.2196/57358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 12/14/2024] [Accepted: 02/18/2025] [Indexed: 03/20/2025] Open
Abstract
BACKGROUND The use of structured electronic health records in health care systems has grown rapidly. These systems collect huge amounts of patient information, including diagnosis codes representing temporal medical history. Sequential diagnostic information has proven valuable for predicting patient outcomes. However, the extent to which these types of data have been incorporated into deep learning (DL) models has not been examined. OBJECTIVE This systematic review aims to describe the use of sequential diagnostic data in DL models, specifically to understand how these data are integrated, whether sample size improves performance, and whether the identified models are generalizable. METHODS Relevant studies published up to May 15, 2023, were identified using 4 databases: PubMed, Embase, IEEE Xplore, and Web of Science. We included all studies using DL algorithms trained on sequential diagnosis codes to predict patient outcomes. We excluded review articles and non-peer-reviewed papers. We evaluated the following aspects in the included papers: DL techniques, characteristics of the dataset, prediction tasks, performance evaluation, generalizability, and explainability. We also assessed the risk of bias and applicability of the studies using the Prediction Model Study Risk of Bias Assessment Tool (PROBAST). We used the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist to report our findings. RESULTS Of the 740 identified papers, 84 (11.4%) met the eligibility criteria. Publications in this area increased yearly. Recurrent neural networks (and their derivatives; 47/84, 56%) and transformers (22/84, 26%) were the most commonly used architectures in DL-based models. Most studies (45/84, 54%) presented their input features as sequences of visit embeddings. Medications (38/84, 45%) were the most common additional feature. Of the 128 predictive outcome tasks, the most frequent was next-visit diagnosis (n=30, 23%), followed by heart failure (n=18, 14%) and mortality (n=17, 13%). Only 7 (8%) of the 84 studies evaluated their models in terms of generalizability. A positive correlation was observed between training sample size and model performance (area under the receiver operating characteristic curve; P=.02). However, 59 (70%) of the 84 studies had a high risk of bias. CONCLUSIONS The application of DL for advanced modeling of sequential medical codes has demonstrated remarkable promise in predicting patient outcomes. The main limitation of this study was the heterogeneity of methods and outcomes. However, our analysis found that using multiple types of features, integrating time intervals, and including larger sample sizes were generally related to an improved predictive performance. This review also highlights that very few studies (7/84, 8%) reported on challenges related to generalizability and less than half (38/84, 45%) of the studies reported on challenges related to explainability. Addressing these shortcomings will be instrumental in unlocking the full potential of DL for enhancing health care outcomes and patient care. TRIAL REGISTRATION PROSPERO CRD42018112161; https://tinyurl.com/yc6h9rwu.
Collapse
Affiliation(s)
- Tuankasfee Hama
- Institute of Health Informatics, University College London, London, United Kingdom
| | - Mohanad M Alsaleh
- Institute of Health Informatics, University College London, London, United Kingdom
- Department of Health Informatics, College of Applied Medical Sciences, Qassim University, Buraydah, Saudi Arabia
| | - Freya Allery
- Institute of Health Informatics, University College London, London, United Kingdom
| | - Jung Won Choi
- Institute of Health Informatics, University College London, London, United Kingdom
| | | | - Honghan Wu
- Institute of Health Informatics, University College London, London, United Kingdom
| | - Alvina Lai
- Institute of Health Informatics, University College London, London, United Kingdom
| | - Nikolas Pontikos
- UCL Institute of Ophthalmology, University College London, London, United Kingdom
| | - Johan H Thygesen
- Institute of Health Informatics, University College London, London, United Kingdom
| |
Collapse
|
2
|
Pham MK, Mai TT, Crane M, Ebiele M, Brennan R, Ward ME, Geary U, McDonald N, Bezbradica M. Forecasting Patient Early Readmission from Irish Hospital Discharge Records Using Conventional Machine Learning Models. Diagnostics (Basel) 2024; 14:2405. [PMID: 39518372 PMCID: PMC11545812 DOI: 10.3390/diagnostics14212405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2024] [Revised: 09/27/2024] [Accepted: 10/23/2024] [Indexed: 11/16/2024] Open
Abstract
BACKGROUND/OBJECTIVES Predicting patient readmission is an important task for healthcare risk management, as it can help prevent adverse events, reduce costs, and improve patient outcomes. In this paper, we compare various conventional machine learning models and deep learning models on a multimodal dataset of electronic discharge records from an Irish acute hospital. METHODS We evaluate the effectiveness of several widely used machine learning models that leverage patient demographics, historical hospitalization records, and clinical diagnosis codes to forecast future clinical risks. Our work focuses on addressing two key challenges in the medical fields, data imbalance and the variety of data types, in order to boost the performance of machine learning algorithms. Furthermore, we also employ SHapley Additive Explanations (SHAP) value visualization to interpret the model predictions and identify both the key data features and disease codes associated with readmission risks, identifying a specific set of diagnosis codes that are significant predictors of readmission within 30 days. RESULTS Through extensive benchmarking and the application of a variety of feature engineering techniques, we successfully improved the area under the curve (AUROC) score from 0.628 to 0.7 across our models on the test dataset. We also revealed that specific diagnoses, including cancer, COPD, and certain social factors, are significant predictors of 30-day readmission risk. Conversely, bacterial carrier status appeared to have minimal impact due to lower case frequencies. CONCLUSIONS Our study demonstrates how we effectively utilize routinely collected hospital data to forecast patient readmission through the use of conventional machine learning while applying explainable AI techniques to explore the correlation between data features and patient readmission rate.
Collapse
Affiliation(s)
- Minh-Khoi Pham
- ADAPT Centre, D02 PN40 Dublin, Ireland; (T.T.M.); (M.C.); (M.E.); (R.B.); (M.B.)
- School of Computing, Dublin City University, D09 Y074 Dublin, Ireland
| | - Tai Tan Mai
- ADAPT Centre, D02 PN40 Dublin, Ireland; (T.T.M.); (M.C.); (M.E.); (R.B.); (M.B.)
- School of Computing, Dublin City University, D09 Y074 Dublin, Ireland
| | - Martin Crane
- ADAPT Centre, D02 PN40 Dublin, Ireland; (T.T.M.); (M.C.); (M.E.); (R.B.); (M.B.)
- School of Computing, Dublin City University, D09 Y074 Dublin, Ireland
| | - Malick Ebiele
- ADAPT Centre, D02 PN40 Dublin, Ireland; (T.T.M.); (M.C.); (M.E.); (R.B.); (M.B.)
- School of Computer Science, University College Dublin, D04 V1W8 Dublin, Ireland
| | - Rob Brennan
- ADAPT Centre, D02 PN40 Dublin, Ireland; (T.T.M.); (M.C.); (M.E.); (R.B.); (M.B.)
- School of Computer Science, University College Dublin, D04 V1W8 Dublin, Ireland
| | - Marie E. Ward
- St James’s Hospital, D08 NHY1 Dublin, Ireland; (M.E.W.); (U.G.)
| | - Una Geary
- St James’s Hospital, D08 NHY1 Dublin, Ireland; (M.E.W.); (U.G.)
| | - Nick McDonald
- School of Psychology, Trinity College Dublin, D02 F6N2 Dublin, Ireland;
| | - Marija Bezbradica
- ADAPT Centre, D02 PN40 Dublin, Ireland; (T.T.M.); (M.C.); (M.E.); (R.B.); (M.B.)
- School of Computing, Dublin City University, D09 Y074 Dublin, Ireland
| |
Collapse
|
3
|
Penso M, Solbiati S, Moccia S, Caiani EG. Decision Support Systems in HF based on Deep Learning Technologies. Curr Heart Fail Rep 2022; 19:38-51. [PMID: 35142985 PMCID: PMC9023383 DOI: 10.1007/s11897-022-00540-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/20/2022] [Indexed: 11/26/2022]
Abstract
Purpose of Review Application of deep learning (DL) is growing in the last years, especially in the healthcare domain. This review presents the current state of DL techniques applied to electronic health record structured data, physiological signals, and imaging modalities for the management of heart failure (HF), focusing in particular on diagnosis, prognosis, and re-hospitalization risk, to explore the level of maturity of DL in this field. Recent Findings DL allows a better integration of different data sources to distillate more accurate outcomes in HF patients, thus resulting in better performance when compared to conventional evaluation methods. While applications in image and signal processing for HF diagnosis have reached very high performance, the application of DL to electronic health records and its multisource data for prediction could still be improved, despite the already promising results. Summary Embracing the current big data era, DL can improve performance compared to conventional techniques and machine learning approaches. DL algorithms have potential to provide more efficient care and improve outcomes of HF patients, although further investigations are needed to overcome current limitations, including results generalizability and transparency and explicability of the evidences supporting the process.
Collapse
Affiliation(s)
- Marco Penso
- Department of Electronics, Information and Biomedical Engineering, Politecnico Di Milano, P.zza L. da Vinci 32, 20133, Milan, Italy
- Centro Cardiologico Monzino IRCCS, Milan, Italy
| | - Sarah Solbiati
- Department of Electronics, Information and Biomedical Engineering, Politecnico Di Milano, P.zza L. da Vinci 32, 20133, Milan, Italy
- Institute of Electronics, Information Engineering and Telecommunications (IEIIT), Italian National Research Council (CNR), Milan, Italy
| | - Sara Moccia
- The BioRobotics Institute, Department of Excellence in Robotics and AI, Scuola Superiore Sant'Anna, Pisa, Italy
| | - Enrico G Caiani
- Department of Electronics, Information and Biomedical Engineering, Politecnico Di Milano, P.zza L. da Vinci 32, 20133, Milan, Italy.
- Institute of Electronics, Information Engineering and Telecommunications (IEIIT), Italian National Research Council (CNR), Milan, Italy.
| |
Collapse
|
4
|
Mensa E, Colla D, Dalmasso M, Giustini M, Mamo C, Pitidis A, Radicioni DP. Violence detection explanation via semantic roles embeddings. BMC Med Inform Decis Mak 2020; 20:263. [PMID: 33059690 PMCID: PMC7559980 DOI: 10.1186/s12911-020-01237-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2020] [Accepted: 09/02/2020] [Indexed: 11/22/2022] Open
Abstract
Background Emergency room reports pose specific challenges to natural language processing techniques. In this setting, violence episodes on women, elderly and children are often under-reported. Categorizing textual descriptions as containing violence-related injuries (V) vs. non-violence-related injuries (NV) is thus a relevant task to the ends of devising alerting mechanisms to track (and prevent) violence episodes. Methods We present ViDeS (so dubbed after Violence Detection System), a system to detect episodes of violence from narrative texts in emergency room reports. It employs a deep neural network for categorizing textual ER reports data, and complements such output by making explicit which elements corroborate the interpretation of the record as reporting about violence-related injuries. To these ends we designed a novel hybrid technique for filling semantic frames that employs distributed representations of terms herein, along with syntactic and semantic information. The system has been validated on real data annotated with two sorts of information: about the presence vs. absence of violence-related injuries, and about some semantic roles that can be interpreted as major cues for violent episodes, such as the agent that committed violence, the victim, the body district involved, etc.. The employed dataset contains over 150K records annotated with class (V,NV) information, and 200 records with finer-grained information on the aforementioned semantic roles. Results We used data coming from an Italian branch of the EU-Injury Database (EU-IDB) project, compiled by hospital staff. Categorization figures approach full precision and recall for negative cases and.97 precision and.94 recall on positive cases. As regards as the recognition of semantic roles, we recorded an accuracy varying from.28 to.90 according to the semantic roles involved. Moreover, the system allowed unveiling annotation errors committed by hospital staff. Conclusions Explaining systems’ results, so to make their output more comprehensible and convincing, is today necessary for AI systems. Our proposal is to combine distributed and symbolic (frame-like) representations as a possible answer to such pressing request for interpretability. Although presently focused on the medical domain, the proposed methodology is general and, in principle, it can be extended to further application areas and categorization tasks.
Collapse
Affiliation(s)
- Enrico Mensa
- Department of Computer Science, University of Turin, Corso Svizzera 185, Turin, 10149, Italy
| | - Davide Colla
- Department of Computer Science, University of Turin, Corso Svizzera 185, Turin, 10149, Italy
| | - Marco Dalmasso
- Servizio sovrazonale di Epidemiologia dell'ASL TO3 della Regione Piemonte, Via Sabaudia 164, Grugliasco (TO), 10095, Italy
| | - Marco Giustini
- Reparto Epidemiologia ambientale e sociale Dipartimento Ambiente e Salute (DAMSA) Istituto Superiore di Sanità, Viale Regina Elena, 299, Roma, 00161, Italy
| | - Carlo Mamo
- Servizio sovrazonale di Epidemiologia dell'ASL TO3 della Regione Piemonte, Via Sabaudia 164, Grugliasco (TO), 10095, Italy
| | - Alessio Pitidis
- Reparto Epidemiologia ambientale e sociale Dipartimento Ambiente e Salute (DAMSA) Istituto Superiore di Sanità, Viale Regina Elena, 299, Roma, 00161, Italy.,Data Analysis Services, B2C Innovation Inc. - Digital Services, Corso Magenta 69/A, Milan, PO Box 20123, Italy
| | - Daniele P Radicioni
- Department of Computer Science, University of Turin, Corso Svizzera 185, Turin, 10149, Italy.
| |
Collapse
|