1
|
Hao Q, Chen J, Chen H, Zhang J, Du Y, Cheng X. Comparing nSOFA, CRIB-II, and SNAPPE-II for predicting mortality and short-term morbidities in preterm infants ≤32 weeks gestation. Ann Med 2024; 56:2426752. [PMID: 39520140 PMCID: PMC11552290 DOI: 10.1080/07853890.2024.2426752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 09/21/2024] [Accepted: 10/09/2024] [Indexed: 11/16/2024] Open
Abstract
BACKGROUND Neonatal illness severity scores are not extensively studied for their ability to predict mortality or morbidity in preterm infants. The aim of this study was to compare the Neonatal Sequential Organ Failure Assessment (nSOFA), Clinical Risk Index for Babies-II (CRIB-II), and Score for Neonatal Acute Physiology with Perinatal extension-II (SNAPPE-II) for predicting mortality and short-term morbidities in preterm infants ≤32 weeks. METHODS In this retrospective study, infants born in 2017-2018 with gestational age (GA) ≤32 weeks were evaluated. nSOFA, CRIB-II, and SNAPPE-II scores were calculated for each patient, and the ability of these scores to predict mortality and morbidities was compared. The morbidities were categorized as mod/sev bronchopulmonary dysplasia (BPD), necrotizing enterocolitis (NEC) requiring surgery, early-onset sepsis (EOS), late-onset sepsis (LOS), retinopathy of prematurity (ROP) requiring treatment, and severe intraventricular hemorrhage (IVH). Calculating the area under the curve (AUC) on receiver operating characteristic curves (ROC) analysis to predict and compare scoring systems' accuracy. RESULTS A total of 759 preterm infants were enrolled, of whom 88 deceased. The median nSOFA, CRIB-II, and SNAPPE-II scores were 2 (0, 3), 6 (4, 8), and 13 (5, 26), respectively. Compared with infants who survived, these three scores were significantly higher in those who deceased (p < 0.05). For predicting mortality, the AUC of the nSOFA, SNAPPE-II, and CRIB-II were 0.90, 0.82, and 0.79, respectively. The nSOFA scoring system had significantly higher AUC than CRIB-II and SNAPPE-II (p < 0.05). However, short-term morbidities were not strongly correlated with these three scoring systems. CONCLUSION In infants ≤32 weeks gestation, nSOFA scoring system is more valuable in predicting mortality than SNAPPE-II and CRIB-II. However, further studies are required to assess the predictive power of neonatal illness severity scores for morbidity.
Collapse
MESH Headings
- Humans
- Infant, Newborn
- Retrospective Studies
- Female
- Male
- Infant, Premature
- Gestational Age
- Organ Dysfunction Scores
- Bronchopulmonary Dysplasia/mortality
- Bronchopulmonary Dysplasia/epidemiology
- Infant, Premature, Diseases/mortality
- Infant, Premature, Diseases/diagnosis
- Infant, Premature, Diseases/epidemiology
- Retinopathy of Prematurity/mortality
- Retinopathy of Prematurity/diagnosis
- Retinopathy of Prematurity/epidemiology
- ROC Curve
- Severity of Illness Index
- Risk Assessment/methods
- Infant
- Enterocolitis, Necrotizing/mortality
- Enterocolitis, Necrotizing/epidemiology
- Enterocolitis, Necrotizing/diagnosis
- Infant Mortality
Collapse
Affiliation(s)
- Qingfei Hao
- Department of Neonatology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Jing Chen
- Department of Neonatology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Haoming Chen
- Department of Neonatology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Jing Zhang
- Department of Neonatology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Yanna Du
- Department of Neonatology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Xiuyong Cheng
- Department of Neonatology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| |
Collapse
|
2
|
Fung A, Loutet M, Roth DE, Wong E, Gill PJ, Morris SK, Beyene J. Clinical prediction models in children that use repeated measurements with time-varying covariates: a scoping review. Acad Pediatr 2024; 24:728-740. [PMID: 38561061 DOI: 10.1016/j.acap.2024.03.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 02/29/2024] [Accepted: 03/27/2024] [Indexed: 04/04/2024]
Abstract
BACKGROUND Emerging evidence suggests that clinical prediction models that use repeated (time-varying) measurements within each patient may have higher predictive accuracy than models that use patient information from a single measurement. OBJECTIVE To determine the breadth of the published literature reporting the development of clinical prediction models in children that use time-varying predictors. DATA SOURCES MEDLINE, EMBASE and Cochrane databases. ELIGIBILITY CRITERIA We included studies reporting the development of a multivariable clinical prediction model in children, with or without validation, to predict a repeatedly measured binary or time-to-event outcome and utilizing at least one repeatedly measured predictor. SYNTHESIS METHODS We categorized included studies by the method used to model time-varying predictors. RESULTS Of 99 clinical prediction model studies that had a repeated measurements data structure, only 27 (27%) used methods that incorporated the repeated measurements as time-varying predictors in a single model. Among these 27 time-varying prediction model studies, we grouped model types into nine categories: time-dependent Cox regression, generalized estimating equations, random effects model, landmark model, joint model, neural network, K-nearest neighbor, support vector machine and tree-based algorithms. Where there was comparison of time-varying models to single measurement models, using time-varying predictors improved predictive accuracy. CONCLUSIONS Various methods have been used to develop time-varying prediction models in children, but there is a paucity of pediatric time-varying models in the literature. Incorporating time-varying covariates in pediatric prediction models may improve predictive accuracy. Future research in pediatric prediction model development should further investigate whether incorporation of time-varying covariates improves predictive accuracy.
Collapse
Affiliation(s)
- Alastair Fung
- Division of Paediatric Medicine (A Fung, DE Roth, and PJ Gill), Hospital for Sick Children, Toronto, Ontario, Canada; Dalla Lana School of Public Health (A Fung, M Loutet, DE Roth, PJ Gill, SK Morris, and J Beyene), University of Toronto, Toronto, Ontario, Canada; Centre for Global Child Health (A Fung, M Loutet, DE Roth, and SK Morris), Hospital for Sick Children, Toronto, Ontario, Canada.
| | - Miranda Loutet
- Dalla Lana School of Public Health (A Fung, M Loutet, DE Roth, PJ Gill, SK Morris, and J Beyene), University of Toronto, Toronto, Ontario, Canada; Centre for Global Child Health (A Fung, M Loutet, DE Roth, and SK Morris), Hospital for Sick Children, Toronto, Ontario, Canada
| | - Daniel E Roth
- Division of Paediatric Medicine (A Fung, DE Roth, and PJ Gill), Hospital for Sick Children, Toronto, Ontario, Canada; Dalla Lana School of Public Health (A Fung, M Loutet, DE Roth, PJ Gill, SK Morris, and J Beyene), University of Toronto, Toronto, Ontario, Canada; Centre for Global Child Health (A Fung, M Loutet, DE Roth, and SK Morris), Hospital for Sick Children, Toronto, Ontario, Canada; Temerty Faculty of Medicine (DE Roth, E Wong, PJ Gill, and SK Morris), University of Toronto, Toronto, Ontario, Canada; Child Health Evaluative Sciences (DE Roth, PJ Gill, and SK Morris), Hospital for Sick Children Research Institute, Toronto, Ontario, Canada
| | - Elliott Wong
- Temerty Faculty of Medicine (DE Roth, E Wong, PJ Gill, and SK Morris), University of Toronto, Toronto, Ontario, Canada
| | - Peter J Gill
- Division of Paediatric Medicine (A Fung, DE Roth, and PJ Gill), Hospital for Sick Children, Toronto, Ontario, Canada; Dalla Lana School of Public Health (A Fung, M Loutet, DE Roth, PJ Gill, SK Morris, and J Beyene), University of Toronto, Toronto, Ontario, Canada; Temerty Faculty of Medicine (DE Roth, E Wong, PJ Gill, and SK Morris), University of Toronto, Toronto, Ontario, Canada; Child Health Evaluative Sciences (DE Roth, PJ Gill, and SK Morris), Hospital for Sick Children Research Institute, Toronto, Ontario, Canada
| | - Shaun K Morris
- Dalla Lana School of Public Health (A Fung, M Loutet, DE Roth, PJ Gill, SK Morris, and J Beyene), University of Toronto, Toronto, Ontario, Canada; Centre for Global Child Health (A Fung, M Loutet, DE Roth, and SK Morris), Hospital for Sick Children, Toronto, Ontario, Canada; Temerty Faculty of Medicine (DE Roth, E Wong, PJ Gill, and SK Morris), University of Toronto, Toronto, Ontario, Canada; Child Health Evaluative Sciences (DE Roth, PJ Gill, and SK Morris), Hospital for Sick Children Research Institute, Toronto, Ontario, Canada; Division of Infectious Diseases (SK Morris), Hospital for Sick Children, Toronto, Ontario, Canada
| | - Joseph Beyene
- Dalla Lana School of Public Health (A Fung, M Loutet, DE Roth, PJ Gill, SK Morris, and J Beyene), University of Toronto, Toronto, Ontario, Canada; Department of Health Research Methods, Evidence and Impact (J Beyene), Faculty of Health Sciences, McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
3
|
Srivastava S, Rajan V. ExpertNet: A Deep Learning Approach to Combined Risk Modeling and Subtyping in Intensive Care Units. IEEE J Biomed Health Inform 2023; 27:5076-5086. [PMID: 37819834 DOI: 10.1109/jbhi.2023.3295751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
Risk models play a crucial role in disease prevention, particularly in intensive care units (ICUs). Diseases often have complex manifestations with heterogeneous subpopulations, or subtypes, that exhibit distinct clinical characteristics. Risk models that explicitly model subtypes have high predictive accuracy and facilitate subtype-specific personalization. Such models combine clustering and classification methods but do not effectively utilize the inferred subtypes in risk modeling. Their limitations include tendency to obtain degenerate clusters and cluster-specific data scarcity leading to insufficient training data for the corresponding classifier. In this article, we develop a new deep learning model for simultaneous clustering and classification, ExpertNet, with novel loss terms and network training strategies that address these limitations. The performance of ExpertNet is evaluated on the tasks of predicting risk of (i) sepsis and (ii) acute respiratory distress syndrome (ARDS), using two large electronic medical records datasets from ICUs. Our extensive experiments show that, in comparison to state-of-the-art baselines for combined clustering and classification, ExpertNet achieves superior accuracy in risk prediction for both ARDS and sepsis; and comparable clustering performance. Visual analysis of the clusters further demonstrates that the clusters obtained are clinically meaningful and a knowledge-distilled model shows significant differences in risk factors across the subtypes. By addressing technical challenges in training neural networks for simultaneous clustering and classification, ExpertNet lays the algorithmic foundation for the future development of subtype-aware risk models.
Collapse
|
4
|
Silva Rocha ED, de Morais Melo FL, de Mello MEF, Figueiroa B, Sampaio V, Endo PT. On usage of artificial intelligence for predicting mortality during and post-pregnancy: a systematic review of literature. BMC Med Inform Decis Mak 2022; 22:334. [PMID: 36536413 PMCID: PMC9764498 DOI: 10.1186/s12911-022-02082-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Accepted: 12/13/2022] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Care during pregnancy, childbirth and puerperium are fundamental to avoid pathologies for the mother and her baby. However, health issues can occur during this period, causing misfortunes, such as the death of the fetus or neonate. Predictive models of fetal and infant deaths are important technological tools that can help to reduce mortality indexes. The main goal of this work is to present a systematic review of literature focused on computational models to predict mortality, covering stillbirth, perinatal, neonatal, and infant deaths, highlighting their methodology and the description of the proposed computational models. METHODS We conducted a systematic review of literature, limiting the search to the last 10 years of publications considering the five main scientific databases as source. RESULTS From 671 works, 18 of them were selected as primary studies for further analysis. We found that most of works are focused on prediction of neonatal deaths, using machine learning models (more specifically Random Forest). The top five most common features used to train models are birth weight, gestational age, sex of the child, Apgar score and mother's age. Having predictive models for preventing mortality during and post-pregnancy not only improve the mother's quality of life, as well as it can be a powerful and low-cost tool to decrease mortality ratios. CONCLUSION Based on the results of this SRL, we can state that scientific efforts have been done in this area, but there are many open research opportunities to be developed by the community.
Collapse
Affiliation(s)
- Elisson da Silva Rocha
- grid.26141.300000 0000 9011 5442Programa de Pós-Graduação em Engenharia da Computação, Universidade de Pernambuco, Recife, Brazil
| | - Flavio Leandro de Morais Melo
- grid.26141.300000 0000 9011 5442Programa de Pós-Graduação em Engenharia da Computação, Universidade de Pernambuco, Recife, Brazil
| | | | - Barbara Figueiroa
- Programa Mãe Coruja Pernambucana, Secretaria de Saúde do Estado de Pernambuco, Recife, Brazil
| | | | - Patricia Takako Endo
- grid.26141.300000 0000 9011 5442Programa de Pós-Graduação em Engenharia da Computação, Universidade de Pernambuco, Recife, Brazil
| |
Collapse
|
5
|
McAdams RM, Kaur R, Sun Y, Bindra H, Cho SJ, Singh H. Predicting clinical outcomes using artificial intelligence and machine learning in neonatal intensive care units: a systematic review. J Perinatol 2022; 42:1561-1575. [PMID: 35562414 DOI: 10.1038/s41372-022-01392-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 03/30/2022] [Accepted: 04/01/2022] [Indexed: 01/19/2023]
Abstract
BACKGROUND Advances in technology, data availability, and analytics have helped improve quality of care in the neonatal intensive care unit. OBJECTIVE To provide an in-depth review of artificial intelligence (AI) and machine learning techniques being utilized to predict neonatal outcomes. METHODS The PRISMA protocol was followed that considered articles from established digital repositories. Included articles were categorized based on predictions of: (a) major neonatal morbidities such as sepsis, bronchopulmonary dysplasia, intraventricular hemorrhage, necrotizing enterocolitis, and retinopathy of prematurity; (b) mortality; and (c) length of stay. RESULTS A total of 366 studies were considered; 68 studies were eligible for inclusion in the review. The current set of predictor models are primarily built on supervised learning and mostly used regression models built on retrospective data. CONCLUSION With the availability of EMR data and data-sharing of NICU outcomes across neonatal research networks, machine learning algorithms have shown breakthrough performance in predicting neonatal disease.
Collapse
Affiliation(s)
- Ryan M McAdams
- Department of Pediatrics, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA
| | - Ravneet Kaur
- Child Health Imprints (CHIL) USA Inc, Madison, WI, USA
| | - Yao Sun
- Division of Neonatology, University of California San Francisco, San Francisco, CA, USA
| | | | - Su Jin Cho
- College of Medicine, Ewha Womans University Seoul, Seoul, Korea
| | | |
Collapse
|
6
|
Deng Y, Liu S, Wang Z, Wang Y, Jiang Y, Liu B. Explainable time-series deep learning models for the prediction of mortality, prolonged length of stay and 30-day readmission in intensive care patients. Front Med (Lausanne) 2022; 9:933037. [PMID: 36250092 PMCID: PMC9554013 DOI: 10.3389/fmed.2022.933037] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 09/01/2022] [Indexed: 11/14/2022] Open
Abstract
Background In-hospital mortality, prolonged length of stay (LOS), and 30-day readmission are common outcomes in the intensive care unit (ICU). Traditional scoring systems and machine learning models for predicting these outcomes usually ignore the characteristics of ICU data, which are time-series forms. We aimed to use time-series deep learning models with the selective combination of three widely used scoring systems to predict these outcomes. Materials and methods A retrospective cohort study was conducted on 40,083 patients in ICU from the Medical Information Mart for Intensive Care-IV (MIMIC-IV) database. Three deep learning models, namely, recurrent neural network (RNN), gated recurrent unit (GRU), and long short-term memory (LSTM) with attention mechanisms, were trained for the prediction of in-hospital mortality, prolonged LOS, and 30-day readmission with variables collected during the initial 24 h after ICU admission or the last 24 h before discharge. The inclusion of variables was based on three widely used scoring systems, namely, APACHE II, SOFA, and SAPS II, and the predictors consisted of time-series vital signs, laboratory tests, medication, and procedures. The patients were randomly divided into a training set (80%) and a test set (20%), which were used for model development and model evaluation, respectively. The area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and Brier scores were used to evaluate model performance. Variable significance was identified through attention mechanisms. Results A total of 33 variables for 40,083 patients were enrolled for mortality and prolonged LOS prediction and 36,180 for readmission prediction. The rates of occurrence of the three outcomes were 9.74%, 27.54%, and 11.79%, respectively. In each of the three outcomes, the performance of RNN, GRU, and LSTM did not differ greatly. Mortality prediction models, prolonged LOS prediction models, and readmission prediction models achieved AUCs of 0.870 ± 0.001, 0.765 ± 0.003, and 0.635 ± 0.018, respectively. The top significant variables co-selected by the three deep learning models were Glasgow Coma Scale (GCS), age, blood urea nitrogen, and norepinephrine for mortality; GCS, invasive ventilation, and blood urea nitrogen for prolonged LOS; and blood urea nitrogen, GCS, and ethnicity for readmission. Conclusion The prognostic prediction models established in our study achieved good performance in predicting common outcomes of patients in ICU, especially in mortality prediction. In addition, GCS and blood urea nitrogen were identified as the most important factors strongly associated with adverse ICU events.
Collapse
Affiliation(s)
- Yuhan Deng
- School of Public Health, Peking University, Beijing, China
| | - Shuang Liu
- School of Public Health, Peking University, Beijing, China
| | - Ziyao Wang
- School of Public Health, Peking University, Beijing, China
| | - Yuxin Wang
- School of Public Health, Peking University, Beijing, China
| | - Yong Jiang
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
- China National Clinical Research Center for Neurological Diseases, Beijing, China
- Yong Jiang,
| | - Baohua Liu
- School of Public Health, Peking University, Beijing, China
- *Correspondence: Baohua Liu,
| |
Collapse
|
7
|
Machine Learning Models for Predicting Mortality in 7472 Very Low Birth Weight Infants Using Data from a Nationwide Neonatal Network. Diagnostics (Basel) 2022; 12:diagnostics12030625. [PMID: 35328178 PMCID: PMC8947011 DOI: 10.3390/diagnostics12030625] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 02/26/2022] [Accepted: 03/01/2022] [Indexed: 11/30/2022] Open
Abstract
Statistical and analytical methods using artificial intelligence approaches such as machine learning (ML) are increasingly being applied to the field of pediatrics, particularly to neonatology. This study compared the representative ML analysis and the logistic regression (LR), which is a traditional statistical analysis method, using them to predict mortality of very low birth weight infants (VLBWI). We included 7472 VLBWI data from a nationwide Korean neonatal network. Eleven predictor variables (neonatal factors: male sex, gestational age, 5 min Apgar scores, body temperature, and resuscitation at birth; maternal factors: diabetes mellitus, hypertension, chorioamnionitis, premature rupture of membranes, antenatal steroid, and cesarean delivery) were selected based on clinical impact and statistical analysis. We compared the predicted mortality between ML methods—such as artificial neural network (ANN), random forest (RF), and support vector machine (SVM)—and LR with a randomly selected training set (80%) and a test set (20%). The model performances of area under the receiver operating curve (95% confidence interval) equaled LR 0.841 (0.811−0.872), ANN 0.845 (0.815−0.875), and RF 0.826 (0.795−0.858). The exception was SVM 0.631 (0.578−0.683). No statistically significant differences were observed between the performance of LR, ANN, and RF (i.e., p > 0.05). However, the SVM model was lower (p < 0.01). We suggest that VLBWI mortality prediction using ML methods would yield the same prediction rate as the traditional statistical LR method and may be suitable for predicting mortality. However, low prediction rates are observed in certain ML methods; hence, further research is needed on these limitations and selecting an appropriate method.
Collapse
|
8
|
Lavilla OC, Aziz KB, Lure AC, Gipson D, de la Cruz D, Wynn JL. Hourly Kinetics of Critical Organ Dysfunction in Extremely Preterm Infants. Am J Respir Crit Care Med 2022; 205:75-87. [PMID: 34550843 PMCID: PMC8865589 DOI: 10.1164/rccm.202106-1359oc] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Rationale: Use of severity of illness scores to classify patients for clinical care and research is common outside of the neonatal ICU. Extremely premature (<29 weeks' gestation) infants with extremely low birth weight (<1,000 g) experience significant mortality and develop severe pathology during the protracted birth hospitalization. Objectives: To measure at high resolution the changes in organ dysfunction that occur from birth to death or discharge home by gestational age and time, and among extremely preterm infants with and without clinically meaningful outcomes using the neonatal sequential organ failure assessment score. Methods: A single-center, retrospective, observational cohort study of inborn, extremely preterm infants with extremely low birth weight admitted between January 2012 and January 2020. Neonatal sequential organ failure assessment scores were calculated every hour for every patient from admission until death or discharge. Measurements and Main Results: Longitudinal, granular scores from 436 infants demonstrated early and sustained discrimination of those who died versus those who survived to discharge. The discrimination for mortality by the maximum score was excellent (area under curve, 0.91; 95% confidence intervals, 0.88-0.94). Among survivors with and without adverse outcomes, most score variation occurred at the patient level. The weekly average score over the first 28 days was associated with the sum of adverse outcomes at discharge. Conclusions: The neonatal sequential organ failure assessment score discriminates between survival and nonsurvival on the first day of life. The major contributor to score variation occurred at the patient level. There was a direct association between scores and major adverse outcomes, including death.
Collapse
Affiliation(s)
| | - Khyzer B. Aziz
- Department of Pediatrics, Johns Hopkins University, Baltimore, Maryland
| | | | | | | | - James L. Wynn
- Department of Pediatrics and,Department of Pathology, Immunology, and Laboratory Medicine, University of Florida, Gainesville, Florida; and
| |
Collapse
|
9
|
Foroushani HM, Hamzehloo A, Kumar A, Chen Y, Heitsch L, Slowik A, Strbian D, Lee JM, Marcus DS, Dhar R. Accelerating Prediction of Malignant Cerebral Edema After Ischemic Stroke with Automated Image Analysis and Explainable Neural Networks. Neurocrit Care 2021; 36:471-482. [PMID: 34417703 DOI: 10.1007/s12028-021-01325-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 08/02/2021] [Indexed: 11/25/2022]
Abstract
BACKGROUND Malignant cerebral edema is a devastating complication of stroke, resulting in deterioration and death if hemicraniectomy is not performed prior to herniation. Current approaches for predicting this relatively rare complication often require advanced imaging and still suffer from suboptimal performance. We performed a pilot study to evaluate whether neural networks incorporating data extracted from routine computed tomography (CT) imaging could enhance prediction of edema in a large diverse stroke cohort. METHODS An automated imaging pipeline retrospectively extracted volumetric data, including cerebrospinal fluid (CSF) volumes and the hemispheric CSF volume ratio, from baseline and 24 h CT scans performed in participants of an international stroke cohort study. Fully connected and long short-term memory (LSTM) neural networks were trained using serial clinical and imaging data to predict those who would require hemicraniectomy or die with midline shift. The performance of these models was tested, in comparison with regression models and the Enhanced Detection of Edema in Malignant Anterior Circulation Stroke (EDEMA) score, using cross-validation to construct precision-recall curves. RESULTS Twenty of 598 patients developed malignant edema (12 required surgery, 8 died). The regression model provided 95% recall but only 32% precision (area under the precision-recall curve [AUPRC] 0.74), similar to the EDEMA score (precision 28%, AUPRC 0.66). The fully connected network did not perform better (precision 33%, AUPRC 0.71), but the LSTM model provided 100% recall and 87% precision (AUPRC 0.97) in the overall cohort and the subgroup with a National Institutes of Health Stroke Scale (NIHSS) score ≥ 8 (p = 0.0001 vs. regression and fully connected models). Features providing the most predictive importance were the hemispheric CSF ratio and NIHSS score measured at 24 h. CONCLUSIONS An LSTM neural network incorporating volumetric data extracted from routine CT scans identified all cases of malignant cerebral edema by 24 h after stroke, with significantly fewer false positives than a fully connected neural network, regression model, and the validated EDEMA score. This preliminary work requires prospective validation but provides proof of principle that a deep learning framework could assist in selecting patients for surgery prior to deterioration.
Collapse
Affiliation(s)
- Hossein Mohammadian Foroushani
- Department of Electrical and Systems Engineering, Washington University in St. Louis McKelvey School of Engineering, 1 Brookings Drive, St. Louis, MO, 63130-4899, USA
| | - Ali Hamzehloo
- Department of Neurology, Washington University in St. Louis School of Medicine, 660 S Euclid Avenue, Campus, Box 8111, St. Louis, MO, 63110, USA
| | - Atul Kumar
- Department of Neurology, Washington University in St. Louis School of Medicine, 660 S Euclid Avenue, Campus, Box 8111, St. Louis, MO, 63110, USA
| | - Yasheng Chen
- Department of Neurology, Washington University in St. Louis School of Medicine, 660 S Euclid Avenue, Campus, Box 8111, St. Louis, MO, 63110, USA
| | - Laura Heitsch
- Department of Emergency Medicine, Washington University in St. Louis School of Medicine, 660 S. Euclid Ave, Campus, Box 8072, St. Louis, MO, 63110, USA
| | - Agnieszka Slowik
- Department of Neurology, Jagiellonian University Medical College, Kraków, Poland
| | - Daniel Strbian
- Department of Neurology, Helsinki University Hospital, Helsinki, Finland
| | - Jin-Moo Lee
- Department of Neurology, Washington University in St. Louis School of Medicine, 660 S Euclid Avenue, Campus, Box 8111, St. Louis, MO, 63110, USA
| | - Daniel S Marcus
- Department of Radiology, Washington University in St. Louis School of Medicine, 525 Scott Ave, Campus, Box 8225, St. Louis, MO, 63110, USA
| | - Rajat Dhar
- Department of Neurology, Washington University in St. Louis School of Medicine, 660 S Euclid Avenue, Campus, Box 8111, St. Louis, MO, 63110, USA.
| |
Collapse
|