1
|
Sunthankar SD, Zhao J, Wei WQ, Hill GD, Parra DA, Kohl K, McCoy A, Jayaram NM, Godown J. Machine Learning to Predict Interstage Mortality Following Single Ventricle Palliation: A NPC-QIC Database Analysis. Pediatr Cardiol 2023; 44:1242-1250. [PMID: 36820914 PMCID: PMC10627450 DOI: 10.1007/s00246-023-03130-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 02/10/2023] [Indexed: 02/24/2023]
Abstract
There is high risk of mortality between stage I and stage II palliation of single ventricle heart disease. This study aimed to leverage advanced machine learning algorithms to optimize risk-prediction models and identify features most predictive of interstage mortality. This study utilized retrospective data from the National Pediatric Cardiology Quality Improvement Collaborative and included all patients who underwent stage I palliation and survived to hospital discharge (2008-2019). Multiple machine learning models were evaluated, including logistic regression, random forest, gradient boosting trees, extreme gradient boost trees, and light gradient boosting machines. A total of 3267 patients were included with 208 (6.4%) interstage deaths. Machine learning models were trained on 180 clinical features. Digoxin use at discharge was the most influential factor resulting in a lower risk of interstage mortality (p < 0.0001). Stage I surgery with Blalock-Taussig-Thomas shunt portended higher risk than Sano conduit (7.8% vs 4.4%, p = 0.0002). Non-modifiable risk factors identified with increased risk of interstage mortality included female sex, lower gestational age, and lower birth weight. Post-operative risk factors included the requirement of unplanned catheterization and more severe atrioventricular valve insufficiency at discharge. Light gradient boosting machines demonstrated the best performance with an area under the receiver operative characteristic curve of 0.642. Advanced machine learning algorithms highlight a number of modifiable and non-modifiable risk factors for interstage mortality following stage I palliation. However, model performance remains modest, suggesting the presence of unmeasured confounders that contribute to interstage risk.
Collapse
Affiliation(s)
- Sudeep D Sunthankar
- Division of Pediatric Cardiology, Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, 37232, USA.
- Thomas P. Graham Jr Division of Pediatric Cardiology, Department of Pediatrics, Monroe Carell Jr Children's Hospital at Vanderbilt, 2220 Children's Way, Suite 5230, Nashville, TN, 37232, USA.
| | - Juan Zhao
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Garick D Hill
- Division of Pediatric Cardiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - David A Parra
- Division of Pediatric Cardiology, Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Karen Kohl
- Division of Pediatric Cardiology, Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Allison McCoy
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Natalie M Jayaram
- Division of Pediatric Cardiology, Children's Mercy Hospital, Kansas City, MO, USA
| | - Justin Godown
- Division of Pediatric Cardiology, Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| |
Collapse
|
2
|
Nayebi A, Tipirneni S, Reddy CK, Foreman B, Subbian V. WindowSHAP: An efficient framework for explaining time-series classifiers based on Shapley values. J Biomed Inform 2023; 144:104438. [PMID: 37414368 PMCID: PMC10552726 DOI: 10.1016/j.jbi.2023.104438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 06/29/2023] [Accepted: 07/03/2023] [Indexed: 07/08/2023]
Abstract
Unpacking and comprehending how black-box machine learning algorithms (such as deep learning models) make decisions has been a persistent challenge for researchers and end-users. Explaining time-series predictive models is useful for clinical applications with high stakes to understand the behavior of prediction models, e.g., to determine how different variables and time points influence the clinical outcome. However, existing approaches to explain such models are frequently unique to architectures and data where the features do not have a time-varying component. In this paper, we introduce WindowSHAP, a model-agnostic framework for explaining time-series classifiers using Shapley values. We intend for WindowSHAP to mitigate the computational complexity of calculating Shapley values for long time-series data as well as improve the quality of explanations. WindowSHAP is based on partitioning a sequence into time windows. Under this framework, we present three distinct algorithms of Stationary, Sliding and Dynamic WindowSHAP, each evaluated against baseline approaches, KernelSHAP and TimeSHAP, using perturbation and sequence analyses metrics. We applied our framework to clinical time-series data from both a specialized clinical domain (Traumatic Brain Injury - TBI) as well as a broad clinical domain (critical care medicine). The experimental results demonstrate that, based on the two quantitative metrics, our framework is superior at explaining clinical time-series classifiers, while also reducing the complexity of computations. We show that for time-series data with 120 time steps (hours), merging 10 adjacent time points can reduce the CPU time of WindowSHAP by 80 % compared to KernelSHAP. We also show that our Dynamic WindowSHAP algorithm focuses more on the most important time steps and provides more understandable explanations. As a result, WindowSHAP not only accelerates the calculation of Shapley values for time-series data, but also delivers more understandable explanations with higher quality.
Collapse
Affiliation(s)
- Amin Nayebi
- Department of Systems and Industrial Engineering, University of Arizona, AZ, USA.
| | | | | | | | - Vignesh Subbian
- Department of Systems and Industrial Engineering, University of Arizona, AZ, USA; Department of Biomedical Engineering, University of Arizona, AZ, USA
| |
Collapse
|