1
|
Zhao N, Yu JY, Bui T, Dzieciolowski K. Correcting Biases of Shapley Value Attributions for Informative Machine Learning Model Explanations. PROCEEDINGS OF THE 33RD ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT 2024:3331-3340. [DOI: 10.1145/3627673.3679846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
Affiliation(s)
| | - Jia Yuan Yu
- Concordia University, Montreal, Quebec, Canada
| | - Trang Bui
- University of Waterloo, Waterloo, Ontario, Canada
| | | |
Collapse
|
2
|
De Ridder D, Ladoy A, Choi Y, Jacot D, Vuilleumier S, Guessous I, Joost S, Greub G. Environmental and geographical factors influencing the spread of SARS-CoV-2 over 2 years: a fine-scale spatiotemporal analysis. Front Public Health 2024; 12:1298177. [PMID: 38957202 PMCID: PMC11217542 DOI: 10.3389/fpubh.2024.1298177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 06/03/2024] [Indexed: 07/04/2024] Open
Abstract
Introduction Since its emergence in late 2019, the SARS-CoV-2 virus has led to a global health crisis, affecting millions and reshaping societies and economies worldwide. Investigating the determinants of SARS-CoV-2 diffusion and their spatiotemporal dynamics at high spatial resolution is critical for public health and policymaking. Methods This study analyses 194,682 georeferenced SARS-CoV-2 RT-PCR tests from March 2020 and April 2022 in the canton of Vaud, Switzerland. We characterized five distinct pandemic periods using metrics of spatial and temporal clustering like inverse Shannon entropy, the Hoover index, Lloyd's index of mean crowding, and the modified space-time DBSCAN algorithm. We assessed the demographic, socioeconomic, and environmental factors contributing to cluster persistence during each period using eXtreme Gradient Boosting (XGBoost) and SHapley Additive exPlanations (SHAP), to consider non-linear and spatial effects. Results Our findings reveal important variations in the spatial and temporal clustering of cases. Notably, areas with flatter epidemics had higher total attack rate. Air pollution emerged as a factor showing a consistent positive association with higher cluster persistence, substantiated by both immission models and, to a lesser extent, tropospheric NO2 estimations. Factors including population density, testing rates, and geographical coordinates, also showed important positive associations with higher cluster persistence. The socioeconomic index showed no significant contribution to cluster persistence, suggesting its limited role in the observed dynamics, which warrants further research. Discussion Overall, the determinants of cluster persistence remained across the study periods. These findings highlight the need for effective air quality management strategies to mitigate air pollution's adverse impacts on public health, particularly in the context of respiratory viral diseases like COVID-19.
Collapse
Affiliation(s)
- David De Ridder
- Geographic Information Research and Analysis in Population Health (GIRAPH) Lab, Faculty of Medicine, University of Geneva (UNIGE), Geneva, Switzerland
- Geospatial Molecular Epidemiology Group (GEOME), Laboratory for Biological Geochemistry (LGB), School of Architecture, Civil and Environmental Engineering (ENAC), École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Division and Department of Primary Care Medicine, Geneva University Hospitals, Geneva, Switzerland
- Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Anaïs Ladoy
- Geographic Information Research and Analysis in Population Health (GIRAPH) Lab, Faculty of Medicine, University of Geneva (UNIGE), Geneva, Switzerland
- Geospatial Molecular Epidemiology Group (GEOME), Laboratory for Biological Geochemistry (LGB), School of Architecture, Civil and Environmental Engineering (ENAC), École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Yangji Choi
- Institute of Microbiology, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland
| | - Damien Jacot
- Institute of Microbiology, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland
| | - Séverine Vuilleumier
- La Source School of Nursing, University of Applied Sciences and Arts Western Switzerland (HES-SO), Lausanne, Switzerland
| | - Idris Guessous
- Geographic Information Research and Analysis in Population Health (GIRAPH) Lab, Faculty of Medicine, University of Geneva (UNIGE), Geneva, Switzerland
- Division and Department of Primary Care Medicine, Geneva University Hospitals, Geneva, Switzerland
- Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Stéphane Joost
- Geographic Information Research and Analysis in Population Health (GIRAPH) Lab, Faculty of Medicine, University of Geneva (UNIGE), Geneva, Switzerland
- Geospatial Molecular Epidemiology Group (GEOME), Laboratory for Biological Geochemistry (LGB), School of Architecture, Civil and Environmental Engineering (ENAC), École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Division and Department of Primary Care Medicine, Geneva University Hospitals, Geneva, Switzerland
- La Source School of Nursing, University of Applied Sciences and Arts Western Switzerland (HES-SO), Lausanne, Switzerland
| | - Gilbert Greub
- Institute of Microbiology, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland
- Infectious Diseases Service, Lausanne University Hospital, Lausanne, Switzerland
| |
Collapse
|
3
|
Zhao N, Yu JY, Dzieciolowski K, Bui T. Error Analysis of Shapley Value-Based Model Explanations: An Informative Perspective. LECTURE NOTES IN COMPUTER SCIENCE 2024:29-48. [DOI: 10.1007/978-3-031-65112-0_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
4
|
Moradi H, Bunnell HT, Price BS, Khodaverdi M, Vest MT, Porterfield JZ, Anzalone AJ, Santangelo SL, Kimble W, Harper J, Hillegass WB, Hodder SL, on behalf of the National COVID Cohort Collaborative (N3C) Consortium. Assessing the effects of therapeutic combinations on SARS-CoV-2 infected patient outcomes: A big data approach. PLoS One 2023; 18:e0282587. [PMID: 36893086 PMCID: PMC9997963 DOI: 10.1371/journal.pone.0282587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 02/18/2023] [Indexed: 03/10/2023] Open
Abstract
BACKGROUND The COVID-19 pandemic has demonstrated the need for efficient and comprehensive, simultaneous assessment of multiple combined novel therapies for viral infection across the range of illness severity. Randomized Controlled Trials (RCT) are the gold standard by which efficacy of therapeutic agents is demonstrated. However, they rarely are designed to assess treatment combinations across all relevant subgroups. A big data approach to analyzing real-world impacts of therapies may confirm or supplement RCT evidence to further assess effectiveness of therapeutic options for rapidly evolving diseases such as COVID-19. METHODS Gradient Boosted Decision Tree, Deep and Convolutional Neural Network classifiers were implemented and trained on the National COVID Cohort Collaborative (N3C) data repository to predict the patients' outcome of death or discharge. Models leveraged the patients' characteristics, the severity of COVID-19 at diagnosis, and the calculated proportion of days on different treatment combinations after diagnosis as features to predict the outcome. Then, the most accurate model is utilized by eXplainable Artificial Intelligence (XAI) algorithms to provide insights about the learned treatment combination impacts on the model's final outcome prediction. RESULTS Gradient Boosted Decision Tree classifiers present the highest prediction accuracy in identifying patient outcomes with area under the receiver operator characteristic curve of 0.90 and accuracy of 0.81 for the outcomes of death or sufficient improvement to be discharged. The resulting model predicts the treatment combinations of anticoagulants and steroids are associated with the highest probability of improvement, followed by combined anticoagulants and targeted antivirals. In contrast, monotherapies of single drugs, including use of anticoagulants without steroid or antivirals are associated with poorer outcomes. CONCLUSIONS This machine learning model by accurately predicting the mortality provides insights about the treatment combinations associated with clinical improvement in COVID-19 patients. Analysis of the model's components suggests benefit to treatment with combination of steroids, antivirals, and anticoagulant medication. The approach also provides a framework for simultaneously evaluating multiple real-world therapeutic combinations in future research studies.
Collapse
Affiliation(s)
- Hamidreza Moradi
- University of Mississippi Medical Center, Jackson, MS, United States of America
| | | | - Bradley S. Price
- West Virginia University, Morgantown, WV, United States of America
| | - Maryam Khodaverdi
- West Virginia Clinical and Translational Science Institute, Morgantown, WV, United States of America
| | - Michael T. Vest
- Christiana Care Health System, Newark, DE, United States of America
| | | | - Alfred J. Anzalone
- University of Nebraska Medical Center, Omaha, NE, United States of America
| | | | - Wesley Kimble
- West Virginia Clinical and Translational Science Institute, Morgantown, WV, United States of America
| | - Jeremy Harper
- Owl Health Works LLC, Indianapolis, IN, United States of America
| | | | - Sally L. Hodder
- West Virginia Clinical and Translational Science Institute, Morgantown, WV, United States of America
| | | |
Collapse
|
5
|
Hilal W, Chislett MG, Snider B, McBean EA, Yawney J, Gadsden SA. Use of AI to assess COVID-19 variant impacts on hospitalization, ICU, and death. Front Artif Intell 2022; 5:927203. [DOI: 10.3389/frai.2022.927203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Accepted: 11/07/2022] [Indexed: 12/05/2022] Open
Abstract
The rapid spread of COVID-19 and its variants have devastated communities worldwide, and as the highly transmissible Omicron variant becomes the dominant strain of the virus in late 2021, the need to characterize and understand the difference between the new variant and its predecessors has been an increasing priority for public health authorities. Artificial Intelligence has played a significant role in the analysis of various facets of COVID-19 since the early stages of the pandemic. This study proposes the use of AI, specifically an XGBoost model, to quantify the impact of various medical risk factors (or “population features”) on the possibility of a patient outcome resulting in hospitalization, ICU admission, or death. The results are compared between the Delta and Omicron COVID-19 variants. Results indicated that older age and an unvaccinated patient status most consistently correspond as the most significant population features contributing to all three scenarios (hospitalization, ICU, death). The top 15 features for each variant-outcome scenario were determined, which most frequently included diabetes, cardiovascular disease, chronic kidney disease, and complications of pneumonia as highly significant population features contributing to serious illness outcomes. The Delta/Hospitalization model returned the highest performance metric scores for the area under the receiver operating characteristic (AUROC), F1, and Recall, while Omicron/ICU and Omicron/Hospitalization had the highest accuracy and precision values, respectively. The recall was found to be above 0.60 in most cases (with only two exceptions), indicating that the total number of false positives was generally minimized (accounting for more of the people who would theoretically require medical care).
Collapse
|
6
|
Jiang AZ, Nian F, Chen H, McBean EA. Passive Samplers, an Important Tool for Continuous Monitoring of the COVID-19 Pandemic. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:32326-32334. [PMID: 35137317 PMCID: PMC9072756 DOI: 10.1007/s11356-022-19073-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Accepted: 02/02/2022] [Indexed: 05/05/2023]
Abstract
The global pandemic caused by COVID-19 has resulted in major costs around the world, costs with dimensions in every aspect, from peoples' daily living to the global economy. As the pandemic progresses, the virus evolves, and more vaccines become available, and the 'battle against the virus' continues. As part of the battle, Wastewater-Based Epidemiology (WBE) technologies are being widely deployed in essential roles for SARS-CoV-2 detection and monitoring. While focusing on demonstrating the advantages of passive samplers as a tool in WBE, this review provides a holistic view of the current WBE applications in monitoring SARS-CoV-2 with the integration of the most up-to-date data. A novel scenario example based on a recent Nanjing (China) outbreak in July 2021 is used to illustrate the potential benefits of using passive samplers to monitor COVID-19 and to facilitate effective control of future major outbreaks. The presented contents and how the application of passive samplers indicates that this technology can be beneficial at different levels, varying from building to community to regional. Countries and regions that have the pandemic well under control or have low positive case occurrences have the potential to significantly benefit from deploying passive samplers as a measure to identify and suppress outbreaks.
Collapse
Affiliation(s)
- Albert Z. Jiang
- School of Engineering, University of Guelph, 50 Stone Rd. E, Guelph, N1G 2W1 Canada
| | - Fulin Nian
- Department of Digestive, Shanghai Pudong Hospital, Fudan University Affiliated Pudong Medical Center, 2800 Gongwei Road, Shanghai, 201399 China
| | - Han Chen
- College of Environmental Science and Engineering/Sino-Canada Joint R&D Centre for Water and Environmental Safety, Nankai University, Tianjin, 300071 China
| | - Edward A. McBean
- School of Engineering, University of Guelph, 50 Stone Rd. E, Guelph, N1G 2W1 Canada
| |
Collapse
|