1
|
Kang Q, Zhang B, Cao Y, Song X, Ye X, Li X, Wu H, Chen Y, Chen B. Causal prior-embedded physics-informed neural networks and a case study on metformin transport in porous media. WATER RESEARCH 2024; 261:121985. [PMID: 38968734 DOI: 10.1016/j.watres.2024.121985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Revised: 05/17/2024] [Accepted: 06/20/2024] [Indexed: 07/07/2024]
Abstract
This study introduces a novel approach to transport modelling by integrating experimentally derived causal priors into neural networks. We illustrate this paradigm using a case study of metformin, a ubiquitous pharmaceutical emerging pollutant, and its transport behaviour in sandy media. Specifically, data from metformin's sandy column transport experiment was used to estimate unobservable parameters through a physics-based model Hydrus-1D, followed by a data augmentation to produce a more comprehensive dataset. A causal graph incorporating key variables was constructed, aiding in identifying impactful variables and estimating their causal dynamics or "causal prior." The causal priors extracted from the augmented dataset included underexplored system parameters such as the type-1 sorption fraction F, first-order reaction rate coefficient α, and transport system scale. Their moderate impact on the transport process has been quantitatively evaluated (normalized causal effect 0.0423, -0.1447 and -0.0351, respectively) with adequate confounders considered for the first time. The prior was later embedded into multilayer neural networks via two methods: causal weight initialization and causal prior regularization. Based on the results from AutoML hyperparameter tuning experiments, using two embedding methods simultaneously emerged as a more advantageous practice since our proposed causal weight initialization technique can enhance model stability, particularly when used in conjunction with causal prior regularization. amongst those experiments utilizing both techniques, the R-squared values peaked at 0.881. This study demonstrates a balanced approach between expert knowledge and data-driven methods, providing enhanced interpretability in black-box models such as neural networks for environmental modelling.
Collapse
Affiliation(s)
- Qiao Kang
- The Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John's, Newfoundland, A1B 3X5, Canada
| | - Baiyu Zhang
- The Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John's, Newfoundland, A1B 3X5, Canada
| | - Yiqi Cao
- The Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John's, Newfoundland, A1B 3X5, Canada
| | - Xing Song
- The Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John's, Newfoundland, A1B 3X5, Canada
| | - Xudong Ye
- The Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John's, Newfoundland, A1B 3X5, Canada
| | - Xixi Li
- The Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John's, Newfoundland, A1B 3X5, Canada
| | - Hongjing Wu
- The Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John's, Newfoundland, A1B 3X5, Canada
| | - Yuanzhu Chen
- School of Computing, Queen's University, Kingston, ON, K7L 2N8, Canada
| | - Bing Chen
- The Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John's, Newfoundland, A1B 3X5, Canada.
| |
Collapse
|
2
|
Rosen EM, Ritchey ME, Girman CJ. Can Weight of Evidence, Quantitative Bias, and Bounding Methods Evaluate Robustness of Real-world Evidence for Regulator and Health Technology Assessment Decisions on Medical Interventions? Clin Ther 2023; 45:1266-1276. [PMID: 37798219 DOI: 10.1016/j.clinthera.2023.09.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 06/07/2023] [Accepted: 09/12/2023] [Indexed: 10/07/2023]
Abstract
PURPOSE High-quality evidence is crucial for health care intervention decision-making. These decisions frequently use nonrandomized data, which can be more vulnerable to biases than randomized trials. Accordingly, methods to quantify biases and weigh available evidence could elucidate the robustness of findings, giving regulators more confidence in making approval and reimbursement decisions. METHODS We conducted an integrative literature review to identify methods for determining probability of causation, evaluating weight of evidence, and conducting quantitative bias analysis as related to health care interventions. Eligible studies were published from 2012 to 2021, applicable to pharmacoepidemiology, and presented a method that met our objective. FINDINGS Twenty-two eligible studies were classified into 4 categories: (1) quantitative bias analysis; (2) weight of evidence methods; (3) Bayesian networks; and (4) miscellaneous. All of the methods have strengths, limitations, and situations in which they are more well suited than others. Some methods seem to lend themselves more to applications of health care evidence on medical interventions than others. IMPLICATIONS To provide robust evidence for and improve confidence in regulatory or reimbursement decisions, we recommend applying multiple methods to triangulate associations of medical interventions, accounting for biases in different ways. This approach could lead to well-defined robustness assessments of study findings and appropriate science-driven decisions by regulators and payers for public health.
Collapse
Affiliation(s)
- Emma M Rosen
- Department of Epidemiology, University of North Carolina-Chapel Hill, Chapel Hill, North Carolina, USA; CERobs Consulting, LLC, Wrightsville Beach, North Carolina, USA
| | - Mary E Ritchey
- CERobs Consulting, LLC, Wrightsville Beach, North Carolina, USA; Med Tech Epi, LLC; Philadelphia, Pennsylvania, USA; Center for Pharmacoepidemiology & Treatment Science, Rutgers University, New Brunswick, New Jersey, USA
| | - Cynthia J Girman
- Department of Epidemiology, University of North Carolina-Chapel Hill, Chapel Hill, North Carolina, USA; CERobs Consulting, LLC, Wrightsville Beach, North Carolina, USA.
| |
Collapse
|
3
|
Juhan N, Zubairi YZ, Mahmood Zuhdi AS, Mohd Khalid Z. Predictors on outcomes of cardiovascular disease of male patients in Malaysia using Bayesian network analysis. BMJ Open 2023; 13:e066748. [PMID: 37923353 PMCID: PMC10626862 DOI: 10.1136/bmjopen-2022-066748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 08/30/2023] [Indexed: 11/07/2023] Open
Abstract
OBJECTIVES Despite extensive advances in medical and surgical treatment, cardiovascular disease (CVD) remains the leading cause of mortality worldwide. Identifying the significant predictors will help clinicians with the prognosis of the disease and patient management. This study aims to identify and interpret the dependence structure between the predictors and health outcomes of ST-elevation myocardial infarction (STEMI) male patients in Malaysian setting. DESIGN Retrospective study. SETTING Malaysian National Cardiovascular Disease Database-Acute Coronary Syndrome (NCVD-ACS) registry years 2006-2013, which consists of 18 hospitals across the country. PARTICIPANTS 7180 male patients diagnosed with STEMI from the NCVD-ACS registry. PRIMARY AND SECONDARY OUTCOME MEASURES A graphical model based on the Bayesian network (BN) approach has been considered. A bootstrap resampling approach was integrated into the structural learning algorithm to estimate probabilistic relations between the studied features that have the strongest influence and support. RESULTS The relationships between 16 features in the domain of CVD were visualised. From the bootstrap resampling approach, out of 250, only 25 arcs are significant (strength value ≥0.85 and the direction value ≥0.50). Age group, Killip class and renal disease were classified as the key predictors in the BN model for male patients as they were the most influential variables directly connected to the outcome, which is the patient status. Widespread probabilistic associations between the key predictors and the remaining variables were observed in the network structure. High likelihood values are observed for patient status variable stated alive (93.8%), Killip class I on presentation (66.8%), patient younger than 65 (81.1%), smoker patient (77.2%) and ethnic Malay (59.2%). The BN model has been shown to have good predictive performance. CONCLUSIONS The data visualisation analysis can be a powerful tool to understand the relationships between the CVD prognostic variables and can be useful to clinicians.
Collapse
Affiliation(s)
- Nurliyana Juhan
- Preparatory Centre for Science and Technology, Universiti Malaysia Sabah, Kota Kinabalu, Malaysia
| | - Yong Zulina Zubairi
- Institute for Advanced Studies, University of Malaya, Kuala Lumpur, Malaysia
| | | | - Zarina Mohd Khalid
- Department of Mathematical Sciences, Universiti Teknologi Malaysia, Skudai, Malaysia
| |
Collapse
|
4
|
Fung H, Sgaier SK, Huang VS. Discovery of interconnected causal drivers of COVID-19 vaccination intentions in the US using a causal Bayesian network. Sci Rep 2023; 13:6988. [PMID: 37193707 DOI: 10.1038/s41598-023-33745-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 04/18/2023] [Indexed: 05/18/2023] Open
Abstract
Holistic interventions to overcome COVID-19 vaccine hesitancy require a system-level understanding of the interconnected causes and mechanisms that give rise to it. However, conventional correlative analyses do not easily provide such nuanced insights. We used an unsupervised, hypothesis-free causal discovery algorithm to learn the interconnected causal pathways to vaccine intention as a causal Bayesian network (BN), using data from a COVID-19 vaccine hesitancy survey in the US in early 2021. We identified social responsibility, vaccine safety and anticipated regret as prime candidates for interventions and revealed a complex network of variables that mediate their influences. Social responsibility's causal effect greatly exceeded that of other variables. The BN revealed that the causal impact of political affiliations was weak compared with more direct causal factors. This approach provides clearer targets for intervention than regression, suggesting it can be an effective way to explore multiple causal pathways of complex behavioural problems to inform interventions.
Collapse
Affiliation(s)
- Henry Fung
- Surgo Health, Washington, DC, USA
- Surgo Ventures, Washington, DC, USA
| | - Sema K Sgaier
- Surgo Health, Washington, DC, USA.
- Surgo Ventures, Washington, DC, USA.
- Department of Global Health, University of Washington, Seattle, WA, USA.
| | - Vincent S Huang
- Surgo Health, Washington, DC, USA
- Surgo Ventures, Washington, DC, USA
| |
Collapse
|
5
|
Cao Y, Kang Q, Zhang B, Zhu Z, Dong G, Cai Q, Lee K, Chen B. Machine learning-aided causal inference for unraveling chemical dispersant and salinity effects on crude oil biodegradation. BIORESOURCE TECHNOLOGY 2022; 345:126468. [PMID: 34864175 DOI: 10.1016/j.biortech.2021.126468] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2021] [Revised: 11/24/2021] [Accepted: 11/27/2021] [Indexed: 06/13/2023]
Abstract
Chemical dispersants have been widely applied to tackle oil spills, but their effects on oil biodegradation in global aquatic systems with different salinities are not well understood. Here, both experiments and advanced machine learning-aided causal inference analysis were applied to evaluate related processes. A halotolerant oil-degrading and biosurfactant-producing species was selected and characterized within the salinity of 0-70 g/L NaCl. Notably, dispersant addition can relieve the biodegradation barriers caused by high salinities. To navigate the causal relationships behind the experimental data, a structural causal model to quantitatively estimate the strength of causal links among salinity, dispersant addition, cell abundance, biosurfactant productivity and oil biodegradation was built. The estimated causal effects were integrated into a weighted directed acyclic graph, which showed that overall positive effects of dispersant addition on oil biodegradation was mainly through the enrichment of cell abundance. These findings can benefit decision-making prior dispersant application under different saline environments.
Collapse
Affiliation(s)
- Yiqi Cao
- The Northern Region Persistent Organic Pollution (NRPOP) Control Laboratory, Faculty of Engineering and Applied Science, Memorial University, St. John's, NL A1B 3X5, Canada
| | - Qiao Kang
- The Northern Region Persistent Organic Pollution (NRPOP) Control Laboratory, Faculty of Engineering and Applied Science, Memorial University, St. John's, NL A1B 3X5, Canada
| | - Baiyu Zhang
- The Northern Region Persistent Organic Pollution (NRPOP) Control Laboratory, Faculty of Engineering and Applied Science, Memorial University, St. John's, NL A1B 3X5, Canada.
| | - Zhiwen Zhu
- The Northern Region Persistent Organic Pollution (NRPOP) Control Laboratory, Faculty of Engineering and Applied Science, Memorial University, St. John's, NL A1B 3X5, Canada
| | - Guihua Dong
- The Northern Region Persistent Organic Pollution (NRPOP) Control Laboratory, Faculty of Engineering and Applied Science, Memorial University, St. John's, NL A1B 3X5, Canada
| | - Qinhong Cai
- National Research Council Canada, Energy, Mining and Environment Research Centre, Montreal, QC H4P 2R2, Canada
| | - Kenneth Lee
- Fisheries and Oceans Canada, Ecosystem Science, Ottawa, ON K1A 0E6, Canada
| | - Bing Chen
- The Northern Region Persistent Organic Pollution (NRPOP) Control Laboratory, Faculty of Engineering and Applied Science, Memorial University, St. John's, NL A1B 3X5, Canada
| |
Collapse
|
6
|
Kang Q, Song X, Xin X, Chen B, Chen Y, Ye X, Zhang B. Machine Learning-Aided Causal Inference Framework for Environmental Data Analysis: A COVID-19 Case Study. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2021; 55:13400-13410. [PMID: 34559516 DOI: 10.1021/acs.est.1c02204] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Links between environmental conditions (e.g., meteorological factors and air quality) and COVID-19 severity have been reported worldwide. However, the existing frameworks of data analysis are insufficient or inefficient to investigate the potential causality behind the associations involving multidimensional factors and complicated interrelationships. Thus, a causal inference framework equipped with the structural causal model aided by machine learning methods was proposed and applied to examine the potential causal relationships between COVID-19 severity and 10 environmental factors (NO2, O3, PM2.5, PM10, SO2, CO, average air temperature, atmospheric pressure, relative humidity, and wind speed) in 166 Chinese cities. The cities were grouped into three clusters based on the socio-economic features. Time-series data from these cities in each cluster were analyzed in different pandemic phases. The robustness check refuted most potential causal relationships' estimations (89 out of 90). Only one potential relationship about air temperature passed the final test with a causal effect of 0.041 under a specific cluster-phase condition. The results indicate that the environmental factors are unlikely to cause noticeable aggravation of the COVID-19 pandemic. This study also demonstrated the high value and potential of the proposed method in investigating causal problems with observational data in environmental or other fields.
Collapse
Affiliation(s)
- Qiao Kang
- Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John's A1B 3X5, Newfoundland and Labrador, Canada
| | - Xing Song
- Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John's A1B 3X5, Newfoundland and Labrador, Canada
| | - Xiaying Xin
- Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John's A1B 3X5, Newfoundland and Labrador, Canada
| | - Bing Chen
- Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John's A1B 3X5, Newfoundland and Labrador, Canada
| | - Yuanzhu Chen
- School of Computing, Queen's University, Kingston K7L 2N8, Ontario, Canada
| | - Xudong Ye
- Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John's A1B 3X5, Newfoundland and Labrador, Canada
| | - Baiyu Zhang
- Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John's A1B 3X5, Newfoundland and Labrador, Canada
| |
Collapse
|