1
|
Brantner CL, Nguyen TQ, Tang T, Zhao C, Hong H, Stuart EA. Comparison of methods that combine multiple randomized trials to estimate heterogeneous treatment effects. Stat Med 2024; 43:1291-1314. [PMID: 38273647 PMCID: PMC11086055 DOI: 10.1002/sim.9955] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 10/06/2023] [Accepted: 10/23/2023] [Indexed: 01/27/2024]
Abstract
Individualized treatment decisions can improve health outcomes, but using data to make these decisions in a reliable, precise, and generalizable way is challenging with a single dataset. Leveraging multiple randomized controlled trials allows for the combination of datasets with unconfounded treatment assignment to better estimate heterogeneous treatment effects. This article discusses several nonparametric approaches for estimating heterogeneous treatment effects using data from multiple trials. We extend single-study methods to a scenario with multiple trials and explore their performance through a simulation study, with data generation scenarios that have differing levels of cross-trial heterogeneity. The simulations demonstrate that methods that directly allow for heterogeneity of the treatment effect across trials perform better than methods that do not, and that the choice of single-study method matters based on the functional form of the treatment effect. Finally, we discuss which methods perform well in each setting and then apply them to four randomized controlled trials to examine effect heterogeneity of treatments for major depressive disorder.
Collapse
Affiliation(s)
- Carly Lupton Brantner
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Trang Quynh Nguyen
- Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Tengjie Tang
- Department of Statistical Science, Duke University, Durham, North Carolina, USA
| | - Congwen Zhao
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, USA
| | - Hwanhee Hong
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, USA
| | - Elizabeth A. Stuart
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
| |
Collapse
|
2
|
Nestler S, Salditt M. Comparing type 1 and type 2 error rates of different tests for heterogeneous treatment effects. Behav Res Methods 2024:10.3758/s13428-024-02371-x. [PMID: 38509268 DOI: 10.3758/s13428-024-02371-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/13/2024] [Indexed: 03/22/2024]
Abstract
Psychologists are increasingly interested in whether treatment effects vary in randomized controlled trials. A number of tests have been proposed in the causal inference literature to test for such heterogeneity, which differ in the sample statistic they use (either using the variance terms of the experimental and control group, their empirical distribution functions, or specific quantiles), and in whether they make distributional assumptions or are based on a Fisher randomization procedure. In this manuscript, we present the results of a simulation study in which we examine the performance of the different tests while varying the amount of treatment effect heterogeneity, the type of underlying distribution, the sample size, and whether an additional covariate is considered. Altogether, our results suggest that researchers should use a randomization test to optimally control for type 1 errors. Furthermore, all tests studied are associated with low power in case of small and moderate samples even when the heterogeneity of the treatment effect is substantial. This suggests that current tests for treatment effect heterogeneity require much larger samples than those collected in current research.
Collapse
Affiliation(s)
- Steffen Nestler
- University of Münster, Institut für Psychologie, Fliednerstr. 21, 48149, Münster, Germany.
| | - Marie Salditt
- University of Münster, Institut für Psychologie, Fliednerstr. 21, 48149, Münster, Germany
| |
Collapse
|
3
|
Brooks JM, Chapman CG, Chen BK, Floyd SB, Hikmet N. Assessing the properties of patient-specific treatment effect estimates from causal forest algorithms under essential heterogeneity. BMC Med Res Methodol 2024; 24:66. [PMID: 38481139 PMCID: PMC10935905 DOI: 10.1186/s12874-024-02187-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 02/21/2024] [Indexed: 03/17/2024] Open
Abstract
BACKGROUND Treatment variation from observational data has been used to estimate patient-specific treatment effects. Causal Forest Algorithms (CFAs) developed for this task have unknown properties when treatment effect heterogeneity from unmeasured patient factors influences treatment choice - essential heterogeneity. METHODS We simulated eleven populations with identical treatment effect distributions based on patient factors. The populations varied in the extent that treatment effect heterogeneity influenced treatment choice. We used the generalized random forest application (CFA-GRF) to estimate patient-specific treatment effects for each population. Average differences between true and estimated effects for patient subsets were evaluated. RESULTS CFA-GRF performed well across the population when treatment effect heterogeneity did not influence treatment choice. Under essential heterogeneity, however, CFA-GRF yielded treatment effect estimates that reflected true treatment effects only for treated patients and were on average greater than true treatment effects for untreated patients. CONCLUSIONS Patient-specific estimates produced by CFAs are sensitive to why patients in real-world practice make different treatment choices. Researchers using CFAs should develop conceptual frameworks of treatment choice prior to estimation to guide estimate interpretation ex post.
Collapse
Affiliation(s)
- John M Brooks
- Center for Effectiveness Research in Orthopaedics - Arnold School of Public Health Greenville, 915 Greene Street #302D, Columbia, SC, 29208-0001, USA.
- University of South Carolina Arnold School of Public Health, Health Services Policy & Management, Columbia, SC, USA.
| | - Cole G Chapman
- Department of Pharmacy Practice and Science Iowa City, University of Iowa, Iowa, USA
- Center for Effectiveness Research in Orthopaedics, Greenville, SC, USA
| | - Brian K Chen
- University of South Carolina Arnold School of Public Health, Health Services Policy & Management, Columbia, SC, USA
- Center for Effectiveness Research in Orthopaedics, Greenville, SC, USA
| | - Sarah B Floyd
- Center for Effectiveness Research in Orthopaedics, Greenville, SC, USA
- Clemson University College of Behavioral Social and Health Sciences, Public Health Sciences, Clemson, South Carolina, USA
| | - Neset Hikmet
- Center for Effectiveness Research in Orthopaedics, Greenville, SC, USA
- Department of Integrated Information Technology, Innovation Think Tank Lab @ USC, University of South Carolina College of Engineering and Computing, Columbia, SC, USA
| |
Collapse
|
4
|
Chen X, Harhay MO, Tong G, Li F. A BAYESIAN MACHINE LEARNING APPROACH FOR ESTIMATING HETEROGENEOUS SURVIVOR CAUSAL EFFECTS: APPLICATIONS TO A CRITICAL CARE TRIAL. Ann Appl Stat 2024; 18:350-374. [PMID: 38455841 PMCID: PMC10919396 DOI: 10.1214/23-aoas1792] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2024]
Abstract
Assessing heterogeneity in the effects of treatments has become increasingly popular in the field of causal inference and carries important implications for clinical decision-making. While extensive literature exists for studying treatment effect heterogeneity when outcomes are fully observed, there has been limited development in tools for estimating heterogeneous causal effects when patient-centered outcomes are truncated by a terminal event, such as death. Due to mortality occurring during study follow-up, the outcomes of interest are unobservable, undefined, or not fully observed for many participants in which case principal stratification is an appealing framework to draw valid causal conclusions. Motivated by the Acute Respiratory Distress Syndrome Network (ARDSNetwork) ARDS respiratory management (ARMA) trial, we developed a flexible Bayesian machine learning approach to estimate the average causal effect and heterogeneous causal effects among the always-survivors stratum when clinical outcomes are subject to truncation. We adopted Bayesian additive regression trees (BART) to flexibly specify separate mean models for the potential outcomes and latent stratum membership. In the analysis of the ARMA trial, we found that the low tidal volume treatment had an overall benefit for participants sustaining acute lung injuries on the outcome of time to returning home but substantial heterogeneity in treatment effects among the always-survivors, driven most strongly by biologic sex and the alveolar-arterial oxygen gradient at baseline (a physiologic measure of lung function and degree of hypoxemia). These findings illustrate how the proposed methodology could guide the prognostic enrichment of future trials in the field.
Collapse
Affiliation(s)
- Xinyuan Chen
- Department of Mathematics and Statistics, Mississippi State University
| | - Michael O. Harhay
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania
| | - Guangyu Tong
- Department of Biostatistics, Yale School of Public Health
| | - Fan Li
- Department of Biostatistics, Yale School of Public Health
| |
Collapse
|
5
|
Post RAJ, Petkovic M, van den Heuvel IL, van den Heuvel ER. Flexible Machine Learning Estimation of Conditional Average Treatment Effects: A Blessing and a Curse. Epidemiology 2024; 35:32-40. [PMID: 37889951 DOI: 10.1097/ede.0000000000001684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]
Abstract
Causal inference from observational data requires untestable identification assumptions. If these assumptions apply, machine learning methods can be used to study complex forms of causal effect heterogeneity. Recently, several machine learning methods were developed to estimate the conditional average treatment effect (ATE). If the features at hand cannot explain all heterogeneity, the individual treatment effects can seriously deviate from the conditional ATE. In this work, we demonstrate how the distributions of the individual treatment effect and the conditional ATE can differ when a causal random forest is applied. We extend the causal random forest to estimate the difference in conditional variance between treated and controls. If the distribution of the individual treatment effect equals that of the conditional ATE, this estimated difference in variance should be small. If they differ, an additional causal assumption is necessary to quantify the heterogeneity not captured by the distribution of the conditional ATE. The conditional variance of the individual treatment effect can be identified when the individual effect is independent of the outcome under no treatment given the measured features. Then, in the cases where the individual treatment effect and conditional ATE distributions differ, the extended causal random forest can appropriately estimate the variance of the individual treatment effect distribution, whereas the causal random forest fails to do so.
Collapse
Affiliation(s)
- Richard A J Post
- From the Department of Mathematics and Computer Science, Eindhoven University of Technology, the Netherlands
| | - Marko Petkovic
- From the Department of Mathematics and Computer Science, Eindhoven University of Technology, the Netherlands
| | - Isabel L van den Heuvel
- From the Department of Mathematics and Computer Science, Eindhoven University of Technology, the Netherlands
| | - Edwin R van den Heuvel
- From the Department of Mathematics and Computer Science, Eindhoven University of Technology, the Netherlands
- Department of Preventive Medicine and Epidemiology, School of Medicine, Boston University, Boston, MA
| |
Collapse
|
6
|
Salditt M, Eckes T, Nestler S. A Tutorial Introduction to Heterogeneous Treatment Effect Estimation with Meta-learners. ADMINISTRATION AND POLICY IN MENTAL HEALTH AND MENTAL HEALTH SERVICES RESEARCH 2023:10.1007/s10488-023-01303-9. [PMID: 37922115 DOI: 10.1007/s10488-023-01303-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/12/2023] [Indexed: 11/05/2023]
Abstract
Psychotherapy has been proven to be effective on average, though patients respond very differently to treatment. Understanding which characteristics are associated with treatment effect heterogeneity can help to customize therapy to the individual patient. In this tutorial, we describe different meta-learners, which are flexible algorithms that can be used to estimate personalized treatment effects. More specifically, meta-learners decompose treatment effect estimation into multiple prediction tasks, each of which can be solved by any machine learning model. We begin by reviewing necessary assumptions for interpreting the estimated treatment effects as causal, and then give an overview over key concepts of machine learning. Throughout the article, we use an illustrative data example to show how the different meta-learners can be implemented in R. We also point out how current popular practices in psychotherapy research fit into the meta-learning framework. Finally, we show how heterogeneous treatment effects can be analyzed, and point out some challenges in the implementation of meta-learners.
Collapse
Affiliation(s)
- Marie Salditt
- Institut für Psychologie, University of Münster, Fliednerstr. 21, 48149, Münster, Germany.
| | - Theresa Eckes
- Institut für Psychologie, University of Münster, Fliednerstr. 21, 48149, Münster, Germany
| | - Steffen Nestler
- Institut für Psychologie, University of Münster, Fliednerstr. 21, 48149, Münster, Germany
| |
Collapse
|
7
|
Carroll JM, Yeager DS, Buontempo J, Hecht C, Cimpian A, Mhatre P, Muller C, Crosnoe R. Mindset × Context: Schools, Classrooms, and the Unequal Translation of Expectations into Math Achievement. Monogr Soc Res Child Dev 2023; 88:7-109. [PMID: 37574937 DOI: 10.1111/mono.12471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 06/08/2023] [Accepted: 06/23/2023] [Indexed: 08/15/2023]
Abstract
When do adolescents' dreams of promising journeys through high school translate into academic success? This monograph reports the results of a collaborative effort among sociologists and psychologists to systematically examine the role of schools and classrooms in disrupting or facilitating the link between adolescents' expectations for success in math and their subsequent progress in the early high school math curriculum. Our primary focus was on gendered patterns of socioeconomic inequality in math and how they are tethered to the school's peer culture and to students' perceptions of gender stereotyping in the classroom. To do this, this monograph advances Mindset × Context Theory. This orients research on educational equity to the reciprocal influence between students' psychological motivations and their school-based opportunities to enact those motivations. Mindset × Context Theory predicts that a student's mindset will be more strongly linked to developmental outcomes among groups of students who are at risk for poor outcomes, but only in a school or classroom context where there is sufficient need and support for the mindset. Our application of this theory centers on expectations for success in high school math as a foundational belief for students' math progress early in high school. We examine how this mindset varies across interpersonal and cultural dynamics in schools and classrooms. Following this perspective, we ask: 1. Which gender and socioeconomic identity groups showed the weakest or strongest links between expectations for success in math and progress through the math curriculum? 2. How did the school's peer culture shape the links between student expectations for success in math and math progress across gender and socioeconomic identity groups? 3. How did perceptions of classroom gender stereotyping shape the links between student expectations for success in math and math progress across gender and socioeconomic identity groups? We used nationally representative data from about 10,000 U.S. public school 9th graders in the National Study of Learning Mindsets (NSLM) collected in 2015-2016-the most recent, national, longitudinal study of adolescents' mindsets in U.S. public schools. The sample was representative with respect to a large number of observable characteristics, such as gender, race, ethnicity, English Language Learners (ELLs), free or reduced price lunch, poverty, food stamps, neighborhood income and labor market participation, and school curricular opportunities. This allowed for generalization to the U.S. public school population and for the systematic investigation of school- and classroom-level contextual factors. The NSLM's complete sampling of students within schools also allowed for a comparison of students from different gender and socioeconomic groups with the same expectations in the same educational contexts. To analyze these data, we used the Bayesian Causal Forest (BCF) algorithm, a best-in-class machine-learning method for discovering complex, replicable interaction effects. Chapter IV examined the interplay of expectations, gender, and socioeconomic status (SES; operationalized with maternal educational attainment). Adolescents' expectations for success in math were meaningful predictors of their early math progress, even when controlling for other psychological factors, prior achievement in math, and racial and ethnic identities. Boys from low-SES families were the most vulnerable identity group. They were over three times more likely to not make adequate progress in math from 9th to 10th grade relative to girls from high-SES families. Boys from low-SES families also benefited the most from their expectations for success in math. Overall, these results were consistent with Mindset × Context Theory's predictions. Chapters V and VI examined the moderating role of school-level and classroom-level factors in the patterns reported in Chapter IV. Expectations were least predictive of math progress in the highest-achieving schools and schools with the most academically oriented peer norms, that is, schools with the most formal and informal resources. School resources appeared to compensate for lower levels of expectations. Conversely, expectations most strongly predicted math progress in the low/medium-achieving schools with less academically oriented peers, especially for boys from low-SES families. This chapter aligns with aspects of Mindset × Context Theory. A context that was not already optimally supporting student success was where outcomes for vulnerable students depended the most on student expectations. Finally, perceptions of classroom stereotyping mattered. Perceptions of gender stereotyping predicted less progress in math, but expectations for success in math more strongly predicted progress in classrooms with high perceived stereotyping. Gender stereotyping interactions emerged for all sociodemographic groups except for boys from high-SES families. The findings across these three analytical chapters demonstrate the value of integrating psychological and sociological perspectives to capture multiple levels of schooling. It also drew on the contextual variability afforded by representative sampling and explored the interplay of lab-tested psychological processes (expectations) with field-developed levers of policy intervention (school contexts). This monograph also leverages developmental and ecological insights to identify which groups of students might profit from different efforts to improve educational equity, such as interventions to increase expectations for success in math, or school programs that improve the school or classroom cultures.
Collapse
|
8
|
O'Flaherty M, Kalucza S, Bon J. Does Anyone Suffer From Teenage Motherhood? Mental Health Effects of Teen Motherhood in Great Britain Are Small and Homogeneous. Demography 2023; 60:707-729. [PMID: 37226980 DOI: 10.1215/00703370-10788364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Teen mothers experience disadvantage across a wide range of outcomes. However, previous research is equivocal with respect to possible long-term mental health consequences of teen motherhood and has not adequately considered the possibility that effects on mental health may be heterogeneous. Drawing on data from the 1970 British Birth Cohort Study, this article applies a novel statistical machine-learning approach-Bayesian Additive Regression Trees-to estimate the effects of teen motherhood on mental health outcomes at ages 30, 34, and 42. We extend previous work by estimating not only sample-average effects but also individual-specific estimates. Our results show that sample-average mental health effects of teen motherhood are substantively small at all time points, apart from age 30 comparisons to women who first became mothers at age 25‒30. Moreover, we find that these effects are largely homogeneous for all women in the sample-indicating that there are no subgroups in the data who experience important detrimental mental health consequences. We conclude that there are likely no mental health benefits to policy and interventions that aim to prevent teen motherhood.
Collapse
Affiliation(s)
- Martin O'Flaherty
- Institute for Social Science Research and Australian Research Council Centre of Excellence for Children and Families Over the Life Course, The University of Queensland, Brisbane, Australia
| | - Sara Kalucza
- Department of Sociology and Centre for Demographic and Ageing Research (CEDAR), Umeå University, Umeå, Sweden
| | - Joshua Bon
- School of Mathematical Sciences and Centre for Data Science, Queensland University of Technology, Brisbane, Australia
| |
Collapse
|
9
|
Blette BS, Granholm A, Li F, Shankar-Hari M, Lange T, Munch MW, Møller MH, Perner A, Harhay MO. Causal Bayesian machine learning to assess treatment effect heterogeneity by dexamethasone dose for patients with COVID-19 and severe hypoxemia. Sci Rep 2023; 13:6570. [PMID: 37085591 PMCID: PMC10120498 DOI: 10.1038/s41598-023-33425-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 04/12/2023] [Indexed: 04/23/2023] Open
Abstract
The currently recommended dose of dexamethasone for patients with severe or critical COVID-19 is 6 mg per day (mg/d) regardless of patient features and variation. However, patients with severe or critical COVID-19 are heterogenous in many ways (e.g., age, weight, comorbidities, disease severity, and immune features). Thus, it is conceivable that a standardized dosing protocol may not be optimal. We assessed treatment effect heterogeneity in the COVID STEROID 2 trial, which compared 6 mg/d to 12 mg/d, using a causal inference framework with Bayesian Additive Regression Trees, a flexible modeling method that detects interactive effects and nonlinear relationships among multiple patient characteristics simultaneously. We found that 12 mg/d of dexamethasone, relative to 6 mg/d, was probably associated with better long-term outcomes (days alive without life support and mortality after 90 days) among the entire trial population (i.e., no signals of harm), and probably more beneficial among those without diabetes mellitus, that were older, were not using IL-6 inhibitors at baseline, weighed less, or had higher level respiratory support at baseline. This adds more evidence supporting the use of 12 mg/d in practice for most patients not receiving other immunosuppressants and that additional study of dosing could potentially optimize clinical outcomes.
Collapse
Affiliation(s)
- Bryan S Blette
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Clinical Trials Methods and Outcomes Lab, Palliative and Advanced Illness Research (PAIR) Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Anders Granholm
- Department of Intensive Care, Rigshospitalet-Copenhagen University Hospital, Copenhagen, Denmark
- Collaboration for Research in Intensive Care, Copenhagen, Denmark
| | - Fan Li
- Department of Biostatistics, Yale University School of Public Health, New Haven, CT, USA
- Center for Methods in Implementation and Prevention Science, Yale University School of Public Health, New Haven, CT, USA
| | - Manu Shankar-Hari
- Centre for Inflammation Research, University of Edinburgh, Edinburgh, UK
| | - Theis Lange
- Section of Biostatistics, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Marie Warrer Munch
- Department of Intensive Care, Rigshospitalet-Copenhagen University Hospital, Copenhagen, Denmark
- Collaboration for Research in Intensive Care, Copenhagen, Denmark
| | - Morten Hylander Møller
- Department of Intensive Care, Rigshospitalet-Copenhagen University Hospital, Copenhagen, Denmark
- Collaboration for Research in Intensive Care, Copenhagen, Denmark
| | - Anders Perner
- Department of Intensive Care, Rigshospitalet-Copenhagen University Hospital, Copenhagen, Denmark
- Collaboration for Research in Intensive Care, Copenhagen, Denmark
| | - Michael O Harhay
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
- Clinical Trials Methods and Outcomes Lab, Palliative and Advanced Illness Research (PAIR) Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
- Division of Pulmonary and Critical Care, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, 304 Blockley Hall, 423 Guardian Drive, Philadelphia, PA, 19104-6021, USA.
| |
Collapse
|
10
|
Josey KP, deSouza P, Wu X, Braun D, Nethery R. Estimating a Causal Exposure Response Function with a Continuous Error-Prone Exposure: A Study of Fine Particulate Matter and All-Cause Mortality. JOURNAL OF AGRICULTURAL, BIOLOGICAL, AND ENVIRONMENTAL STATISTICS 2023; 28:20-41. [PMID: 37063643 PMCID: PMC10103900 DOI: 10.1007/s13253-022-00508-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 07/08/2022] [Accepted: 07/23/2022] [Indexed: 10/14/2022]
Abstract
Numerous studies have examined the associations between long-term exposure to fine particulate matter (PM2.5) and adverse health outcomes. Recently, many of these studies have begun to employ high-resolution predicted PM2.5 concentrations, which are subject to measurement error. Previous approaches for exposure measurement error correction have either been applied in non-causal settings or have only considered a categorical exposure. Moreover, most procedures have failed to account for uncertainty induced by error correction when fitting an exposure-response function (ERF). To remedy these deficiencies, we develop a multiple imputation framework that combines regression calibration and Bayesian techniques to estimate a causal ERF. We demonstrate how the output of the measurement error correction steps can be seamlessly integrated into a Bayesian additive regression trees (BART) estimator of the causal ERF. We also demonstrate how locally-weighted smoothing of the posterior samples from BART can be used to create a more accurate ERF estimate. Our proposed approach also properly propagates the exposure measurement error uncertainty to yield accurate standard error estimates. We assess the robustness of our proposed approach in an extensive simulation study. We then apply our methodology to estimate the effects of PM2.5 on all-cause mortality among Medicare enrollees in New England from 2000-2012.
Collapse
Affiliation(s)
- Kevin P. Josey
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| | - Priyanka deSouza
- Department of Urban and Regional Planning, University of Colorado, Denver, CO
| | - Xiao Wu
- Department of Statistics, Stanford University, Stanford, CA
- Stanford Data Science, Stanford University, Stanford, CA
| | - Danielle Braun
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA
| | - Rachel Nethery
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| |
Collapse
|
11
|
Ranjbar S, Salvati N, Pacini B. Estimating heterogeneous causal effects in observational studies using small area predictors. Comput Stat Data Anal 2023. [DOI: 10.1016/j.csda.2023.107742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2023]
|
12
|
Varga AN, Guevara Morel AE, Lokkerbol J, van Dongen JM, van Tulder MW, Bosmans JE. Dealing with confounding in observational studies: A scoping review of methods evaluated in simulation studies with single-point exposure. Stat Med 2023; 42:487-516. [PMID: 36562408 PMCID: PMC10107671 DOI: 10.1002/sim.9628] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 11/22/2022] [Accepted: 12/01/2022] [Indexed: 12/24/2022]
Abstract
The aim of this article was to perform a scoping review of methods available for dealing with confounding when analyzing the effect of health care treatments with single-point exposure in observational data. We aim to provide an overview of methods and their performance assessed by simulation studies indexed in PubMed. We searched PubMed for simulation studies published until January 2021. Our search was restricted to studies evaluating binary treatments and binary and/or continuous outcomes. Information was extracted on the methods' assumptions, performance, and technical properties. Of 28,548 identified references, 127 studies were eligible for inclusion. Of them, 84 assessed 14 different methods (ie, groups of estimators that share assumptions and implementation) for dealing with measured confounding, and 43 assessed 10 different methods for dealing with unmeasured confounding. Results suggest that there are large differences in performance between methods and that the performance of a specific method is highly dependent on the estimator. Furthermore, the methods' assumptions regarding the specific data features also substantially influence the methods' performance. Finally, the methods result in different estimands (ie, target of inference), which can even vary within methods. In conclusion, when choosing a method to adjust for measured or unmeasured confounding it is important to choose the most appropriate estimand, while considering the population of interest, data structure, and whether the plausibility of the methods' required assumptions hold.
Collapse
Affiliation(s)
- Anita Natalia Varga
- Department of Health Sciences, Faculty of Science, Vrije Universiteit Amsterdam, Amsterdam Public Health Research Institute, The Netherlands
| | - Alejandra Elizabeth Guevara Morel
- Department of Health Sciences, Faculty of Science, Vrije Universiteit Amsterdam, Amsterdam Public Health Research Institute, The Netherlands
| | - Joran Lokkerbol
- Centre of Economic Evaluation, Trimbos Institute (Netherlands Institute of Mental Health), Utrecht, The Netherlands
| | - Johanna Maria van Dongen
- Department of Health Sciences, Faculty of Science, Vrije Universiteit Amsterdam, Amsterdam Public Health Research Institute, The Netherlands
| | - Maurits Willem van Tulder
- Department of Health Sciences, Faculty of Science, Vrije Universiteit Amsterdam, Amsterdam Public Health Research Institute, The Netherlands.,Department Physiotherapy and Occupational Therapy, Aarhus University Hospital, Aarhus, Denmark
| | - Judith Ekkina Bosmans
- Department of Health Sciences, Faculty of Science, Vrije Universiteit Amsterdam, Amsterdam Public Health Research Institute, The Netherlands
| |
Collapse
|
13
|
Choi Y, Gibson JR. The effect of COVID-19 on self-reported safety incidents in aviation: An examination of the heterogeneous effects using causal machine learning. JOURNAL OF SAFETY RESEARCH 2023; 84:393-403. [PMID: 36868668 PMCID: PMC9729650 DOI: 10.1016/j.jsr.2022.12.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 09/17/2022] [Accepted: 12/01/2022] [Indexed: 06/18/2023]
Abstract
INTRODUCTION Disruptions to aviation operations occur daily on a micro-level with negligible impacts beyond the inconvenience of rebooking and changing aircrew schedules. The unprecedented disruption in global aviation due to COVID-19 highlighted a need to evaluate emergent safety issues rapidly. METHOD This paper uses causal machine learning to examine the heterogeneous effects of COVID-19 on reported aircraft incursions/excursions. The analysis utilized self report data from NASA Aviation Safety Reporting System collected from 2018 to 2020. The report attributes include self identified group characteristics and expert categorization of factors and outcomes. The analysis identified attributes and subgroup characteristics that were most sensitive to COVID-19 in inducing incursions/excursions. The method included the generalized random forest and difference-in-difference techniques to explore causal effects. RESULTS The analysis indicates first officers are more prone to experiencing incursion/excursion events during the pandemic. In addition, events categorized with the human factors confusion, distraction, and the causal factor fatigue increased incursion/excursion events. PRACTICAL APPLICATIONS Understanding the attributes associated with the likelihood of incursion/excursion events provides policymakers and aviation organizations insights to improve prevention mechanisms for future pandemics or extended periods of reduced aviation operations.
Collapse
Affiliation(s)
- Youngran Choi
- David B. O'Maley College of Business, Embry-Riddle Aeronautical University, 1 Aerospace Boulevard Daytona Beach, FL 32114, United States.
| | - James R Gibson
- College of Business, Embry-Riddle Aeronautical University, 1 Aerospace Boulevard Daytona Beach, FL 32114, United States.
| |
Collapse
|
14
|
Dorie V, Perrett G, Hill JL, Goodrich B. Stan and BART for Causal Inference: Estimating Heterogeneous Treatment Effects Using the Power of Stan and the Flexibility of Machine Learning. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1782. [PMID: 36554187 PMCID: PMC9778579 DOI: 10.3390/e24121782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 10/22/2022] [Accepted: 11/06/2022] [Indexed: 06/17/2023]
Abstract
A wide range of machine-learning-based approaches have been developed in the past decade, increasing our ability to accurately model nonlinear and nonadditive response surfaces. This has improved performance for inferential tasks such as estimating average treatment effects in situations where standard parametric models may not fit the data well. These methods have also shown promise for the related task of identifying heterogeneous treatment effects. However, the estimation of both overall and heterogeneous treatment effects can be hampered when data are structured within groups if we fail to correctly model the dependence between observations. Most machine learning methods do not readily accommodate such structure. This paper introduces a new algorithm, stan4bart, that combines the flexibility of Bayesian Additive Regression Trees (BART) for fitting nonlinear response surfaces with the computational and statistical efficiencies of using Stan for the parametric components of the model. We demonstrate how stan4bart can be used to estimate average, subgroup, and individual-level treatment effects with stronger performance than other flexible approaches that ignore the multilevel structure of the data as well as multilevel approaches that have strict parametric forms.
Collapse
Affiliation(s)
| | - George Perrett
- Department of Applied Statistics, Social Science, and the Humanities, New York University, New York, NY 10003, USA
| | - Jennifer L. Hill
- Department of Applied Statistics, Social Science, and the Humanities, New York University, New York, NY 10003, USA
| | - Benjamin Goodrich
- Department of Political Science, Columbia University, New York, NY 10025, USA
| |
Collapse
|
15
|
Emmert-Fees KMF, Capacci S, Sassi F, Mazzocchi M, Laxy M. Estimating the impact of nutrition and physical activity policies with quasi-experimental methods and simulation modelling: an integrative review of methods, challenges and synergies. Eur J Public Health 2022; 32:iv84-iv91. [PMID: 36444112 PMCID: PMC9706116 DOI: 10.1093/eurpub/ckac051] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND The promotion of healthy lifestyles has high priority on the global public health agenda. Evidence on the real-world (cost-)effectiveness of policies addressing nutrition and physical activity is needed. To estimate short-term policy impacts, quasi-experimental methods using observational data are useful, while simulation models can estimate long-term impacts. We review the methods, challenges and potential synergies of both approaches for the evaluation of nutrition and physical activity policies. METHODS We performed an integrative review applying purposive literature sampling techniques to synthesize original articles, systematic reviews and lessons learned from public international workshops conducted within the European Union Policy Evaluation Network. RESULTS We highlight data requirements for policy evaluations, discuss the distinct assumptions of instrumental variable, difference-in-difference, and regression discontinuity designs and describe the necessary robustness and falsification analyses to test them. Further, we summarize the specific assumptions of comparative risk assessment and Markov state-transition simulation models, including their extension to microsimulation. We describe the advantages and limitations of these modelling approaches and discuss future directions, such as the adequate consideration of heterogeneous policy responses. Finally, we highlight how quasi-experimental and simulation modelling methods can be integrated into an evidence cycle for policy evaluation. CONCLUSIONS Assumptions of quasi-experimental and simulation modelling methods in policy evaluations should be credible, rigorously tested and transparently communicated. Both approaches can be applied synergistically within a coherent framework to compare policy implementation scenarios and improve the estimation of nutrition and physical activity policy impacts, including their distribution across population sub-groups.
Collapse
Affiliation(s)
- Karl M F Emmert-Fees
- Correspondence: Karl M.F. Emmert-Fees, Institute of Health Economics and Health Care Management, Helmholtz Zentrum München, Ingolstädter Landstraße 1, 85764 Neuherberg, Germany, Tel: +49 89 3187-43709, e-mail:
| | - Sara Capacci
- Department of Statistical Sciences, University of Bologna, Bologna, Italy
| | - Franco Sassi
- Centre for Health Economics and Policy Innovation (CHEPI), Imperial College Business School, London, UK
| | | | | |
Collapse
|
16
|
Bargagli-Stoffi FJ, De Witte K, Gnecco G. Heterogeneous causal effects with imperfect compliance: A Bayesian machine learning approach. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
17
|
Brooks JM, Chapman CG, Floyd SB, Chen BK, Thigpen CA, Kissenberth M. Assessing the ability of an instrumental variable causal forest algorithm to personalize treatment evidence using observational data: the case of early surgery for shoulder fracture. BMC Med Res Methodol 2022; 22:190. [PMID: 35818028 PMCID: PMC9275148 DOI: 10.1186/s12874-022-01663-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Accepted: 06/20/2022] [Indexed: 11/24/2022] Open
Abstract
Background Comparative effectiveness research (CER) using observational databases has been suggested to obtain personalized evidence of treatment effectiveness. Inferential difficulties remain using traditional CER approaches especially related to designating patients to reference classes a priori. A novel Instrumental Variable Causal Forest Algorithm (IV-CFA) has the potential to provide personalized evidence using observational data without designating reference classes a priori, but the consistency of the evidence when varying key algorithm parameters remains unclear. We investigated the consistency of IV-CFA estimates through application to a database of Medicare beneficiaries with proximal humerus fractures (PHFs) that previously revealed heterogeneity in the effects of early surgery using instrumental variable estimators. Methods IV-CFA was used to estimate patient-specific early surgery effects on both beneficial and detrimental outcomes using different combinations of algorithm parameters and estimate variation was assessed for a population of 72,751 fee-for-service Medicare beneficiaries with PHFs in 2011. Classification and regression trees (CART) were applied to these estimates to create ex-post reference classes and the consistency of these classes were assessed. Two-stage least squares (2SLS) estimators were applied to representative ex-post reference classes to scrutinize the estimates relative to known 2SLS properties. Results IV-CFA uncovered substantial early surgery effect heterogeneity across PHF patients, but estimates for individual patients varied with algorithm parameters. CART applied to these estimates revealed ex-post reference classes consistent across algorithm parameters. 2SLS estimates showed that ex-post reference classes containing older, frailer patients with more comorbidities, and lower utilizers of healthcare were less likely to benefit and more likely to have detriments from higher rates of early surgery. Conclusions IV-CFA provides an illuminating method to uncover ex-post reference classes of patients based on treatment effects using observational data with a strong instrumental variable. Interpretation of treatment effect estimates within each ex-post reference class using traditional CER methods remains conditional on the extent of measured information in the data. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-022-01663-0.
Collapse
Affiliation(s)
- John M Brooks
- Center for Effectiveness Research in Orthopaedics - Arnold School of Public Health Greenville, 915 Greene Street #302D, 29208, Columbia, SC, 29208-0001, USA. .,Health Services Policy & Management, University of South Carolina Arnold School of Public Health, Columbia, USA.
| | - Cole G Chapman
- Department of Pharmacy Practice and Science, University of Iowa, Iowa City, USA.,Center for Effectiveness Research in Orthopaedics, Greenville, USA
| | - Sarah B Floyd
- Center for Effectiveness Research in Orthopaedics, Greenville, USA.,Clemson University College of Behavioral Social and Health Sciences, Public Health Sciences, Clemson, USA
| | - Brian K Chen
- Health Services Policy & Management, University of South Carolina Arnold School of Public Health, Columbia, USA.,Center for Effectiveness Research in Orthopaedics, Greenville, USA
| | - Charles A Thigpen
- Center for Effectiveness Research in Orthopaedics, Greenville, USA.,ATI Physical Therapy, Greenville, USA
| | - Michael Kissenberth
- Center for Effectiveness Research in Orthopaedics, Greenville, USA.,Prisma Health, Steadman Hawkins Clinic of the Carolinas, Greenville, USA
| |
Collapse
|
18
|
Shi J, Norgeot B. Learning Causal Effects From Observational Data in Healthcare: A Review and Summary. Front Med (Lausanne) 2022; 9:864882. [PMID: 35872797 PMCID: PMC9300826 DOI: 10.3389/fmed.2022.864882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 06/17/2022] [Indexed: 11/29/2022] Open
Abstract
Causal inference is a broad field that seeks to build and apply models that learn the effect of interventions on outcomes using many data types. While the field has existed for decades, its potential to impact healthcare outcomes has increased dramatically recently due to both advancements in machine learning and the unprecedented amounts of observational data resulting from electronic capture of patient claims data by medical insurance companies and widespread adoption of electronic health records (EHR) worldwide. However, there are many different schools of learning causality coming from different fields of statistics, some of them strongly conflicting. While the recent advances in machine learning greatly enhanced causal inference from a modeling perspective, it further exacerbated the fractured state in this field. This fractured state has limited research at the intersection of causal inference, modern machine learning, and EHRs that could potentially transform healthcare. In this paper we unify the classical causal inference approaches with new machine learning developments into a straightforward framework based on whether the researcher is most interested in finding the best intervention for an individual, a group of similar people, or an entire population. Through this lens, we then provide a timely review of the applications of causal inference in healthcare from the literature. As expected, we found that applications of causal inference in medicine were mostly limited to just a few technique types and lag behind other domains. In light of this gap, we offer a helpful schematic to guide data scientists and healthcare stakeholders in selecting appropriate causal methods and reviewing the findings generated by them.
Collapse
|
19
|
Zhu J, Gallego B. Causal inference for observational longitudinal studies using deep survival models. J Biomed Inform 2022; 131:104119. [PMID: 35714819 DOI: 10.1016/j.jbi.2022.104119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Revised: 05/11/2022] [Accepted: 06/06/2022] [Indexed: 11/28/2022]
Abstract
OBJECTIVE Causal inference for observational longitudinal studies often requires the accurate estimation of treatment effects on time-to-event outcomes in the presence of time-dependent patient history and time-dependent covariates. MATERIALS AND METHODS To tackle this longitudinal treatment effect estimation problem, we have developed a time-variant causal survival (TCS) model that uses the potential outcomes framework with an ensemble of recurrent subnetworks to estimate the difference in survival probabilities and its confidence interval over time as a function of time-dependent covariates and treatments. RESULTS Using simulated survival datasets, the TCS model showed good causal effect estimation performance across scenarios of varying sample dimensions, event rates, confounding and overlapping. However, increasing the sample size was not effective in alleviating the adverse impact of a high level of confounding. In a large clinical cohort study, TCS identified the expected conditional average treatment effect and detected individual treatment effect heterogeneity over time. TCS provides an efficient way to estimate and update individualized treatment effects over time, in order to improve clinical decisions. DISCUSSION The use of a propensity score layer and potential outcome subnetworks helps correcting for selection bias. However, the proposed model is limited in its ability to correct the bias from unmeasured confounding, and more extensive testing of TCS under extreme scenarios such as low overlapping and the presence of unmeasured confounders is desired and left for future work. CONCLUSION TCS fills the gap in causal inference using deep learning techniques in survival analysis. It considers time-varying confounders and treatment options. Its treatment effect estimation can be easily compared with the conventional literature, which uses relative measures of treatment effect. We expect TCS will be particularly useful for identifying and quantifying treatment effect heterogeneity over time under the ever complex observational health care environment.
Collapse
Affiliation(s)
- Jie Zhu
- Centre for Big Data Research in Health (CBDRH), UNSW, Sydney, NSW 2052, Australia.
| | - Blanca Gallego
- Centre for Big Data Research in Health (CBDRH), UNSW, Sydney, NSW 2052, Australia.
| |
Collapse
|
20
|
Generalizability of heterogeneous treatment effects based on causal forests applied to two randomized clinical trials of intensive glycemic control. Ann Epidemiol 2022; 65:101-108. [PMID: 34280545 PMCID: PMC8748294 DOI: 10.1016/j.annepidem.2021.07.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Revised: 06/04/2021] [Accepted: 07/09/2021] [Indexed: 01/03/2023]
Abstract
Purpose Machine learning is an attractive tool for identifying heterogeneous treatment effects (HTE) of interventions but generalizability of machine learning derived HTE remains unclear. We examined generalizability of HTE detected using causal forests in two similarly designed randomized trials in type II diabetes patients. Methods We evaluated published HTE of intensive versus standard glycemic control on all-cause mortality from the Action to Control Cardiovascular Risk in Diabetes study (ACCORD) in a second trial, the Veterans Affairs Diabetes Trial (VADT). We then applied causal forests to VADT, ACCORD, and pooled data from both studies and compared variable importance and subgroup effects across samples. Results HTE in ACCORD did not replicate in similar subgroups in VADT, but variable importance was correlated between VADT and ACCORD (Kendall's tau-b 0.75). Applying causal forests to pooled individual-level data yielded seven subgroups with similar HTE across both studies, ranging from risk difference of all-cause mortality of -3.9% (95% CI -7.0, -0.8) to 4.7% (95% CI 1.8, 7.5). Conclusions Machine learning detection of HTE subgroups from randomized trials may not generalize across study samples even when variable importance is correlated. Pooling individual-level data may overcome differences in study populations and/or differences in interventions that limit HTE generalizability.
Collapse
Key Words
- BMI, Body mass index
- Generalizability, Glycemic control, Causal forests, Heterogeneous treatment effects. Abbreviations: ACCORD, Action to Control Cardiovascular Risk in Diabetes Study
- HGI, Hemoglobin glycation index
- HTE, Heterogeneous treatment effects
- HbA1c, Hemoglobin A1c
- VADT, Veterans Affairs Diabetes Trial
- eGFR, Estimated glomerular filtration rate
Collapse
|
21
|
Meid AD, Wirbka L, Groll A, Haefeli WE. Can Machine Learning from Real-World Data Support Drug Treatment Decisions? A Prediction Modeling Case for Direct Oral Anticoagulants. Med Decis Making 2021; 42:587-598. [PMID: 34911402 PMCID: PMC9189725 DOI: 10.1177/0272989x211064604] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
BACKGROUND Decision making for the "best" treatment is particularly challenging in situations in which individual patient response to drugs can largely differ from average treatment effects. By estimating individual treatment effects (ITEs), we aimed to demonstrate how strokes, major bleeding events, and a composite of both could be reduced by model-assisted recommendations for a particular direct oral anticoagulant (DOAC). METHODS In German claims data for the calendar years 2014-2018, we selected 29 901 new users of the DOACs rivaroxaban and apixaban. Random forests considered binary events within 1 y to estimate ITEs under each DOAC according to the X-learner algorithm with 29 potential effect modifiers; treatment recommendations were based on these estimated ITEs. Model performance was evaluated by the c-for-benefit statistics, absolute risk reduction (ARR), and absolute risk difference (ARD) by trial emulation. RESULTS A significant proportion of patients would be recommended a different treatment option than they actually received. The stroke model significantly discriminated patients for higher benefit and thus indicated improved decisions by reduced outcomes (c-for-benefit: 0.56; 95% confidence interval [0.52; 0.60]). In the group with apixaban recommendation, the model also improved the composite endpoint (ARR: 1.69 % [0.39; 2.97]). In trial emulations, model-assisted recommendations significantly reduced the composite event rate (ARD: -0.78 % [-1.40; -0.03]). CONCLUSIONS If prescribers are undecided about the potential benefits of different treatment options, ITEs can support decision making, especially if evidence is inconclusive, risk-benefit profiles of therapeutic alternatives differ significantly, and the patients' complexity deviates from "typical" study populations. In the exemplary case for DOACs and potentially in other situations, the significant impact could also become practically relevant if recommendations were available in an automated way as part of decision making.HighlightsIt was possible to calculate individual treatment effects (ITEs) from routine claims data for rivaroxaban and apixaban, and the characteristics between the groups with recommendation for one or the other option differed significantly.ITEs resulted in recommendations that were significantly superior to usual (observed) treatment allocations in terms of absolute risk reduction, both separately for stroke and in the composite endpoint of stroke and major bleeding.When similar patients from routine data were selected (precision cohorts) for patients with a strong recommendation for one option or the other, those similar patients under the respective recommendation showed a significantly better prognosis compared with the alternative option.Many steps may still be needed on the way to clinical practice, but the principle of decision support developed from routine data may point the way toward future decision-making processes.
Collapse
Affiliation(s)
- Andreas D Meid
- Department of Clinical Pharmacology and Pharmacoepidemiology, University of Heidelberg, Heidelberg, Germany
| | - Lucas Wirbka
- Department of Clinical Pharmacology and Pharmacoepidemiology, University of Heidelberg, Heidelberg, Germany
| | | | - Andreas Groll
- Department of Statistics, TU Dortmund University, Dortmund, Germany
| | - Walter E Haefeli
- Department of Clinical Pharmacology and Pharmacoepidemiology, University of Heidelberg, Heidelberg, Germany
| |
Collapse
|
22
|
Koutroulis G, Botler L, Mutlu B, Diwold K, Römer K, Kern R. KOMPOS: Connecting Causal Knots in Large Nonlinear Time Series with Non-Parametric Regression Splines. ACM T INTEL SYST TEC 2021. [DOI: 10.1145/3480971] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Recovering causality from copious time series data beyond mere correlations has been an important contributing factor in numerous scientific fields. Most existing works assume linearity in the data that may not comply with many real-world scenarios. Moreover, it is usually not sufficient to solely infer the causal relationships. Identifying the correct time delay of cause-effect is extremely vital for further insight and effective policies in inter-disciplinary domains. To bridge this gap, we propose KOMPOS, a novel algorithmic framework that combines a powerful concept from causal discovery of additive noise models with graphical ones. We primarily build our structural causal model from multivariate adaptive regression splines with inherent additive local nonlinearities, which render the underlying causal structure more easily identifiable. In contrast to other methods, our approach is not restricted to Gaussian or non-Gaussian noise due to the non-parametric attribute of the regression method. We conduct extensive experiments on both synthetic and real-world datasets, demonstrating the superiority of the proposed algorithm over existing causal discovery methods, especially for the challenging cases of autocorrelated and non-stationary time series.
Collapse
Affiliation(s)
| | - Leo Botler
- Graz University of Technology, Graz, Austria
| | | | - Konrad Diwold
- Pro2Future GmbH and Graz University of Technology, Graz, Austria
| | - Kay Römer
- Graz University of Technology, Graz, Austria
| | - Roman Kern
- Graz University of Technology, Graz, Austria
| |
Collapse
|
23
|
Zhang Y, Sabbaghi A. The Designed Bootstrap for Causal Inference in Big Observational Data. JOURNAL OF STATISTICAL THEORY AND PRACTICE 2021. [DOI: 10.1007/s42519-021-00213-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
24
|
Weissler EH, Naumann T, Andersson T, Ranganath R, Elemento O, Luo Y, Freitag DF, Benoit J, Hughes MC, Khan F, Slater P, Shameer K, Roe M, Hutchison E, Kollins SH, Broedl U, Meng Z, Wong JL, Curtis L, Huang E, Ghassemi M. The role of machine learning in clinical research: transforming the future of evidence generation. Trials 2021; 22:537. [PMID: 34399832 PMCID: PMC8365941 DOI: 10.1186/s13063-021-05489-x] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 07/26/2021] [Indexed: 12/13/2022] Open
Abstract
Background Interest in the application of machine learning (ML) to the design, conduct, and analysis of clinical trials has grown, but the evidence base for such applications has not been surveyed. This manuscript reviews the proceedings of a multi-stakeholder conference to discuss the current and future state of ML for clinical research. Key areas of clinical trial methodology in which ML holds particular promise and priority areas for further investigation are presented alongside a narrative review of evidence supporting the use of ML across the clinical trial spectrum. Results Conference attendees included stakeholders, such as biomedical and ML researchers, representatives from the US Food and Drug Administration (FDA), artificial intelligence technology and data analytics companies, non-profit organizations, patient advocacy groups, and pharmaceutical companies. ML contributions to clinical research were highlighted in the pre-trial phase, cohort selection and participant management, and data collection and analysis. A particular focus was paid to the operational and philosophical barriers to ML in clinical research. Peer-reviewed evidence was noted to be lacking in several areas. Conclusions ML holds great promise for improving the efficiency and quality of clinical research, but substantial barriers remain, the surmounting of which will require addressing significant gaps in evidence.
Collapse
Affiliation(s)
- E Hope Weissler
- Duke Clinical Research Institute, Duke University School of Medicine, Box 2834, Durham, NC, 27701, USA.
| | | | | | - Rajesh Ranganath
- Courant Institute of Mathematical Science, New York University, New York, NY, USA
| | - Olivier Elemento
- Englander Institute for Precision Medicine, Weill Cornell Medical College, New York, NY, USA
| | - Yuan Luo
- Northwestern University Clinical and Translational Sciences Institute, Northwestern University, Chicago, IL, USA
| | - Daniel F Freitag
- Division Pharmaceuticals, Open Innovation and Digital Technologies, Bayer AG, Wuppertal, Germany
| | - James Benoit
- University of Alberta, Edmonton, Alberta, Canada
| | - Michael C Hughes
- Department of Computer Science, Tufts University, Medford, MA, USA
| | | | | | | | | | | | - Scott H Kollins
- Duke Clinical Research Institute, Duke University School of Medicine, Box 2834, Durham, NC, 27701, USA
| | - Uli Broedl
- Boehringer-Ingelheim, Burlington, Canada
| | | | | | - Lesley Curtis
- Duke Clinical Research Institute, Duke University School of Medicine, Box 2834, Durham, NC, 27701, USA
| | - Erich Huang
- Duke Clinical Research Institute, Duke University School of Medicine, Box 2834, Durham, NC, 27701, USA.,Duke Forge, Durham, NC, USA
| | - Marzyeh Ghassemi
- Vector Institute, University of Toronto, Toronto, Ontario, Canada.,Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, 02139, USA.,Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, 02139, USA.,CIFAR AI Chair, Vector Institute, Toronto, Ontario, Canada
| |
Collapse
|
25
|
Meid AD, Ruff C, Wirbka L, Stoll F, Seidling HM, Groll A, Haefeli WE. Using the Causal Inference Framework to Support Individualized Drug Treatment Decisions Based on Observational Healthcare Data. Clin Epidemiol 2020; 12:1223-1234. [PMID: 33173350 PMCID: PMC7646479 DOI: 10.2147/clep.s274466] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 10/08/2020] [Indexed: 01/02/2023] Open
Abstract
When healthcare professionals have the choice between several drug treatments for their patients, they often experience considerable decision uncertainty because many decisions simply have no single “best” choice. The challenges are manifold and include that guideline recommendations focus on randomized controlled trials whose populations do not necessarily correspond to specific patients in everyday treatment. Further reasons may be insufficient evidence on outcomes, lack of direct comparison of distinct options, and the need to individually balance benefits and risks. All these situations will occur in routine care, its outcomes will be mirrored in routine data, and could thus be used to guide decisions. We propose a concept to facilitate decision-making by exploiting this wealth of information. Our working example for illustration assumes that the response to a particular (drug) treatment can substantially differ between individual patients depending on their characteristics (heterogeneous treatment effects, HTE), and that decisions will be more precise if they are based on real-world evidence of HTE considering this information. However, such methods must account for confounding by indication and effect measure modification, eg, by adequately using machine learning methods or parametric regressions to estimate individual responses to pharmacological treatments. The better a model assesses the underlying HTE, the more accurate are predicted probabilities of treatment response. After probabilities for treatment-related benefit and harm have been calculated, decision rules can be applied and patient preferences can be considered to provide individual recommendations. Emulated trials in observational data are a straightforward technique to predict the effects of such decision rules when applied in routine care. Prediction-based decision rules from routine data have the potential to efficiently supplement clinical guidelines and support healthcare professionals in creating personalized treatment plans using decision support tools.
Collapse
Affiliation(s)
- Andreas D Meid
- Department of Clinical Pharmacology and Pharmacoepidemiology, University of Heidelberg, Heidelberg 69120, Germany
| | - Carmen Ruff
- Department of Clinical Pharmacology and Pharmacoepidemiology, University of Heidelberg, Heidelberg 69120, Germany
| | - Lucas Wirbka
- Department of Clinical Pharmacology and Pharmacoepidemiology, University of Heidelberg, Heidelberg 69120, Germany
| | - Felicitas Stoll
- Department of Clinical Pharmacology and Pharmacoepidemiology, University of Heidelberg, Heidelberg 69120, Germany
| | - Hanna M Seidling
- Department of Clinical Pharmacology and Pharmacoepidemiology, University of Heidelberg, Heidelberg 69120, Germany.,Cooperation Unit Clinical Pharmacy, University of Heidelberg, Heidelberg 69120, Germany
| | - Andreas Groll
- Department of Statistics, TU Dortmund University, Dortmund 44227, Germany
| | - Walter E Haefeli
- Department of Clinical Pharmacology and Pharmacoepidemiology, University of Heidelberg, Heidelberg 69120, Germany.,Cooperation Unit Clinical Pharmacy, University of Heidelberg, Heidelberg 69120, Germany
| |
Collapse
|
26
|
Nethery RC, Mealli F, Sacks JD, Dominici F. Evaluation of the health impacts of the 1990 Clean Air Act Amendments using causal inference and machine learning. J Am Stat Assoc 2020; 1:1-12. [PMID: 33424062 PMCID: PMC7788006 DOI: 10.1080/01621459.2020.1803883] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Revised: 07/13/2020] [Accepted: 07/20/2020] [Indexed: 10/23/2022]
Abstract
We develop a causal inference approach to estimate the number of adverse health events that were prevented due to changes in exposure to multiple pollutants attributable to a large-scale air quality intervention/regulation, with a focus on the 1990 Clean Air Act Amendments (CAAA). We introduce a causal estimand called the Total Events Avoided (TEA) by the regulation, defined as the difference in the number of health events expected under the no-regulation pollution exposures and the number observed with-regulation. We propose matching and machine learning methods that leverage population-level pollution and health data to estimate the TEA. Our approach improves upon traditional methods for regulation health impact analyses by formalizing causal identifying assumptions, utilizing population-level data, minimizing parametric assumptions, and collectively analyzing multiple pollutants. To reduce model-dependence, our approach estimates cumulative health impacts in the subset of regions with projected no-regulation features lying within the support of the observed with-regulation data, thereby providing a conservative but data-driven assessment to complement traditional parametric approaches. We analyze the health impacts of the CAAA in the US Medicare population in the year 2000, and our estimates suggest that large numbers of cardiovascular and dementia-related hospitalizations were avoided due to CAAA-attributable changes in pollution exposure.
Collapse
Affiliation(s)
- Rachel C Nethery
- Department of Biostatistics, Harvard T.H. Chan School of Public Health
| | - Fabrizia Mealli
- Department of Statistics, Computer Science, Applications, University of Florence
| | - Jason D Sacks
- National Center for Environmental Assessment, Office of Research and Development, U.S. Environmental Protection Agency
| | | |
Collapse
|
27
|
Arbet J, Brokamp C, Meinzen-Derr J, Trinkley KE, Spratt HM. Lessons and tips for designing a machine learning study using EHR data. J Clin Transl Sci 2020; 5:e21. [PMID: 33948244 PMCID: PMC8057454 DOI: 10.1017/cts.2020.513] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 06/18/2020] [Accepted: 07/13/2020] [Indexed: 02/08/2023] Open
Abstract
Machine learning (ML) provides the ability to examine massive datasets and uncover patterns within data without relying on a priori assumptions such as specific variable associations, linearity in relationships, or prespecified statistical interactions. However, the application of ML to healthcare data has been met with mixed results, especially when using administrative datasets such as the electronic health record. The black box nature of many ML algorithms contributes to an erroneous assumption that these algorithms can overcome major data issues inherent in large administrative healthcare data. As with other research endeavors, good data and analytic design is crucial to ML-based studies. In this paper, we will provide an overview of common misconceptions for ML, the corresponding truths, and suggestions for incorporating these methods into healthcare research while maintaining a sound study design.
Collapse
Affiliation(s)
- Jaron Arbet
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado-Denver Anschutz Medical Campus, Aurora, CO, USA
| | - Cole Brokamp
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biostatistics and Epidemiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
| | - Jareen Meinzen-Derr
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biostatistics and Epidemiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
| | - Katy E. Trinkley
- Department of Clinical Pharmacy, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado, Aurora, CO, USA
- Department of Medicine, School of Medicine, University of Colorado, Aurora, CO, USA
| | - Heidi M. Spratt
- Department of Preventive Medicine and Population Health, University of Texas Medical Branch, Galveston, TX, USA
| |
Collapse
|
28
|
Yadlowsky S, Pellegrini F, Lionetto F, Braune S, Tian L. Estimation and Validation of Ratio-based Conditional Average Treatment Effects Using Observational Data. J Am Stat Assoc 2020; 116:335-352. [PMID: 33767517 PMCID: PMC7985957 DOI: 10.1080/01621459.2020.1772080] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2019] [Revised: 04/20/2020] [Accepted: 05/16/2020] [Indexed: 10/24/2022]
Abstract
While sample sizes in randomized clinical trials are large enough to estimate the average treatment effect well, they are often insufficient for estimation of treatment-covariate interactions critical to studying data-driven precision medicine. Observational data from real world practice may play an important role in alleviating this problem. One common approach in trials is to predict the outcome of interest with separate regression models in each treatment arm, and estimate the treatment effect based on the contrast of the predictions. Unfortunately, this simple approach may induce spurious treatment-covariate interaction in observational studies when the regression model is misspecified. Motivated by the need of modeling the number of relapses in multiple sclerosis patients, where the ratio of relapse rates is a natural choice of the treatment effect, we propose to estimate the conditional average treatment effect (CATE) as the ratio of expected potential outcomes, and derive a doubly robust estimator of this CATE in a semiparametric model of treatment-covariate interactions. We also provide a validation procedure to check the quality of the estimator on an independent sample. We conduct simulations to demonstrate the finite sample performance of the proposed methods, and illustrate their advantages on real data by examining the treatment effect of dimethyl fumarate compared to teriflunomide in multiple sclerosis patients.
Collapse
Affiliation(s)
- Steve Yadlowsky
- Stanford University, Electrical Engineering, 1265 Welch Rd, Stanford, 94305-6104 United States
| | | | | | - Stefan Braune
- NeuroTransData, Neurology, Neuburg an der Donau, Germany
| | - Lu Tian
- Stanford University, Department of Biomedical Data Science, Stanford, 94305-6104 United States
| |
Collapse
|
29
|
Zhu J, Gallego B. Targeted estimation of heterogeneous treatment effect in observational survival analysis. J Biomed Inform 2020; 107:103474. [DOI: 10.1016/j.jbi.2020.103474] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2020] [Revised: 05/29/2020] [Accepted: 06/02/2020] [Indexed: 10/24/2022]
|
30
|
Bica I, Alaa AM, Lambert C, van der Schaar M. From Real-World Patient Data to Individualized Treatment Effects Using Machine Learning: Current and Future Methods to Address Underlying Challenges. Clin Pharmacol Ther 2020; 109:87-100. [PMID: 32449163 DOI: 10.1002/cpt.1907] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Accepted: 05/14/2020] [Indexed: 12/21/2022]
Abstract
Clinical decision making needs to be supported by evidence that treatments are beneficial to individual patients. Although randomized control trials (RCTs) are the gold standard for testing and introducing new drugs, due to the focus on specific questions with respect to establishing efficacy and safety vs. standard treatment, they do not provide a full characterization of the heterogeneity in the final intended treatment population. Conversely, real-world observational data, such as electronic health records (EHRs), contain large amounts of clinical information about heterogeneous patients and their response to treatments. In this paper, we introduce the main opportunities and challenges in using observational data for training machine learning methods to estimate individualized treatment effects and make treatment recommendations. We describe the modeling choices of the state-of-the-art machine learning methods for causal inference, developed for estimating treatment effects both in the cross-section and longitudinal settings. Additionally, we highlight future research directions that could lead to achieving the full potential of leveraging EHRs and machine learning for making individualized treatment recommendations. We also discuss how experimental data from RCTs and Pharmacometric and Quantitative Systems Pharmacology approaches can be used to not only improve machine learning methods, but also provide ways for validating them. These future research directions will require us to collaborate across the scientific disciplines to incorporate models based on RCTs and known disease processes, physiology, and pharmacology into these machine learning models based on EHRs to fully optimize the opportunity these data present.
Collapse
Affiliation(s)
- Ioana Bica
- University of Oxford, Oxford, UK.,The Alan Turing Institute, London, UK
| | - Ahmed M Alaa
- University of California - Los Angeles, Los Angeles, California, USA
| | - Craig Lambert
- Clinical Pharmacology and Safety Sciences, Research and Development, AstraZeneca, Cambridge, UK
| | - Mihaela van der Schaar
- The Alan Turing Institute, London, UK.,University of California - Los Angeles, Los Angeles, California, USA.,University of Cambridge, Cambridge, UK
| |
Collapse
|
31
|
Replicator degrees of freedom allow publication of misleading failures to replicate. Proc Natl Acad Sci U S A 2019; 116:25535-25545. [PMID: 31767750 PMCID: PMC6925985 DOI: 10.1073/pnas.1910951116] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
We show that commonly exercised flexibility at the experimental design and data analysis stages of replication testing makes it easy to publish false-negative replication results while maintaining the impression of methodological rigor. These findings have important implications for how the many ostensible nonreplications already in the literature should be interpreted and for how future replication tests should be conducted. In recent years, the field of psychology has begun to conduct replication tests on a large scale. Here, we show that “replicator degrees of freedom” make it far too easy to obtain and publish false-negative replication results, even while appearing to adhere to strict methodological standards. Specifically, using data from an ongoing debate, we show that commonly exercised flexibility at the experimental design and data analysis stages of replication testing can make it appear that a finding was not replicated when, in fact, it was. The debate that we focus on is representative, on key dimensions, of a large number of other replication tests in psychology that have been published in recent years, suggesting that the lessons of this analysis may be far reaching. The problems with current practice in replication science that we uncover here are particularly worrisome because they are not adequately addressed by the field’s standard remedies, including preregistration. Implications for how the field could develop more effective methodological standards for replication are discussed.
Collapse
|