1
|
Chen S, Haziza D. A unified framework of multiply robust estimation approaches for handling incomplete data. Comput Stat Data Anal 2023; 179:107646. [PMID: 38736662 PMCID: PMC11087063 DOI: 10.1016/j.csda.2022.107646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Missing data occur frequently in practice. Inverse probability weighting and imputation are regarded as two important approaches for handling missing data. However, the validity of these approaches depends on underlying model assumptions. A new general framework for multiply robust estimation procedures by combining multiple nonresponse and imputation models is proposed in the paper. The proposed method can be used to estimate both smooth and non-smooth parameters defined as the solution of some estimating equations. It includes population means, quantiles, and distribution functions as special cases. The asymptotic results of the proposed methods are established. The results of a simulation study and a real data application suggest that the proposed methods perform well in terms of bias and efficiency.
Collapse
Affiliation(s)
- Sixia Chen
- Department of Biostatistics and Epidemiology, University of Oklahoma Health Sciences Center, 801 NE 13th ST, Oklahoma City, 73104, Oklahoma, USA
| | - David Haziza
- Department of Mathematics and Statistics, University of Ottawa, 150 Louis-Pasteur Private, Ottawa, K1N 6N5, Ontario, Canada
| |
Collapse
|
2
|
Wen L, Marcus JL, Young JG. Intervention treatment distributions that depend on the observed treatment process and model double robustness in causal survival analysis. Stat Methods Med Res 2023; 32:509-523. [PMID: 36597699 PMCID: PMC9983057 DOI: 10.1177/09622802221146311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
The generalized g-formula can be used to estimate the probability of survival under a sustained treatment strategy. When treatment strategies are deterministic, estimators derived from the so-called efficient influence function (EIF) for the g-formula will be doubly robust to model misspecification. In recent years, several practical applications have motivated estimation of the g-formula under non-deterministic treatment strategies where treatment assignment at each time point depends on the observed treatment process. In this case, EIF-based estimators may or may not be doubly robust. In this paper, we provide sufficient conditions to ensure the existence of doubly robust estimators for intervention treatment distributions that depend on the observed treatment process for point treatment interventions and give a class of intervention treatment distributions dependent on the observed treatment process that guarantee model doubly and multiply robust estimators in longitudinal settings. Motivated by an application to pre-exposure prophylaxis (PrEP) initiation studies, we propose a new treatment intervention dependent on the observed treatment process. We show there exist (1) estimators that are doubly and multiply robust to model misspecification and (2) estimators that when used with machine learning algorithms can attain fast convergence rates for our proposed intervention. Finally, we explore the finite sample performance of our estimators via simulation studies.
Collapse
Affiliation(s)
- Lan Wen
- Department of Statistics and Actuarial Science, 8430University of Waterloo, Waterloo, ON, Canada
| | - Julia L Marcus
- Department of Population Medicine, 1811Harvard Medical School, Boston, MA, USA
| | - Jessica G Young
- Department of Population Medicine, 1811Harvard Medical School, Boston, MA, USA
| |
Collapse
|
3
|
Wen L, Hernán MA, Robins JM. MULTIPLY ROBUST ESTIMATORS OF CAUSAL EFFECTS FOR SURVIVAL OUTCOMES. Scand Stat Theory Appl 2022; 49:1304-1328. [PMID: 36033967 PMCID: PMC9401091 DOI: 10.1111/sjos.12561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Accepted: 08/30/2021] [Indexed: 11/27/2022]
Abstract
Multiply robust estimators of the longitudinal g-formula have recently been proposed to protect against model misspecification better than the standard augmented inverse probability weighted estimator (Rotnitzky et al., 2017; Luedtke et al., 2018). These multiply robust estimators ensure consistency if one of the models for the treatment process or outcome process is correctly specified at each time point. We study the multiply robust estimators of Rotnitzky et al. (2017) in the context of a survival outcome. Specifically, we compare various estimators of the g-formula for survival outcomes in order to 1) understand how the estimators may be related to one another, 2) understand each estimator's robustness to model misspecification, and 3) construct estimators that can be more efficient than others in certain model misspecification scenarios. We propose a modification of the multiply robust estimators to gain efficiency under misspecification of the outcome model by using calibrated propensity scores over non-calibrated propensity scores at each time point. Theoretical results are confirmed via simulation studies, and a practical comparison of these estimators is conducted through an application to the US Veterans Aging Cohort Study.
Collapse
Affiliation(s)
- Lan Wen
- DEPARTMENT OF EPIDEMIOLOGY, HARVARD T. H. CHAN SCHOOL OF PUBLIC HEALTH
- CAUSALAB, HARVARD T.H. CHAN SCHOOL OF PUBLIC HEALTH
| | - Miguel A Hernán
- DEPARTMENT OF EPIDEMIOLOGY, HARVARD T. H. CHAN SCHOOL OF PUBLIC HEALTH
- CAUSALAB, HARVARD T.H. CHAN SCHOOL OF PUBLIC HEALTH
- DEPARTMENT OF BIOSTATISTICS, HARVARD T. H. CHAN SCHOOL OF PUBLIC HEALTH
| | - James M Robins
- DEPARTMENT OF EPIDEMIOLOGY, HARVARD T. H. CHAN SCHOOL OF PUBLIC HEALTH
- CAUSALAB, HARVARD T.H. CHAN SCHOOL OF PUBLIC HEALTH
- DEPARTMENT OF BIOSTATISTICS, HARVARD T. H. CHAN SCHOOL OF PUBLIC HEALTH
| |
Collapse
|
4
|
Zhang S, Han P, Wu C. Calibration Techniques Encompassing Survey Sampling, Missing Data Analysis and Causal Inference. Int Stat Rev 2022. [DOI: 10.1111/insr.12518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
Affiliation(s)
- Shixiao Zhang
- Department of Mathematics and Statistics University of Massachusetts Amherst Amherst Massachusetts USA
| | - Peisong Han
- Department Biostatistics University of Michigan Ann Arbor Michigan USA
| | - Changbao Wu
- Department of Statistics and Actuarial Science University of Waterloo Waterloo Ontario Canada
| |
Collapse
|
5
|
Zhou X. Semiparametric estimation for causal mediation analysis with multiple causally ordered mediators. J R Stat Soc Series B Stat Methodol 2021. [DOI: 10.1111/rssb.12487] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Xiang Zhou
- Harvard University Cambridge Massachusetts USA
| |
Collapse
|
6
|
Díaz I, Williams N, Hoffman KL, Schenck EJ. Nonparametric Causal Effects Based on Longitudinal Modified Treatment Policies. J Am Stat Assoc 2021. [DOI: 10.1080/01621459.2021.1955691] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Iván Díaz
- Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, New York
| | - Nicholas Williams
- Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, New York
| | - Katherine L. Hoffman
- Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, New York
| | - Edward J. Schenck
- Division of Pulmonary & Critical Care Medicine, Department of Medicine, Weill Cornell Medicine, New York
| |
Collapse
|
7
|
Mo W, Qi Z, Liu Y. Rejoinder: Learning Optimal Distributionally Robust Individualized Treatment Rules. J Am Stat Assoc 2021; 116:699-707. [PMID: 34177008 PMCID: PMC8221610 DOI: 10.1080/01621459.2020.1866581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Accepted: 12/12/2020] [Indexed: 10/21/2022]
Abstract
We thank the opportunity offered by editors for this discussion and the discussants for their insightful comments and thoughtful contributions. We also want to congratulate Kallus (2020) for his inspiring work in improving the effciency of policy learning by retargeting. Motivated from the discussion in Dukes and Vansteelandt (2020), we first point out interesting connections and distinctions between our work and Kallus (2020) in Section 1. In particular, the assumptions and sources of variation for consideration in these two papers lead to different research problems with different scopes and focuses. In Section 2, following the discussions in Li et al. (2020); Liang and Zhao (2020), we also consider the efficient policy evaluation problem when we have some data from the testing distribution available at the training stage. We show that under the assumption that the sample sizes from training and testing are growing in the same order, efficient value function estimates can deliver competitive performance. We further show some connections of these estimates with existing literature. However, when the growth of testing sample size available for training is in a slower order, efficient value function estimates may not perform well anymore. In contrast, the requirement of the testing sample size for DRITR is not as strong as that of efficient policy evaluation using the combined data. Finally, we highlight the general applicability and usefulness of DRITR in Section 3.
Collapse
Affiliation(s)
- Weibin Mo
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Zhengling Qi
- Department of Decision Sciences, George Washington University, Washington, D.C. 20052, USA
| | - Yufeng Liu
- Department of Statistics and Operations Research, Department of Genetics, Department of Biostatistics, Carolina Center for Genome Science, Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, NC 27599, USA
| |
Collapse
|
8
|
Yang S, Pieper K, Cools F. Semiparametric estimation of structural failure time models in continuous-time processes. Biometrika 2020; 107:123-136. [PMID: 33162561 DOI: 10.1093/biomet/asz057] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open
Abstract
Structural failure time models are causal models for estimating the effect of time-varying treatments on a survival outcome. G-estimation and artificial censoring have been proposed for estimating the model parameters in the presence of time-dependent confounding and administrative censoring. However, most existing methods require manually pre-processing data into regularly spaced data, which may invalidate the subsequent causal analysis. Moreover, the computation and inference are challenging due to the nonsmoothness of artificial censoring. We propose a class of continuous-time structural failure time models that respects the continuous-time nature of the underlying data processes. Under a martingale condition of no unmeasured confounding, we show that the model parameters are identifiable from a potentially infinite number of estimating equations. Using the semiparametric efficiency theory, we derive the first semiparametric doubly robust estimators, which are consistent if the model for the treatment process or the failure time model, but not necessarily both, is correctly specified. Moreover, we propose using inverse probability of censoring weighting to deal with dependent censoring. In contrast to artificial censoring, our weighting strategy does not introduce nonsmoothness in estimation and ensures that resampling methods can be used for inference.
Collapse
Affiliation(s)
- S Yang
- Department of Statistics, North Carolina State University, 2311 Stinson Drive, Raleigh, North Carolina 27695, U.S.A
| | - K Pieper
- Duke Clinical Research Institute, Duke University, 300 W. Morgan Street, Durham, North Carolina 27705, U.S.A
| | - F Cools
- Department of Cardiology, AZ Klina, Augustijnslei 100, 2930 Brasschaat, Belgium
| |
Collapse
|
9
|
Rotnitzky A, Smucler E, Robins JM. Characterization of parameters with a mixed bias property. Biometrika 2020. [DOI: 10.1093/biomet/asaa054] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Summary
We study a class of parameters with the so-called mixed bias property. For parameters with this property, the bias of the semiparametric efficient one-step estimator is equal to the mean of the product of the estimation errors of two nuisance functions. In nonparametric models, parameters with the mixed bias property admit so-called rate doubly robust estimators, i.e., estimators that are consistent and asymptotically normal when one succeeds in estimating both nuisance functions at sufficiently fast rates, with the possibility of trading off slower rates of convergence for the estimator of one of the nuisance functions against faster rates for the estimator of the other nuisance function. We show that the class of parameters with the mixed bias property strictly includes two recently studied classes of parameters which, in turn, include many parameters of interest in causal inference. We characterize the form of parameters with the mixed bias property and of their influence functions. Furthermore, we derive two functional loss functions, each being minimized at one of the two nuisance functions. These loss functions can be used to derive loss-based penalized estimators of the nuisance functions.
Collapse
Affiliation(s)
- A Rotnitzky
- Department of Economics, Universidad Torcuato Di Tella, Av. Figueroa Alcorta 7350, Buenos Aires 1428, Argentina
| | - E Smucler
- Department of Mathematics & Statistics, Universidad Torcuato Di Tella, Av. Figueroa Alcorta 7350, Buenos Aires 1428, Argentina
| | - J M Robins
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, 655 Huntingdon Avenue, Boston, Massachusetts 02115, U.S.A
| |
Collapse
|
10
|
Li W, Yang S, Han P. Robust estimation for moment condition models with data missing not at random. J Stat Plan Inference 2020. [DOI: 10.1016/j.jspi.2020.01.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
11
|
Sun Y, Wang L, Han P. Multiply robust estimation in nonparametric regression with missing data. J Nonparametr Stat 2019. [DOI: 10.1080/10485252.2019.1700254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
- Yilun Sun
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Lu Wang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Peisong Han
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
12
|
Babino L, Rotnitzky A, Robins J. Multiple robust estimation of marginal structural mean models for unconstrained outcomes. Biometrics 2018; 75:90-99. [PMID: 30004573 DOI: 10.1111/biom.12924] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2017] [Revised: 02/01/2018] [Accepted: 02/01/2018] [Indexed: 12/01/2022]
Abstract
We consider estimation, from longitudinal observational data, of the parameters of marginal structural mean models for unconstrained outcomes. Current proposals include inverse probability of treatment weighted and double robust (DR) estimators. A difficulty with DR estimation is that it requires postulating a sequence of models, one for the each mean of the counterfactual outcome given covariate and treatment history up to each exposure time point. Most natural models for such means are often incompatible. Robins et al., (2000b) proposed a parameterization of the likelihood which implies compatible parametric models for such means. Their parameterization has not been exploited to construct DR estimators and one goal of this article is to fill this gap. More importantly, exploiting this parameterization we propose a multiple robust (MR) estimator that confers even more protection against model misspecification than DR estimators. Our methods are easy to implement as they are based on the iterative fit of a sequence of weighted regressions.
Collapse
Affiliation(s)
- Lucia Babino
- Instituto de Calculo, FCEN, Universidad de Buenos Aires, Buenos Aires 1428, Argentina
| | - Andrea Rotnitzky
- Departamento de Economia, Universidad Torcuato Di Tella, Buenos Aires 1428, Argentina
| | - James Robins
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, Massachusetts 02115, U.S.A
| |
Collapse
|
13
|
Wang L, Tchetgen Tchetgen E. Bounded, efficient and multiply robust estimation of average treatment effects using instrumental variables. J R Stat Soc Series B Stat Methodol 2018; 80:531-550. [PMID: 30034269 PMCID: PMC6051728 DOI: 10.1111/rssb.12262] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Instrumental variables (IVs) are widely used for estimating causal effects in the presence of unmeasured confounding. Under the standard IV model, however, the average treatment effect (ATE) is only partially identifiable. To address this, we propose novel assumptions that allow for identification of the ATE. Our identification assumptions are clearly separated from model assumptions needed for estimation, so that researchers are not required to commit to a specific observed data model in establishing identification. We then construct multiple estimators that are consistent under three different observed data models, and multiply robust estimators that are consistent in the union of these observed data models. We pay special attention to the case of binary outcomes, for which we obtain bounded estimators of the ATE that are guaranteed to lie between -1 and 1. Our approaches are illustrated with simulations and a data analysis evaluating the causal effect of education on earnings.
Collapse
Affiliation(s)
- Linbo Wang
- Harvard T.H. Chan School of Public Health, Boston, Massachusetts, U.S.A
| | | |
Collapse
|