1
|
Ahmad S, Shabbir J, Zahid E, Aamir M, Alqawba M. New generalized class of estimators for estimation of finite population mean based on probability proportional to size sampling using two auxiliary variables: A simulation study. Sci Prog 2023; 106:368504231208537. [PMID: 37885238 PMCID: PMC10612467 DOI: 10.1177/00368504231208537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
This article aims to suggest a new generalized class of estimators based on probability proportional to size sampling using two auxiliary variables. The numerical expressions for the bias and mean squared error (MSE) are derived up to the first order of approximation. Four actual data sets are used to examine the performances of a new improved generalized class of estimators. From the results of real data sets, it is examined that the suggested estimator gives the minimum MSE and the percentage relative efficiency is higher than all existing estimators, which shows the importance of the new generalized class of estimators. To check the strength and generalizability of our proposed class of estimators, a simulation study is also accompanied. The consequence of the simulation study shows the worth of newly found proposed class estimators. Overall, we get to the conclusion that the proposed estimator outperforms as compared to all other estimators taken into account in this study.
Collapse
Affiliation(s)
- Sohaib Ahmad
- Department of Statistics, Abdul Wali Khan University, Mardan, Pakistan
| | - Javid Shabbir
- Department of Statistics, Quaid-i-Azam University, Islamabad, Pakistan
- Department of Statistics, University of Wah, Wah Cantt, Pakistan
| | - Erum Zahid
- Department of Applied Mathematics and Statistics, Institute of Space Technology, Islamabad, Pakistan
| | - Muhammad Aamir
- Department of Statistics, Abdul Wali Khan University, Mardan, Pakistan
| | - Mohammed Alqawba
- Department of Mathematics, College of Science and Arts, Qassim University, Ar Rass, Saudi Arabia
| |
Collapse
|
2
|
Hussain S, Akhtar S, El-Morshedy M. Modified estimators of finite population distribution function based on dual use of auxiliary information under stratified random sampling. Sci Prog 2022; 105:368504221128486. [PMID: 36168269 PMCID: PMC10450482 DOI: 10.1177/00368504221128486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
In survey sampling, information on auxiliary variables related to the main variable is often available in many practical problems. Since the mid-twentieth century, researchers have taken a keen interest in the use of auxiliary information due to its usefulness in estimation methods. The current study presents two new estimators for the distribution function of a finite population based on dual auxiliary variables. The new estimators can be used in situations where the researchers face some sort of complex data set. The mathematical equations for the bias and mean square error have been obtained for each proposed estimator. Besides, an empirical study simulation study has also been conducted to analyse the performance of estimators. It is found that the new suggested estimators of the distribution function of a finite population are more accurate than some of the existing estimators.
Collapse
Affiliation(s)
- Sardar Hussain
- Department of Statistics, Quaid-i-azam University, Islamabad, Pakistan
| | - Sohail Akhtar
- Department of Mathematics and Statistics, The University of Haripur, Haripur, Pakistan
| | - Mahmoud El-Morshedy
- Department of Mathematics, College of Science and Humanities in Al-Kharj, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia
- Department of Mathematics, Faculty of Science, Mansoura University, Mansoura, Egypt
| |
Collapse
|
3
|
Flandre P. Weighted log-rank test to compare two survival functions in the presence of dependent censoring. Pharm Stat 2022; 21:1281-1293. [PMID: 35708191 DOI: 10.1002/pst.2245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 04/11/2022] [Accepted: 05/12/2022] [Indexed: 11/12/2022]
Abstract
Comparing survival functions with the log-rank test in the presence of dependent censoring can produce an invalid test result. We extend our previous work on the estimation of the survival function using prognostic variables to adjust for dependent censoring to the comparison of two survival distributions. In these estimators, the weights of a censored individual is redistributed among either a subset of patients in the risk set or all patients in the risk set but giving more weight to patients having smallest distances from the censored subject. The distance is based on two risk scores obtained from two working models, one for the failure time and one for the censoring time. Based on the estimators, we suggest a weighted log-rank test to compare two survival distributions. A simulation study compared performance of our method with the analysis of the observed data without using auxiliary variables and with a recent method based on multiple imputation (KMIB method). With appropriate parameters, the weighted log-rank approach provides sizes of the test comparable to the nominal value but higher powers than the two other methods. The method is illustrated with data from a breast cancer study.
Collapse
Affiliation(s)
- Philippe Flandre
- INSERM Institut Pierre Louis d'Epidémiologie et de Santé Publique (IPLESP), Sorbonne Université, Paris, France
| |
Collapse
|
4
|
Nelson TD, Brock RL, Yokum S, Tomaso CC, Savage CR, Stice E. Much Ado About Missingness: A Demonstration of Full Information Maximum Likelihood Estimation to Address Missingness in Functional Magnetic Resonance Imaging Data. Front Neurosci 2021; 15:746424. [PMID: 34658780 PMCID: PMC8514662 DOI: 10.3389/fnins.2021.746424] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 08/31/2021] [Indexed: 11/23/2022] Open
Abstract
The current paper leveraged a large multi-study functional magnetic resonance imaging (fMRI) dataset (N = 363) and a generated missingness paradigm to demonstrate different approaches for handling missing fMRI data under a variety of conditions. The performance of full information maximum likelihood (FIML) estimation, both with and without auxiliary variables, and listwise deletion were compared under different conditions of generated missing data volumes (i.e., 20, 35, and 50%). FIML generally performed better than listwise deletion in replicating results from the full dataset, but differences were small in the absence of auxiliary variables that correlated strongly with fMRI task data. However, when an auxiliary variable created to correlate r = 0.5 with fMRI task data was included, the performance of the FIML model improved, suggesting the potential value of FIML-based approaches for missing fMRI data when a strong auxiliary variable is available. In addition to primary methodological insights, the current study also makes an important contribution to the literature on neural vulnerability factors for obesity. Specifically, results from the full data model show that greater activation in regions implicated in reward processing (caudate and putamen) in response to tastes of milkshake significantly predicted weight gain over the following year. Implications of both methodological and substantive findings are discussed.
Collapse
Affiliation(s)
- Timothy D Nelson
- Department of Psychology, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Rebecca L Brock
- Department of Psychology, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Sonja Yokum
- Oregon Research Institute, Eugene, OR, United States
| | - Cara C Tomaso
- Department of Psychology, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Cary R Savage
- Department of Psychology, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Eric Stice
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, United States
| |
Collapse
|
5
|
Tsaousis I, Sideridis GD, AlGhamdi HM. Measurement Invariance and Differential Item Functioning Across Gender Within a Latent Class Analysis Framework: Evidence From a High-Stakes Test for University Admission in Saudi Arabia. Front Psychol 2020; 11:622. [PMID: 32318006 PMCID: PMC7147614 DOI: 10.3389/fpsyg.2020.00622] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2019] [Accepted: 03/16/2020] [Indexed: 11/13/2022] Open
Abstract
The main aim of the present study was to investigate the presence of Differential Item Functioning (DIF) using a latent class (LC) analysis approach. Particularly, we examined potential sources of DIF in relation to gender. Data came from 6,265 Saudi Arabia students, who completed a high-stakes standardized admission test for university entrance. The results from a Latent Class Analysis (LCA) revealed a three-class solution (i.e., high, average, and low scorers). Then, to better understand the nature of the emerging classes and the characteristics of the people who comprise them, we applied a new stepwise approach, using the Multiple Indicator Multiple Causes (MIMIC) model. The model identified both uniform and non-uniform DIF effects for several items across all scales of the test, although, for the majority of them, the DIF effect sizes were negligible. Findings from this study have important implications for both measurement quality and interpretation of the results. Particularly, results showed that gender is a potential source of DIF for latent class indicators; thus, it is important to include those direct effects in the latent class regression model, to obtain unbiased estimates not only for the measurement parameters but also of the structural parameters. Ignoring these effects might lead to misspecification of the latent classes in terms of both the size and the characteristics of each class, which in turn, could lead to misinterpretations of the obtained latent class results. Implications of the results for practice are discussed.
Collapse
Affiliation(s)
| | - Georgios D. Sideridis
- Boston Children’s Hospital, Harvard Medical School, Boston, MA, United States
- National and Kapodistrian University of Athens, Athens, Greece
| | - Hanan M. AlGhamdi
- National Center for Assessment in Higher Education, Riyadh, Saudi Arabia
| |
Collapse
|
6
|
Imori S, Shimodaira H. An Information Criterion for Auxiliary Variable Selection in Incomplete Data Analysis. Entropy (Basel) 2019; 21:E281. [PMID: 33266996 DOI: 10.3390/e21030281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Revised: 03/09/2019] [Accepted: 03/12/2019] [Indexed: 11/16/2022]
Abstract
Statistical inference is considered for variables of interest, called primary variables, when auxiliary variables are observed along with the primary variables. We consider the setting of incomplete data analysis, where some primary variables are not observed. Utilizing a parametric model of joint distribution of primary and auxiliary variables, it is possible to improve the estimation of parametric model for the primary variables when the auxiliary variables are closely related to the primary variables. However, the estimation accuracy reduces when the auxiliary variables are irrelevant to the primary variables. For selecting useful auxiliary variables, we formulate the problem as model selection, and propose an information criterion for predicting primary variables by leveraging auxiliary variables. The proposed information criterion is an asymptotically unbiased estimator of the Kullback–Leibler divergence for complete data of primary variables under some reasonable conditions. We also clarify an asymptotic equivalence between the proposed information criterion and a variant of leave-one-out cross validation. Performance of our method is demonstrated via a simulation study and a real data example.
Collapse
|
7
|
Wang T, Wang X, Zhou H, Cai J, George SL. Auxiliary variable-enriched biomarker-stratified design. Stat Med 2018; 37:4610-4635. [PMID: 30221368 DOI: 10.1002/sim.7938] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2018] [Revised: 06/04/2018] [Accepted: 07/15/2018] [Indexed: 12/18/2022]
Abstract
Clinical trials in the era of precision medicine require assessment of biomarkers to identify appropriate subgroups of patients for targeted therapy. In a biomarker-stratified design (BSD), biomarkers are measured on all patients and used as stratification variables. However, such a trial can be both inefficient and costly, especially when the prevalence of the subgroup of primary interest is low and the cost of assessing the biomarkers is high. Efficiency can be improved and costs reduced by using enriched biomarker-stratified designs, in which patients of primary interest, typically the biomarker-positive patients, are oversampled. We consider a special type of enrichment design, an auxiliary variable-enriched design (AEBSD), in which enrichment is based on some inexpensive auxiliary variable that is positively correlated with the true biomarker. The proposed AEBSD reduces the total cost of the trial compared with a standard BSD when the prevalence rate of true biomarker positivity is small and the positive predictive value (PPV) of the auxiliary biomarker is larger than the prevalence rate. In addition, for an AEBSD, we can immediately randomize the patients selected in the screening process without waiting for the result of the true biomarker test, reducing the treatment waiting time. We propose an adaptive Bayesian method to adjust the assumed PPV while the trial is ongoing. Numerical studies and an example illustrate the approach. An R package is available.
Collapse
Affiliation(s)
- Ting Wang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Xiaofei Wang
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina
| | - Haibo Zhou
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Jianwen Cai
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Stephen L George
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina
| |
Collapse
|
8
|
Sun Y, Qi L, Yang G, Gilbert PB. Hypothesis tests for stratified mark-specific proportional hazards models with missing covariates, with application to HIV vaccine efficacy trials. Biom J 2018; 60:516-536. [PMID: 29488249 DOI: 10.1002/bimj.201700002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2017] [Revised: 08/13/2017] [Accepted: 11/09/2017] [Indexed: 11/06/2022]
Abstract
This article develops hypothesis testing procedures for the stratified mark-specific proportional hazards model with missing covariates where the baseline functions may vary with strata. The mark-specific proportional hazards model has been studied to evaluate mark-specific relative risks where the mark is the genetic distance of an infecting HIV sequence to an HIV sequence represented inside the vaccine. This research is motivated by analyzing the RV144 phase 3 HIV vaccine efficacy trial, to understand associations of immune response biomarkers on the mark-specific hazard of HIV infection, where the biomarkers are sampled via a two-phase sampling nested case-control design. We test whether the mark-specific relative risks are unity and how they change with the mark. The developed procedures enable assessment of whether risk of HIV infection with HIV variants close or far from the vaccine sequence are modified by immune responses induced by the HIV vaccine; this question is interesting because vaccine protection occurs through immune responses directed at specific HIV sequences. The test statistics are constructed based on augmented inverse probability weighted complete-case estimators. The asymptotic properties and finite-sample performances of the testing procedures are investigated, demonstrating double-robustness and effectiveness of the predictive auxiliaries to recover efficiency. The finite-sample performance of the proposed tests are examined through a comprehensive simulation study. The methods are applied to the RV144 trial.
Collapse
Affiliation(s)
- Yanqing Sun
- Department of Mathematics and Statistics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
| | - Li Qi
- Biostatistics and Programming, Sanofi, Bridgewater, NJ, 08807, USA
| | - Guangren Yang
- Department of Statistics, School of Economics, Jinan University, Guangzhou, China
| | - Peter B Gilbert
- Department of Biostatistics, University of Washington, and Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
| |
Collapse
|
9
|
Marnissi Y, Chouzenoux E, Benazza-Benyahia A, Pesquet JC. An Auxiliary Variable Method for Markov Chain Monte Carlo Algorithms in High Dimension. Entropy (Basel) 2018; 20:E110. [PMID: 33265201 DOI: 10.3390/e20020110] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Revised: 01/16/2018] [Accepted: 01/30/2018] [Indexed: 11/16/2022]
Abstract
In this paper, we are interested in Bayesian inverse problems where either the data fidelity term or the prior distribution is Gaussian or driven from a hierarchical Gaussian model. Generally, Markov chain Monte Carlo (MCMC) algorithms allow us to generate sets of samples that are employed to infer some relevant parameters of the underlying distributions. However, when the parameter space is high-dimensional, the performance of stochastic sampling algorithms is very sensitive to existing dependencies between parameters. In particular, this problem arises when one aims to sample from a high-dimensional Gaussian distribution whose covariance matrix does not present a simple structure. Another challenge is the design of Metropolis–Hastings proposals that make use of information about the local geometry of the target density in order to speed up the convergence and improve mixing properties in the parameter space, while not being too computationally expensive. These two contexts are mainly related to the presence of two heterogeneous sources of dependencies stemming either from the prior or the likelihood in the sense that the related covariance matrices cannot be diagonalized in the same basis. In this work, we address these two issues. Our contribution consists of adding auxiliary variables to the model in order to dissociate the two sources of dependencies. In the new augmented space, only one source of correlation remains directly related to the target parameters, the other sources of correlations being captured by the auxiliary variables. Experiments are conducted on two practical image restoration problems—namely the recovery of multichannel blurred images embedded in Gaussian noise and the recovery of signal corrupted by a mixed Gaussian noise. Experimental results indicate that adding the proposed auxiliary variables makes the sampling problem simpler since the new conditional distribution no longer contains highly heterogeneous correlations. Thus, the computational cost of each iteration of the Gibbs sampler is significantly reduced while ensuring good mixing properties.
Collapse
|
10
|
Abstract
In the era of precision medicine, drugs are increasingly developed to target subgroups of patients with certain biomarkers. In large all-comer trials using a biomarker stratified design, the cost of treating and following patients for clinical outcomes may be prohibitive. With a fixed number of randomized patients, the efficiency of testing certain treatments parameters, including the treatment effect among biomarker-positive patients and the interaction between treatment and biomarker, can be improved by increasing the proportion of biomarker positives on study, especially when the prevalence rate of biomarker positives is low in the underlying patient population. When the cost of assessing the true biomarker is prohibitive, one can further improve the study efficiency by oversampling biomarker positives with a cheaper auxiliary variable or a surrogate biomarker that correlates with the true biomarker. To improve efficiency and reduce cost, we can adopt an enrichment strategy for both scenarios by concentrating on testing and treating patient subgroups that contain more information about specific treatment parameters of primary interest to the investigators. In the first scenario, an enriched biomarker stratified design enriches the cohort of randomized patients by directly oversampling the relevant patients with the true biomarker, while in the second scenario, an auxiliary-variable-enriched biomarker stratified design enriches the randomized cohort based on an inexpensive auxiliary variable, thereby avoiding testing the true biomarker on all screened patients and reducing treatment waiting time. For both designs, we discuss how to choose the optimal enrichment proportion when testing a single hypothesis or two hypotheses simultaneously. At a requisite power, we compare the two new designs with the BSD design in terms of the number of randomized patients and the cost of trial under scenarios mimicking real biomarker stratified trials. The new designs are illustrated with hypothetical examples for designing biomarker-driven cancer trials.
Collapse
Affiliation(s)
- Xiaofei Wang
- a Department of Biostatistics and Bioinformatics , Duke University , Durham , NC , U.S.A
| | - Jingzhu Zhou
- a Department of Biostatistics and Bioinformatics , Duke University , Durham , NC , U.S.A
| | - Ting Wang
- b Department of Biostatistics , University of North Carolina at Chapel Hill , Chapel Hill , NC , U.S.A
| | - Stephen L George
- a Department of Biostatistics and Bioinformatics , Duke University , Durham , NC , U.S.A
| |
Collapse
|
11
|
Yue YR, Wang XF. Bayesian inference for generalized linear mixed models with predictors subject to detection limits: an approach that leverages information from auxiliary variables. Stat Med 2015; 35:1689-705. [PMID: 26643287 DOI: 10.1002/sim.6830] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2014] [Accepted: 11/08/2015] [Indexed: 11/05/2022]
Abstract
This paper is motivated from a retrospective study of the impact of vitamin D deficiency on the clinical outcomes for critically ill patients in multi-center critical care units. The primary predictors of interest, vitamin D2 and D3 levels, are censored at a known detection limit. Within the context of generalized linear mixed models, we investigate statistical methods to handle multiple censored predictors in the presence of auxiliary variables. A Bayesian joint modeling approach is proposed to fit the complex heterogeneous multi-center data, in which the data information is fully used to estimate parameters of interest. Efficient Monte Carlo Markov chain algorithms are specifically developed depending on the nature of the response. Simulation studies demonstrate the outperformance of the proposed Bayesian approach over other existing methods. An application to the data set from the vitamin D deficiency study is presented. Possible extensions of the method regarding the absence of auxiliary variables, semiparametric models, as well as the type of censoring are also discussed.
Collapse
Affiliation(s)
- Yu Ryan Yue
- Department of Statistics and CIS, Zicklin School of Business, Baruch College, The City University of New York, New York, NY, U.S.A
| | - Xiao-Feng Wang
- Department of Quantitative Health Sciences / Biostatistics Section, Cleveland Clinic Lerner Research Institute, Cleveland, OH, U.S.A
| |
Collapse
|
12
|
Sullivan TR, Salter AB, Ryan P, Lee KJ. Bias and Precision of the "Multiple Imputation, Then Deletion" Method for Dealing With Missing Outcome Data. Am J Epidemiol 2015; 182:528-34. [PMID: 26337075 DOI: 10.1093/aje/kwv100] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2014] [Accepted: 04/08/2015] [Indexed: 01/18/2023] Open
Abstract
Multiple imputation (MI) is increasingly being used to handle missing data in epidemiologic research. When data on both the exposure and the outcome are missing, an alternative to standard MI is the "multiple imputation, then deletion" (MID) method, which involves deleting imputed outcomes prior to analysis. While MID has been shown to provide efficiency gains over standard MI when analysis and imputation models are the same, the performance of MID in the presence of auxiliary variables for the incomplete outcome is not well understood. Using simulated data, we evaluated the performance of standard MI and MID in regression settings where data were missing on both the outcome and the exposure and where an auxiliary variable associated with the incomplete outcome was included in the imputation model. When the auxiliary variable was unrelated to missingness in the outcome, both standard MI and MID produced negligible bias when estimating regression parameters, with standard MI being more efficient in most settings. However, when the auxiliary variable was also associated with missingness in the outcome, alarmingly MID produced markedly biased parameter estimates. On the basis of these results, we recommend that researchers use standard MI rather than MID in the presence of auxiliary variables associated with an incomplete outcome.
Collapse
|
13
|
Hsu CH, Taylor JMG, Hu C. Analysis of accelerated failure time data with dependent censoring using auxiliary variables via nonparametric multiple imputation. Stat Med 2015; 34:2768-80. [PMID: 25999295 PMCID: PMC5863093 DOI: 10.1002/sim.6534] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2014] [Revised: 04/01/2015] [Accepted: 04/29/2015] [Indexed: 11/07/2022]
Abstract
We consider the situation of estimating the marginal survival distribution from censored data subject to dependent censoring using auxiliary variables. We had previously developed a nonparametric multiple imputation approach. The method used two working proportional hazards (PH) models, one for the event times and the other for the censoring times, to define a nearest neighbor imputing risk set. This risk set was then used to impute failure times for censored observations. Here, we adapt the method to the situation where the event and censoring times follow accelerated failure time models and propose to use the Buckley-James estimator as the two working models. Besides studying the performances of the proposed method, we also compare the proposed method with two popular methods for handling dependent censoring through the use of auxiliary variables, inverse probability of censoring weighted and parametric multiple imputation methods, to shed light on the use of them. In a simulation study with time-independent auxiliary variables, we show that all approaches can reduce bias due to dependent censoring. The proposed method is robust to misspecification of either one of the two working models and their link function. This indicates that a working proportional hazards model is preferred because it is more cumbersome to fit an accelerated failure time model. In contrast, the inverse probability of censoring weighted method is not robust to misspecification of the link function of the censoring time model. The parametric imputation methods rely on the specification of the event time model. The approaches are applied to a prostate cancer dataset.
Collapse
Affiliation(s)
- Chiu-Hsieh Hsu
- Division of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, AZ, 85724, Tucson, U.S.A
- Arizona Cancer Center, University of Arizona, AZ, 85724, Tucson, U.S.A
| | - Jeremy M G Taylor
- Department of Biostatistics, School of Public Health, University of Michigan, MI, 48109, Ann Arbor, U.S.A
| | - Chengcheng Hu
- Division of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, AZ, 85724, Tucson, U.S.A
- Arizona Cancer Center, University of Arizona, AZ, 85724, Tucson, U.S.A
| |
Collapse
|
14
|
Mazza GL, Enders CK, Ruehlman LS. Addressing Item-Level Missing Data: A Comparison of Proration and Full Information Maximum Likelihood Estimation. Multivariate Behav Res 2015; 50:504-519. [PMID: 26610249 PMCID: PMC4701045 DOI: 10.1080/00273171.2015.1068157] [Citation(s) in RCA: 82] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Often when participants have missing scores on one or more of the items comprising a scale, researchers compute prorated scale scores by averaging the available items. Methodologists have cautioned that proration may make strict assumptions about the mean and covariance structures of the items comprising the scale (Schafer & Graham, 2002 ; Graham, 2009 ; Enders, 2010 ). We investigated proration empirically and found that it resulted in bias even under a missing completely at random (MCAR) mechanism. To encourage researchers to forgo proration, we describe a full information maximum likelihood (FIML) approach to item-level missing data handling that mitigates the loss in power due to missing scale scores and utilizes the available item-level data without altering the substantive analysis. Specifically, we propose treating the scale score as missing whenever one or more of the items are missing and incorporating items as auxiliary variables. Our simulations suggest that item-level missing data handling drastically increases power relative to scale-level missing data handling. These results have important practical implications, especially when recruiting more participants is prohibitively difficult or expensive. Finally, we illustrate the proposed method with data from an online chronic pain management program.
Collapse
Affiliation(s)
- Gina L Mazza
- a Department of Psychology Arizona State University
| | | | | |
Collapse
|
15
|
Abstract
In randomized clinical trials, the use of potentially subjective endpoints has led to frequent use of blinded independent central review (BICR) and event adjudication committees to reduce possible bias in treatment effect estimators based on local evaluations (LE). In oncology trials, progression-free survival (PFS) is one such endpoint. PFS requires image interpretation to determine whether a patient's cancer has progressed, and BICR has been advocated to reduce the potential for endpoints to be biased by knowledge of treatment assignment. There is current debate, however, about the value of such reviews with time-to-event outcomes such as PFS. We propose a BICR audit strategy as an alternative to a complete-case BICR to provide assurance of the presence of a treatment effect. We develop an auxiliary-variable estimator of the log-hazard ratio that is more efficient than simply using the audited (i.e., sampled) BICR data for estimation. Our estimator incorporates information from the LE on all the cases and the audited BICR cases, and is an asymptotically unbiased estimator of the log-hazard ratio from BICR. The estimator offers considerable efficiency gains that improve as the correlation between LE and BICR increases. A two-stage auditing strategy is also proposed and evaluated through simulation studies. The method is applied retrospectively to a large oncology trial that had a complete-case BICR, showing the potential for efficiency improvements.
Collapse
Affiliation(s)
- Lori E Dodd
- Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland 20892, USA.
| | | | | | | | | |
Collapse
|