1
|
Yin W, Zhao SD, Liang F. Bayesian penalized Buckley-James method for high dimensional bivariate censored regression models. LIFETIME DATA ANALYSIS 2022; 28:282-318. [PMID: 35239126 DOI: 10.1007/s10985-022-09549-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Accepted: 01/22/2022] [Indexed: 06/14/2023]
Abstract
For high dimensional gene expression data, one important goal is to identify a small number of genes that are associated with progression of the disease or survival of the patients. In this paper, we consider the problem of variable selection for multivariate survival data. We propose an estimation procedure for high dimensional accelerated failure time (AFT) models with bivariate censored data. The method extends the Buckley-James method by minimizing a penalized [Formula: see text] loss function with a penalty function induced from a bivariate spike-and-slab prior specification. In the proposed algorithm, censored observations are imputed using the Kaplan-Meier estimator, which avoids a parametric assumption on the error terms. Our empirical studies demonstrate that the proposed method provides better performance compared to the alternative procedures designed for univariate survival data regardless of whether the true events are correlated or not, and conceptualizes a formal way of handling bivariate survival data for AFT models. Findings from the analysis of a myeloma clinical trial using the proposed method are also presented.
Collapse
Affiliation(s)
- Wenjing Yin
- Department of Statistics, University of Illinois, Urbana-Champaign, Champaign, IL, USA
| | - Sihai Dave Zhao
- Department of Statistics, University of Illinois, Urbana-Champaign, Champaign, IL, USA
| | - Feng Liang
- Department of Statistics, University of Illinois, Urbana-Champaign, Champaign, IL, USA.
| |
Collapse
|
2
|
Liu P, Song S, Zhou Y. Semiparametric additive frailty hazard model for clustered failure time data. CAN J STAT 2021. [DOI: 10.1002/cjs.11647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Peng Liu
- School of Mathematics, Statistics and Actuarial Science University of Kent Canterbury UK
| | - Shanshan Song
- School of Statistics and Management Shanghai University of Finance and Economics Shanghai China
| | - Yong Zhou
- Academy of Statistics and Interdisciplinary Sciences East China Normal University Shanghai China
| |
Collapse
|
3
|
He J, Duan X, Zhang S, Li H. Estimation of marginal generalized linear model with subgroup auxiliary information. COMMUN STAT-THEOR M 2021. [DOI: 10.1080/03610926.2019.1642490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Jie He
- School of Mathematics, Beijing Normal University, Beijing, P. R. China
| | - Xiaogang Duan
- School of Statistics, Beijing Normal University, Beijing, P. R. China
| | - Shumei Zhang
- School of Statistics, Beijing Normal University, Beijing, P. R. China
| | - Hui Li
- School of Statistics, Beijing Normal University, Beijing, P. R. China
| |
Collapse
|
4
|
Huang CY, Qin J. A unified approach for synthesizing population-level covariate effect information in semiparametric estimation with survival data. Stat Med 2020; 39:1573-1590. [PMID: 32073677 DOI: 10.1002/sim.8499] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Revised: 08/22/2019] [Accepted: 10/28/2019] [Indexed: 01/12/2023]
Abstract
There has been a growing interest in developing methodologies to combine information from public domains to improve efficiency in the analysis of relatively small-scale studies that collect more detailed patient-level information. The auxiliary information is usually given in the form of summary statistics or regression coefficients. Thus, the question arises as to how to incorporate the summary information in the model estimation procedure. In this article, we consider statistical analysis of right-censored survival data when additional information about the covariate effects evaluated in a reduced Cox model is available. Recognizing that such external information can be summarized using population moments, we present a unified framework by employing the generalized method of moments to combine information from different sources for the analysis of survival data. The proposed estimator can be shown to be consistent and asymptotically normal; moreover, it is more efficient than the maximum partial likelihood estimator. We also consider incorporating uncertainty of the external information in the inference procedure. Simulation studies show that, by incorporating the additional summary information, the proposed estimators enjoy a substantial gain in efficiency over the conventional approach. A data analysis of a pancreatic cancer cohort study is presented to illustrate the methods and theory.
Collapse
Affiliation(s)
- Chiung-Yu Huang
- Department of Epidemiology and Biostatistics, University of California at San Francisco, San Francisco, California
| | - Jing Qin
- Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland
| |
Collapse
|
5
|
Lyu T, Luo X, Xu G, Huang CY. Induced smoothing for rank-based regression with recurrent gap time data. Stat Med 2018; 37:1086-1100. [PMID: 29205446 DOI: 10.1002/sim.7564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2016] [Revised: 09/26/2017] [Accepted: 10/30/2017] [Indexed: 11/08/2022]
Abstract
Various semiparametric regression models have recently been proposed for the analysis of gap times between consecutive recurrent events. Among them, the semiparametric accelerated failure time (AFT) model is especially appealing owing to its direct interpretation of covariate effects on the gap times. In general, estimation of the semiparametric AFT model is challenging because the rank-based estimating function is a nonsmooth step function. As a result, solutions to the estimating equations do not necessarily exist. Moreover, the popular resampling-based variance estimation for the AFT model requires solving rank-based estimating equations repeatedly and hence can be computationally cumbersome and unstable. In this paper, we extend the induced smoothing approach to the AFT model for recurrent gap time data. Our proposed smooth estimating function permits the application of standard numerical methods for both the regression coefficients estimation and the standard error estimation. Large-sample properties and an asymptotic variance estimator are provided for the proposed method. Simulation studies show that the proposed method outperforms the existing nonsmooth rank-based estimating function methods in both point estimation and variance estimation. The proposed method is applied to the data analysis of repeated hospitalizations for patients in the Danish Psychiatric Center Register.
Collapse
Affiliation(s)
- Tianmeng Lyu
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Xianghua Luo
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA.,Biostatistics Core, Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA
| | - Gongjun Xu
- Department of Statistics, University of Michigan, Ann Arbor, MI, USA
| | - Chiung-Yu Huang
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA, USA
| |
Collapse
|
6
|
Sun Y, Chan KCG, Qin J. Simple and fast overidentified rank estimation for right-censored length-biased data and backward recurrence time. Biometrics 2018; 74:77-85. [PMID: 28504836 PMCID: PMC5976459 DOI: 10.1111/biom.12727] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Revised: 04/01/2017] [Accepted: 04/01/2017] [Indexed: 11/25/2022]
Abstract
Length-biased survival data subject to right-censoring are often collected from a prevalent cohort. However, informative right censoring induced by the sampling design creates challenges in methodological development. While certain conditioning arguments could circumvent the problem of informative censoring, related rank estimation methods are typically inefficient because the marginal likelihood of the backward recurrence time is not ancillary. Under a semiparametric accelerated failure time model, an overidentified set of log-rank estimating equations is constructed based on the left-truncated right-censored data and backward recurrence time. Efficient combination of the estimating equations is simplified by exploiting an asymptotic independence property between two sets of estimating equations. A fast algorithm is studied for solving non-smooth, non-monotone estimating equations. Simulation studies confirm that the overidentified rank estimator can have a substantially improved estimation efficiency compared to just-identified rank estimators. The proposed method is applied to a dementia study for illustration.
Collapse
Affiliation(s)
- Yifei Sun
- Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland 21205, U.S.A
| | - Kwun Chuen Gary Chan
- Department of Biostatistics, University of Washington, Seattle, Washington 98195, U.S.A
| | - Jing Qin
- Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases, Bethesda, Maryland 20892, U.S.A
| |
Collapse
|
7
|
The generalized moment estimation of the additive–multiplicative hazard model with auxiliary survival information. Comput Stat Data Anal 2017. [DOI: 10.1016/j.csda.2017.03.013] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
8
|
Affiliation(s)
- You‐Gan Wang
- School of Mathematical Sciences Queensland University of Technology 2 George Street Brisbane QLD 4000 Australia
| | - Yudong Zhao
- Danone Nutricia Research 30 Biopolis Street, Matrix Building #05/01B Singapore 138671 Singapore
| | - Liya Fu
- School of Mathematics and Statistics Xi'an Jiaotong University No.28, Xianning West Road Xi'an Shaanxi 710049 China
| |
Collapse
|
9
|
Li H, Duan X, Yin G. Generalized Method of Moments for Additive Hazards Model with Clustered Dental Survival Data. Scand Stat Theory Appl 2016. [DOI: 10.1111/sjos.12232] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Hui Li
- Department of Statistics Beijing Normal University
| | | | - Guosheng Yin
- Department of Statistics and Actuarial Science The University of Hong Kong
| |
Collapse
|
10
|
Chiou SH, Kang S, Kim J, Yan J. Marginal semiparametric multivariate accelerated failure time model with generalized estimating equations. LIFETIME DATA ANALYSIS 2014; 20:599-618. [PMID: 24549607 DOI: 10.1007/s10985-014-9292-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2013] [Accepted: 01/28/2014] [Indexed: 06/03/2023]
Abstract
The semiparametric accelerated failure time (AFT) model is not as widely used as the Cox relative risk model due to computational difficulties. Recent developments in least squares estimation and induced smoothing estimating equations for censored data provide promising tools to make the AFT models more attractive in practice. For multivariate AFT models, we propose a generalized estimating equations (GEE) approach, extending the GEE to censored data. The consistency of the regression coefficient estimator is robust to misspecification of working covariance, and the efficiency is higher when the working covariance structure is closer to the truth. The marginal error distributions and regression coefficients are allowed to be unique for each margin or partially shared across margins as needed. The initial estimator is a rank-based estimator with Gehan's weight, but obtained from an induced smoothing approach with computational ease. The resulting estimator is consistent and asymptotically normal, with variance estimated through a multiplier resampling method. In a large scale simulation study, our estimator was up to three times as efficient as the estimateor that ignores the within-cluster dependence, especially when the within-cluster dependence was strong. The methods were applied to the bivariate failure times data from a diabetic retinopathy study.
Collapse
Affiliation(s)
- Sy Han Chiou
- Department of Mathematics and Statistics, University of Minnesota, Duluth, Duluth, MN, USA
| | | | | | | |
Collapse
|
11
|
Westgate PM. A Comparison of Utilized and Theoretical Covariance Weighting Matrices on the Estimation Performance of Quadratic Inference Functions. COMMUN STAT-SIMUL C 2014. [DOI: 10.1080/03610918.2012.752839] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
12
|
Liu B, Lu W, Zhang J. Kernel Smoothed Profile Likelihood Estimation in the Accelerated Failure Time Frailty Model for Clustered Survival Data. Biometrika 2014; 100:741-755. [PMID: 24443587 DOI: 10.1093/biomet/ast012] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Clustered survival data frequently arise in biomedical applications, where event times of interest are clustered into groups such as families. In this article we consider an accelerated failure time frailty model for clustered survival data and develop nonparametric maximum likelihood estimation for it via a kernel smoother aided EM algorithm. We show that the proposed estimator for the regression coefficients is consistent, asymptotically normal and semiparametric efficient when the kernel bandwidth is properly chosen. An EM-aided numerical differentiation method is derived for estimating its variance. Simulation studies evaluate the finite sample performance of the estimator, and it is applied to the Diabetic Retinopathy data set.
Collapse
Affiliation(s)
- Bo Liu
- Department of Statistics, North Carolina State University, 2311 Stinson Drive, Raleigh, North Carolina 27695, U.S.A
| | - Wenbin Lu
- Department of Statistics, North Carolina State University, 2311 Stinson Drive, Raleigh, North Carolina 27695, U.S.A
| | - Jiajia Zhang
- Department of Epidemiology and Biostatistics, University of South Carolina, 800 Sumter Street, Columbia, South Carolina 29208, U.S.A
| |
Collapse
|