1
|
Reeder HT, Ha Lee K, Haneuse S. Characterizing quantile-varying covariate effects under the accelerated failure time model. Biostatistics 2024; 25:449-467. [PMID: 36610077 DOI: 10.1093/biostatistics/kxac052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 12/17/2022] [Accepted: 12/20/2022] [Indexed: 01/09/2023] Open
Abstract
An important task in survival analysis is choosing a structure for the relationship between covariates of interest and the time-to-event outcome. For example, the accelerated failure time (AFT) model structures each covariate effect as a constant multiplicative shift in the outcome distribution across all survival quantiles. Though parsimonious, this structure cannot detect or capture effects that differ across quantiles of the distribution, a limitation that is analogous to only permitting proportional hazards in the Cox model. To address this, we propose a general framework for quantile-varying multiplicative effects under the AFT model. Specifically, we embed flexible regression structures within the AFT model and derive a novel formula for interpretable effects on the quantile scale. A regression standardization scheme based on the g-formula is proposed to enable the estimation of both covariate-conditional and marginal effects for an exposure of interest. We implement a user-friendly Bayesian approach for the estimation and quantification of uncertainty while accounting for left truncation and complex censoring. We emphasize the intuitive interpretation of this model through numerical and graphical tools and illustrate its performance through simulation and application to a study of Alzheimer's disease and dementia.
Collapse
Affiliation(s)
- Harrison T Reeder
- Biostatistics, Massachusetts General Hospital, 50 Staniford Street, Suite 560, Boston, MA 02114, USA and Department of Medicine, Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA
| | - Kyu Ha Lee
- Departments of Nutrition, Epidemiology, and Biostatistics, Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, MA 02115, USA
| | - Sebastien Haneuse
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, MA 02115, USA
| |
Collapse
|
2
|
Jiang Q, Basu S. Cure models with adaptive activation for modeling cancer survival. Stat Methods Med Res 2024; 33:227-242. [PMID: 38298015 DOI: 10.1177/09622802231224647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2024]
Abstract
We propose a class of cure rate models motivated by analysis of colon cancer and triple-negative breast cancer survival data. This class is indexed by an adaptive activation parameter and a function. We establish that the class is stochastically ordered in the activation parameter and also establish two identifiability results for this class. The first- and last-activation models are members of this class whereas many cure rate models proposed in the literature are also part of this class. We illustrate that while first- and last-activation models may perform poorly under model misspecifications, the proposed model with adaptive activation provides appropriate inference in these cases. We apply the proposed approach to assess treatment-sex interaction on cure rate in a colon cancer study and to assess role of tumor heterogeneity and ethnic disparity in breast cancer.
Collapse
Affiliation(s)
- Qi Jiang
- AbbVie, Inc., North Chicago, IL, USA
| | - Sanjib Basu
- Division of Epidemiology and Biostatistics, University of Illinois Chicago, IL, USA
| |
Collapse
|
3
|
Sadeqi MB, Ballvora A, Léon J. Local and Bayesian Survival FDR Estimations to Identify Reliable Associations in Whole Genome of Bread Wheat. Int J Mol Sci 2023; 24:14011. [PMID: 37762314 PMCID: PMC10531084 DOI: 10.3390/ijms241814011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 09/02/2023] [Accepted: 09/07/2023] [Indexed: 09/29/2023] Open
Abstract
Estimating the FDR significance threshold in genome-wide association studies remains a major challenge in distinguishing true positive hypotheses from false positive and negative errors. Several comparative methods for multiple testing comparison have been developed to determine the significance threshold; however, these methods may be overly conservative and lead to an increase in false negative results. The local FDR approach is suitable for testing many associations simultaneously based on the empirical Bayes perspective. In the local FDR, the maximum likelihood estimator is sensitive to bias when the GWAS model contains two or more explanatory variables as genetic parameters simultaneously. The main criticism of local FDR is that it focuses only locally on the effects of single nucleotide polymorphism (SNP) in tails of distribution, whereas the signal associations are distributed across the whole genome. The advantage of the Bayesian perspective is that knowledge of prior distribution comes from other genetic parameters included in the GWAS model, such as linkage disequilibrium (LD) analysis, minor allele frequency (MAF) and call rate of significant associations. We also proposed Bayesian survival FDR to solve the multi-collinearity and large-scale problems, respectively, in grain yield (GY) vector in bread wheat with large-scale SNP information. The objective of this study was to obtain a short list of SNPs that are reliably associated with GY under low and high levels of nitrogen (N) in the population. The five top significant SNPs were compared with different Bayesian models. Based on the time to events in the Bayesian survival analysis, the differentiation between minor and major alleles within the association panel can be identified.
Collapse
Affiliation(s)
| | - Agim Ballvora
- INRES-Plant Breeding, Rheinische Friedrich-Wilhelms-Universität Bonn, 53113 Bonn, Germany; (M.B.S.); (J.L.)
| | | |
Collapse
|
4
|
Sanz-Puig M, Lázaro E, Armero C, Alvares D, Martínez A, Rodrigo D. S. Typhimurium virulence changes caused by exposure to different non-thermal preservation treatments using C. elegans. Int J Food Microbiol 2017; 262:49-54. [PMID: 28963905 DOI: 10.1016/j.ijfoodmicro.2017.09.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2016] [Revised: 06/19/2017] [Accepted: 09/10/2017] [Indexed: 11/25/2022]
Abstract
The aims of this research study were: (i) to postulate Caenorhabditis elegans (C. elegans) as a useful organism to describe infection by Salmonella enterica serovar Typhimurium (S. Typhimurium), and (ii) to evaluate changes in virulence of S. Typhimurium when subjected repetitively to different antimicrobial treatments. Specifically, cauliflower by-product infusion, High Hydrostatic Pressure (HHP), and Pulsed Electric Fields (PEF). This study was carried out by feeding C. elegans with different microbial populations: E. coli OP50 (optimal conditions), untreated S. Typhimurium, S. Typhimurium treated once and three times with cauliflower by-product infusion, S. Typhimurium treated once and four times with HHP and S. Typhimurium treated once and four times with PEF. Bayesian survival analysis was applied to estimate C. elegans lifespan when fed with the different microbial populations considered. Results showed that C. elegans is a useful organism to describe infection by S. Typhimurium because its lifespan was reduced when it was infected. In addition, the application of antimicrobial treatments repetitively generated different responses: when cauliflower by-product infusion and PEF treatment were applied repetitively the virulence of S. Typhimurium was lower than when the treatment was applied once. In contrast, when HHP treatment was applied repetitively, the virulence of S. Typhimurium was higher than when it was applied once. Nevertheless, in all the populations analyzed treated S. Typhimurium had lower virulence than untreated S. Typhimurium.
Collapse
Affiliation(s)
- María Sanz-Puig
- Instituto de Agroquímica y Tecnología de Alimentos (IATA-CSIC), Carrer del Catedràtic Agustín Escardino Benlloch, 7, 46980 Paterna, València, Spain
| | - Elena Lázaro
- Department of Statistics and Operations Research, Universitat de València, Carrer Doctor Moliner, 50, 46100 Burjassot, Spain
| | - Carmen Armero
- Department of Statistics and Operations Research, Universitat de València, Carrer Doctor Moliner, 50, 46100 Burjassot, Spain
| | - Danilo Alvares
- Department of Statistics and Operations Research, Universitat de València, Carrer Doctor Moliner, 50, 46100 Burjassot, Spain
| | - Antonio Martínez
- Instituto de Agroquímica y Tecnología de Alimentos (IATA-CSIC), Carrer del Catedràtic Agustín Escardino Benlloch, 7, 46980 Paterna, València, Spain
| | - Dolores Rodrigo
- Instituto de Agroquímica y Tecnología de Alimentos (IATA-CSIC), Carrer del Catedràtic Agustín Escardino Benlloch, 7, 46980 Paterna, València, Spain.
| |
Collapse
|
5
|
Abstract
This study aimed to investigate the contributing factors to serious casualty crashes in China. Crashes with deaths greater than 10 people are defined as serious casualty crashes in China. The serious casualty crash data were collected from 2009 to 2014. The random forest analysis was first conducted to select the candidate variables that affect the risks of serious casualty crashes. The Bayesian random parameters accelerated failure time (AFT) model was then developed to link the probability of the serious casualty crash with road geometric conditions, pavement conditions, environmental characteristics, collision characteristics, vehicle conditions, and driver characteristics. The AFT model estimation results indicate that overload driving, country road, northwest china region, turnover crash, private car, snowy or icy road surface and sight distance conditions have significant fixed effects on the likelihood of serious casualty crashes. In addition to these fixed-parameter variables, freeway, clear weather conditions, coach drivers, and upgrade horizontal curve affect the likelihood of serious casualty crashes with varying magnitude across observations. One of the important findings is that the serious casualty crash likelihood does not always decrease with an increase in the driving experience (number of years driven). Before the inflection point of 7 years, the serious casualty crash likelihood increases as the driving experience grows. The results of this study can help to develop effective countermeasures and policy initiatives for the prevention of serious casualty crashes.
Collapse
Affiliation(s)
- Chengcheng Xu
- a Jiangsu Key Laboratory of Urban ITS , Southeast University , Nanjing , China.,b Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies , Southeast University , Nanjing , China
| | - Jie Bao
- a Jiangsu Key Laboratory of Urban ITS , Southeast University , Nanjing , China.,b Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies , Southeast University , Nanjing , China
| | - Pan Liu
- a Jiangsu Key Laboratory of Urban ITS , Southeast University , Nanjing , China.,b Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies , Southeast University , Nanjing , China
| | - Wei Wang
- a Jiangsu Key Laboratory of Urban ITS , Southeast University , Nanjing , China.,b Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies , Southeast University , Nanjing , China
| |
Collapse
|
6
|
Lee KH, Rondeau V, Haneuse S. Accelerated failure time models for semi-competing risks data in the presence of complex censoring. Biometrics 2017; 73:1401-1412. [PMID: 28395116 DOI: 10.1111/biom.12696] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2016] [Revised: 03/01/2017] [Accepted: 03/01/2017] [Indexed: 10/19/2022]
Abstract
Statistical analyses that investigate risk factors for Alzheimer's disease (AD) are often subject to a number of challenges. Some of these challenges arise due to practical considerations regarding data collection such that the observation of AD events is subject to complex censoring including left-truncation and either interval or right-censoring. Additional challenges arise due to the fact that study participants under investigation are often subject to competing forces, most notably death, that may not be independent of AD. Towards resolving the latter, researchers may choose to embed the study of AD within the "semi-competing risks" framework for which the recent statistical literature has seen a number of advances including for the so-called illness-death model. To the best of our knowledge, however, the semi-competing risks literature has not fully considered analyses in contexts with complex censoring, as in studies of AD. This is particularly the case when interest lies with the accelerated failure time (AFT) model, an alternative to the traditional multiplicative Cox model that places emphasis away from the hazard function. In this article, we outline a new Bayesian framework for estimation/inference of an AFT illness-death model for semi-competing risks data subject to complex censoring. An efficient computational algorithm that gives researchers the flexibility to adopt either a fully parametric or a semi-parametric model specification is developed and implemented. The proposed methods are motivated by and illustrated with an analysis of data from the Adult Changes in Thought study, an on-going community-based prospective study of incident AD in western Washington State.
Collapse
Affiliation(s)
- Kyu Ha Lee
- Epidemiology and Biostatistics Core, The Forsyth Institute, Cambridge, Massachusetts, U.S.A.,Department of Oral Health Policy and Epidemiology, Harvard School of Dental Medicine, Boston, Massachusetts, U.S.A
| | - Virginie Rondeau
- Centre INSERM U-897-Epidemiologie-Biostatistique, INSERM, Bordeaux, France
| | - Sebastien Haneuse
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, U.S.A
| |
Collapse
|
7
|
Lee KH, Dominici F, Schrag D, Haneuse S. Hierarchical models for semi-competing risks data with application to quality of end-of-life care for pancreatic cancer. J Am Stat Assoc 2016; 111:1075-1095. [PMID: 28303074 DOI: 10.1080/01621459.2016.1164052] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Readmission following discharge from an initial hospitalization is a key marker of quality of health care in the United States. For the most part, readmission has been studied among patients with 'acute' health conditions, such as pneumonia and heart failure, with analyses based on a logistic-Normal generalized linear mixed model (Normand et al., 1997). Naïve application of this model to the study of readmission among patients with 'advanced' health conditions such as pancreatic cancer, however, is problematic because it ignores death as a competing risk. A more appropriate analysis is to imbed such a study within the semi-competing risks framework. To our knowledge, however, no comprehensive statistical methods have been developed for cluster-correlated semi-competing risks data. To resolve this gap in the literature we propose a novel hierarchical modeling framework for the analysis of cluster-correlated semi-competing risks data that permits parametric or non-parametric specifications for a range of components giving analysts substantial flexibility as they consider their own analyses. Estimation and inference is performed within the Bayesian paradigm since it facilitates the straightforward characterization of (posterior) uncertainty for all model parameters, including hospital-specific random effects. Model comparison and choice is performed via the deviance information criterion and the log-pseudo marginal likelihood statistic, both of which are based on a partially marginalized likelihood. An efficient computational scheme, based on the Metropolis-Hastings-Green algorithm, is developed and had been implemented in the SemiCompRisks R package. A comprehensive simulation study shows that the proposed framework performs very well in a range of data scenarios, and outperforms competitor analysis strategies. The proposed framework is motivated by and illustrated with an on-going study of the risk of readmission among Medicare beneficiaries diagnosed with pancreatic cancer. Using data on n=5,298 patients at J=112 hospitals in the six New England states between 2000-2009, key scientific questions we consider include the role of patient-level risk factors on the risk of readmission and the extent of variation in risk across hospitals not explained by differences in patient case-mix.
Collapse
Affiliation(s)
- Kyu Ha Lee
- Epidemiology and Biostatistics Core, The Forsyth Institute, Department of Oral Health Policy and Epidemiology, Harvard School of Dental Medicine
| | | | - Deborah Schrag
- Department of Medical Oncology, Dana Farber Cancer Institute
| | - Sebastien Haneuse
- Department of Biostatistics, Harvard T.H. Chan School of Public Health
| |
Collapse
|
8
|
Abstract
Recent studies of (cost-) effectiveness in cardiothoracic transplantation have required estimation of mean survival over the lifetime of the recipients. In order to calculate mean survival, the complete survivor curve is required but is often not fully observed, so that survival extrapolation is necessary. After transplantation, the hazard function is bathtub-shaped, reflecting latent competing risks which operate additively in overlapping time periods. The poly-Weibull distribution is a flexible parametric model that may be used to extrapolate survival and has a natural competing risks interpretation. In addition, treatment effects and subgroups can be modelled separately for each component of risk. We describe the model and develop inference procedures using freely available software. The methods are applied to two problems from cardiothoracic transplantation.
Collapse
Affiliation(s)
| | - David Lunn
- Medical Research Council Biostatistics Unit, Cambridge, UK
| | | |
Collapse
|
9
|
Lee KH, Haneuse S, Schrag D, Dominici F. Bayesian Semi-parametric Analysis of Semi-competing Risks Data: Investigating Hospital Readmission after a Pancreatic Cancer Diagnosis. J R Stat Soc Ser C Appl Stat 2014; 64:253-273. [PMID: 25977592 DOI: 10.1111/rssc.12078] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
In the U.S., the Centers for Medicare and Medicaid Services uses 30-day readmission, following hospitalization, as a proxy outcome to monitor quality of care. These efforts generally focus on treatable health conditions, such as pneumonia and heart failure. Expanding quality of care systems to monitor conditions for which treatment options are limited or non-existent, such as pancreatic cancer, is challenging because of the non-trivial force of mortality; 30-day mortality for pancreatic cancer is approximately 30%. In the statistical literature, data that arise when the observation of the time to some non-terminal event is subject to some terminal event are referred to as 'semi-competing risks data'. Given such data, scientific interest may lie in at least one of three areas: (i) estimation/inference for regression parameters, (ii) characterization of dependence between the two events, and (iii) prediction given a covariate profile. Existing statistical methods focus almost exclusively on the first of these; methods are sparse or non-existent, however, when interest lies with understanding dependence and performing prediction. In this paper we propose a Bayesian semi-parametric regression framework for analyzing semi-competing risks data that permits the simultaneous investigation of all three of the aforementioned scientific goals. Characterization of the induced posterior and posterior predictive distributions is achieved via an efficient Metropolis-Hastings-Green algorithm, which has been implemented in an R package. The proposed framework is applied to data on 16,051 individuals diagnosed with pancreatic cancer between 2005-2008, obtained from Medicare Part A. We found that increased risk for readmission is associated with a high comorbidity index, a long hospital stay at initial hospitalization, non-white race, male, and discharge to home care.
Collapse
Affiliation(s)
- Kyu Ha Lee
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, USA
| | - Sebastien Haneuse
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, USA
| | - Deborah Schrag
- Dana-Farber Cancer Institute, Boston, Massachusetts, USA
| | - Francesca Dominici
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, USA
| |
Collapse
|
10
|
Bouman P, Meng XL, Dignam J, Dukić V. A Multiresolution Hazard Model for Multicenter Survival Studies: Application to Tamoxifen Treatment in Early Stage Breast Cancer. J Am Stat Assoc 2007; 102:1145-1157. [PMID: 25620824 DOI: 10.1198/016214506000000951] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
In multicenter studies, one often needs to make inference about a population survival curve based on multiple, possibly heterogeneous survival data from individual centers. We investigate a flexible Bayesian method for estimating a population survival curve based on a semiparametric multiresolution hazard model that can incorporate covariates and account for center heterogeneity. The method yields a smooth estimate of the survival curve for "multiple resolutions" or time scales of interest. The Bayesian model used has the capability to accommodate general forms of censoring and a priori smoothness assumptions. We develop a model checking and diagnostic technique based on the posterior predictive distribution and use it to identify departures from the model assumptions. The hazard estimator is used to analyze data from 110 centers that participated in a multicenter randomized clinical trial to evaluate tamoxifen in the treatment of early stage breast cancer. Of particular interest are the estimates of center heterogeneity in the baseline hazard curves and in the treatment effects, after adjustment for a few key clinical covariates. Our analysis suggests that the treatment effect estimates are rather robust, even for a collection of small trial centers, despite variations in center characteristics.
Collapse
Affiliation(s)
- Peter Bouman
- Marketing Department, Kellogg School of Management, Northwestern University, Evanston, IL 60208
| | - Xiao-Li Meng
- Department of Statistics, Harvard University, Cambridge, MA 02138
| | - James Dignam
- Department of Health Studies, University of Chicago, Chicago, IL 60637
| | - Vanja Dukić
- Department of Health Studies, University of Chicago, Chicago, IL 60637
| |
Collapse
|