1
|
Cho S, Psioda MA, Ibrahim JG. Bayesian joint modeling of multivariate longitudinal and survival outcomes using Gaussian copulas. Biostatistics 2024:kxae009. [PMID: 38669589 DOI: 10.1093/biostatistics/kxae009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 03/06/2024] [Accepted: 03/11/2024] [Indexed: 04/28/2024] Open
Abstract
There is an increasing interest in the use of joint models for the analysis of longitudinal and survival data. While random effects models have been extensively studied, these models can be hard to implement and the fixed effect regression parameters must be interpreted conditional on the random effects. Copulas provide a useful alternative framework for joint modeling. One advantage of using copulas is that practitioners can directly specify marginal models for the outcomes of interest. We develop a joint model using a Gaussian copula to characterize the association between multivariate longitudinal and survival outcomes. Rather than using an unstructured correlation matrix in the copula model to characterize dependence structure as is common, we propose a novel decomposition that allows practitioners to impose structure (e.g., auto-regressive) which provides efficiency gains in small to moderate sample sizes and reduces computational complexity. We develop a Markov chain Monte Carlo model fitting procedure for estimation. We illustrate the method's value using a simulation study and present a real data analysis of longitudinal quality of life and disease-free survival data from an International Breast Cancer Study Group trial.
Collapse
Affiliation(s)
- Seoyoon Cho
- Department of Biostatistics, University of North Carolina, McGavran-Greenberg Hall, CB#7420, Chapel Hill, NC 27599, United States
| | - Matthew A Psioda
- Statistics and Data Science Innovation Hub, GlaxoSmithKline, Philadelphia, PA 19426, United States
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina, McGavran-Greenberg Hall, CB#7420, Chapel Hill, NC 27599, United States
| |
Collapse
|
2
|
Chen X, Nifong B, Alt EM, Psioda MA, Ibrahim JG. Bayesian design of clinical trials using the scale transformed power prior. J Biopharm Stat 2024:1-20. [PMID: 38639571 DOI: 10.1080/10543406.2024.2330205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Accepted: 03/01/2024] [Indexed: 04/20/2024]
Abstract
There are many Bayesian design methods allowing for the incorporation of historical data for sample size determination (SSD) in situations where the outcome in the historical data is the same as the outcome of a new study. However, there is a dearth of methods supporting the incorporation of data from a previously completed clinical trial that investigated the same or similar treatment as the new trial but had a primary outcome that is different. We propose a simulation-based Bayesian SSD framework using the partial-borrowing scale transformed power prior (straPP). The partial-borrowing straPP is developed by applying a novel scale transformation to a traditional power prior on the parameters from the historical data model to make the information better align with the new data model. The scale transformation is based on the assumption that the standardized parameters (i.e., parameters multiplied by the square roots of their respective Fisher information matrices) are equal. To illustrate the method, we present results from simulation studies that use real data from a previously completed clinical trial to design a new clinical trial with a primary time-to-event endpoint.
Collapse
Affiliation(s)
- Xinxin Chen
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Brady Nifong
- Non-Clinical and Translational Statistics, GSK, Collegeville, PA, USA
| | - Ethan M Alt
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Matthew A Psioda
- Statistics and Data Science Innovation Hub, GSK, Collegeville, PA, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
3
|
Tan X, Wang W, Zeng D, Liu GF, Diao G, Jafari N, Alt EM, Ibrahim JG. Safety signal detection with control of latent factors. Stat Med 2024; 43:1397-1418. [PMID: 38297431 DOI: 10.1002/sim.10015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 10/26/2023] [Accepted: 12/27/2023] [Indexed: 02/02/2024]
Abstract
Postmarket drug safety database like vaccine adverse event reporting system (VAERS) collect thousands of spontaneous reports annually, with each report recording occurrences of any adverse events (AEs) and use of vaccines. We hope to identify signal vaccine-AE pairs, for which certain vaccines are statistically associated with certain adverse events (AE), using such data. Thus, the outcomes of interest are multiple AEs, which are binary outcomes and could be correlated because they might share certain latent factors; and the primary covariates are vaccines. Appropriately accounting for the complex correlation among AEs could improve the sensitivity and specificity of identifying signal vaccine-AE pairs. We propose a two-step approach in which we first estimate the shared latent factors among AEs using a working multivariate logistic regression model, and then use univariate logistic regression model to examine the vaccine-AE associations after controlling for the latent factors. Our simulation studies show that this approach outperforms current approaches in terms of sensitivity and specificity. We apply our approach in analyzing VAERS data and report our findings.
Collapse
Affiliation(s)
- Xianming Tan
- Department of Biostatistics at Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - William Wang
- Merck and Co., Inc., North Wales, Pennsylvania, USA
| | - Donglin Zeng
- Department of Biostatistics at Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | | | - Guoqing Diao
- Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, George Washington University, Washington, DC, USA
| | | | - Ethan M Alt
- Department of Biostatistics at Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Joseph G Ibrahim
- Department of Biostatistics at Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
4
|
Weideman AMK, Wang R, Ibrahim JG, Jiang Y. Canopy2: tumor phylogeny inference by bulk DNA and single-cell RNA sequencing. bioRxiv 2024:2024.03.18.585595. [PMID: 38562795 PMCID: PMC10983938 DOI: 10.1101/2024.03.18.585595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Tumors are comprised of a mixture of distinct cell populations that differ in terms of genetic makeup and function. Such heterogeneity plays a role in the development of drug resistance and the ineffectiveness of targeted cancer therapies. Insight into this complexity can be obtained through the construction of a phylogenetic tree, which illustrates the evolutionary lineage of tumor cells as they acquire mutations over time. We propose Canopy2, a Bayesian framework that uses single nucleotide variants derived from bulk DNA and single-cell RNA sequencing to infer tumor phylogeny and conduct mutational profiling of tumor subpopulations. Canopy2 uses Markov chain Monte Carlo methods to sample from a joint probability distribution involving a mixture of binomial and beta-binomial distributions, specifically chosen to account for the sparsity and stochasticity of the single-cell data. Canopy2 demystifies the sources of zeros in the single-cell data and separates zeros categorized as non-cancerous (cells without mutations), stochastic (mutations not expressed due to bursting), and technical (expressed mutations not picked up by sequencing). Simulations demonstrate that Canopy2 consistently outperforms competing methods and reconstructs the clonal tree with high fidelity, even in situations involving low sequencing depth, poor single-cell yield, and highly-advanced and polyclonal tumors. We further assess the performance of Canopy2 through application to breast cancer and glioblastoma data, benchmarking against existing methods. Canopy2 is an open-source R package available at https://github.com/annweideman/canopy2.
Collapse
Affiliation(s)
- Ann Marie K. Weideman
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Rujin Wang
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Joseph G. Ibrahim
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Yuchao Jiang
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
5
|
Heiling HM, Rashid NU, Li Q, Peng XL, Yeh JJ, Ibrahim JG. Efficient computation of high-dimensional penalized generalized linear mixed models by latent factor modeling of the random effects. Biometrics 2024; 80:ujae016. [PMID: 38497825 PMCID: PMC10946237 DOI: 10.1093/biomtc/ujae016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 11/22/2023] [Accepted: 02/16/2024] [Indexed: 03/19/2024]
Abstract
Modern biomedical datasets are increasingly high-dimensional and exhibit complex correlation structures. Generalized linear mixed models (GLMMs) have long been employed to account for such dependencies. However, proper specification of the fixed and random effects in GLMMs is increasingly difficult in high dimensions, and computational complexity grows with increasing dimension of the random effects. We present a novel reformulation of the GLMM using a factor model decomposition of the random effects, enabling scalable computation of GLMMs in high dimensions by reducing the latent space from a large number of random effects to a smaller set of latent factors. We also extend our prior work to estimate model parameters using a modified Monte Carlo Expectation Conditional Minimization algorithm, allowing us to perform variable selection on both the fixed and random effects simultaneously. We show through simulation that through this factor model decomposition, our method can fit high-dimensional penalized GLMMs faster than comparable methods and more easily scale to larger dimensions not previously seen in existing approaches.
Collapse
Affiliation(s)
- Hillary M Heiling
- Department of Biostatistics, University of North Carolina Chapel Hill, Chapel Hill, NC 27599, United States
| | - Naim U Rashid
- Department of Biostatistics, University of North Carolina Chapel Hill, Chapel Hill, NC 27599, United States
| | - Quefeng Li
- Department of Biostatistics, University of North Carolina Chapel Hill, Chapel Hill, NC 27599, United States
| | - Xianlu L Peng
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
| | - Jen Jen Yeh
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
- Department of Surgery, University of North Carolina Chapel Hill, Chapel Hill, NC 27599, United States
- Department of Pharmacology, University of North Carolina Chapel Hill, Chapel Hill, NC 27599, United States
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina Chapel Hill, Chapel Hill, NC 27599, United States
| |
Collapse
|
6
|
LaVange LM, Alt EM, Ibrahim JG. Discussion of "Optimal test procedures for multiple hypotheses controlling the familywise expected loss" by Willi Maurer, Frank Bretz, and Xiaolei Xun. Biometrics 2023; 79:2802-2805. [PMID: 37488695 DOI: 10.1111/biom.13910] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 01/30/2023] [Indexed: 07/26/2023]
Abstract
We provide commentary on the paper by Willi Maurer, Frank Bretz, and Xiaolei Xun entitled, "Optimal test procedures for multiple hypotheses controlling for the familywise expected loss." The authors provide an excellent discussion of the multiplicity problem in clinical trials and propose a novel approach based on a decision-theoretic framework that incorporates loss functions that can vary across multiple hypotheses in a family. We provide some considerations for the practical use of the authors' proposed methods as well as some alternative methods that may also be of interest in this setting.
Collapse
Affiliation(s)
- L M LaVange
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - E M Alt
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - J G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
7
|
Xu J, Psioda MA, Ibrahim JG. Bayesian design of clinical trials using joint models for recurrent and terminating events. Biostatistics 2023; 24:866-884. [PMID: 35851911 DOI: 10.1093/biostatistics/kxac025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 06/19/2022] [Accepted: 06/22/2022] [Indexed: 10/19/2023] Open
Abstract
Joint models for recurrent event and terminating event data are increasingly used for the analysis of clinical trials. However, few methods have been proposed for designing clinical trials using these models. In this article, we develop a Bayesian clinical trial design methodology focused on evaluating the effect of an investigational product (IP) on both recurrent event and terminating event processes considered as multiple primary endpoints, using a multifrailty joint model. Dependence between the recurrent and terminating event processes is accounted for using a shared frailty. Inferences for the multiple primary outcomes are based on posterior model probabilities corresponding to mutually exclusive hypotheses regarding the benefit of IP with respect to the recurrent and terminating event processes. We propose an approach for sample size determination to ensure the trial design has a high power and a well-controlled type I error rate, with both operating characteristics defined from a Bayesian perspective. We also consider a generalization of the proposed parametric model that uses a nonparametric mixture of Dirichlet processes to model the frailty distributions and compare its performance to the proposed approach. We demonstrate the methodology by designing a colorectal cancer clinical trial with a goal of demonstrating that the IP causes a favorable effect on at least one of the two outcomes but no harm on either.
Collapse
Affiliation(s)
| | - Matthew A Psioda
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
8
|
Bean NW, Ibrahim JG, Psioda MA. Bayesian joint models for multi-regional clinical trials. Biostatistics 2023:kxad023. [PMID: 37669215 DOI: 10.1093/biostatistics/kxad023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 08/09/2023] [Accepted: 08/10/2023] [Indexed: 09/07/2023] Open
Abstract
In recent years, multi-regional clinical trials (MRCTs) have increased in popularity in the pharmaceutical industry due to their ability to accelerate the global drug development process. To address potential challenges with MRCTs, the International Council for Harmonisation released the E17 guidance document which suggests the use of statistical methods that utilize information borrowing across regions if regional sample sizes are small. We develop an approach that allows for information borrowing via Bayesian model averaging in the context of a joint analysis of survival and longitudinal data from MRCTs. In this novel application of joint models to MRCTs, we use Laplace's method to integrate over subject-specific random effects and to approximate posterior distributions for region-specific treatment effects on the time-to-event outcome. Through simulation studies, we demonstrate that the joint modeling approach can result in an increased rejection rate when testing the global treatment effect compared with methods that analyze survival data alone. We then apply the proposed approach to data from a cardiovascular outcomes MRCT.
Collapse
Affiliation(s)
- Nathan W Bean
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
| | - Matthew A Psioda
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
| |
Collapse
|
9
|
Lee E, Ibrahim JG, Zhu H. Bayesian bi-level variable selection for genome-wide survival study. Genomics Inform 2023; 21:e28. [PMID: 37813624 PMCID: PMC10584651 DOI: 10.5808/gi.23047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 06/26/2023] [Accepted: 06/27/2023] [Indexed: 10/11/2023] Open
Abstract
Mild cognitive impairment (MCI) is a clinical syndrome characterized by the onset and evolution of cognitive impairments, often considered a transitional stage to Alzheimer's disease (AD). The genetic traits of MCI patients who experience a rapid progression to AD can enhance early diagnosis capabilities and facilitate drug discovery for AD. While a genome-wide association study (GWAS) is a standard tool for identifying single nucleotide polymorphisms (SNPs) related to a disease, it fails to detect SNPs with small effect sizes due to stringent control for multiple testing. Additionally, the method does not consider the group structures of SNPs, such as genes or linkage disequilibrium blocks, which can provide valuable insights into the genetic architecture. To address the limitations, we propose a Bayesian bi-level variable selection method that detects SNPs associated with time of conversion from MCI to AD. Our approach integrates group inclusion indicators into an accelerated failure time model to identify important SNP groups. Additionally, we employ data augmentation techniques to impute censored time values using a predictive posterior. We adapt Dirichlet-Laplace shrinkage priors to incorporate the group structure for SNP-level variable selection. In the simulation study, our method outperformed other competing methods regarding variable selection. The analysis of Alzheimer's Disease Neuroimaging Initiative (ADNI) data revealed several genes directly or indirectly related to AD, whereas a classical GWAS did not identify any significant SNPs.
Collapse
Affiliation(s)
- Eunjee Lee
- Department of Information and Statistics, Chungnam National University, Daejeon 34134, Korea
| | - Joseph G. Ibrahim
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA
| | | |
Collapse
|
10
|
Vincent BG, File DM, McKinnon KP, Moore DT, Frelinger JA, Collins EJ, Ibrahim JG, Bixby L, Reisdorf S, Laurie SJ, Park YA, Anders CK, Collichio FA, Muss HB, Carey LA, van Deventer HW, Dees EC, Serody JS. Efficacy of a Dual-Epitope Dendritic Cell Vaccine as Part of Combined Immunotherapy for HER2-Expressing Breast Tumors. J Immunol 2023:263816. [PMID: 37204246 DOI: 10.4049/jimmunol.2300077] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 05/02/2023] [Indexed: 05/20/2023]
Abstract
Previous work from our group and others has shown that patients with breast cancer can generate a T cell response against specific human epidermal growth factor 2 (HER2) epitopes. In addition, preclinical work has shown that this T cell response can be augmented by Ag-directed mAb therapy. This study evaluated the activity and safety of a combination of dendritic cell (DC) vaccination given with mAb and cytotoxic therapy. We performed a phase I/II study using autologous DCs pulsed with two different HER2 peptides given with trastuzumab and vinorelbine to a study cohort of patients with HER2-overexpressing and a second with HER2 nonoverexpressing metastatic breast cancer. Seventeen patients with HER2-overexpressing and seven with nonoverexpressing disease were treated. Treatment was well tolerated, with one patient removed from therapy because of toxicity and no deaths. Forty-six percent of patients had stable disease after therapy, with 4% achieving a partial response and no complete responses. Immune responses were generated in the majority of patients but did not correlate with clinical response. However, in one patient, who has survived >14 y since treatment in the trial, a robust immune response was demonstrated, with 25% of her T cells specific to one of the peptides in the vaccine at the peak of her response. These data suggest that autologous DC vaccination when given with anti-HER2-directed mAb therapy and vinorelbine is safe and can induce immune responses, including significant T cell clonal expansion, in a subset of patients.
Collapse
Affiliation(s)
- Benjamin G Vincent
- Division of Hematology, Department of Medicine, University of North Carolina, Chapel Hill, NC
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC
- Department of Microbiology and Immunology, UNC School of Medicine, Marsico Hall, Chapel Hill, NC
- Program in Computational Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - Danielle M File
- Division of Oncology, Department of Medicine, University of North Carolina, Chapel Hill, NC
| | - Karen P McKinnon
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - Dominic T Moore
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - Jeffrey A Frelinger
- Department of Microbiology and Immunology, UNC School of Medicine, Marsico Hall, Chapel Hill, NC
| | - Edward J Collins
- Department of Microbiology and Immunology, UNC School of Medicine, Marsico Hall, Chapel Hill, NC
| | - Joseph G Ibrahim
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - Lisa Bixby
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - Shannon Reisdorf
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - Sonia J Laurie
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - Yara A Park
- Department of Pathology and Laboratory Medicine, University of North Carolina, Chapel Hill, NC
| | - Carey K Anders
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC
- Division of Oncology, Department of Medicine, University of North Carolina, Chapel Hill, NC
| | - Frances A Collichio
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC
- Division of Oncology, Department of Medicine, University of North Carolina, Chapel Hill, NC
| | - Hyman B Muss
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC
- Division of Oncology, Department of Medicine, University of North Carolina, Chapel Hill, NC
| | - Lisa A Carey
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC
- Division of Oncology, Department of Medicine, University of North Carolina, Chapel Hill, NC
| | - Hendrik W van Deventer
- Division of Hematology, Department of Medicine, University of North Carolina, Chapel Hill, NC
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - E Claire Dees
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC
- Division of Oncology, Department of Medicine, University of North Carolina, Chapel Hill, NC
| | - Jonathan S Serody
- Division of Hematology, Department of Medicine, University of North Carolina, Chapel Hill, NC
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC
- Department of Microbiology and Immunology, UNC School of Medicine, Marsico Hall, Chapel Hill, NC
- Program in Computational Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC
| |
Collapse
|
11
|
Hauser P, Tan X, Chen F, Ibrahim JG. Bayesian generalized linear low rank regression models for the detection of vaccine-adverse event associations. Stat Med 2023; 42:2009-2026. [PMID: 36974659 DOI: 10.1002/sim.9711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 02/27/2023] [Accepted: 03/07/2023] [Indexed: 03/29/2023]
Abstract
We propose a generalized linear low-rank mixed model (GLLRM) for the analysis of both high-dimensional and sparse responses and covariates where the responses may be binary, counts, or continuous. This development is motivated by the problem of identifying vaccine-adverse event associations in post-market drug safety databases, where an adverse event is any untoward medical occurrence or health problem that occurs during or following vaccination. The GLLRM is a generalization of a generalized linear mixed model in that it integrates a factor analysis model to describe the dependence among responses and a low-rank matrix to approximate the high-dimensional regression coefficient matrix. A sampling procedure combining the Gibbs sampler and Metropolis and Gamerman algorithms is employed to obtain posterior estimates of the regression coefficients and other model parameters. Testing of response-covariate pair associations is based on the posterior distribution of the corresponding regression coefficients. Monte Carlo simulation studies are conducted to examine the finite-sample performance of the proposed procedures on binary and count outcomes. We further illustrate the GLLRM via a real data example based on the Vaccine Adverse Event Reporting System.
Collapse
Affiliation(s)
- Paloma Hauser
- Department of Biostatistics, University of North Carolina, Chapel Hill, 27599, North Carolina, USA
| | - Xianming Tan
- Department of Biostatistics, University of North Carolina, Chapel Hill, 27599, North Carolina, USA
| | - Fang Chen
- SAS Institute Inc., Cary, 27513, North Carolina, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina, Chapel Hill, 27599, North Carolina, USA
| |
Collapse
|
12
|
Alt EM, Psioda MA, Ibrahim JG. A Bayesian approach to study design and analysis with type I error rate control for response variables of mixed types. Stat Med 2023; 42:1722-1740. [PMID: 36929939 DOI: 10.1002/sim.9696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 11/29/2022] [Accepted: 02/20/2023] [Indexed: 03/18/2023]
Abstract
There has been increased interest in the design and analysis of studies consisting of multiple response variables of mixed types. For example, in clinical trials, it is desirable to establish efficacy for a treatment effect in primary and secondary outcomes. In this article, we develop Bayesian approaches for hypothesis testing and study planning for data consisting of multiple response variables of mixed types with covariates. We assume that the responses are correlated via a Gaussian copula, and that the model for each response is, marginally, a generalized linear model (GLM). Taking a fully Bayesian approach, the proposed method enables inference based on the joint posterior distribution of the parameters. Under some mild conditions, we show that the joint distribution of the posterior probabilities under any Bayesian analysis converges to a Gaussian copula distribution as the sample size tends to infinity. Using this result, we develop an approach to control the type I error rate under multiple testing. Simulation results indicate that the method is more powerful than conducting marginal regression models and correcting for multiplicity using the Bonferroni-Holm Method. We also develop a Bayesian approach to sample size determination in the presence of response variables of mixed types, extending the concept of probability of success (POS) to multiple response variables of mixed types.
Collapse
Affiliation(s)
- Ethan M Alt
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Matthew A Psioda
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
13
|
Shen Y, A. Psioda M, G. Ibrahim J. BayesPPD: An R Package for Bayesian Sample Size Determination Using the Power and Normalized Power Prior for Generalized Linear Models. The R Journal 2023. [DOI: 10.32614/rj-2023-016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/04/2023]
Affiliation(s)
- Yueqi Shen
- University of North Carolina at Chapel Hill
| | | | | |
Collapse
|
14
|
Sheikh MT, Chen MH, Gelfond JA, Sun W, Ibrahim JG. New C-indices for assessing importance of longitudinal biomarkers in fitting competing risks survival data in the presence of partially masked causes. Stat Med 2023; 42:1308-1322. [PMID: 36696954 DOI: 10.1002/sim.9671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Revised: 12/20/2022] [Accepted: 01/13/2023] [Indexed: 01/27/2023]
Abstract
Competing risks survival data in the presence of partially masked causes are frequently encountered in medical research or clinical trials. When longitudinal biomarkers are also available, it is of great clinical importance to examine associations between the longitudinal biomarkers and the cause-specific survival outcomes. In this article, we propose a cause-specific C-index for joint models of longitudinal and competing risks survival data accounting for masked causes. We also develop a posterior predictive algorithm for computing the out-of-sample cause-specific C-index using Markov chain Monte Carlo samples from the joint posterior of the in-sample longitudinal and competing risks survival data. We further construct the Δ $$ \Delta $$ C-index to quantify the strength of association between the longitudinal and cause-specific survival data, or between the out-of-sample longitudinal and survival data. Empirical performance of the proposed assessment criteria is examined through an extensive simulation study. An in-depth analysis of the real data from large cancer prevention trials is carried out to demonstrate the usefulness of the proposed methodology.
Collapse
Affiliation(s)
- Md Tuhin Sheikh
- Department of Statistics, University of Connecticut, Storrs, Connecticut, USA
| | - Ming-Hui Chen
- Department of Statistics, University of Connecticut, Storrs, Connecticut, USA
| | - Jonathan A Gelfond
- Department of Epidemiology and Biostatistics, University of Texas Health, Houston, Texas, USA
| | - Wei Sun
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
15
|
Alt EM, Nifong B, Chen X, Psioda MA, Ibrahim JG. The scale transformed power prior for use with historical data from a different outcome model. Stat Med 2023; 42:1-14. [PMID: 36318875 PMCID: PMC9789178 DOI: 10.1002/sim.9598] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 08/26/2022] [Accepted: 10/06/2022] [Indexed: 11/05/2022]
Abstract
We develop the scale transformed power prior for settings where historical and current data involve different data types, such as binary and continuous data. This situation arises often in clinical trials, for example, when historical data involve binary responses and the current data involve some other type of continuous or discrete outcome. The power prior, proposed by Ibrahim and Chen, does not address the issue of different data types. Herein, we develop a new type of power prior, which we call the scale transformed power prior (straPP). The straPP is constructed by transforming the power prior for the historical data by rescaling the parameter using a function of the Fisher information matrices for the historical and current data models, thereby shifting the scale of the parameter vector from that of the historical to that of the current data. Examples are presented to motivate the need for such a transformation, and simulation studies are presented to illustrate the performance advantages of the straPP over the power prior and other informative and noninformative priors. A real dataset from a clinical trial undertaken to study a novel transitional care model for stroke survivors is used to illustrate the methodology.
Collapse
Affiliation(s)
- Ethan M Alt
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Brady Nifong
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Xinxin Chen
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Matthew A Psioda
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
16
|
Zhang Z, Wu Y, Xiong D, Ibrahim JG, Srivastava A, Zhu H. Rejoinder: LESA: Longitudinal Elastic Shape Analysis of Brain Subcortical Structures. J Am Stat Assoc 2023. [DOI: 10.1080/01621459.2022.2139264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
Affiliation(s)
- Zhengwu Zhang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - Yuexuan Wu
- Department of Statistics, Florida State University, Tallahassee, FL
| | - Di Xiong
- Departments of Biostatistics University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - Joseph G. Ibrahim
- Departments of Biostatistics University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - Anuj Srivastava
- Department of Statistics, Florida State University, Tallahassee, FL
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC
- Departments of Biostatistics University of North Carolina at Chapel Hill, Chapel Hill, NC
- Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC
- Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC
- Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC
| |
Collapse
|
17
|
Xu J, Psioda MA, Ibrahim JG. Bayesian Design of Clinical Trials Using Joint Cure Rate Models for Longitudinal and Time-to-Event Data. Lifetime Data Anal 2023; 29:213-233. [PMID: 36357647 DOI: 10.1007/s10985-022-09581-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Accepted: 10/24/2022] [Indexed: 06/16/2023]
Abstract
For clinical trial design and analysis, there has been extensive work related to using joint models for longitudinal and time-to-event data without a cure fraction (i.e., when all patients are at risk for the event of interest), but comparatively little treatment has been given to design and analysis of clinical trials using joint models that incorporate a cure fraction. In this paper, we develop a Bayesian clinical trial design methodology focused on evaluating the treatment's effect on a time-to-event endpoint using a promotion time cure rate model, where the longitudinal process is incorporated into the hazard model for the promotion times. A piecewise linear hazard model for the period after assessment of the longitudinal measure ends is proposed as an alternative to extrapolating the longitudinal trajectory. This may be advantageous in scenarios where the period of time from the end of longitudinal measurements until the end of observation is substantial. Inference for the time-to-event endpoint is based on a novel estimand which combines the treatment's effect on the probability of cure and its effect on the promotion time distribution, mediated by the longitudinal outcome. We propose an approach for sample size determination such that the design has a high power and a well-controlled type I error rate with both operating characteristics defined from a Bayesian perspective. We demonstrate the methodology by designing a breast cancer clinical trial with a primary time-to-event endpoint where longitudinal outcomes are measured periodically during follow up.
Collapse
Affiliation(s)
- Jiawei Xu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| | - Matthew A Psioda
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
18
|
Lim D, Chen MH, G. Ibrahim J, Kim S, K. Shah A, Lin J. metapack: An R Package for Bayesian Meta-Analysis and Network Meta-Analysis with a Unified Formula Interface. The R Journal 2022; 14:142-161. [PMID: 37168034 PMCID: PMC10168678 DOI: 10.32614/rj-2022-047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Meta-analysis, a statistical procedure that compares, combines, and synthesizes research findings from multiple studies in a principled manner, has become popular in a variety of fields. Meta-analyses using study-level (or equivalently aggregate) data are of particular interest due to data availability and modeling flexibility. In this paper, we describe an R package metapack that introduces a unified formula interface for both meta-analysis and network meta-analysis. The user interface-and therefore the package-allows flexible variance-covariance modeling for multivariate meta-analysis models and univariate network meta-analysis models. Complicated computing for these models has prevented their widespread adoption. The package also provides functions to generate relevant plots and perform statistical inferences like model assessments. Use cases are demonstrated using two real data sets contained in metapack.
Collapse
|
19
|
Zhang Z, Wu Y, Xiong D, Ibrahim JG, Srivastava A, Zhu H. LESA: Longitudinal Elastic Shape Analysis of Brain Subcortical Structures. J Am Stat Assoc 2022; 118:3-17. [PMID: 37153845 PMCID: PMC10162479 DOI: 10.1080/01621459.2022.2102984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 07/01/2022] [Accepted: 07/09/2022] [Indexed: 10/17/2022]
Abstract
Over the past 30 years, magnetic resonance imaging has become a ubiquitous tool for accurately visualizing the change and development of the brain's subcortical structures (e.g., hippocampus). Although subcortical structures act as information hubs of the nervous system, their quantification is still in its infancy due to many challenges in shape extraction, representation, and modeling. Here, we develop a simple and efficient framework of longitudinal elastic shape analysis (LESA) for subcortical structures. Integrating ideas from elastic shape analysis of static surfaces and statistical modeling of sparse longitudinal data, LESA provides a set of tools for systematically quantifying changes of longitudinal subcortical surface shapes from raw structure MRI data. The key novelties of LESA include: (i) it can efficiently represent complex subcortical structures using a small number of basis functions and (ii) it can accurately delineate the spatiotemporal shape changes of the human subcortical structures. We applied LESA to analyze three longitudinal neuroimaging data sets and showcase its wide applications in estimating continuous shape trajectories, building life-span growth patterns, and comparing shape differences among different groups. In particular, with the Alzheimer's Disease Neuroimaging Initiative (ADNI) data, we found that the Alzheimer's Disease (AD) can significantly speed the shape change of ventricle and hippocampus from 60 to 75 years old compared with normal aging.
Collapse
Affiliation(s)
- Zhengwu Zhang
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill Chapel Hill, North Carolina
| | - Yuexuan Wu
- Department of Statistics, Florida State University, Tallahassee, Florida
| | - Di Xiong
- Departments of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Joseph G. Ibrahim
- Departments of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Anuj Srivastava
- Department of Statistics, Florida State University, Tallahassee, Florida
| | - Hongtu Zhu
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill Chapel Hill, North Carolina
- Departments of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
- Departments of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
- Departments of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
- Biomedical Research Imaging Center, University of North Carolina at Chapel, Hill Chapel Hill, North Carolina
| |
Collapse
|
20
|
Diao G, Ma H, Zeng D, Ke C, Ibrahim JG. Synthesizing studies for comparing different treatment sequences in clinical trials. Stat Med 2022; 41:5134-5149. [PMID: 36005293 DOI: 10.1002/sim.9559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 07/29/2022] [Accepted: 08/02/2022] [Indexed: 11/09/2022]
Abstract
With advances in cancer treatments and improved patient survival, more patients may go through multiple lines of treatment. It is of clinical importance to choose a sequence of effective treatments (eg, lines of treatment) for individual patients with the goal of optimizing their long-term clinical outcome (eg, survival). Several important issues arise in cancer studies. First, cancer clinical trials are usually conducted by each line of treatment. For a treatment sequence, we may have first line and second line treatment data from two different studies. Second, there is typically a treatment initiation period varying from patient to patient between progression of disease and the start of the second line treatment due to administrative reasons. Additionally, the choice of the second line treatment for patients with progression of disease may depend on their characteristics. We address all these issues and develop semiparametric methods under the potential outcome framework for the estimation of the overall survival probability for a treatment sequence and for comparing different treatment sequences. We establish the large sample properties of the proposed inferential procedures. Simulation studies and an application to a colorectal clinical trial are provided.
Collapse
Affiliation(s)
- Guoqing Diao
- Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, George Washington University, Washington, District of Columbia, USA
| | - Haijun Ma
- Exelixis, Inc., Alameda, California, USA
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Chunlei Ke
- Apellis Pharmaceuticals, Waltham, Massachusetts, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
21
|
Jia B, Zeng D, Liao JJZ, Liu GF, Tan X, Diao G, Ibrahim JG. Mixture survival trees for cancer risk classification. Lifetime Data Anal 2022; 28:356-379. [PMID: 35486260 PMCID: PMC10402927 DOI: 10.1007/s10985-022-09552-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 03/04/2022] [Indexed: 06/14/2023]
Abstract
In oncology studies, it is important to understand and characterize disease heterogeneity among patients so that patients can be classified into different risk groups and one can identify high-risk patients at the right time. This information can then be used to identify a more homogeneous patient population for developing precision medicine. In this paper, we propose a mixture survival tree approach for direct risk classification. We assume that the patients can be classified into a pre-specified number of risk groups, where each group has distinct survival profile. Our proposed tree-based methods are devised to estimate latent group membership using an EM algorithm. The observed data log-likelihood function is used as the splitting criterion in recursive partitioning. The finite sample performance is evaluated by extensive simulation studies and the proposed method is illustrated by a case study in breast cancer.
Collapse
Affiliation(s)
- Beilin Jia
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | | - Guanghan F Liu
- Biostatistics and Research Decision Sciences, Merck & Co., Inc, North Wales, PA, USA
| | - Xianming Tan
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Guoqing Diao
- Department of Biostatistics and Bioinformatics, The George Washington University, Washington, DC, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
22
|
Alt EM, Psioda MA, Ibrahim JG. A hierarchical prior for generalized linear models based on predictions for the mean response. Biostatistics 2022; 23:1165-1181. [PMID: 35770800 DOI: 10.1093/biostatistics/kxac022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 05/03/2022] [Accepted: 06/09/2022] [Indexed: 11/14/2022] Open
Abstract
There has been increased interest in using prior information in statistical analyses. For example, in rare diseases, it can be difficult to establish treatment efficacy based solely on data from a prospective study due to low sample sizes. To overcome this issue, an informative prior to the treatment effect may be elicited. We develop a novel extension of the conjugate prior of Chen and Ibrahim (2003) that enables practitioners to elicit a prior prediction for the mean response for generalized linear models, treating the prediction as random. We refer to the hierarchical prior as the hierarchical prediction prior (HPP). For independent and identically distributed settings and the normal linear model, we derive cases for which the hyperprior is a conjugate prior. We also develop an extension of the HPP in situations where summary statistics from a previous study are available. The HPP allows for discounting based on the quality of individual level predictions, and simulation results suggest that, compared to the conjugate prior and the power prior, the HPP efficiency gains (e.g., lower mean squared error) where predictions are incompatible with the data. An efficient Monte Carlo Markov chain algorithm is developed. Applications illustrate that inferences under the HPP are more robust to prior-data conflict compared to selected nonhierarchical priors.
Collapse
Affiliation(s)
- Ethan M Alt
- Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women's Hospital and Harvard Medical School, 1620 Tremont St., Suite 3030, Boston, MA 02120, USA
| | - Matthew A Psioda
- Department of Biostatistics, University of North Carolina, 135 Dauer Drive, Chapel Hill, NC 27599, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina, 135 Dauer Drive, Chapel Hill, NC 27599, USA
| |
Collapse
|
23
|
Gelfond JA, Hernandez B, Goros M, Ibrahim JG, Chen MH, Sun W, Leach RJ, Kattan MW, Thompson IM, Ankerst DP, Liss M. Prediction of future risk of any and higher-grade prostate cancer based on the PLCO and SELECT trials. BMC Urol 2022; 22:45. [PMID: 35351104 PMCID: PMC8966358 DOI: 10.1186/s12894-022-00986-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 03/01/2022] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND A model was built that characterized effects of individual factors on five-year prostate cancer (PCa) risk in the Prostate, Lung, Colon, and Ovarian Cancer Screening Trial (PLCO) and the Selenium and Vitamin E Cancer Prevention Trial (SELECT). This model was validated in a third San Antonio Biomarkers of Risk (SABOR) screening cohort. METHODS A prediction model for 1- to 5-year risk of developing PCa and Gleason > 7 PCa (HG PCa) was built on PLCO and SELECT using the Cox proportional hazards model adjusting for patient baseline characteristics. Random forests and neural networks were compared to Cox proportional hazard survival models, using the trial datasets for model building and the SABOR cohort for model evaluation. The most accurate prediction model is included in an online calculator. RESULTS The respective rates of PCa were 8.9%, 7.2%, and 11.1% in PLCO (n = 31,495), SELECT (n = 35,507), and SABOR (n = 1790) over median follow-up of 11.7, 8.1 and 9.0 years. The Cox model showed higher prostate-specific antigen (PSA), BMI and age, and African American race to be associated with PCa and HGPCa. Five-year risk predictions from the combined SELECT and PLCO model effectively discriminated risk in the SABOR cohort with C-index 0.76 (95% CI [0.72, 0.79]) for PCa, and 0.74 (95% CI [0.65,0.83]) for HGPCa. CONCLUSIONS A 1- to 5-year PCa risk prediction model developed from PLCO and SELECT was validated with SABOR and implemented online. This model can individualize and inform shared screening decisions.
Collapse
Affiliation(s)
- Jonathan A. Gelfond
- Department of Population Health Sciences, Mail Code 7933, 7703 Floyd Curl Drive, San Antonio, TX 78229-3900 USA
| | - Brian Hernandez
- Department of Population Health Sciences, Mail Code 7933, 7703 Floyd Curl Drive, San Antonio, TX 78229-3900 USA
| | - Martin Goros
- Department of Population Health Sciences, Mail Code 7933, 7703 Floyd Curl Drive, San Antonio, TX 78229-3900 USA
| | - Joseph G. Ibrahim
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC USA
| | - Ming-Hui Chen
- Department of Statistics, University of Connecticut, New Haven, NC USA
| | - Wei Sun
- Biostatistics Program, The Fred Hutchinson Cancer Research Center, Seattle, WA USA
| | - Robin J. Leach
- Department of Urology and Mays Cancer Center, University of Texas Health at San Antonio, San Antonio, TX USA
| | - Michael W. Kattan
- Department of Quantitative Health Sciences, Cleveland Clinic, Cleveland, OH USA
| | - Ian M. Thompson
- Department of Urology and Mays Cancer Center, University of Texas Health at San Antonio, San Antonio, TX USA
- CHRISTUS Santa Rosa Hospital – Medical Center, San Antonio, TX USA
| | - Donna Pauler Ankerst
- Department of Urology and Mays Cancer Center, University of Texas Health at San Antonio, San Antonio, TX USA
- Departments of Mathematics, Life Sciences, Technical University of Munich, Munich, Germany
| | - Michael Liss
- Department of Urology and Mays Cancer Center, University of Texas Health at San Antonio, San Antonio, TX USA
| |
Collapse
|
24
|
Alt EM, Psioda MA, Ibrahim JG. Bayesian multivariate probability of success using historical data with type I error rate control. Biostatistics 2022; 24:17-31. [PMID: 34981114 PMCID: PMC9748585 DOI: 10.1093/biostatistics/kxab050] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 12/09/2021] [Accepted: 12/14/2021] [Indexed: 01/05/2023] Open
Abstract
In clinical trials, it is common to have multiple clinical outcomes (e.g., coprimary endpoints or a primary and multiple secondary endpoints). It is often desirable to establish efficacy in at least one of multiple clinical outcomes, which leads to a multiplicity problem. In the frequentist paradigm, the most popular methods to correct for multiplicity are typically conservative. Moreover, despite guidance from regulators, it is difficult to determine the sample size of a future study with multiple clinical outcomes. In this article, we introduce a Bayesian methodology for multiple testing that asymptotically guarantees type I error control. Using a seemingly unrelated regression model, correlations between outcomes are specifically modeled, which enables inference on the joint posterior distribution of the treatment effects. Simulation results suggest that the proposed Bayesian approach is more powerful than the method of Holm (1979), which is commonly utilized in practice as a more powerful alternative to the ubiquitous Bonferroni correction. We further develop multivariate probability of success, a Bayesian method to robustly determine sample size in the presence of multiple outcomes.
Collapse
Affiliation(s)
- Ethan M Alt
- Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women’s Hospital and Harvard Medical School, 75 Francis Street, Boston, MA, 02115, USA,To whom correspondence should be addressed.
| | - Matthew A Psioda
- Department of Biostatistics, University of North Carolina at Chapel Hill, 135 Dauer Drive, Chapel Hill, NC, 27599, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, 135 Dauer Drive, Chapel Hill, NC, 27599, USA
| |
Collapse
|
25
|
Diao G, Liu GF, Zeng D, Zhang Y, Golm G, Heyse JF, Ibrahim JG. Efficient Multiple Imputation for Sensitivity Analysis of Recurrent Events Data with Informative Censoring. Stat Biopharm Res 2022; 14:153-161. [PMID: 35601027 PMCID: PMC9119645 DOI: 10.1080/19466315.2020.1819403] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Missing data are commonly encountered in clinical trials due to dropout or nonadherence to study procedures. In trials in which recurrent events are of interest, the observed count can be an undercount of the events if a patient drops out before the end of the study. In many applications, the data are not necessarily missing at random and it is often not possible to test the missing at random assumption. Consequently, it is critical to conduct sensitivity analysis. We develop a control-based multiple imputation method for recurrent events data, where patients who drop out of the study are assumed to have a similar response profile to those in the control group after dropping out. Specifically, we consider the copy reference approach and the jump to reference approach. We model the recurrent event data using a semiparametric proportional intensity frailty model with the baseline hazard function completely unspecified. We develop nonparametric maximum likelihood estimation and inference procedures. We then impute the missing data based on the large sample distribution of the resulting estimators. The variance estimation is corrected by a bootstrap procedure. Simulation studies demonstrate the proposed method performs well in practical settings. We provide applications to two clinical trials.
Collapse
Affiliation(s)
- Guoqing Diao
- Department of Biostatistics and Bioinformatics, The George Washington University, Washington, District of Columbia, U.S.A.,
| | | | - Donglin Zeng
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Yilong Zhang
- Merck & Co., Inc., North Wales, Pennsylvania, U.S.A
| | - Gregory Golm
- Merck & Co., Inc., North Wales, Pennsylvania, U.S.A
| | | | - Joseph G. Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| |
Collapse
|
26
|
Heiling HM, Wilson DR, Rashid NU, Sun W, Ibrahim JG. Estimating cell type composition using isoform expression one gene at a time. Biometrics 2021. [PMID: 34921386 DOI: 10.1111/biom.13614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Accepted: 12/08/2021] [Indexed: 11/29/2022]
Abstract
Human tissue samples are often mixtures of heterogeneous cell types, which can confound the analyses of gene expression data derived from such tissues. The cell type composition of a tissue sample may itself be of interest and is needed for proper analysis of differential gene expression. A variety of computational methods have been developed to estimate cell type proportions using gene-level expression data. However, RNA isoforms can also be differentially expressed across cell types, and isoform-level expression could be equally or more informative for determining cell type origin than gene-level expression. We propose a new computational method, IsoDeconvMM, which estimates cell type fractions using isoform-level gene expression data. A novel and useful feature of IsoDeconvMM is that it can estimate cell type proportions using only a single gene, though in practice we recommend aggregating estimates of a few dozen genes to obtain more accurate results. We demonstrate the performance of IsoDeconvMM using a unique data set with cell type-specific RNA-seq data across more than 135 individuals. This data set allows us to evaluate different methods given the biological variation of cell type-specific gene expression data across individuals. We further complement this analysis with additional simulations.
Collapse
Affiliation(s)
- Hillary M Heiling
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Douglas R Wilson
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Naim U Rashid
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina.,Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Wei Sun
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina.,Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| |
Collapse
|
27
|
Eggleston BS, Ibrahim JG, McNeil B, Catellier D. BayesCTDesign: An R Package for Bayesian Trial Design Using Historical Control Data. J Stat Softw 2021; 100:10.18637/jss.v100.i21. [PMID: 34975350 PMCID: PMC8715862 DOI: 10.18637/jss.v100.i21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
Abstract
This article introduces the R (R Core Team 2019) package BayesCTDesign for two-arm randomized Bayesian trial design using historical control data when available, and simple two-arm randomized Bayesian trial design when historical control data is not available. The package BayesCTDesign, which is available on CRAN, has two simulation functions, historic_sim() and simple_sim() for studying trial characteristics under user defined scenarios, and two methods print() and plot() for displaying summaries of the simulated trial characteristics. The package BayesCTDesign works with two-arm trials with equal sample sizes per arm. The package BayesCTDesign allows a user to study Gaussian, Poisson, Bernoulli, Weibull, Lognormal, and Piecewise Exponential (pwe) outcomes. Power for two-sided hypothesis tests at a user defined alpha is estimated via simulation using a test within each simulation replication that involves comparing a 95% credible interval for the outcome specific treatment effect measure to the null case value. If the 95% credible interval excludes the null case value, then the null hypothesis is rejected, else the null hypothesis is accepted. In the article, the idea of including historical control data in a Bayesian analysis is reviewed, the estimation process of BayesCTDesign is explained, and the user interface is described. Finally, the BayesCTDesign is illustrated via several examples.
Collapse
|
28
|
Sheikh MT, Chen MH, Gelfond JA, Ibrahim JG. A Power Prior Approach for Leveraging External Longitudinal and Competing Risks Survival Data Within the Joint Modeling Framework. Stat Biosci 2021. [DOI: 10.1007/s12561-021-09330-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
29
|
Abstract
The power prior is a popular tool for constructing informative prior distributions based on historical data. The method consists of raising the likelihood to a discounting factor in order to control the amount of information borrowed from the historical data. However, one often wishes to assign this discounting factor a prior distribution and estimate it jointly with the parameters, which in turn necessitates the computation of a normalizing constant. In this article, we are concerned with how to approximately sample from joint posterior of the parameters and the discounting factor. We first show a few important properties of the normalizing constant and then use these results to motivate a bisection-type algorithm for computing it on a fixed budget of evaluations. We give a large array of illustrations and discuss cases where the normalizing constant is known in closed-form and where it is not. We show that the proposed method produces approximate posteriors that are very close to the exact distributions and also produces posteriors that cover the data-generating parameters with higher probability in the intractable case. Our results suggest that the proposed method is an accurate and easy to implement technique to include this normalization, being applicable to a large class of models. They also reinforce the notion that proper inclusion of the normalizing constant is crucial to the drawing of correct inferences and appropriate quantification of uncertainty.
Collapse
Affiliation(s)
- Luiz Max Carvalho
- School of Applied Mathematics, Getúlio Vargas Foundation (FGV), Rio de Janeiro, Brazil
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
30
|
Bean NW, Ibrahim JG, Psioda MA. Bayesian multiregional clinical trials using model averaging. Biostatistics 2021; 24:262-276. [PMID: 34296263 PMCID: PMC10102881 DOI: 10.1093/biostatistics/kxab027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Revised: 05/26/2021] [Accepted: 06/21/2021] [Indexed: 11/14/2022] Open
Abstract
Multiregional clinical trials (MRCTs) provide the benefit of more rapidly introducing drugs to the global market; however, small regional sample sizes can lead to poor estimation quality of region-specific effects when using current statistical methods. With the publication of the International Conference for Harmonisation E17 guideline in 2017, the MRCT design is recognized as a viable strategy that can be accepted by regional regulatory authorities, necessitating new statistical methods that improve the quality of region-specific inference. In this article, we develop a novel methodology for estimating region-specific and global treatment effects for MRCTs using Bayesian model averaging. This approach can be used for trials that compare two treatment groups with respect to a continuous outcome, and it allows for the incorporation of patient characteristics through the inclusion of covariates. We propose an approach that uses posterior model probabilities to quantify evidence in favor of consistency of treatment effects across all regions, and this metric can be used by regulatory authorities for drug approval. We show through simulations that the proposed modeling approach results in lower MSE than a fixed-effects linear regression model and better control of type I error rates than a Bayesian hierarchical model.
Collapse
Affiliation(s)
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina, McGavran-Greenberg Hall, CB #7420, Chapel Hill, NC 27599, USA
| | - Matthew A Psioda
- Department of Biostatistics, University of North Carolina, McGavran-Greenberg Hall, CB #7420, Chapel Hill, NC 27599, USA
| |
Collapse
|
31
|
Zhao B, Ibrahim JG, Li Y, Li T, Wang Y, Shan Y, Zhu Z, Zhou F, Zhang J, Huang C, Liao H, Yang L, Thompson PM, Zhu H. Corrigendum to: Heritability of regional brain volumes in large-scale neuroimaging and genetic studies. Cereb Cortex 2021; 31:4865. [PMID: 34296751 DOI: 10.1093/cercor/bhab270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 04/21/2021] [Accepted: 05/25/2021] [Indexed: 11/14/2022] Open
Affiliation(s)
- Bingxin Zhao
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.,Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.,Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.,Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Tengfei Li
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX 77230, USA
| | - Yue Wang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Yue Shan
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Ziliang Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Fan Zhou
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jingwen Zhang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Chao Huang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Huiling Liao
- Department of Statistics, Texas A&M University, College Station, TX 77843, USA
| | - Liuqing Yang
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Paul M Thompson
- Imaging Genetics Center, Mark and Mary Stevens Institute for Neuroimaging & Informatics, University of Southern California, Los Angeles, CA 90033, USA
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.,Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX 77230, USA
| |
Collapse
|
32
|
Abstract
The aim of this paper is to develop a weighted functional linear Cox regression model that accounts for the association between a failure time and a set of functional and scalar covariates. We formulate the weighted functional linear Cox regression by incorporating a comprehensive three-stage estimation procedure as a unified methodology. Specifically, the weighted functional linear Cox regression uses a functional principal component analysis to represent the functional covariates and a high-dimensional Cox regression model to capture the joint effects of both scalar and functional covariates on the failure time data. Then, we consider an uncensored probability for each subject by estimating the important parameter of a censoring distribution. Finally, we use such a weight to construct the pseudo-likelihood function and maximize it to acquire an estimator. We also show our estimation and testing procedures through simulations and an analysis of real data from the Alzheimer's Disease Neuroimaging Initiative.
Collapse
Affiliation(s)
- Hojin Yang
- Department of Statistics, Pusan National University, Busan, South Korea
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, USA
| | - Mihye Ahn
- Department of Mathematics and Statistics, University of Nevada, Reno, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, USA
| |
Collapse
|
33
|
Baldoni PL, Rashid NU, Ibrahim JG. Efficient detection and classification of epigenomic changes under multiple conditions. Biometrics 2021; 78:1141-1154. [PMID: 33860525 DOI: 10.1111/biom.13477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 04/02/2021] [Accepted: 04/08/2021] [Indexed: 11/28/2022]
Abstract
Epigenomics, the study of the human genome and its interactions with proteins and other cellular elements, has become of significant interest in recent years. Such interactions have been shown to regulate essential cellular functions and are associated with multiple complex diseases. Therefore, understanding how these interactions may change across conditions is central in biomedical research. Chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-seq) is one of several techniques to detect local changes in epigenomic activity (peaks). However, existing methods for differential peak calling are not optimized for the diversity in ChIP-seq signal profiles, are limited to the analysis of two conditions, or cannot classify specific patterns of differential change when multiple patterns exist. To address these limitations, we present a flexible and efficient method for the detection of differential epigenomic activity across multiple conditions. We utilize data from the ENCODE Consortium and show that the presented method, epigraHMM, exhibits superior performance to current tools and it is among the fastest algorithms available, while allowing the classification of combinatorial patterns of differential epigenomic activity and the characterization of chromatin regulatory states.
Collapse
Affiliation(s)
- Pedro L Baldoni
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Naim U Rashid
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
34
|
Li H, Lim D, Chen MH, Ibrahim JG, Kim S, Shah AK, Lin J. Bayesian network meta-regression hierarchical models using heavy-tailed multivariate random effects with covariate-dependent variances. Stat Med 2021; 40:3582-3603. [PMID: 33846992 DOI: 10.1002/sim.8983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 03/10/2021] [Accepted: 03/12/2021] [Indexed: 11/12/2022]
Abstract
Network meta-analysis (NMA) is gaining popularity in evidence synthesis and network meta-regression allows us to incorporate potentially important covariates into network meta-analysis. In this article, we propose a Bayesian network meta-regression hierarchical model and assume a general multivariate t distribution for the random treatment effects. The multivariate t distribution is desired for heavy-tailed random effects and converges to the multivariate normal distribution when the degrees of freedom go to infinity. Moreover, in NMA, some treatments are compared only in a single study. To overcome such sparsity, we propose a log-linear regression model for the variances of the random effects and incorporate aggregate covariates into modeling the variance components. We develop a Markov chain Monte Carlo sampling algorithm to sample from the posterior distribution via the collapsed Gibbs technique. We further use the deviance information criterion and the logarithm of the pseudo-marginal likelihood for model comparison. A simulation study is conducted and a detailed analysis from our motivating case study is carried out to further demonstrate the proposed methodology.
Collapse
Affiliation(s)
- Hao Li
- Department of Statistics, University of Connecticut, Storrs, Connecticut
| | - Daeyoung Lim
- Department of Statistics, University of Connecticut, Storrs, Connecticut
| | - Ming-Hui Chen
- Department of Statistics, University of Connecticut, Storrs, Connecticut
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Sungduk Kim
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland
| | | | | |
Collapse
|
35
|
Jia B, Zeng D, Liao JJZ, Liu GF, Tan X, Diao G, Ibrahim JG. Inferring latent heterogeneity using many feature variables supervised by survival outcome. Stat Med 2021; 40:3181-3195. [PMID: 33819928 PMCID: PMC8237103 DOI: 10.1002/sim.8972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 03/18/2021] [Accepted: 03/18/2021] [Indexed: 11/06/2022]
Abstract
In cancer studies, it is important to understand disease heterogeneity among patients so that precision medicine can particularly target high-risk patients at the right time. Many feature variables such as demographic variables and biomarkers, combined with a patient's survival outcome, can be used to infer such latent heterogeneity. In this work, we propose a mixture model to model each patient's latent survival pattern, where the mixing probabilities for latent groups are modeled through a multinomial distribution. The Bayesian information criterion is used for selecting the number of latent groups. Furthermore, we incorporate variable selection with the adaptive lasso into inference so that only a few feature variables will be selected to characterize the latent heterogeneity. We show that our adaptive lasso estimator has oracle properties when the number of parameters diverges with the sample size. The finite sample performance is evaluated by the simulation study, and the proposed method is illustrated by two datasets.
Collapse
Affiliation(s)
- Beilin Jia
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | | | - Guanghan F Liu
- Biostatistics and Research Decision Sciences, Merck & Co., Inc, Kenilworth, Pennsylvania, USA
| | - Xianming Tan
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Guoqing Diao
- Department of Biostatistics and Bioinformatics, The George Washington University, Washington, District of Columbia, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
36
|
Zhu A, Matoba N, Wilson EP, Tapia AL, Li Y, Ibrahim JG, Stein JL, Love MI. MRLocus: Identifying causal genes mediating a trait through Bayesian estimation of allelic heterogeneity. PLoS Genet 2021; 17:e1009455. [PMID: 33872308 PMCID: PMC8084342 DOI: 10.1371/journal.pgen.1009455] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 04/29/2021] [Accepted: 02/26/2021] [Indexed: 11/18/2022] Open
Abstract
Expression quantitative trait loci (eQTL) studies are used to understand the regulatory function of non-coding genome-wide association study (GWAS) risk loci, but colocalization alone does not demonstrate a causal relationship of gene expression affecting a trait. Evidence for mediation, that perturbation of gene expression in a given tissue or developmental context will induce a change in the downstream GWAS trait, can be provided by two-sample Mendelian Randomization (MR). Here, we introduce a new statistical method, MRLocus, for Bayesian estimation of the gene-to-trait effect from eQTL and GWAS summary data for loci with evidence of allelic heterogeneity, that is, containing multiple causal variants. MRLocus makes use of a colocalization step applied to each nearly-LD-independent eQTL, followed by an MR analysis step across eQTLs. Additionally, our method involves estimation of the extent of allelic heterogeneity through a dispersion parameter, indicating variable mediation effects from each individual eQTL on the downstream trait. Our method is evaluated against other state-of-the-art methods for estimation of the gene-to-trait mediation effect, using an existing simulation framework. In simulation, MRLocus often has the highest accuracy among competing methods, and in each case provides more accurate estimation of uncertainty as assessed through interval coverage. MRLocus is then applied to five candidate causal genes for mediation of particular GWAS traits, where gene-to-trait effects are concordant with those previously reported. We find that MRLocus's estimation of the causal effect across eQTLs within a locus provides useful information for determining how perturbation of gene expression or individual regulatory elements will affect downstream traits. The MRLocus method is implemented as an R package available at https://mikelove.github.io/mrlocus.
Collapse
Affiliation(s)
- Anqi Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Nana Matoba
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Emma P. Wilson
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Amanda L. Tapia
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Joseph G. Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Jason L. Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Michael I. Love
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| |
Collapse
|
37
|
Abstract
Clustering is a form of unsupervised learning that aims to uncover latent groups within data based on similarity across a set of features. A common application of this in biomedical research is in delineating novel cancer subtypes from patient gene expression data, given a set of informative genes. However, it is typically unknown a priori what genes may be informative in discriminating between clusters, and what the optimal number of clusters are. Few methods exist for performing unsupervised clustering of RNA-seq samples, and none currently adjust for between-sample global normalization factors, select cluster-discriminatory genes, or account for potential confounding variables during clustering. To address these issues, we propose the Feature Selection and Clustering of RNA-seq (FSCseq): a model-based clustering algorithm that utilizes a finite mixture of regression (FMR) model and the quadratic penalty method with a Smoothly-Clipped Absolute Deviation (SCAD) penalty. The maximization is done by a penalized Classification EM algorithm, allowing us to include normalization factors and confounders in our modeling framework. Given the fitted model, our framework allows for subtype prediction in new patients via posterior probabilities of cluster membership, even in the presence of batch effects. Based on simulations and real data analysis, we show the advantages of our method relative to competing approaches.
Collapse
Affiliation(s)
- David K Lim
- University of North Carolina at Chapel Hill, NC, USA
| | - Naim U Rashid
- University of North Carolina at Chapel Hill, NC, USA
| | | |
Collapse
|
38
|
van Oudenhoven FM, Swinkels SHN, Ibrahim JG, Rizopoulos D. A marginal estimate for the overall treatment effect on a survival outcome within the joint modeling framework. Stat Med 2020; 39:4120-4132. [PMID: 32838484 PMCID: PMC7674249 DOI: 10.1002/sim.8713] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Revised: 05/11/2020] [Accepted: 06/05/2020] [Indexed: 02/04/2023]
Abstract
Joint models for longitudinal and survival data are increasingly used and enjoy a wide range of application areas. In this article, we focus on the application of joint models on clinical trial data with special interest in the treatment effect on the survival outcome. Within a joint model, the estimated treatment effect on the survival outcome is an aggregate comprising the indirect treatment effect through the longitudinal outcome and the direct treatment effect on the survival outcome. This overall treatment effect is, however, conditional on random effects, and therefore has a subject‐specific interpretation. The conditional interpretation arises from the shared random effects between the longitudinal and survival process in combination with the nonlinear link function of the survival model. The overall treatment effect is, therefore, not valid for population‐based inference, which is the goal for most clinical trials. We propose a method to obtain a marginal estimate of the overall treatment effect on the survival outcome in a joint model. Additionally, we extend our proposal to allow for different parameterizations for the association between the longitudinal and survival outcome. The proposed method is demonstrated on data of a clinical study on the effect of synbiotic on the gut microbiota of cesarean delivered infants, where we estimate the marginal overall treatment effect on the risk of eczema or atopic dermatitis using longitudinal information on fecal bifidobacteria.
Collapse
Affiliation(s)
- Floor M van Oudenhoven
- Department of Biostatistics, Erasmus MC, Rotterdam, The Netherlands.,Danone Nutricia Research, Utrecht, The Netherlands
| | | | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, USA
| | | |
Collapse
|
39
|
Xu J, Psioda MA, Ibrahim JG. Bayesian design of clinical trials using joint models for longitudinal and time-to-event data. Biostatistics 2020; 23:591-608. [PMID: 33155038 DOI: 10.1093/biostatistics/kxaa044] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 09/10/2020] [Accepted: 09/12/2020] [Indexed: 11/14/2022] Open
Abstract
Joint models for longitudinal and time-to-event data are increasingly used for the analysis of clinical trial data. However, few methods have been proposed for designing clinical trials using these models. In this article, we develop a Bayesian clinical trial design methodology focused on evaluating the treatment's effect on the time-to-event endpoint using a flexible trajectory joint model. By incorporating the longitudinal outcome trajectory into the hazard model for the time-to-event endpoint, the joint modeling framework allows for non-proportional hazards (e.g., an increasing hazard ratio over time). Inference for the time-to-event endpoint is based on an average of a time-varying hazard ratio which can be decomposed according to the treatment's direct effect on the time-to-event endpoint and its indirect effect, mediated through the longitudinal outcome. We propose an approach for sample size determination for a trial such that the design has high power and a well-controlled type I error rate with both operating characteristics defined from a Bayesian perspective. We demonstrate the methodology by designing a breast cancer clinical trial with a primary time-to-event endpoint and where predictive longitudinal outcome measures are also collected periodically during follow-up.
Collapse
Affiliation(s)
- Jiawei Xu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| | - Matthew A Psioda
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| |
Collapse
|
40
|
Zhao B, Ibrahim JG, Li Y, Li T, Wang Y, Shan Y, Zhu Z, Zhou F, Zhang J, Huang C, Liao H, Yang L, Thompson PM, Zhu H. Heritability of Regional Brain Volumes in Large-Scale Neuroimaging and Genetic Studies. Cereb Cortex 2020; 29:2904-2914. [PMID: 30010813 DOI: 10.1093/cercor/bhy157] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2018] [Revised: 06/11/2018] [Indexed: 12/20/2022] Open
Abstract
Brain genetics is an active research area. The degree to which genetic variants impact variations in brain structure and function remains largely unknown. We examined the heritability of regional brain volumes (P ~ 100) captured by single-nucleotide polymorphisms (SNPs) in UK Biobank (n ~ 9000). We found that regional brain volumes are highly heritable in this study population and common genetic variants can explain up to 80% of their variabilities (median heritability 34.8%). We observed omnigenic impact across the genome and examined the enrichment of SNPs in active chromatin regions. Principal components derived from regional volume data are also highly heritable, but the amount of variance in brain volume explained by the component did not seem to be related to its heritability. Heritability estimates vary substantially across large-scale functional networks, exhibit a symmetric pattern across left and right hemispheres, and are consistent in females and males (correlation = 0.638). We repeated the main analysis in Alzheimer's Disease Neuroimaging Initiative (n ~ 1100), Philadelphia Neurodevelopmental Cohort (n ~ 600), and Pediatric Imaging, Neurocognition, and Genetics (n ~ 500) datasets, which demonstrated that more stable estimates can be obtained from the UK Biobank.
Collapse
Affiliation(s)
- Bingxin Zhao
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Tengfei Li
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Yue Wang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Yue Shan
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Ziliang Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Fan Zhou
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jingwen Zhang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Chao Huang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Huiling Liao
- Department of Statistics, Texas A&M University, College Station, TX, USA
| | - Liuqing Yang
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Paul M Thompson
- Imaging Genetics Center, Mark and Mary Stevens Institute for Neuroimaging & Informatics, University of Southern California, Los Angeles, CA, USA
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| |
Collapse
|
41
|
Sheikh MT, Ibrahim JG, Gelfond JA, Sun W, Chen MH. Joint modelling of longitudinal and survival data in the presence of competing risks with applications to prostate cancer data. STAT MODEL 2020; 21:72-94. [PMID: 34177376 DOI: 10.1177/1471082x20944620] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
This research is motivated from the data from a large Selenium and Vitamin E Cancer Prevention Trial (SELECT). The prostate specific antigens (PSAs) were collected longitudinally, and the survival endpoint was the time to low-grade cancer or the time to high-grade cancer (competing risks). In this article, the goal is to model the longitudinal PSA data and the time-to-prostate cancer (PC) due to low- or high-grade. We consider the low-grade and high-grade as two competing causes of developing PC. A joint model for simultaneously analysing longitudinal and time-to-event data in the presence of multiple causes of failure (or competing risk) is proposed within the Bayesian framework. The proposed model allows for handling the missing causes of failure in the SELECT data and implementing an efficient Markov chain Monte Carlo sampling algorithm to sample from the posterior distribution via a novel reparameterization technique. Bayesian criteria, ΔDICSurv, and ΔWAICSurv, are introduced to quantify the gain in fit in the survival sub-model due to the inclusion of longitudinal data. A simulation study is conducted to examine the empirical performance of the posterior estimates as well as ΔDICSurv and ΔWAICSurv and a detailed analysis of the SELECT data is also carried out to further demonstrate the proposed methodology.
Collapse
Affiliation(s)
- Md Tuhin Sheikh
- Department of Statistics, University of Connecticut, Storrs, CT, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jonathan A Gelfond
- Department of Epidemiology and Biostatistics, University of Texas Health San Antonio, San Antonio, TX, USA
| | - Wei Sun
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Ming-Hui Chen
- Department of Statistics, University of Connecticut, Storrs, CT, USA
| |
Collapse
|
42
|
Wilson DR, Ibrahim JG, Sun W. Mapping Tumor-Specific Expression QTLs in Impure Tumor Samples. J Am Stat Assoc 2020; 115:79-89. [PMID: 32773912 DOI: 10.1080/01621459.2019.1609968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
The study of gene expression quantitative trait loci (eQTL) is an effective approach to illuminate the functional roles of genetic variants. Computational methods have been developed for eQTL mapping using gene expression data from microarray or RNA-seq technology. Application of these methods for eQTL mapping in tumor tissues is problematic because tumor tissues are composed of both tumor and infiltrating normal cells (e.g. immune cells) and eQTL effects may vary between tumor and infiltrating normal cells. To address this challenge, we have developed a new method for eQTL mapping using RNA-seq data from tumor samples. Our method separately estimates the eQTL effects in tumor and infiltrating normal cells using both total expression and allele-specific expression (ASE). We demonstrate that our method controls type I error rate and has higher power than some alternative approaches. We applied our method to study RNA-seq data from The Cancer Genome Atlas and illustrated the similarities and differences of eQTL effects in tumor and normal cells.
Collapse
Affiliation(s)
- Douglas R Wilson
- Doug R. Wilson is a graduate student, Department of Biostatistics, UNC Chapel Hill, NC 27599
| | - Joseph G Ibrahim
- Joseph G. Ibrahim is Alumni Distinguished Professor of Biostatistics, Department of Biostatistics, UNC Chapel Hill, NC 27599
| | - Wei Sun
- Wei Sun is an Associate Member in Biostatistics Program at Fred Hutchinson Cancer Research Center
| |
Collapse
|
43
|
Zhu A, Ibrahim JG, Love MI. Heavy-tailed prior distributions for sequence count data: removing the noise and preserving large differences. Bioinformatics 2020; 35:2084-2092. [PMID: 30395178 PMCID: PMC6581436 DOI: 10.1093/bioinformatics/bty895] [Citation(s) in RCA: 792] [Impact Index Per Article: 198.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Revised: 09/25/2018] [Accepted: 10/23/2018] [Indexed: 01/08/2023] Open
Abstract
MOTIVATION In RNA-seq differential expression analysis, investigators aim to detect those genes with changes in expression level across conditions, despite technical and biological variability in the observations. A common task is to accurately estimate the effect size, often in terms of a logarithmic fold change (LFC). RESULTS When the read counts are low or highly variable, the maximum likelihood estimates for the LFCs has high variance, leading to large estimates not representative of true differences, and poor ranking of genes by effect size. One approach is to introduce filtering thresholds and pseudocounts to exclude or moderate estimated LFCs. Filtering may result in a loss of genes from the analysis with true differences in expression, while pseudocounts provide a limited solution that must be adapted per dataset. Here, we propose the use of a heavy-tailed Cauchy prior distribution for effect sizes, which avoids the use of filter thresholds or pseudocounts. The proposed method, Approximate Posterior Estimation for generalized linear model, apeglm, has lower bias than previously proposed shrinkage estimators, while still reducing variance for those genes with little information for statistical inference. AVAILABILITY AND IMPLEMENTATION The apeglm package is available as an R/Bioconductor package at https://bioconductor.org/packages/apeglm, and the methods can be called from within the DESeq2 software. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Anqi Zhu
- Department of Biostatistics, University of North Carolina-Chapel Hill, NC, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina-Chapel Hill, NC, USA
| | - Michael I Love
- Department of Biostatistics, University of North Carolina-Chapel Hill, NC, USA.,Department of Genetics, University of North Carolina-Chapel Hill, NC, USA
| |
Collapse
|
44
|
Tan X, Chen BE, Sun J, Patel T, Ibrahim JG. A hierarchical testing approach for detecting safety signals in clinical trials. Stat Med 2020; 39:1541-1557. [PMID: 32050050 PMCID: PMC8258607 DOI: 10.1002/sim.8495] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Revised: 05/01/2019] [Accepted: 08/16/2019] [Indexed: 11/10/2022]
Abstract
Detecting safety signals in clinical trial safety data is known to be challenging due to high dimensionality, rare occurrence, weak signal, and complex dependence. We propose a new hierarchical testing approach for analyzing safety data from a typical randomized clinical trial. This approach accounts for the hierarchical structure of adverse events (AEs), that is, AEs are categorized by system organ class (SOC). Our approach contains two steps: the first step tests, for each SOC, whether any AEs within this SOC are differently distributed between treatment arms; and the second step identifies signal AEs from SOCs passing the first step tests. We show the superiority, in terms of power of detecting safety signals given controlled false discovery rate, of the new approach comparing with currently available approaches through simulation studies. We also demonstrate this approach with two real data examples.
Collapse
Affiliation(s)
- Xianming Tan
- Department of Biostatistics, UNC at Chapel Hill, Chapel Hill, North Carolina
| | - Bingshu E. Chen
- Canadian Cancer Trials Group and Department of Public Health Sciences, Queen’s University, Kingston, Ontario Canada
| | - Jianping Sun
- Department of Mathematics and Statistics, UNC at Greensboro, Greensboro, North Carolina
| | - Tejendra Patel
- Division of Pharmacotherapy and Experimental Therapeutics, Eshelman School of Pharmacy, UNC at Chapel Hill, Chapel Hill, North Carolina
| | - Joseph G. Ibrahim
- Department of Biostatistics, UNC at Chapel Hill, Chapel Hill, North Carolina
| |
Collapse
|
45
|
Psioda MA, Xia HA, Jiang X, Xu J, Ibrahim JG. Bayesian adaptive design for concurrent trials involving biologically related diseases. Biostatistics 2020; 23:kxab008. [PMID: 33982753 DOI: 10.1093/biostatistics/kxab008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 11/30/2020] [Accepted: 02/25/2021] [Indexed: 11/13/2022] Open
Abstract
We develop a Bayesian design method for a clinical program where an investigational product is to be studied concurrently in a set of clinical trials involving related diseases with the goal of demonstrating superiority to a control in each. The approach borrows information on treatment effectiveness using correlated mixture priors using an analysis procedure that is closely related Bayesian model averaging. Mixture priors are constructed by eliciting conjugate priors based on pessimistic and enthusiastic predictions for the data to be observed for each disease and then by eliciting mixture weights for all possible configurations of the pessimistic and enthusiastic priors across the diseases to be studied. The proposed approach provides a robust framework for information borrowing in settings where the diseases may have endpoints based on different data types. We show via simulation that operating characteristics based on the proposed design framework are favorable compared to those based on information borrowing designs using the Bayesian hierarchical model which is poorly suited for information borrowing when there are different data types underpinning the endpoints across which information is to be borrowed.
Collapse
Affiliation(s)
- Matthew A Psioda
- Department of Biostatistics, University of North Carolina, McGavran-Greenberg Hall, CB#7420, Chapel Hill, NC 27599, USA
| | | | - Xun Jiang
- Amgen Inc., One Amgen Center Drive, Thousand Oaks, CA 91320, USA
| | - Jiawei Xu
- Department of Biostatistics, University of North Carolina, McGavran-Greenberg Hall, CB#7420, Chapel Hill, NC 27599, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina, McGavran-Greenberg Hall, CB#7420, Chapel Hill, NC 27599, USA
| |
Collapse
|
46
|
Gwon Y, Mo M, Chen MH, Chi Z, Li J, Xia AH, Ibrahim JG. Network meta-regression for ordinal outcomes: Applications in comparing Crohn's disease treatments. Stat Med 2020; 39:10.1002/sim.8518. [PMID: 32166784 PMCID: PMC7727029 DOI: 10.1002/sim.8518] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Revised: 02/08/2020] [Accepted: 02/14/2020] [Indexed: 12/22/2022]
Abstract
Crohn's disease (CD) is a life-long condition associated with recurrent relapses characterized by abdominal pain, weight loss, anemia, and persistent diarrhea. In the US, there are approximately 780 000 CD patients and 33 000 new cases added each year. In this article, we propose a new network meta-regression approach for modeling ordinal outcomes in order to assess the efficacy of treatments for CD. Specifically, we develop regression models based on aggregate covariates for the underlying cut points of the ordinal outcomes as well as for the variances of the random effects to capture heterogeneity across trials. Our proposed models are particularly useful for indirect comparisons of multiple treatments that have not been compared head-to-head within the network meta-analysis framework. Moreover, we introduce Pearson residuals and construct an invariant test statistic to evaluate goodness-of-fit in the setting of ordinal outcome data. A detailed case study demonstrating the usefulness of the proposed methodology is carried out using aggregate ordinal outcome data from 16 clinical trials for treating CD.
Collapse
Affiliation(s)
- Yeongjin Gwon
- Department of Biostatistics, University of Nebraska Medical Center, Omaha, Nebraska, USA
| | - May Mo
- Amgen Inc., Thousand Oaks, California, USA
| | - Ming-Hui Chen
- Department of Statistics, University of Connecticut, Storrs, Connecticut, USA
| | - Zhiyi Chi
- Department of Statistics, University of Connecticut, Storrs, Connecticut, USA
| | - Juan Li
- Lily Biotechnology Center, Eli Lily and Company, San Diego, California, USA
| | - Amy H. Xia
- Amgen Inc., Thousand Oaks, California, USA
| | - Joseph G. Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
47
|
Wang Y, Ibrahim JG, Zhu H. Partial least squares for functional joint models with applications to the Alzheimer's disease neuroimaging initiative study. Biometrics 2020; 76:1109-1119. [PMID: 32010968 DOI: 10.1111/biom.13219] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Revised: 11/15/2019] [Accepted: 01/13/2020] [Indexed: 11/30/2022]
Abstract
Many biomedical studies have identified important imaging biomarkers that are associated with both repeated clinical measures and a survival outcome. The functional joint model (FJM) framework, proposed by Li and Luo in 2017, investigates the association between repeated clinical measures and survival data, while adjusting for both high-dimensional images and low-dimensional covariates based on the functional principal component analysis (FPCA). In this paper, we propose a novel algorithm for the estimation of FJM based on the functional partial least squares (FPLS). Our numerical studies demonstrate that, compared to FPCA, the proposed FPLS algorithm can yield more accurate and robust estimation and prediction performance in many important scenarios. We apply the proposed FPLS algorithm to a neuroimaging study. Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database.
Collapse
Affiliation(s)
- Yue Wang
- Department of Biostatistics, University of Washington, Seattle, Washington
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| |
Collapse
|
48
|
Psioda MA, Ibrahim JG. Bayesian clinical trial design using historical data that inform the treatment effect. Biostatistics 2020; 20:400-415. [PMID: 29547966 DOI: 10.1093/biostatistics/kxy009] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2017] [Accepted: 01/13/2018] [Indexed: 11/12/2022] Open
Abstract
We consider the problem of Bayesian sample size determination for a clinical trial in the presence of historical data that inform the treatment effect. Our broadly applicable, simulation-based methodology provides a framework for calibrating the informativeness of a prior while simultaneously identifying the minimum sample size required for a new trial such that the overall design has appropriate power to detect a non-null treatment effect and reasonable type I error control. We develop a comprehensive strategy for eliciting null and alternative sampling prior distributions which are used to define Bayesian generalizations of the traditional notions of type I error control and power. Bayesian type I error control requires that a weighted-average type I error rate not exceed a prespecified threshold. We develop a procedure for generating an appropriately sized Bayesian hypothesis test using a simple partial-borrowing power prior which summarizes the fraction of information borrowed from the historical trial. We present results from simulation studies that demonstrate that a hypothesis test procedure based on this simple power prior is as efficient as those based on more complicated meta-analytic priors, such as normalized power priors or robust mixture priors, when all are held to precise type I error control requirements. We demonstrate our methodology using a real data set to design a follow-up clinical trial with time-to-event endpoint for an investigational treatment in high-risk melanoma.
Collapse
Affiliation(s)
- Matthew A Psioda
- Department of Biostatistics, University of North Carolina, McGavran-Greenberg Hall, CB#7420, Chapel Hill, NC, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina, McGavran-Greenberg Hall, CB#7420, Chapel Hill, NC, USA
| |
Collapse
|
49
|
Sun W, Jin C, Gelfond JA, Chen MH, Ibrahim JG. Joint analysis of single-cell and bulk tissue sequencing data to infer intratumor heterogeneity. Biometrics 2019; 76:983-994. [PMID: 31813161 DOI: 10.1111/biom.13198] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Revised: 10/23/2019] [Accepted: 11/25/2019] [Indexed: 11/28/2022]
Abstract
Many computational methods have been developed to discern intratumor heterogeneity (ITH) using DNA sequence data from bulk tumor samples. These methods share an assumption that two mutations arise from the same subclone if they have similar mutant allele-frequencies (MAFs), and thus it is difficult or impossible to distinguish two subclones with similar MAFs. Single-cell DNA sequencing (scDNA-seq) data can be very informative for ITH inference. However, due to the difficulty of DNA amplification, scDNA-seq data are often very noisy. A promising new study design is to collect both bulk and single-cell DNA-seq data and jointly analyze them to mitigate the limitations of each data type. To address the analytic challenges of this new study design, we propose a computational method named BaSiC (Bulk tumor and Single Cell), to discern ITH by jointly analyzing DNA-seq data from bulk tumor and single cells. We demonstrate that BaSiC has comparable or better performance than the methods using either data type. We further evaluate BaSiC using bulk tumor and single-cell DNA-seq data from a breast cancer patient and several leukemia patients.
Collapse
Affiliation(s)
- Wei Sun
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Chong Jin
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Jonathan A Gelfond
- Department of Epidemiology and Biostatistics, UT Health Science Center, San Antonio, Texas
| | - Ming-Hui Chen
- Department of Statistics, University of Connecticut, Storrs, Connecticut
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina
| |
Collapse
|
50
|
Baldoni PL, Rashid NU, Ibrahim JG. Improved detection of epigenomic marks with mixed-effects hidden Markov models. Biometrics 2019; 75:1401-1413. [PMID: 31081192 PMCID: PMC6851437 DOI: 10.1111/biom.13083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Accepted: 05/03/2019] [Indexed: 11/30/2022]
Abstract
Chromatin immunoprecipitation followed by next-generation sequencing (ChIP-seq) is a technique to detect genomic regions containing protein-DNA interaction, such as transcription factor binding sites or regions containing histone modifications. One goal of the analysis of ChIP-seq experiments is to identify genomic loci enriched for sequencing reads pertaining to DNA bound to the factor of interest. The accurate identification of such regions aids in the understanding of epigenomic marks and gene regulatory mechanisms. Given the reduction of massively parallel sequencing costs, methods to detect consensus regions of enrichment across multiple samples are of interest. Here, we present a statistical model to detect broad consensus regions of enrichment from ChIP-seq technical or biological replicates through a class of zero-inflated mixed-effects hidden Markov models. We show that the proposed model outperforms existing methods for consensus peak calling in common epigenomic marks by accounting for the excess zeros and sample-specific biases. We apply our method to data from the Encyclopedia of DNA Elements and Roadmap Epigenomics projects and also from an extensive simulation study.
Collapse
Affiliation(s)
- Pedro L. Baldoni
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Naim U. Rashid
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Joseph G. Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| |
Collapse
|