1
|
Fu J, Koslovsky MD, Neophytou AM, Vannucci M. A Bayesian joint model for compositional mediation effect selection in microbiome data. Stat Med 2023. [PMID: 37173609 DOI: 10.1002/sim.9764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 04/17/2023] [Accepted: 04/26/2023] [Indexed: 05/15/2023]
Abstract
Analyzing multivariate count data generated by high-throughput sequencing technology in microbiome research studies is challenging due to the high-dimensional and compositional structure of the data and overdispersion. In practice, researchers are often interested in investigating how the microbiome may mediate the relation between an assigned treatment and an observed phenotypic response. Existing approaches designed for compositional mediation analysis are unable to simultaneously determine the presence of direct effects, relative indirect effects, and overall indirect effects, while quantifying their uncertainty. We propose a formulation of a Bayesian joint model for compositional data that allows for the identification, estimation, and uncertainty quantification of various causal estimands in high-dimensional mediation analysis. We conduct simulation studies and compare our method's mediation effects selection performance with existing methods. Finally, we apply our method to a benchmark data set investigating the sub-therapeutic antibiotic treatment effect on body weight in early-life mice.
Collapse
Affiliation(s)
- Jingyan Fu
- Department of Statistics, Rice University, Houston, Texas, USA
| | - Matthew D Koslovsky
- Department of Statistics, Colorado State University, Fort Collins, Colorado, USA
| | - Andreas M Neophytou
- Department of Environmental & Radiological Health Sciences, Colorado State University, Fort Collins, Colorado, USA
| | - Marina Vannucci
- Department of Statistics, Rice University, Houston, Texas, USA
| |
Collapse
|
2
|
Koslovsky MD, Hoffman KL, Daniel CR, Vannucci M. A Bayesian model of microbiome data for simultaneous identification of covariate associations and prediction of phenotypic outcomes. Ann Appl Stat 2020. [DOI: 10.1214/20-aoas1354] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
3
|
Romeo G, Thoresen M. Model selection in high-dimensional noisy data: a simulation study. J STAT COMPUT SIM 2019. [DOI: 10.1080/00949655.2019.1607345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Giovanni Romeo
- Department of Biostatistics, Oslo Centre for Biostatistics and Epidemiology, University of Oslo, Oslo, Norway
| | - Magne Thoresen
- Department of Biostatistics, Oslo Centre for Biostatistics and Epidemiology, University of Oslo, Oslo, Norway
| |
Collapse
|
4
|
Sørensen Ø, Hellton KH, Frigessi A, Thoresen M. Covariate Selection in High-Dimensional Generalized Linear Models With Measurement Error. J Comput Graph Stat 2018. [DOI: 10.1080/10618600.2018.1425626] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Affiliation(s)
- Øystein Sørensen
- Oslo Centre for Biostatistics and Epidemiology, Department of Biostatistics, University of Oslo, Oslo, Norway
| | | | - Arnoldo Frigessi
- Oslo Centre for Biostatistics and Epidemiology, Department of Biostatistics, University of Oslo, Oslo, Norway
- Oslo Centre for Biostatistics and Epidemiology, Research Support Services, Oslo University Hospital, Oslo, Norway
| | - Magne Thoresen
- Oslo Centre for Biostatistics and Epidemiology, Department of Biostatistics, University of Oslo, Oslo, Norway
| |
Collapse
|
5
|
Muff S, Ott M, Braun J, Held L. Bayesian two-component measurement error modelling for survival analysis using INLA—A case study on cardiovascular disease mortality in Switzerland. Comput Stat Data Anal 2017. [DOI: 10.1016/j.csda.2017.03.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
6
|
Guerau-de-Arellano M, Smith KM, Godlewski J, Liu Y, Winger R, Lawler SE, Whitacre CC, Racke MK, Lovett-Racke AE. Micro-RNA dysregulation in multiple sclerosis favours pro-inflammatory T-cell-mediated autoimmunity. ACTA ACUST UNITED AC 2011; 134:3578-89. [PMID: 22088562 DOI: 10.1093/brain/awr262] [Citation(s) in RCA: 155] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Pro-inflammatory T cells mediate autoimmune demyelination in multiple sclerosis. However, the factors driving their development and multiple sclerosis susceptibility are incompletely understood. We investigated how micro-RNAs, newly described as post-transcriptional regulators of gene expression, contribute to pathogenic T-cell differentiation in multiple sclerosis. miR-128 and miR-27b were increased in naïve and miR-340 in memory CD4(+) T cells from patients with multiple sclerosis, inhibiting Th2 cell development and favouring pro-inflammatory Th1 responses. These effects were mediated by direct suppression of B lymphoma Mo-MLV insertion region 1 homolog (BMI1) and interleukin-4 (IL4) expression, resulting in decreased GATA3 levels, and a Th2 to Th1 cytokine shift. Gain-of-function experiments with these micro-RNAs enhanced the encephalitogenic potential of myelin-specific T cells in experimental autoimmune encephalomyelitis. In addition, treatment of multiple sclerosis patient T cells with oligonucleotide micro-RNA inhibitors led to the restoration of Th2 responses. These data illustrate the biological significance and therapeutic potential of these micro-RNAs in regulating T-cell phenotypes in multiple sclerosis.
Collapse
|
7
|
Cheng YJ, Crainiceanu CM. Cox Models With Smooth Functional Effect of Covariates Measured With Error. J Am Stat Assoc 2009; 104:1144-1154. [PMID: 21818167 PMCID: PMC3148771 DOI: 10.1198/jasa.2009.tm08160] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
We propose, develop, and implement a fully Bayesian inferential approach for the Cox model when the log hazard function contains unknown smooth functions of the variables measured with error. Our approach is to model nonparametrically both the log-baseline hazard and the smooth components of the log-hazard functions using low-rank penalized splines. Careful implementation of the Bayesian inferential machinery is shown to produce remarkably better results than the naive approach. Our methodology was motivated by and applied to the study of progression time to chronic kidney disease as a function of baseline kidney function and applied to the Atherosclerosis Risk in Communities study, a large epidemiological cohort study. This article has supplementary material online.
Collapse
Affiliation(s)
- Yu-Jen Cheng
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21205 ()
| | | |
Collapse
|
8
|
van Wieringen WN, Kun D, Hampel R, Boulesteix AL. Survival prediction using gene expression data: A review and comparison. Comput Stat Data Anal 2009. [DOI: 10.1016/j.csda.2008.05.021] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
9
|
Shen R, Ghosh D, Taylor JMG. Modeling intra-tumor protein expression heterogeneity in tissue microarray experiments. Stat Med 2008; 27:1944-59. [PMID: 18300332 DOI: 10.1002/sim.3217] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Tissue microarrays (TMAs) measure tumor-specific protein expression via high-density immunohistochemical staining assays. They provide a proteomic platform for validating cancer biomarkers emerging from large-scale DNA microarray studies. Repeated observations within each tumor result in substantial biological and experimental variability. This variability is usually ignored when associating the TMA expression data with patient survival outcome. It generates biased estimates of hazard ratio in proportional hazards models. We propose a Latent Expression Index (LEI) as a surrogate protein expression estimate in a two-stage analysis. Several estimators of LEI are compared: an empirical Bayes, a full Bayes, and a varying replicate number estimator. In addition, we jointly model survival and TMA expression data via a shared random effects model. Bayesian estimation is carried out using a Markov chain Monte Carlo method. Simulation studies were conducted to compare the two-stage methods and the joint analysis in estimating the Cox regression coefficient. We show that the two-stage methods reduce bias relative to the naive approach, but still lead to under-estimated hazard ratios. The joint model consistently outperforms the two-stage methods in terms of both bias and coverage property in various simulation scenarios. In case studies using prostate cancer TMA data sets, the two-stage methods yield a good approximation in one data set whereas an insufficient one in the other. A general advice is to use the joint model inference whenever results differ between the two-stage methods and the joint analysis.
Collapse
Affiliation(s)
- Ronglai Shen
- Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, 307 East 63rd Street, New York, NY 10065, U.S.A.
| | | | | |
Collapse
|
10
|
Boulesteix AL, Strobl C, Augustin T, Daumer M. Evaluating microarray-based classifiers: an overview. Cancer Inform 2008; 6:77-97. [PMID: 19259405 PMCID: PMC2623308 DOI: 10.4137/cin.s408] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
For the last eight years, microarray-based class prediction has been the subject of numerous publications in medicine, bioinformatics and statistics journals. However, in many articles, the assessment of classification accuracy is carried out using suboptimal procedures and is not paid much attention. In this paper, we carefully review various statistical aspects of classifier evaluation and validation from a practical point of view. The main topics addressed are accuracy measures, error rate estimation procedures, variable selection, choice of classifiers and validation strategy.
Collapse
Affiliation(s)
- A-L Boulesteix
- Sylvia Lawry Centre for MS Research (SLC), Hohenlindenerstr. 1, Munich, Germany
| | | | | | | |
Collapse
|
11
|
Zhang D, Zhang M. Bayesian profiling of molecular signatures to predict event times. Theor Biol Med Model 2007; 4:3. [PMID: 17239251 PMCID: PMC1796541 DOI: 10.1186/1742-4682-4-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2006] [Accepted: 01/19/2007] [Indexed: 01/28/2023] Open
Abstract
Background It is of particular interest to identify cancer-specific molecular signatures for early diagnosis, monitoring effects of treatment and predicting patient survival time. Molecular information about patients is usually generated from high throughput technologies such as microarray and mass spectrometry. Statistically, we are challenged by the large number of candidates but only a small number of patients in the study, and the right-censored clinical data further complicate the analysis. Results We present a two-stage procedure to profile molecular signatures for survival outcomes. Firstly, we group closely-related molecular features into linkage clusters, each portraying either similar or opposite functions and playing similar roles in prognosis; secondly, a Bayesian approach is developed to rank the centroids of these linkage clusters and provide a list of the main molecular features closely related to the outcome of interest. A simulation study showed the superior performance of our approach. When it was applied to data on diffuse large B-cell lymphoma (DLBCL), we were able to identify some new candidate signatures for disease prognosis. Conclusion This multivariate approach provides researchers with a more reliable list of molecular features profiled in terms of their prognostic relationship to the event times, and generates dependable information for subsequent identification of prognostic molecular signatures through either biological procedures or further data analysis.
Collapse
Affiliation(s)
- Dabao Zhang
- Department of Statistics, Purdue University, 150 N. University Street, West Lafayette, Indiana 47907-2067, USA
| | - Min Zhang
- Department of Statistics, Purdue University, 150 N. University Street, West Lafayette, Indiana 47907-2067, USA
| |
Collapse
|