1
|
Chi S, Flowers CR, Li Z, Huang X, Wei P. MASH: MEDIATION ANALYSIS OF SURVIVAL OUTCOME AND HIGH-DIMENSIONAL OMICS MEDIATORS WITH APPLICATION TO COMPLEX DISEASES. bioRxiv 2024:2023.08.22.554286. [PMID: 37662296 PMCID: PMC10473652 DOI: 10.1101/2023.08.22.554286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Environmental exposures such as cigarette smoking influence health outcomes through intermediate molecular phenotypes, such as the methylome, transcriptome, and metabolome. Mediation analysis is a useful tool for investigating the role of potentially high-dimensional intermediate phenotypes in the relationship between environmental exposures and health outcomes. However, little work has been done on mediation analysis when the mediators are high-dimensional and the outcome is a survival endpoint, and none of it has provided a robust measure of total mediation effect. To this end, we propose an estimation procedure for Mediation Analysis of Survival outcome and High-dimensional omics mediators (MASH) based on sure independence screening for putative mediator variable selection and a second-moment-based measure of total mediation effect for survival data analogous to the R 2 measure in a linear model. Extensive simulations showed good performance of MASH in estimating the total mediation effect and identifying true mediators. By applying MASH to the metabolomics data of 1919 subjects in the Framingham Heart Study, we identified five metabolites as mediators of the effect of cigarette smoking on coronary heart disease risk (total mediation effect, 51.1%) and two metabolites as mediators between smoking and risk of cancer (total mediation effect, 50.7%). Application of MASH to a diffuse large B-cell lymphoma genomics data set identified copy-number variations for eight genes as mediators between the baseline International Prognostic Index score and overall survival.
Collapse
Affiliation(s)
- Sunyi Chi
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Christopher R Flowers
- Department of Lymphoma, The University of Texas MD Anderson Cancer Center, Houston, USA
| | - Ziyi Li
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Xuelin Huang
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Peng Wei
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| |
Collapse
|
2
|
Wang S, Huang Y. DP2LM: leveraging deep learning approach for estimation and hypothesis testing on mediation effects with high-dimensional mediators and complex confounders. Biostatistics 2024:kxad037. [PMID: 38330064 DOI: 10.1093/biostatistics/kxad037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 12/18/2023] [Accepted: 12/21/2023] [Indexed: 02/10/2024] Open
Abstract
Traditional linear mediation analysis has inherent limitations when it comes to handling high-dimensional mediators. Particularly, accurately estimating and rigorously inferring mediation effects is challenging, primarily due to the intertwined nature of the mediator selection issue. Despite recent developments, the existing methods are inadequate for addressing the complex relationships introduced by confounders. To tackle these challenges, we propose a novel approach called DP2LM (Deep neural network-based Penalized Partially Linear Mediation). This approach incorporates deep neural network techniques to account for nonlinear effects in confounders and utilizes the penalized partially linear model to accommodate high dimensionality. Unlike most existing works that concentrate on mediator selection, our method prioritizes estimation and inference on mediation effects. Specifically, we develop test procedures for testing the direct and indirect mediation effects. Theoretical analysis shows that the tests maintain the Type-I error rate. In simulation studies, DP2LM demonstrates its superior performance as a modeling tool for complex data, outperforming existing approaches in a wide range of settings and providing reliable estimation and inference in scenarios involving a considerable number of mediators. Further, we apply DP2LM to investigate the mediation effect of DNA methylation on cortisol stress reactivity in individuals who experienced childhood trauma, uncovering new insights through a comprehensive analysis.
Collapse
Affiliation(s)
- Shuoyang Wang
- Department of Biostatistics, Yale University, New Haven, CT 06520, USA
| | - Yuan Huang
- Department of Biostatistics, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
3
|
Cui Y, Luo C, Luo L, Yu Z. High-Dimensional Mediation Analysis Based on Additive Hazards Model for Survival Data. Front Genet 2021; 12:771932. [PMID: 35003213 PMCID: PMC8734376 DOI: 10.3389/fgene.2021.771932] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 10/19/2021] [Indexed: 11/13/2022] Open
Abstract
Mediation analysis has been extensively used to identify potential pathways between exposure and outcome. However, the analytical methods of high-dimensional mediation analysis for survival data are still yet to be promoted, especially for non-Cox model approaches. We propose a procedure including "two-step" variable selection and indirect effect estimation for the additive hazards model with high-dimensional mediators. We first apply sure independence screening and smoothly clipped absolute deviation regularization to select mediators. Then we use the Sobel test and the BH method for indirect effect hypothesis testing. Simulation results demonstrate its good performance with a higher true-positive rate and accuracy, as well as a lower false-positive rate. We apply the proposed procedure to analyze DNA methylation markers mediating smoking and survival time of lung cancer patients in a TCGA (The Cancer Genome Atlas) cohort study. The real data application identifies four mediate CpGs, three of which are newly found.
Collapse
Affiliation(s)
- Yidan Cui
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| | - Chengwen Luo
- Public Laboratory, Taizhou Hospital of Zhejiang Province, Wenzhou Medical University, Linhai, Zhejiang, China
| | - Linghao Luo
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| | - Zhangsheng Yu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
- Clinical Research Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| |
Collapse
|
4
|
Abstract
Mediation analysis is a common statistical method for investigating the mechanism of environmental exposures on health outcomes. Previous studies have extended mediation models with a single mediator to high-dimensional mediators selection. It is often assumed that there are no confounders that influence the relations among the exposure, mediator, and outcome. This is not realistic for the observational studies. To accommodate the potential confounders, we propose a concise and efficient high-dimensional mediation analysis procedure using the propensity score for adjustment. Results from simulation studies demonstrate the proposed procedure has good performance in mediator selection and effect estimation compared with methods that ignore all confounders. Of note, as the sample size increases, the performance of variable selection and mediation effect estimation is as well as the results shown in the method which include all confounders as covariates in the mediation model. By applying this procedure to a TCGA lung cancer data set, we find that lung cancer patients who had serious smoking history have increased the risk of death via the methylation markers cg21926276 and cg20707991 with significant hazard ratios of 1.2093 (95% CI: 1.2019-1.2167) and 1.1388 (95% CI: 1.1339-1.1438), respectively.
Collapse
Affiliation(s)
- Zhangsheng Yu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
- Clinical Research Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Yidan Cui
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| | - Ting Wei
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| | - Yanran Ma
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| | - Chengwen Luo
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
5
|
Song Y, Zhou X, Zhang M, Zhao W, Liu Y, Kardia SLR, Diez Roux AV, Needham BL, Smith JA, Mukherjee B. Bayesian shrinkage estimation of high dimensional causal mediation effects in omics studies. Biometrics 2020; 76:700-710. [PMID: 31733066 PMCID: PMC7228845 DOI: 10.1111/biom.13189] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 10/30/2019] [Accepted: 11/04/2019] [Indexed: 11/29/2022]
Abstract
Causal mediation analysis aims to examine the role of a mediator or a group of mediators that lie in the pathway between an exposure and an outcome. Recent biomedical studies often involve a large number of potential mediators based on high-throughput technologies. Most of the current analytic methods focus on settings with one or a moderate number of potential mediators. With the expanding growth of -omics data, joint analysis of molecular-level genomics data with epidemiological data through mediation analysis is becoming more common. However, such joint analysis requires methods that can simultaneously accommodate high-dimensional mediators and that are currently lacking. To address this problem, we develop a Bayesian inference method using continuous shrinkage priors to extend previous causal mediation analysis techniques to a high-dimensional setting. Simulations demonstrate that our method improves the power of global mediation analysis compared to simpler alternatives and has decent performance to identify true nonnull contributions to the mediation effects of the pathway. The Bayesian method also helps us to understand the structure of the composite null cases for inactive mediators in the pathway. We applied our method to Multi-Ethnic Study of Atherosclerosis and identified DNA methylation regions that may actively mediate the effect of socioeconomic status on cardiometabolic outcomes.
Collapse
Affiliation(s)
- Yanyi Song
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, U.S.A
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, U.S.A
| | - Min Zhang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, U.S.A
| | - Wei Zhao
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, U.S.A
| | - Yongmei Liu
- Department of Epidemiology and Prevention, Wake Forest School of Medicine, Winston-Salem, NC, U.S.A
| | | | - Ana V. Diez Roux
- Department of Epidemiology and Biostatistics, Drexel University, Philadelphia, PA, U.S.A
| | - Belinda L. Needham
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, U.S.A
| | - Jennifer A. Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, U.S.A
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, U.S.A
| |
Collapse
|