1
|
Liu Z, Liu ZA, Hosni A, Kim J, Jiang B, Saarela O. A Bayesian joint model for mediation analysis with matrix-valued mediators. Biometrics 2024; 80:ujae143. [PMID: 39671276 DOI: 10.1093/biomtc/ujae143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 09/19/2024] [Accepted: 12/11/2024] [Indexed: 12/15/2024]
Abstract
Unscheduled treatment interruptions may lead to reduced quality of care in radiation therapy (RT). Identifying the RT prescription dose effects on the outcome of treatment interruptions, mediated through doses distributed into different organs at risk (OARs), can inform future treatment planning. The radiation exposure to OARs can be summarized by a matrix of dose-volume histograms (DVH) for each patient. Although various methods for high-dimensional mediation analysis have been proposed recently, few studies investigated how matrix-valued data can be treated as mediators. In this paper, we propose a novel Bayesian joint mediation model for high-dimensional matrix-valued mediators. In this joint model, latent features are extracted from the matrix-valued data through an adaptation of probabilistic multilinear principal components analysis (MPCA), retaining the inherent matrix structure. We derive and implement a Gibbs sampling algorithm to jointly estimate all model parameters, and introduce a Varimax rotation method to identify active indicators of mediation among the matrix-valued data. Our simulation study finds that the proposed joint model has higher efficiency in estimating causal decomposition effects compared to an alternative two-step method, and demonstrates that the mediation effects can be identified and visualized in the matrix form. We apply the method to study the effect of prescription dose on treatment interruptions in anal canal cancer patients.
Collapse
Affiliation(s)
- Zijin Liu
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario M5T 3M7, Canada
| | - Zhihui Amy Liu
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario M5T 3M7, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario M5G 2M9, Canada
| | - Ali Hosni
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario M5G 2M9, Canada
| | - John Kim
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario M5G 2M9, Canada
| | - Bei Jiang
- Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, Alberta T6G 2G1, Canada
| | - Olli Saarela
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario M5T 3M7, Canada
| |
Collapse
|
2
|
Dai W, Zhang H. AN INTEGRATIVE NETWORK-BASED MEDIATION MODEL (NMM) TO ESTIMATE MULTIPLE GENETIC EFFECTS ON OUTCOMES MEDIATED BY FUNCTIONAL CONNECTIVITY. Ann Appl Stat 2024; 18:2277-2294. [PMID: 39640845 PMCID: PMC11616023 DOI: 10.1214/24-aoas1880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2024]
Abstract
Functional connectivity of the brain, characterized by interconnected neural circuits across functional networks, is a cutting-edge feature in neuroimaging. It has the potential to mediate the effect of genetic variants on behavioral outcomes or diseases. Existing mediation analysis methods can evaluate the impact of genetics and brain structurefunction on cognitive behavior or disorders, but they tend to be limited to single genetic variants or univariate mediators, without considering cumulative genetic effects and the complex matrix and group and network structures of functional connectivity. To address this gap, the paper presents an integrative network-based mediation model (NMM) that estimates the effect of multiple genetic variants on behavioral outcomes or diseases mediated by functional connectivity. The model incorporates group information of inter-regions at broad network level and imposes low-rank and sparse assumptions to reflect the complex structures of functional connectivity and selecting network mediators simultaneously. We adopt block coordinate descent algorithm to implement a fast and efficient solution to our model. Simulation results indicate the efficacy of the model in selecting active mediators and reducing bias in effect estimation. With application to the Human Connectome Project Youth Adult (HCP-YA) study of 493 young adults, two genetic variants (rs769448 and rs769449) on the APOE4 gene are identified that lead to deficits in functional connectivity within visual networks and fluid intelligence.
Collapse
Affiliation(s)
- Wei Dai
- Department of Biostatistics, Yale University School of Public Health
| | - Heping Zhang
- Department of Biostatistics, Yale University School of Public Health
| |
Collapse
|
3
|
Zhang Q, Yang Z, Yang J. Dissecting the colocalized GWAS and eQTLs with mediation analysis for high-dimensional exposures and confounders. Biometrics 2024; 80:ujae050. [PMID: 38801257 DOI: 10.1093/biomtc/ujae050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 03/14/2024] [Accepted: 05/14/2024] [Indexed: 05/29/2024]
Abstract
To leverage the advancements in genome-wide association studies (GWAS) and quantitative trait loci (QTL) mapping for traits and molecular phenotypes to gain mechanistic understanding of the genetic regulation, biological researchers often investigate the expression QTLs (eQTLs) that colocalize with QTL or GWAS peaks. Our research is inspired by 2 such studies. One aims to identify the causal single nucleotide polymorphisms that are responsible for the phenotypic variation and whose effects can be explained by their impacts at the transcriptomic level in maize. The other study in mouse focuses on uncovering the cis-driver genes that induce phenotypic changes by regulating trans-regulated genes. Both studies can be formulated as mediation problems with potentially high-dimensional exposures, confounders, and mediators that seek to estimate the overall indirect effect (IE) for each exposure. In this paper, we propose MedDiC, a novel procedure to estimate the overall IE based on difference-in-coefficients approach. Our simulation studies find that MedDiC offers valid inference for the IE with higher power, shorter confidence intervals, and faster computing time than competing methods. We apply MedDiC to the 2 aforementioned motivating datasets and find that MedDiC yields reproducible outputs across the analysis of closely related traits, with results supported by external biological evidence. The code and additional information are available on our GitHub page (https://github.com/QiZhangStat/MedDiC).
Collapse
Affiliation(s)
- Qi Zhang
- Department of Mathematics and Statistics, University of New Hampshire, Durham, NH 03824, United States
| | - Zhikai Yang
- Complex Biosystems Program and Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68583, United States
| | - Jinliang Yang
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68583, United States
| |
Collapse
|
4
|
Yang Z, Zhao T, Cheng H, Yang J. Microbiome-enabled genomic selection improves prediction accuracy for nitrogen-related traits in maize. G3 (BETHESDA, MD.) 2024; 14:jkad286. [PMID: 38113533 PMCID: PMC11090461 DOI: 10.1093/g3journal/jkad286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 05/19/2023] [Accepted: 12/05/2023] [Indexed: 12/21/2023]
Abstract
Root-associated microbiomes in the rhizosphere (rhizobiomes) are increasingly known to play an important role in nutrient acquisition, stress tolerance, and disease resistance of plants. However, it remains largely unclear to what extent these rhizobiomes contribute to trait variation for different genotypes and if their inclusion in the genomic selection protocol can enhance prediction accuracy. To address these questions, we developed a microbiome-enabled genomic selection method that incorporated host SNPs and amplicon sequence variants from plant rhizobiomes in a maize diversity panel under high and low nitrogen (N) field conditions. Our cross-validation results showed that the microbiome-enabled genomic selection model significantly outperformed the conventional genomic selection model for nearly all time-series traits related to plant growth and N responses, with an average relative improvement of 3.7%. The improvement was more pronounced under low N conditions (8.4-40.2% of relative improvement), consistent with the view that some beneficial microbes can enhance N nutrient uptake, particularly in low N fields. However, our study could not definitively rule out the possibility that the observed improvement is partially due to the amplicon sequence variants being influenced by microenvironments. Using a high-dimensional mediation analysis method, our study has also identified microbial mediators that establish a link between plant genotype and phenotype. Some of the detected mediator microbes were previously reported to promote plant growth. The enhanced prediction accuracy of the microbiome-enabled genomic selection models, demonstrated in a single environment, serves as a proof-of-concept for the potential application of microbiome-enabled plant breeding for sustainable agriculture.
Collapse
Affiliation(s)
- Zhikai Yang
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68583, USA
| | - Tianjing Zhao
- Department of Animal Science, University of Nebraska-Lincoln, Lincoln, NE 68583, USA
- Department of Animal Science, University of California Davis, Davis, CA 95616, USA
| | - Hao Cheng
- Department of Animal Science, University of California Davis, Davis, CA 95616, USA
| | - Jinliang Yang
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68583, USA
| |
Collapse
|
5
|
Clark-Boucher D, Zhou X, Du J, Liu Y, Needham BL, Smith JA, Mukherjee B. Methods for mediation analysis with high-dimensional DNA methylation data: Possible choices and comparisons. PLoS Genet 2023; 19:e1011022. [PMID: 37934796 PMCID: PMC10655967 DOI: 10.1371/journal.pgen.1011022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 11/17/2023] [Accepted: 10/18/2023] [Indexed: 11/09/2023] Open
Abstract
Epigenetic researchers often evaluate DNA methylation as a potential mediator of the effect of social/environmental exposures on a health outcome. Modern statistical methods for jointly evaluating many mediators have not been widely adopted. We compare seven methods for high-dimensional mediation analysis with continuous outcomes through both diverse simulations and analysis of DNAm data from a large multi-ethnic cohort in the United States, while providing an R package for their seamless implementation and adoption. Among the considered choices, the best-performing methods for detecting active mediators in simulations are the Bayesian sparse linear mixed model (BSLMM) and high-dimensional mediation analysis (HDMA); while the preferred methods for estimating the global mediation effect are high-dimensional linear mediation analysis (HILMA) and principal component mediation analysis (PCMA). We provide guidelines for epigenetic researchers on choosing the best method in practice and offer suggestions for future methodological development.
Collapse
Affiliation(s)
- Dylan Clark-Boucher
- Department of Biostatistics, Harvard T.H. Chan School of Public Health; Boston, Massachusetts, United States of America
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan; Ann Arbor, Michigan, United States of America
| | - Jiacong Du
- Department of Biostatistics, University of Michigan; Ann Arbor, Michigan, United States of America
| | - Yongmei Liu
- Department of Medicine, Divisions of Cardiology and Neurology, Duke University Medical Center; Durham, North Carolina, United States of America
| | - Belinda L. Needham
- Department of Epidemiology, University of Michigan; Ann Arbor, Michigan, United States of America
| | - Jennifer A. Smith
- Department of Epidemiology, University of Michigan; Ann Arbor, Michigan, United States of America
- Survey Research Center, Institute for Social Research, University of Michigan; Ann Arbor, Michigan, United States of America
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan; Ann Arbor, Michigan, United States of America
- Department of Epidemiology, University of Michigan; Ann Arbor, Michigan, United States of America
| |
Collapse
|
6
|
Clark-Boucher D, Zhou X, Du J, Liu Y, Needham BL, Smith JA, Mukherjee B. Methods for Mediation Analysis with High-Dimensional DNA Methylation Data: Possible Choices and Comparison. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.02.10.23285764. [PMID: 36824903 PMCID: PMC9949196 DOI: 10.1101/2023.02.10.23285764] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
Epigenetic researchers often evaluate DNA methylation as a mediator between social/environmental exposures and disease, but modern statistical methods for jointly evaluating many mediators have not been widely adopted. We compare seven methods for high-dimensional mediation analysis with continuous outcomes through both diverse simulations and analysis of DNAm data from a large national cohort in the United States, while providing an R package for their implementation. Among the considered choices, the best-performing methods for detecting active mediators in simulations are the Bayesian sparse linear mixed model by Song et al. (2020) and high-dimensional mediation analysis by Gao et al. (2019); while the superior methods for estimating the global mediation effect are high-dimensional linear mediation analysis by Zhou et al. (2021) and principal component mediation analysis by Huang and Pan (2016). We provide guidelines for epigenetic researchers on choosing the best method in practice and offer suggestions for future methodological development.
Collapse
Affiliation(s)
- Dylan Clark-Boucher
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI
| | - Jiacong Du
- Department of Biostatistics, University of Michigan, Ann Arbor, MI
| | - Yongmei Liu
- Department of Medicine, Divisions of Cardiology and Neurology, Duke University Medical Center, Durham, NC
| | | | - Jennifer A Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, MI
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan, Ann Arbor, MI
- Department of Epidemiology, University of Michigan, Ann Arbor, MI
| |
Collapse
|
7
|
A cognitive neurogenetic approach to uncovering the structure of executive functions. Nat Commun 2022; 13:4588. [PMID: 35933428 PMCID: PMC9357028 DOI: 10.1038/s41467-022-32383-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 07/27/2022] [Indexed: 11/08/2022] Open
Abstract
One central mission of cognitive neuroscience is to understand the ontology of complex cognitive functions. We addressed this question with a cognitive neurogenetic approach using a large-scale dataset of executive functions (EFs), whole-brain resting-state functional connectivity, and genetic polymorphisms. We found that the bifactor model with common and shifting-specific components not only was parsimonious but also showed maximal dissociations among the EF components at behavioral, neural, and genetic levels. In particular, the genes with enhanced expression in the middle frontal gyrus (MFG) and the subcallosal cingulate gyrus (SCG) showed enrichment for the common and shifting-specific component, respectively. Finally, High-dimensional mediation models further revealed that the functional connectivity patterns significantly mediated the genetic effect on the common EF component. Our study not only reveals insights into the ontology of EFs and their neurogenetic basis, but also provides useful tools to uncover the structure of complex constructs of human cognition.
Collapse
|
8
|
Yang Z, Xu G, Zhang Q, Obata T, Yang J. Genome-wide mediation analysis: an empirical study to connect phenotype with genotype via intermediate transcriptomic data in maize. Genetics 2022; 221:6572813. [PMID: 35460234 PMCID: PMC9157066 DOI: 10.1093/genetics/iyac057] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Accepted: 04/04/2022] [Indexed: 11/13/2022] Open
Abstract
Mapping genotype to phenotype is an essential topic in genetics and genomics research. As the Omics data become increasingly available, 2-variable methods have been widely applied to associate genotype with the phenotype (genome-wide association study), gene expression with the phenotype (transcriptome-wide association study), and genotype with gene expression. However, signals detected by these 2-variable association methods suffer from low mapping resolution or inexplicit causality between genotype and phenotype, making it challenging to interpret and validate the molecular mechanisms of the underlying genomic variations and the candidate genes. Under the context of genetics research, we hypothesized a causal chain from genotype to phenotype partially mediated by intermediate molecular processes, i.e. gene expression. To test this hypothesis, we applied the high-dimensional mediation analysis, a class of causal inference method with an assumed causal chain from the exposure to the mediator to the outcome, and implemented it with a maize association panel (N = 280 lines). Using 40 publicly available agronomy traits, 66 newly generated metabolite traits, and published RNA-seq data from 7 different tissues, our empirical study detected 736 unique mediating genes. Noticeably, 83/736 (11%) genes were identified in mediating more than 1 trait, suggesting the prevalence of pleiotropic mediating effects. We demonstrated that several identified mediating genes are consistent with their known functions. In addition, our results provided explicit hypotheses for functional validation and suggested that the mediation analysis is a powerful tool to integrate Omics data to connect genotype to phenotype.
Collapse
Affiliation(s)
- Zhikai Yang
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA,Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68583, USA
| | - Gen Xu
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA,Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68583, USA
| | - Qi Zhang
- Department of Mathematics and Statistics, University of New Hampshire, Durham, NH 03824, USA
| | - Toshihiro Obata
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68583, USA,Department of Biochemistry, University of Nebraska-Lincoln, Lincoln, NE 68583, USA
| | - Jinliang Yang
- Corresponding author: Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA.
| |
Collapse
|