1
|
Min J, Vishnyakova O, Brooks-Wilson A, Elliott LT. A Joint Bayesian Model for Change-Points and Heteroskedasticity Applied to the Canadian Longitudinal Study on Aging. J Comput Biol 2025; 32:374-393. [PMID: 39829350 DOI: 10.1089/cmb.2024.0563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2025] Open
Abstract
Maintaining homeostasis, the regulation of internal physiological parameters, is essential for health and well-being. Deviations from optimal levels, or 'sweet spots,' can lead to health deterioration and disease. Identifying biomarkers with sweet spots requires both change-point detection and variance effect analysis. Traditional approaches involve separate tests for change-points and heteroskedasticity, which can yield inaccurate results if model assumptions are violated. To address these challenges, we propose a unified approach: Bayesian Testing for Heteroskedasticity and Sweet Spots (BTHS). This framework integrates sampling-based parameter estimation and Bayes factor computation to enhance change-point detection, heteroskedasticity quantification, and testing in change-point regression settings, and extends previous Bayesian approaches. BTHS eliminates the need for separate analyses and provides detailed insights into both the magnitude and shape of heteroskedasticity, enabling robust identification of sweet spots without strong assumptions. We applied BTHS to blood elements from the Canadian Longitudinal Study on Aging identifying nine blood elements with significant sweet spot variance effects.
Collapse
Affiliation(s)
- Joosung Min
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, BC, Canada
| | - Olga Vishnyakova
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, BC, Canada
- Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC, Canada
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Angela Brooks-Wilson
- Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC, Canada
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Lloyd T Elliott
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, BC, Canada
| |
Collapse
|
2
|
Wang C, Wang T, Kiryluk K, Wei Y, Aschard H, Ionita-Laza I. Genome-wide discovery for biomarkers using quantile regression at biobank scale. Nat Commun 2024; 15:6460. [PMID: 39085219 PMCID: PMC11291931 DOI: 10.1038/s41467-024-50726-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 07/18/2024] [Indexed: 08/02/2024] Open
Abstract
Genome-wide association studies (GWAS) for biomarkers important for clinical phenotypes can lead to clinically relevant discoveries. Conventional GWAS for quantitative traits are based on simplified regression models modeling the conditional mean of a phenotype as a linear function of genotype. We draw attention here to an alternative, lesser known approach, namely quantile regression that naturally extends linear regression to the analysis of the entire conditional distribution of a phenotype of interest. Quantile regression can be applied efficiently at biobank scale, while having some unique advantages such as (1) identifying variants with heterogeneous effects across quantiles of the phenotype distribution; (2) accommodating a wide range of phenotype distributions including non-normal distributions, with invariance of results to trait transformations; and (3) providing more detailed information about genotype-phenotype associations even for those associations identified by conventional GWAS. We show in simulations that quantile regression is powerful across both homogeneous and various heterogeneous models. Applications to 39 quantitative traits in the UK Biobank demonstrate that quantile regression can be a helpful complement to linear regression in GWAS and can identify variants with larger effects on high-risk subgroups of individuals but with lower or no contribution overall.
Collapse
Affiliation(s)
- Chen Wang
- Department of Biostatistics, Columbia University, New York, NY, USA
- Division of Nephrology, Department of Medicine, Vagelos College of Physicians and Surgeons, Columbia University, New York, NY, USA
| | | | - Krzysztof Kiryluk
- Division of Nephrology, Department of Medicine, Vagelos College of Physicians and Surgeons, Columbia University, New York, NY, USA
| | - Ying Wei
- Department of Biostatistics, Columbia University, New York, NY, USA
| | - Hugues Aschard
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, France
| | - Iuliana Ionita-Laza
- Department of Biostatistics, Columbia University, New York, NY, USA.
- Department of Statistics, Lund University, Lund, Sweden.
| |
Collapse
|
3
|
Breeyear JH, Mautz BS, Keaton JM, Hellwege JN, Torstenson ES, Liang J, Bray MJ, Giri A, Warren HR, Munroe PB, Velez Edwards DR, Zhu X, Li C, Edwards TL. A new test for trait mean and variance detects unreported loci for blood-pressure variation. Am J Hum Genet 2024; 111:954-965. [PMID: 38614075 PMCID: PMC11080606 DOI: 10.1016/j.ajhg.2024.03.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 03/20/2024] [Accepted: 03/22/2024] [Indexed: 04/15/2024] Open
Abstract
Variability in quantitative traits has clinical, ecological, and evolutionary significance. Most genetic variants identified for complex quantitative traits have only a detectable effect on the mean of trait. We have developed the mean-variance test (MVtest) to simultaneously model the mean and log-variance of a quantitative trait as functions of genotypes and covariates by using estimating equations. The advantages of MVtest include the facts that it can detect effect modification, that multiple testing can follow conventional thresholds, that it is robust to non-normal outcomes, and that association statistics can be meta-analyzed. In simulations, we show control of type I error of MVtest over several alternatives. We identified 51 and 37 previously unreported associations for effects on blood-pressure variance and mean, respectively, in the UK Biobank. Transcriptome-wide association studies revealed 633 significant unique gene associations with blood-pressure mean variance. MVtest is broadly applicable to studies of complex quantitative traits and provides an important opportunity to detect novel loci.
Collapse
Affiliation(s)
- Joseph H Breeyear
- Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA; Biostatistics and Computational Biology Branch, Division of Intramural Research, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Brian S Mautz
- Population Analytics and Insights, Data Sciences, Janssen Research and Development, Spring House, PA, USA
| | - Jacob M Keaton
- Medical Genomics and Metabolic Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Jacklyn N Hellwege
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA; Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Eric S Torstenson
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jingjing Liang
- Department of Pharmacy Practice and Science, University of Arizona, Tucson, AZ, USA
| | - Michael J Bray
- Department of Maternal and Fetal Medicine, Orlando Health, Orlando, FL, USA; Genetic Counseling Program, Bay Path University, Longmeadow, MA, USA
| | - Ayush Giri
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA; Division of Quantitative Sciences, Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Helen R Warren
- Center of Clinical Pharmacology and Precision Medicine, Queen Mary University, London, England
| | - Patricia B Munroe
- Center of Clinical Pharmacology and Precision Medicine, Queen Mary University, London, England
| | - Digna R Velez Edwards
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA; Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA; Division of Quantitative Sciences, Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Xiaofeng Zhu
- Department of Epidemiology and Biostatistics, Institute for Computational Biology, Case Western Reserve University, Cleveland, OH, USA
| | - Chun Li
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA, USA
| | - Todd L Edwards
- Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.
| |
Collapse
|
4
|
Zhang X, Bell JT. Detecting genetic effects on phenotype variability to capture gene-by-environment interactions: a systematic method comparison. G3 (BETHESDA, MD.) 2024; 14:jkae022. [PMID: 38289865 PMCID: PMC10989912 DOI: 10.1093/g3journal/jkae022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 01/16/2024] [Accepted: 01/19/2024] [Indexed: 02/01/2024]
Abstract
Genetically associated phenotypic variability has been widely observed across organisms and traits, including in humans. Both gene-gene and gene-environment interactions can lead to an increase in genetically associated phenotypic variability. Therefore, detecting the underlying genetic variants, or variance Quantitative Trait Loci (vQTLs), can provide novel insights into complex traits. Established approaches to detect vQTLs apply different methodologies from variance-only approaches to mean-variance joint tests, but a comprehensive comparison of these methods is lacking. Here, we review available methods to detect vQTLs in humans, carry out a simulation study to assess their performance under different biological scenarios of gene-environment interactions, and apply the optimal approaches for vQTL identification to gene expression data. Overall, with a minor allele frequency (MAF) of less than 0.2, the squared residual value linear model (SVLM) and the deviation regression model (DRM) are optimal when the data follow normal and non-normal distributions, respectively. In addition, the Brown-Forsythe (BF) test is one of the optimal methods when the MAF is 0.2 or larger, irrespective of phenotype distribution. Additionally, a larger sample size and more balanced sample distribution in different exposure categories increase the power of BF, SVLM, and DRM. Our results highlight vQTL detection methods that perform optimally under realistic simulation settings and show that their relative performance depends on the phenotype distribution, allele frequency, sample size, and the type of exposure in the interaction model underlying the vQTL.
Collapse
Affiliation(s)
- Xiaopu Zhang
- Department of Twin Research and Genetic Epidemiology, King's College London, St Thomas’ Hospital, Westminster Bridge Road, London SE1 7EH, UK
| | - Jordana T Bell
- Department of Twin Research and Genetic Epidemiology, King's College London, St Thomas’ Hospital, Westminster Bridge Road, London SE1 7EH, UK
| |
Collapse
|
5
|
Han L, Shen B, Wu X, Zhang J, Wen YJ. Compressed variance component mixed model reveals epistasis associated with flowering in Arabidopsis. FRONTIERS IN PLANT SCIENCE 2024; 14:1283642. [PMID: 38259933 PMCID: PMC10800901 DOI: 10.3389/fpls.2023.1283642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Accepted: 12/15/2023] [Indexed: 01/24/2024]
Abstract
Introduction Epistasis is currently a topic of great interest in molecular and quantitative genetics. Arabidopsis thaliana, as a model organism, plays a crucial role in studying the fundamental biology of diverse plant species. However, there have been limited reports about identification of epistasis related to flowering in genome-wide association studies (GWAS). Therefore, it is of utmost importance to conduct epistasis in Arabidopsis. Method In this study, we employed Levene's test and compressed variance component mixed model in GWAS to detect quantitative trait nucleotides (QTNs) and QTN-by-QTN interactions (QQIs) for 11 flowering-related traits of 199 Arabidopsis accessions with 216,130 markers. Results Our analysis detected 89 QTNs and 130 pairs of QQIs. Around these loci, 34 known genes previously reported in Arabidopsis were confirmed to be associated with flowering-related traits, such as SPA4, which is involved in regulating photoperiodic flowering, and interacts with PAP1 and PAP2, affecting growth of Arabidopsis under light conditions. Then, we observed significant and differential expression of 35 genes in response to variations in temperature, photoperiod, and vernalization treatments out of unreported genes. Functional enrichment analysis revealed that 26 of these genes were associated with various biological processes. Finally, the haplotype and phenotypic difference analysis revealed 20 candidate genes exhibiting significant phenotypic variations across gene haplotypes, of which the candidate genes AT1G12990 and AT1G09950 around QQIs might have interaction effect to flowering time regulation in Arabidopsis. Discussion These findings may offer valuable insights for the identification and exploration of genes and gene-by-gene interactions associated with flowering-related traits in Arabidopsis, that may even provide valuable reference and guidance for the research of epistasis in other species.
Collapse
Affiliation(s)
- Le Han
- College of Science, Nanjing Agricultural University, Nanjing, China
| | - Bolin Shen
- College of Science, Nanjing Agricultural University, Nanjing, China
| | - Xinyi Wu
- College of Science, Nanjing Agricultural University, Nanjing, China
| | - Jin Zhang
- College of Science, Nanjing Agricultural University, Nanjing, China
- State Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, China
| | - Yang-Jun Wen
- College of Science, Nanjing Agricultural University, Nanjing, China
- State Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, China
| |
Collapse
|
6
|
Garrido-Martín D, Calvo M, Reverter F, Guigó R. A fast non-parametric test of association for multiple traits. Genome Biol 2023; 24:230. [PMID: 37828616 PMCID: PMC10571397 DOI: 10.1186/s13059-023-03076-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 09/27/2023] [Indexed: 10/14/2023] Open
Abstract
The increasing availability of multidimensional phenotypic data in large cohorts of genotyped individuals requires efficient methods to identify genetic effects on multiple traits. Permutational multivariate analysis of variance (PERMANOVA) offers a powerful non-parametric approach. However, it relies on permutations to assess significance, which hinders the analysis of large datasets. Here, we derive the limiting null distribution of the PERMANOVA test statistic, providing a framework for the fast computation of asymptotic p values. Our asymptotic test presents controlled type I error and high power, often outperforming parametric approaches. We illustrate its applicability in the context of QTL mapping and GWAS.
Collapse
Affiliation(s)
- Diego Garrido-Martín
- Department of Genetics, Microbiology and Statistics, Universitat de Barcelona (UB), Av. Diagonal 643, Barcelona, 08028, Spain.
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Catalonia, Spain.
| | - Miquel Calvo
- Department of Genetics, Microbiology and Statistics, Universitat de Barcelona (UB), Av. Diagonal 643, Barcelona, 08028, Spain
| | - Ferran Reverter
- Department of Genetics, Microbiology and Statistics, Universitat de Barcelona (UB), Av. Diagonal 643, Barcelona, 08028, Spain
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Catalonia, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Catalonia, Spain
| |
Collapse
|
7
|
Wang C, Wang T, Wei Y, Aschard H, Ionita-Laza I. Quantile Regression for biomarkers in the UK Biobank. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.05.543699. [PMID: 37333162 PMCID: PMC10274625 DOI: 10.1101/2023.06.05.543699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Genome-wide association studies (GWAS) for biomarkers important for clinical phenotypes can lead to clinically relevant discoveries. GWAS for quantitative traits are based on simplified regression models modeling the conditional mean of a phenotype as a linear function of genotype. An alternative and easy to apply approach is quantile regression that naturally extends linear regression to the analysis of the entire conditional distribution of a phenotype of interest by modeling conditional quantiles within a regression framework. Quantile regression can be applied efficiently at biobank scale using standard statistical packages in much the same way as linear regression, while having some unique advantages such as identifying variants with heterogeneous effects across different quantiles, including non-additive effects and variants involved in gene-environment interactions; accommodating a wide range of phenotype distributions with invariance to trait transformation; and overall providing more detailed information about the underlying genotype-phenotype associations. Here, we demonstrate the value of quantile regression in the context of GWAS by applying it to 39 quantitative traits in the UK Biobank (n > 300 , 000 individuals). Across these 39 traits we identify 7,297 significant loci, including 259 loci only detected by quantile regression. We show that quantile regression can help uncover replicable but unmodelled gene-environment interactions, and can provide additional key insights into poorly understood genotype-phenotype correlations for clinically relevant biomarkers at minimal additional cost.
Collapse
Affiliation(s)
- Chen Wang
- Department of Biostatistics, Columbia University, New York, USA
| | - Tianying Wang
- Center for Statistical Science & Department of Industrial Engineering, Tsinghua University, Beijing, China
| | - Ying Wei
- Department of Biostatistics, Columbia University, New York, USA
| | - Hugues Aschard
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, Paris, France
| | | |
Collapse
|
8
|
Shi G. Genome-wide variance quantitative trait locus analysis suggests small interaction effects in blood pressure traits. Sci Rep 2022; 12:12649. [PMID: 35879408 PMCID: PMC9314370 DOI: 10.1038/s41598-022-16908-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Accepted: 07/18/2022] [Indexed: 11/09/2022] Open
Abstract
Genome-wide variance quantitative trait loci (vQTL) analysis complements genome-wide association study (GWAS) and has the potential to identify novel variants associated with the trait, explain additional trait variance and lead to the identification of factors that modulate the genetic effects. I conducted genome-wide analysis of the UK Biobank data and identified 27 vQTLs associated with systolic blood pressure (SBP), diastolic blood pressure (DBP) and pulse pressure (PP). The top single-nucleotide polymorphisms (SNPs) are enriched for expression QTLs (eQTLs) or splicing QTLs (sQTLs) annotated by GTEx, suggesting their regulatory roles in mediating the associations with blood pressure (BP). Of the 27 vQTLs, 14 are known BP-associated QTLs discovered by GWASs. The heteroscedasticity effects of the 13 novel vQTLs are larger than their genetic main effects, which were not detected by existing GWASs. The total R-squared of the 27 top SNPs due to variance heteroscedasticity is 0.28%, compared with 0.50% owing to their main effects. The overall effect size of the variance heteroscedasticity is small in GWAS SNPs compared with their main effects. For the 411, 384 and 285 GWAS SNPs associated with SBP, DBP and PP, respectively, their heteroscedasticity effects were 0.52%, 0.43%, and 0.16%, and their main effects were 5.13%, 5.61%, and 3.75%, respectively. The number and effects of the vQTLs are small, which suggests that the effects of gene-environment and gene-gene interactions are small. The main effects of the SNPs remain the major source of genetic variance for BP, which would probably be true for other complex traits as well.
Collapse
Affiliation(s)
- Gang Shi
- School of Telecommunications Engineering, Xidian University, 2 South Taibai Road, Xi'an, 710071, Shaanxi, China.
| |
Collapse
|
9
|
Murphy MD, Fernandes SB, Morota G, Lipka AE. Assessment of two statistical approaches for variance genome-wide association studies in plants. Heredity (Edinb) 2022; 129:93-102. [PMID: 35538221 PMCID: PMC9338250 DOI: 10.1038/s41437-022-00541-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 04/28/2022] [Accepted: 04/29/2022] [Indexed: 11/09/2022] Open
Abstract
Genomic loci that control the variance of agronomically important traits are increasingly important due to the profusion of unpredictable environments arising from climate change. The ability to identify such variance-controlling loci in association studies will be critical for future breeding efforts. Two statistical approaches that have already been used in the variance genome-wide association study (vGWAS) paradigm are the Brown-Forsythe test (BFT) and the double generalized linear model (DGLM). To ensure that these approaches are deployed as effectively as possible, it is critical to study the factors that influence their ability to identify variance-controlling loci. We used genome-wide marker data in maize (Zea mays L.) and Arabidopsis thaliana to simulate traits controlled by epistasis, genotype by environment (GxE) interactions, and variance quantitative trait nucleotides (vQTNs). We then quantified true and false positive detection rates of the BFT and DGLM across all simulated traits. We also conducted a vGWAS using both the BFT and DGLM on plant height in a maize diversity panel. The observed true positive detection rates at the maximum sample size considered (N = 2815) suggest that both of these vGWAS approaches are capable of identifying epistasis and GxE for sufficiently large sample sizes. We also noted that the DGLM decisively outperformed the BFT for simulated traits controlled by vQTNs at sample sizes of N = 500. Although we conclude that there are still certain aspects of vGWAS approaches that need further refinement, this study suggests that the BFT and DGLM are capable of identifying variance-controlling loci in current state-of-the-art plant or agronomic data sets.
Collapse
Affiliation(s)
- Matthew D Murphy
- Department of Crop Sciences, University of Illinois Urbana-Champaign, 1102 S Goodwin Ave, Urbana, IL, 61801, USA
| | - Samuel B Fernandes
- Department of Crop Sciences, University of Illinois Urbana-Champaign, 1102 S Goodwin Ave, Urbana, IL, 61801, USA
| | - Gota Morota
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, 175 West Campus Drive, Blacksburg, VA, 24061, USA
| | - Alexander E Lipka
- Department of Crop Sciences, University of Illinois Urbana-Champaign, 1102 S Goodwin Ave, Urbana, IL, 61801, USA.
| |
Collapse
|
10
|
Hallou A, Yevick HG, Dumitrascu B, Uhlmann V. Deep learning for bioimage analysis in developmental biology. Development 2021; 148:dev199616. [PMID: 34490888 PMCID: PMC8451066 DOI: 10.1242/dev.199616] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Deep learning has transformed the way large and complex image datasets can be processed, reshaping what is possible in bioimage analysis. As the complexity and size of bioimage data continues to grow, this new analysis paradigm is becoming increasingly ubiquitous. In this Review, we begin by introducing the concepts needed for beginners to understand deep learning. We then review how deep learning has impacted bioimage analysis and explore the open-source resources available to integrate it into a research project. Finally, we discuss the future of deep learning applied to cell and developmental biology. We analyze how state-of-the-art methodologies have the potential to transform our understanding of biological systems through new image-based analysis and modelling that integrate multimodal inputs in space and time.
Collapse
Affiliation(s)
- Adrien Hallou
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge, CB3 0HE, UK
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, CB2 1QN, UK
- Wellcome Trust/Medical Research Council Stem Cell Institute, University of Cambridge, Cambridge, CB2 1QR, UK
| | - Hannah G. Yevick
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, 02142, USA
| | - Bianca Dumitrascu
- Computer Laboratory, Cambridge, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Virginie Uhlmann
- European Bioinformatics Institute, European Molecular Biology Laboratory, Cambridge, CB10 1SD, UK
| |
Collapse
|
11
|
Braz CU, Rowan TN, Schnabel RD, Decker JE. Genome-wide association analyses identify genotype-by-environment interactions of growth traits in Simmental cattle. Sci Rep 2021; 11:13335. [PMID: 34172761 PMCID: PMC8233360 DOI: 10.1038/s41598-021-92455-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Accepted: 06/07/2021] [Indexed: 02/06/2023] Open
Abstract
Understanding genotype-by-environment interactions (G × E) is crucial to understand environmental adaptation in mammals and improve the sustainability of agricultural production. Here, we present an extensive study investigating the interaction of genome-wide SNP markers with a vast assortment of environmental variables and searching for SNPs controlling phenotypic variance (vQTL) using a large beef cattle dataset. We showed that G × E contribute 10.1%, 3.8%, and 2.8% of the phenotypic variance of birth weight, weaning weight, and yearling weight, respectively. G × E genome-wide association analysis (GWAA) detected a large number of G × E loci affecting growth traits, which the traditional GWAA did not detect, showing that functional loci may have non-additive genetic effects regardless of differences in genotypic means. Further, variance-heterogeneity GWAA detected loci enriched with G × E effects without requiring prior knowledge of the interacting environmental factors. Functional annotation and pathway analysis of G × E genes revealed biological mechanisms by which cattle respond to changes in their environment, such as neurotransmitter activity, hypoxia-induced processes, keratinization, hormone, thermogenic and immune pathways. We unraveled the relevance and complexity of the genetic basis of G × E underlying growth traits, providing new insights into how different environmental conditions interact with specific genes influencing adaptation and productivity in beef cattle and potentially across mammals.
Collapse
Affiliation(s)
- Camila U Braz
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA
| | - Troy N Rowan
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA
- Genetics Area Program, University of Missouri, Columbia, MO, 65211, USA
| | - Robert D Schnabel
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA
- Genetics Area Program, University of Missouri, Columbia, MO, 65211, USA
- Informatics Institute, University of Missouri, Columbia, MO, 65211, USA
| | - Jared E Decker
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA.
- Genetics Area Program, University of Missouri, Columbia, MO, 65211, USA.
- Informatics Institute, University of Missouri, Columbia, MO, 65211, USA.
| |
Collapse
|
12
|
Villicaña S, Bell JT. Genetic impacts on DNA methylation: research findings and future perspectives. Genome Biol 2021; 22:127. [PMID: 33931130 PMCID: PMC8086086 DOI: 10.1186/s13059-021-02347-6] [Citation(s) in RCA: 124] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Accepted: 04/09/2021] [Indexed: 12/17/2022] Open
Abstract
Multiple recent studies highlight that genetic variants can have strong impacts on a significant proportion of the human DNA methylome. Methylation quantitative trait loci, or meQTLs, allow for the exploration of biological mechanisms that underlie complex human phenotypes, with potential insights for human disease onset and progression. In this review, we summarize recent milestones in characterizing the human genetic basis of DNA methylation variation over the last decade, including heritability findings and genome-wide identification of meQTLs. We also discuss challenges in this field and future areas of research geared to generate insights into molecular processes underlying human complex traits.
Collapse
Affiliation(s)
- Sergio Villicaña
- Department of Twin Research and Genetic Epidemiology, St. Thomas’ Hospital, King’s College London, 3rd Floor, South Wing, Block D, London, SE1 7EH UK
| | - Jordana T. Bell
- Department of Twin Research and Genetic Epidemiology, St. Thomas’ Hospital, King’s College London, 3rd Floor, South Wing, Block D, London, SE1 7EH UK
| |
Collapse
|
13
|
Soave D, Lawless JF, Awadalla P. Score tests for scale effects, with application to genomic analysis. Stat Med 2021; 40:3808-3822. [PMID: 33908071 DOI: 10.1002/sim.9000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 04/01/2021] [Accepted: 04/07/2021] [Indexed: 11/07/2022]
Abstract
Tests for variance or scale effects due to covariates are used in many areas and recently, in genomic and genetic association studies. We study score tests based on location-scale models with arbitrary error distributions that allow incorporation of additional adjustment covariates. Tests based on Gaussian and Laplacian double generalized linear models are examined in some detail. Numerical properties of the tests under Gaussian and other error distributions are examined. Our results show that the use of model-based asymptotic distributions with score tests for scale effects does not control type 1 error well in many settings of practical relevance. We consider simple statistics based on permutation distribution approximations, which correspond to well-known statistics derived by another approach. They are shown to give good type 1 error control under different error distributions and under covariate distribution imbalance. The methods are illustrated through a differential gene expression analysis involving breast cancer tumor samples.
Collapse
Affiliation(s)
- David Soave
- Department of Mathematics, Wilfrid Laurier University, Waterloo, Ontario, Canada.,Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Jerald F Lawless
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
| | - Philip Awadalla
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
14
|
Marderstein AR, Davenport ER, Kulm S, Van Hout CV, Elemento O, Clark AG. Leveraging phenotypic variability to identify genetic interactions in human phenotypes. Am J Hum Genet 2021; 108:49-67. [PMID: 33326753 PMCID: PMC7820920 DOI: 10.1016/j.ajhg.2020.11.016] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Accepted: 11/23/2020] [Indexed: 12/13/2022] Open
Abstract
Although thousands of loci have been associated with human phenotypes, the role of gene-environment (GxE) interactions in determining individual risk of human diseases remains unclear. This is partly because of the severe erosion of statistical power resulting from the massive number of statistical tests required to detect such interactions. Here, we focus on improving the power of GxE tests by developing a statistical framework for assessing quantitative trait loci (QTLs) associated with the trait means and/or trait variances. When applying this framework to body mass index (BMI), we find that GxE discovery and replication rates are significantly higher when prioritizing genetic variants associated with the variance of the phenotype (vQTLs) compared to when assessing all genetic variants. Moreover, we find that vQTLs are enriched for associations with other non-BMI phenotypes having strong environmental influences, such as diabetes or ulcerative colitis. We show that GxE effects first identified in quantitative traits such as BMI can be used for GxE discovery in disease phenotypes such as diabetes. A clear conclusion is that strong GxE interactions mediate the genetic contribution to body weight and diabetes risk.
Collapse
Affiliation(s)
- Andrew R Marderstein
- Tri-Institutional Program in Computational Biology & Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Institute of Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA
| | - Emily R Davenport
- Department of Biology, Huck Institutes of the Life Sciences, Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA 16802, USA
| | - Scott Kulm
- Institute of Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA
| | | | - Olivier Elemento
- Institute of Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA.
| | - Andrew G Clark
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA.
| |
Collapse
|
15
|
Szoke A, Pignon B, Boster S, Jamain S, Schürhoff F. Schizophrenia: Developmental Variability Interacts with Risk Factors to Cause the Disorder: Nonspecific Variability-Enhancing Factors Combine with Specific Risk Factors to Cause Schizophrenia. Bioessays 2020; 42:e2000038. [PMID: 32864753 DOI: 10.1002/bies.202000038] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Revised: 08/10/2020] [Indexed: 12/31/2022]
Abstract
A new etiological model is proposed for schizophrenia that combines variability-enhancing nonspecific factors acting during development with more specific risk factors. This model is better suited than the current etiological models of schizophrenia, based on the risk factors paradigm, for predicting and/or explaining several important findings about schizophrenia: high co-morbidity rates, low specificity of many risk factors, and persistence in the population of the associated genetic polymorphisms. Compared with similar models, e.g., de-canalization, common psychopathology factor, sexual-selection, or differential sensitivity to the environment, this proposal is more general and integrative. Recently developed research methods have proven the existence of genetic and environmental factors that enhance developmental variability. Applying such methods to newly collected or already available data can allow for testing the hypotheses upon which this model is built. If validated, this model may change the understanding of the etiology of schizophrenia, the research models, and preventionbrk paradigms.
Collapse
Affiliation(s)
- Andrei Szoke
- INSERM, U955, Translational NeuroPsychiatry Lab, Créteil, 94000, France.,AP-HP, DHU IMPACT, Pôle de Psychiatrie, Hôpitaux Universitaires Henri-Mondor, Créteil, 94000, France.,Fondation FondaMental, Créteil, 94000, France.,UPEC, Faculté de Médecine, Université Paris-Est Créteil, Créteil, 94000, France
| | - Baptiste Pignon
- INSERM, U955, Translational NeuroPsychiatry Lab, Créteil, 94000, France.,AP-HP, DHU IMPACT, Pôle de Psychiatrie, Hôpitaux Universitaires Henri-Mondor, Créteil, 94000, France.,Fondation FondaMental, Créteil, 94000, France.,UPEC, Faculté de Médecine, Université Paris-Est Créteil, Créteil, 94000, France
| | | | - Stéphane Jamain
- INSERM, U955, Translational NeuroPsychiatry Lab, Créteil, 94000, France.,UPEC, Faculté de Médecine, Université Paris-Est Créteil, Créteil, 94000, France
| | - Franck Schürhoff
- INSERM, U955, Translational NeuroPsychiatry Lab, Créteil, 94000, France.,AP-HP, DHU IMPACT, Pôle de Psychiatrie, Hôpitaux Universitaires Henri-Mondor, Créteil, 94000, France.,Fondation FondaMental, Créteil, 94000, France.,UPEC, Faculté de Médecine, Université Paris-Est Créteil, Créteil, 94000, France
| |
Collapse
|
16
|
Hussain W, Campbell MT, Jarquin D, Walia H, Morota G. Variance heterogeneity genome-wide mapping for cadmium in bread wheat reveals novel genomic loci and epistatic interactions. THE PLANT GENOME 2020; 13:e20011. [PMID: 33016629 DOI: 10.1002/tpg2.20011] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Accepted: 01/22/2020] [Indexed: 06/11/2023]
Abstract
Genome-wide association mapping identifies quantitative trait loci (QTL) that influence the mean differences between the marker genotypes for a given trait. While most loci influence the mean value of a trait, certain loci, known as variance heterogeneity QTL (vQTL) determine the variability of the trait instead of the mean trait value (mQTL). In the present study, we performed a variance heterogeneity genome-wide association study (vGWAS) for grain cadmium (Cd) concentration in bread wheat. We used double generalized linear model and hierarchical generalized linear model to identify vQTL associated with grain Cd. We identified novel vQTL regions on chromosomes 2A and 2B that contribute to the Cd variation and loci that affect both mean and variance heterogeneity (mvQTL) on chromosome 5A. In addition, our results demonstrated the presence of epistatic interactions between vQTL and mvQTL, which could explain variance heterogeneity. Overall, we provide novel insights into the genetic architecture of grain Cd concentration and report the first application of vGWAS in wheat. Moreover, our findings indicated that epistasis is an important mechanism underlying natural variation for grain Cd concentration.
Collapse
Affiliation(s)
- Waseem Hussain
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, 68583, USA
| | - Malachy T Campbell
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA
| | - Diego Jarquin
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, 68583, USA
| | - Harkamal Walia
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, 68583, USA
| | - Gota Morota
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA
| |
Collapse
|
17
|
Corty RW, Valdar W. vqtl: An R Package for Mean-Variance QTL Mapping. G3 (BETHESDA, MD.) 2018; 8:3757-3766. [PMID: 30389795 PMCID: PMC6288833 DOI: 10.1534/g3.118.200642] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Accepted: 10/23/2018] [Indexed: 12/26/2022]
Abstract
We present vqtl, an R package for mean-variance QTL mapping. This QTL mapping approach tests for genetic loci that influence the mean of the phenotype, termed mean QTL, the variance of the phenotype, termed variance QTL, or some combination of the two, termed mean-variance QTL. It is unique in its ability to correct for variance heterogeneity arising not only from the QTL itself but also from nuisance factors, such as sex, batch, or housing. This package provides functions to conduct genome scans, run permutations to assess the statistical significance, and make informative plots to communicate results. Because it is inter-operable with the popular qtl package and uses many of the same data structures and input patterns, it will be straightforward for geneticists to analyze future experiments with vqtl as well as re-analyze past experiments, possibly discovering new QTL.
Collapse
Affiliation(s)
- Robert W Corty
- Department of Genetics
- Bioinformatics and Computational Biology Curriculum
| | - William Valdar
- Department of Genetics
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC
| |
Collapse
|
18
|
Corty RW, Valdar W. QTL Mapping on a Background of Variance Heterogeneity. G3 (BETHESDA, MD.) 2018; 8:3767-3782. [PMID: 30389794 DOI: 10.1101/276980] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Standard QTL mapping procedures seek to identify genetic loci affecting the phenotypic mean while assuming that all individuals have the same residual variance. But when the residual variance differs systematically between groups, perhaps due to a genetic or environmental factor, such standard procedures can falter: in testing for QTL associations, they attribute too much weight to observations that are noisy and too little to those that are precise, resulting in reduced power and and increased susceptibility to false positives. The negative effects of such "background variance heterogeneity" (BVH) on standard QTL mapping have received little attention until now, although the subject is closely related to work on the detection of variance-controlling genes. Here we use simulation to examine how BVH affects power and false positive rate for detecting QTL affecting the mean (mQTL), the variance (vQTL), or both (mvQTL). We compare linear regression for mQTL and Levene's test for vQTL, with tests more recently developed, including tests based on the double generalized linear model (DGLM), which can model BVH explicitly. We show that, when used in conjunction with a suitable permutation procedure, the DGLM-based tests accurately control false positive rate and are more powerful than the other tests. We also find that some adverse effects of BVH can be mitigated by applying a rank inverse normal transform. We apply our novel approach, which we term "mean-variance QTL mapping", to publicly available data on a mouse backcross and, after accommodating BVH driven by sire, detect a new mQTL for bodyweight.
Collapse
Affiliation(s)
- Robert W Corty
- Department of Genetics
- Bioinformatics and Computational Biology Curriculum
| | - William Valdar
- Department of Genetics
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC
| |
Collapse
|
19
|
Corty RW, Kumar V, Tarantino LM, Takahashi JS, Valdar W. Mean-Variance QTL Mapping Identifies Novel QTL for Circadian Activity and Exploratory Behavior in Mice. G3 (BETHESDA, MD.) 2018; 8:3783-3790. [PMID: 30389793 PMCID: PMC6288835 DOI: 10.1534/g3.118.200194] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2018] [Accepted: 10/11/2018] [Indexed: 12/11/2022]
Abstract
We illustrate, through two case studies, that "mean-variance QTL mapping"-QTL mapping that models effects on the mean and the variance simultaneously-can discover QTL that traditional interval mapping cannot. Mean-variance QTL mapping is based on the double generalized linear model, which extends the standard linear model used in interval mapping by incorporating not only a set of genetic and covariate effects for mean but also set of such effects for the residual variance. Its potential for use in QTL mapping has been described previously, but it remains underutilized, with certain key advantages undemonstrated until now. In the first case study, a reduced complexity intercross of C57BL/6J and C57BL/6N mice examining circadian behavior, our reanalysis detected a mean-controlling QTL for circadian wheel running activity that interval mapping did not; mean-variance QTL mapping was more powerful than interval mapping at the QTL because it accounted for the fact that mice homozygous for the C57BL/6N allele had less residual variance than other mice. In the second case study, an intercross between C57BL/6J and C58/J mice examining anxiety-like behaviors, our reanalysis detected a variance-controlling QTL for rearing behavior; interval mapping did not identify this QTL because it does not target variance QTL. We believe that the results of these reanalyses, which in other respects largely replicated the original findings, support the use of mean-variance QTL mapping as standard practice.
Collapse
Affiliation(s)
- Robert W Corty
- Department of Genetics
- Bioinformatics and Computational Biology Curriculum
| | | | | | - Joseph S Takahashi
- Howard Hughes Medical Institute, Department of Neuroscience, University of Texas Southwestern Medical Center, Dallas, TX 75390
| | - William Valdar
- Department of Genetics
- Bioinformatics and Computational Biology Curriculum
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599
| |
Collapse
|
20
|
Corty RW, Valdar W. QTL Mapping on a Background of Variance Heterogeneity. G3 (BETHESDA, MD.) 2018; 8:3767-3782. [PMID: 30389794 PMCID: PMC6288843 DOI: 10.1534/g3.118.200790] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2018] [Accepted: 10/28/2018] [Indexed: 12/21/2022]
Abstract
Standard QTL mapping procedures seek to identify genetic loci affecting the phenotypic mean while assuming that all individuals have the same residual variance. But when the residual variance differs systematically between groups, perhaps due to a genetic or environmental factor, such standard procedures can falter: in testing for QTL associations, they attribute too much weight to observations that are noisy and too little to those that are precise, resulting in reduced power and and increased susceptibility to false positives. The negative effects of such "background variance heterogeneity" (BVH) on standard QTL mapping have received little attention until now, although the subject is closely related to work on the detection of variance-controlling genes. Here we use simulation to examine how BVH affects power and false positive rate for detecting QTL affecting the mean (mQTL), the variance (vQTL), or both (mvQTL). We compare linear regression for mQTL and Levene's test for vQTL, with tests more recently developed, including tests based on the double generalized linear model (DGLM), which can model BVH explicitly. We show that, when used in conjunction with a suitable permutation procedure, the DGLM-based tests accurately control false positive rate and are more powerful than the other tests. We also find that some adverse effects of BVH can be mitigated by applying a rank inverse normal transform. We apply our novel approach, which we term "mean-variance QTL mapping", to publicly available data on a mouse backcross and, after accommodating BVH driven by sire, detect a new mQTL for bodyweight.
Collapse
Affiliation(s)
- Robert W Corty
- Department of Genetics
- Bioinformatics and Computational Biology Curriculum
| | - William Valdar
- Department of Genetics
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC
| |
Collapse
|