1
|
Detecting genetic effects on phenotype variability to capture gene-by-environment interactions: a systematic method comparison. G3 (BETHESDA, MD.) 2024; 14:jkae022. [PMID: 38289865 PMCID: PMC10989912 DOI: 10.1093/g3journal/jkae022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 01/16/2024] [Accepted: 01/19/2024] [Indexed: 02/01/2024]
Abstract
Genetically associated phenotypic variability has been widely observed across organisms and traits, including in humans. Both gene-gene and gene-environment interactions can lead to an increase in genetically associated phenotypic variability. Therefore, detecting the underlying genetic variants, or variance Quantitative Trait Loci (vQTLs), can provide novel insights into complex traits. Established approaches to detect vQTLs apply different methodologies from variance-only approaches to mean-variance joint tests, but a comprehensive comparison of these methods is lacking. Here, we review available methods to detect vQTLs in humans, carry out a simulation study to assess their performance under different biological scenarios of gene-environment interactions, and apply the optimal approaches for vQTL identification to gene expression data. Overall, with a minor allele frequency (MAF) of less than 0.2, the squared residual value linear model (SVLM) and the deviation regression model (DRM) are optimal when the data follow normal and non-normal distributions, respectively. In addition, the Brown-Forsythe (BF) test is one of the optimal methods when the MAF is 0.2 or larger, irrespective of phenotype distribution. Additionally, a larger sample size and more balanced sample distribution in different exposure categories increase the power of BF, SVLM, and DRM. Our results highlight vQTL detection methods that perform optimally under realistic simulation settings and show that their relative performance depends on the phenotype distribution, allele frequency, sample size, and the type of exposure in the interaction model underlying the vQTL.
Collapse
|
2
|
A longitudinal genome-wide association study of bone mineral density mean and variability in the UK Biobank. Osteoporos Int 2023; 34:1907-1916. [PMID: 37500982 DOI: 10.1007/s00198-023-06852-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 07/06/2023] [Indexed: 07/29/2023]
Abstract
Bone mineral density (BMD) is an essential predictor of osteoporosis and fracture. We conducted a genome-wide trajectory analysis of BMD and analyzed the BMD change. PURPOSE This study aimed to identify the genetic architecture and potential biomarkers of BMD. METHODS Our analysis included 141,261 white participants from the UK Biobank with heel BMD phenotype data. We used a genome-wide trajectory analysis tool, TrajGWAS, to conduct a genome-wide association study (GWAS) of BMD. Then, we validated our findings in previously reported BMD genetic associations and performed replication analysis in the Asian participants. Finally, gene-set enrichment analysis (GSEA) of the identified candidate genes was conducted using the FUMA platform. RESULTS A total of 52 genes associated with BMD trajectory mean were identified, of which the top three significant genes were WNT16 (P = 1.31 × 10-126), FAM3C (P = 4.18 × 10-108), and CPED1 (P = 8.48 × 10-106). In addition, 114 genes associated with BMD within-subject variability were also identified, such as AC092079.1 (P = 2.72 × 10-13) and RGS7 (P = 4.72 × 10-10). The associations for these candidate genes were confirmed in the previous GWASs and replicated successfully in the Asian participants. GSEA results of BMD change identified multiple GO terms related to skeletal development, such as SKELETAL SYSTEM DEVELOPMENT (Padjusted = 2.45 × 10-3) and REGULATION OF OSSIFICATION (Padjusted = 2.45 × 10-3). KEGG enrichment analysis showed that these genes were mainly enriched in WNT SIGNALING PATHWAY. CONCLUSIONS Our findings indicated that the CPED1-WNT16-FAM3C locus plays a significant role in BMD mean trajectories and identified several novel candidate genes contributing to BMD within-subject variability, facilitating the understanding of the genetic architecture of BMD.
Collapse
|
3
|
Complex effects of sequence variants on lipid levels and coronary artery disease. Cell 2023; 186:4085-4099.e15. [PMID: 37714134 DOI: 10.1016/j.cell.2023.08.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 05/06/2023] [Accepted: 08/10/2023] [Indexed: 09/17/2023]
Abstract
Many sequence variants have additive effects on blood lipid levels and, through that, on the risk of coronary artery disease (CAD). We show that variants also have non-additive effects and interact to affect lipid levels as well as affecting variance and correlations. Variance and correlation effects are often signatures of epistasis or gene-environmental interactions. These complex effects can translate into CAD risk. For example, Trp154Ter in FUT2 protects against CAD among subjects with the A1 blood group, whereas it associates with greater risk of CAD in others. His48Arg in ADH1B interacts with alcohol consumption to affect lipid levels and CAD. The effect of variants in TM6SF2 on blood lipids is greatest among those who never eat oily fish but absent from those who often do. This work demonstrates that variants that affect variance of quantitative traits can allow for the discovery of epistasis and interactions of variants with the environment.
Collapse
|
4
|
Risk of type 2 diabetes and KCNJ11 gene polymorphisms: a nested case-control study and meta-analysis. Sci Rep 2022; 12:20709. [PMID: 36456687 PMCID: PMC9715540 DOI: 10.1038/s41598-022-24931-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 11/22/2022] [Indexed: 12/05/2022] Open
Abstract
Due to the central role in insulin secretion, the potassium inwardly-rectifying channel subfamily J member 11 (KCNJ11) gene is one of the essential genes for type 2 diabetes (T2D) predisposition. However, the relevance of this gene to T2D development is not consistent among diverse populations. In the current study, we aim to capture the possible association of common KCNJ11 variants across Iranian adults, followed by a meta-analysis. We found that the tested variants of KCNJ11 have not contributed to T2D incidence in Iranian adults, consistent with similar insulin secretion levels among individuals with different genotypes. The integration of our results with 72 eligible published case-control studies (41,372 cases and 47,570 controls) as a meta-analysis demonstrated rs5219 and rs5215 are significantly associated with the increased T2D susceptibility under different genetic models. Nevertheless, the stratified analysis according to ethnicity showed rs5219 is involved in the T2D risk among disparate populations, including American, East Asian, European, and Greater Middle Eastern, but not South Asian. Additionally, the meta-regression analysis demonstrated that the sample size of both case and control groups was significantly associated with the magnitude of pooled genetic effect size. The present study can expand our knowledge about the KCNJ11 common variant's contributions to T2D incidence, which is valuable for designing SNP-based panels for potential clinical applications in precision medicine. It also highlights the importance of similar sample sizes for avoiding high heterogeneity and conducting a more precise meta-analysis.
Collapse
|
5
|
WiSER: Robust and scalable estimation and inference of within-subject variances from intensive longitudinal data. Biometrics 2022; 78:1313-1327. [PMID: 34142722 PMCID: PMC8683571 DOI: 10.1111/biom.13506] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2020] [Revised: 04/12/2021] [Accepted: 05/19/2021] [Indexed: 12/30/2022]
Abstract
The availability of vast amounts of longitudinal data from electronic health records (EHRs) and personal wearable devices opens the door to numerous new research questions. In many studies, individual variability of a longitudinal outcome is as important as the mean. Blood pressure fluctuations, glycemic variations, and mood swings are prime examples where it is critical to identify factors that affect the within-individual variability. We propose a scalable method, within-subject variance estimator by robust regression (WiSER), for the estimation and inference of the effects of both time-varying and time-invariant predictors on within-subject variance. It is robust against the misspecification of the conditional distribution of responses or the distribution of random effects. It shows similar performance as the correctly specified likelihood methods but is 103 ∼ 105 times faster. The estimation algorithm scales linearly in the total number of observations, making it applicable to massive longitudinal data sets. The effectiveness of WiSER is evaluated in extensive simulation studies. Its broad applicability is illustrated using the accelerometry data from the Women's Health Study and a clinical trial for longitudinal diabetes care.
Collapse
|
6
|
A quantile integral linear model to quantify genetic effects on phenotypic variability. Proc Natl Acad Sci U S A 2022; 119:e2212959119. [PMID: 36122202 PMCID: PMC9522331 DOI: 10.1073/pnas.2212959119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Detecting genetic variants associated with the variance of complex traits, that is, variance quantitative trait loci (vQTLs), can provide crucial insights into the interplay between genes and environments and how they jointly shape human phenotypes in the population. We propose a quantile integral linear model (QUAIL) to estimate genetic effects on trait variability. Through extensive simulations and analyses of real data, we demonstrate that QUAIL provides computationally efficient and statistically powerful vQTL mapping that is robust to non-Gaussian phenotypes and confounding effects on phenotypic variability. Applied to UK Biobank (n = 375,791), QUAIL identified 11 vQTLs for body mass index (BMI) that have not been previously reported. Top vQTL findings showed substantial enrichment for interactions with physical activities and sedentary behavior. Furthermore, variance polygenic scores (vPGSs) based on QUAIL effect estimates showed superior predictive performance on both population-level and within-individual BMI variability compared to existing approaches. Overall, QUAIL is a unified framework to quantify genetic effects on the phenotypic variability at both single-variant and vPGS levels. It addresses critical limitations in existing approaches and may have broad applications in future gene-environment interaction studies.
Collapse
|
7
|
GWAS of longitudinal trajectories at biobank scale. Am J Hum Genet 2022; 109:433-445. [PMID: 35196515 PMCID: PMC8948167 DOI: 10.1016/j.ajhg.2022.01.018] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Accepted: 01/25/2022] [Indexed: 12/12/2022] Open
Abstract
Biobanks linked to massive, longitudinal electronic health record (EHR) data make numerous new genetic research questions feasible. One among these is the study of biomarker trajectories. For example, high blood pressure measurements over visits strongly predict stroke onset, and consistently high fasting glucose and Hb1Ac levels define diabetes. Recent research reveals that not only the mean level of biomarker trajectories but also their fluctuations, or within-subject (WS) variability, are risk factors for many diseases. Glycemic variation, for instance, is recently considered an important clinical metric in diabetes management. It is crucial to identify the genetic factors that shift the mean or alter the WS variability of a biomarker trajectory. Compared to traditional cross-sectional studies, trajectory analysis utilizes more data points and captures a complete picture of the impact of time-varying factors, including medication history and lifestyle. Currently, there are no efficient tools for genome-wide association studies (GWASs) of biomarker trajectories at the biobank scale, even for just mean effects. We propose TrajGWAS, a linear mixed effect model-based method for testing genetic effects that shift the mean or alter the WS variability of a biomarker trajectory. It is scalable to biobank data with 100,000 to 1,000,000 individuals and many longitudinal measurements and robust to distributional assumptions. Simulation studies corroborate that TrajGWAS controls the type I error rate and is powerful. Analysis of eleven biomarkers measured longitudinally and extracted from UK Biobank primary care data for more than 150,000 participants with 1,800,000 observations reveals loci that significantly alter the mean or WS variability.
Collapse
|
8
|
Role of monogenic diabetes genes on beta cell function in Italian patients with newly diagnosed type 2 diabetes. The Verona Newly Diagnosed Type 2 Diabetes Study (VNDS) 13. DIABETES & METABOLISM 2022; 48:101323. [DOI: 10.1016/j.diabet.2022.101323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 10/27/2021] [Accepted: 11/25/2021] [Indexed: 10/19/2022]
|
9
|
Investigation of the Causal Association between Long-Chain n-6 Polyunsaturated Fatty Acid Synthesis and the Risk of Type 2 Diabetes: A Mendelian Randomization Analysis. Lifestyle Genom 2020; 13:146-153. [DOI: 10.1159/000509663] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Accepted: 06/18/2020] [Indexed: 11/19/2022] Open
|
10
|
Heritability and genome-wide association analyses of fasting plasma glucose in Chinese adult twins. BMC Genomics 2020; 21:491. [PMID: 32682390 PMCID: PMC7368793 DOI: 10.1186/s12864-020-06898-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Accepted: 07/09/2020] [Indexed: 02/06/2023] Open
Abstract
Background Currently, diabetes has become one of the leading causes of death worldwide. Fasting plasma glucose (FPG) levels that are higher than optimal, even if below the diagnostic threshold of diabetes, can also lead to increased morbidity and mortality. Here we intend to study the magnitude of the genetic influence on FPG variation by conducting structural equation modelling analysis and to further identify specific genetic variants potentially related to FPG levels by performing a genome-wide association study (GWAS) in Chinese twins. Results The final sample included 382 twin pairs: 139 dizygotic (DZ) pairs and 243 monozygotic (MZ) pairs. The DZ twin correlation for the FPG level (rDZ = 0.20, 95% CI: 0.04–0.36) was much lower than half that of the MZ twin correlation (rMZ = 0.68, 95% CI: 0.62–0.74). For the variation in FPG level, the AE model was the better fitting model, with additive genetic parameters (A) accounting for 67.66% (95% CI: 60.50–73.62%) and unique environmental or residual parameters (E) accounting for 32.34% (95% CI: 26.38–39.55%), respectively. In the GWAS, although no genetic variants reached the genome-wide significance level (P < 5 × 10− 8), 28 SNPs exceeded the level of a suggestive association (P < 1 × 10− 5). One promising genetic region (2q33.1) around rs10931893 (P = 1.53 × 10− 7) was found. After imputing untyped SNPs, we found that rs60106404 (P = 2.38 × 10− 8) located at SPATS2L reached the genome-wide significance level, and 216 SNPs exceeded the level of a suggestive association. We found 1007 genes nominally associated with the FPG level (P < 0.05), including SPATS2L, KCNK5, ADCY5, PCSK1, PTPRA, and SLC26A11. Moreover, C1orf74 (P = 0.014) and SLC26A11 (P = 0.021) were differentially expressed between patients with impaired fasting glucose and healthy controls. Some important enriched biological pathways, such as β-alanine metabolism, regulation of insulin secretion, glucagon signaling in metabolic regulation, IL-1 receptor pathway, signaling by platelet derived growth factor, cysteine and methionine metabolism pathway, were identified. Conclusions The FPG level is highly heritable in the Chinese population, and genetic variants are significantly involved in regulatory domains, functional genes and biological pathways that mediate FPG levels. This study provides important clues for further elucidating the molecular mechanism of glucose homeostasis and discovering new diagnostic biomarkers and therapeutic targets for diabetes.
Collapse
|
11
|
Abstract
PURPOSE OF REVIEW We review recent evidence of the relationship between dietary fat intake and risk of type 2 diabetes (T2D), the role of epigenetic alterations as a mediator of this relationship, and the impact of gene-dietary fat interactions in the development of the disease. Based on the observations made, we will discuss whether there is evidence to support genetic personalization of fat intake recommendations in T2D prevention. RECENT FINDINGS Strong evidence suggests that polyunsaturated fatty acids (PUFA) have a protective effect on T2D risk, whereas the roles of saturated and monounsaturated fatty acids (SFA and MUFA) remain unclear. Diets enriched with PUFA vs SFA lead to distinct epigenetic alterations that may mediate their effects on T2D risk by changing gene function. However, it is not currently known which of the epigenetic alterations, if any, are causal for T2D. The current literature shows no replicated evidence of genetic variants modifying the effect of dietary fat intake on T2D risk. There is consistent evidence of a protective role of PUFA in T2D prevention. No evidence supports genetic personalization of dietary recommendations in T2D prevention.
Collapse
|
12
|
Genome-wide Association Study of Change in Fasting Glucose over time in 13,807 non-diabetic European Ancestry Individuals. Sci Rep 2019; 9:9439. [PMID: 31263163 PMCID: PMC6602949 DOI: 10.1038/s41598-019-45823-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Accepted: 05/29/2019] [Indexed: 01/13/2023] Open
Abstract
Type 2 diabetes (T2D) affects the health of millions of people worldwide. The identification of genetic determinants associated with changes in glycemia over time might illuminate biological features that precede the development of T2D. Here we conducted a genome-wide association study of longitudinal fasting glucose changes in up to 13,807 non-diabetic individuals of European descent from nine cohorts. Fasting glucose change over time was defined as the slope of the line defined by multiple fasting glucose measurements obtained over up to 14 years of observation. We tested for associations of genetic variants with inverse-normal transformed fasting glucose change over time adjusting for age at baseline, sex, and principal components of genetic variation. We found no genome-wide significant association (P < 5 × 10-8) with fasting glucose change over time. Seven loci previously associated with T2D, fasting glucose or HbA1c were nominally (P < 0.05) associated with fasting glucose change over time. Limited power influences unambiguous interpretation, but these data suggest that genetic effects on fasting glucose change over time are likely to be small. A public version of the data provides a genomic resource to combine with future studies to evaluate shared genetic links with T2D and other metabolic risk traits.
Collapse
|
13
|
Abstract
More than a decade ago, the term "next-generation" sequencing was coined to describe what was, at the time, revolutionary new methods to sequence RNA and DNA at a faster pace and cheaper cost than could be performed by standard bench-top protocols. Since then, the field of DNA sequencing has evolved at a rapid pace, with new breakthroughs allowing capacity to exponentially increase and cost to dramatically decrease. As genome-scale sequencing has become routine, a paradigm shift is occurring in genomics, which uses the power of high-throughput, rapid sequencing power with large-scale studies. These new approaches to genetic discovery will provide direct impact to fields such as personalized medicine, evolution, and biodiversity. This work reviews recent technology advances and methods in next-generation sequencing and highlights current large-scale sequencing efforts driving the evolution of the genomics space.
Collapse
|
14
|
Abstract
Standard QTL mapping procedures seek to identify genetic loci affecting the phenotypic mean while assuming that all individuals have the same residual variance. But when the residual variance differs systematically between groups, perhaps due to a genetic or environmental factor, such standard procedures can falter: in testing for QTL associations, they attribute too much weight to observations that are noisy and too little to those that are precise, resulting in reduced power and and increased susceptibility to false positives. The negative effects of such "background variance heterogeneity" (BVH) on standard QTL mapping have received little attention until now, although the subject is closely related to work on the detection of variance-controlling genes. Here we use simulation to examine how BVH affects power and false positive rate for detecting QTL affecting the mean (mQTL), the variance (vQTL), or both (mvQTL). We compare linear regression for mQTL and Levene's test for vQTL, with tests more recently developed, including tests based on the double generalized linear model (DGLM), which can model BVH explicitly. We show that, when used in conjunction with a suitable permutation procedure, the DGLM-based tests accurately control false positive rate and are more powerful than the other tests. We also find that some adverse effects of BVH can be mitigated by applying a rank inverse normal transform. We apply our novel approach, which we term "mean-variance QTL mapping", to publicly available data on a mouse backcross and, after accommodating BVH driven by sire, detect a new mQTL for bodyweight.
Collapse
|
15
|
QTL Mapping on a Background of Variance Heterogeneity. G3 (BETHESDA, MD.) 2018; 8:3767-3782. [PMID: 30389794 PMCID: PMC6288843 DOI: 10.1534/g3.118.200790] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2018] [Accepted: 10/28/2018] [Indexed: 12/21/2022]
Abstract
Standard QTL mapping procedures seek to identify genetic loci affecting the phenotypic mean while assuming that all individuals have the same residual variance. But when the residual variance differs systematically between groups, perhaps due to a genetic or environmental factor, such standard procedures can falter: in testing for QTL associations, they attribute too much weight to observations that are noisy and too little to those that are precise, resulting in reduced power and and increased susceptibility to false positives. The negative effects of such "background variance heterogeneity" (BVH) on standard QTL mapping have received little attention until now, although the subject is closely related to work on the detection of variance-controlling genes. Here we use simulation to examine how BVH affects power and false positive rate for detecting QTL affecting the mean (mQTL), the variance (vQTL), or both (mvQTL). We compare linear regression for mQTL and Levene's test for vQTL, with tests more recently developed, including tests based on the double generalized linear model (DGLM), which can model BVH explicitly. We show that, when used in conjunction with a suitable permutation procedure, the DGLM-based tests accurately control false positive rate and are more powerful than the other tests. We also find that some adverse effects of BVH can be mitigated by applying a rank inverse normal transform. We apply our novel approach, which we term "mean-variance QTL mapping", to publicly available data on a mouse backcross and, after accommodating BVH driven by sire, detect a new mQTL for bodyweight.
Collapse
|
16
|
Identifying loci affecting trait variability and detecting interactions in genome-wide association studies. Nat Genet 2018; 50:1608-1614. [PMID: 30323177 DOI: 10.1038/s41588-018-0225-6] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2017] [Accepted: 08/03/2018] [Indexed: 11/09/2022]
Abstract
Identification of genetic variants with effects on trait variability can provide insights into the biological mechanisms that control variation and can identify potential interactions. We propose a two-degree-of-freedom test for jointly testing mean and variance effects to identify such variants. We implement the test in a linear mixed model, for which we provide an efficient algorithm and software. To focus on biologically interesting settings, we develop a test for dispersion effects, that is, variance effects not driven solely by mean effects when the trait distribution is non-normal. We apply our approach to body mass index in the subsample of the UK Biobank population with British ancestry (n ~408,000) and show that our approach can increase the power to detect associated loci. We identify and replicate novel associations with significant variance effects that cannot be explained by the non-normality of body mass index, and we provide suggestive evidence for a connection between leptin levels and body mass index variability.
Collapse
|
17
|
Genotypic variability-based genome-wide association study identifies non-additive loci HLA-C and IL12B for psoriasis. J Hum Genet 2017; 63:289-296. [DOI: 10.1038/s10038-017-0350-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2017] [Revised: 09/04/2017] [Accepted: 09/05/2017] [Indexed: 12/19/2022]
|