1
|
Tyler AL, Mahoney JM, Keller MP, Baker CN, Gaca M, Srivastava A, Gerdes Gyuricza I, Braun MJ, Rosenthal NA, Attie AD, Churchill GA, Carter GW. Transcripts with high distal heritability mediate genetic effects on complex metabolic traits. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.26.613931. [PMID: 39386475 PMCID: PMC11463413 DOI: 10.1101/2024.09.26.613931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
Although many genes are subject to local regulation, recent evidence suggests that complex distal regulation may be more important in mediating phenotypic variability. To assess the role of distal gene regulation in complex traits, we combined multi-tissue transcriptomes with physiological outcomes to model diet-induced obesity and metabolic disease in a population of Diversity Outbred mice. Using a novel high-dimensional mediation analysis, we identified a composite transcriptome signature that summarized genetic effects on gene expression and explained 30% of the variation across all metabolic traits. The signature was heritable, interpretable in biological terms, and predicted obesity status from gene expression in an independently derived mouse cohort and multiple human studies. Transcripts contributing most strongly to this composite mediator frequently had complex, distal regulation distributed throughout the genome. These results suggest that trait-relevant variation in transcription is largely distally regulated, but is nonetheless identifiable, interpretable, and translatable across species.
Collapse
|
2
|
Dagostino R, Gottlieb A. Tissue-specific atlas of trans-models for gene regulation elucidates complex regulation patterns. BMC Genomics 2024; 25:377. [PMID: 38632500 PMCID: PMC11022497 DOI: 10.1186/s12864-024-10317-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 04/16/2024] [Indexed: 04/19/2024] Open
Abstract
BACKGROUND Deciphering gene regulation is essential for understanding the underlying mechanisms of healthy and disease states. While the regulatory networks formed by transcription factors (TFs) and their target genes has been mostly studied with relation to cis effects such as in TF binding sites, we focused on trans effects of TFs on the expression of their transcribed genes and their potential mechanisms. RESULTS We provide a comprehensive tissue-specific atlas, spanning 49 tissues of TF variations affecting gene expression through computational models considering two potential mechanisms, including combinatorial regulation by the expression of the TFs, and by genetic variants within the TF. We demonstrate that similarity between tissues based on our discovered genes corresponds to other types of tissue similarity. The genes affected by complex TF regulation, and their modelled TFs, were highly enriched for pharmacogenomic functions, while the TFs themselves were also enriched in several cancer and metabolic pathways. Additionally, genes that appear in multiple clusters are enriched for regulation of immune system while tissue clusters include cluster-specific genes that are enriched for biological functions and diseases previously associated with the tissues forming the cluster. Finally, our atlas exposes multilevel regulation across multiple tissues, where TFs regulate other TFs through the two tested mechanisms. CONCLUSIONS Our tissue-specific atlas provides hierarchical tissue-specific trans genetic regulations that can be further studied for association with human phenotypes.
Collapse
Affiliation(s)
- Robert Dagostino
- McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Assaf Gottlieb
- McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA.
| |
Collapse
|
3
|
Wittich H, Ardlie K, Taylor KD, Durda P, Liu Y, Mikhaylova A, Gignoux CR, Cho MH, Rich SS, Rotter JI, Manichaikul A, Im HK, Wheeler HE. Transcriptome-wide association study of the plasma proteome reveals cis and trans regulatory mechanisms underlying complex traits. Am J Hum Genet 2024; 111:445-455. [PMID: 38320554 PMCID: PMC10940016 DOI: 10.1016/j.ajhg.2024.01.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 01/12/2024] [Accepted: 01/12/2024] [Indexed: 02/08/2024] Open
Abstract
Regulation of transcription and translation are mechanisms through which genetic variants affect complex traits. Expression quantitative trait locus (eQTL) studies have been more successful at identifying cis-eQTL (within 1 Mb of the transcription start site) than trans-eQTL. Here, we tested the cis component of gene expression for association with observed plasma protein levels to identify cis- and trans-acting genes that regulate protein levels. We used transcriptome prediction models from 49 Genotype-Tissue Expression (GTEx) Project tissues to predict the cis component of gene expression and tested the predicted expression of every gene in every tissue for association with the observed abundance of 3,622 plasma proteins measured in 3,301 individuals from the INTERVAL study. We tested significant results for replication in 971 individuals from the Trans-omics for Precision Medicine (TOPMed) Multi-Ethnic Study of Atherosclerosis (MESA). We found 1,168 and 1,210 cis- and trans-acting associations that replicated in TOPMed (FDR < 0.05) with a median expected true positive rate (π1) across tissues of 0.806 and 0.390, respectively. The target proteins of trans-acting genes were enriched for transcription factor binding sites and autoimmune diseases in the GWAS catalog. Furthermore, we found a higher correlation between predicted expression and protein levels of the same underlying gene (R = 0.17) than observed expression (R = 0.10, p = 7.50 × 10-11). This indicates the cis-acting genetically regulated (heritable) component of gene expression is more consistent across tissues than total observed expression (genetics + environment) and is useful in uncovering the function of SNPs associated with complex traits.
Collapse
Affiliation(s)
- Henry Wittich
- Program in Bioinformatics, Loyola University Chicago, Chicago, IL 60660, USA
| | - Kristin Ardlie
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Kent D Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA 90502, USA
| | - Peter Durda
- Laboratory for Clinical Biochemistry Research, University of Vermont, Colchester, VT 05446, USA
| | - Yongmei Liu
- Department of Medicine, Duke University School of Medicine, Durham, NC 27710, USA
| | - Anna Mikhaylova
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - Chris R Gignoux
- Division of Biomedical Informatics and Personalized Medicine, Department of Medicine, University of Colorado Denver Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Michael H Cho
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Stephen S Rich
- Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA 90502, USA
| | - Ani Manichaikul
- Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA
| | - Hae Kyung Im
- Section of Genetic Medicine, The University of Chicago, Chicago, IL 60637, USA
| | - Heather E Wheeler
- Program in Bioinformatics, Loyola University Chicago, Chicago, IL 60660, USA; Department of Biology, Loyola University Chicago, Chicago, IL 60660, USA.
| |
Collapse
|
4
|
Tissue-Specific Variations in Transcription Factors Elucidate Complex Immune System Regulation. Genes (Basel) 2022; 13:genes13050929. [PMID: 35627314 PMCID: PMC9140347 DOI: 10.3390/genes13050929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 05/16/2022] [Accepted: 05/18/2022] [Indexed: 11/17/2022] Open
Abstract
Gene expression plays a key role in health and disease. Estimating the genetic components underlying gene expression can thus help understand disease etiology. Polygenic models termed “transcriptome imputation” are used to estimate the genetic component of gene expression, but these models typically consider only the cis regions of the gene. However, these cis-based models miss large variability in expression for multiple genes. Transcription factors (TFs) that regulate gene expression are natural candidates for looking for additional sources of the missing variability. We developed a hypothesis-driven approach to identify second-tier regulation by variability in TFs. Our approach tested two models representing possible mechanisms by which variations in TFs can affect gene expression: variability in the expression of the TF and genetic variants within the TF that may affect the binding affinity of the TF to the TF-binding site. We tested our TF models in whole blood and skeletal muscle tissues and identified TF variability that can partially explain missing gene expression for 1035 genes, 76% of which explains more than the cis-based models. While the discovered regulation patterns were tissue-specific, they were both enriched for immune system functionality, elucidating complex regulation patterns. Our hypothesis-driven approach is useful for identifying tissue-specific genetic regulation patterns involving variations in TF expression or binding.
Collapse
|
5
|
Banerjee S, Simonetti FL, Detrois KE, Kaphle A, Mitra R, Nagial R, Söding J. Tejaas: reverse regression increases power for detecting trans-eQTLs. Genome Biol 2021; 22:142. [PMID: 33957961 PMCID: PMC8101255 DOI: 10.1186/s13059-021-02361-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 04/22/2021] [Indexed: 12/18/2022] Open
Abstract
Trans-acting expression quantitative trait loci (trans-eQTLs) account for ≥70% expression heritability and could therefore facilitate uncovering mechanisms underlying the origination of complex diseases. Identifying trans-eQTLs is challenging because of small effect sizes, tissue specificity, and a severe multiple-testing burden. Tejaas predicts trans-eQTLs by performing L2-regularized “reverse” multiple regression of each SNP on all genes, aggregating evidence from many small trans-effects while being unaffected by the strong expression correlations. Combined with a novel unsupervised k-nearest neighbor method to remove confounders, Tejaas predicts 18851 unique trans-eQTLs across 49 tissues from GTEx. They are enriched in open chromatin, enhancers, and other regulatory regions. Many overlap with disease-associated SNPs, pointing to tissue-specific transcriptional regulation mechanisms.
Collapse
Affiliation(s)
- Saikat Banerjee
- Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry, Göttingen, 37077, Germany.
| | - Franco L Simonetti
- Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry, Göttingen, 37077, Germany
| | - Kira E Detrois
- Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry, Göttingen, 37077, Germany.,Georg-August University, Göttingen, 37075, Germany
| | - Anubhav Kaphle
- Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry, Göttingen, 37077, Germany.,Georg-August University, Göttingen, 37075, Germany
| | | | | | - Johannes Söding
- Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry, Göttingen, 37077, Germany. .,Campus-Institut Data Science (CIDAS), University of Göttingen, Göttingen, 37073, Germany. .,Cluster of Excellence "Multiscale Bioimaging" (MBExC), University of Göttingen, Göttingen, 37075, Germany.
| |
Collapse
|
6
|
Bhattacharya A, Li Y, Love MI. MOSTWAS: Multi-Omic Strategies for Transcriptome-Wide Association Studies. PLoS Genet 2021; 17:e1009398. [PMID: 33684137 PMCID: PMC7971899 DOI: 10.1371/journal.pgen.1009398] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2020] [Revised: 03/18/2021] [Accepted: 02/04/2021] [Indexed: 02/06/2023] Open
Abstract
Traditional predictive models for transcriptome-wide association studies (TWAS) consider only single nucleotide polymorphisms (SNPs) local to genes of interest and perform parameter shrinkage with a regularization process. These approaches ignore the effect of distal-SNPs or other molecular effects underlying the SNP-gene association. Here, we outline multi-omics strategies for transcriptome imputation from germline genetics to allow more powerful testing of gene-trait associations by prioritizing distal-SNPs to the gene of interest. In one extension, we identify mediating biomarkers (CpG sites, microRNAs, and transcription factors) highly associated with gene expression and train predictive models for these mediators using their local SNPs. Imputed values for mediators are then incorporated into the final predictive model of gene expression, along with local SNPs. In the second extension, we assess distal-eQTLs (SNPs associated with genes not in a local window around it) for their mediation effect through mediating biomarkers local to these distal-eSNPs. Distal-eSNPs with large indirect mediation effects are then included in the transcriptomic prediction model with the local SNPs around the gene of interest. Using simulations and real data from ROS/MAP brain tissue and TCGA breast tumors, we show considerable gains of percent variance explained (1-2% additive increase) of gene expression and TWAS power to detect gene-trait associations. This integrative approach to transcriptome-wide imputation and association studies aids in identifying the complex interactions underlying genetic regulation within a tissue and important risk genes for various traits and disorders.
Collapse
Affiliation(s)
- Arjun Bhattacharya
- Department of Pathology and Laboratory Medicine, University of California-Los Angeles, Los Angeles, California, United States of America
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Michael I. Love
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| |
Collapse
|
7
|
Alcohol use disorder causes global changes in splicing in the human brain. Transl Psychiatry 2021; 11:2. [PMID: 33414398 PMCID: PMC7790816 DOI: 10.1038/s41398-020-01163-z] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 12/09/2020] [Accepted: 12/10/2020] [Indexed: 01/11/2023] Open
Abstract
Alcohol use disorder (AUD) is a widespread disease leading to the deterioration of cognitive and other functions. Mechanisms by which alcohol affects the brain are not fully elucidated. Splicing constitutes a nuclear process of RNA maturation, which results in the formation of the transcriptome. We tested the hypothesis as to whether AUD impairs splicing in the superior frontal cortex (SFC), nucleus accumbens (NA), basolateral amygdala (BLA), and central nucleus of the amygdala (CNA). To evaluate splicing, bam files from STAR alignments were indexed with samtools for use by rMATS software. Computational analysis of affected pathways was performed using Gene Ontology Consortium, Gene Set Enrichment Analysis, and LncRNA Ontology databases. Surprisingly, AUD was associated with limited changes in the transcriptome: expression of 23 genes was altered in SFC, 14 in NA, 102 in BLA, and 57 in CNA. However, strikingly, mis-splicing in AUD was profound: 1421 mis-splicing events were detected in SFC, 394 in NA, 1317 in BLA, and 469 in CNA. To determine the mechanism of mis-splicing, we analyzed the elements of the spliceosome: small nuclear RNAs (snRNAs) and splicing factors. While snRNAs were not affected by alcohol, expression of splicing factor heat shock protein family A (Hsp70) member 6 (HSPA6) was drastically increased in SFC, BLA, and CNA. Also, AUD was accompanied by aberrant expression of long noncoding RNAs (lncRNAs) related to splicing. In summary, alcohol is associated with genome-wide changes in splicing in multiple human brain regions, likely due to dysregulation of splicing factor(s) and/or altered expression of splicing-related lncRNAs.
Collapse
|
8
|
Xie Y, Shan N, Zhao H, Hou L. Transcriptome wide association studies: general framework and methods. QUANTITATIVE BIOLOGY 2021. [DOI: 10.15302/j-qb-020-0228] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
9
|
Deutelmoser H, Lorenzo Bermejo J, Benner A, Weigl K, Park HA, Haffa M, Herpel E, Schneider M, Ulrich CM, Hoffmeister M, Chang-Claude J, Brenner H, Scherer D. Genotype-Based Gene Expression in Colon Tissue-Prediction Accuracy and Relationship with the Prognosis of Colorectal Cancer Patients. Int J Mol Sci 2020; 21:E8150. [PMID: 33142733 PMCID: PMC7662650 DOI: 10.3390/ijms21218150] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 10/26/2020] [Accepted: 10/27/2020] [Indexed: 12/24/2022] Open
Abstract
Colorectal cancer (CRC) survival has environmental and inherited components. The expression of specific genes can be inferred based on individual genotypes-so called expression quantitative trait loci. In this study, we used the PrediXcan method to predict gene expression in normal colon tissue using individual genotype data from 91 CRC patients and examined the correlation ρ between predicted and measured gene expression levels. Out of 5434 predicted genes, 58% showed a negative ρ value and only 16% presented a ρ higher than 0.10. We subsequently investigated the association between genotype-based gene expression in colon tissue for genes with ρ > 0.10 and survival of 4436 CRC patients. We identified an inverse association between the predicted expression of ARID3B and CRC-specific survival for patients with a body mass index greater than or equal to 30 kg/m2 (HR (hazard ratio) = 0.66 for an expression higher vs. lower than the median, p = 0.005). This association was validated using genotype and clinical data from the UK Biobank (HR = 0.74, p = 0.04). In addition to the identification of ARID3B expression in normal colon tissue as a candidate prognostic biomarker for obese CRC patients, our study illustrates the challenges of genotype-based prediction of gene expression, and the advantage of reassessing the prediction accuracy in a subset of the study population using measured gene expression data.
Collapse
Affiliation(s)
- Heike Deutelmoser
- Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Im Neuenheimer Feld 460, 69120 Heidelberg, Germany; (H.D.); (M.H.); (C.M.U.); (H.B.)
- Institute of Medical Biometry and Informatics, Medical Faculty, Heidelberg University, Im Neuenheimer Feld 130.3, 69120 Heidelberg, Germany;
| | - Justo Lorenzo Bermejo
- Institute of Medical Biometry and Informatics, Medical Faculty, Heidelberg University, Im Neuenheimer Feld 130.3, 69120 Heidelberg, Germany;
| | - Axel Benner
- Division of Biostatistics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 581, 69121 Heidelberg, Germany;
| | - Korbinian Weigl
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 581, 69121 Heidelberg, Germany; (K.W.); (M.H.)
| | - Hanla A. Park
- Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 581, 69121 Heidelberg, Germany; (H.A.P.); (J.C.-C.)
| | - Mariam Haffa
- Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Im Neuenheimer Feld 460, 69120 Heidelberg, Germany; (H.D.); (M.H.); (C.M.U.); (H.B.)
- Division of Translational Functional Cancer Genomics, National Center for Tumor Diseases (NCT) and German Cancer Research Center (DKFZ), Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Esther Herpel
- NCT Tissue Bank, National Center for Tumor Diseases (NCT) and University Hospital Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany;
- Institute of Pathology, University Hospital Heidelberg, Im Neuenheimer Feld 224, 69120 Heidelberg, Germany
| | - Martin Schneider
- Department of General, Visceral, and Transplantation Surgery, University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany;
| | - Cornelia M. Ulrich
- Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Im Neuenheimer Feld 460, 69120 Heidelberg, Germany; (H.D.); (M.H.); (C.M.U.); (H.B.)
- Huntsman Cancer Institute, 2000 Cir of Hope Dr 1950, Salt Lake City, UT 84112, USA
- Department of Population Health Sciences, School of Medicine, University of Utah, Salt Lake City, UT 84112, USA
| | - Michael Hoffmeister
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 581, 69121 Heidelberg, Germany; (K.W.); (M.H.)
| | - Jenny Chang-Claude
- Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 581, 69121 Heidelberg, Germany; (H.A.P.); (J.C.-C.)
- Cancer Epidemiology Group, University Cancer Center Hamburg (UCCH), University Medical Center Hamburg-Eppendorf (UKE), Martinstraße 52, 20246 Hamburg, Germany
| | - Hermann Brenner
- Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Im Neuenheimer Feld 460, 69120 Heidelberg, Germany; (H.D.); (M.H.); (C.M.U.); (H.B.)
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 581, 69121 Heidelberg, Germany; (K.W.); (M.H.)
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | - Dominique Scherer
- Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Im Neuenheimer Feld 460, 69120 Heidelberg, Germany; (H.D.); (M.H.); (C.M.U.); (H.B.)
- Institute of Medical Biometry and Informatics, Medical Faculty, Heidelberg University, Im Neuenheimer Feld 130.3, 69120 Heidelberg, Germany;
| |
Collapse
|
10
|
Kolberg L, Kerimov N, Peterson H, Alasoo K. Co-expression analysis reveals interpretable gene modules controlled by trans-acting genetic variants. eLife 2020; 9:e58705. [PMID: 32880574 PMCID: PMC7470823 DOI: 10.7554/elife.58705] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Accepted: 08/20/2020] [Indexed: 12/16/2022] Open
Abstract
Understanding the causal processes that contribute to disease onset and progression is essential for developing novel therapies. Although trans-acting expression quantitative trait loci (trans-eQTLs) can directly reveal cellular processes modulated by disease variants, detecting trans-eQTLs remains challenging due to their small effect sizes. Here, we analysed gene expression and genotype data from six blood cell types from 226 to 710 individuals. We used co-expression modules inferred from gene expression data with five methods as traits in trans-eQTL analysis to limit multiple testing and improve interpretability. In addition to replicating three established associations, we discovered a novel trans-eQTL near SLC39A8 regulating a module of metallothionein genes in LPS-stimulated monocytes. Interestingly, this effect was mediated by a transient cis-eQTL present only in early LPS response and lost before the trans effect appeared. Our analyses highlight how co-expression combined with functional enrichment analysis improves the identification and prioritisation of trans-eQTLs when applied to emerging cell-type-specific datasets.
Collapse
Affiliation(s)
- Liis Kolberg
- Institute of Computer Science, University of TartuTartuEstonia
| | - Nurlan Kerimov
- Institute of Computer Science, University of TartuTartuEstonia
| | - Hedi Peterson
- Institute of Computer Science, University of TartuTartuEstonia
| | - Kaur Alasoo
- Institute of Computer Science, University of TartuTartuEstonia
| |
Collapse
|
11
|
Liu X, Mefford JA, Dahl A, He Y, Subramaniam M, Battle A, Price AL, Zaitlen N. GBAT: a gene-based association test for robust detection of trans-gene regulation. Genome Biol 2020; 21:211. [PMID: 32831138 PMCID: PMC7444084 DOI: 10.1186/s13059-020-02120-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Accepted: 07/27/2020] [Indexed: 02/07/2023] Open
Abstract
The observation that disease-associated genetic variants typically reside outside of exons has inspired widespread investigation into the genetic basis of transcriptional regulation. While associations between the mRNA abundance of a gene and its proximal SNPs (cis-eQTLs) are now readily identified, identification of high-quality distal associations (trans-eQTLs) has been limited by a heavy multiple testing burden and the proneness to false-positive signals. To address these issues, we develop GBAT, a powerful gene-based pipeline that allows robust detection of high-quality trans-gene regulation signal.
Collapse
Affiliation(s)
- Xuanyao Liu
- Department of Epidemiology, Harvard TH Chan School of Public Health, Boston, MA USA
- Department of Human Genetics, The University of Chicago, Chicago, IL USA
| | - Joel A. Mefford
- Departments of Neurology and Computational Medicine, University of California Los Angeles, Los Angeles, CA USA
| | - Andrew Dahl
- Departments of Neurology and Computational Medicine, University of California Los Angeles, Los Angeles, CA USA
| | - Yuan He
- Department of Computer Science, Johns Hopkins University, Baltimore, MD USA
| | - Meena Subramaniam
- Departments of Neurology and Computational Medicine, University of California Los Angeles, Los Angeles, CA USA
| | - Alexis Battle
- Department of Computer Science, Johns Hopkins University, Baltimore, MD USA
| | - Alkes L. Price
- Department of Epidemiology, Harvard TH Chan School of Public Health, Boston, MA USA
| | - Noah Zaitlen
- Departments of Neurology and Computational Medicine, University of California Los Angeles, Los Angeles, CA USA
| |
Collapse
|
12
|
Yang T, Wu C, Wei P, Pan W. Integrating DNA sequencing and transcriptomic data for association analyses of low-frequency variants and lipid traits. Hum Mol Genet 2020; 29:515-526. [PMID: 31919517 PMCID: PMC7015848 DOI: 10.1093/hmg/ddz314] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 12/11/2019] [Accepted: 12/16/2019] [Indexed: 12/13/2022] Open
Abstract
Transcriptome-wide association studies (TWAS) integrate genome-wide association studies (GWAS) and transcriptomic data to showcase their improved statistical power of identifying gene-trait associations while, importantly, offering further biological insights. TWAS have thus far focused on common variants as available from GWAS. Compared with common variants, the findings for or even applications to low-frequency variants are limited and their underlying role in regulating gene expression is less clear. To fill this gap, we extend TWAS to integrating whole genome sequencing data with transcriptomic data for low-frequency variants. Using the data from the Framingham Heart Study, we demonstrate that low-frequency variants play an important and universal role in predicting gene expression, which is not completely due to linkage disequilibrium with the nearby common variants. By including low-frequency variants, in addition to common variants, we increase the predictivity of gene expression for 79% of the examined genes. Incorporating this piece of functional genomic information, we perform association testing for five lipid traits in two UK10K whole genome sequencing cohorts, hypothesizing that cis-expression quantitative trait loci, including low-frequency variants, are more likely to be trait-associated. We discover that two genes, LDLR and TTC22, are genome-wide significantly associated with low-density lipoprotein cholesterol based on 3203 subjects and that the association signals are largely independent of common variants. We further demonstrate that a joint analysis of both common and low-frequency variants identifies association signals that would be missed by testing on either common variants or low-frequency variants alone.
Collapse
Affiliation(s)
- Tianzhong Yang
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Chong Wu
- Department of Statistics, Florida State University, Tallahassee, FL, USA
| | - Peng Wei
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
13
|
Wheeler HE, Ploch S, Barbeira AN, Bonazzola R, Andaleon A, Fotuhi Siahpirani A, Saha A, Battle A, Roy S, Im HK. Imputed gene associations identify replicable trans-acting genes enriched in transcription pathways and complex traits. Genet Epidemiol 2019; 43:596-608. [PMID: 30950127 PMCID: PMC6687523 DOI: 10.1002/gepi.22205] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Revised: 02/15/2019] [Accepted: 03/18/2019] [Indexed: 11/17/2022]
Abstract
Regulation of gene expression is an important mechanism through which genetic variation can affect complex traits. A substantial portion of gene expression variation can be explained by both local (cis) and distal (trans) genetic variation. Much progress has been made in uncovering cis-acting expression quantitative trait loci (cis-eQTL), but trans-eQTL have been more difficult to identify and replicate. Here we take advantage of our ability to predict the cis component of gene expression coupled with gene mapping methods such as PrediXcan to identify high confidence candidate trans-acting genes and their targets. That is, we correlate the cis component of gene expression with observed expression of genes in different chromosomes. Leveraging the shared cis-acting regulation across tissues, we combine the evidence of association across all available Genotype-Tissue Expression Project tissues and find 2,356 trans-acting/target gene pairs with high mappability scores. Reassuringly, trans-acting genes are enriched in transcription and nucleic acid binding pathways and target genes are enriched in known transcription factor binding sites. Interestingly, trans-acting genes are more significantly associated with selected complex traits and diseases than target or background genes, consistent with percolating trans effects. Our scripts and summary statistics are publicly available for future studies of trans-acting gene regulation.
Collapse
Affiliation(s)
- Heather E. Wheeler
- Department of BiologyLoyola University ChicagoChicagoIllinois
- Department of Computer ScienceLoyola University ChicagoChicagoIllinois
- Department of Public Health SciencesStritch School of Medicine, Loyola University ChicagoMaywoodIllinois
| | - Sally Ploch
- Department of BiologyLoyola University ChicagoChicagoIllinois
| | - Alvaro N. Barbeira
- Section of Genetic Medicine, Department of MedicineUniversity of ChicagoChicagoIllinois
| | - Rodrigo Bonazzola
- Section of Genetic Medicine, Department of MedicineUniversity of ChicagoChicagoIllinois
| | - Angela Andaleon
- Department of BiologyLoyola University ChicagoChicagoIllinois
| | | | - Ashis Saha
- Department of Computer ScienceJohns Hopkins UniversityBaltimoreMaryland
| | - Alexis Battle
- Department of Computer ScienceJohns Hopkins UniversityBaltimoreMaryland
- Department of Biomedical EngineeringJohns Hopkins UniversityBaltimoreMaryland
| | - Sushmita Roy
- Department of Biostatistics and Medical InformaticsUniversity of Wisconsin‐MadisonMadisonWisconsin
| | - Hae Kyung Im
- Section of Genetic Medicine, Department of MedicineUniversity of ChicagoChicagoIllinois
| |
Collapse
|
14
|
Opportunities and challenges for transcriptome-wide association studies. Nat Genet 2019; 51:592-599. [PMID: 30926968 DOI: 10.1038/s41588-019-0385-z] [Citation(s) in RCA: 533] [Impact Index Per Article: 88.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Accepted: 02/13/2019] [Indexed: 11/08/2022]
Abstract
Transcriptome-wide association studies (TWAS) integrate genome-wide association studies (GWAS) and gene expression datasets to identify gene-trait associations. In this Perspective, we explore properties of TWAS as a potential approach to prioritize causal genes at GWAS loci, by using simulations and case studies of literature-curated candidate causal genes for schizophrenia, low-density-lipoprotein cholesterol and Crohn's disease. We explore risk loci where TWAS accurately prioritizes the likely causal gene as well as loci where TWAS prioritizes multiple genes, some likely to be non-causal, owing to sharing of expression quantitative trait loci (eQTL). TWAS is especially prone to spurious prioritization with expression data from non-trait-related tissues or cell types, owing to substantial cross-cell-type variation in expression levels and eQTL strengths. Nonetheless, TWAS prioritizes candidate causal genes more accurately than simple baselines. We suggest best practices for causal-gene prioritization with TWAS and discuss future opportunities for improvement. Our results showcase the strengths and limitations of using eQTL datasets to determine causal genes at GWAS loci.
Collapse
|