51
|
Abstract
Expression quantitative trait locus (eQTL) analysis is a powerful method to understand the association between genetic variant and gene expression; it also has potential impact for the study of transcription medicine for human complex disease. In the past two decades, the researchers focus on studying the eQTL, while more and more evidence shows that the regulatory genetic variants locating noncoding region have strong effect for the gene expression. More and more researchers working on eQTL analysis realize the importance of other types of QTLs beyond eQTL. In this chapter, we will explore some QTLs beyond eQTLs that show the regulatory association with eQTLs and explain the underlying link among these types of QTLs.
Collapse
Affiliation(s)
- Jia Wen
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC, USA.
| | - Conor Nodzak
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC, USA
| | - Xinghua Shi
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC, USA
| |
Collapse
|
52
|
Rodríguez-López ML, Martínez-Magaña JJ, Cabrera-Mendoza B, Genis-Mendoza AD, García-Dolores F, López-Armenta M, Flores G, Vázquez-Roque RA, Nicolini H. Exploratory analysis of genetic variants influencing molecular traits in cerebral cortex of suicide completers. Am J Med Genet B Neuropsychiatr Genet 2020; 183:26-37. [PMID: 31418530 DOI: 10.1002/ajmg.b.32752] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 05/13/2019] [Accepted: 07/09/2019] [Indexed: 12/28/2022]
Abstract
Genetic factors have been implicated in suicidal behavior. It has been suggested that one of the roles of genetic factors in suicide could be represented by the effect of genetic variants on gene expression regulation. Alteration in the expression of genes participating in multiple biological systems in the suicidal brain has been demonstrated, so it is imperative to identify genetic variants that could influence gene expression or its regulatory mechanisms. In this study, we integrated DNA methylation, gene expression, and genotype data from the prefrontal cortex of suicides to identify genetic variants that could be factors in the regulation of gene expression, generally called quantitative trait locus (xQTLs). We identify 6,224 methylation quantitative trait loci and 2,239 expression quantitative trait loci (eQTLs) in the prefrontal cortex of suicide completers. The xQTLs identified influence the expression of genes involved in neurodevelopment and cell organization. Two of the eQTLs identified (rs8065311 and rs1019238) were previously associated with cannabis dependence, highlighting a candidate genetic variant for the increased suicide risk in subjects with substance use disorders. Our findings suggest that genetic variants may regulate gene expression in the prefrontal cortex of suicides through the modulation of promoter and enhancer activity, and to a lesser extent, binding transcription factors.
Collapse
Affiliation(s)
- Mariana L Rodríguez-López
- Genomics of Psychiatric and Neurodegenerative Diseases Laboratory, National Institute of Genomic Medicine (INMEGEN), Mexico City, Mexico
| | - José J Martínez-Magaña
- Genomics of Psychiatric and Neurodegenerative Diseases Laboratory, National Institute of Genomic Medicine (INMEGEN), Mexico City, Mexico
| | - Brenda Cabrera-Mendoza
- Genomics of Psychiatric and Neurodegenerative Diseases Laboratory, National Institute of Genomic Medicine (INMEGEN), Mexico City, Mexico
| | - Alma D Genis-Mendoza
- Genomics of Psychiatric and Neurodegenerative Diseases Laboratory, National Institute of Genomic Medicine (INMEGEN), Mexico City, Mexico.,Psychiatric Care Services, Child Psychiatric Hospital Dr. Juan N Navarro, CDMX, Mexico
| | | | | | - Gonzalo Flores
- Neuropsychiatry Laboratory, Institute of Physiology, Meritorious Autonomous University of Puebla, Puebla, Mexico
| | - Rubén A Vázquez-Roque
- Neuropsychiatry Laboratory, Institute of Physiology, Meritorious Autonomous University of Puebla, Puebla, Mexico
| | - Humberto Nicolini
- Genomics of Psychiatric and Neurodegenerative Diseases Laboratory, National Institute of Genomic Medicine (INMEGEN), Mexico City, Mexico.,Carracci Medical Group, CDMX, Mexico
| |
Collapse
|
53
|
Keele GR, Quach BC, Israel JW, Chappell GA, Lewis L, Safi A, Simon JM, Cotney P, Crawford GE, Valdar W, Rusyn I, Furey TS. Integrative QTL analysis of gene expression and chromatin accessibility identifies multi-tissue patterns of genetic regulation. PLoS Genet 2020; 16:e1008537. [PMID: 31961859 PMCID: PMC7010298 DOI: 10.1371/journal.pgen.1008537] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Revised: 02/10/2020] [Accepted: 11/23/2019] [Indexed: 01/08/2023] Open
Abstract
Gene transcription profiles across tissues are largely defined by the activity of regulatory elements, most of which correspond to regions of accessible chromatin. Regulatory element activity is in turn modulated by genetic variation, resulting in variable transcription rates across individuals. The interplay of these factors, however, is poorly understood. Here we characterize expression and chromatin state dynamics across three tissues-liver, lung, and kidney-in 47 strains of the Collaborative Cross (CC) mouse population, examining the regulation of these dynamics by expression quantitative trait loci (eQTL) and chromatin QTL (cQTL). QTL whose allelic effects were consistent across tissues were detected for 1,101 genes and 133 chromatin regions. Also detected were eQTL and cQTL whose allelic effects differed across tissues, including local-eQTL for Pik3c2g detected in all three tissues but with distinct allelic effects. Leveraging overlapping measurements of gene expression and chromatin accessibility on the same mice from multiple tissues, we used mediation analysis to identify chromatin and gene expression intermediates of eQTL effects. Based on QTL and mediation analyses over multiple tissues, we propose a causal model for the distal genetic regulation of Akr1e1, a gene involved in glycogen metabolism, through the zinc finger transcription factor Zfp985 and chromatin intermediates. This analysis demonstrates the complexity of transcriptional and chromatin dynamics and their regulation over multiple tissues, as well as the value of the CC and related genetic resource populations for identifying specific regulatory mechanisms within cells and tissues.
Collapse
Affiliation(s)
- Gregory R. Keele
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
| | - Bryan C. Quach
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Center for Omics Discovery and Epidemiology, Research Triangle Institute (RTI) International, Research Triangle Park, North Carolina, United States of America
| | - Jennifer W. Israel
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Grace A. Chappell
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, Texas, United States of America
| | - Lauren Lewis
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, Texas, United States of America
| | - Alexias Safi
- Department of Pediatrics, Duke University, Durham, North Carolina, United States of America
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina, United States of America
| | - Jeremy M. Simon
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Paul Cotney
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Gregory E. Crawford
- Department of Pediatrics, Duke University, Durham, North Carolina, United States of America
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina, United States of America
| | - William Valdar
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Ivan Rusyn
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, Texas, United States of America
| | - Terrence S. Furey
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| |
Collapse
|
54
|
Church BV, Williams HT, Mar JC. Investigating skewness to understand gene expression heterogeneity in large patient cohorts. BMC Bioinformatics 2019; 20:668. [PMID: 31861976 PMCID: PMC6923883 DOI: 10.1186/s12859-019-3252-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Skewness is an under-utilized statistical measure that captures the degree of asymmetry in the distribution of any dataset. This study applied a new metric based on skewness to identify regulators or genes that have outlier expression in large patient cohorts. RESULTS We investigated whether specific patterns of skewed expression were related to the enrichment of biological pathways or genomic properties like DNA methylation status. Our study used publicly available datasets that were generated using both RNA-sequencing and microarray technology platforms. For comparison, the datasets selected for this study also included different samples derived from control donors and cancer patients. When comparing the shift in expression skewness between cancer and control datasets, we observed an enrichment of pathways related to the immune function that reflects an increase towards positive skewness in the cancer relative to control datasets. A significant correlation was also detected between expression skewness and the top 500 genes corresponding to the most significant differential DNA methylation occurring in the promotor regions for four Cancer Genome Atlas cancer cohorts. CONCLUSIONS Our results indicate that expression skewness can reveal new insights into transcription based on outlier and asymmetrical behaviour in large patient cohorts.
Collapse
Affiliation(s)
- Benjamin V. Church
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, 10461 NY USA
- Department of Mathematics, Columbia University, 2990 Broadway, New York, 10027 NY USA
| | - Henry T. Williams
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, 10461 NY USA
- Department of Mathematics, Columbia University, 2990 Broadway, New York, 10027 NY USA
| | - Jessica C. Mar
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, 10461 NY USA
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, 10461 NY USA
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, 4072 QLD Australia
| |
Collapse
|
55
|
Gauthier L, Stynen B, Serohijos AWR, Michnick SW. Genetics' Piece of the PI: Inferring the Origin of Complex Traits and Diseases from Proteome-Wide Protein-Protein Interaction Dynamics. Bioessays 2019; 42:e1900169. [PMID: 31854021 DOI: 10.1002/bies.201900169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Revised: 11/15/2019] [Indexed: 11/07/2022]
Abstract
How do common and rare genetic polymorphisms contribute to quantitative traits or disease risk and progression? Multiple human traits have been extensively characterized at the genomic level, revealing their complex genetic architecture. However, it is difficult to resolve the mechanisms by which specific variants contribute to a phenotype. Recently, analyses of variant effects on molecular traits have uncovered intermediate mechanisms that link sequence variation to phenotypic changes. Yet, these methods only capture a fraction of genetic contributions to phenotype. Here, in reviewing the field, it is proposed that complex traits can be understood by characterizing the dynamics of biochemical networks within living cells, and that the effects of genetic variation can be captured on these networks by using protein-protein interaction (PPI) methodologies. This synergy between PPI methodologies and the genetics of complex traits opens new avenues to investigate the molecular etiology of human diseases and to facilitate their prevention or treatment.
Collapse
Affiliation(s)
- Louis Gauthier
- Departement de Biochimie, Université de Montréal, 2900 Édouard-Montpetit, Montréal, Quebec, H3T 1J4, Canada.,Centre Robert-Cedergren en Bioinformatique et Génomique, Université de Montréal, 2900 Édouard-Montpetit, Montréal, Quebec, H3T 1J4, Canada
| | - Bram Stynen
- Departement de Biochimie, Université de Montréal, 2900 Édouard-Montpetit, Montréal, Quebec, H3T 1J4, Canada.,Centre Robert-Cedergren en Bioinformatique et Génomique, Université de Montréal, 2900 Édouard-Montpetit, Montréal, Quebec, H3T 1J4, Canada
| | - Adrian W R Serohijos
- Departement de Biochimie, Université de Montréal, 2900 Édouard-Montpetit, Montréal, Quebec, H3T 1J4, Canada.,Centre Robert-Cedergren en Bioinformatique et Génomique, Université de Montréal, 2900 Édouard-Montpetit, Montréal, Quebec, H3T 1J4, Canada
| | - Stephen W Michnick
- Departement de Biochimie, Université de Montréal, 2900 Édouard-Montpetit, Montréal, Quebec, H3T 1J4, Canada.,Centre Robert-Cedergren en Bioinformatique et Génomique, Université de Montréal, 2900 Édouard-Montpetit, Montréal, Quebec, H3T 1J4, Canada
| |
Collapse
|
56
|
Xu Y, Wu T, Li F, Dong Q, Wang J, Shang D, Xu Y, Zhang C, Dou Y, Hu C, Yang H, Zheng X, Zhang Y, Wang L, Li X. Identification and comprehensive characterization of lncRNAs with copy number variations and their driving transcriptional perturbed subpathways reveal functional significance for cancer. Brief Bioinform 2019; 21:2153-2166. [PMID: 31792500 DOI: 10.1093/bib/bbz113] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Revised: 08/05/2019] [Accepted: 08/07/2019] [Indexed: 12/17/2022] Open
Abstract
Numerous studies have shown that copy number variation (CNV) in lncRNA regions play critical roles in the initiation and progression of cancer. However, our knowledge about their functionalities is still limited. Here, we firstly provided a computational method to identify lncRNAs with copy number variation (lncRNAs-CNV) and their driving transcriptional perturbed subpathways by integrating multidimensional omics data of cancer. The high reliability and accuracy of our method have been demonstrated. Then, the method was applied to 14 cancer types, and a comprehensive characterization and analysis was performed. LncRNAs-CNV had high specificity in cancers, and those with high CNV level may perturb broad biological functions. Some core subpathways and cancer hallmarks widely perturbed by lncRNAs-CNV were revealed. Moreover, subpathways highlighted the functional diversity of lncRNAs-CNV in various cancers. Survival analysis indicated that functional lncRNAs-CNV could be candidate prognostic biomarkers for clinical applications, such as ST7-AS1, CDKN2B-AS1 and EGFR-AS1. In addition, cascade responses and a functional crosstalk model among lncRNAs-CNV, impacted genes, driving subpathways and cancer hallmarks were proposed for understanding the driving mechanism of lncRNAs-CNV. Finally, we developed a user-friendly web interface-LncCASE (http://bio-bigdata.hrbmu.edu.cn/LncCASE/) for exploring lncRNAs-CNV and their driving subpathways in various cancer types. Our study identified and systematically characterized lncRNAs-CNV and their driving subpathways and presented valuable resources for investigating the functionalities of non-coding variations and the mechanisms of tumorigenesis.
Collapse
Affiliation(s)
- Yanjun Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Tan Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Feng Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Qun Dong
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Jingwen Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Desi Shang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Yingqi Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Chunlong Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Yiying Dou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Congxue Hu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Haixiu Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Xuan Zheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Yunpeng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Lihua Wang
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin 150081, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| |
Collapse
|
57
|
Springer N, de León N, Grotewold E. Challenges of Translating Gene Regulatory Information into Agronomic Improvements. TRENDS IN PLANT SCIENCE 2019; 24:1075-1082. [PMID: 31377174 DOI: 10.1016/j.tplants.2019.07.004] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 06/26/2019] [Accepted: 07/05/2019] [Indexed: 06/10/2023]
Abstract
Improvement of agricultural species has exploited the genetic variation responsible for complex quantitative traits. Much of the functional variation is regulatory, in cis-regulatory elements and trans-acting factors that ultimately contribute to gene expression differences. However, the identification of gene regulatory network components that, when modulated, will increase plant productivity or resilience, is challenging, yet essential to provide increased predictive power for genome engineering approaches that are likely to benefit useful traits. Here, we discuss the opportunities and limitations of using data obtained from gene coexpression, transcription factor binding, and genome-wide association mapping analyses to predict regulatory interactions that impact crop improvement. It is apparent that a combination of information from these data types is necessary for the reliable identification and utilization of important regulatory interactions that underlie complex agronomic traits.
Collapse
Affiliation(s)
- Nathan Springer
- Department of Plant and Microbial Biology, University of Minnesota, St Paul, MN 55108, USA.
| | - Natalia de León
- Department of Agronomy, University of Wisconsin, Madison, WI 56706, USA
| | - Erich Grotewold
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|
58
|
Turchin MC, Stephens M. Bayesian multivariate reanalysis of large genetic studies identifies many new associations. PLoS Genet 2019; 15:e1008431. [PMID: 31596850 PMCID: PMC6802844 DOI: 10.1371/journal.pgen.1008431] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Revised: 10/21/2019] [Accepted: 09/17/2019] [Indexed: 01/08/2023] Open
Abstract
Genome-wide association studies (GWAS) have now been conducted for hundreds of phenotypes of relevance to human health. Many such GWAS involve multiple closely-related phenotypes collected on the same samples. However, the vast majority of these GWAS have been analyzed using simple univariate analyses, which consider one phenotype at a time. This is despite the fact that, at least in simulation experiments, multivariate analyses have been shown to be more powerful at detecting associations. Here, we conduct multivariate association analyses on 13 different publicly-available GWAS datasets that involve multiple closely-related phenotypes. These data include large studies of anthropometric traits (GIANT), plasma lipid traits (GlobalLipids), and red blood cell traits (HaemgenRBC). Our analyses identify many new associations (433 in total across the 13 studies), many of which replicate when follow-up samples are available. Overall, our results demonstrate that multivariate analyses can help make more effective use of data from both existing and future GWAS. Genome-wide association studies (GWAS) have become a common and powerful tool for identifying significant correlations between markers of genetic variation and physical traits of interest. Often these studies are conducted by comparing genetic variation against single traits one at a time (‘univariate’); however, it has previously been shown that it is possible to increase your power to detect significant associations by comparing genetic variation against multiple traits simultaneously (‘multivariate’). Despite this apparent increase in power though, researchers still rarely conduct multivariate GWAS, even when studies have multiple traits readily available. Here, we reanalyze 13 previously published GWAS using a multivariate method and find >400 additional associations. Our method makes use of univariate GWAS summary statistics and is available as a software package, thus making it accessible to other researchers interested in conducting the same analyses. We also show, using studies that have multiple releases, that our new associations have high rates of replication. Overall, we argue multivariate approaches in GWAS should no longer be overlooked and how, often, there is low-hanging fruit in the form of new associations by running these methods on data already collected.
Collapse
Affiliation(s)
- Michael C. Turchin
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
| | - Matthew Stephens
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
- Department of Statistics, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|
59
|
Raulerson CK, Ko A, Kidd JC, Currin KW, Brotman SM, Cannon ME, Wu Y, Spracklen CN, Jackson AU, Stringham HM, Welch RP, Fuchsberger C, Locke AE, Narisu N, Lusis AJ, Civelek M, Furey TS, Kuusisto J, Collins FS, Boehnke M, Scott LJ, Lin DY, Love MI, Laakso M, Pajukanta P, Mohlke KL. Adipose Tissue Gene Expression Associations Reveal Hundreds of Candidate Genes for Cardiometabolic Traits. Am J Hum Genet 2019; 105:773-787. [PMID: 31564431 PMCID: PMC6817527 DOI: 10.1016/j.ajhg.2019.09.001] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Accepted: 09/03/2019] [Indexed: 12/15/2022] Open
Abstract
Genome-wide association studies (GWASs) have identified thousands of genetic loci associated with cardiometabolic traits including type 2 diabetes (T2D), lipid levels, body fat distribution, and adiposity, although most causal genes remain unknown. We used subcutaneous adipose tissue RNA-seq data from 434 Finnish men from the METSIM study to identify 9,687 primary and 2,785 secondary cis-expression quantitative trait loci (eQTL; <1 Mb from TSS, FDR < 1%). Compared to primary eQTL signals, secondary eQTL signals were located further from transcription start sites, had smaller effect sizes, and were less enriched in adipose tissue regulatory elements compared to primary signals. Among 2,843 cardiometabolic GWAS signals, 262 colocalized by LD and conditional analysis with 318 transcripts as primary and conditionally distinct secondary cis-eQTLs, including some across ancestries. Of cardiometabolic traits examined for adipose tissue eQTL colocalizations, waist-hip ratio (WHR) and circulating lipid traits had the highest percentage of colocalized eQTLs (15% and 14%, respectively). Among alleles associated with increased cardiometabolic GWAS risk, approximately half (53%) were associated with decreased gene expression level. Mediation analyses of colocalized genes and cardiometabolic traits within the 434 individuals provided further evidence that gene expression influences variant-trait associations. These results identify hundreds of candidate genes that may act in adipose tissue to influence cardiometabolic traits.
Collapse
Affiliation(s)
- Chelsea K Raulerson
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Arthur Ko
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095, USA; Department of Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095, USA
| | - John C Kidd
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Kevin W Currin
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Sarah M Brotman
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Maren E Cannon
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Ying Wu
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | | | - Anne U Jackson
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Heather M Stringham
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Ryan P Welch
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Christian Fuchsberger
- Center for Biomedicine, European Academy of Bolzano/Bozen, University of Lübeck, Bolzano/Bozen 39100, Italy
| | - Adam E Locke
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Narisu Narisu
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Aldons J Lusis
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095, USA
| | - Mete Civelek
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA 22908, USA
| | - Terrence S Furey
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA; Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Johanna Kuusisto
- Institute of Clinical Medicine, Kuopio University Hospital, University of Eastern Finland, Kuopio 70210, Finland
| | - Francis S Collins
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Michael Boehnke
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Laura J Scott
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Dan-Yu Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Michael I Love
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA; Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Markku Laakso
- Institute of Clinical Medicine, Kuopio University Hospital, University of Eastern Finland, Kuopio 70210, Finland
| | - Päivi Pajukanta
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095, USA; Institute for Precision Health, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095, USA
| | - Karen L Mohlke
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA.
| |
Collapse
|
60
|
Ramos PS. Epigenetics of scleroderma: Integrating genetic, ethnic, age, and environmental effects. JOURNAL OF SCLERODERMA AND RELATED DISORDERS 2019; 4:238-250. [PMID: 35382507 PMCID: PMC8922566 DOI: 10.1177/2397198319855872] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Accepted: 05/15/2019] [Indexed: 08/02/2023]
Abstract
Scleroderma or systemic sclerosis is thought to result from the interplay between environmental or non-genetic factors in a genetically susceptible individual. Epigenetic modifications are influenced by genetic variation and environmental exposures, and change with chronological age and between populations. Despite progress in identifying genetic, epigenetic, and environmental risk factors, the underlying mechanism of systemic sclerosis remains unclear. Since epigenetics provides the regulatory mechanism linking genetic and non-genetic factors to gene expression, understanding the role of epigenetic regulation in systemic sclerosis will elucidate how these factors interact to cause systemic sclerosis. Among the cell types under tight epigenetic control and susceptible to epigenetic dysregulation, immune cells are critically involved in early pathogenic events in the progression of fibrosis and systemic sclerosis. This review starts by summarizing the changes in DNA methylation, histone modification, and non-coding RNAs associated with systemic sclerosis. It then discusses the role of genetic, ethnic, age, and environmental effects on epigenetic regulation, with a focus on immune system dysregulation. Given the potential of epigenome editing technologies for cell reprogramming and as a therapeutic approach for durable gene regulation, this review concludes with a prospect on epigenetic editing. Although epigenomics in systemic sclerosis is in its infancy, future studies will help elucidate the regulatory mechanisms underpinning systemic sclerosis and inform the design of targeted epigenetic therapies to control its dysregulation.
Collapse
Affiliation(s)
- Paula S Ramos
- Paula S. Ramos, Division of Rheumatology and Immunology, Department of Medicine and Department of Public Health Sciences, Medical University of South Carolina, 96 Jonathan Lucas Street, Suite 816, MSC 637, Charleston, SC 29425, USA.
| |
Collapse
|
61
|
Benaglio P, D'Antonio-Chronowska A, Ma W, Yang F, Young Greenwald WW, Donovan MKR, DeBoever C, Li H, Drees F, Singhal S, Matsui H, van Setten J, Sotoodehnia N, Gaulton KJ, Smith EN, D'Antonio M, Rosenfeld MG, Frazer KA. Allele-specific NKX2-5 binding underlies multiple genetic associations with human electrocardiographic traits. Nat Genet 2019; 51:1506-1517. [PMID: 31570892 PMCID: PMC6858543 DOI: 10.1038/s41588-019-0499-3] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2018] [Accepted: 08/15/2019] [Indexed: 12/15/2022]
Abstract
The cardiac transcription factor (TF) gene NKX2-5 has been associated with electrocardiographic (EKG) traits through genome-wide association studies (GWASs), but the extent to which differential binding of NKX2-5 at common regulatory variants contributes to these traits has not yet been studied. We analyzed transcriptomic and epigenomic data from induced pluripotent stem cell-derived cardiomyocytes from seven related individuals, and identified ~2,000 single-nucleotide variants associated with allele-specific effects (ASE-SNVs) on NKX2-5 binding. NKX2-5 ASE-SNVs were enriched for altered TF motifs, for heart-specific expression quantitative trait loci and for EKG GWAS signals. Using fine-mapping combined with epigenomic data from induced pluripotent stem cell-derived cardiomyocytes, we prioritized candidate causal variants for EKG traits, many of which were NKX2-5 ASE-SNVs. Experimentally characterizing two NKX2-5 ASE-SNVs (rs3807989 and rs590041) showed that they modulate the expression of target genes via differential protein binding in cardiac cells, indicating that they are functional variants underlying EKG GWAS signals. Our results show that differential NKX2-5 binding at numerous regulatory variants across the genome contributes to EKG phenotypes.
Collapse
Affiliation(s)
- Paola Benaglio
- Department of Pediatrics, Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, CA, USA
| | | | - Wubin Ma
- Howard Hughes Medical Institute, Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Feng Yang
- Howard Hughes Medical Institute, Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | | | - Margaret K R Donovan
- Bioinformatics and Systems Biology, University of California, San Diego, La Jolla, CA, USA.,Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, USA
| | - Christopher DeBoever
- Bioinformatics and Systems Biology, University of California, San Diego, La Jolla, CA, USA
| | - He Li
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Frauke Drees
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Sanghamitra Singhal
- Department of Pediatrics, Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Hiroko Matsui
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Jessica van Setten
- Department of Cardiology, University Medical Center Utrecht, University of Utrecht, Utrecht, the Netherlands
| | - Nona Sotoodehnia
- Department of Medicine, Cardiovascular Health Research Unit, Division of Cardiology, University of Washington, Seattle, WA, USA.,Department of Epidemiology, Cardiovascular Health Research Unit, Division of Cardiology, University of Washington, Seattle, WA, USA
| | - Kyle J Gaulton
- Department of Pediatrics, Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Erin N Smith
- Department of Pediatrics, Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Matteo D'Antonio
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Michael G Rosenfeld
- Howard Hughes Medical Institute, Department of Medicine, University of California, San Diego, La Jolla, CA, USA.
| | - Kelly A Frazer
- Department of Pediatrics, Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, CA, USA. .,Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
62
|
Metzger BPH, Wittkopp PJ. Compensatory trans-regulatory alleles minimizing variation in TDH3 expression are common within Saccharomyces cerevisiae. Evol Lett 2019; 3:448-461. [PMID: 31636938 PMCID: PMC6791293 DOI: 10.1002/evl3.137] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Revised: 08/07/2019] [Accepted: 08/09/2019] [Indexed: 11/06/2022] Open
Abstract
Heritable variation in gene expression is common within species. Much of this variation is due to genetic differences outside of the gene with altered expression and is trans-acting. This trans-regulatory variation is often polygenic, with individual variants typically having small effects, making the genetic architecture and evolution of trans-regulatory variation challenging to study. Consequently, key questions about trans-regulatory variation remain, including the variability of trans-regulatory variation within a species, how selection affects trans-regulatory variation, and how trans-regulatory variants are distributed throughout the genome and within a species. To address these questions, we isolated and measured trans-regulatory differences affecting TDH3 promoter activity among 56 strains of Saccharomyces cerevisiae, finding that trans-regulatory backgrounds varied approximately twofold in their effects on TDH3 promoter activity. Comparing this variation to neutral models of trans-regulatory evolution based on empirical measures of mutational effects revealed that despite this variability in the effects of trans-regulatory backgrounds, stabilizing selection has constrained trans-regulatory differences within this species. Using a powerful quantitative trait locus mapping method, we identified ∼100 trans-acting expression quantitative trait locus in each of three crosses to a common reference strain, indicating that regulatory variation is more polygenic than previous studies have suggested. Loci altering expression were located throughout the genome, and many loci were strain specific. This distribution and prevalence of alleles is consistent with recent theories about the genetic architecture of complex traits. In all mapping experiments, the nonreference strain alleles increased and decreased TDH3 promoter activity with similar frequencies, suggesting that stabilizing selection maintained many trans-acting variants with opposing effects. This variation may provide the raw material for compensatory evolution and larger scale regulatory rewiring observed in developmental systems drift among species.
Collapse
Affiliation(s)
- Brian P H Metzger
- Department of Ecology and Evolutionary Biology University of Michigan Ann Arbor Michigan 48109.,Department of Ecology and Evolution University of Chicago Chicago Illinois 60637
| | - Patricia J Wittkopp
- Department of Ecology and Evolutionary Biology University of Michigan Ann Arbor Michigan 48109.,Department of Molecular, Cellular, and Developmental Biology University of Michigan Ann Arbor Michigan 48109
| |
Collapse
|
63
|
Shi W, Fornes O, Wasserman WW. Gene expression models based on transcription factor binding events confer insight into functional cis-regulatory variants. Bioinformatics 2019; 35:2610-2617. [PMID: 30541050 PMCID: PMC6662294 DOI: 10.1093/bioinformatics/bty992] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Revised: 10/17/2018] [Accepted: 12/10/2018] [Indexed: 01/03/2023] Open
Abstract
MOTIVATION Deciphering the functional roles of cis-regulatory variants is a critical challenge in genome analysis and interpretation. It has been hypothesized that altered transcription factor (TF) binding events are a central mechanism by which cis-regulatory variants impact gene expression levels. However, we lack a computational framework to understand and quantify such mechanistic contributions. RESULTS We present TF2Exp, a gene-based framework to predict the impact of altered TF-binding events on gene expression levels. Using data from lymphoblastoid cell lines, TF2Exp models were applied successfully to predict the expression levels of 3196 genes. Alterations within DNase I hypersensitive, CTCF-bound and tissue-specific TF-bound regions were the greatest contributing features to the models. TF2Exp models performed as well as models based on common variants, both in cross-validation and external validation. Combining TF alteration and common variant features can further improve model performance. Unlike variant-based models, TF2Exp models have the unique advantage to evaluate the functional impact of variants in linkage disequilibrium and uncommon variants. We find that adding TF-binding events altered only by uncommon variants could increase the number of predictable genes (R2 > 0.05). Taken together, TF2Exp represents a key step towards interpreting the functional roles of cis-regulatory variants in the human genome. AVAILABILITY AND IMPLEMENTATION The code and model training results are publicly available at https://github.com/wqshi/TF2Exp. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wenqiang Shi
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics, BC Children’s Hospital Research Institute, University of British Columbia, Vancouver, BC, Canada
- Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, Canada
- Beijing Institute of Microbiology and Epidemiology, Beijing, China
| | - Oriol Fornes
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics, BC Children’s Hospital Research Institute, University of British Columbia, Vancouver, BC, Canada
| | - Wyeth W Wasserman
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics, BC Children’s Hospital Research Institute, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
64
|
Ramos PS, Zimmerman KD, Haddad S, Langefeld CD, Medsger TA, Feghali-Bostwick CA. Integrative analysis of DNA methylation in discordant twins unveils distinct architectures of systemic sclerosis subsets. Clin Epigenetics 2019; 11:58. [PMID: 30947741 PMCID: PMC6449959 DOI: 10.1186/s13148-019-0652-y] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Accepted: 03/11/2019] [Indexed: 02/08/2023] Open
Abstract
Background Systemic sclerosis (SSc) is a rare autoimmune fibrosing disease with an incompletely understood genetic and non-genetic etiology. Defining its etiology is important to allow the development of effective predictive, preventative, and therapeutic strategies. We conducted this epigenomic study to investigate the contributions of DNA methylation to the etiology of SSc while minimizing confounding due to genetic heterogeneity. Methods Genomic methylation in whole blood from 27 twin pairs discordant for SSc was assayed over 450 K CpG sites. In silico integration with reported differentially methylated cytosines, differentially expressed genes, and regulatory annotation was conducted to validate and interpret the results. Results A total of 153 unique cytosines in limited cutaneous SSc (lcSSc) and 266 distinct sites in diffuse cutaneous SSc (dcSSc) showed suggestive differential methylation levels in affected twins. Integration with available data revealed 76 CpGs that were also differentially methylated in blood cells from lupus patients, suggesting their role as potential epigenetic blood biomarkers of autoimmunity. It also revealed 27 genes with concomitant differential expression in blood from SSc patients, including IFI44L and RSAD2. Regulatory annotation revealed that dcSSc-associated CpGs (but not lcSSc) are enriched at Encyclopedia of DNA Elements-, Roadmap-, and BLUEPRINT-derived regulatory regions, supporting their potential role in disease presentation. Notably, the predominant enrichment of regulatory regions in monocytes and macrophages is consistent with the role of these cells in fibrosis, suggesting that the observed cellular dysregulation might be, at least partly, due to altered epigenetic mechanisms of these cells in dcSSc. Conclusions These data implicate epigenetic changes in the pathogenesis of SSc and suggest functional mechanisms in SSc etiology. Electronic supplementary material The online version of this article (10.1186/s13148-019-0652-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Paula S Ramos
- Division of Rheumatology and Immunology, Department of Medicine, Medical University of South Carolina, Charleston, SC, USA.,Department of Public Health Sciences, Medical University of South Carolina, Charleston, SC, USA
| | - Kip D Zimmerman
- Department of Biostatistics and Data Science, Wake Forest School of Medicine, Winston-Salem, NC, USA.,Center for Public Health Genomics, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | | | - Carl D Langefeld
- Department of Biostatistics and Data Science, Wake Forest School of Medicine, Winston-Salem, NC, USA.,Center for Public Health Genomics, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Thomas A Medsger
- Division of Rheumatology and Clinical Immunology, Department of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Carol A Feghali-Bostwick
- Division of Rheumatology and Immunology, Department of Medicine, Medical University of South Carolina, Charleston, SC, USA.
| |
Collapse
|
65
|
Shan N, Wang Z, Hou L. Identification of trans-eQTLs using mediation analysis with multiple mediators. BMC Bioinformatics 2019; 20:126. [PMID: 30925861 PMCID: PMC6440281 DOI: 10.1186/s12859-019-2651-6] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Background Mapping expression quantitative trait loci (eQTLs) has provided insight into gene regulation. Compared to cis-eQTLs, the regulatory mechanisms of trans-eQTLs are less known. Previous studies suggest that trans-eQTLs may regulate expression of remote genes by altering the expression of nearby genes. Trans-association has been studied in the mediation analysis with a single mediator. However, prior applications with one mediator are prone to model misspecification due to correlations between genes. Motivated from the observation that trans-eQTLs are more likely to associate with more than one cis-gene than randomly selected SNPs in the GTEx dataset, we developed a computational method to identify trans-eQTLs that are mediated by multiple mediators. Results We proposed two hypothesis tests for testing the total mediation effect (TME) and the component-wise mediation effects (CME), respectively. We demonstrated in simulation studies that the type I error rates were controlled in both tests despite model misspecification. The TME test was more powerful than the CME test when the two mediation effects are in the same direction, while the CME test was more powerful than the TME test when the two mediation effects are in opposite direction. Multiple mediator analysis had increased power to detect mediated trans-eQTLs, especially in large samples. In the HapMap3 data, we identified 11 mediated trans-eQTLs that were not detected by the single mediator analysis in the combined samples of African populations. Moreover, the mediated trans-eQTLs in the HapMap3 samples are more likely to be trait-associated SNPs. In terms of computation, although there is no limit in the number of mediators in our model, analysis takes more time when adding additional mediators. In the analysis of the HapMap3 samples, we included at most 5 cis-gene mediators. Majority of the trios we considered have one or two mediators. Conclusions Trans-eQTLs are more likely to associate with multiple cis-genes than randomly selected SNPs. Mediation analysis with multiple mediators improves power of identification of mediated trans-eQTLs, especially in large samples. Electronic supplementary material The online version of this article (10.1186/s12859-019-2651-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Nayang Shan
- Center for Statistical Science, Tsinghua University, Beijing, 100084, China.,Department of Industrial Engineering, Tsinghua University, Beijing, 100084, China
| | - Zuoheng Wang
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, 06510, USA.
| | - Lin Hou
- Center for Statistical Science, Tsinghua University, Beijing, 100084, China. .,Department of Industrial Engineering, Tsinghua University, Beijing, 100084, China. .,MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
66
|
Baker CL, Walker M, Arat S, Ananda G, Petkova P, Powers NR, Tian H, Spruce C, Ji B, Rausch D, Choi K, Petkov PM, Carter GW, Paigen K. Tissue-Specific Trans Regulation of the Mouse Epigenome. Genetics 2019; 211:831-845. [PMID: 30593494 PMCID: PMC6404261 DOI: 10.1534/genetics.118.301697] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Accepted: 12/15/2018] [Indexed: 11/18/2022] Open
Abstract
The epigenetic landscape varies greatly among cell types. Although a variety of writers, readers, and erasers of epigenetic features are known, we have little information about the underlying regulatory systems controlling the establishment and maintenance of these features. Here, we have explored how natural genetic variation affects the epigenome in mice. Studying levels of H3K4me3, a histone modification at sites such as promoters, enhancers, and recombination hotspots, we found tissue-specific trans-regulation of H3K4me3 levels in four highly diverse cell types: male germ cells, embryonic stem cells, hepatocytes, and cardiomyocytes. To identify the genetic loci involved, we measured H3K4me3 levels in male germ cells in a mapping population of 59 BXD recombinant inbred lines. We found extensive trans-regulation of H3K4me3 peaks, including six major histone quantitative trait loci (QTL). These chromatin regulatory loci act dominantly to suppress H3K4me3, which at hotspots reduces the likelihood of subsequent DNA double-strand breaks. QTL locations do not correspond with genes encoding enzymes known to metabolize chromatin features. Instead their locations match clusters of zinc finger genes, making these possible candidates that explain the dominant suppression of H3K4me3. Collectively, these data describe an extensive, set of chromatin regulatory loci that control the epigenetic landscape.
Collapse
Affiliation(s)
| | | | - Seda Arat
- The Jackson Laboratory, Bar Harbor, Maine 04609
| | | | | | | | - Hui Tian
- The Jackson Laboratory, Bar Harbor, Maine 04609
| | | | - Bo Ji
- The Jackson Laboratory, Bar Harbor, Maine 04609
| | | | | | | | | | | |
Collapse
|
67
|
Ravindran SP, Herrmann M, Cordellier M. Contrasting patterns of divergence at the regulatory and sequence level in European Daphnia galeata natural populations. Ecol Evol 2019; 9:2487-2504. [PMID: 30891195 PMCID: PMC6405927 DOI: 10.1002/ece3.4894] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2018] [Revised: 12/05/2018] [Accepted: 12/13/2018] [Indexed: 12/30/2022] Open
Abstract
Understanding the genetic basis of local adaptation has long been a focus of evolutionary biology. Recently, there has been increased interest in deciphering the evolutionary role of Daphnia's plasticity and the molecular mechanisms of local adaptation. Using transcriptome data, we assessed the differences in gene expression profiles and sequences in four European Daphnia galeata populations. In total, ~33% of 32,903 transcripts were differentially expressed between populations. Among 10,280 differentially expressed transcripts, 5,209 transcripts deviated from neutral expectations and their population-specific expression pattern is likely the result of local adaptation processes. Furthermore, a SNP analysis allowed inferring population structure and distribution of genetic variation. The population divergence at the sequence level was comparatively higher than the gene expression level by several orders of magnitude consistent with strong founder effects and lack of gene flow between populations. Using sequence homology, the candidate transcripts were annotated using a comparative genomics approach. Additionally, we also performed a weighted gene co-expression analysis to identify population-specific regulatory patterns of transcripts in D. galeata. Thus, we identified candidate transcriptomic regions for local adaptation in this key species of aquatic ecosystems in the absence of any laboratory-induced stressor.
Collapse
Affiliation(s)
| | - Maike Herrmann
- Department of Veterinary MedicinePaul‐Ehrlich‐InstitutLangenGermany
| | | |
Collapse
|
68
|
Hatcher C, Relton CL, Gaunt TR, Richardson TG. Leveraging brain cortex-derived molecular data to elucidate epigenetic and transcriptomic drivers of complex traits and disease. Transl Psychiatry 2019; 9:105. [PMID: 30820025 PMCID: PMC6395652 DOI: 10.1038/s41398-019-0437-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Revised: 01/21/2019] [Accepted: 01/24/2019] [Indexed: 12/23/2022] Open
Abstract
Integrative approaches that harness large-scale molecular datasets can help develop mechanistic insight into findings from genome-wide association studies (GWAS). We have performed extensive analyses to uncover transcriptional and epigenetic processes which may play a role in complex trait variation. This was undertaken by applying Bayesian multiple-trait colocalization systematically across the genome to identify genetic variants responsible for influencing intermediate molecular phenotypes as well as complex traits. In this analysis, we leveraged high-dimensional quantitative trait loci data derived from the prefrontal cortex tissue (concerning gene expression, DNA methylation and histone acetylation) and GWAS findings for five complex traits (Neuroticism, Schizophrenia, Educational Attainment, Insomnia and Alzheimer's disease). There was evidence of colocalization for 118 associations, suggesting that the same underlying genetic variant influenced both nearby gene expression as well as complex trait variation. Of these, 73 associations provided evidence that the genetic variant also influenced proximal DNA methylation and/or histone acetylation. These findings support previous evidence at loci where epigenetic mechanisms may putatively mediate effects of genetic variants on traits, such as KLC1 and schizophrenia. We also uncovered evidence implicating novel loci in disease susceptibility, including genes expressed predominantly in the brain tissue, such as MDGA1, KIRREL3 and SLC12A5. An inverse relationship between DNA methylation and gene expression was observed more than can be accounted for by chance, supporting previous findings implicating DNA methylation as a transcriptional repressor. Our study should prove valuable in helping future studies prioritize candidate genes and epigenetic mechanisms for in-depth functional follow-up analyses.
Collapse
Affiliation(s)
- Charlie Hatcher
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Caroline L Relton
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Tom R Gaunt
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Tom G Richardson
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK.
| |
Collapse
|
69
|
Ali S, Signor SA, Kozlov K, Nuzhdin SV. Novel approach to quantitative spatial gene expression uncovers genetic stochasticity in the developing Drosophila eye. Evol Dev 2019; 21:157-171. [PMID: 30756455 DOI: 10.1111/ede.12283] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Robustness in development allows for the accumulation of genetically based variation in expression. However, this variation is usually examined in response to large perturbations, and examination of this variation has been limited to being spatial, or quantitative, but because of technical restrictions not both. Here we bridge these gaps by investigating replicated quantitative spatial gene expression using rigorous statistical models, in different genotypes, sexes, and species (Drosophila melanogaster and D. simulans). Using this type of quantitative approach with molecular developmental data allows for comparison among conditions, such as different genetic backgrounds. We apply this approach to the morphogenetic furrow, a wave of differentiation that patterns the developing eye disc. Within the morphogenetic furrow, we focus on four genes, hairy, atonal, hedgehog, and Delta. Hybridization chain reaction quantitatively measures spatial gene expression, co-staining for all four genes simultaneously. We find considerable variation in the spatial expression pattern of these genes in the eye between species, genotypes, and sexes. We also find that there has been evolution of the regulatory relationship between these genes, and that their spatial interrelationships have evolved between species. This variation has no phenotypic effect, and could be buffered by network thresholds or compensation from other genes. Both of these mechanisms could potentially be contributing to long term developmental systems drift.
Collapse
Affiliation(s)
- Sammi Ali
- Molecular and Computational Biology, University of Southern California, Los Angeles, California
| | - Sarah A Signor
- Molecular and Computational Biology, University of Southern California, Los Angeles, California
| | - Konstantin Kozlov
- Department of Applied Mathematics, St. Petersburg State Polytechnic University, St. Petersburg, Russia
| | - Sergey V Nuzhdin
- Molecular and Computational Biology, University of Southern California, Los Angeles, California.,Department of Applied Mathematics, St. Petersburg State Polytechnic University, St. Petersburg, Russia
| |
Collapse
|
70
|
Husquin LT, Rotival M, Fagny M, Quach H, Zidane N, McEwen LM, MacIsaac JL, Kobor MS, Aschard H, Patin E, Quintana-Murci L. Exploring the genetic basis of human population differences in DNA methylation and their causal impact on immune gene regulation. Genome Biol 2018; 19:222. [PMID: 30563547 PMCID: PMC6299574 DOI: 10.1186/s13059-018-1601-3] [Citation(s) in RCA: 84] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Accepted: 12/04/2018] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND DNA methylation is influenced by both environmental and genetic factors and is increasingly thought to affect variation in complex traits and diseases. Yet, the extent of ancestry-related differences in DNA methylation, their genetic determinants, and their respective causal impact on immune gene regulation remain elusive. RESULTS We report extensive population differences in DNA methylation between 156 individuals of African and European descent, detected in primary monocytes that are used as a model of a major innate immunity cell type. Most of these differences (~ 70%) are driven by DNA sequence variants nearby CpG sites, which account for ~ 60% of the variance in DNA methylation. We also identify several master regulators of DNA methylation variation in trans, including a regulatory hub nearby the transcription factor-encoding CTCF gene, which contributes markedly to ancestry-related differences in DNA methylation. Furthermore, we establish that variation in DNA methylation is associated with varying gene expression levels following mostly, but not exclusively, a canonical model of negative associations, particularly in enhancer regions. Specifically, we find that DNA methylation highly correlates with transcriptional activity of 811 and 230 genes, at the basal state and upon immune stimulation, respectively. Finally, using a Bayesian approach, we estimate causal mediation effects of DNA methylation on gene expression in ~ 20% of the studied cases, indicating that DNA methylation can play an active role in immune gene regulation. CONCLUSION Using a system-level approach, our study reveals substantial ancestry-related differences in DNA methylation and provides evidence for their causal impact on immune gene regulation.
Collapse
Affiliation(s)
- Lucas T. Husquin
- Unit of Human Evolutionary Genetics, Institut Pasteur, 75015 Paris, France
- Centre National de la Recherche Scientifique (CNRS) UMR2000, 75015 Paris, France
- Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, 75015 Paris, France
| | - Maxime Rotival
- Unit of Human Evolutionary Genetics, Institut Pasteur, 75015 Paris, France
- Centre National de la Recherche Scientifique (CNRS) UMR2000, 75015 Paris, France
- Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, 75015 Paris, France
| | - Maud Fagny
- Laboratory for Epigenetics & Environment, Centre National de Recherche en Génomique Humaine (CNRGH), CEA-Institut de Biologie François Jacob, 91000 Evry, France
| | - Hélène Quach
- Unit of Human Evolutionary Genetics, Institut Pasteur, 75015 Paris, France
- Centre National de la Recherche Scientifique (CNRS) UMR2000, 75015 Paris, France
- Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, 75015 Paris, France
| | - Nora Zidane
- Unit of Human Evolutionary Genetics, Institut Pasteur, 75015 Paris, France
- Centre National de la Recherche Scientifique (CNRS) UMR2000, 75015 Paris, France
- Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, 75015 Paris, France
| | - Lisa M. McEwen
- Department of Medical Genetics, University of British Columbia, Centre for Molecular Medicine and Therapeutics, BC Children’s Hospital Research Institute, Vancouver, BC Canada
| | - Julia L. MacIsaac
- Department of Medical Genetics, University of British Columbia, Centre for Molecular Medicine and Therapeutics, BC Children’s Hospital Research Institute, Vancouver, BC Canada
| | - Michael S. Kobor
- Department of Medical Genetics, University of British Columbia, Centre for Molecular Medicine and Therapeutics, BC Children’s Hospital Research Institute, Vancouver, BC Canada
| | - Hugues Aschard
- Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, 75015 Paris, France
| | - Etienne Patin
- Unit of Human Evolutionary Genetics, Institut Pasteur, 75015 Paris, France
- Centre National de la Recherche Scientifique (CNRS) UMR2000, 75015 Paris, France
- Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, 75015 Paris, France
| | - Lluis Quintana-Murci
- Unit of Human Evolutionary Genetics, Institut Pasteur, 75015 Paris, France
- Centre National de la Recherche Scientifique (CNRS) UMR2000, 75015 Paris, France
- Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, 75015 Paris, France
| |
Collapse
|
71
|
Li Q, Cassese A, Guindani M, Vannucci M. Bayesian negative binomial mixture regression models for the analysis of sequence count and methylation data. Biometrics 2018; 75:183-192. [DOI: 10.1111/biom.12962] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Revised: 05/01/2018] [Accepted: 07/01/2018] [Indexed: 02/01/2023]
Affiliation(s)
- Qiwei Li
- Department of Clinical SciencesUniversity of Texas Southwestern Medical Center Dallas Texas U.S.A
| | - Alberto Cassese
- Department of Methodology and StatisticsFaculty of Psychology and NeuroscienceMaastricht University Maastricht, The Netherlands
| | - Michele Guindani
- Department of StatisticsUniversity of California Irvine California U.S.A
| | | |
Collapse
|
72
|
Zadel M, Maver A, Kovanda A, Peterlin B. DNA Methylation Profiles in Whole Blood of Huntington's Disease Patients. Front Neurol 2018; 9:655. [PMID: 30158895 PMCID: PMC6104454 DOI: 10.3389/fneur.2018.00655] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Accepted: 07/23/2018] [Indexed: 12/14/2022] Open
Abstract
Epigenetic mechanisms, especially DNA methylation, are suggested to play a role in the age-of-onset in Huntington's disease (HD) based on studies on patient brains, and cellular and animal models. Methylation is tissue-specific and it is not clear how HD specific methylation in the brain correlates with the blood compartment, which represents a much more clinically accessible sample. Therefore, we explored the presence of HD specific DNA methylation patterns in whole blood on a cohort of HDM and healthy controls from Slovenia. We compared CpG site-specific DNA methylation in whole blood of 11 symptomatic and 9 pre-symptomatic HDM (HDM), and 15 healthy controls, by using bisulfite converted DNA on the Infinium® Human Methylation27 BeadChip microarray (Illumina) covering 27,578 CpG sites and 14,495 genes. Of the examined 14,495 genes, 437 were differentially methylated (p < 0.01) in pre-symptomatic HDM compared to controls, with three genes (CLDN16, DDC, NXT2) retaining statistical significance after the correction for multiple testing (false discovery rate, FDR < 0.05). Comparisons between symptomatic HDM and controls, and the comparison of symptomatic and pre-symptomatic HDM further identified 260 and 198 differentially methylated genes (p < 0.01), respectively, whereas the comparison of all HDM (symptomatic and pre-symptomatic) and healthy controls identified 326 differentially methylated genes (p < 0.01), however, none of these changes retained significance (FDR < 0.05) after the correction for multiple testing. The results of our study suggest that methylation signatures in the blood compartment are not robust enough to prove as valuable biomarkers for predicting HD progression, but recognizable changes in methylation deserve further research.
Collapse
Affiliation(s)
- Maja Zadel
- Clinical Institute of Medical Genetics, University Medical Centre Ljubljana, Ljubljana, Slovenia.,Community Health Centre Ljubljana, Ljubljana, Slovenia
| | - Aleš Maver
- Clinical Institute of Medical Genetics, University Medical Centre Ljubljana, Ljubljana, Slovenia
| | - Anja Kovanda
- Clinical Institute of Medical Genetics, University Medical Centre Ljubljana, Ljubljana, Slovenia
| | - Borut Peterlin
- Clinical Institute of Medical Genetics, University Medical Centre Ljubljana, Ljubljana, Slovenia
| |
Collapse
|
73
|
Realizing the significance of noncoding functionality in clinical genomics. Exp Mol Med 2018; 50:1-8. [PMID: 30089779 PMCID: PMC6082831 DOI: 10.1038/s12276-018-0087-0] [Citation(s) in RCA: 66] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2018] [Revised: 03/04/2018] [Accepted: 03/09/2018] [Indexed: 12/14/2022] Open
Abstract
Clinical genomics promises unprecedented precision in understanding the genetic basis of disease. Understanding the impact of variation across the genome is required to realize this potential. Currently, clinical genomics analyses focus on protein-coding genes. However, the noncoding genome is substantially larger than the protein-coding counterpart, and contains structural, regulatory, and transcribed information that needs to be incorporated into genome annotations if the full extent of the opportunity to use genomic information in healthcare is to be realized. This article reviews the challenges and opportunities in unlocking the clinical significance of coding and noncoding genomic information and translating its utility in practice. Most of the DNA in the genome does not consist of genes that code for proteins, and understanding the function of these less examined parts of our genetic material is essential to fully understand human development and disease. Brian Gloss and Marcel Dinger at the Garvan Institute of Medical Research in Sydney, Australia, review the challenges and opportunities in unraveling the clinical significance of all parts of our DNA. Many regions of DNA that do not encode protein molecules perform crucial functions in regulating the activity and interactions of the protein-coding genes. Variations in these regions may significantly influence the risks and causes of disease. Studying all parts of the genome will be critical for ensuring that the powerful modern techniques of genetic analysis have maximal impact on healthcare.
Collapse
|
74
|
Ben Hamda C, Sangeda R, Mwita L, Meintjes A, Nkya S, Panji S, Mulder N, Guizani-Tabbane L, Benkahla A, Makani J, Ghedira K, H3ABioNet Consortium. A common molecular signature of patients with sickle cell disease revealed by microarray meta-analysis and a genome-wide association study. PLoS One 2018; 13:e0199461. [PMID: 29979707 PMCID: PMC6034806 DOI: 10.1371/journal.pone.0199461] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Accepted: 06/07/2018] [Indexed: 12/16/2022] Open
Abstract
A chronic inflammatory state to a large extent explains sickle cell disease (SCD) pathophysiology. Nonetheless, the principal dysregulated factors affecting this major pathway and their mechanisms of action still have to be fully identified and elucidated. Integrating gene expression and genome-wide association study (GWAS) data analysis represents a novel approach to refining the identification of key mediators and functions in complex diseases. Here, we performed gene expression meta-analysis of five independent publicly available microarray datasets related to homozygous SS patients with SCD to identify a consensus SCD transcriptomic profile. The meta-analysis conducted using the MetaDE R package based on combining p values (maxP approach) identified 335 differentially expressed genes (DEGs; 224 upregulated and 111 downregulated). Functional gene set enrichment revealed the importance of several metabolic pathways, of innate immune responses, erythrocyte development, and hemostasis pathways. Advanced analyses of GWAS data generated within the framework of this study by means of the atSNP R package and SIFT tool identified 60 regulatory single-nucleotide polymorphisms (rSNPs) occurring in the promoter of 20 DEGs and a deleterious SNP, affecting CAMKK2 protein function. This novel database of candidate genes, transcription factors, and rSNPs associated with SCD provides new markers that may help to identify new therapeutic targets.
Collapse
Affiliation(s)
- Cherif Ben Hamda
- Laboratory of Bioinformatics, Biomathematics and Biostatistics, Institute Pasteur of Tunis, Tunis, Tunisia
- University of Tunis El Manar, Tunis, Tunisia
- Faculty of Science of Bizerte, Jarzouna, University of Carthage, Tunisia
- * E-mail: (KG); (CBH)
| | - Raphael Sangeda
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
| | - Liberata Mwita
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
| | | | - Siana Nkya
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
| | - Sumir Panji
- University of Cape Town, Cape Town, South Africa
| | | | - Lamia Guizani-Tabbane
- University of Tunis El Manar, Tunis, Tunisia
- Laboratory of Medical Parasitology, Biotechnology and Biomolecules, Institute Pasteur of Tunis, Tunis, Tunisia
| | - Alia Benkahla
- Laboratory of Bioinformatics, Biomathematics and Biostatistics, Institute Pasteur of Tunis, Tunis, Tunisia
- University of Tunis El Manar, Tunis, Tunisia
| | - Julie Makani
- Faculty of Science of Bizerte, Jarzouna, University of Carthage, Tunisia
| | - Kais Ghedira
- Laboratory of Bioinformatics, Biomathematics and Biostatistics, Institute Pasteur of Tunis, Tunis, Tunisia
- University of Tunis El Manar, Tunis, Tunisia
- * E-mail: (KG); (CBH)
| | | |
Collapse
|
75
|
Gu D, Zheng R, Xin J, Li S, Chu H, Gong W, Qiang F, Zhang Z, Wang M, Du M, Chen J. Evaluation of GWAS-Identified Genetic Variants for Gastric Cancer Survival. EBioMedicine 2018; 33:82-87. [PMID: 29983348 PMCID: PMC6085567 DOI: 10.1016/j.ebiom.2018.06.028] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Revised: 06/03/2018] [Accepted: 06/22/2018] [Indexed: 12/31/2022] Open
Abstract
BACKGROUNDS Genome-wide association studies (GWASs) have identified several gastric cancer (GC) susceptibility loci in Asians, but their effects on disease outcome are still unknown. This study aimed to investigate whether these GWAS-identified genetic variants could serve as robust prognostic biomarkers for GC. METHODS A multistage clinical cohort, including a total of 2432 GC patients in the Chinese population, was used to identify the association between GWAS-identified risk variants and overall survival of GC. Hazard ratios (HRs) and 95% confidence intervals (CIs) were computed by Cox regression analysis, and the log-rank P was calculated by the log-rank test with the Kaplan-Meier method. RESULTS We found that rs2274223 A>G in PLCE1 was associated with increased GC survival in both training set (P = .011), which was independently replicated in validation set 1 (P = .045), but not in validation set 2. The area under the curve (AUC) from receiver-operator characteristic (ROC) curve showed this clinical relevance with onset age-dependence, especially in the subgroup of early-onset cases. Moreover, a significant improvement in overall survival prediction was identified when the rs2274223 genetic effect was included in the estimation; this result was also supported by the prognostic nomogram. In addition, patients with lower expression of PLCE1 showed benefits via longer survival, potentially due to the functional effect of rs2274223. INTERPRETATION This preliminary study suggests that a GWAS-identified genetic variant in PLCE1 may serve as a potential biomarker for GC survival. Additional replication with larger samples size is warranted to further investigation.
Collapse
Affiliation(s)
- Dongying Gu
- Department of Oncology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Rui Zheng
- Department of Environmental Genomics, Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Junyi Xin
- Department of Environmental Genomics, Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Shuwei Li
- Department of Environmental Genomics, Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Haiyan Chu
- Department of Environmental Genomics, Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Weida Gong
- Department of Surgery, Yixing Cancer Hospital, Yixing, China
| | - Fulin Qiang
- Core Laboratory, Nantong Tumor Hospital, Nantong, China
| | - Zhengdong Zhang
- Department of Environmental Genomics, Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Meilin Wang
- Department of Environmental Genomics, Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China.
| | - Mulong Du
- Department of Environmental Genomics, Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Biostatistics, School of Public Heath, Nanjing Medical University, Nanjing, China.
| | - Jinfei Chen
- Department of Oncology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China.
| |
Collapse
|
76
|
Su YR, Di C, Bien S, Huang L, Dong X, Abecasis G, Berndt S, Bezieau S, Brenner H, Caan B, Casey G, Chang-Claude J, Chanock S, Chen S, Connolly C, Curtis K, Figueiredo J, Gala M, Gallinger S, Harrison T, Hoffmeister M, Hopper J, Huyghe JR, Jenkins M, Joshi A, Le Marchand L, Newcomb P, Nickerson D, Potter J, Schoen R, Slattery M, White E, Zanke B, Peters U, Hsu L. A Mixed-Effects Model for Powerful Association Tests in Integrative Functional Genomics. Am J Hum Genet 2018; 102:904-919. [PMID: 29727690 PMCID: PMC5986723 DOI: 10.1016/j.ajhg.2018.03.019] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2017] [Accepted: 03/15/2018] [Indexed: 01/05/2023] Open
Abstract
Genome-wide association studies (GWASs) have successfully identified thousands of genetic variants for many complex diseases; however, these variants explain only a small fraction of the heritability. Recently, genetic association studies that leverage external transcriptome data have received much attention and shown promise for discovering novel variants. One such approach, PrediXcan, is to use predicted gene expression through genetic regulation. However, there are limitations in this approach. The predicted gene expression may be biased, resulting from regularized regression applied to moderately sample-sized reference studies. Further, some variants can individually influence disease risk through alternative functional mechanisms besides expression. Thus, testing only the association of predicted gene expression as proposed in PrediXcan will potentially lose power. To tackle these challenges, we consider a unified mixed effects model that formulates the association of intermediate phenotypes such as imputed gene expression through fixed effects, while allowing residual effects of individual variants to be random. We consider a set-based score testing framework, MiST (mixed effects score test), and propose two data-driven combination approaches to jointly test for the fixed and random effects. We establish the asymptotic distributions, which enable rapid calculation of p values for genome-wide analyses, and provide p values for fixed and random effects separately to enhance interpretability over GWASs. Extensive simulations demonstrate that our approaches are more powerful than existing ones. We apply our approach to a large-scale GWAS of colorectal cancer and identify two genes, POU5F1B and ATF1, which would have otherwise been missed by PrediXcan, after adjusting for all known loci.
Collapse
Affiliation(s)
- Yu-Ru Su
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA.
| | - Chongzhi Di
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Stephanie Bien
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Licai Huang
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Xinyuan Dong
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - Goncalo Abecasis
- Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Sonja Berndt
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Rockville, MD 20850, USA
| | - Stephane Bezieau
- Service de Génétique Médicale Centre Hospitalier Universitaire (CHU) Nantes, Nantes 44093, France
| | - Hermann Brenner
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg 69120, Germany
| | - Bette Caan
- Division of Research, Kaiser Permanente Medical Care Program of Northern California, Oakland, CA 94612, USA
| | - Graham Casey
- Public Health Sciences Division, University of Virginia, Charlottesville, VA 22908, USA
| | - Jenny Chang-Claude
- Division of Cancer Epidemiology, German Cancer Research Center, Heidelberg 69009, Germany
| | - Stephen Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Rockville, MD 20850, USA
| | - Sai Chen
- Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Charles Connolly
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Keith Curtis
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Jane Figueiredo
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Manish Gala
- Division of Gastroenterology, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Steven Gallinger
- Department of Surgery, Mount Sinai Hospital, Toronto, ON M5G 1X5, Canada
| | - Tabitha Harrison
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Michael Hoffmeister
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg 69120, Germany
| | - John Hopper
- Melborne School of Population Health, The University of Melborne, Carlton, VIC 3010, Australia
| | - Jeroen R Huyghe
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Mark Jenkins
- Melborne School of Population Health, The University of Melborne, Carlton, VIC 3010, Australia
| | - Amit Joshi
- Department of Epidemiology, Harvard School of Public Health, Boston, MA 02115, USA
| | - Loic Le Marchand
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI 96813, USA
| | - Polly Newcomb
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; Department of Epidemiology, University of Washington School of Public Health, Seattle, WA 98109, USA
| | | | - John Potter
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; Department of Epidemiology, University of Washington School of Public Health, Seattle, WA 98109, USA
| | - Robert Schoen
- Department of Medicine and Epidemiology, University of Pittsburgh Medical Center, Pittsburgh, PA 15213, USA
| | - Martha Slattery
- Department of Internal Medicine, University of Utah Health Sciences Center, Salt Lake City, UT 84132, USA
| | - Emily White
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; Department of Epidemiology, University of Washington School of Public Health, Seattle, WA 98109, USA
| | - Brent Zanke
- Division of Hematology, Faculty of Medicine, The University of Ottawa, Ottawa, ON K1Y 4E9, USA
| | - Ulrike Peters
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; Department of Epidemiology, University of Washington School of Public Health, Seattle, WA 98109, USA
| | - Li Hsu
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; Department of Biostatistics, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
77
|
Gallagher MD, Chen-Plotkin AS. The Post-GWAS Era: From Association to Function. Am J Hum Genet 2018; 102:717-730. [PMID: 29727686 DOI: 10.1016/j.ajhg.2018.04.002] [Citation(s) in RCA: 526] [Impact Index Per Article: 75.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Accepted: 04/04/2018] [Indexed: 12/13/2022] Open
Abstract
During the past 12 years, genome-wide association studies (GWASs) have uncovered thousands of genetic variants that influence risk for complex human traits and diseases. Yet functional studies aimed at delineating the causal genetic variants and biological mechanisms underlying the observed statistical associations with disease risk have lagged. In this review, we highlight key advances in the field of functional genomics that may facilitate the derivation of biological meaning post-GWAS. We highlight the evidence suggesting that causal variants underlying disease risk often function through regulatory effects on the expression of target genes and that these expression effects might be modest and cell-type specific. We moreover discuss specific studies as proof-of-principle examples for current statistical, bioinformatic, and empirical bench-based approaches to downstream elucidation of GWAS-identified disease risk loci.
Collapse
|
78
|
Pierce BL, Tong L, Argos M, Demanelis K, Jasmine F, Rakibuz-Zaman M, Sarwar G, Islam MT, Shahriar H, Islam T, Rahman M, Yunus M, Kibriya MG, Chen LS, Ahsan H. Co-occurring expression and methylation QTLs allow detection of common causal variants and shared biological mechanisms. Nat Commun 2018; 9:804. [PMID: 29476079 PMCID: PMC5824840 DOI: 10.1038/s41467-018-03209-9] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2016] [Accepted: 01/26/2018] [Indexed: 12/21/2022] Open
Abstract
Inherited genetic variation affects local gene expression and DNA methylation in humans. Most expression quantitative trait loci (cis-eQTLs) occur at the same genomic location as a methylation QTL (cis-meQTL), suggesting a common causal variant and shared mechanism. Using DNA and RNA from peripheral blood of Bangladeshi individuals, here we use co-localization methods to identify eQTL-meQTL pairs likely to share a causal variant. We use partial correlation and mediation analyses to identify >400 of these pairs showing evidence of a causal relationship between expression and methylation (i.e., shared mechanism) with many additional pairs we are underpowered to detect. These co-localized pairs are enriched for SNPs showing opposite associations with expression and methylation, although many SNPs affect multiple CpGs in opposite directions. This work demonstrates the pervasiveness of co-regulated expression and methylation in the human genome. Applying this approach to other types of molecular QTLs can enhance our understanding of regulatory mechanisms.
Collapse
Affiliation(s)
- Brandon L Pierce
- Department of Public Health Sciences, The University of Chicago, Chicago, IL, 60637, USA.
- Department of Human Genetics, The University of Chicago, Chicago, IL, 60637, USA.
- Comprehensive Cancer Center, The University of Chicago, Chicago, IL, 60637, USA.
| | - Lin Tong
- Department of Public Health Sciences, The University of Chicago, Chicago, IL, 60637, USA
| | - Maria Argos
- Division of Epidemiology and Biostatistics, University of Illinois at Chicago, Chicago, IL, 60612, USA
| | - Kathryn Demanelis
- Department of Public Health Sciences, The University of Chicago, Chicago, IL, 60637, USA
| | - Farzana Jasmine
- Department of Public Health Sciences, The University of Chicago, Chicago, IL, 60637, USA
| | | | - Golam Sarwar
- UChicago Research Bangladesh, Mohakhali, Dhaka, 1230, Bangladesh
| | - Md Tariqul Islam
- UChicago Research Bangladesh, Mohakhali, Dhaka, 1230, Bangladesh
| | - Hasan Shahriar
- UChicago Research Bangladesh, Mohakhali, Dhaka, 1230, Bangladesh
| | - Tariqul Islam
- UChicago Research Bangladesh, Mohakhali, Dhaka, 1230, Bangladesh
| | - Mahfuzar Rahman
- UChicago Research Bangladesh, Mohakhali, Dhaka, 1230, Bangladesh
- Research and Evaluation Division, BRAC, Dhaka, 1212, Bangladesh
| | - Md Yunus
- International Centre for Diarrhoeal Disease Research Bangladesh, Dhaka, 1000, Bangladesh
| | - Muhammad G Kibriya
- Department of Public Health Sciences, The University of Chicago, Chicago, IL, 60637, USA
| | - Lin S Chen
- Department of Public Health Sciences, The University of Chicago, Chicago, IL, 60637, USA
| | - Habibul Ahsan
- Department of Public Health Sciences, The University of Chicago, Chicago, IL, 60637, USA.
- Department of Human Genetics, The University of Chicago, Chicago, IL, 60637, USA.
- Comprehensive Cancer Center, The University of Chicago, Chicago, IL, 60637, USA.
- Department of Medicine, The University of Chicago, Chicago, IL, 60637, USA.
| |
Collapse
|
79
|
Herrmann M, Ravindran SP, Schwenk K, Cordellier M. Population transcriptomics in Daphnia
: The role of thermal selection. Mol Ecol 2017; 27:387-402. [DOI: 10.1111/mec.14450] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2017] [Revised: 10/22/2017] [Accepted: 11/02/2017] [Indexed: 12/30/2022]
Affiliation(s)
- Maike Herrmann
- Institute for Environmental Sciences; University Koblenz-Landau; Landau in der Pfalz Germany
| | | | - Klaus Schwenk
- Institute for Environmental Sciences; University Koblenz-Landau; Landau in der Pfalz Germany
| | | |
Collapse
|
80
|
Bartonicek N, Clark MB, Quek XC, Torpy JR, Pritchard AL, Maag JLV, Gloss BS, Crawford J, Taft RJ, Hayward NK, Montgomery GW, Mattick JS, Mercer TR, Dinger ME. Intergenic disease-associated regions are abundant in novel transcripts. Genome Biol 2017; 18:241. [PMID: 29284497 PMCID: PMC5747244 DOI: 10.1186/s13059-017-1363-3] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Accepted: 11/21/2017] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Genotyping of large populations through genome-wide association studies (GWAS) has successfully identified many genomic variants associated with traits or disease risk. Unexpectedly, a large proportion of GWAS single nucleotide polymorphisms (SNPs) and associated haplotype blocks are in intronic and intergenic regions, hindering their functional evaluation. While some of these risk-susceptibility regions encompass cis-regulatory sites, their transcriptional potential has never been systematically explored. RESULTS To detect rare tissue-specific expression, we employed the transcript-enrichment method CaptureSeq on 21 human tissues to identify 1775 multi-exonic transcripts from 561 intronic and intergenic haploblocks associated with 392 traits and diseases, covering 73.9 Mb (2.2%) of the human genome. We show that a large proportion (85%) of disease-associated haploblocks express novel multi-exonic non-coding transcripts that are tissue-specific and enriched for GWAS SNPs as well as epigenetic markers of active transcription and enhancer activity. Similarly, we captured transcriptomes from 13 melanomas, targeting nine melanoma-associated haploblocks, and characterized 31 novel melanoma-specific transcripts that include fusion proteins, novel exons and non-coding RNAs, one-third of which showed allelically imbalanced expression. CONCLUSIONS This resource of previously unreported transcripts in disease-associated regions ( http://gwas-captureseq.dingerlab.org ) should provide an important starting point for the translational community in search of novel biomarkers, disease mechanisms, and drug targets.
Collapse
Affiliation(s)
- N Bartonicek
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - M B Clark
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Department of Psychiatry, University of Oxford, Warneford Hospital, Oxford, UK
| | - X C Quek
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - J R Torpy
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - A L Pritchard
- QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - J L V Maag
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - B S Gloss
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - J Crawford
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | - R J Taft
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
- Illumina, Inc., San Diego, CA, USA
| | - N K Hayward
- QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - G W Montgomery
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | - J S Mattick
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - T R Mercer
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
- Altius Institute for Biomedical Sciences, Seattle, USA
| | - M E Dinger
- Garvan Institute of Medical Research, Sydney, NSW, Australia.
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia.
| |
Collapse
|
81
|
Goddard ME, Kemper KE, MacLeod IM, Chamberlain AJ, Hayes BJ. Genetics of complex traits: prediction of phenotype, identification of causal polymorphisms and genetic architecture. Proc Biol Sci 2017; 283:rspb.2016.0569. [PMID: 27440663 DOI: 10.1098/rspb.2016.0569] [Citation(s) in RCA: 83] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2016] [Accepted: 06/23/2016] [Indexed: 01/01/2023] Open
Abstract
Complex or quantitative traits are important in medicine, agriculture and evolution, yet, until recently, few of the polymorphisms that cause variation in these traits were known. Genome-wide association studies (GWAS), based on the ability to assay thousands of single nucleotide polymorphisms (SNPs), have revolutionized our understanding of the genetics of complex traits. We advocate the analysis of GWAS data by a statistical method that fits all SNP effects simultaneously, assuming that these effects are drawn from a prior distribution. We illustrate how this method can be used to predict future phenotypes, to map and identify the causal mutations, and to study the genetic architecture of complex traits. The genetic architecture of complex traits is even more complex than previously thought: in almost every trait studied there are thousands of polymorphisms that explain genetic variation. Methods of predicting future phenotypes, collectively known as genomic selection or genomic prediction, have been widely adopted in livestock and crop breeding, leading to increased rates of genetic improvement.
Collapse
Affiliation(s)
- M E Goddard
- Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Melbourne, Victoria 3010, Australia Department of Economic Development, Jobs, Transport and Resources, AgriBio, La Trobe University, Bundoora, Victoria 3083, Australia
| | - K E Kemper
- Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Melbourne, Victoria 3010, Australia
| | - I M MacLeod
- Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Melbourne, Victoria 3010, Australia Department of Economic Development, Jobs, Transport and Resources, AgriBio, La Trobe University, Bundoora, Victoria 3083, Australia Dairy Futures Cooperative Research Centre, AgriBio, La Trobe University, Bundoora, Victoria 3083, Australia
| | - A J Chamberlain
- Department of Economic Development, Jobs, Transport and Resources, AgriBio, La Trobe University, Bundoora, Victoria 3083, Australia
| | - B J Hayes
- Department of Economic Development, Jobs, Transport and Resources, AgriBio, La Trobe University, Bundoora, Victoria 3083, Australia School of Applied System Biology, La Trobe University, Agribiosciences Building, Bundoora, Australia
| |
Collapse
|
82
|
Zeng P, Wang T, Huang S. Cis-SNPs Set Testing and PrediXcan Analysis for Gene Expression Data using Linear Mixed Models. Sci Rep 2017; 7:15237. [PMID: 29127305 PMCID: PMC5681585 DOI: 10.1038/s41598-017-15055-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2017] [Accepted: 10/19/2017] [Indexed: 12/21/2022] Open
Abstract
Understanding the functional mechanism of SNPs identified in GWAS on complex diseases is currently a challenging task. The studies of expression quantitative trait loci (eQTL) have shown that regulatory variants play a crucial role in the function of associated SNPs. Detecting significant genes (called eGenes) in eQTL studies and analyzing the effect sizes of cis-SNPs can offer important implications on the genetic architecture of associated SNPs and interpretations of the molecular basis of diseases. We applied linear mixed models (LMM) to the gene expression level and constructed likelihood ratio tests (LRT) to test for eGene in the Geuvadis data. We identified about 11% genes as eGenes in the Geuvadis data and found some eGenes were enriched in approximately independent linkage disequilibrium (LD) blocks (e.g. MHC). We further performed PrediXcan analysis for seven diseases in the WTCCC data with weights estimated using LMM and identified 64, 5, 21 and 1 significant genes (p < 0.05 after Bonferroni correction) associated with T1D, CD, RA and T2D. We found most of the significant genes of T1D and RA were also located within the MHC region. Our results provide strong evidence that gene expression plays an intermediate role for the associated variants in GWAS.
Collapse
Affiliation(s)
- Ping Zeng
- Xuzhou Medical University, Department of Epidemiology and Biostatistics, Xuzhou, 221004, China.
- University of Michigan, Department of Biostatistics, Ann Arbor, MI, 48104, USA.
| | - Ting Wang
- Xuzhou Medical University, Department of Epidemiology and Biostatistics, Xuzhou, 221004, China
| | - Shuiping Huang
- Xuzhou Medical University, Department of Epidemiology and Biostatistics, Xuzhou, 221004, China.
| |
Collapse
|
83
|
Shen JJ, Wang TY, Yang W. Regulatory and evolutionary signatures of sex-biased genes on both the X chromosome and the autosomes. Biol Sex Differ 2017; 8:35. [PMID: 29096703 PMCID: PMC5668987 DOI: 10.1186/s13293-017-0156-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/05/2017] [Accepted: 10/16/2017] [Indexed: 12/15/2022] Open
Abstract
Background Sex is an important but understudied factor in the genetics of human diseases. Analyses using a combination of gene expression data, ENCODE data, and evolutionary data of sex-biased gene expression in human tissues can give insight into the regulatory and evolutionary forces acting on sex-biased genes. Methods In this study, we analyzed the differentially expressed genes between males and females. On the X chromosome, we used a novel method and investigated the status of genes that escape X-chromosome inactivation (escape genes), taking into account the clonality of lymphoblastoid cell lines (LCLs). To investigate the regulation of sex-biased differentially expressed genes (sDEG), we conducted pathway and transcription factor enrichment analyses on the sDEGs, as well as analyses on the genomic distribution of sDEGs. Evolutionary analyses were also conducted on both sDEGs and escape genes. Results Genome-wide, we characterized differential gene expression between sexes in 462 RNA-seq samples and identified 587 sex-biased genes, or 3.2% of the genes surveyed. On the X chromosome, sDEGs were distributed in evolutionary strata in a similar pattern as escape genes. We found a trend of negative correlation between the gene expression breadth and nonsynonymous over synonymous mutation (dN/dS) ratios, showing a possible pleiotropic constraint on evolution of genes. Genome-wide, nine transcription factors were found enriched in binding to the regions surrounding the transcription start sites of female-biased genes. Many pathways and protein domains were enriched in sex-biased genes, some of which hint at sex-biased physiological processes. Conclusions These findings lend insight into the regulatory and evolutionary forces shaping sex-biased gene expression and their involvement in the physiological and pathological processes in human health and diseases. Electronic supplementary material The online version of this article (10.1186/s13293-017-0156-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jiangshan J Shen
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, 21 Sassoon Road, Pokfulam, Hong Kong
| | - Ting-You Wang
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, 21 Sassoon Road, Pokfulam, Hong Kong
| | - Wanling Yang
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, 21 Sassoon Road, Pokfulam, Hong Kong.
| |
Collapse
|
84
|
Nourmohammad A, Rambeau J, Held T, Kovacova V, Berg J, Lässig M. Adaptive Evolution of Gene Expression in Drosophila. Cell Rep 2017; 20:1385-1395. [DOI: 10.1016/j.celrep.2017.07.033] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2016] [Revised: 04/15/2017] [Accepted: 07/13/2017] [Indexed: 01/17/2023] Open
|
85
|
Yamasaki AE, Panopoulos AD, Belmonte JCI. Understanding the genetics behind complex human disease with large-scale iPSC collections. Genome Biol 2017; 18:135. [PMID: 28728561 PMCID: PMC5520285 DOI: 10.1186/s13059-017-1276-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Three recent studies analyzing large-scale collections of human induced pluripotent stem cell lines provide valuable insight into how genetic regulatory variation affects cellular and molecular traits.
Collapse
Affiliation(s)
- Amanda E Yamasaki
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Athanasia D Panopoulos
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, 46556, USA.
| | | |
Collapse
|
86
|
Do C, Shearer A, Suzuki M, Terry MB, Gelernter J, Greally JM, Tycko B. Genetic-epigenetic interactions in cis: a major focus in the post-GWAS era. Genome Biol 2017. [PMID: 28629478 PMCID: PMC5477265 DOI: 10.1186/s13059-017-1250-y] [Citation(s) in RCA: 95] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Studies on genetic-epigenetic interactions, including the mapping of methylation quantitative trait loci (mQTLs) and haplotype-dependent allele-specific DNA methylation (hap-ASM), have become a major focus in the post-genome-wide-association-study (GWAS) era. Such maps can nominate regulatory sequence variants that underlie GWAS signals for common diseases, ranging from neuropsychiatric disorders to cancers. Conversely, mQTLs need to be filtered out when searching for non-genetic effects in epigenome-wide association studies (EWAS). Sequence variants in CCCTC-binding factor (CTCF) and transcription factor binding sites have been mechanistically linked to mQTLs and hap-ASM. Identifying these sites can point to disease-associated transcriptional pathways, with implications for targeted treatment and prevention.
Collapse
Affiliation(s)
- Catherine Do
- Institute for Cancer Genetics and Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, 10032, USA
| | - Alyssa Shearer
- Institute for Cancer Genetics and Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, 10032, USA
| | - Masako Suzuki
- Center for Epigenomics, Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Mary Beth Terry
- Department of Epidemiology, Columbia University Mailman School of Public Health, and Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, 10032, USA
| | - Joel Gelernter
- Departments of Psychiatry, Genetics, and Neurobiology, Yale University School of Medicine, New Haven, CT, 06520, USA
| | - John M Greally
- Center for Epigenomics, Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Benjamin Tycko
- Institute for Cancer Genetics, Herbert Irving Comprehensive Cancer Center, Taub Institute for Research on Alzheimer's disease and the Aging Brain, New York, NY, 10032, USA. .,Department of Pathology and Cell Biology, Columbia University, New York, NY, 10032, USA.
| |
Collapse
|
87
|
Zhu Y, Tazearslan C, Suh Y. Challenges and progress in interpretation of non-coding genetic variants associated with human disease. Exp Biol Med (Maywood) 2017; 242:1325-1334. [PMID: 28581336 DOI: 10.1177/1535370217713750] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Genome-wide association studies have shown that the far majority of disease-associated variants reside in the non-coding regions of the genome, suggesting that gene regulatory changes contribute to disease risk. To identify truly causal non-coding variants and their affected target genes remains challenging but is a critical step to translate the genetic associations to molecular mechanisms and ultimately clinical applications. Here we review genomic/epigenomic resources and in silico tools that can be used to identify causal non-coding variants and experimental strategies to validate their functionalities. Impact statement Most signals from genome-wide association studies (GWASs) map to the non-coding genome, and functional interpretation of these associations remained challenging. We reviewed recent progress in methodologies of studying the non-coding genome and argued that no single approach allows one to effectively identify the causal regulatory variants from GWAS results. By illustrating the advantages and limitations of each method, our review potentially provided a guideline for taking a combinatorial approach to accurately predict, prioritize, and eventually experimentally validate the causal variants.
Collapse
Affiliation(s)
- Yizhou Zhu
- 1 Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Cagdas Tazearslan
- 1 Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Yousin Suh
- 1 Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA.,2 Department of Ophthalmology & Visual Sciences, Albert Einstein College of Medicine, Bronx, NY 10461, USA.,3 Department of Medicine, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| |
Collapse
|
88
|
Shin S, Keleş S. Annotation Regression for Genome-Wide Association Studies with an Application to Psychiatric Genomic Consortium Data. STATISTICS IN BIOSCIENCES 2017; 9:50-72. [PMID: 28781711 PMCID: PMC5542423 DOI: 10.1007/s12561-016-9154-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Revised: 05/09/2016] [Accepted: 06/20/2016] [Indexed: 10/21/2022]
Abstract
Although genome-wide association studies (GWAS) have been successful at finding thousands of disease-associated genetic variants (GVs), identifying causal variants and elucidating the mechanisms by which genotypes influence phenotypes are critical open questions. A key challenge is that a large percentage of disease-associated GVs are potential regulatory variants located in noncoding regions, making them difficult to interpret. Recent research efforts focus on going beyond annotating GVs by integrating functional annotation data with GWAS to prioritize GVs. However, applicability of these approaches is challenged by high dimensionality and heterogeneity of functional annotation data. Furthermore, existing methods often assume global associations of GVs with annotation data. This strong assumption is susceptible to violations for GVs involved in many complex diseases. To address these issues, we develop a general regression framework, named Annotation Regression for GWAS (ARoG). ARoG is based on a finite mixture of linear regressions model where GWAS association measures are viewed as responses and functional annotations as predictors. This mixture framework addresses heterogeneity of effects of GVs by grouping them into clusters and high dimensionality of the functional annotations by enabling annotation selection within each cluster. ARoG further employs permutation testing to evaluate the significance of selected annotations. Computational experiments indicate that ARoG can discover distinct associations between disease risk and functional annotations. Application of ARoG to autism and schizophrenia data from Psychiatric Genomics Consortium led to identification of GVs that significantly affect interactions of several transcription factors with DNA as potential mechanisms contributing to these disorders.
Collapse
Affiliation(s)
- Sunyoung Shin
- Department of Statistics, Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, USA
| | - Sündüz Keleş
- Department of Statistics, Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, USA
| |
Collapse
|
89
|
|
90
|
Ju JH, Shenoy SA, Crystal RG, Mezey JG. An independent component analysis confounding factor correction framework for identifying broad impact expression quantitative trait loci. PLoS Comput Biol 2017; 13:e1005537. [PMID: 28505156 PMCID: PMC5448815 DOI: 10.1371/journal.pcbi.1005537] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Revised: 05/30/2017] [Accepted: 04/28/2017] [Indexed: 11/19/2022] Open
Abstract
Genome-wide expression Quantitative Trait Loci (eQTL) studies in humans have provided numerous insights into the genetics of both gene expression and complex diseases. While the majority of eQTL identified in genome-wide analyses impact a single gene, eQTL that impact many genes are particularly valuable for network modeling and disease analysis. To enable the identification of such broad impact eQTL, we introduce CONFETI: Confounding Factor Estimation Through Independent component analysis. CONFETI is designed to address two conflicting issues when searching for broad impact eQTL: the need to account for non-genetic confounding factors that can lower the power of the analysis or produce broad impact eQTL false positives, and the tendency of methods that account for confounding factors to model broad impact eQTL as non-genetic variation. The key advance of the CONFETI framework is the use of Independent Component Analysis (ICA) to identify variation likely caused by broad impact eQTL when constructing the sample covariance matrix used for the random effect in a mixed model. We show that CONFETI has better performance than other mixed model confounding factor methods when considering broad impact eQTL recovery from synthetic data. We also used the CONFETI framework and these same confounding factor methods to identify eQTL that replicate between matched twin pair datasets in the Multiple Tissue Human Expression Resource (MuTHER), the Depression Genes Networks study (DGN), the Netherlands Study of Depression and Anxiety (NESDA), and multiple tissue types in the Genotype-Tissue Expression (GTEx) consortium. These analyses identified both cis-eQTL and trans-eQTL impacting individual genes, and CONFETI had better or comparable performance to other mixed model confounding factor analysis methods when identifying such eQTL. In these analyses, we were able to identify and replicate a few broad impact eQTL although the overall number was small even when applying CONFETI. In light of these results, we discuss the broad impact eQTL that have been previously reported from the analysis of human data and suggest that considerable caution should be exercised when making biological inferences based on these reported eQTL.
Collapse
Affiliation(s)
- Jin Hyun Ju
- Department of Genetic Medicine, Weill Cornell Medical College, New York, NY, United States of America
- Institute for Computational Biomedicine, Weill Cornell Medical College, New York, NY, United States of America
| | - Sushila A. Shenoy
- Department of Genetic Medicine, Weill Cornell Medical College, New York, NY, United States of America
| | - Ronald G. Crystal
- Department of Genetic Medicine, Weill Cornell Medical College, New York, NY, United States of America
| | - Jason G. Mezey
- Department of Genetic Medicine, Weill Cornell Medical College, New York, NY, United States of America
- Institute for Computational Biomedicine, Weill Cornell Medical College, New York, NY, United States of America
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY, United States of America
- * E-mail:
| |
Collapse
|
91
|
DeBoever C, Li H, Jakubosky D, Benaglio P, Reyna J, Olson KM, Huang H, Biggs W, Sandoval E, D'Antonio M, Jepsen K, Matsui H, Arias A, Ren B, Nariai N, Smith EN, D'Antonio-Chronowska A, Farley EK, Frazer KA. Large-Scale Profiling Reveals the Influence of Genetic Variation on Gene Expression in Human Induced Pluripotent Stem Cells. Cell Stem Cell 2017; 20:533-546.e7. [PMID: 28388430 PMCID: PMC5444918 DOI: 10.1016/j.stem.2017.03.009] [Citation(s) in RCA: 122] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2016] [Revised: 12/27/2016] [Accepted: 03/15/2017] [Indexed: 12/18/2022]
Abstract
In this study, we used whole-genome sequencing and gene expression profiling of 215 human induced pluripotent stem cell (iPSC) lines from different donors to identify genetic variants associated with RNA expression for 5,746 genes. We were able to predict causal variants for these expression quantitative trait loci (eQTLs) that disrupt transcription factor binding and validated a subset of them experimentally. We also identified copy-number variant (CNV) eQTLs, including some that appear to affect gene expression by altering the copy number of intergenic regulatory regions. In addition, we were able to identify effects on gene expression of rare genic CNVs and regulatory single-nucleotide variants and found that reactivation of gene expression on the X chromosome depends on gene chromosomal position. Our work highlights the value of iPSCs for genetic association analyses and provides a unique resource for investigating the genetic regulation of gene expression in pluripotent cells.
Collapse
Affiliation(s)
- Christopher DeBoever
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - He Li
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - David Jakubosky
- Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA 92093-0419, USA; Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Paola Benaglio
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Joaquin Reyna
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Katrina M Olson
- Division of Cardiology, Department of Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA; Division of Biological Sciences, Section of Molecular Biology, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Hui Huang
- Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA 92093-0419, USA; Ludwig Institute for Cancer Research, La Jolla, CA 92093, USA
| | | | | | - Matteo D'Antonio
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Kristen Jepsen
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Hiroko Matsui
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Angelo Arias
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Bing Ren
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA; Ludwig Institute for Cancer Research, La Jolla, CA 92093, USA; Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Naoki Nariai
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Erin N Smith
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | | | - Emma K Farley
- Division of Cardiology, Department of Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA; Division of Biological Sciences, Section of Molecular Biology, University of California, San Diego, La Jolla, CA 92093-0419, USA.
| | - Kelly A Frazer
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA; Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA 92093-0419, USA.
| |
Collapse
|
92
|
Ardlie KG, Guigó R. Data Resources for Human Functional Genomics. CURRENT OPINION IN SYSTEMS BIOLOGY 2017; 1:75-79. [PMID: 28989986 PMCID: PMC5625631 DOI: 10.1016/j.coisb.2016.12.019] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
| | - Roderic Guigó
- Center for Genomic Regulation (CRG), Barcelona, Catalonia, Spain
| |
Collapse
|
93
|
Deplancke B, Alpern D, Gardeux V. The Genetics of Transcription Factor DNA Binding Variation. Cell 2016; 166:538-554. [PMID: 27471964 DOI: 10.1016/j.cell.2016.07.012] [Citation(s) in RCA: 273] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Indexed: 12/23/2022]
Abstract
Most complex trait-associated variants are located in non-coding regulatory regions of the genome, where they have been shown to disrupt transcription factor (TF)-DNA binding motifs. Variable TF-DNA interactions are therefore increasingly considered as key drivers of phenotypic variation. However, recent genome-wide studies revealed that the majority of variable TF-DNA binding events are not driven by sequence alterations in the motif of the studied TF. This observation implies that the molecular mechanisms underlying TF-DNA binding variation and, by extrapolation, inter-individual phenotypic variation are more complex than originally anticipated. Here, we summarize the findings that led to this important paradigm shift and review proposed mechanisms for local, proximal, or distal genetic variation-driven variable TF-DNA binding. In addition, we discuss the biomedical implications of these findings for our ability to dissect the molecular role(s) of non-coding genetic variants in complex traits, including disease susceptibility.
Collapse
Affiliation(s)
- Bart Deplancke
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.
| | - Daniel Alpern
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Vincent Gardeux
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
94
|
Transcriptomic Analysis Implicates the p53 Signaling Pathway in the Establishment of HIV-1 Latency in Central Memory CD4 T Cells in an In Vitro Model. PLoS Pathog 2016; 12:e1006026. [PMID: 27898737 PMCID: PMC5127598 DOI: 10.1371/journal.ppat.1006026] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2016] [Accepted: 10/26/2016] [Indexed: 12/19/2022] Open
Abstract
The search for an HIV-1 cure has been greatly hindered by the presence of a viral reservoir that persists despite antiretroviral therapy (ART). Studies of HIV-1 latency in vivo are also complicated by the low proportion of latently infected cells in HIV-1 infected individuals. A number of models of HIV-1 latency have been developed to examine the signaling pathways and viral determinants of latency and reactivation. A primary cell model of HIV-1 latency, which incorporates the generation of primary central memory CD4 T cells (TCM), full-length virus infection (HIVNL4-3) and ART to suppress virus replication, was used to investigate the establishment of HIV latency using RNA-Seq. Initially, an investigation of host and viral gene expression in the resting and activated states of this model indicated that the resting condition was reflective of a latent state. Then, a comparison of the host transcriptome between the uninfected and latently infected conditions of this model identified 826 differentially expressed genes, many of which were related to p53 signaling. Inhibition of the transcriptional activity of p53 by pifithrin-α during HIV-1 infection reduced the ability of HIV-1 to be reactivated from its latent state by an unknown mechanism. In conclusion, this model may be used to screen latency reversing agents utilized in shock and kill approaches to cure HIV, to search for cellular markers of latency, and to understand the mechanisms by which HIV-1 establishes latency.
Collapse
|
95
|
Kumar S, Ambrosini G, Bucher P. SNP2TFBS - a database of regulatory SNPs affecting predicted transcription factor binding site affinity. Nucleic Acids Res 2016; 45:D139-D144. [PMID: 27899579 PMCID: PMC5210548 DOI: 10.1093/nar/gkw1064] [Citation(s) in RCA: 144] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Revised: 10/05/2016] [Accepted: 10/24/2016] [Indexed: 01/21/2023] Open
Abstract
SNP2TFBS is a computational resource intended to support researchers investigating the molecular mechanisms underlying regulatory variation in the human genome. The database essentially consists of a collection of text files providing specific annotations for human single nucleotide polymorphisms (SNPs), namely whether they are predicted to abolish, create or change the affinity of one or several transcription factor (TF) binding sites. A SNP's effect on TF binding is estimated based on a position weight matrix (PWM) model for the binding specificity of the corresponding factor. These data files are regenerated at regular intervals by an automatic procedure that takes as input a reference genome, a comprehensive SNP catalogue and a collection of PWMs. SNP2TFBS is also accessible over a web interface, enabling users to view the information provided for an individual SNP, to extract SNPs based on various search criteria, to annotate uploaded sets of SNPs or to display statistics about the frequencies of binding sites affected by selected SNPs. Homepage: http://ccg.vital-it.ch/snp2tfbs/.
Collapse
Affiliation(s)
- Sunil Kumar
- Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland.,Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne, Switzerland
| | - Giovanna Ambrosini
- Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland.,Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne, Switzerland
| | - Philipp Bucher
- Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland .,Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne, Switzerland
| |
Collapse
|
96
|
Abstract
The hematopoietic system plays a major role in human health. Two studies by Astle et al. and Chen et al. published in this issue of Cell use genome-wide association and functional genomics approaches to provide deep insights into the role of genetic variants in hematological traits. We discuss these discoveries and future strategies toward completing our understanding of the genetic basis for variation in human traits.
Collapse
Affiliation(s)
- Sarah Kim-Hellmuth
- New York Genome Center, New York, NY 10013, USA; Department of Systems Biology, Columbia University, New York, NY 10027, USA
| | - Tuuli Lappalainen
- New York Genome Center, New York, NY 10013, USA; Department of Systems Biology, Columbia University, New York, NY 10027, USA.
| |
Collapse
|
97
|
Jeidane S, Scott-Boyer MP, Tremblay N, Cardin S, Picard S, Baril M, Lamarre D, Deschepper CF. Association of a Network of Interferon-Stimulated Genes with a Locus Encoding a Negative Regulator of Non-conventional IKK Kinases and IFNB1. Cell Rep 2016; 17:425-435. [PMID: 27705791 DOI: 10.1016/j.celrep.2016.09.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2015] [Revised: 08/11/2016] [Accepted: 09/02/2016] [Indexed: 11/25/2022] Open
Abstract
Functional genomic analysis of gene expression in mice allowed us to identify a quantitative trait locus (QTL) linked in trans to the expression of 190 gene transcripts and in cis to the expression of only two genes, one of which was Ypel5. Most of the trans-expression QTL genes were interferon-stimulated genes (ISGs), and their expression in mouse macrophage cell lines was stimulated in an IFNB1-dependent manner by Ypel5 silencing. In human HEK293T cells, YPEL5 silencing enhanced the induction of IFNB1 by pattern recognition receptors and phosphorylation of TBK1/IKBKE kinases, whereas co-immunoprecipitation experiments revealed that YPEL5 interacted physically with IKBKE. We thus found that the Ypel5 gene (contained in a locus linked to a network of ISGs in mice) is a negative regulator of IFNB1 production and innate immune responses that interacts functionally and physically with TBK1/IKBKE kinases.
Collapse
Affiliation(s)
- Saloua Jeidane
- Cardiovascular Biology Research Unit, Institut de Recherches Cliniques de Montréal (IRCM), Montréal, QC H2W 1R7, Canada; Département de Médecine, Faculté de Médecine, Université de Montréal, Montréal, QC J2S 2M2, Canada
| | - Marie-Pier Scott-Boyer
- Cardiovascular Biology Research Unit, Institut de Recherches Cliniques de Montréal (IRCM), Montréal, QC H2W 1R7, Canada; Département de Médecine, Faculté de Médecine, Université de Montréal, Montréal, QC J2S 2M2, Canada
| | - Nicolas Tremblay
- Département de Médecine, Faculté de Médecine, Université de Montréal, Montréal, QC J2S 2M2, Canada; Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CR-CHUM), Montréal, QC H2X 3J4, Canada
| | - Sophie Cardin
- Cardiovascular Biology Research Unit, Institut de Recherches Cliniques de Montréal (IRCM), Montréal, QC H2W 1R7, Canada; Département de Médecine, Faculté de Médecine, Université de Montréal, Montréal, QC J2S 2M2, Canada
| | - Sylvie Picard
- Cardiovascular Biology Research Unit, Institut de Recherches Cliniques de Montréal (IRCM), Montréal, QC H2W 1R7, Canada
| | - Martin Baril
- Département de Médecine, Faculté de Médecine, Université de Montréal, Montréal, QC J2S 2M2, Canada; Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CR-CHUM), Montréal, QC H2X 3J4, Canada
| | - Daniel Lamarre
- Département de Médecine, Faculté de Médecine, Université de Montréal, Montréal, QC J2S 2M2, Canada; Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CR-CHUM), Montréal, QC H2X 3J4, Canada
| | - Christian F Deschepper
- Cardiovascular Biology Research Unit, Institut de Recherches Cliniques de Montréal (IRCM), Montréal, QC H2W 1R7, Canada; Département de Médecine, Faculté de Médecine, Université de Montréal, Montréal, QC J2S 2M2, Canada.
| |
Collapse
|
98
|
The 'heritability' of domestication and its functional partitioning in the pig. Heredity (Edinb) 2016; 118:160-168. [PMID: 27649617 DOI: 10.1038/hdy.2016.78] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Revised: 07/04/2016] [Accepted: 07/04/2016] [Indexed: 11/08/2022] Open
Abstract
We propose to estimate the proportion of variance explained by regression on genome-wide markers (or genomic heritability) when wild/domestic status is considered the phenotype of interest. This approach differs from the standard Fst in that it can accommodate genetic similarity between individuals in a general form. We apply this strategy to complete genome data from 47 wild and domestic pigs from Asia and Europe. When we partitioned the total genomic variance into components associated to subsets of single nucleotide polymorphisms (SNPs) defined in terms of their annotation, we found that potentially deleterious non-synonymous mutations (9566 SNPs) explained as much genetic variance as the whole set of 25 million SNPs. This suggests that domestication may have affected protein sequence to a larger extent than regulatory or other kinds of mutations. A pathway-guided analysis revealed ovarian steroidogenesis and leptin signaling as highly relevant in domestication. The genomic regression approach proposed in this study revealed molecular processes not apparent through typical differentiation statistics. We propose that at least some of these processes are likely new discoveries because domestication is a dynamic process of genetic selection, which may not be completely characterized by a static metric like Fst. Nevertheless, and despite some particularly influential mutation types or pathways, our analyses tend to rule out a simplistic genetic basis for the domestication process: neither a single pathway nor a unique set of SNPs can explain the process as a whole.
Collapse
|
99
|
Hore V, Viñuela A, Buil A, Knight J, McCarthy MI, Small K, Marchini J. Tensor decomposition for multiple-tissue gene expression experiments. Nat Genet 2016; 48:1094-100. [PMID: 27479908 PMCID: PMC5010142 DOI: 10.1038/ng.3624] [Citation(s) in RCA: 91] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2015] [Accepted: 06/22/2016] [Indexed: 12/20/2022]
Abstract
Genome-wide association studies of gene expression traits and other cellular phenotypes have successfully identified links between genetic variation and biological processes. The majority of discoveries have uncovered cis-expression quantitative trait locus (eQTL) effects via mass univariate testing of SNPs against gene expression in single tissues. Here we present a Bayesian method for multiple-tissue experiments focusing on uncovering gene networks linked to genetic variation. Our method decomposes the 3D array (or tensor) of gene expression measurements into a set of latent components. We identify sparse gene networks that can then be tested for association against genetic variation across the genome. We apply our method to a data set of 845 individuals from the TwinsUK cohort with gene expression measured via RNA-seq analysis in adipose, lymphoblastoid cell lines (LCLs) and skin. We uncover several gene networks with a genetic basis and clear biological and statistical significance. Extensions of this approach will allow integration of different omics, environmental and phenotypic data sets.
Collapse
Affiliation(s)
- Victoria Hore
- Department of Statistics, University of Oxford, 24-29 St Giles, Oxford OX1 3LB, UK
| | - Ana Viñuela
- Department of Twin Research and Genetic Epidemiology, King’s College London, SE1 7EH, UK
| | - Alfonso Buil
- Department of Genetic Medicine and Development, University of Geneva, Geneva, Switzerland
| | - Julian Knight
- The Wellcome Trust Centre for Human Genetics, Roosevelt Drive, Oxford OX3 7BN, UK
| | - Mark I McCarthy
- The Wellcome Trust Centre for Human Genetics, Roosevelt Drive, Oxford OX3 7BN, UK
- Oxford Centre for Diabetes, Endocrinology and Metabolism, University of Oxford, Churchill Hospital, Old Road, Oxford OX3 7LJ
| | - Kerrin Small
- Department of Twin Research and Genetic Epidemiology, King’s College London, SE1 7EH, UK
| | - Jonathan Marchini
- Department of Statistics, University of Oxford, 24-29 St Giles, Oxford OX1 3LB, UK
- The Wellcome Trust Centre for Human Genetics, Roosevelt Drive, Oxford OX3 7BN, UK
| |
Collapse
|
100
|
Lappalainen T. Functional genomics bridges the gap between quantitative genetics and molecular biology. Genome Res 2016; 25:1427-31. [PMID: 26430152 PMCID: PMC4579327 DOI: 10.1101/gr.190983.115] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Deep characterization of molecular function of genetic variants in the human genome is becoming increasingly important for understanding genetic associations to disease and for learning to read the regulatory code of the genome. In this paper, I discuss how recent advances in both quantitative genetics and molecular biology have contributed to understanding functional effects of genetic variants, lessons learned from eQTL studies, and future challenges in this field.
Collapse
Affiliation(s)
- Tuuli Lappalainen
- New York Genome Center, New York, New York 10013, USA; Department of Systems Biology, Columbia University, New York, New York 10032, USA
| |
Collapse
|