1
|
Varshney A, Manickam N, Orchard P, Tovar A, Ventresca C, Zhang Z, Feng F, Mears J, Erdos MR, Narisu N, Nishino K, Rai V, Stringham HM, Jackson AU, Tamsen T, Gao C, Yang M, Koues OI, Welch JD, Burant CF, Williams LK, Jenkinson C, DeFronzo RA, Norton L, Saramies J, Lakka TA, Laakso M, Tuomilehto J, Mohlke KL, Kitzman JO, Koistinen HA, Liu J, Boehnke M, Collins FS, Scott LJ, Parker SCJ. Population-scale skeletal muscle single-nucleus multi-omic profiling reveals extensive context specific genetic regulation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.15.571696. [PMID: 38168419 PMCID: PMC10760134 DOI: 10.1101/2023.12.15.571696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
Skeletal muscle, the largest human organ by weight, is relevant in several polygenic metabolic traits and diseases including type 2 diabetes (T2D). Identifying genetic mechanisms underlying these traits requires pinpointing cell types, regulatory elements, target genes, and causal variants. Here, we use genetic multiplexing to generate population-scale single nucleus (sn) chromatin accessibility (snATAC-seq) and transcriptome (snRNA-seq) maps across 287 frozen human skeletal muscle biopsies representing nearly half a million nuclei. We identify 13 cell types and integrate genetic variation to discover >7,000 expression quantitative trait loci (eQTL) and >100,000 chromatin accessibility QTLs (caQTL) across cell types. Learning patterns of e/caQTL sharing across cell types increased precision of effect estimates. We identify high-resolution cell-states and context-specific e/caQTL with significant genotype by context interaction. We identify nearly 2,000 eGenes colocalized with caQTL and construct causal directional maps for chromatin accessibility and gene expression. Almost 3,500 genome-wide association study (GWAS) signals across 38 relevant traits colocalize with sn-e/caQTL, most in a cell-specific manner. These signals typically colocalize with caQTL and not eQTL, highlighting the importance of population-scale chromatin profiling for GWAS functional studies. Finally, our GWAS-caQTL colocalization data reveal distinct cell-specific regulatory paradigms. Our results illuminate the genetic regulatory architecture of human skeletal muscle at high resolution epigenomic, transcriptomic, and cell-state scales and serve as a template for population-scale multi-omic mapping in complex tissues and traits.
Collapse
Affiliation(s)
- Arushi Varshney
- Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Nandini Manickam
- Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Peter Orchard
- Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Adelaide Tovar
- Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Christa Ventresca
- Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Dept. of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Zhenhao Zhang
- Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Fan Feng
- Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Joseph Mears
- Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Michael R Erdos
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Narisu Narisu
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Kirsten Nishino
- Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Vivek Rai
- Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Heather M Stringham
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Anne U Jackson
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Tricia Tamsen
- Biomedical Research Core Facilities Advanced Genomics Core, University of Michigan, Ann Arbor, MI, USA
| | - Chao Gao
- Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Mao Yang
- Department of Internal Medicine, Center for Individualized and Genomic Medicine Research, Henry Ford Hospital, Detroit, MI, USA
| | - Olivia I Koues
- Biomedical Research Core Facilities Advanced Genomics Core, University of Michigan, Ann Arbor, MI, USA
| | - Joshua D Welch
- Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Charles F Burant
- Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, USA
| | - L Keoki Williams
- Department of Internal Medicine, Center for Individualized and Genomic Medicine Research, Henry Ford Hospital, Detroit, MI, USA
| | - Chris Jenkinson
- South Texas Diabetes and Obesity Research Institute, School of Medicine, University of Texas, Rio Grande Valley, TX, USA
| | - Ralph A DeFronzo
- Department of Medicine/Diabetes Division, University of Texas Health, San Antonio, TX, USA
| | - Luke Norton
- Department of Medicine/Diabetes Division, University of Texas Health, San Antonio, TX, USA
| | - Jouko Saramies
- Savitaipale Health Center, South Karelia Central Hospital, Lappeenranta, Finland
| | - Timo A Lakka
- Institute of Biomedicine, University of Eastern Finland, Kuopio, Finland
| | - Markku Laakso
- Institute of Clinical Medicine, University of Eastern Finland, Kuopio, Finland
| | - Jaakko Tuomilehto
- Dept. of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
- Dept. of Public Health, University of Helsinki, Helsinki, Finland
- Diabetes Research Group, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Karen L Mohlke
- Dept. of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | - Jacob O Kitzman
- Dept. of Human Genetics, University of Michigan, Ann Arbor, MI, USA
- Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Heikki A Koistinen
- Dept. of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
- Department of Medicine, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
- Minerva Foundation Institute for Medical Research, Helsinki, Finland
| | - Jie Liu
- Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Michael Boehnke
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Francis S Collins
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Laura J Scott
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Stephen C J Parker
- Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Dept. of Human Genetics, University of Michigan, Ann Arbor, MI, USA
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
2
|
Ivancevic A, Simpson DM, Joyner OM, Bagby SM, Nguyen LL, Bitler BG, Pitts TM, Chuong EB. Endogenous retroviruses mediate transcriptional rewiring in response to oncogenic signaling in colorectal cancer. SCIENCE ADVANCES 2024; 10:eado1218. [PMID: 39018396 PMCID: PMC466953 DOI: 10.1126/sciadv.ado1218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 06/13/2024] [Indexed: 07/19/2024]
Abstract
Cancer cells exhibit rewired transcriptional regulatory networks that promote tumor growth and survival. However, the mechanisms underlying the formation of these pathological networks remain poorly understood. Through a pan-cancer epigenomic analysis, we found that primate-specific endogenous retroviruses (ERVs) are a rich source of enhancers displaying cancer-specific activity. In colorectal cancer and other epithelial tumors, oncogenic MAPK/AP1 signaling drives the activation of enhancers derived from the primate-specific ERV family LTR10. Functional studies in colorectal cancer cells revealed that LTR10 elements regulate tumor-specific expression of multiple genes associated with tumorigenesis, such as ATG12 and XRCC4. Within the human population, individual LTR10 elements exhibit germline and somatic structural variation resulting from a highly mutable internal tandem repeat region, which affects AP1 binding activity. Our findings reveal that ERV-derived enhancers contribute to transcriptional dysregulation in response to oncogenic signaling and shape the evolution of cancer-specific regulatory networks.
Collapse
Affiliation(s)
- Atma Ivancevic
- BioFrontiers Institute and Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, USA
| | - David M. Simpson
- BioFrontiers Institute and Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, USA
| | - Olivia M. Joyner
- BioFrontiers Institute and Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, USA
| | - Stacey M. Bagby
- Division of Medical Oncology, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Lily L. Nguyen
- BioFrontiers Institute and Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, USA
- Division of Reproductive Sciences, Department of Obstetrics and Gynecology, University of Colorado School of Medicine, Aurora, CO, USA
| | - Ben G. Bitler
- Division of Reproductive Sciences, Department of Obstetrics and Gynecology, University of Colorado School of Medicine, Aurora, CO, USA
| | - Todd M. Pitts
- Division of Medical Oncology, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Edward B. Chuong
- BioFrontiers Institute and Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, USA
| |
Collapse
|
3
|
Tovar A, Kyono Y, Nishino K, Bose M, Varshney A, Parker SCJ, Kitzman JO. Using a modular massively parallel reporter assay to discover context-specific regulatory grammars in type 2 diabetes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.08.561391. [PMID: 37873175 PMCID: PMC10592691 DOI: 10.1101/2023.10.08.561391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Recent genome-wide association studies have established that most complex disease-associated loci are found in noncoding regions where defining their function is nontrivial. In this study, we leverage a modular massively parallel reporter assay (MPRA) to uncover sequence features linked to context-specific regulatory activity. We screened enhancer activity across a panel of 198-bp fragments spanning over 10k type 2 diabetes- and metabolic trait-associated variants in the 832/13 rat insulinoma cell line, a relevant model of pancreatic beta cells. We explored these fragments' context sensitivity by comparing their activities when placed up-or downstream of a reporter gene, and in combination with either a synthetic housekeeping promoter (SCP1) or a more biologically relevant promoter corresponding to the human insulin gene ( INS ). We identified clear effects of MPRA construct design on measured fragment enhancer activity. Specifically, a subset of fragments (n = 702/11,656) displayed positional bias, evenly distributed across up- and downstream preference. A separate set of fragments exhibited promoter bias (n = 698/11,656), mostly towards the cell-specific INS promoter (73.4%). To identify sequence features associated with promoter preference, we used Lasso regression with 562 genomic annotations and discovered that fragments with INS promoter-biased activity are enriched for HNF1 motifs. HNF1 family transcription factors are key regulators of glucose metabolism disrupted in maturity onset diabetes of the young (MODY), suggesting genetic convergence between rare coding variants that cause MODY and common T2D-associated regulatory variants. We designed a follow-up MPRA containing HNF1 motif-enriched fragments and observed several instances where deletion or mutation of HNF1 motifs disrupted the INS promoter-biased enhancer activity, specifically in the beta cell model but not in a skeletal muscle cell line, another diabetes-relevant cell type. Together, our study suggests that cell-specific regulatory activity is partially influenced by enhancer-promoter compatibility and indicates that careful attention should be paid when designing MPRA libraries to capture context-specific regulatory processes at disease-associated genetic signals.
Collapse
|
4
|
Kravchuk EV, Ashniev GA, Gladkova MG, Orlov AV, Vasileva AV, Boldyreva AV, Burenin AG, Skirda AM, Nikitin PI, Orlova NN. Experimental Validation and Prediction of Super-Enhancers: Advances and Challenges. Cells 2023; 12:cells12081191. [PMID: 37190100 DOI: 10.3390/cells12081191] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 04/07/2023] [Accepted: 04/14/2023] [Indexed: 05/17/2023] Open
Abstract
Super-enhancers (SEs) are cis-regulatory elements of the human genome that have been widely discussed since the discovery and origin of the term. Super-enhancers have been shown to be strongly associated with the expression of genes crucial for cell differentiation, cell stability maintenance, and tumorigenesis. Our goal was to systematize research studies dedicated to the investigation of structure and functions of super-enhancers as well as to define further perspectives of the field in various applications, such as drug development and clinical use. We overviewed the fundamental studies which provided experimental data on various pathologies and their associations with particular super-enhancers. The analysis of mainstream approaches for SE search and prediction allowed us to accumulate existing data and propose directions for further algorithmic improvements of SEs' reliability levels and efficiency. Thus, here we provide the description of the most robust algorithms such as ROSE, imPROSE, and DEEPSEN and suggest their further use for various research and development tasks. The most promising research direction, which is based on topic and number of published studies, are cancer-associated super-enhancers and prospective SE-targeted therapy strategies, most of which are discussed in this review.
Collapse
Affiliation(s)
- Ekaterina V Kravchuk
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia
- Faculty of Biology, Lomonosov Moscow State University, Leninskiye Gory, MSU, 1-12, 119991 Moscow, Russia
| | - German A Ashniev
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia
- Faculty of Biology, Lomonosov Moscow State University, Leninskiye Gory, MSU, 1-12, 119991 Moscow, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, GSP-1, Leninskiye Gory, MSU, 1-73, 119234 Moscow, Russia
| | - Marina G Gladkova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, GSP-1, Leninskiye Gory, MSU, 1-73, 119234 Moscow, Russia
| | - Alexey V Orlov
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia
| | - Anastasiia V Vasileva
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia
| | - Anna V Boldyreva
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia
| | - Alexandr G Burenin
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia
| | - Artemiy M Skirda
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia
| | - Petr I Nikitin
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia
| | - Natalia N Orlova
- Prokhorov General Physics Institute of the Russian Academy of Sciences, 38 Vavilov St., 119991 Moscow, Russia
| |
Collapse
|
5
|
Orchard P, Manickam N, Ventresca C, Vadlamudi S, Varshney A, Rai V, Kaplan J, Lalancette C, Mohlke KL, Gallagher K, Burant CF, Parker SCJ. Human and rat skeletal muscle single-nuclei multi-omic integrative analyses nominate causal cell types, regulatory elements, and SNPs for complex traits. Genome Res 2021; 31:2258-2275. [PMID: 34815310 PMCID: PMC8647829 DOI: 10.1101/gr.268482.120] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Accepted: 09/16/2021] [Indexed: 12/12/2022]
Abstract
Skeletal muscle accounts for the largest proportion of human body mass, on average, and is a key tissue in complex diseases and mobility. It is composed of several different cell and muscle fiber types. Here, we optimize single-nucleus ATAC-seq (snATAC-seq) to map skeletal muscle cell-specific chromatin accessibility landscapes in frozen human and rat samples, and single-nucleus RNA-seq (snRNA-seq) to map cell-specific transcriptomes in human. We additionally perform multi-omics profiling (gene expression and chromatin accessibility) on human and rat muscle samples. We capture type I and type II muscle fiber signatures, which are generally missed by existing single-cell RNA-seq methods. We perform cross-modality and cross-species integrative analyses on 33,862 nuclei and identify seven cell types ranging in abundance from 59.6% to 1.0% of all nuclei. We introduce a regression-based approach to infer cell types by comparing transcription start site-distal ATAC-seq peaks to reference enhancer maps and show consistency with RNA-based marker gene cell type assignments. We find heterogeneity in enrichment of genetic variants linked to complex phenotypes from the UK Biobank and diabetes genome-wide association studies in cell-specific ATAC-seq peaks, with the most striking enrichment patterns in muscle mesenchymal stem cells (∼3.5% of nuclei). Finally, we overlay these chromatin accessibility maps on GWAS data to nominate causal cell types, SNPs, transcription factor motifs, and target genes for type 2 diabetes signals. These chromatin accessibility profiles for human and rat skeletal muscle cell types are a useful resource for nominating causal GWAS SNPs and cell types.
Collapse
Affiliation(s)
- Peter Orchard
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Nandini Manickam
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Christa Ventresca
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Swarooparani Vadlamudi
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| | - Arushi Varshney
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Vivek Rai
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Jeremy Kaplan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Claudia Lalancette
- Epigenomics Core, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Karen L Mohlke
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| | - Katherine Gallagher
- Department of Surgery, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Charles F Burant
- Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Stephen C J Parker
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
6
|
Varshney A, Kyono Y, Elangovan VR, Wang C, Erdos MR, Narisu N, Albanus RD, Orchard P, Stitzel ML, Collins FS, Kitzman JO, Parker SCJ. A Transcription Start Site Map in Human Pancreatic Islets Reveals Functional Regulatory Signatures. Diabetes 2021; 70:1581-1591. [PMID: 33849996 PMCID: PMC8336006 DOI: 10.2337/db20-1087] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Accepted: 04/09/2021] [Indexed: 12/21/2022]
Abstract
Identifying the tissue-specific molecular signatures of active regulatory elements is critical to understand gene regulatory mechanisms. Here, we identify transcription start sites (TSS) using cap analysis of gene expression (CAGE) across 57 human pancreatic islet samples. We identify 9,954 reproducible CAGE tag clusters (TCs), ∼20% of which are islet specific and occur mostly distal to known gene TSS. We integrated islet CAGE data with histone modification and chromatin accessibility profiles to identify epigenomic signatures of transcription initiation. Using a massively parallel reporter assay, we validated the transcriptional enhancer activity for 2,279 of 3,378 (∼68%) tested islet CAGE elements (5% false discovery rate). TCs within accessible enhancers show higher enrichment to overlap type 2 diabetes genome-wide association study (GWAS) signals than existing islet annotations, which emphasizes the utility of mapping CAGE profiles in disease-relevant tissue. This work provides a high-resolution map of transcriptional initiation in human pancreatic islets with utility for dissecting active enhancers at GWAS loci.
Collapse
Affiliation(s)
- Arushi Varshney
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI
- Department of Human Genetics, University of Michigan, Ann Arbor, MI
| | - Yasuhiro Kyono
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI
| | | | - Collin Wang
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI
| | - Michael R Erdos
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Narisu Narisu
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | | | - Peter Orchard
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI
| | | | - Francis S Collins
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Jacob O Kitzman
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI
- Department of Human Genetics, University of Michigan, Ann Arbor, MI
| | - Stephen C J Parker
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI
- Department of Human Genetics, University of Michigan, Ann Arbor, MI
| |
Collapse
|
7
|
Ohnmacht J, May P, Sinkkonen L, Krüger R. Missing heritability in Parkinson's disease: the emerging role of non-coding genetic variation. J Neural Transm (Vienna) 2020; 127:729-748. [PMID: 32248367 PMCID: PMC7242266 DOI: 10.1007/s00702-020-02184-0] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 03/24/2020] [Indexed: 02/01/2023]
Abstract
Parkinson's disease (PD) is a neurodegenerative disorder caused by a complex interplay of genetic and environmental factors. For the stratification of PD patients and the development of advanced clinical trials, including causative treatments, a better understanding of the underlying genetic architecture of PD is required. Despite substantial efforts, genome-wide association studies have not been able to explain most of the observed heritability. The majority of PD-associated genetic variants are located in non-coding regions of the genome. A systematic assessment of their functional role is hampered by our incomplete understanding of genotype-phenotype correlations, for example through differential regulation of gene expression. Here, the recent progress and remaining challenges for the elucidation of the role of non-coding genetic variants is reviewed with a focus on PD as a complex disease with multifactorial origins. The function of gene regulatory elements and the impact of non-coding variants on them, and the means to map these elements on a genome-wide level, will be delineated. Moreover, examples of how the integration of functional genomic annotations can serve to identify disease-associated pathways and to prioritize disease- and cell type-specific regulatory variants will be given. Finally, strategies for functional validation and considerations for suitable model systems are outlined. Together this emphasizes the contribution of rare and common genetic variants to the complex pathogenesis of PD and points to remaining challenges for the dissection of genetic complexity that may allow for better stratification, improved diagnostics and more targeted treatments for PD in the future.
Collapse
Affiliation(s)
- Jochen Ohnmacht
- LCSB, University of Luxembourg, Belvaux, Luxembourg
- Department of Life Sciences and Medicine (DLSM), University of Luxembourg, Belvaux, Luxembourg
| | - Patrick May
- LCSB, University of Luxembourg, Belvaux, Luxembourg
| | - Lasse Sinkkonen
- Department of Life Sciences and Medicine (DLSM), University of Luxembourg, Belvaux, Luxembourg
| | - Rejko Krüger
- LCSB, University of Luxembourg, Belvaux, Luxembourg.
- Luxembourg Institute of Health (LIH), Transversal Translational Medicine, Strassen, Luxembourg.
- Parkinson Research Clinic, Centre Hospitalier de Luxembourg (CHL), Luxembourg, Luxembourg.
| |
Collapse
|
8
|
Quang D, Xie X. FactorNet: A deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data. Methods 2019; 166:40-47. [PMID: 30922998 PMCID: PMC6708499 DOI: 10.1016/j.ymeth.2019.03.020] [Citation(s) in RCA: 98] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 03/05/2019] [Accepted: 03/20/2019] [Indexed: 01/08/2023] Open
Abstract
Due to the large numbers of transcription factors (TFs) and cell types, querying binding profiles of all valid TF/cell type pairs is not experimentally feasible. To address this issue, we developed a convolutional-recurrent neural network model, called FactorNet, to computationally impute the missing binding data. FactorNet trains on binding data from reference cell types to make predictions on testing cell types by leveraging a variety of features, including genomic sequences, genome annotations, gene expression, and signal data, such as DNase I cleavage. FactorNet implements several convenient strategies to reduce runtime and memory consumption. By visualizing the neural network models, we can interpret how the model predicts binding. We also investigate the variables that affect cross-cell type accuracy, and offer suggestions to improve upon this field. Our method ranked among the top teams in the ENCODE-DREAM in vivo Transcription Factor Binding Site Prediction Challenge, achieving first place on six of the 13 final round evaluation TF/cell type pairs, the most of any competing team. The FactorNet source code is publicly available, allowing users to reproduce our methodology from the ENCODE-DREAM Challenge.
Collapse
Affiliation(s)
- Daniel Quang
- University of California, Department of Computer Science, Irvine, CA 92697, United States.
| | - Xiaohui Xie
- University of California, Department of Computer Science, Irvine, CA 92697, United States.
| |
Collapse
|
9
|
Lee HK, Willi M, Shin HY, Liu C, Hennighausen L. Progressing super-enhancer landscape during mammary differentiation controls tissue-specific gene regulation. Nucleic Acids Res 2019; 46:10796-10809. [PMID: 30285185 PMCID: PMC6237736 DOI: 10.1093/nar/gky891] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Accepted: 09/20/2018] [Indexed: 12/15/2022] Open
Abstract
The mammary luminal lineage relies on the common cytokine-sensing transcription factor STAT5 to establish super-enhancers during pregnancy and initiate a genetic program that activates milk production. As pups grow, the greatly increasing demand for milk requires progressive differentiation of mammary cells with advancing lactation. Here we investigate how persistent hormonal exposure during lactation shapes an evolving enhancer landscape and impacts the biology of mammary cells. Employing ChIP-seq, we uncover a changing transcription factor occupancy at mammary enhancers, suggesting that their activities evolve with advancing differentiation. Using mouse genetics, we demonstrate that the functions of individual enhancers within the Wap super-enhancer evolve as lactation progresses. Most profoundly, a seed enhancer, which is mandatory for the activation of the Wap super-enhancer during pregnancy, is not required during lactation, suggesting compensatory flexibility. Combinatorial deletions of structurally equivalent constituent enhancers demonstrated differentiation-specific compensatory activities during lactation. We also demonstrate that the Wap super-enhancer, which is built on STAT5 and other common transcription factors, retains its exquisite mammary specificity when placed into globally permissive chromatin, suggesting a limited role of chromatin in controlling cell specificity. Our studies unveil a previously unrecognized progressive enhancer landscape where structurally equivalent components serve unique and differentiation-specific functions.
Collapse
Affiliation(s)
- Hye Kyung Lee
- Laboratory of Genetics and Physiology, National Institute of Diabetes and Digestive and Kidney Diseases, US National Institutes of Health, Bethesda, MD 20892, USA
| | - Michaela Willi
- Laboratory of Genetics and Physiology, National Institute of Diabetes and Digestive and Kidney Diseases, US National Institutes of Health, Bethesda, MD 20892, USA
| | - Ha Youn Shin
- Department of Biomedical Science and Engineering, Konkuk University, Seoul 05029, Republic of Korea
| | - Chengyu Liu
- Transgenic Core, National Heart, Lung, and Blood Institute, US National Institutes of Health, Bethesda, MD 20892, USA
| | - Lothar Hennighausen
- Laboratory of Genetics and Physiology, National Institute of Diabetes and Digestive and Kidney Diseases, US National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
10
|
Jia Y, Chng WJ, Zhou J. Super-enhancers: critical roles and therapeutic targets in hematologic malignancies. J Hematol Oncol 2019; 12:77. [PMID: 31311566 PMCID: PMC6636097 DOI: 10.1186/s13045-019-0757-y] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Accepted: 06/14/2019] [Indexed: 12/11/2022] Open
Abstract
Super-enhancers (SEs) in a broad range of human cell types are large clusters of enhancers with aberrant high levels of transcription factor binding, which are central to drive expression of genes in controlling cell identity and stimulating oncogenic transcription. Cancer cells acquire super-enhancers at oncogene and cancerous phenotype relies on these abnormal transcription propelled by SEs. Furthermore, specific inhibitors targeting SEs assembly and activation have offered potential targets for treating various tumors including hematological malignancies. Here, we first review the identification, functional significance of SEs. Next, we summarize recent findings of SEs and SE-driven gene regulation in normal hematopoiesis and hematologic malignancies. The importance and various modes of SE-mediated MYC oncogene amplification are illustrated. Finally, we highlight the progress of SEs as selective therapeutic targets in basic research and clinical trials. Some open questions regarding functional significance and future directions of targeting SEs in the clinic will be discussed too.
Collapse
Affiliation(s)
- Yunlu Jia
- Cancer Science Institute of Singapore, National University of Singapore, 14 Medical Drive, Centre for Translational Medicine, Singapore, 117599 Republic of Singapore
- Department of Surgical Oncology, Sir Run Run Shaw Hospital, Zhejiang University, Hangzhou, 310016 Zhejiang China
| | - Wee-Joo Chng
- Cancer Science Institute of Singapore, National University of Singapore, 14 Medical Drive, Centre for Translational Medicine, Singapore, 117599 Republic of Singapore
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117597 Republic of Singapore
- Department of Hematology-Oncology, National University Cancer Institute of Singapore (NCIS), The National University Health System (NUHS), 1E, Kent Ridge Road, Singapore, 119228 Republic of Singapore
| | - Jianbiao Zhou
- Cancer Science Institute of Singapore, National University of Singapore, 14 Medical Drive, Centre for Translational Medicine, Singapore, 117599 Republic of Singapore
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117597 Republic of Singapore
| |
Collapse
|
11
|
Hamdan FH, Johnsen SA. Perturbing Enhancer Activity in Cancer Therapy. Cancers (Basel) 2019; 11:cancers11050634. [PMID: 31067678 PMCID: PMC6563029 DOI: 10.3390/cancers11050634] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Revised: 04/26/2019] [Accepted: 05/05/2019] [Indexed: 02/07/2023] Open
Abstract
Tight regulation of gene transcription is essential for normal development, tissue homeostasis, and disease-free survival. Enhancers are distal regulatory elements in the genome that provide specificity to gene expression programs and are frequently misregulated in cancer. Recent studies examined various enhancer-driven malignant dependencies and identified different approaches to specifically target these programs. In this review, we describe numerous features that make enhancers good transcriptional targets in cancer therapy and discuss different approaches to overcome enhancer perturbation. Interestingly, a number of approved therapeutic agents, such as cyclosporine, steroid hormones, and thiazolidinediones, actually function by affecting enhancer landscapes by directly targeting very specific transcription factor programs. More recently, a broader approach to targeting deregulated enhancer programs has been achieved via Bromodomain and Extraterminal (BET) inhibition or perturbation of transcription-related cyclin-dependent kinases (CDK). One challenge to enhancer-targeted therapy is proper patient stratification. We suggest that monitoring of enhancer RNA (eRNA) expression may serve as a unique biomarker of enhancer activity that can help to predict and monitor responsiveness to enhancer-targeted therapies. A more thorough investigation of cancer-specific enhancers and the underlying mechanisms of deregulation will pave the road for an effective utilization of enhancer modulators in a precision oncology approach to cancer treatment.
Collapse
Affiliation(s)
- Feda H Hamdan
- Gene Regulatory Mechanisms and Molecular Epigenetics Lab, Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN 55905, USA.
| | - Steven A Johnsen
- Gene Regulatory Mechanisms and Molecular Epigenetics Lab, Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN 55905, USA.
| |
Collapse
|
12
|
Varshney A, VanRenterghem H, Orchard P, Boyle AP, Stitzel ML, Ucar D, Parker SCJ. Cell Specificity of Human Regulatory Annotations and Their Genetic Effects on Gene Expression. Genetics 2019; 211:549-562. [PMID: 30593493 PMCID: PMC6366912 DOI: 10.1534/genetics.118.301525] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2018] [Accepted: 12/09/2018] [Indexed: 12/19/2022] Open
Abstract
Epigenomic signatures from histone marks and transcription factor (TF)-binding sites have been used to annotate putative gene regulatory regions. However, a direct comparison of these diverse annotations is missing, and it is unclear how genetic variation within these annotations affects gene expression. Here, we compare five widely used annotations of active regulatory elements that represent high densities of one or more relevant epigenomic marks-"super" and "typical" (nonsuper) enhancers, stretch enhancers, high-occupancy target (HOT) regions, and broad domains-across the four matched human cell types for which they are available. We observe that stretch and super enhancers cover cell type-specific enhancer "chromatin states," whereas HOT regions and broad domains comprise more ubiquitous promoter states. Expression quantitative trait loci (eQTL) in stretch enhancers have significantly smaller effect sizes compared to those in HOT regions. Strikingly, chromatin accessibility QTL in stretch enhancers have significantly larger effect sizes compared to those in HOT regions. These observations suggest that stretch enhancers could harbor genetically primed chromatin to enable changes in TF binding, possibly to drive cell type-specific responses to environmental stimuli. Our results suggest that current eQTL studies are relatively underpowered or could lack the appropriate environmental context to detect genetic effects in the most cell type-specific "regulatory annotations," which likely contributes to infrequent colocalization of eQTL with genome-wide association study signals.
Collapse
Affiliation(s)
- Arushi Varshney
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109
| | - Hadley VanRenterghem
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109
| | - Peter Orchard
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109
| | - Alan P Boyle
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109
| | - Michael L Stitzel
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut 06032
| | - Duygu Ucar
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut 06032
| | - Stephen C J Parker
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109
| |
Collapse
|
13
|
Khan A, Mathelier A, Zhang X. Super-enhancers are transcriptionally more active and cell type-specific than stretch enhancers. Epigenetics 2018; 13:910-922. [PMID: 30169995 DOI: 10.1080/15592294.2018.1514231] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Super-enhancers and stretch enhancers represent classes of transcriptional enhancers that have been shown to control the expression of cell identity genes and carry disease- and trait-associated variants. Specifically, super-enhancers are clusters of enhancers defined based on the binding occupancy of master transcription factors, chromatin regulators, or chromatin marks, while stretch enhancers are large chromatin-defined regulatory regions of at least 3,000 base pairs. Several studies have characterized these regulatory regions in numerous cell types and tissues to decipher their functional importance. However, the differences and similarities between these regulatory regions have not been fully assessed. We integrated genomic, epigenomic, and transcriptomic data from ten human cell types to perform a comparative analysis of super and stretch enhancers with respect to their chromatin profiles, cell type-specificity, and ability to control gene expression. We found that stretch enhancers are more abundant, more distal to transcription start sites, cover twice as much the genome, and are significantly less conserved than super-enhancers. In contrast, super-enhancers are significantly more enriched for active chromatin marks and cohesin complex, and more transcriptionally active than stretch enhancers. Importantly, a vast majority of super-enhancers (85%) overlap with only a small subset of stretch enhancers (13%), which are enriched for cell type-specific biological functions, and control cell identity genes. These results suggest that super-enhancers are transcriptionally more active and cell type-specific than stretch enhancers, and importantly, most of the stretch enhancers that are distinct from super-enhancers do not show an association with cell identity genes, are less active, and more likely to be poised enhancers.
Collapse
Affiliation(s)
- Aziz Khan
- a Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership , University of Oslo , Oslo , Norway.,b Key Lab of Bioinformatics/Bioinformatics Division, BNRIST (Beijing National Research Center for Information Science and Technology), Department of Automation , Tsinghua University , Beijing , China
| | - Anthony Mathelier
- a Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership , University of Oslo , Oslo , Norway.,c Department of Cancer Genetics, Institute for Cancer Research , Oslo University Hospital Radiumhospitalet , Oslo , Norway
| | - Xuegong Zhang
- b Key Lab of Bioinformatics/Bioinformatics Division, BNRIST (Beijing National Research Center for Information Science and Technology), Department of Automation , Tsinghua University , Beijing , China.,d School of Life Sciences , Tsinghua University , Beijing , China
| |
Collapse
|
14
|
Developmental Control of NRAMP1 (SLC11A1) Expression in Professional Phagocytes. BIOLOGY 2017; 6:biology6020028. [PMID: 28467369 PMCID: PMC5485475 DOI: 10.3390/biology6020028] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Revised: 04/25/2017] [Accepted: 04/25/2017] [Indexed: 12/11/2022]
Abstract
NRAMP1 (SLC11A1) is a professional phagocyte membrane importer of divalent metals that contributes to iron recycling at homeostasis and to nutritional immunity against infection. Analyses of data generated by several consortia and additional studies were integrated to hypothesize mechanisms restricting NRAMP1 expression to mature phagocytes. Results from various epigenetic and transcriptomic approaches were collected for mesodermal and hematopoietic cell types and compiled for combined analysis with results of genetic studies associating single nucleotide polymorphisms (SNPs) with variations in NRAMP1 expression (eQTLs). Analyses establish that NRAMP1 is part of an autonomous topologically associated domain delimited by ubiquitous CCCTC-binding factor (CTCF) sites. NRAMP1 locus contains five regulatory regions: a predicted super-enhancer (S-E) key to phagocyte-specific expression; the proximal promoter; two intronic areas, including 3' inhibitory elements that restrict expression during development; and a block of upstream sites possibly extending the S-E domain. Also the downstream region adjacent to the 3' CTCF locus boundary may regulate expression during hematopoiesis. Mobilization of the locus 14 predicted transcriptional regulatory elements occurs in three steps, beginning with hematopoiesis; at the onset of myelopoiesis and through myelo-monocytic differentiation. Basal expression level in mature phagocytes is further influenced by genetic variation, tissue environment, and in response to infections that induce various epigenetic memories depending on microorganism nature. Constitutively associated transcription factors (TFs) include CCAAT enhancer binding protein beta (C/EBPb), purine rich DNA binding protein (PU.1), early growth response 2 (EGR2) and signal transducer and activator of transcription 1 (STAT1) while hypoxia-inducible factors (HIFs) and interferon regulatory factor 1 (IRF1) may stimulate iron acquisition in pro-inflammatory conditions. Mouse orthologous locus is generally conserved; chromatin patterns typify a de novo myelo-monocytic gene whose expression is tightly controlled by TFs Pu.1, C/ebps and Irf8; Irf3 and nuclear factor NF-kappa-B p 65 subunit (RelA) regulate expression in inflammatory conditions. Functional differences in the determinants identified at these orthologous loci imply that species-specific mechanisms control gene expression.
Collapse
|
15
|
Keating ST, Plutzky J, El-Osta A. Epigenetic Changes in Diabetes and Cardiovascular Risk. Circ Res 2017; 118:1706-22. [PMID: 27230637 DOI: 10.1161/circresaha.116.306819] [Citation(s) in RCA: 108] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/08/2016] [Accepted: 04/30/2016] [Indexed: 01/03/2023]
Abstract
Cardiovascular complications remain the leading causes of morbidity and premature mortality in patients with diabetes mellitus. Studies in humans and preclinical models demonstrate lasting gene expression changes in the vasculopathies initiated by previous exposure to high glucose concentrations and the associated overproduction of reactive oxygen species. The molecular signatures of chromatin architectures that sensitize the genome to these and other cardiometabolic risk factors of the diabetic milieu are increasingly implicated in the biological memory underlying cardiovascular complications and now widely considered as promising therapeutic targets. Atherosclerosis is a complex heterocellular disease where the contributing cell types possess distinct epigenomes shaping diverse gene expression. Although the extent that pathological chromatin changes can be manipulated in human cardiovascular disease remains to be established, the clinical applicability of epigenetic interventions will be greatly advanced by a deeper understanding of the cell type-specific roles played by writers, erasers, and readers of chromatin modifications in the diabetic vasculature. This review details a current perspective of epigenetic mechanisms of macrovascular disease in diabetes mellitus and highlights recent key descriptions of chromatinized changes associated with persistent gene expression in endothelial, smooth muscle, and circulating immune cells relevant to atherosclerosis. Furthermore, we discuss the challenges associated with pharmacological targeting of epigenetic networks to correct abnormal or deregulated gene expression as a strategy to alleviate the clinical burden of diabetic cardiovascular disease.
Collapse
Affiliation(s)
- Samuel T Keating
- From the Epigenetics in Human Health and Disease Laboratory (S.T.K., A.E.-O.) and Epigenomics Profiling Facility (A.E.-O.), Baker IDI Heart and Diabetes Institute, The Alfred Medical Research and Education Precinct, Melbourne, Victoria, Australia; Division of Cardiovascular Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA (J.P.); Department of Pathology, The University of Melbourne, Victoria, Australia (A.E.-O.); and Central Clinical School, Department of Medicine, Monash University, Victoria, Australia (A.E.-O.)
| | - Jorge Plutzky
- From the Epigenetics in Human Health and Disease Laboratory (S.T.K., A.E.-O.) and Epigenomics Profiling Facility (A.E.-O.), Baker IDI Heart and Diabetes Institute, The Alfred Medical Research and Education Precinct, Melbourne, Victoria, Australia; Division of Cardiovascular Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA (J.P.); Department of Pathology, The University of Melbourne, Victoria, Australia (A.E.-O.); and Central Clinical School, Department of Medicine, Monash University, Victoria, Australia (A.E.-O.)
| | - Assam El-Osta
- From the Epigenetics in Human Health and Disease Laboratory (S.T.K., A.E.-O.) and Epigenomics Profiling Facility (A.E.-O.), Baker IDI Heart and Diabetes Institute, The Alfred Medical Research and Education Precinct, Melbourne, Victoria, Australia; Division of Cardiovascular Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA (J.P.); Department of Pathology, The University of Melbourne, Victoria, Australia (A.E.-O.); and Central Clinical School, Department of Medicine, Monash University, Victoria, Australia (A.E.-O.).
| |
Collapse
|
16
|
Expression of long non-coding RNAs in autoimmunity and linkage to enhancer function and autoimmune disease risk genetic variants. J Autoimmun 2017; 81:99-109. [PMID: 28420548 DOI: 10.1016/j.jaut.2017.03.014] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2017] [Revised: 03/29/2017] [Accepted: 03/31/2017] [Indexed: 01/19/2023]
Abstract
Genome-wide association studies have identified numerous genetic variants conferring autoimmune disease risk. Most of these genetic variants lie outside protein-coding genes hampering mechanistic explorations. Numerous mRNAs are also differentially expressed in autoimmune disease but their regulation is also unclear. The majority of the human genome is transcribed yet its biologic significance is incompletely understood. We performed whole genome RNA-sequencing [RNA-seq] to categorize expression of mRNAs, known and novel long non-coding RNAs [lncRNAs] in leukocytes from subjects with autoimmune disease and identified annotated and novel lncRNAs differentially expressed across multiple disorders. We found that loci transcribing novel lncRNAs were not randomly distributed across the genome but co-localized with leukocyte transcriptional enhancers, especially super-enhancers, and near genetic variants associated with autoimmune disease risk. We propose that alterations in enhancer function, including lncRNA expression, produced by genetics and environment, change cellular phenotypes contributing to disease risk and pathogenesis and represent attractive therapeutic targets.
Collapse
|
17
|
Pierce S, Coetzee GA. Parkinson's disease-associated genetic variation is linked to quantitative expression of inflammatory genes. PLoS One 2017; 12:e0175882. [PMID: 28407015 PMCID: PMC5391096 DOI: 10.1371/journal.pone.0175882] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2016] [Accepted: 03/31/2017] [Indexed: 12/11/2022] Open
Abstract
Genome-wide association studies (GWAS) have linked dozens of single nucleotide polymorphisms (SNPs) with Parkinson’s disease (PD) risk. Ascertaining the functional and eventual causal mechanisms underlying these relationships has proven difficult. The majority of risk SNPs, and nearby SNPs in linkage disequilibrium (LD), are found in intergenic or intronic regions and confer risk through allele-dependent expression of multiple unknown target genes. Combining GWAS results with publicly available GTEx data, generated through eQTL (expression quantitative trait loci) identification studies, enables a direct association of SNPs to gene expression levels and aids in narrowing the large population of potential genetic targets for hypothesis-driven experimental cell biology. Separately, overlapping of SNPs with putative enhancer segmentations can strengthen target filtering. We report here the results of analyzing 7,607 PD risk SNPs along with an additional 23,759 high linkage disequilibrium-associated variants paired with eQTL gene expression. We found that enrichment analysis on the set of genes following target filtering pointed to a single large LD block at 6p21 that contained multiple HLA-MHC-II genes. These MHC-II genes remain associated with PD when the genes were filtered for correlation between GWAS significance and eQTL levels, strongly indicating a direct effect on PD etiology.
Collapse
Affiliation(s)
- Steven Pierce
- Center for Neurodegenerative Science, Van Andel Research Institute, Grand Rapids, MI, United States
| | - Gerhard A. Coetzee
- Center for Neurodegenerative Science, Van Andel Research Institute, Grand Rapids, MI, United States
- * E-mail:
| |
Collapse
|
18
|
Lawlor N, Khetan S, Ucar D, Stitzel ML. Genomics of Islet (Dys)function and Type 2 Diabetes. Trends Genet 2017; 33:244-255. [PMID: 28245910 DOI: 10.1016/j.tig.2017.01.010] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Accepted: 01/30/2017] [Indexed: 12/28/2022]
Abstract
Pancreatic islet dysfunction and beta cell failure are hallmarks of type 2 diabetes mellitus (T2DM) pathogenesis. In this review, we discuss how genome-wide association studies (GWASs) and recent developments in islet (epi)genome and transcriptome profiling (particularly single cell analyses) are providing novel insights into the genetic, environmental, and cellular contributions to islet (dys)function and T2DM pathogenesis. Moving forward, study designs that interrogate and model genetic variation [e.g., allelic profiling and (epi)genome editing] will be critical to dissect the molecular genetics of T2DM pathogenesis, to build next-generation cellular and animal models, and to develop precision medicine approaches to detect, treat, and prevent islet (dys)function and T2DM.
Collapse
Affiliation(s)
- Nathan Lawlor
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Shubham Khetan
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA; Department of Genetics & Genome Sciences, University of Connecticut, Farmington, CT 06032, USA
| | - Duygu Ucar
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA; Institute for Systems Genomics, University of Connecticut, Farmington, CT 06032, USA
| | - Michael L Stitzel
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA; Department of Genetics & Genome Sciences, University of Connecticut, Farmington, CT 06032, USA; Institute for Systems Genomics, University of Connecticut, Farmington, CT 06032, USA.
| |
Collapse
|
19
|
Genetic regulatory signatures underlying islet gene expression and type 2 diabetes. Proc Natl Acad Sci U S A 2017; 114:2301-2306. [PMID: 28193859 DOI: 10.1073/pnas.1621192114] [Citation(s) in RCA: 126] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Genome-wide association studies (GWAS) have identified >100 independent SNPs that modulate the risk of type 2 diabetes (T2D) and related traits. However, the pathogenic mechanisms of most of these SNPs remain elusive. Here, we examined genomic, epigenomic, and transcriptomic profiles in human pancreatic islets to understand the links between genetic variation, chromatin landscape, and gene expression in the context of T2D. We first integrated genome and transcriptome variation across 112 islet samples to produce dense cis-expression quantitative trait loci (cis-eQTL) maps. Additional integration with chromatin-state maps for islets and other diverse tissue types revealed that cis-eQTLs for islet-specific genes are specifically and significantly enriched in islet stretch enhancers. High-resolution chromatin accessibility profiling using assay for transposase-accessible chromatin sequencing (ATAC-seq) in two islet samples enabled us to identify specific transcription factor (TF) footprints embedded in active regulatory elements, which are highly enriched for islet cis-eQTL. Aggregate allelic bias signatures in TF footprints enabled us de novo to reconstruct TF binding affinities genetically, which support the high-quality nature of the TF footprint predictions. Interestingly, we found that T2D GWAS loci were strikingly and specifically enriched in islet Regulatory Factor X (RFX) footprints. Remarkably, within and across independent loci, T2D risk alleles that overlap with RFX footprints uniformly disrupt the RFX motifs at high-information content positions. Together, these results suggest that common regulatory variations have shaped islet TF footprints and the transcriptome and that a confluent RFX regulatory grammar plays a significant role in the genetic component of T2D predisposition.
Collapse
|
20
|
Ehrlich KC, Paterson HL, Lacey M, Ehrlich M. DNA Hypomethylation in Intragenic and Intergenic Enhancer Chromatin of Muscle-Specific Genes Usually Correlates with their Expression. THE YALE JOURNAL OF BIOLOGY AND MEDICINE 2016; 89:441-455. [PMID: 28018137 PMCID: PMC5168824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Tissue-specific enhancers are critical for gene regulation. In this study, we help elucidate the contribution of muscle-associated differential DNA methylation to the enhancer activity of highly muscle-specific genes. By bioinformatic analysis of 44 muscle-associated genes, we show that preferential gene expression in skeletal muscle (SkM) correlates with SkM-specific intragenic and intergenic enhancer chromatin and overlapping foci of DNA hypomethylation. Some genes, e.g., CASQ1 and FBXO32, displayed broad regions of both SkM- and heart-specific enhancer chromatin but exhibited focal SkM-specific DNA hypomethylation. Half of the genes had SkM-specific super-enhancers. In contrast to simple enhancer/gene-expression correlations, a super-enhancer was associated with the myogenic MYOD1 gene in both SkM and myoblasts even though SkM has < 1 percent as much MYOD1 expression. Local chromatin differences in this super-enhancer probably contribute to the SkM/myoblast differential expression. Transfection assays confirmed the tissue-specificity of the 0.3-kb core enhancer within MYOD1's super-enhancer and demonstrated its repression by methylation of its three CG dinucleotides. Our study suggests that DNA hypomethylation increases enhancer tissue-specificity and that SkM super-enhancers sometimes are poised for physiologically important, rapid up-regulation.
Collapse
Affiliation(s)
- Kenneth C. Ehrlich
- Program in Bioinformatics and Genomics, Tulane University Health Sciences Center, New Orleans, LA
| | | | - Michelle Lacey
- Tulane Cancer Center, Tulane University Health Sciences Center, New Orleans, LA,Mathematics Department, Tulane University, New Orleans, LA
| | - Melanie Ehrlich
- Program in Bioinformatics and Genomics, Tulane University Health Sciences Center, New Orleans, LA,Tulane Cancer Center, Tulane University Health Sciences Center, New Orleans, LA,Hayward Genetics Center, Tulane University Health Sciences Center, New Orleans, LA,To whom all correspondence should be addressed: Melanie Ehrlich, PhD, Hayward Genetics Center, Tulane University Health Sciences Center, 1430 Tulane Ave., New Orleans, LA 70112; Tele: 504-988-2449; Fax: 504-988-1763;
| |
Collapse
|
21
|
Scott LJ, Erdos MR, Huyghe JR, Welch RP, Beck AT, Wolford BN, Chines PS, Didion JP, Narisu N, Stringham HM, Taylor DL, Jackson AU, Vadlamudi S, Bonnycastle LL, Kinnunen L, Saramies J, Sundvall J, Albanus RD, Kiseleva A, Hensley J, Crawford GE, Jiang H, Wen X, Watanabe RM, Lakka TA, Mohlke KL, Laakso M, Tuomilehto J, Koistinen HA, Boehnke M, Collins FS, Parker SCJ. The genetic regulatory signature of type 2 diabetes in human skeletal muscle. Nat Commun 2016; 7:11764. [PMID: 27353450 PMCID: PMC4931250 DOI: 10.1038/ncomms11764] [Citation(s) in RCA: 96] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2015] [Accepted: 04/27/2016] [Indexed: 12/11/2022] Open
Abstract
Type 2 diabetes (T2D) results from the combined effects of genetic and environmental factors on multiple tissues over time. Of the >100 variants associated with T2D and related traits in genome-wide association studies (GWAS), >90% occur in non-coding regions, suggesting a strong regulatory component to T2D risk. Here to understand how T2D status, metabolic traits and genetic variation influence gene expression, we analyse skeletal muscle biopsies from 271 well-phenotyped Finnish participants with glucose tolerance ranging from normal to newly diagnosed T2D. We perform high-depth strand-specific mRNA-sequencing and dense genotyping. Computational integration of these data with epigenome data, including ATAC-seq on skeletal muscle, and transcriptome data across diverse tissues reveals that the tissue-specific genetic regulatory architecture of skeletal muscle is highly enriched in muscle stretch/super enhancers, including some that overlap T2D GWAS variants. In one such example, T2D risk alleles residing in a muscle stretch/super enhancer are linked to increased expression and alternative splicing of muscle-specific isoforms of ANK1.
Collapse
Affiliation(s)
- Laura J. Scott
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Michael R. Erdos
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Jeroen R. Huyghe
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Ryan P. Welch
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Andrew T. Beck
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Brooke N. Wolford
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Peter S. Chines
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - John P. Didion
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Narisu Narisu
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Heather M. Stringham
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - D. Leland Taylor
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Anne U. Jackson
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Swarooparani Vadlamudi
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | - Lori L. Bonnycastle
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Leena Kinnunen
- Department of Health, National Institute for Health and Welfare, P.O. Box 30, Helsinki FI-00271, Finland
| | - Jouko Saramies
- South Karelia Central Hospital, Lappeenranta 53130, Finland
| | - Jouko Sundvall
- Department of Health, National Institute for Health and Welfare, P.O. Box 30, Helsinki FI-00271, Finland
| | - Ricardo D'Oliveira Albanus
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Anna Kiseleva
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - John Hensley
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Gregory E. Crawford
- Center for Genomic & Computational Biology, Duke University, Durham, North Carolina 27708, USA
- Department of Pediatrics, Division of Medical Genetics, Duke University Medical Center, Durham, North Carolina 27708, USA
| | - Hui Jiang
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Xiaoquan Wen
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Richard M. Watanabe
- Department of Preventive Medicine, Keck School of Medicine of USC, Los Angeles, California 90089, USA
- Department of Physiology and Biophysics, Keck School of Medicine of USC, Los Angeles, California 90089, USA
| | - Timo A. Lakka
- Institute of Biomedicine/Physiology, University of Eastern Finland, Kuopio FI-00100, Finland
- Kuopio Research Institute of Exercise Medicine, Kuopio FI-00100, Finland
- Department of Clinical Physiology and Nuclear Medicine, Kuopio University Hospital, University of Eastern Finland, Kuopio FI-00100, Finland
| | - Karen L. Mohlke
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | - Markku Laakso
- Department of Medicine, University of Eastern Finland, Kuopio FI-00100, Finland
- Kuopio University Hospital, Kuopio FI-00100, Finland
| | - Jaakko Tuomilehto
- Chronic Disease Prevention Unit, National Institute for Health and Welfare, P.O. Box 30, Helsinki FI-00271, Finland
- Center for Vascular Prevention, Danube University Krems, Krems 3500, Austria
- Diabetes Research Group, King Abdulaziz University, Jeddah 21589, Saudi Arabia
- Dasman Diabetes Institute, Dasman 15461, Kuwait
| | - Heikki A. Koistinen
- Department of Health, National Institute for Health and Welfare, P.O. Box 30, Helsinki FI-00271, Finland
- Department of Medicine and Abdominal Center: Endocrinology, University of Helsinki and Helsinki University Central Hospital, P.O. Box 340, Haartmaninkatu 4, Helsinki FI-00029, Finland
- Minerva Foundation Institute for Medical Research, Biomedicum 2U, Tukholmankatu 8, Helsinki FI-00290, Finland
| | - Michael Boehnke
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Francis S. Collins
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Stephen C. J. Parker
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
22
|
Quang D, Xie X. DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res 2016; 44:e107. [PMID: 27084946 PMCID: PMC4914104 DOI: 10.1093/nar/gkw226] [Citation(s) in RCA: 454] [Impact Index Per Article: 50.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2015] [Revised: 02/27/2016] [Accepted: 03/22/2016] [Indexed: 01/19/2023] Open
Abstract
Modeling the properties and functions of DNA sequences is an important, but challenging task in the broad field of genomics. This task is particularly difficult for non-coding DNA, the vast majority of which is still poorly understood in terms of function. A powerful predictive model for the function of non-coding DNA can have enormous benefit for both basic science and translational research because over 98% of the human genome is non-coding and 93% of disease-associated variants lie in these regions. To address this need, we propose DanQ, a novel hybrid convolutional and bi-directional long short-term memory recurrent neural network framework for predicting non-coding function de novo from sequence. In the DanQ model, the convolution layer captures regulatory motifs, while the recurrent layer captures long-term dependencies between the motifs in order to learn a regulatory 'grammar' to improve predictions. DanQ improves considerably upon other models across several metrics. For some regulatory markers, DanQ can achieve over a 50% relative improvement in the area under the precision-recall curve metric compared to related models. We have made the source code available at the github repository http://github.com/uci-cbcl/DanQ.
Collapse
Affiliation(s)
- Daniel Quang
- Department of Computer Science University of California, Irvine, CA 92697, USA Center for Complex Biological Systems University of California, Irvine, CA 92697, USA
| | - Xiaohui Xie
- Department of Computer Science University of California, Irvine, CA 92697, USA Center for Complex Biological Systems University of California, Irvine, CA 92697, USA
| |
Collapse
|
23
|
Super Enhancers in Cancers, Complex Disease, and Developmental Disorders. Genes (Basel) 2015; 6:1183-200. [PMID: 26569311 PMCID: PMC4690034 DOI: 10.3390/genes6041183] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2015] [Revised: 10/24/2015] [Accepted: 10/26/2015] [Indexed: 11/17/2022] Open
Abstract
Recently, unique areas of transcriptional regulation termed super-enhancers have been identified and implicated in human disease. Defined by their magnitude of size, transcription factor density, and binding of transcriptional machinery, super-enhancers have been associated with genes driving cell differentiation. While their functions are not completely understood, it is clear that these regions driving high-level transcription are susceptible to perturbation, and trait-associated single nucleotide polymorphisms (SNPs) occur within super-enhancers of disease-relevant cell types. Here we review evidence for super-enhancer involvement in cancers, complex diseases, and developmental disorders and discuss interactions between super-enhancers and cofactors/chromatin regulators.
Collapse
|