1
|
Perez MF. CelEst: a unified gene regulatory network for estimating transcription factor activities in C. elegans. Genetics 2025; 229:iyae189. [PMID: 39705007 PMCID: PMC11912867 DOI: 10.1093/genetics/iyae189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Accepted: 11/02/2024] [Indexed: 12/21/2024] Open
Abstract
Transcription factors (TFs) play a pivotal role in orchestrating critical intricate patterns of gene regulation. Although gene expression is complex, differential expression of hundreds of genes is often due to regulation by just a handful of TFs. Despite extensive efforts to elucidate TF-target regulatory relationships in Caenorhabditis elegans, existing experimental datasets cover distinct subsets of TFs and leave data integration challenging. Here, I introduce CelEst, a unified gene regulatory network designed to estimate the activity of 487 distinct C. elegans TFs-∼58% of the total-from gene expression data. To integrate data from ChIP-seq, DNA-binding motifs, and eY1H screens, optimal processing of each data type was benchmarked against a set of TF perturbation RNA-seq experiments. Moreover, I showcase how leveraging TF motif conservation in target promoters across genomes of related species can distinguish highly informative interactions, a strategy which can be applied to many model organisms. Integrated analyses of data from commonly studied conditions including heat shock, bacterial infection, and sex differences validates CelEst's performance and highlights overlooked TFs that likely play major roles in coordinating the transcriptional response to these conditions. CelEst can infer TF activity on a standard laptop computer within minutes. Furthermore, an R Shiny app with a step-by-step guide is provided for the community to perform rapid analysis with minimal coding required. I anticipate that widespread adoption of CelEsT will significantly enhance the interpretive power of transcriptomic experiments, both present and retrospective, thereby advancing our understanding of gene regulation in C. elegans and beyond.
Collapse
Affiliation(s)
- Marcos Francisco Perez
- Instituto de Biología Molecular de Barcelona (IBMB), CSIC, Parc Científic de Barcelona, C. Baldiri Reixac, 4-8, 08028 Barcelona, Spain
| |
Collapse
|
2
|
Hudaiberdiev S, Ovcharenko I. Functional characteristics and computational model of abundant hyperactive loci in the human genome. eLife 2024; 13:RP95170. [PMID: 39535534 PMCID: PMC11560132 DOI: 10.7554/elife.95170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024] Open
Abstract
Enhancers and promoters are classically considered to be bound by a small set of transcription factors (TFs) in a sequence-specific manner. This assumption has come under increasing skepticism as the datasets of ChIP-seq assays of TFs have expanded. In particular, high-occupancy target (HOT) loci attract hundreds of TFs with often no detectable correlation between ChIP-seq peaks and DNA-binding motif presence. Here, we used a set of 1003 TF ChIP-seq datasets (HepG2, K562, H1) to analyze the patterns of ChIP-seq peak co-occurrence in combination with functional genomics datasets. We identified 43,891 HOT loci forming at the promoter (53%) and enhancer (47%) regions. HOT promoters regulate housekeeping genes, whereas HOT enhancers are involved in tissue-specific process regulation. HOT loci form the foundation of human super-enhancers and evolve under strong negative selection, with some of these loci being located in ultraconserved regions. Sequence-based classification analysis of HOT loci suggested that their formation is driven by the sequence features, and the density of mapped ChIP-seq peaks across TF-bound loci correlates with sequence features and the expression level of flanking genes. Based on the affinities to bind to promoters and enhancers we detected five distinct clusters of TFs that form the core of the HOT loci. We report an abundance of HOT loci in the human genome and a commitment of 51% of all TF ChIP-seq binding events to HOT locus formation thus challenging the classical model of enhancer activity and propose a model of HOT locus formation based on the existence of large transcriptional condensates.
Collapse
Affiliation(s)
- Sanjarbek Hudaiberdiev
- National Institute for Biotechnology and Information, National Library of Medicine, National Institutes of HealthBethesdaUnited States
| | - Ivan Ovcharenko
- National Institute for Biotechnology and Information, National Library of Medicine, National Institutes of HealthBethesdaUnited States
| |
Collapse
|
3
|
Hudaiberdiev S, Ovcharenko I. Functional characteristics and computational model of abundant hyperactive loci in the human genome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.02.05.527203. [PMID: 36945558 PMCID: PMC10028745 DOI: 10.1101/2023.02.05.527203] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Enhancers and promoters are classically considered to be bound by a small set of TFs in a sequence-specific manner. This assumption has come under increasing skepticism as the datasets of ChIP-seq assays of TFs have expanded. In particular, high-occupancy target (HOT) loci attract hundreds of TFs with often no detectable correlation between ChIP-seq peaks and DNA-binding motif presence. Here, we used a set of 1,003 TF ChIP-seq datasets (HepG2, K562, H1) to analyze the patterns of ChIP-seq peak co-occurrence in combination with functional genomics datasets. We identified 43,891 HOT loci forming at the promoter (53%) and enhancer (47%) regions. HOT promoters regulate housekeeping genes, whereas HOT enhancers are involved in tissue-specific process regulation. HOT loci form the foundation of human super-enhancers and evolve under strong negative selection, with some of these loci being located in ultraconserved regions. Sequence-based classification analysis of HOT loci suggested that their formation is driven by the sequence features, and the density of mapped ChIP-seq peaks across TF-bound loci correlates with sequence features and the expression level of flanking genes. Based on the affinities to bind to promoters and enhancers we detected 5 distinct clusters of TFs that form the core of the HOT loci. We report an abundance of HOT loci in the human genome and a commitment of 51% of all TF ChIP-seq binding events to HOT locus formation thus challenging the classical model of enhancer activity and propose a model of HOT locus formation based on the existence of large transcriptional condensates.
Collapse
Affiliation(s)
- Sanjarbek Hudaiberdiev
- National Institute for Biotechnology and Information, National Library of Medicine, National Institutes of Health. Bethesda, MD
| | - Ivan Ovcharenko
- National Institute for Biotechnology and Information, National Library of Medicine, National Institutes of Health. Bethesda, MD
| |
Collapse
|
4
|
Loupe JM, Anderson AG, Rizzardi LF, Rodriguez-Nunez I, Moyers B, Trausch-Lowther K, Jain R, Bunney WE, Bunney BG, Cartagena P, Sequeira A, Watson SJ, Akil H, Cooper GM, Myers RM. Multiomic profiling of transcription factor binding and function in human brain. Nat Neurosci 2024; 27:1387-1399. [PMID: 38831039 DOI: 10.1038/s41593-024-01658-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 04/19/2024] [Indexed: 06/05/2024]
Abstract
Transcription factors (TFs) orchestrate gene expression programs crucial for brain function, but we lack detailed information about TF binding in human brain tissue. We generated a multiomic resource (ChIP-seq, ATAC-seq, RNA-seq, DNA methylation) on bulk tissues and sorted nuclei from several postmortem brain regions, including binding maps for more than 100 TFs. We demonstrate improved measurements of TF activity, including motif recognition and gene expression modeling, upon identification and removal of high TF occupancy regions. Further, predictive TF binding models demonstrate a bias for these high-occupancy sites. Neuronal TFs SATB2 and TBR1 bind unique regions depleted for such sites and promote neuronal gene expression. Binding sites for TFs, including TBR1 and PKNOX1, are enriched for risk variants associated with neuropsychiatric disorders, predominantly in neurons. This work, titled BrainTF, is a powerful resource for future studies seeking to understand the roles of specific TFs in regulating gene expression in the human brain.
Collapse
Affiliation(s)
- Jacob M Loupe
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | | | - Lindsay F Rizzardi
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
- Department of Biochemistry and Molecular Biology, The University of Alabama in Birmingham, Birmingham, AL, USA
| | | | - Belle Moyers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | | | - Rashmi Jain
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - William E Bunney
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA, USA
| | - Blynn G Bunney
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA, USA
| | - Preston Cartagena
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA, USA
| | - Adolfo Sequeira
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA, USA
| | - Stanley J Watson
- The Michigan Neuroscience Institute, University of Michigan, Ann Arbor, MI, USA
| | - Huda Akil
- The Michigan Neuroscience Institute, University of Michigan, Ann Arbor, MI, USA
| | | | - Richard M Myers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.
| |
Collapse
|
5
|
Cascianelli S, Ceddia G, Marchesi A, Masseroli M. Identification of transcription factor high accumulation DNA zones. BMC Bioinformatics 2023; 24:395. [PMID: 37864168 PMCID: PMC10590011 DOI: 10.1186/s12859-023-05528-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 10/10/2023] [Indexed: 10/22/2023] Open
Abstract
BACKGROUND Transcription factors (TF) play a crucial role in the regulation of gene transcription; alterations of their activity and binding to DNA areas are strongly involved in cancer and other disease onset and development. For proper biomedical investigation, it is hence essential to correctly trace TF dense DNA areas, having multiple bindings of distinct factors, and select DNA high occupancy target (HOT) zones, showing the highest accumulation of such bindings. Indeed, systematic and replicable analysis of HOT zones in a large variety of cells and tissues would allow further understanding of their characteristics and could clarify their functional role. RESULTS Here, we propose, thoroughly explain and discuss a full computational procedure to study in-depth DNA dense areas of transcription factor accumulation and identify HOT zones. This methodology, developed as a computationally efficient parametric algorithm implemented in an R/Bioconductor package, uses a systematic approach with two alternative methods to examine transcription factor bindings and provide comparative and fully-reproducible assessments. It offers different resolutions by introducing three distinct types of accumulation, which can analyze DNA from single-base to region-oriented levels, and a moving window, which can estimate the influence of the neighborhood for each DNA base under exam. CONCLUSIONS We quantitatively assessed the full procedure by using our implemented software package, named TFHAZ, in two example applications of biological interest, proving its full reliability and relevance.
Collapse
Affiliation(s)
- Silvia Cascianelli
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, 20133 Milan, Italy
| | - Gaia Ceddia
- Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain
| | - Alberto Marchesi
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, 20133 Milan, Italy
| | - Marco Masseroli
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, 20133 Milan, Italy
| |
Collapse
|
6
|
Kent D, Marchetti L, Mikulasova A, Russell LJ, Rico D. Broad H3K4me3 domains: Maintaining cellular identity and their implication in super-enhancer hijacking. Bioessays 2023; 45:e2200239. [PMID: 37350339 DOI: 10.1002/bies.202200239] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 05/25/2023] [Accepted: 05/30/2023] [Indexed: 06/24/2023]
Abstract
The human and mouse genomes are complex from a genomic standpoint. Each cell has the same genomic sequence, yet a wide array of cell types exists due to the presence of a plethora of regulatory elements in the non-coding genome. Recent advances in epigenomic profiling have uncovered non-coding gene proximal promoters and distal enhancers of transcription genome-wide. Extension of promoter-associated H3K4me3 histone mark across the gene body, known as a broad H3K4me3 domain (H3K4me3-BD), is a signature of constitutive expression of cell-type-specific regulation and of tumour suppressor genes in healthy cells. Recently, it has been discovered that the presence of H3K4me3-BDs over oncogenes is a cancer-specific feature associated with their dysregulated gene expression and tumourigenesis. Moreover, it has been shown that the hijacking of clusters of enhancers, known as super-enhancers (SE), by proto-oncogenes results in the presence of H3K4me3-BDs over the gene body. Therefore, H3K4me3-BDs and SE crosstalk in healthy and cancer cells therefore represents an important mechanism to identify future treatments for patients with SE driven cancers.
Collapse
Affiliation(s)
- Daniel Kent
- Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
| | - Letizia Marchetti
- Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
| | - Aneta Mikulasova
- Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
| | - Lisa J Russell
- Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
| | - Daniel Rico
- Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
| |
Collapse
|
7
|
Erokhin M, Mogila V, Lomaev D, Chetverina D. Polycomb Recruiters Inside and Outside of the Repressed Domains. Int J Mol Sci 2023; 24:11394. [PMID: 37511153 PMCID: PMC10379775 DOI: 10.3390/ijms241411394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 06/24/2023] [Accepted: 07/11/2023] [Indexed: 07/30/2023] Open
Abstract
The establishment and stable inheritance of individual patterns of gene expression in different cell types are required for the development of multicellular organisms. The important epigenetic regulators are the Polycomb group (PcG) and Trithorax group (TrxG) proteins, which control the silenced and active states of genes, respectively. In Drosophila, the PcG/TrxG group proteins are recruited to the DNA regulatory sequences termed the Polycomb response elements (PREs). The PREs are composed of the binding sites for different DNA-binding proteins, the so-called PcG recruiters. Currently, the role of the PcG recruiters in the targeting of the PcG proteins to PREs is well documented. However, there are examples where the PcG recruiters are also implicated in the active transcription and in the TrxG function. In addition, there is increasing evidence that the genome-wide PcG recruiters interact with the chromatin outside of the PREs and overlap with the proteins of differing regulatory classes. Recent studies of the interactomes of the PcG recruiters significantly expanded our understanding that they have numerous interactors besides the PcG proteins and that their functions extend beyond the regulation of the PRE repressive activity. Here, we summarize current data about the functions of the PcG recruiters.
Collapse
Affiliation(s)
- Maksim Erokhin
- Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov Street, Moscow 119334, Russia
| | - Vladic Mogila
- Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov Street, Moscow 119334, Russia
| | - Dmitry Lomaev
- Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov Street, Moscow 119334, Russia
| | - Darya Chetverina
- Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov Street, Moscow 119334, Russia
| |
Collapse
|
8
|
Eve A. Transitions in development - an interview with Evgeny Kvon. Development 2023; 150:dev202032. [PMID: 37366161 DOI: 10.1242/dev.202032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/28/2023]
Abstract
Evgeny Kvon is an Assistant Professor at the University of California, Irvine (UCI) in the Department of Developmental and Cell Biology, USA. His lab studies non-coding regulatory DNA and its mechanistic role in the control of gene expression to understand more about development, disease and evolution. Last year, Evgeny received the National Institutes of Health Director's New Innovator Award. We spoke to Evgeny over Zoom to learn more about his career and the silver lining to starting a lab during the COVID-19 lockdowns.
Collapse
|
9
|
Kaur A, Chauhan APS, Aggarwal AK. Prediction of Enhancers in DNA Sequence Data using a Hybrid CNN-DLSTM Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1327-1336. [PMID: 35417351 DOI: 10.1109/tcbb.2022.3167090] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Enhancer, a distal cis-regulatory element controls gene expression. Experimental prediction of enhancer elements is time-consuming and expensive. Consequently, various inexpensive deep learning-based fast methods have been developed for predicting the enhancers and determining their strength. In this paper, we have proposed a two-stage deep learning-based framework leveraging DNA structural features, natural language processing, convolutional neural network, and long short-term memory to predict the enhancer elements accurately in the genomics data. In the first stage, we extracted the features from DNA sequence data by using three feature representation techniques viz., k-mer based feature extraction along with word2vector based interpretation of underlined patterns, one-hot encoding, and the DNAshape technique. In the second stage, strength of enhancers is predicted from the extracted features using a hybrid deep learning model. The method is capable of adapting itself to varying sizes of datasets. Also, as proposed model can capture long-range sequencing patterns, the robustness of the method remains unaffected against minor variations in the genomics sequence. The method outperforms the other state-of-the-art methods at both stages in terms of performance metrics of prediction accuracy, specificity, Mathew's correlation coefficient, and area under the ROC curve. In summary, the proposed method is a reliable method for enhancer prediction.
Collapse
|
10
|
Staller MV. Transcription factors perform a 2-step search of the nucleus. Genetics 2022; 222:iyac111. [PMID: 35939561 PMCID: PMC9526044 DOI: 10.1093/genetics/iyac111] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 07/14/2022] [Indexed: 01/02/2023] Open
Abstract
Transcription factors regulate gene expression by binding to regulatory DNA and recruiting regulatory protein complexes. The DNA-binding and protein-binding functions of transcription factors are traditionally described as independent functions performed by modular protein domains. Here, I argue that genome binding can be a 2-part process with both DNA-binding and protein-binding steps, enabling transcription factors to perform a 2-step search of the nucleus to find their appropriate binding sites in a eukaryotic genome. I support this hypothesis with new and old results in the literature, discuss how this hypothesis parsimoniously resolves outstanding problems, and present testable predictions.
Collapse
Affiliation(s)
- Max Valentín Staller
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
11
|
Jansen C, Paraiso KD, Zhou JJ, Blitz IL, Fish MB, Charney RM, Cho JS, Yasuoka Y, Sudou N, Bright AR, Wlizla M, Veenstra GJC, Taira M, Zorn AM, Mortazavi A, Cho KWY. Uncovering the mesendoderm gene regulatory network through multi-omic data integration. Cell Rep 2022; 38:110364. [PMID: 35172134 PMCID: PMC8917868 DOI: 10.1016/j.celrep.2022.110364] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 10/30/2021] [Accepted: 01/19/2022] [Indexed: 01/01/2023] Open
Abstract
Mesendodermal specification is one of the earliest events in embryogenesis, where cells first acquire distinct identities. Cell differentiation is a highly regulated process that involves the function of numerous transcription factors (TFs) and signaling molecules, which can be described with gene regulatory networks (GRNs). Cell differentiation GRNs are difficult to build because existing mechanistic methods are low throughput, and high-throughput methods tend to be non-mechanistic. Additionally, integrating highly dimensional data composed of more than two data types is challenging. Here, we use linked self-organizing maps to combine chromatin immunoprecipitation sequencing (ChIP-seq)/ATAC-seq with temporal, spatial, and perturbation RNA sequencing (RNA-seq) data from Xenopus tropicalis mesendoderm development to build a high-resolution genome scale mechanistic GRN. We recover both known and previously unsuspected TF-DNA/TF-TF interactions validated through reporter assays. Our analysis provides insights into transcriptional regulation of early cell fate decisions and provides a general approach to building GRNs using highly dimensional multi-omic datasets.
Collapse
Affiliation(s)
- Camden Jansen
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA; Center for Complex Biological Systems, University of California, Irvine, CA, USA
| | - Kitt D Paraiso
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA; Center for Complex Biological Systems, University of California, Irvine, CA, USA
| | - Jeff J Zhou
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
| | - Ira L Blitz
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
| | - Margaret B Fish
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
| | - Rebekah M Charney
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
| | - Jin Sun Cho
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
| | - Yuuri Yasuoka
- Laboratory for Comprehensive Genomic Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Norihiro Sudou
- Department of Anatomy, School of Medicine, Toho University, Tokyo, Japan
| | - Ann Rose Bright
- Department of Molecular Developmental Biology, Radboud University, Nijmegen, the Netherlands
| | - Marcin Wlizla
- Division of Developmental Biology, Department of Pediatrics, Cincinnati Children's Hospital Medical Center, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Gert Jan C Veenstra
- Department of Molecular Developmental Biology, Radboud University, Nijmegen, the Netherlands
| | - Masanori Taira
- Department of Biological Sciences, Chuo University, Tokyo, Japan
| | - Aaron M Zorn
- Division of Developmental Biology, Department of Pediatrics, Cincinnati Children's Hospital Medical Center, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA; Center for Complex Biological Systems, University of California, Irvine, CA, USA.
| | - Ken W Y Cho
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA; Center for Complex Biological Systems, University of California, Irvine, CA, USA.
| |
Collapse
|
12
|
Romanov SE, Kalashnikova DA, Laktionov PP. Methods of massive parallel reporter assays for investigation of enhancers. Vavilovskii Zhurnal Genet Selektsii 2021; 25:344-355. [PMID: 34901731 PMCID: PMC8627875 DOI: 10.18699/vj21.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Revised: 03/28/2021] [Accepted: 03/28/2021] [Indexed: 11/19/2022] Open
Abstract
The correct deployment of genetic programs for development and differentiation relies on finely coordinated regulation of specific gene sets. Genomic regulatory elements play an exceptional role in this process. There are few types of gene regulatory elements, including promoters, enhancers, insulators and silencers. Alterations of gene regulatory elements may cause various pathologies, including cancer, congenital disorders and autoimmune diseases. The development of high-throughput genomic assays has made it possible to significantly accelerate the accumulation of information about the characteristic epigenetic properties of regulatory elements. In combination with high-throughput studies focused on the genome-wide distribution of epigenetic marks, regulatory proteins and the spatial structure of chromatin, this significantly expands the understanding of the principles of epigenetic regulation of genes and allows potential regulatory elements to be searched for in silico. However, common experimental approaches used to study the local characteristics of chromatin have a number of technical limitations that may reduce the reliability of computational identification of genomic regulatory sequences. Taking into account the variability of the functions of epigenetic determinants and complex multicomponent regulation of genomic elements activity, their functional verification is often required. A plethora of methods have been developed to study the functional role of regulatory elements on the genome scale. Common experimental approaches for in silico identification of regulatory elements and their inherent technical limitations will be described. The present review is focused on original high-throughput methods of enhancer activity reporter analysis that are currently used to validate predicted regulatory elements and to perform de novo searches. The methods described allow assessing the functional role of the nucleotide sequence of a regulatory element, to determine its exact boundaries and to assess the influence of the local state of chromatin on the activity of enhancers and gene expression. These approaches have contributed substantially to the understanding of the fundamental principles of gene regulation.
Collapse
Affiliation(s)
- S E Romanov
- Novosibirsk State University, Epigenetics Laboratory, Department of Natural Sciences, Novosibirsk, Russia Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Sciences, Genomics Laboratory, Novosibirsk, Russia
| | - D A Kalashnikova
- Novosibirsk State University, Epigenetics Laboratory, Department of Natural Sciences, Novosibirsk, Russia Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Sciences, Genomics Laboratory, Novosibirsk, Russia
| | - P P Laktionov
- Novosibirsk State University, Epigenetics Laboratory, Department of Natural Sciences, Novosibirsk, Russia Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Sciences, Genomics Laboratory, Novosibirsk, Russia
| |
Collapse
|
13
|
Ohba S. Genome-scale actions of master regulators directing skeletal development. JAPANESE DENTAL SCIENCE REVIEW 2021; 57:217-223. [PMID: 34745394 PMCID: PMC8556520 DOI: 10.1016/j.jdsr.2021.10.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 09/14/2021] [Accepted: 10/10/2021] [Indexed: 11/03/2022] Open
Abstract
The mammalian skeleton develops through two distinct modes of ossification: intramembranous ossification and endochondral ossification. During the process of skeletal development, SRY-box containing gene 9 (Sox9), runt-related transcription factor 2 (Runx2), and Sp7 work as master transcription factors (TFs) or transcriptional regulators, underlying cell fate specification of the two distinct populations: bone-forming osteoblasts and cartilage-forming chondrocytes. In the past two decades, core transcriptional circuits underlying skeletal development have been identified mainly through mouse genetics and biochemical approaches. Recently emerging next-generation sequencer (NGS)-based studies have provided genome-scale views on the gene regulatory landscape programmed by the master TFs/transcriptional regulators. With particular focus on Sox9, Runx2, and Sp7, this review aims to discuss the gene regulatory landscape in skeletal development, which has been identified by genome-scale data, and provide future perspectives in this field.
Collapse
Affiliation(s)
- Shinsuke Ohba
- Department of Cell Biology, Institute of Biomedical Sciences, Nagasaki University, 1-7-1 Sakamoto, Nagasaki 852-8588, Japan
| |
Collapse
|
14
|
Grosveld F, van Staalduinen J, Stadhouders R. Transcriptional Regulation by (Super)Enhancers: From Discovery to Mechanisms. Annu Rev Genomics Hum Genet 2021; 22:127-146. [PMID: 33951408 DOI: 10.1146/annurev-genom-122220-093818] [Citation(s) in RCA: 65] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Accurate control of gene expression in the right cell at the right moment is of fundamental importance to animal development and homeostasis. At the heart of gene regulation lie the enhancers, a class of gene regulatory elements that ensures precise spatiotemporal activation of gene transcription. Mammalian genomes are littered with enhancers, which are frequently organized in cooperative clusters such as locus control regions and superenhancers. Here, we discuss our current knowledge of enhancer biology, including an overview of the discovery of the various enhancer subsets and the mechanistic models used to explain their gene regulatory function.
Collapse
Affiliation(s)
- Frank Grosveld
- Department of Cell Biology, Erasmus MC, 3000 CA Rotterdam, The Netherlands; ,
| | | | - Ralph Stadhouders
- Department of Cell Biology, Erasmus MC, 3000 CA Rotterdam, The Netherlands; , .,Department of Pulmonary Medicine, Erasmus MC, 3000 CA Rotterdam, The Netherlands
| |
Collapse
|
15
|
Chetverina D, Erokhin M, Schedl P. GAGA factor: a multifunctional pioneering chromatin protein. Cell Mol Life Sci 2021; 78:4125-4141. [PMID: 33528710 DOI: 10.1007/s00018-021-03776-z] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2020] [Revised: 12/08/2020] [Accepted: 01/19/2021] [Indexed: 12/27/2022]
Abstract
The Drosophila GAGA factor (GAF) is a multifunctional protein implicated in nucleosome organization and remodeling, activation and repression of gene expression, long distance enhancer-promoter communication, higher order chromosome structure, and mitosis. This broad range of activities poses questions about how a single protein can perform so many seemingly different and unrelated functions. Current studies argue that GAF acts as a "pioneer" factor, generating nucleosome-free regions of chromatin for different classes of regulatory elements. The removal of nucleosomes from regulatory elements in turn enables other factors to bind to these elements and carry out their specialized functions. Consistent with this view, GAF associates with a collection of chromatin remodelers and also interacts with proteins implicated in different regulatory functions. In this review, we summarize the known activities of GAF and the functions of its protein partners.
Collapse
Affiliation(s)
- Darya Chetverina
- Group of Epigenetics, Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov St., Moscow, 119334, Russia.
| | - Maksim Erokhin
- Group of Chromatin Biology, Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov St., Moscow, 119334, Russia
| | - Paul Schedl
- Department of Molecular Biology, Princeton University, Princeton, NJ, 08544, USA.
| |
Collapse
|
16
|
Serebreni L, Stark A. Insights into gene regulation: From regulatory genomic elements to DNA-protein and protein-protein interactions. Curr Opin Cell Biol 2020; 70:58-66. [PMID: 33385708 DOI: 10.1016/j.ceb.2020.11.009] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 11/19/2020] [Accepted: 11/29/2020] [Indexed: 01/19/2023]
Abstract
Transcription is orchestrated by non-coding regulatory elements embedded in chromatin, which exist within the larger context of chromosome topology. Here, we review recent insights into the functions of non-coding regulatory elements and their protein interactors during transcription control. A picture emerges in which the topological environment constraints enhancer-promoter interactions and specific enhancer-bound proteins with distinct promoter-compatibilities refine target promoter choice. Such compatibilities are encoded within the sequences of enhancers and promoters and realized by diverse transcription factors and cofactors with distinct biochemical activities. An emerging property of transcription factors and cofactors is the formation of nuclear microenvironments or membraneless compartments that can have properties of phase-separated liquids. These environments are able to selectively enrich certain proteins and small molecules over others. Further investigation into the interaction of transcriptional regulators with themselves and regulatory DNA elements will help reveal the complexities of gene regulation within the context of the nucleus.
Collapse
Affiliation(s)
- Leonid Serebreni
- Research Institute of Molecular Pathology (IMP), Vienna BioCenter (VBC), Vienna, Austria
| | - Alexander Stark
- Research Institute of Molecular Pathology (IMP), Vienna BioCenter (VBC), Vienna, Austria; Medical University of Vienna, Vienna BioCenter (VBC), Vienna, Austria.
| |
Collapse
|
17
|
Abstract
Determining whether and how a gene is transcribed are two of the central processes of life. The conceptual basis for understanding such gene regulation arose from pioneering biophysical studies in eubacteria. However, eukaryotic genomes exhibit vastly greater complexity, which raises questions not addressed by this bacterial paradigm. First, how is information integrated from many widely separated binding sites to determine how a gene is transcribed? Second, does the presence of multiple energy-expending mechanisms, which are absent from eubacterial genomes, indicate that eukaryotes are capable of improved forms of genetic information processing? An updated biophysical foundation is needed to answer such questions. We describe the linear framework, a graph-based approach to Markov processes, and show that it can accommodate many previous studies in the field. Under the assumption of thermodynamic equilibrium, we introduce a language of higher-order cooperativities and show how it can rigorously quantify gene regulatory properties suggested by experiment. We point out that fundamental limits to information processing arise at thermodynamic equilibrium and can only be bypassed through energy expenditure. Finally, we outline some of the mathematical challenges that must be overcome to construct an improved biophysical understanding of gene regulation.
Collapse
Affiliation(s)
- Felix Wong
- Institute for Medical Engineering & Science, Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.,Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Jeremy Gunawardena
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts 02115, USA;
| |
Collapse
|
18
|
Klein JC, Agarwal V, Inoue F, Keith A, Martin B, Kircher M, Ahituv N, Shendure J. A systematic evaluation of the design and context dependencies of massively parallel reporter assays. Nat Methods 2020; 17:1083-1091. [PMID: 33046894 PMCID: PMC7727316 DOI: 10.1038/s41592-020-0965-y] [Citation(s) in RCA: 116] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Accepted: 08/27/2020] [Indexed: 01/02/2023]
Abstract
Massively parallel reporter assays (MPRAs) functionally screen thousands of sequences for regulatory activity in parallel. To date, there are limited studies that systematically compare differences in MPRA design. Here, we screen a library of 2,440 candidate liver enhancers and controls for regulatory activity in HepG2 cells using nine different MPRA designs. We identify subtle but significant differences that correlate with epigenetic and sequence-level features, as well as differences in dynamic range and reproducibility. We also validate that enhancer activity is largely independent of orientation, at least for our library and designs. Finally, we assemble and test the same enhancers as 192-mers, 354-mers and 678-mers and observe sizable differences. This work provides a framework for the experimental design of high-throughput reporter assays, suggesting that the extended sequence context of tested elements and to a lesser degree the precise assay, influence MPRA results.
Collapse
Affiliation(s)
- Jason C Klein
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Vikram Agarwal
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Calico Life Sciences LLC, South San Francisco, CA, USA
| | - Fumitaka Inoue
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | - Aidan Keith
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Beth Martin
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Martin Kircher
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Berlin Institute of Health (BIH), Berlin, Germany
- Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA.
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA.
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Howard Hughes Medical Institute, Seattle, WA, USA.
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA, USA.
| |
Collapse
|
19
|
Overton IM, Sims AH, Owen JA, Heale BSE, Ford MJ, Lubbock ALR, Pairo-Castineira E, Essafi A. Functional Transcription Factor Target Networks Illuminate Control of Epithelial Remodelling. Cancers (Basel) 2020; 12:cancers12102823. [PMID: 33007944 PMCID: PMC7652213 DOI: 10.3390/cancers12102823] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2020] [Revised: 09/16/2020] [Accepted: 09/24/2020] [Indexed: 12/15/2022] Open
Abstract
Cell identity is governed by gene expression, regulated by transcription factor (TF) binding at cis-regulatory modules. Decoding the relationship between TF binding patterns and gene regulation is nontrivial, remaining a fundamental limitation in understanding cell decision-making. We developed the NetNC software to predict functionally active regulation of TF targets; demonstrated on nine datasets for the TFs Snail, Twist, and modENCODE Highly Occupied Target (HOT) regions. Snail and Twist are canonical drivers of epithelial to mesenchymal transition (EMT), a cell programme important in development, tumour progression and fibrosis. Predicted "neutral" (non-functional) TF binding always accounted for the majority (50% to 95%) of candidate target genes from statistically significant peaks and HOT regions had higher functional binding than most of the Snail and Twist datasets examined. Our results illuminated conserved gene networks that control epithelial plasticity in development and disease. We identified new gene functions and network modules including crosstalk with notch signalling and regulation of chromatin organisation, evidencing networks that reshape Waddington's epigenetic landscape during epithelial remodelling. Expression of orthologous functional TF targets discriminated breast cancer molecular subtypes and predicted novel tumour biology, with implications for precision medicine. Predicted invasion roles were validated using a tractable cell model, supporting our approach.
Collapse
Affiliation(s)
- Ian M. Overton
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK; (A.H.S.); (B.S.E.H.); (M.J.F.); (A.L.R.L.); (E.P.-C.); (A.E.)
- Department of Systems Biology, Harvard University, Boston, MA 02115, USA;
- Centre for Synthetic and Systems Biology (SynthSys), University of Edinburgh, Edinburgh EH9 3BF, UK
- Patrick G Johnston Centre for Cancer Research, Queen’s University Belfast, Belfast BT9 7AE, UK
- Correspondence:
| | - Andrew H. Sims
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK; (A.H.S.); (B.S.E.H.); (M.J.F.); (A.L.R.L.); (E.P.-C.); (A.E.)
| | - Jeremy A. Owen
- Department of Systems Biology, Harvard University, Boston, MA 02115, USA;
- Department of Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Bret S. E. Heale
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK; (A.H.S.); (B.S.E.H.); (M.J.F.); (A.L.R.L.); (E.P.-C.); (A.E.)
| | - Matthew J. Ford
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK; (A.H.S.); (B.S.E.H.); (M.J.F.); (A.L.R.L.); (E.P.-C.); (A.E.)
| | - Alexander L. R. Lubbock
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK; (A.H.S.); (B.S.E.H.); (M.J.F.); (A.L.R.L.); (E.P.-C.); (A.E.)
| | - Erola Pairo-Castineira
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK; (A.H.S.); (B.S.E.H.); (M.J.F.); (A.L.R.L.); (E.P.-C.); (A.E.)
| | - Abdelkader Essafi
- MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK; (A.H.S.); (B.S.E.H.); (M.J.F.); (A.L.R.L.); (E.P.-C.); (A.E.)
| |
Collapse
|
20
|
Hojo H, Ohba S. Gene regulatory landscape in osteoblast differentiation. Bone 2020; 137:115458. [PMID: 32474244 DOI: 10.1016/j.bone.2020.115458] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 05/25/2020] [Accepted: 05/25/2020] [Indexed: 12/29/2022]
Abstract
The development of osteoblasts, a bone-forming cell population, occurs in conjunction with development of the skeleton, which creates our physical framework and shapes the body. In the past two decades, genetic studies have uncovered the molecular framework of this process-namely, transcriptional regulators and signaling pathways coordinate the cell fate determination and differentiation of osteoblasts in a spatial and temporal manner. Recently emerging genome-wide studies provide additional layers of understanding of the gene regulatory landscape during osteoblast differentiation, allowing us to gain novel insight into the modes of action of the key regulators, functional interaction among the regulator-bound enhancers, epigenetic regulations, and the complex nature of regulatory inputs. In this review, we summarize current understanding of the transcriptional regulation in osteoblasts, in terms of the gene regulatory landscape.
Collapse
Affiliation(s)
- Hironori Hojo
- Department of Clinical Biotechnology, Center for Disease Biology and Integrative Medicine, The University of Tokyo Graduate School of Medicine, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
| | - Shinsuke Ohba
- Department of Cell Biology, Institute of Biomedical Sciences, Nagasaki University, 1-7-1 Sakamoto, Nagasaki 852-8588, Japan.
| |
Collapse
|
21
|
Chen HM, Marques JG, Sugino K, Wei D, Miyares RL, Lee T. CAMIO: a transgenic CRISPR pipeline to create diverse targeted genome deletions in Drosophila. Nucleic Acids Res 2020; 48:4344-4356. [PMID: 32187363 PMCID: PMC7192631 DOI: 10.1093/nar/gkaa177] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 02/06/2020] [Accepted: 03/10/2020] [Indexed: 02/07/2023] Open
Abstract
The genome is the blueprint for an organism. Interrogating the genome, especially locating critical cis-regulatory elements, requires deletion analysis. This is conventionally performed using synthetic constructs, making it cumbersome and non-physiological. Thus, we created Cas9-mediated Arrayed Mutagenesis of Individual Offspring (CAMIO) to achieve comprehensive analysis of a targeted region of native DNA. CAMIO utilizes CRISPR that is spatially restricted to generate independent deletions in the intact Drosophila genome. Controlled by recombination, a single guide RNA is stochastically chosen from a set targeting a specific DNA region. Combining two sets increases variability, leading to either indels at 1–2 target sites or inter-target deletions. Cas9 restriction to male germ cells elicits autonomous double-strand-break repair, consequently creating offspring with diverse mutations. Thus, from a single population cross, we can obtain a deletion matrix covering a large expanse of DNA at both coarse and fine resolution. We demonstrate the ease and power of CAMIO by mapping 5′UTR sequences crucial for chinmo's post-transcriptional regulation.
Collapse
Affiliation(s)
- Hui-Min Chen
- Howard Hughes Medical Institute, Janelia Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA
| | - Jorge Garcia Marques
- Howard Hughes Medical Institute, Janelia Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA
| | - Ken Sugino
- Howard Hughes Medical Institute, Janelia Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA
| | - Dingjun Wei
- Howard Hughes Medical Institute, Janelia Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA
| | - Rosa Linda Miyares
- Howard Hughes Medical Institute, Janelia Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA
| | - Tzumin Lee
- Howard Hughes Medical Institute, Janelia Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA
| |
Collapse
|
22
|
Ryan GE, Farley EK. Functional genomic approaches to elucidate the role of enhancers during development. WILEY INTERDISCIPLINARY REVIEWS. SYSTEMS BIOLOGY AND MEDICINE 2020; 12:e1467. [PMID: 31808313 PMCID: PMC7027484 DOI: 10.1002/wsbm.1467] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 10/02/2019] [Accepted: 10/11/2019] [Indexed: 12/22/2022]
Abstract
Successful development depends on the precise tissue-specific regulation of genes by enhancers, genetic elements that act as switches to control when and where genes are expressed. Because enhancers are critical for development, and the majority of disease-associated mutations reside within enhancers, it is essential to understand which sequences within enhancers are important for function. Advances in sequencing technology have enabled the rapid generation of genomic data that predict putative active enhancers, but functionally validating these sequences at scale remains a fundamental challenge. Herein, we discuss the power of genome-wide strategies used to identify candidate enhancers, and also highlight limitations and misconceptions that have arisen from these data. We discuss the use of massively parallel reporter assays to test enhancers for function at scale. We also review recent advances in our ability to study gene regulation during development, including CRISPR-based tools to manipulate genomes and single-cell transcriptomics to finely map gene expression. Finally, we look ahead to a synthesis of complementary genomic approaches that will advance our understanding of enhancer function during development. This article is categorized under: Physiology > Mammalian Physiology in Health and Disease Developmental Biology > Developmental Processes in Health and Disease Laboratory Methods and Technologies > Genetic/Genomic Methods.
Collapse
Affiliation(s)
- Genevieve E. Ryan
- Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
- Division of Biological Sciences, Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
| | - Emma K. Farley
- Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
- Division of Biological Sciences, Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
| |
Collapse
|
23
|
Zhu H, Uusküla-Reimand L, Isaev K, Wadi L, Alizada A, Shuai S, Huang V, Aduluso-Nwaobasi D, Paczkowska M, Abd-Rabbo D, Ocsenas O, Liang M, Thompson JD, Li Y, Ruan L, Krassowski M, Dzneladze I, Simpson JT, Lupien M, Stein LD, Boutros PC, Wilson MD, Reimand J. Candidate Cancer Driver Mutations in Distal Regulatory Elements and Long-Range Chromatin Interaction Networks. Mol Cell 2020; 77:1307-1321.e10. [PMID: 31954095 DOI: 10.1016/j.molcel.2019.12.027] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2019] [Revised: 06/04/2019] [Accepted: 12/24/2019] [Indexed: 12/17/2022]
Abstract
A comprehensive catalog of cancer driver mutations is essential for understanding tumorigenesis and developing therapies. Exome-sequencing studies have mapped many protein-coding drivers, yet few non-coding drivers are known because genome-wide discovery is challenging. We developed a driver discovery method, ActiveDriverWGS, and analyzed 120,788 cis-regulatory modules (CRMs) across 1,844 whole tumor genomes from the ICGC-TCGA PCAWG project. We found 30 CRMs with enriched SNVs and indels (FDR < 0.05). These frequently mutated regulatory elements (FMREs) were ubiquitously active in human tissues, showed long-range chromatin interactions and mRNA abundance associations with target genes, and were enriched in motif-rewiring mutations and structural variants. Genomic deletion of one FMRE in human cells caused proliferative deficiencies and transcriptional deregulation of cancer genes CCNB1IP1, CDH1, and CDKN2B, validating observations in FMRE-mutated tumors. Pathway analysis revealed further sub-significant FMREs at cancer genes and processes, indicating an unexplored landscape of infrequent driver mutations in the non-coding genome.
Collapse
Affiliation(s)
- Helen Zhu
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada
| | - Liis Uusküla-Reimand
- Program in Genetics and Genome Biology, SickKids Research Institute, Peter Gilgan Centre for Research and Learning (PGCRL), 686 Bay Street, Toronto, ON M5G 0A4, Canada; Division of Gene Technology, Department of Chemistry and Biotechnology, Tallinn University of Technology, Akadeemia tee 15, Tallinn 12618, Estonia
| | - Keren Isaev
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada
| | - Lina Wadi
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Azad Alizada
- Program in Genetics and Genome Biology, SickKids Research Institute, Peter Gilgan Centre for Research and Learning (PGCRL), 686 Bay Street, Toronto, ON M5G 0A4, Canada
| | - Shimin Shuai
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Molecular Genetics, University of Toronto, 1 King's College Circle, Toronto, ON M5S 1A8, Canada
| | - Vincent Huang
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Dike Aduluso-Nwaobasi
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Marta Paczkowska
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Diala Abd-Rabbo
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Oliver Ocsenas
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada
| | - Minggao Liang
- Program in Genetics and Genome Biology, SickKids Research Institute, Peter Gilgan Centre for Research and Learning (PGCRL), 686 Bay Street, Toronto, ON M5G 0A4, Canada; Department of Molecular Genetics, University of Toronto, 1 King's College Circle, Toronto, ON M5S 1A8, Canada
| | - J Drew Thompson
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Yao Li
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Luyao Ruan
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Michal Krassowski
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Irakli Dzneladze
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Jared T Simpson
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Computer Science, University of Toronto, 214 College Street, Toronto, ON M5T 3A1, Canada
| | - Mathieu Lupien
- Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada; Princess Margaret Cancer Centre, 101 College Street, Toronto, ON M5G 0A3, Canada
| | - Lincoln D Stein
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Molecular Genetics, University of Toronto, 1 King's College Circle, Toronto, ON M5S 1A8, Canada
| | - Paul C Boutros
- Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada; Department of Human Genetics, University of California Los Angeles, 10833 Le Conte Avenue, Los Angeles, CA 90095, USA; Department of Urology, University of California Los Angeles, 200 Medical Plaza Driveway #140, Los Angeles, CA 90024, USA; Institute of Precision Health, University of California Los Angeles, 10833 Le Conte Avenue, Los Angeles, CA 90024, USA; Jonsson Comprehensive Cancer Centre, University of California Los Angeles, 10833 Le Conte Avenue, Los Angeles, CA 90024, USA
| | - Michael D Wilson
- Program in Genetics and Genome Biology, SickKids Research Institute, Peter Gilgan Centre for Research and Learning (PGCRL), 686 Bay Street, Toronto, ON M5G 0A4, Canada; Department of Molecular Genetics, University of Toronto, 1 King's College Circle, Toronto, ON M5S 1A8, Canada
| | - Jüri Reimand
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada.
| |
Collapse
|
24
|
Niu X, Yang K, Zhang G, Yang Z, Hu X. A Pretraining-Retraining Strategy of Deep Learning Improves Cell-Specific Enhancer Predictions. Front Genet 2020; 10:1305. [PMID: 31969903 PMCID: PMC6960260 DOI: 10.3389/fgene.2019.01305] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Accepted: 11/26/2019] [Indexed: 01/22/2023] Open
Abstract
Deciphering the code of cis-regulatory element (CRE) is one of the core issues of today’s biology. Enhancers are distal CREs and play significant roles in gene transcriptional regulation. Although identifications of enhancer locations across the whole genome [discriminative enhancer predictions (DEP)] is necessary, it is more important to predict in which specific cell or tissue types, they will be activated and functional [tissue-specific enhancer predictions (TSEP)]. Although existing deep learning models achieved great successes in DEP, they cannot be directly employed in TSEP because a specific cell or tissue type only has a limited number of available enhancer samples for training. Here, we first adopted a reported deep learning architecture and then developed a novel training strategy named “pretraining-retraining strategy” (PRS) for TSEP by decomposing the whole training process into two successive stages: a pretraining stage is designed to train with the whole enhancer data for performing DEP, and a retraining strategy is then designed to train with tissue-specific enhancer samples based on the trained pretraining model for making TSEP. As a result, PRS is found to be valid for DEP with an AUC of 0.922 and a GM (geometric mean) of 0.696, when testing on a larger-scale FANTOM5 enhancer dataset via a five-fold cross-validation. Interestingly, based on the trained pretraining model, a new finding is that only additional twenty epochs are needed to complete the retraining process on testing 23 specific tissues or cell lines. For TSEP tasks, PRS achieved a mean GM of 0.806 which is significantly higher than 0.528 of gkm-SVM, an existing mainstream method for CRE predictions. Notably, PRS is further proven superior to other two state-of-the-art methods: DEEP and BiRen. In summary, PRS has employed useful ideas from the domain of transfer learning and is a reliable method for TSEPs.
Collapse
Affiliation(s)
- Xiaohui Niu
- College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China
| | - Kun Yang
- College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China
| | - Ge Zhang
- College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China
| | - Zhiquan Yang
- College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China
| | - Xuehai Hu
- College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
25
|
Bell K, Skier K, Chen KH, Gergen JP. Two pair-rule responsive enhancers regulate wingless transcription in the Drosophila blastoderm embryo. Dev Dyn 2019; 249:556-572. [PMID: 31837063 DOI: 10.1002/dvdy.142] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 11/25/2019] [Accepted: 11/26/2019] [Indexed: 11/07/2022] Open
Abstract
BACKGROUND While many developmentally relevant enhancers act in a modular fashion, there is growing evidence for nonadditive interactions between distinct cis-regulatory enhancers. We investigated if nonautonomous enhancer interactions underlie transcription regulation of the Drosophila segment polarity gene, wingless. RESULTS We identified two wg enhancers active at the blastoderm stage: wg 3613u, located from -3.6 to -1.3 kb upstream of the wg transcription start site (TSS) and 3046d, located in intron two of the wg gene, from 3.0 to 4.6 kb downstream of the TSS. Genetic experiments confirm that Even Skipped (Eve), Fushi-tarazu (Ftz), Runt, Odd-paired (Opa), Odd-skipped (Odd), and Paired (Prd) contribute to spatially regulated wg expression. Interestingly, there are enhancer specific differences in response to the gain or loss of function of pair-rule gene activity. Although each element recapitulates aspects of wg expression, a composite reporter containing both enhancers more faithfully recapitulates wg regulation than would be predicted from the sum of their individual responses. CONCLUSION These results suggest that the regulation of wg by pair-rule genes involves nonadditive interactions between distinct cis-regulatory enhancers.
Collapse
Affiliation(s)
- Kimberly Bell
- Department of Biochemistry and Cell Biology and the Center for Developmental Genetics, Stony Brook University, Stony Brook, New York
- Center for Excellence in Learning & Teaching, Stony Brook University, Stony Brook, New York
| | - Kevin Skier
- Department of Biochemistry and Cell Biology and the Center for Developmental Genetics, Stony Brook University, Stony Brook, New York
- University of Massachusetts Medical School, Worcester, Massachusetts
| | - Kevin H Chen
- Department of Biochemistry and Cell Biology and the Center for Developmental Genetics, Stony Brook University, Stony Brook, New York
- Boston University School of Medicine, Boston, Massachusetts
| | - John Peter Gergen
- Department of Biochemistry and Cell Biology and the Center for Developmental Genetics, Stony Brook University, Stony Brook, New York
| |
Collapse
|
26
|
Mahmud AKMF, Yang D, Stenberg P, Ioshikhes I, Nandi S. Exploring a Drosophila Transcription Factor Interaction Network to Identify Cis-Regulatory Modules. J Comput Biol 2019; 27:1313-1328. [PMID: 31855461 DOI: 10.1089/cmb.2018.0160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Multiple transcription factors (TFs) bind to specific sites in the genome and interact among themselves to form the cis-regulatory modules (CRMs). They are essential in modulating the expression of genes, and it is important to study this interplay to understand gene regulation. In the present study, we integrated experimentally identified TF binding sites collected from published studies with computationally predicted TF binding sites to identify Drosophila CRMs. Along with the detection of the previously known CRMs, this approach identified novel protein combinations. We determined high-occupancy target sites, where a large number of TFs bind. Investigating these sites revealed that Giant, Dichaete, and Knirp are highly enriched in these locations. A common TAG team motif was observed at these sites, which might play a role in recruiting other TFs. While comparing the binding sites at distal and proximal promoters, we found that certain regulatory TFs, such as Zelda, were highly enriched in enhancers. Our study has shown that, from the information available concerning the TF binding sites, the real CRMs could be predicted accurately and efficiently. Although we only may claim co-occurrence of these proteins in this study, it may actually point to their interaction (as known interaction proteins typically co-occur together). Such an integrative approach can, therefore, help us to provide a better understanding of the interplay among the factors, even though further experimental verification is required.
Collapse
Affiliation(s)
| | - Doo Yang
- Ottawa Institute of Computational Biology and Bioinformatics (OICBB) and Ottawa Institute of Systems Biology (OISB) and Department of Biochemistry, Microbiology and Immunology (BMI), Faculty of Medicine, University of Ottawa, Ottawa, Canada
| | - Per Stenberg
- Department of Molecular Biology, Umeå University, Umeå, Sweden
| | - Ilya Ioshikhes
- Ottawa Institute of Computational Biology and Bioinformatics (OICBB) and Ottawa Institute of Systems Biology (OISB) and Department of Biochemistry, Microbiology and Immunology (BMI), Faculty of Medicine, University of Ottawa, Ottawa, Canada
| | - Soumyadeep Nandi
- Life Sciences Division, Institute of Advanced Study in Science and Technology, Vigyan Path, Paschim Boragaon, Guwahati, India; Amity University Haryana, Gurugram, India
| |
Collapse
|
27
|
Vijayabaskar MS, Goode DK, Obier N, Lichtinger M, Emmett AML, Abidin FNZ, Shar N, Hannah R, Assi SA, Lie-A-Ling M, Gottgens B, Lacaud G, Kouskoff V, Bonifer C, Westhead DR. Identification of gene specific cis-regulatory elements during differentiation of mouse embryonic stem cells: An integrative approach using high-throughput datasets. PLoS Comput Biol 2019; 15:e1007337. [PMID: 31682597 PMCID: PMC6855567 DOI: 10.1371/journal.pcbi.1007337] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 11/14/2019] [Accepted: 08/15/2019] [Indexed: 01/22/2023] Open
Abstract
Gene expression governs cell fate, and is regulated via a complex interplay of transcription factors and molecules that change chromatin structure. Advances in sequencing-based assays have enabled investigation of these processes genome-wide, leading to large datasets that combine information on the dynamics of gene expression, transcription factor binding and chromatin structure as cells differentiate. While numerous studies focus on the effects of these features on broader gene regulation, less work has been done on the mechanisms of gene-specific transcriptional control. In this study, we have focussed on the latter by integrating gene expression data for the in vitro differentiation of murine ES cells to macrophages and cardiomyocytes, with dynamic data on chromatin structure, epigenetics and transcription factor binding. Combining a novel strategy to identify communities of related control elements with a penalized regression approach, we developed individual models to identify the potential control elements predictive of the expression of each gene. Our models were compared to an existing method and evaluated using the existing literature and new experimental data from embryonic stem cell differentiation reporter assays. Our method is able to identify transcriptional control elements in a gene specific manner that reflect known regulatory relationships and to generate useful hypotheses for further testing.
Collapse
Affiliation(s)
- M. S. Vijayabaskar
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| | - Debbie K. Goode
- Wellcome Trust & MRC Cambridge Stem Cell Institute and Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Nadine Obier
- Institute for Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham. Birmingham, United Kingdom
| | - Monika Lichtinger
- Institute for Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham. Birmingham, United Kingdom
| | - Amber M. L. Emmett
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| | - Fatin N. Zainul Abidin
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| | - Nisar Shar
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| | - Rebecca Hannah
- Wellcome Trust & MRC Cambridge Stem Cell Institute and Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Salam A. Assi
- Institute for Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham. Birmingham, United Kingdom
| | - Michael Lie-A-Ling
- CRUK Manchester Institute, University of Manchester, Manchester, United Kingdom
| | - Berthold Gottgens
- Wellcome Trust & MRC Cambridge Stem Cell Institute and Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Georges Lacaud
- CRUK Manchester Institute, University of Manchester, Manchester, United Kingdom
| | - Valerie Kouskoff
- Division of Developmental Biology and Medicine, The University of Manchester, Manchester, United Kingdom
| | - Constanze Bonifer
- Institute for Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham. Birmingham, United Kingdom
| | - David R. Westhead
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| |
Collapse
|
28
|
Scott JG, Buchon N. Drosophila melanogaster as a powerful tool for studying insect toxicology. PESTICIDE BIOCHEMISTRY AND PHYSIOLOGY 2019; 161:95-103. [PMID: 31685202 DOI: 10.1016/j.pestbp.2019.09.006] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Revised: 09/17/2019] [Accepted: 09/18/2019] [Indexed: 06/10/2023]
Abstract
Insecticides are valuable and widely used tools for the control of pest insects. Despite the use of synthetic insecticides for >50 years, we continue to have a limited understanding of the genes that influence the key steps of the poisoning process. Major barriers for improving our understanding of insecticide toxicity have included a narrow range of tools and/or a large number of candidate genes that could be involved in the poisoning process. Herein, we discuss the numerous tools and resources available in Drosophila melanogaster that could be brought to bear to improve our understanding of the processes determining insecticide toxicity. These include unbiased approaches such as forward genetic screens, population genetic methods and candidate gene approaches. Examples are provided to showcase how D. melanogaster has been successfully used for insecticide toxicology studies in the past, and ideas for future studies using this valuable insect are discussed.
Collapse
Affiliation(s)
- Jeffrey G Scott
- Department of Entomology, Comstock Hall, Cornell University, Ithaca, NY, USA.
| | - Nicolas Buchon
- Department of Entomology, Comstock Hall, Cornell University, Ithaca, NY, USA
| |
Collapse
|
29
|
Wu C, Chen J, Liu Y, Hu X. Improved Prediction of Regulatory Element Using Hybrid Abelian Complexity Features with DNA Sequences. Int J Mol Sci 2019; 20:ijms20071704. [PMID: 30959806 PMCID: PMC6480087 DOI: 10.3390/ijms20071704] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Revised: 04/01/2019] [Accepted: 04/02/2019] [Indexed: 12/14/2022] Open
Abstract
Deciphering the code of cis-regulatory element (CRE) is one of the core issues of current biology. As an important category of CRE, enhancers play crucial roles in gene transcriptional regulations in a distant manner. Further, the disruption of an enhancer can cause abnormal transcription and, thus, trigger human diseases, which means that its accurate identification is currently of broad interest. Here, we introduce an innovative concept, i.e., abelian complexity function (ACF), which is a more complex extension of the classic subword complexity function, for a new coding of DNA sequences. After feature selection by an upper bound estimation and integration with DNA composition features, we developed an enhancer prediction model with hybrid abelian complexity features (HACF). Compared with existing methods, HACF shows consistently superior performance on three sources of enhancer datasets. We tested the generalization ability of HACF by scanning human chromosome 22 to validate previously reported super-enhancers. Meanwhile, we identified novel candidate enhancers which have supports from enhancer-related ENCODE ChIP-seq signals. In summary, HACF improves current enhancer prediction and may be beneficial for further prioritization of functional noncoding variants.
Collapse
Affiliation(s)
- Chengchao Wu
- College of Informatics, Agricultural Bioinformatics Key Laboratory of Hubei Province, Huazhong Agricultural University, Wuhan 430070, China.
| | - Jin Chen
- College of Science, Huazhong Agricultural University, Wuhan 430070, China.
| | - Yunxia Liu
- College of Informatics, Agricultural Bioinformatics Key Laboratory of Hubei Province, Huazhong Agricultural University, Wuhan 430070, China.
| | - Xuehai Hu
- College of Informatics, Agricultural Bioinformatics Key Laboratory of Hubei Province, Huazhong Agricultural University, Wuhan 430070, China.
| |
Collapse
|
30
|
Varshney A, VanRenterghem H, Orchard P, Boyle AP, Stitzel ML, Ucar D, Parker SCJ. Cell Specificity of Human Regulatory Annotations and Their Genetic Effects on Gene Expression. Genetics 2019; 211:549-562. [PMID: 30593493 PMCID: PMC6366912 DOI: 10.1534/genetics.118.301525] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2018] [Accepted: 12/09/2018] [Indexed: 12/19/2022] Open
Abstract
Epigenomic signatures from histone marks and transcription factor (TF)-binding sites have been used to annotate putative gene regulatory regions. However, a direct comparison of these diverse annotations is missing, and it is unclear how genetic variation within these annotations affects gene expression. Here, we compare five widely used annotations of active regulatory elements that represent high densities of one or more relevant epigenomic marks-"super" and "typical" (nonsuper) enhancers, stretch enhancers, high-occupancy target (HOT) regions, and broad domains-across the four matched human cell types for which they are available. We observe that stretch and super enhancers cover cell type-specific enhancer "chromatin states," whereas HOT regions and broad domains comprise more ubiquitous promoter states. Expression quantitative trait loci (eQTL) in stretch enhancers have significantly smaller effect sizes compared to those in HOT regions. Strikingly, chromatin accessibility QTL in stretch enhancers have significantly larger effect sizes compared to those in HOT regions. These observations suggest that stretch enhancers could harbor genetically primed chromatin to enable changes in TF binding, possibly to drive cell type-specific responses to environmental stimuli. Our results suggest that current eQTL studies are relatively underpowered or could lack the appropriate environmental context to detect genetic effects in the most cell type-specific "regulatory annotations," which likely contributes to infrequent colocalization of eQTL with genome-wide association study signals.
Collapse
Affiliation(s)
- Arushi Varshney
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109
| | - Hadley VanRenterghem
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109
| | - Peter Orchard
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109
| | - Alan P Boyle
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109
| | - Michael L Stitzel
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut 06032
| | - Duygu Ucar
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut 06032
| | - Stephen C J Parker
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109
| |
Collapse
|
31
|
Lin X, Zhang X. Prediction of Hot Regions in PPIs Based on Improved Local Community Structure Detecting. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1470-1479. [PMID: 29994749 DOI: 10.1109/tcbb.2018.2793858] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The hot regions in PPIs are some assembly regions which are composed of the tightly packed HotSpots. The discovery of hot regions helps to understand life activities and has very important value for biological applications. The identification of hot regions is the basis for protein design and cancer prevention. The existing algorithms of predicting hot regions often have some defects, such as low accuracy and unstability. This paper proposes a novel hot region prediction method based on diverse biological characteristics. First, feature evaluation is employed by using an impoved mRMR method. Then, SVM is adopted to create cassification model based on the features selected. In addition, a new clustering algorithm, namely LCSD (Local community structure detecting), is developed to detect and analyze the conformation of hot regions. In the clustering process, the link similarity of protein residues is introduced to handle the boundary nodes. This algorithm can effectively deal with the missing residue nodes and control the local community boundaries. The results indicate that the spatial structure of hot regions can be obtained more effectively, and that our method is more effective than previous methods for precise identification of hot regions.
Collapse
|
32
|
Cortini R, Filion GJ. Theoretical principles of transcription factor traffic on folded chromatin. Nat Commun 2018; 9:1740. [PMID: 29712907 PMCID: PMC5928121 DOI: 10.1038/s41467-018-04130-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Accepted: 04/05/2018] [Indexed: 01/02/2023] Open
Abstract
All organisms regulate transcription of their genes. To understand this process, a complete understanding of how transcription factors find their targets in cellular nuclei is essential. The DNA sequence and other variables are known to influence this binding, but the distribution of transcription factor binding patterns remains mostly unexplained in metazoan genomes. Here, we investigate the role of chromosome conformation in the trajectories of transcription factors. Using molecular dynamics simulations, we uncover the principles of their diffusion on chromatin. Chromosome contacts play a conflicting role: at low density they enhance transcription factor traffic, but at high density they lower it by volume exclusion. Consistently, we observe that in human cells, highly occupied targets, where protein binding is promiscuous, are found at sites engaged in chromosome loops within uncompacted chromatin. In summary, we provide a framework for understanding the search trajectories of transcription factors, highlighting the key contribution of genome conformation. How transcription factors find their targets in vivo is still poorly understood. Here the authors use molecular dynamics simulations to investigate how transcription factors diffuse on chromatin, providing a theoretical framework for understanding the key role of genome conformation in this process.
Collapse
Affiliation(s)
- Ruggero Cortini
- Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain. .,Universidad Pompeu Fabra (UPF), 08003, Barcelona, Spain.
| | - Guillaume J Filion
- Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain. .,Universidad Pompeu Fabra (UPF), 08003, Barcelona, Spain.
| |
Collapse
|
33
|
Diehl AG, Boyle AP. Conserved and species-specific transcription factor co-binding patterns drive divergent gene regulation in human and mouse. Nucleic Acids Res 2018; 46:1878-1894. [PMID: 29361190 PMCID: PMC5829737 DOI: 10.1093/nar/gky018] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 12/15/2017] [Accepted: 01/08/2018] [Indexed: 12/24/2022] Open
Abstract
The mouse is widely used as system to study human genetic mechanisms. However, extensive rewiring of transcriptional regulatory networks often confounds translation of findings between human and mouse. Site-specific gain and loss of individual transcription factor binding sites (TFBS) has caused functional divergence of orthologous regulatory loci, and so we must look beyond this positional conservation to understand common themes of regulatory control. Fortunately, transcription factor co-binding patterns shared across species often perform conserved regulatory functions. These can be compared to 'regulatory sentences' that retain the same meanings regardless of sequence and species context. By analyzing TFBS co-occupancy patterns observed in four human and mouse cell types, we learned a regulatory grammar: the rules by which TFBS are combined into meaningful regulatory sentences. Different parts of this grammar associate with specific sets of functional annotations regardless of sequence conservation and predict functional signatures more accurately than positional conservation. We further show that both species-specific and conserved portions of this grammar are involved in gene expression divergence and human disease risk. These findings expand our understanding of transcriptional regulatory mechanisms, suggesting that phenotypic divergence and disease risk are driven by a complex interplay between deeply conserved and species-specific transcriptional regulatory pathways.
Collapse
Affiliation(s)
- Adam G Diehl
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Alan P Boyle
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
34
|
Albritton SE, Ercan S. Caenorhabditis elegans Dosage Compensation: Insights into Condensin-Mediated Gene Regulation. Trends Genet 2017; 34:41-53. [PMID: 29037439 DOI: 10.1016/j.tig.2017.09.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Revised: 09/19/2017] [Accepted: 09/25/2017] [Indexed: 01/05/2023]
Abstract
Recent work demonstrating the role of chromosome organization in transcriptional regulation has sparked substantial interest in the molecular mechanisms that control chromosome structure. Condensin, an evolutionarily conserved multisubunit protein complex, is essential for chromosome condensation during cell division and functions in regulating gene expression during interphase. In Caenorhabditis elegans, a specialized condensin forms the core of the dosage compensation complex (DCC), which specifically binds to and represses transcription from the hermaphrodite X chromosomes. DCC serves as a clear paradigm for addressing how condensins target large chromosomal domains and how they function to regulate chromosome structure and transcription. Here, we discuss recent research on C. elegans DCC in the context of canonical condensin mechanisms as have been studied in various organisms.
Collapse
Affiliation(s)
- Sarah Elizabeth Albritton
- Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA
| | - Sevinç Ercan
- Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA.
| |
Collapse
|
35
|
Uyehara CM, Nystrom SL, Niederhuber MJ, Leatham-Jensen M, Ma Y, Buttitta LA, McKay DJ. Hormone-dependent control of developmental timing through regulation of chromatin accessibility. Genes Dev 2017; 31:862-875. [PMID: 28536147 PMCID: PMC5458754 DOI: 10.1101/gad.298182.117] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2017] [Accepted: 04/26/2017] [Indexed: 11/24/2022]
Abstract
Uyehara et al. show that hormone-induced transcription factors control temporal gene expression by regulating accessibility of DNA regulatory elements. Using the Drosophila wing, they demonstrate that temporal changes in gene expression are accompanied by genome-wide changes in chromatin accessibility at temporal-specific enhancers. Specification of tissue identity during development requires precise coordination of gene expression in both space and time. Spatially, master regulatory transcription factors are required to control tissue-specific gene expression programs. However, the mechanisms controlling how tissue-specific gene expression changes over time are less well understood. Here, we show that hormone-induced transcription factors control temporal gene expression by regulating the accessibility of DNA regulatory elements. Using the Drosophila wing, we demonstrate that temporal changes in gene expression are accompanied by genome-wide changes in chromatin accessibility at temporal-specific enhancers. We also uncover a temporal cascade of transcription factors following a pulse of the steroid hormone ecdysone such that different times in wing development can be defined by distinct combinations of hormone-induced transcription factors. Finally, we show that the ecdysone-induced transcription factor E93 controls temporal identity by directly regulating chromatin accessibility across the genome. Notably, we found that E93 controls enhancer activity through three different modalities, including promoting accessibility of late-acting enhancers and decreasing accessibility of early-acting enhancers. Together, this work supports a model in which an extrinsic signal triggers an intrinsic transcription factor cascade that drives development forward in time through regulation of chromatin accessibility.
Collapse
Affiliation(s)
- Christopher M Uyehara
- Department of Biology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Department of Genetics, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Curriculum in Genetics and Molecular Biology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Integrative Program for Biological and Genome Sciences, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA
| | - Spencer L Nystrom
- Department of Biology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Department of Genetics, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Curriculum in Genetics and Molecular Biology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Integrative Program for Biological and Genome Sciences, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA
| | - Matthew J Niederhuber
- Department of Biology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Department of Genetics, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Curriculum in Genetics and Molecular Biology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Integrative Program for Biological and Genome Sciences, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA
| | - Mary Leatham-Jensen
- Department of Biology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Department of Genetics, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Integrative Program for Biological and Genome Sciences, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA
| | - Yiqin Ma
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Laura A Buttitta
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Daniel J McKay
- Department of Biology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Department of Genetics, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA.,Integrative Program for Biological and Genome Sciences, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599, USA
| |
Collapse
|
36
|
Suske G. NF-Y and SP transcription factors — New insights in a long-standing liaison. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2017; 1860:590-597. [DOI: 10.1016/j.bbagrm.2016.08.011] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2016] [Revised: 08/18/2016] [Accepted: 08/24/2016] [Indexed: 12/31/2022]
|
37
|
Lomaev D, Mikhailova A, Erokhin M, Shaposhnikov AV, Moresco JJ, Blokhina T, Wolle D, Aoki T, Ryabykh V, Yates JR, Shidlovskii YV, Georgiev P, Schedl P, Chetverina D. The GAGA factor regulatory network: Identification of GAGA factor associated proteins. PLoS One 2017; 12:e0173602. [PMID: 28296955 PMCID: PMC5351981 DOI: 10.1371/journal.pone.0173602] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Accepted: 02/23/2017] [Indexed: 11/24/2022] Open
Abstract
The Drosophila GAGA factor (GAF) has an extraordinarily diverse set of functions that include the activation and silencing of gene expression, nucleosome organization and remodeling, higher order chromosome architecture and mitosis. One hypothesis that could account for these diverse activities is that GAF is able to interact with partners that have specific and dedicated functions. To test this possibility we used affinity purification coupled with high throughput mass spectrometry to identify GAF associated partners. Consistent with this hypothesis the GAF interacting network includes a large collection of factors and complexes that have been implicated in many different aspects of gene activity, chromosome structure and function. Moreover, we show that GAF interactions with a small subset of partners is direct; however for many others the interactions could be indirect, and depend upon intermediates that serve to diversify the functional capabilities of the GAF protein.
Collapse
Affiliation(s)
- Dmitry Lomaev
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
| | - Anna Mikhailova
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
| | - Maksim Erokhin
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
| | | | - James J. Moresco
- Department of Chemical Physiology, SR302B, The Scripps Research Institute, La Jolla, California, United States of America
| | - Tatiana Blokhina
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
| | - Daniel Wolle
- Department of Molecular Biology Princeton University, Princeton, NJ, United States of America
| | - Tsutomu Aoki
- Department of Molecular Biology Princeton University, Princeton, NJ, United States of America
| | - Vladimir Ryabykh
- Institute of Animal Physiology, Biochemistry and Nutrition, Borovsk, Russia
| | - John R. Yates
- Department of Chemical Physiology, SR302B, The Scripps Research Institute, La Jolla, California, United States of America
| | | | - Pavel Georgiev
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
- * E-mail: (DC); (PS); (PG)
| | - Paul Schedl
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
- Department of Molecular Biology Princeton University, Princeton, NJ, United States of America
- * E-mail: (DC); (PS); (PG)
| | - Darya Chetverina
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
- * E-mail: (DC); (PS); (PG)
| |
Collapse
|
38
|
Shin DH, Hong JW. Transcriptional activity of the short gastrulation primary enhancer in the ventral midline requires its early activity in the presumptive neurogenic ectoderm. BMB Rep 2017; 49:572-577. [PMID: 27616358 PMCID: PMC5227300 DOI: 10.5483/bmbrep.2016.49.10.119] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2016] [Indexed: 11/22/2022] Open
Abstract
The short gastrulation (sog) shadow enhancer directs early and late sog expression in the neurogenic ectoderm and the ventral midline of the developing Drosophila embryo, respectively. Here, evidence is presented that the sog primary enhancer also has both activities, with the late enhancer activity dependent on the early activity. Computational analyses showed that the sog primary enhancer contains five Dorsal (Dl)-, four Zelda (Zld)-, three Bicoid (Bcd)-, and no Single-minded (Sim)-binding sites. In contrast to many ventral midline enhancers, the primary enhancer can direct lacZ expression in the ventral midline as well as in the neurogenic ectoderm without a canonical Simbinding site. Intriguingly, the impaired transcriptional synergy between Dl and either Zld or Bcd led to aberrant and abolished lacZ expression in the neurogenic ectoderm and in the ventral midline, respectively. These findings suggest that the two enhancer activities of the sog primary enhancer are functionally consolidated and geographically inseparable. [BMB Reports 2016; 49(10): 572-577]
Collapse
Affiliation(s)
- Dong-Hyeon Shin
- Graduate School of East-West Medical Science, Kyung Hee University, Yongin 17104, Korea
| | - Joung-Woo Hong
- Graduate School of East-West Medical Science, Kyung Hee University, Yongin 17104, Korea
| |
Collapse
|
39
|
Zacher B, Michel M, Schwalb B, Cramer P, Tresch A, Gagneur J. Accurate Promoter and Enhancer Identification in 127 ENCODE and Roadmap Epigenomics Cell Types and Tissues by GenoSTAN. PLoS One 2017; 12:e0169249. [PMID: 28056037 PMCID: PMC5215863 DOI: 10.1371/journal.pone.0169249] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2016] [Accepted: 12/14/2016] [Indexed: 12/22/2022] Open
Abstract
Accurate maps of promoters and enhancers are required for understanding transcriptional regulation. Promoters and enhancers are usually mapped by integration of chromatin assays charting histone modifications, DNA accessibility, and transcription factor binding. However, current algorithms are limited by unrealistic data distribution assumptions. Here we propose GenoSTAN (Genomic STate ANnotation), a hidden Markov model overcoming these limitations. We map promoters and enhancers for 127 cell types and tissues from the ENCODE and Roadmap Epigenomics projects, today’s largest compendium of chromatin assays. Extensive benchmarks demonstrate that GenoSTAN generally identifies promoters and enhancers with significantly higher accuracy than previous methods. Moreover, GenoSTAN-derived promoters and enhancers showed significantly higher enrichment of complex trait-associated genetic variants than current annotations. Altogether, GenoSTAN provides an easy-to-use tool to define promoters and enhancers in any system, and our annotation of human transcriptional cis-regulatory elements constitutes a rich resource for future research in biology and medicine.
Collapse
Affiliation(s)
- Benedikt Zacher
- Gene Center and Department of Biochemistry, Center for Integrated Protein Science CIPSM, Ludwig-Maximilians-Universität Munich, Germany
- * E-mail: (BZ); (AT); (JG)
| | - Margaux Michel
- Department of Molecular Biology, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Björn Schwalb
- Department of Molecular Biology, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Patrick Cramer
- Department of Molecular Biology, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Achim Tresch
- Department of Biology, University of Cologne, Cologne, Germany
- Max Planck Institute for Plant Breeding Research, Cologne, Germany
- * E-mail: (BZ); (AT); (JG)
| | - Julien Gagneur
- Gene Center and Department of Biochemistry, Center for Integrated Protein Science CIPSM, Ludwig-Maximilians-Universität Munich, Germany
- * E-mail: (BZ); (AT); (JG)
| |
Collapse
|
40
|
Blythe SA, Wieschaus EF. Establishment and maintenance of heritable chromatin structure during early Drosophila embryogenesis. eLife 2016; 5:20148. [PMID: 27879204 PMCID: PMC5156528 DOI: 10.7554/elife.20148] [Citation(s) in RCA: 115] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Accepted: 11/21/2016] [Indexed: 12/18/2022] Open
Abstract
During embryogenesis, the initial chromatin state is established during a period of rapid proliferative activity. We have measured with 3-min time resolution how heritable patterns of chromatin structure are initially established and maintained during the midblastula transition (MBT). We find that regions of accessibility are established sequentially, where enhancers are opened in advance of promoters and insulators. These open states are stably maintained in highly condensed mitotic chromatin to ensure faithful inheritance of prior accessibility status across cell divisions. The temporal progression of establishment is controlled by the biological timers that control the onset of the MBT. In general, acquisition of promoter accessibility is controlled by the biological timer that measures the nucleo-cytoplasmic (N:C) ratio, whereas timing of enhancer accessibility is regulated independently of the N:C ratio. These different timing classes each associate with binding sites for two transcription factors, GAGA-factor and Zelda, previously implicated in controlling chromatin accessibility at ZGA. DOI:http://dx.doi.org/10.7554/eLife.20148.001
Collapse
Affiliation(s)
- Shelby A Blythe
- Howard Hughes Medical Institute, Princeton University, Princeton, United States
| | - Eric F Wieschaus
- Howard Hughes Medical Institute, Princeton University, Princeton, United States
| |
Collapse
|
41
|
Koenecke N, Johnston J, He Q, Meier S, Zeitlinger J. Drosophila poised enhancers are generated during tissue patterning with the help of repression. Genome Res 2016; 27:64-74. [PMID: 27979994 PMCID: PMC5204345 DOI: 10.1101/gr.209486.116] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2016] [Accepted: 11/08/2016] [Indexed: 12/18/2022]
Abstract
Histone modifications are frequently used as markers for enhancer states, but how to interpret enhancer states in the context of embryonic development is not clear. The poised enhancer signature, involving H3K4me1 and low levels of H3K27ac, has been reported to mark inactive enhancers that are poised for future activation. However, future activation is not always observed, and alternative reasons for the widespread occurrence of this enhancer signature have not been investigated. By analyzing enhancers during dorsal-ventral (DV) axis formation in the Drosophila embryo, we find that the poised enhancer signature is specifically generated during patterning in the tissue where the enhancers are not induced, including at enhancers that are known to be repressed by a transcriptional repressor. These results suggest that, rather than serving exclusively as an intermediate step before future activation, the poised enhancer state may be a mark for spatial regulation during tissue patterning. We discuss the possibility that the poised enhancer state is more generally the result of repression by transcriptional repressors.
Collapse
Affiliation(s)
- Nina Koenecke
- Stowers Institute for Medical Research, Kansas City, Missouri 64110, USA
| | - Jeff Johnston
- Stowers Institute for Medical Research, Kansas City, Missouri 64110, USA
| | - Qiye He
- Stowers Institute for Medical Research, Kansas City, Missouri 64110, USA
| | - Samuel Meier
- Stowers Institute for Medical Research, Kansas City, Missouri 64110, USA
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, Missouri 64110, USA.,University of Kansas Medical Center, Department of Pathology, Kansas City, Kansas 66160, USA
| |
Collapse
|
42
|
Onichtchouk DV, Voronina AS. Regulation of Zygotic Genome and Cellular Pluripotency. BIOCHEMISTRY (MOSCOW) 2016; 80:1723-33. [PMID: 26878577 DOI: 10.1134/s0006297915130088] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Events, manifesting transition from maternal to zygotic period of development are studied for more than 100 years, but underlying mechanisms are not yet clear. We provide a brief historical overview of development of concepts and explain the specific terminology used in the field. We further discuss differences and similarities between the zygotic genome activation and in vitro reprogramming process. Finally, we envision the future research directions within the field, where biochemical methods will play increasingly important role.
Collapse
Affiliation(s)
- D V Onichtchouk
- University of Freiburg, Developmental Biology Unit, Biologie 1, Freiburg, 79194, Germany.
| | | |
Collapse
|
43
|
Li H, Liu F, Ren C, Bo X, Shu W. Genome-wide identification and characterisation of HOT regions in the human genome. BMC Genomics 2016; 17:733. [PMID: 27633377 PMCID: PMC5025555 DOI: 10.1186/s12864-016-3077-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Accepted: 09/08/2016] [Indexed: 01/10/2023] Open
Abstract
Background HOT (high-occupancy target) regions, which are bound by a surprisingly large number of transcription factors, are considered to be among the most intriguing findings of recent years. An improved understanding of the roles that HOT regions play in biology would be afforded by knowing the constellation of factors that constitute these domains and by identifying HOT regions across the spectrum of human cell types. Results We characterised and validated HOT regions in embryonic stem cells (ESCs) and produced a catalogue of HOT regions in a broad range of human cell types. We found that HOT regions are associated with genes that control and define the developmental processes of the respective cell and tissue types. We also showed evidence of the developmental persistence of HOT regions at primitive enhancers and demonstrate unique signatures of HOT regions that distinguish them from typical enhancers and super-enhancers. Finally, we performed a dynamic analysis to reveal the dynamical regulation of HOT regions upon H1 differentiation. Conclusions Taken together, our results provide a resource for the functional exploration of HOT regions and extend our understanding of the key roles of HOT regions in development and differentiation. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3077-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hao Li
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing, 100850, China
| | - Feng Liu
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing, 100850, China
| | - Chao Ren
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing, 100850, China
| | - Xiaochen Bo
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing, 100850, China.
| | - Wenjie Shu
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing, 100850, China.
| |
Collapse
|
44
|
Shlyueva D, Meireles-Filho ACA, Pagani M, Stark A. Genome-Wide Ultrabithorax Binding Analysis Reveals Highly Targeted Genomic Loci at Developmental Regulators and a Potential Connection to Polycomb-Mediated Regulation. PLoS One 2016; 11:e0161997. [PMID: 27575958 PMCID: PMC5004984 DOI: 10.1371/journal.pone.0161997] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2016] [Accepted: 08/16/2016] [Indexed: 12/22/2022] Open
Abstract
Hox homeodomain transcription factors are key regulators of animal development. They specify the identity of segments along the anterior-posterior body axis in metazoans by controlling the expression of diverse downstream targets, including transcription factors and signaling pathway components. The Drosophila melanogaster Hox factor Ultrabithorax (Ubx) directs the development of thoracic and abdominal segments and appendages, and loss of Ubx function can lead for example to the transformation of third thoracic segment appendages (e.g. halters) into second thoracic segment appendages (e.g. wings), resulting in a characteristic four-wing phenotype. Here we present a Drosophila melanogaster strain with a V5-epitope tagged Ubx allele, which we employed to obtain a high quality genome-wide map of Ubx binding sites using ChIP-seq. We confirm the sensitivity of the V5 ChIP-seq by recovering 7/8 of well-studied Ubx-dependent cis-regulatory regions. Moreover, we show that Ubx binding is predictive of enhancer activity as suggested by comparison with a genome-scale resource of in vivo tested enhancer candidates. We observed densely clustered Ubx binding sites at 12 extended genomic loci that included ANTP-C, BX-C, Polycomb complex genes, and other regulators and the clustered binding sites were frequently active enhancers. Furthermore, Ubx binding was detected at known Polycomb response elements (PREs) and was associated with significant enrichments of Pc and Pho ChIP signals in contrast to binding sites of other developmental TFs. Together, our results show that Ubx targets developmental regulators via strongly clustered binding sites and allow us to hypothesize that regulation by Ubx might involve Polycomb group proteins to maintain specific regulatory states in cooperative or mutually exclusive fashion, an attractive model that combines two groups of proteins with prominent gene regulatory roles during animal development.
Collapse
Affiliation(s)
- Daria Shlyueva
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), Vienna, Austria
| | | | - Michaela Pagani
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), Vienna, Austria
| | - Alexander Stark
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), Vienna, Austria
- * E-mail:
| |
Collapse
|
45
|
Beagrie RA, Pombo A. Gene activation by metazoan enhancers: Diverse mechanisms stimulate distinct steps of transcription. Bioessays 2016; 38:881-93. [PMID: 27452946 DOI: 10.1002/bies.201600032] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Enhancers can stimulate transcription by a number of different mechanisms which control different stages of the transcription cycle of their target genes, from recruitment of the transcription machinery to elongation by RNA polymerase. These mechanisms may not be mutually exclusive, as a single enhancer may act through different pathways by binding multiple transcription factors. Multiple enhancers may also work together to regulate transcription of a shared target gene. Most of the evidence supporting different enhancer mechanisms comes from the study of single genes, but new high-throughput experimental frameworks offer the opportunity to integrate and generalize disparate mechanisms identified at single genes. This effort is especially important if we are to fully understand how sequence variation within enhancers contributes to human disease.
Collapse
Affiliation(s)
- Robert A Beagrie
- Epigenetic Regulation and Chromatin Architecture Group, Berlin Institute for Medical Systems Biology, Max-Delbrück Centre for Molecular Medicine, Berlin-Buch, Germany
| | - Ana Pombo
- Epigenetic Regulation and Chromatin Architecture Group, Berlin Institute for Medical Systems Biology, Max-Delbrück Centre for Molecular Medicine, Berlin-Buch, Germany
| |
Collapse
|
46
|
Multiplex enhancer-reporter assays uncover unsophisticated TP53 enhancer logic. Genome Res 2016; 26:882-95. [PMID: 27197205 PMCID: PMC4937571 DOI: 10.1101/gr.204149.116] [Citation(s) in RCA: 64] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2016] [Accepted: 05/17/2016] [Indexed: 12/11/2022]
Abstract
Transcription factors regulate their target genes by binding to regulatory regions in the genome. Although the binding preferences of TP53 are known, it remains unclear what distinguishes functional enhancers from nonfunctional binding. In addition, the genome is scattered with recognition sequences that remain unoccupied. Using two complementary techniques of multiplex enhancer-reporter assays, we discovered that functional enhancers could be discriminated from nonfunctional binding events by the occurrence of a single TP53 canonical motif. By combining machine learning with a meta-analysis of TP53 ChIP-seq data sets, we identified a core set of more than 1000 responsive enhancers in the human genome. This TP53 cistrome is invariably used between cell types and experimental conditions, whereas differences among experiments can be attributed to indirect nonfunctional binding events. Our data suggest that TP53 enhancers represent a class of unsophisticated cell-autonomous enhancers containing a single TP53 binding site, distinct from complex developmental enhancers that integrate signals from multiple transcription factors.
Collapse
|
47
|
An autonomous CEBPA enhancer specific for myeloid-lineage priming and neutrophilic differentiation. Blood 2016; 127:2991-3003. [PMID: 26966090 DOI: 10.1182/blood-2016-01-695759] [Citation(s) in RCA: 59] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2016] [Accepted: 03/02/2016] [Indexed: 12/24/2022] Open
Abstract
Neutrophilic differentiation is dependent on CCAAT enhancer-binding protein α (C/EBPα), a transcription factor expressed in multiple organs including the bone marrow. Using functional genomic technologies in combination with clustered regularly-interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 genome editing and in vivo mouse modeling, we show that CEBPA is located in a 170-kb topological-associated domain that contains 14 potential enhancers. Of these, 1 enhancer located +42 kb from CEBPA is active and engages with the CEBPA promoter in myeloid cells only. Germ line deletion of the homologous enhancer in mice in vivo reduces Cebpa levels exclusively in hematopoietic stem cells (HSCs) and myeloid-primed progenitor cells leading to severe defects in the granulocytic lineage, without affecting any other Cebpa-expressing organ studied. The enhancer-deleted progenitor cells lose their myeloid transcription program and are blocked in differentiation. Deletion of the enhancer also causes loss of HSC maintenance. We conclude that a single +42-kb enhancer is essential for CEBPA expression in myeloid cells only.
Collapse
|
48
|
Onichtchouk D, Driever W. Zygotic Genome Activators, Developmental Timing, and Pluripotency. Curr Top Dev Biol 2016; 116:273-97. [PMID: 26970624 DOI: 10.1016/bs.ctdb.2015.12.004] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The transcription factors Pou5f1, Sox2, and Nanog are central regulators of pluripotency in mammalian ES and iPS cells. In vertebrate embryos, Pou5f1/3, SoxB1, and Nanog control zygotic genome activation and participate in lineage decisions. We review the current knowledge of the roles of these genes in developing vertebrate embryos from fish to mammals and suggest a model for pluripotency gene regulatory network functions in early development.
Collapse
Affiliation(s)
- Daria Onichtchouk
- Developmental Biology Unit, Institute Biology I, Faculty of Biology, and Center for Biological Signaling Studies (BIOSS), Albert-Ludwigs-University, Freiburg, Germany.
| | - Wolfgang Driever
- Developmental Biology Unit, Institute Biology I, Faculty of Biology, and Center for Biological Signaling Studies (BIOSS), Albert-Ludwigs-University, Freiburg, Germany.
| |
Collapse
|
49
|
Rezsohazy R, Saurin AJ, Maurel-Zaffran C, Graba Y. Cellular and molecular insights into Hox protein action. Development 2016; 142:1212-27. [PMID: 25804734 DOI: 10.1242/dev.109785] [Citation(s) in RCA: 100] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Hox genes encode homeodomain transcription factors that control morphogenesis and have established functions in development and evolution. Hox proteins have remained enigmatic with regard to the molecular mechanisms that endow them with specific and diverse functions, and to the cellular functions that they control. Here, we review recent examples of Hox-controlled cellular functions that highlight their versatile and highly context-dependent activity. This provides the setting to discuss how Hox proteins control morphogenesis and organogenesis. We then summarise the molecular modalities underlying Hox protein function, in particular in light of current models of transcription factor function. Finally, we discuss how functional divergence between Hox proteins might be achieved to give rise to the many facets of their action.
Collapse
Affiliation(s)
- René Rezsohazy
- Institut des Sciences de la Vie, Université Catholique de Louvain, Louvain-la-Neuve B-1348, Belgium
| | - Andrew J Saurin
- Aix Marseille Université, CNRS, IBDM, UMR 7288, Marseille 13288, Cedex 09, France
| | | | - Yacine Graba
- Aix Marseille Université, CNRS, IBDM, UMR 7288, Marseille 13288, Cedex 09, France
| |
Collapse
|
50
|
Peng PC, Hassan Samee MA, Sinha S. Incorporating chromatin accessibility data into sequence-to-expression modeling. Biophys J 2016; 108:1257-67. [PMID: 25762337 DOI: 10.1016/j.bpj.2014.12.037] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2014] [Revised: 12/01/2014] [Accepted: 12/11/2014] [Indexed: 01/30/2023] Open
Abstract
Prediction of gene expression levels from regulatory sequences is one of the major challenges of genomic biology today. A particularly promising approach to this problem is that taken by thermodynamics-based models that interpret an enhancer sequence in a given cellular context specified by transcription factor concentration levels and predict precise expression levels driven by that enhancer. Such models have so far not accounted for the effect of chromatin accessibility on interactions between transcription factor and DNA and consequently on gene-expression levels. Here, we extend a thermodynamics-based model of gene expression, called GEMSTAT (Gene Expression Modeling Based on Statistical Thermodynamics), to incorporate chromatin accessibility data and quantify its effect on accuracy of expression prediction. In the new model, called GEMSTAT-A, accessibility at a binding site is assumed to affect the transcription factor's binding strength at the site, whereas all other aspects are identical to the GEMSTAT model. We show that this modification results in significantly better fits in a data set of over 30 enhancers regulating spatial expression patterns in the blastoderm-stage Drosophila embryo. It is important to note that the improved fits result not from an overall elevated accessibility in active enhancers but from the variation of accessibility levels within an enhancer. With whole-genome DNA accessibility measurements becoming increasingly popular, our work demonstrates how such data may be useful for sequence-to-expression models. It also calls for future advances in modeling accessibility levels from sequence and the transregulatory context, so as to predict accurately the effect of cis and trans perturbations on gene expression.
Collapse
Affiliation(s)
- Pei-Chen Peng
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Md Abul Hassan Samee
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois; Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.
| |
Collapse
|