1
|
Friedman RZ, Ramu A, Lichtarge S, Wu Y, Tripp L, Lyon D, Myers CA, Granas DM, Gause M, Corbo JC, Cohen BA, White MA. Active learning of enhancers and silencers in the developing neural retina. Cell Syst 2025; 16:101163. [PMID: 39778579 PMCID: PMC11827711 DOI: 10.1016/j.cels.2024.12.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 10/17/2024] [Accepted: 12/06/2024] [Indexed: 01/11/2025]
Abstract
Deep learning is a promising strategy for modeling cis-regulatory elements. However, models trained on genomic sequences often fail to explain why the same transcription factor can activate or repress transcription in different contexts. To address this limitation, we developed an active learning approach to train models that distinguish between enhancers and silencers composed of binding sites for the photoreceptor transcription factor cone-rod homeobox (CRX). After training the model on nearly all bound CRX sites from the genome, we coupled synthetic biology with uncertainty sampling to generate additional rounds of informative training data. This allowed us to iteratively train models on data from multiple rounds of massively parallel reporter assays. The ability of the resulting models to discriminate between CRX sites with identical sequence but opposite functions establishes active learning as an effective strategy to train models of regulatory DNA. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Ryan Z Friedman
- The Edison Family Center for Genome Sciences & Systems Biology, Saint Louis, MO 63110, USA; Department of Genetics, Saint Louis, MO 63110, USA
| | - Avinash Ramu
- The Edison Family Center for Genome Sciences & Systems Biology, Saint Louis, MO 63110, USA; Department of Genetics, Saint Louis, MO 63110, USA
| | - Sara Lichtarge
- The Edison Family Center for Genome Sciences & Systems Biology, Saint Louis, MO 63110, USA; Department of Genetics, Saint Louis, MO 63110, USA
| | - Yawei Wu
- The Edison Family Center for Genome Sciences & Systems Biology, Saint Louis, MO 63110, USA; Department of Genetics, Saint Louis, MO 63110, USA
| | - Lloyd Tripp
- The Edison Family Center for Genome Sciences & Systems Biology, Saint Louis, MO 63110, USA; Department of Genetics, Saint Louis, MO 63110, USA
| | - Daniel Lyon
- The Edison Family Center for Genome Sciences & Systems Biology, Saint Louis, MO 63110, USA; Department of Genetics, Saint Louis, MO 63110, USA
| | - Connie A Myers
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - David M Granas
- The Edison Family Center for Genome Sciences & Systems Biology, Saint Louis, MO 63110, USA; Department of Genetics, Saint Louis, MO 63110, USA
| | - Maria Gause
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Joseph C Corbo
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Barak A Cohen
- The Edison Family Center for Genome Sciences & Systems Biology, Saint Louis, MO 63110, USA; Department of Genetics, Saint Louis, MO 63110, USA
| | - Michael A White
- The Edison Family Center for Genome Sciences & Systems Biology, Saint Louis, MO 63110, USA; Department of Genetics, Saint Louis, MO 63110, USA.
| |
Collapse
|
2
|
Romero R, Menichelli C, Vroland C, Marin JM, Lèbre S, Lecellier CH, Bréhélin L. TFscope: systematic analysis of the sequence features involved in the binding preferences of transcription factors. Genome Biol 2024; 25:187. [PMID: 38987807 PMCID: PMC11514967 DOI: 10.1186/s13059-024-03321-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 06/24/2024] [Indexed: 07/12/2024] Open
Abstract
Characterizing the binding preferences of transcription factors (TFs) in different cell types and conditions is key to understand how they orchestrate gene expression. Here, we develop TFscope, a machine learning approach that identifies sequence features explaining the binding differences observed between two ChIP-seq experiments targeting either the same TF in two conditions or two TFs with similar motifs (paralogous TFs). TFscope systematically investigates differences in the core motif, nucleotide environment and co-factor motifs, and provides the contribution of each key feature in the two experiments. TFscope was applied to > 305 ChIP-seq pairs, and several examples are discussed.
Collapse
Affiliation(s)
- Raphaël Romero
- LIRMM, Univ Montpellier, CNRS, Montpellier, France
- IMAG, Univ Montpellier, CNRS, Montpellier, France
| | | | - Christophe Vroland
- LIRMM, Univ Montpellier, CNRS, Montpellier, France
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France
| | | | - Sophie Lèbre
- IMAG, Univ Montpellier, CNRS, Montpellier, France.
- AMIS, Université Paul-Valéry-Montpellier 3, Montpellier, France.
| | - Charles-Henri Lecellier
- LIRMM, Univ Montpellier, CNRS, Montpellier, France.
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France.
| | | |
Collapse
|
3
|
Gupta S, Kesarwani V, Bhati U, Jyoti, Shankar R. PTFSpot: deep co-learning on transcription factors and their binding regions attains impeccable universality in plants. Brief Bioinform 2024; 25:bbae324. [PMID: 39013383 PMCID: PMC11250369 DOI: 10.1093/bib/bbae324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 06/07/2024] [Accepted: 06/19/2024] [Indexed: 07/18/2024] Open
Abstract
Unlike animals, variability in transcription factors (TFs) and their binding regions (TFBRs) across the plants species is a major problem that most of the existing TFBR finding software fail to tackle, rendering them hardly of any use. This limitation has resulted into underdevelopment of plant regulatory research and rampant use of Arabidopsis-like model species, generating misleading results. Here, we report a revolutionary transformers-based deep-learning approach, PTFSpot, which learns from TF structures and their binding regions' co-variability to bring a universal TF-DNA interaction model to detect TFBR with complete freedom from TF and species-specific models' limitations. During a series of extensive benchmarking studies over multiple experimentally validated data, it not only outperformed the existing software by >30% lead but also delivered consistently >90% accuracy even for those species and TF families that were never encountered during the model-building process. PTFSpot makes it possible now to accurately annotate TFBRs across any plant genome even in the total lack of any TF information, completely free from the bottlenecks of species and TF-specific models.
Collapse
Affiliation(s)
- Sagar Gupta
- Studio of Computational Biology & Bioinformatics, The Himalayan Centre for High-throughput Computational Biology, (HiCHiCoB, A BIC supported by DBT, India), Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology (CSIR-IHBT), Palampur, Himachal Pradesh 176061, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Veerbhan Kesarwani
- Studio of Computational Biology & Bioinformatics, The Himalayan Centre for High-throughput Computational Biology, (HiCHiCoB, A BIC supported by DBT, India), Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology (CSIR-IHBT), Palampur, Himachal Pradesh 176061, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Umesh Bhati
- Studio of Computational Biology & Bioinformatics, The Himalayan Centre for High-throughput Computational Biology, (HiCHiCoB, A BIC supported by DBT, India), Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology (CSIR-IHBT), Palampur, Himachal Pradesh 176061, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Jyoti
- Studio of Computational Biology & Bioinformatics, The Himalayan Centre for High-throughput Computational Biology, (HiCHiCoB, A BIC supported by DBT, India), Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology (CSIR-IHBT), Palampur, Himachal Pradesh 176061, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Ravi Shankar
- Studio of Computational Biology & Bioinformatics, The Himalayan Centre for High-throughput Computational Biology, (HiCHiCoB, A BIC supported by DBT, India), Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology (CSIR-IHBT), Palampur, Himachal Pradesh 176061, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| |
Collapse
|
4
|
Khetan S, Bulyk ML. Overlapping binding sites underlie TF genomic occupancy. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.05.583629. [PMID: 38496549 PMCID: PMC10942454 DOI: 10.1101/2024.03.05.583629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Sequence-specific DNA binding by transcription factors (TFs) is a crucial step in gene regulation. However, current high-throughput in vitro approaches cannot reliably detect lower affinity TF-DNA interactions, which play key roles in gene regulation. Here, we developed PADIT-seq ( p rotein a ffinity to D NA by in vitro transcription and RNA seq uencing) to assay TF binding preferences to all 10-bp DNA sequences at far greater sensitivity than prior approaches. The expanded catalogs of low affinity DNA binding sites for the human TFs HOXD13 and EGR1 revealed that nucleotides flanking high affinity DNA binding sites create overlapping lower affinity sites that together modulate TF genomic occupancy in vivo . Formation of such extended recognition sequences stems from an inherent property of TF binding sites to interweave each other and expands the genomic sequence space for identifying noncoding variants that directly alter TF binding. One-Sentence Summary Overlapping DNA binding sites underlie TF genomic occupancy through their inherent propensity to interweave each other.
Collapse
|
5
|
Kang CK, Kim AR. Deep molecular learning of transcriptional control of a synthetic CRE enhancer and its variants. iScience 2024; 27:108747. [PMID: 38222110 PMCID: PMC10784702 DOI: 10.1016/j.isci.2023.108747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 08/29/2023] [Accepted: 12/12/2023] [Indexed: 01/16/2024] Open
Abstract
Massively parallel reporter assay measures transcriptional activities of various cis-regulatory modules (CRMs) in a single experiment. We developed a thermodynamic computational model framework that calculates quantitative levels of gene expression directly from regulatory DNA sequences. Using the framework, we investigated the molecular mechanisms of cis-regulatory mutations of a synthetic enhancer that cause abnormal gene expression. We found that, in a human cell line, competitive binding between family transcription factors (TFs) with slightly different binding preferences significantly increases the accuracy of recapitulating the transcriptional effects of thousands of single- or multi-mutations. We also discovered that even if various harmful mutations occurred in an activator binding site, CRM could stably maintain or even increase gene expression through a certain form of competitive binding between family TFs. These findings enhance understanding the effect of SNPs and indels on CRMs and would help building robust custom-designed CRMs for biologics production and gene therapy.
Collapse
Affiliation(s)
- Chan-Koo Kang
- School of Life Science, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
- Department of Advanced Convergence, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
| | - Ah-Ram Kim
- School of Life Science, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
- Department of Advanced Convergence, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- School of Applied Artificial Intelligence, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
| |
Collapse
|
6
|
Wen X, Xiao Y, Xiao H, Tan X, Wu B, Li Z, Wang R, Xu X, Li T. Bisphenol S induces brown adipose tissue whitening and aggravates diet-induced obesity in an estrogen-dependent manner. Cell Rep 2023; 42:113504. [PMID: 38041811 DOI: 10.1016/j.celrep.2023.113504] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 10/06/2023] [Accepted: 11/10/2023] [Indexed: 12/04/2023] Open
Abstract
Bisphenol S (BPS) exposure has been implied epidemiologically to increase obesity risk, but the underlying mechanism is unclear. Here, we propose that BPS exposure at an environmentally relevant dose aggravates diet-induced obesity in female mice by inducing brown adipose tissue (BAT) whitening. We explored the underlying mechanism by which KDM5A-associated demethylation of the trimethylation of lysine 4 on histone H3 (H3K4me3) in thermogenic genes is overactivated in BAT upon BPS exposure, leading to the reduced expression of thermogenic genes. Further studies have suggested that BPS activates KDM5A transcription in BAT by binding to glucocorticoid receptor (GR) in an estrogen-dependent manner. Estrogen-estrogen receptors facilitate the accessibility of the KDM5A gene promoter to BPS-activated GR by recruiting the activator protein 1 (AP-1) complex. These results indicate that BAT is another important target of BPS and that targeting KDM5A-related signals may serve as an approach to counteract the BPS-induced susceptivity to obesity.
Collapse
Affiliation(s)
- Xue Wen
- Department of Plastic and Burn Surgery, National Clinical Research Center for Geriatrics, West China Hospital of Sichuan University, Chengdu 610041, China; Department of Anesthesiology, Laboratory of Mitochondria and Metabolism, West China Hospital of Sichuan University, Chengdu 610041, China
| | - Yang Xiao
- Department of Plastic and Burn Surgery, National Clinical Research Center for Geriatrics, West China Hospital of Sichuan University, Chengdu 610041, China
| | - Haitao Xiao
- Department of Plastic and Burn Surgery, National Clinical Research Center for Geriatrics, West China Hospital of Sichuan University, Chengdu 610041, China
| | - Xueqin Tan
- Department of Plastic and Burn Surgery, National Clinical Research Center for Geriatrics, West China Hospital of Sichuan University, Chengdu 610041, China; Department of Anesthesiology, Laboratory of Mitochondria and Metabolism, West China Hospital of Sichuan University, Chengdu 610041, China
| | - Beiyi Wu
- Department of Plastic and Burn Surgery, National Clinical Research Center for Geriatrics, West China Hospital of Sichuan University, Chengdu 610041, China; Department of Anesthesiology, Laboratory of Mitochondria and Metabolism, West China Hospital of Sichuan University, Chengdu 610041, China
| | - Zehua Li
- Department of Plastic and Burn Surgery, National Clinical Research Center for Geriatrics, West China Hospital of Sichuan University, Chengdu 610041, China; Department of Anesthesiology, Laboratory of Mitochondria and Metabolism, West China Hospital of Sichuan University, Chengdu 610041, China
| | - Ru Wang
- Department of Plastic and Burn Surgery, National Clinical Research Center for Geriatrics, West China Hospital of Sichuan University, Chengdu 610041, China
| | - Xuewen Xu
- Department of Plastic and Burn Surgery, National Clinical Research Center for Geriatrics, West China Hospital of Sichuan University, Chengdu 610041, China.
| | - Tao Li
- Department of Anesthesiology, Laboratory of Mitochondria and Metabolism, West China Hospital of Sichuan University, Chengdu 610041, China.
| |
Collapse
|
7
|
Peng Y, Song W, Teif VB, Ovcharenko I, Landsman D, Panchenko AR. Detection of new pioneer transcription factors as cell-type specific nucleosome binders. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.10.540098. [PMID: 37425841 PMCID: PMC10327179 DOI: 10.1101/2023.05.10.540098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Wrapping of DNA into nucleosomes restricts accessibility to the DNA and may affect the recognition of binding motifs by transcription factors. A certain class of transcription factors, the pioneer transcription factors, can specifically recognize their DNA binding sites on nucleosomes, may initiate local chromatin opening and facilitate the binding of co-factors in a cell-type-specific manner. For the majority of human pioneer transcription factors, the locations of their binding sites, mechanisms of binding and regulation remain unknown. We have developed a computational method to predict the cell-type-specific ability of transcription factors to bind nucleosomes by integrating ChIP-seq, MNase-seq and DNase-seq data with details of nucleosome structure. We have demonstrated the ability of our approach in discriminating pioneer from canonical transcription factors and predicted new potential pioneer transcription factors in H1, K562, HepG2 and HeLa cell lines. Lastly, we systemically analyzed the interaction modes between various pioneer transcription factors and detected several clusters of distinctive binding sites on nucleosomal DNA.
Collapse
Affiliation(s)
- Yunhui Peng
- current address: Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
- National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Wei Song
- National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Vladimir B. Teif
- School of Life Sciences, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, UK
| | - Ivan Ovcharenko
- National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - David Landsman
- National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Anna R. Panchenko
- Department of Pathology and Molecular Medicine, Queen’s University, ON, Canada
- Department of Biology and Molecular Sciences, Queen’s University, ON, Canada
- School of Computing, Queen’s University, ON, Canada
- Ontario Institute of Cancer Research, Toronto, ON, Canada
| |
Collapse
|
8
|
Kleinschmidt H, Xu C, Bai L. Using Synthetic DNA Libraries to Investigate Chromatin and Gene Regulation. Chromosoma 2023; 132:167-189. [PMID: 37184694 PMCID: PMC10542970 DOI: 10.1007/s00412-023-00796-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 04/25/2023] [Accepted: 04/26/2023] [Indexed: 05/16/2023]
Abstract
Despite the recent explosion in genome-wide studies in chromatin and gene regulation, we are still far from extracting a set of genetic rules that can predict the function of the regulatory genome. One major reason for this deficiency is that gene regulation is a multi-layered process that involves an enormous variable space, which cannot be fully explored using native genomes. This problem can be partially solved by introducing synthetic DNA libraries into cells, a method that can test the regulatory roles of thousands to millions of sequences with limited variables. Here, we review recent applications of this method to study transcription factor (TF) binding, nucleosome positioning, and transcriptional activity. We discuss the design principles, experimental procedures, and major findings from these studies and compare the pros and cons of different approaches.
Collapse
Affiliation(s)
- Holly Kleinschmidt
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Cheng Xu
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Lu Bai
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA.
- Department of Physics, The Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
9
|
Friedman RZ, Ramu A, Lichtarge S, Myers CA, Granas DM, Gause M, Corbo JC, Cohen BA, White MA. Active learning of enhancer and silencer regulatory grammar in photoreceptors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.21.554146. [PMID: 37662358 PMCID: PMC10473580 DOI: 10.1101/2023.08.21.554146] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Cis-regulatory elements (CREs) direct gene expression in health and disease, and models that can accurately predict their activities from DNA sequences are crucial for biomedicine. Deep learning represents one emerging strategy to model the regulatory grammar that relates CRE sequence to function. However, these models require training data on a scale that exceeds the number of CREs in the genome. We address this problem using active machine learning to iteratively train models on multiple rounds of synthetic DNA sequences assayed in live mammalian retinas. During each round of training the model actively selects sequence perturbations to assay, thereby efficiently generating informative training data. We iteratively trained a model that predicts the activities of sequences containing binding motifs for the photoreceptor transcription factor Cone-rod homeobox (CRX) using an order of magnitude less training data than current approaches. The model's internal confidence estimates of its predictions are reliable guides for designing sequences with high activity. The model correctly identified critical sequence differences between active and inactive sequences with nearly identical transcription factor binding sites, and revealed order and spacing preferences for combinations of motifs. Our results establish active learning as an effective method to train accurate deep learning models of cis-regulatory function after exhausting naturally occurring training examples in the genome.
Collapse
Affiliation(s)
- Ryan Z. Friedman
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Avinash Ramu
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Sara Lichtarge
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Connie A. Myers
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, 63110
| | - David M. Granas
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Maria Gause
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Joseph C. Corbo
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Barak A. Cohen
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Michael A. White
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| |
Collapse
|
10
|
Baniulyte G, Durham SA, Merchant LE, Sammons MA. Shared Gene Targets of the ATF4 and p53 Transcriptional Networks. Mol Cell Biol 2023; 43:426-449. [PMID: 37533313 PMCID: PMC10448979 DOI: 10.1080/10985549.2023.2229225] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 06/12/2023] [Accepted: 06/20/2023] [Indexed: 08/04/2023] Open
Abstract
The master tumor suppressor p53 regulates multiple cell fate decisions, such as cell cycle arrest and apoptosis, via transcriptional control of a broad gene network. Dysfunction in the p53 network is common in cancer, often through mutations that inactivate p53 or other members of the pathway. Induction of tumor-specific cell death by restoration of p53 activity without off-target effects has gained significant interest in the field. In this study, we explore the gene regulatory mechanisms underlying a putative anticancer strategy involving stimulation of the p53-independent integrated stress response (ISR). Our data demonstrate the p53 and ISR pathways converge to independently regulate common metabolic and proapoptotic genes. We investigated the architecture of multiple gene regulatory elements bound by p53 and the ISR effector ATF4 controlling this shared regulation. We identified additional key transcription factors that control basal and stress-induced regulation of these shared p53 and ATF4 target genes. Thus, our results provide significant new molecular and genetic insight into gene regulatory networks and transcription factors that are the target of numerous antitumor therapies.
Collapse
Affiliation(s)
- Gabriele Baniulyte
- Department of Biological Sciences, The RNA Institute, University at Albany, State University of New York, Albany, New York, USA
| | - Serene A. Durham
- Department of Biological Sciences, The RNA Institute, University at Albany, State University of New York, Albany, New York, USA
| | - Lauren E. Merchant
- Department of Biological Sciences, The RNA Institute, University at Albany, State University of New York, Albany, New York, USA
| | - Morgan A. Sammons
- Department of Biological Sciences, The RNA Institute, University at Albany, State University of New York, Albany, New York, USA
| |
Collapse
|
11
|
Baniulyte G, Durham SA, Merchant LE, Sammons MA. Shared gene targets of the ATF4 and p53 transcriptional networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.15.532778. [PMID: 36993734 PMCID: PMC10055071 DOI: 10.1101/2023.03.15.532778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The master tumor suppressor p53 regulates multiple cell fate decisions, like cell cycle arrest and apoptosis, via transcriptional control of a broad gene network. Dysfunction in the p53 network is common in cancer, often through mutations that inactivate p53 or other members of the pathway. Induction of tumor-specific cell death by restoration of p53 activity without off-target effects has gained significant interest in the field. In this study, we explore the gene regulatory mechanisms underlying a putative anti-cancer strategy involving stimulation of the p53-independent Integrated Stress Response (ISR). Our data demonstrate the p53 and ISR pathways converge to independently regulate common metabolic and pro-apoptotic genes. We investigated the architecture of multiple gene regulatory elements bound by p53 and the ISR effector ATF4 controlling this shared regulation. We identified additional key transcription factors that control basal and stress-induced regulation of these shared p53 and ATF4 target genes. Thus, our results provide significant new molecular and genetic insight into gene regulatory networks and transcription factors that are the target of numerous antitumor therapies.
Collapse
Affiliation(s)
- Gabriele Baniulyte
- Department of Biological Sciences and The RNA Institute, University at Albany, State University of New York, Albany, NY, USA
| | - Serene A. Durham
- Department of Biological Sciences and The RNA Institute, University at Albany, State University of New York, Albany, NY, USA
| | - Lauren E. Merchant
- Department of Biological Sciences and The RNA Institute, University at Albany, State University of New York, Albany, NY, USA
| | - Morgan A. Sammons
- Department of Biological Sciences and The RNA Institute, University at Albany, State University of New York, Albany, NY, USA
| |
Collapse
|
12
|
Takemon Y, LeBlanc VG, Song J, Chan SY, Lee SD, Trinh DL, Ahmad ST, Brothers WR, Corbett RD, Gagliardi A, Moradian A, Cairncross JG, Yip S, Aparicio SAJR, Chan JA, Hughes CS, Morin GB, Gorski SM, Chittaranjan S, Marra MA. Multi-Omic Analysis of CIC's Functional Networks Reveals Novel Interaction Partners and a Potential Role in Mitotic Fidelity. Cancers (Basel) 2023; 15:2805. [PMID: 37345142 PMCID: PMC10216487 DOI: 10.3390/cancers15102805] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Revised: 05/11/2023] [Accepted: 05/15/2023] [Indexed: 06/23/2023] Open
Abstract
CIC encodes a transcriptional repressor and MAPK signalling effector that is inactivated by loss-of-function mutations in several cancer types, consistent with a role as a tumour suppressor. Here, we used bioinformatic, genomic, and proteomic approaches to investigate CIC's interaction networks. We observed both previously identified and novel candidate interactions between CIC and SWI/SNF complex members, as well as novel interactions between CIC and cell cycle regulators and RNA processing factors. We found that CIC loss is associated with an increased frequency of mitotic defects in human cell lines and an in vivo mouse model and with dysregulated expression of mitotic regulators. We also observed aberrant splicing in CIC-deficient cell lines, predominantly at 3' and 5' untranslated regions of genes, including genes involved in MAPK signalling, DNA repair, and cell cycle regulation. Our study thus characterises the complexity of CIC's functional network and describes the effect of its loss on cell cycle regulation, mitotic integrity, and transcriptional splicing, thereby expanding our understanding of CIC's potential roles in cancer. In addition, our work exemplifies how multi-omic, network-based analyses can be used to uncover novel insights into the interconnected functions of pleiotropic genes/proteins across cellular contexts.
Collapse
Affiliation(s)
- Yuka Takemon
- Genome Science and Technology Graduate Program, University of British Columbia, Vancouver, BC V5Z 4S6, Canada;
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (V.G.L.); (A.M.); (S.M.G.)
| | - Véronique G. LeBlanc
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (V.G.L.); (A.M.); (S.M.G.)
| | - Jungeun Song
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (V.G.L.); (A.M.); (S.M.G.)
| | - Susanna Y. Chan
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (V.G.L.); (A.M.); (S.M.G.)
| | - Stephen Dongsoo Lee
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (V.G.L.); (A.M.); (S.M.G.)
| | - Diane L. Trinh
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (V.G.L.); (A.M.); (S.M.G.)
| | - Shiekh Tanveer Ahmad
- Department of Pathology & Laboratory Medicine, University of Calgary, Calgary, AB T2N 1N4, Canada
- Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, AB T2N 4Z6, Canada
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - William R. Brothers
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (V.G.L.); (A.M.); (S.M.G.)
| | - Richard D. Corbett
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (V.G.L.); (A.M.); (S.M.G.)
| | - Alessia Gagliardi
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (V.G.L.); (A.M.); (S.M.G.)
| | - Annie Moradian
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (V.G.L.); (A.M.); (S.M.G.)
| | - J. Gregory Cairncross
- Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, AB T2N 4Z6, Canada
- Department of Clinical Neurosciences, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Stephen Yip
- Department of Molecular Oncology, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (S.Y.); (S.A.J.R.A.); (C.S.H.)
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC V6T 1Z7, Canada
| | - Samuel A. J. R. Aparicio
- Department of Molecular Oncology, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (S.Y.); (S.A.J.R.A.); (C.S.H.)
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC V6T 1Z7, Canada
| | - Jennifer A. Chan
- Department of Pathology & Laboratory Medicine, University of Calgary, Calgary, AB T2N 1N4, Canada
- Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, AB T2N 4Z6, Canada
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Christopher S. Hughes
- Department of Molecular Oncology, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (S.Y.); (S.A.J.R.A.); (C.S.H.)
| | - Gregg B. Morin
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (V.G.L.); (A.M.); (S.M.G.)
- Department of Medical Genetics, University of British Columbia, Vancouver, BC V6H 3N1, Canada
| | - Sharon M. Gorski
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (V.G.L.); (A.M.); (S.M.G.)
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Suganthi Chittaranjan
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (V.G.L.); (A.M.); (S.M.G.)
| | - Marco A. Marra
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer Research Institute, Vancouver, BC V5Z 1L3, Canada; (V.G.L.); (A.M.); (S.M.G.)
- Department of Medical Genetics, University of British Columbia, Vancouver, BC V6H 3N1, Canada
| |
Collapse
|
13
|
Nguyen HT, Martin LJ. Regulation of Cdh2 by the AP-1 family transcription factor Junb in TM4 Sertoli cells. Biochem Biophys Res Commun 2023; 663:32-40. [PMID: 37119763 DOI: 10.1016/j.bbrc.2023.04.078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 04/15/2023] [Accepted: 04/23/2023] [Indexed: 05/01/2023]
Abstract
Cadherins are transmembrane proteins that mediate cell-to-cell adhesion and various cellular processes. In Sertoli cells of the testis, Cdh2 contributes to the development of the testis and the formation of the blood-testis barrier, being essential for germ cells' protection. Analyses of chromatin accessibility and epigenetic marks in adult mouse testis have shown that the region from -800 to +900 bp respective to Cdh2 transcription start site (TSS) is likely the active regulatory region of this gene. In addition, the JASPAR 2022 matrix has predicted an AP-1 binding element at about -600 bp. Transcription factors of the activator protein 1 (AP-1) family have been implicated in the regulation of the expression of genes encoding cell-to-cell interaction proteins such as Gja1, Nectin2 and Cdh3. To test the potential regulation of Cdh2 by members of the AP-1 family, siRNAs were transfected into TM4 Sertoli cells. The knockdown of Junb led to a decrease in Cdh2 expression. ChIP-qPCR and luciferase reporter assays with site-directed mutagenesis confirmed the recruitment of Junb to several AP-1 regulatory elements in the proximal region of the Cdh2 promoter in TM4 cells. Further investigation with luciferase reporter assays showed that other AP-1 members can also activate the Cdh2 promoter albeit to a lesser extent than Junb. Taken together, these data suggest that in TM4 Sertoli cells, Junb is responsible for the regulation of Cdh2 expression which requires its recruitment to the proximal region of the Cdh2 promoter.
Collapse
Affiliation(s)
- Ha Tuyen Nguyen
- Biology Department, Université de Moncton, Moncton, New Brunswick, E1A 3E9, Canada
| | - Luc J Martin
- Biology Department, Université de Moncton, Moncton, New Brunswick, E1A 3E9, Canada.
| |
Collapse
|
14
|
Zhao S, Hong CKY, Myers CA, Granas DM, White MA, Corbo JC, Cohen BA. A single-cell massively parallel reporter assay detects cell-type-specific gene regulation. Nat Genet 2023; 55:346-354. [PMID: 36635387 PMCID: PMC9931678 DOI: 10.1038/s41588-022-01278-7] [Citation(s) in RCA: 42] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 12/05/2022] [Indexed: 01/14/2023]
Abstract
Massively parallel reporter gene assays are key tools in regulatory genomics but cannot be used to identify cell-type-specific regulatory elements without performing assays serially across different cell types. To address this problem, we developed a single-cell massively parallel reporter assay (scMPRA) to measure the activity of libraries of cis-regulatory sequences (CRSs) across multiple cell types simultaneously. We assayed a library of core promoters in a mixture of HEK293 and K562 cells and showed that scMPRA is a reproducible, highly parallel, single-cell reporter gene assay that detects cell-type-specific cis-regulatory activity. We then measured a library of promoter variants across multiple cell types in live mouse retinas and showed that subtle genetic variants can produce cell-type-specific effects on cis-regulatory activity. We anticipate that scMPRA will be widely applicable for studying the role of CRSs across diverse cell types.
Collapse
Affiliation(s)
- Siqi Zhao
- Edison Family Center for Systems Biology and Genome Sciences, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
- Ginkgo Bioworks, Boston, MA, USA
| | - Clarice K Y Hong
- Edison Family Center for Systems Biology and Genome Sciences, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
| | - Connie A Myers
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO, USA
| | - David M Granas
- Edison Family Center for Systems Biology and Genome Sciences, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
| | - Michael A White
- Edison Family Center for Systems Biology and Genome Sciences, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
| | - Joseph C Corbo
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO, USA
| | - Barak A Cohen
- Edison Family Center for Systems Biology and Genome Sciences, Washington University School of Medicine, St. Louis, MO, USA.
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA.
| |
Collapse
|
15
|
Nguyen HT, Martin LJ. The transcription factors Junb and Fosl2 cooperate to regulate Cdh3 expression in 15P-1 Sertoli cells. Mol Reprod Dev 2023; 90:27-41. [PMID: 36468795 DOI: 10.1002/mrd.23656] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 10/31/2022] [Accepted: 11/18/2022] [Indexed: 12/12/2022]
Abstract
In Sertoli cells of the testis, cadherins (Cdh) are important cell-to-cell interaction proteins and contribute to the formation of the blood-testis barrier being essential for germ cells' protection. P-cadherin or Cdh3 is only expressed in Sertoli cells from embryonic to prepubertal development. Interestingly, the expression profile of Cdh3 correlates with that of activating protein-1 (AP-1) transcription factors during Sertoli cells development. To assess their potential implications in the regulation of Cdh3, different AP-1 transcription factors were overexpressed in 15P-1 Sertoli cells. We found that the overexpressions of Junb and Fosl2 activated Cdh3 promoter. ChIP-qPCR assay and luciferase reporter assay with 5' promoter deletions and site-directed mutagenesis confirmed the recruitment of Junb and Fosl2 to an AP-1 regulatory element at -47 bp in the proximal region of Cdh3 promoter in 15P-1 cells. These findings were further supported by histone modification markers and chromatin accessibility surrounding Cdh3 promoter in mouse testis. Moreover, the knockdowns of Junb and/or Fosl2 by siRNA decreased Cdh3 protein levels. Taken together, these data suggest that in 15P-1 Sertoli cells, the AP-1 family members Junb and Fosl2 are responsible for the regulation of Cdh3 expression, which requires the recruitment of both factors to the proximal region of the Cdh3 promoter.
Collapse
Affiliation(s)
- Ha T Nguyen
- Department of Biology, Université de Moncton, Moncton, New Brunswick, Canada
| | - Luc J Martin
- Department of Biology, Université de Moncton, Moncton, New Brunswick, Canada
| |
Collapse
|
16
|
Du AY, Zhuo X, Sundaram V, Jensen NO, Chaudhari HG, Saccone NL, Cohen BA, Wang T. Functional characterization of enhancer activity during a long terminal repeat's evolution. Genome Res 2022; 32:1840-1851. [PMID: 36192170 PMCID: PMC9712623 DOI: 10.1101/gr.276863.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 08/23/2022] [Indexed: 11/24/2022]
Abstract
Many transposable elements (TEs) contain transcription factor binding sites and are implicated as potential regulatory elements. However, TEs are rarely functionally tested for regulatory activity, which in turn limits our understanding of how TE regulatory activity has evolved. We systematically tested the human LTR18A subfamily for regulatory activity using massively parallel reporter assay (MPRA) and found AP-1- and CEBP-related binding motifs as drivers of enhancer activity. Functional analysis of evolutionarily reconstructed ancestral sequences revealed that LTR18A elements have generally lost regulatory activity over time through sequence changes, with the largest effects occurring owing to mutations in the AP-1 and CEBP motifs. We observed that the two motifs are conserved at higher rates than expected based on neutral evolution. Finally, we identified LTR18A elements as potential enhancers in the human genome, primarily in epithelial cells. Together, our results provide a model for the origin, evolution, and co-option of TE-derived regulatory elements.
Collapse
Affiliation(s)
- Alan Y Du
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Xiaoyu Zhuo
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Vasavi Sundaram
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Nicholas O Jensen
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Division of Biostatistics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Department of Developmental Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Hemangi G Chaudhari
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Nancy L Saccone
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Division of Biostatistics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Barak A Cohen
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Ting Wang
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| |
Collapse
|
17
|
Yang MG, Ling E, Cowley CJ, Greenberg ME, Vierbuchen T. Characterization of sequence determinants of enhancer function using natural genetic variation. eLife 2022; 11:76500. [PMID: 36043696 PMCID: PMC9662815 DOI: 10.7554/elife.76500] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Accepted: 08/30/2022] [Indexed: 02/04/2023] Open
Abstract
Sequence variation in enhancers that control cell-type-specific gene transcription contributes significantly to phenotypic variation within human populations. However, it remains difficult to predict precisely the effect of any given sequence variant on enhancer function due to the complexity of DNA sequence motifs that determine transcription factor (TF) binding to enhancers in their native genomic context. Using F1-hybrid cells derived from crosses between distantly related inbred strains of mice, we identified thousands of enhancers with allele-specific TF binding and/or activity. We find that genetic variants located within the central region of enhancers are most likely to alter TF binding and enhancer activity. We observe that the AP-1 family of TFs (Fos/Jun) are frequently required for binding of TEAD TFs and for enhancer function. However, many sequence variants outside of core motifs for AP-1 and TEAD also impact enhancer function, including sequences flanking core TF motifs and AP-1 half sites. Taken together, these data represent one of the most comprehensive assessments of allele-specific TF binding and enhancer function to date and reveal how sequence changes at enhancers alter their function across evolutionary timescales.
Collapse
Affiliation(s)
- Marty G Yang
- Department of Neurobiology, Harvard Medical School, Boston, United States.,Program in Neuroscience, Harvard Medical School, Boston, United States
| | - Emi Ling
- Department of Neurobiology, Harvard Medical School, Boston, United States
| | | | | | - Thomas Vierbuchen
- Developmental Biology Program, Sloan Kettering Institute for Cancer Research, New York, United States.,Center for Stem Cell Biology, Sloan Kettering Institute for Cancer Research, New York, United States
| |
Collapse
|
18
|
Steinauer N, Zhang K, Guo C, Zhang J. Computational Modeling of Gene-Specific Transcriptional Repression, Activation and Chromatin Interactions in Leukemogenesis by LASSO-Regularized Logistic Regression. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2109-2122. [PMID: 33961561 PMCID: PMC8572318 DOI: 10.1109/tcbb.2021.3078128] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Many physiological and pathological pathways are dependent on gene-specific on/off regulation of transcription. Some genes are repressed, while others are activated. Although many previous studies have analyzed the mechanisms of gene-specific repression and activation, these studies are mainly based on the use of candidate genes, which are either repressed or activated, without simultaneously comparing and contrasting both groups of genes. There is also insufficient consideration of gene locations. Here we describe an integrated machine learning approach, using LASSO-regularized logistic regression, to model gene-specific repression and activation and the underlying contribution of chromatin interactions. LASSO-regularized logistic regression accurately predicted gene-specific transcriptional events and robustly detected the rate-limiting factors that underlie the differences of gene activation and repression. An example was provided by the leukemogenic transcription factor AML1-ETO, which is responsible for 10-15 percent of all acute myeloid leukemia cases. The analysis of AML1-ETO has also revealed novel networks of chromatin interactions and uncovered an unexpected role for E-proteins in AML1-ETO-p300 interactions and a role for the pre-existing gene state in governing the transcriptional response. Our results show that logistic regression-based probabilistic modeling is a promising tool to decipher mechanisms that integrate gene regulation and chromatin interactions in regulated transcription.
Collapse
|
19
|
Zebrafish (Danio rerio) ecotoxicological ABCB4, ABCC1 and ABCG2a gene promoters depict spatiotemporal xenobiotic multidrug resistance properties against environmental pollutants. GENE REPORTS 2021. [DOI: 10.1016/j.genrep.2021.101110] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
|
20
|
Niu X, Deng K, Liu L, Yang K, Hu X. A statistical framework for predicting critical regions of p53-dependent enhancers. Brief Bioinform 2021; 22:bbaa053. [PMID: 32392580 PMCID: PMC8138796 DOI: 10.1093/bib/bbaa053] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Revised: 02/26/2020] [Indexed: 12/13/2022] Open
Abstract
P53 is the 'guardian of the genome' and is responsible for regulating cell cycle and apoptosis. The genomic p53 binding regions, where activating transcriptional factors and cofactors like p300 simultaneously bind, are called 'p53-dependent enhancers', which play an important role in tumorigenesis. Current experimental assays generally provide a broad peak of each enhancer element, leaving our knowledge about critical enhancer regions (CERs) limited. Under the inspiration of enhancer dissection by CRISPR-Cas9 screen library on genome-wide p53 binding sites, here we introduce a statistical framework called 'Computational CRISPR Strategy' (CCS), to predict whether a given DNA fragment will be a p53-dependent CER by employing 7-mer as feature extractions along with random forest as the regressor. When training on a p53 CRISPR enhancer dataset, CCS not only accurately fitted the top-ranked enriched single guide RNAs (sgRNAs) but also successfully reproduced two known CERs that were validated by experiments. When applying it to an independent testing dataset on a tilling of a 2K-b genomic region of CRISPR-deCDKN1A-Lib, the trained model shows great generalizability by identifying a CER containing five top-ranked sgRNAs. A feature importance analysis further indicates that top-ranked 7-mers are mapped onto informative TF motifs including POU5F1 and SOX5, which are differentially enriched in p53-dependent CERs and are potential factors to make a general p53 binding site to form a p53-dependent CER, providing the interpretability of the trained model. Our results demonstrate that CCS is an alternative way of the CRISPR experiment to screen the genome for mapping p53-dependent CERs.
Collapse
Affiliation(s)
| | | | | | | | - Xuehai Hu
- Corresponding author: Xuehai Hu, College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, Hubei, 430070, P.R. China. Tel.: +86-18171282783; Fax: +86-27-87288509; E-mail:
| |
Collapse
|
21
|
Seo J, Koçak DD, Bartelt LC, Williams CA, Barrera A, Gersbach CA, Reddy TE. AP-1 subunits converge promiscuously at enhancers to potentiate transcription. Genome Res 2021; 31:538-550. [PMID: 33674350 PMCID: PMC8015846 DOI: 10.1101/gr.267898.120] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Accepted: 02/17/2021] [Indexed: 12/12/2022]
Abstract
The AP-1 transcription factor (TF) dimer contributes to many biological processes and environmental responses. AP-1 can be composed of many interchangeable subunits. Unambiguously determining the binding locations of these subunits in the human genome is challenging because of variable antibody specificity and affinity. Here, we definitively establish the genome-wide binding patterns of five AP-1 subunits by using CRISPR to introduce a common antibody tag on each subunit. We find limited evidence for strong dimerization preferences between subunits at steady state and find that, under a stimulus, dimerization patterns reflect changes in the transcriptome. Further, our analysis suggests that canonical AP-1 motifs indiscriminately recruit all AP-1 subunits to genomic sites, which we term AP-1 hotspots. We find that AP-1 hotspots are predictive of cell type–specific gene expression and of genomic responses to glucocorticoid signaling (more so than super-enhancers) and are significantly enriched in disease-associated genetic variants. Together, these results support a model where promiscuous binding of many AP-1 subunits to the same genomic location play a key role in regulating cell type–specific gene expression and environmental responses.
Collapse
Affiliation(s)
- Jungkyun Seo
- Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical Center, Durham, North Carolina 27708, USA.,Computational Biology and Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA.,Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27708, USA
| | - D Dewran Koçak
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27708, USA.,Department of Biomedical Engineering, Duke University, Durham, North Carolina 27708, USA
| | - Luke C Bartelt
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA.,University Program in Genetics and Genomics, Duke University, Durham, North Carolina 27708, USA
| | - Courtney A Williams
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27708, USA
| | - Alejandro Barrera
- Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical Center, Durham, North Carolina 27708, USA.,Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27708, USA
| | - Charles A Gersbach
- Computational Biology and Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA.,Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27708, USA.,Department of Biomedical Engineering, Duke University, Durham, North Carolina 27708, USA.,University Program in Genetics and Genomics, Duke University, Durham, North Carolina 27708, USA.,Department of Surgery, Duke University Medical Center, Durham, North Carolina 27708, USA
| | - Timothy E Reddy
- Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical Center, Durham, North Carolina 27708, USA.,Computational Biology and Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA.,Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27708, USA.,Department of Biomedical Engineering, Duke University, Durham, North Carolina 27708, USA.,University Program in Genetics and Genomics, Duke University, Durham, North Carolina 27708, USA.,Department of Molecular Genetics and Microbiology, Duke University, Durham, North Carolina 27708, USA
| |
Collapse
|
22
|
Senitzki A, Safieh J, Sharma V, Golovenko D, Danin-Poleg Y, Inga A, Haran TE. The complex architecture of p53 binding sites. Nucleic Acids Res 2021; 49:1364-1382. [PMID: 33444431 PMCID: PMC7897521 DOI: 10.1093/nar/gkaa1283] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 12/22/2020] [Accepted: 12/24/2020] [Indexed: 12/13/2022] Open
Abstract
Sequence-specific protein-DNA interactions are at the heart of the response of the tumor-suppressor p53 to numerous physiological and stress-related signals. Large variability has been previously reported in p53 binding to and transactivating from p53 response elements (REs) due, at least in part, to changes in direct (base) and indirect (shape) readouts of p53 REs. Here, we dissect p53 REs to decipher the mechanism by which p53 optimizes this highly regulated variable level of interaction with its DNA binding sites. We show that hemi-specific binding is more prevalent in p53 REs than previously envisioned. We reveal that sequences flanking the REs modulate p53 binding and activity and show that these effects extend to 4–5 bp from the REs. Moreover, we show here that the arrangement of p53 half-sites within its REs, relative to transcription direction, has been fine-tuned by selection pressure to optimize and regulate the response levels from p53 REs. This directionality in the REs arrangement is at least partly encoded in the structural properties of the REs. Furthermore, we show here that in the p21-5′ RE the orientation of the half-sites is such that the effect of the flanking sequences is minimized and we discuss its advantages.
Collapse
Affiliation(s)
- Alon Senitzki
- Department of Biology, Technion - Israel Institute of Technology, Technion City, Haifa 3200003, Israel
| | - Jessy Safieh
- Department of Biology, Technion - Israel Institute of Technology, Technion City, Haifa 3200003, Israel
| | - Vasundhara Sharma
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, via Sommarive 9, 38123 Trento, TN, Italy
| | - Dmitrij Golovenko
- Department of Biology, Technion - Israel Institute of Technology, Technion City, Haifa 3200003, Israel
| | - Yael Danin-Poleg
- Department of Biology, Technion - Israel Institute of Technology, Technion City, Haifa 3200003, Israel
| | - Alberto Inga
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, via Sommarive 9, 38123 Trento, TN, Italy
| | - Tali E Haran
- Department of Biology, Technion - Israel Institute of Technology, Technion City, Haifa 3200003, Israel
| |
Collapse
|
23
|
Liu L, Zhang G, He S, Hu X. TSPTFBS: a docker image for Trans-Species Prediction of Transcription Factor Binding Sites in Plants. Bioinformatics 2021; 37:260-262. [PMID: 33416862 DOI: 10.1093/bioinformatics/btaa1100] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Revised: 12/18/2020] [Accepted: 12/29/2020] [Indexed: 01/19/2023] Open
Abstract
MOTIVATION Both the lack or limitation of experimental data of transcription factor binding sites (TFBS) in plants and the independent evolutions of plant TFs make computational approaches for identifying plant TFBSs lagging behind the relevant human researches. Observing that TFs are highly conserved among plant species, here we first employ the deep convolutional neural network (DeepCNN) to build 265 Arabidopsis TFBS prediction models based on available DAP-seq (DNA affinity purification sequencing) datasets, and then transfer them into homologous TFs in other plants. RESULTS DeepCNN not only achieves greater successes on Arabidopsis TFBS predictions when compared with gkm-SVM and MEME, but also has learned its known motif for most Arabidopsis TFs as well as cooperative TF motifs with PPI (protein-protein-interaction) evidences as its biological interpretability. Under the idea of transfer learning, trans-species prediction performances on ten TFs of other three plants of Oryza sativa, Zea mays and Glycine max demonstrate the feasibility of current strategy. AVAILABILITY AND IMPLEMENTATION The trained 265 Arabidopsis TFBS prediction models were packaged in a Docker image named TSPTFBS, which is freely available on DockerHub at https://hub.docker.com/r/vanadiummm/tsptfbs. Source code and documentation are available on GitHub at: https://github.com/liulifenyf/TSPTFBS.
Collapse
Affiliation(s)
- Lifen Liu
- College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, Hubei, P.R. of China
| | - Ge Zhang
- College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, Hubei, P.R. of China
| | - Shoupeng He
- College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, Hubei, P.R. of China
| | - Xuehai Hu
- College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, Hubei, P.R. of China
| |
Collapse
|
24
|
King DM, Hong CKY, Shepherdson JL, Granas DM, Maricque BB, Cohen BA. Synthetic and genomic regulatory elements reveal aspects of cis-regulatory grammar in mouse embryonic stem cells. eLife 2020; 9:41279. [PMID: 32043966 PMCID: PMC7077988 DOI: 10.7554/elife.41279] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Accepted: 02/07/2020] [Indexed: 01/08/2023] Open
Abstract
In embryonic stem cells (ESCs), a core transcription factor (TF) network establishes the gene expression program necessary for pluripotency. To address how interactions between four key TFs contribute to cis-regulation in mouse ESCs, we assayed two massively parallel reporter assay (MPRA) libraries composed of binding sites for SOX2, POU5F1 (OCT4), KLF4, and ESRRB. Comparisons between synthetic cis-regulatory elements and genomic sequences with comparable binding site configurations revealed some aspects of a regulatory grammar. The expression of synthetic elements is influenced by both the number and arrangement of binding sites. This grammar plays only a small role for genomic sequences, as the relative activities of genomic sequences are best explained by the predicted occupancy of binding sites, regardless of binding site identity and positioning. Our results suggest that the effects of transcription factor binding sites (TFBS) are influenced by the order and orientation of sites, but that in the genome the overall occupancy of TFs is the primary determinant of activity. Transcription factors are proteins that flip genetic switches; their role is to control when and where genes are active. They do this by binding to short stretches of DNA called cis-regulatory sequences. Each sequence can have several binding sites for different transcription factors, but it is largely unclear whether the transcription factors binding to the same regulatory sequence actually work together. It is possible that each transcription factor may work independently and there only needs to be critical mass of transcription factors bound to throw the genetic switch. If this is the case, the most important features of a cis-regulatory sequence should be the number of binding sites it contains, and how tightly the transcription factors bind to those sites. The more transcription factors and the more strongly they bind, the more active the gene should be. An alternative option is that certain transcription factors may work better together, enhancing each other's effects such that the total effect is more than the sum of its parts. If this is true, the order, orientation and spacing of the binding sites within a sequence should matter more than the number. One way to investigate to distinguish between these possibilities is to study mouse embryonic stem cells, which have a core set of four transcription factors. Looking directly at a real genome, however, can be confusing and it is difficult to measure the effects of different cis-regulatory sequences because genes differ in so many other ways. To tackle this problem, King et al. created a synthetic set of cis-regulatory sequences based on the four core transcription factors found in mouse stem cells. The synthetic set had every combination of two, three or four of the binding sites, with each site either facing forwards or backwards along the DNA strand. King et al. attached each of the synthetic cis-regulatory sequences to a reporter gene to find out how well each sequence performed. This revealed that the cis-regulatory sequences with the most binding sites and the tightest binding affinities work best, suggesting that transcription factors mainly work independently. There was evidence of some interaction between some transcription factors, because, of the synthetic sequences with four binding sites, some worked better than others, and there were patterns in the most effective binding site combinations. However, these effects were small and when King et al. went on to test sequences from the real mouse genome, the most important factor by far was the number of binding sites. Synthetic libraries of DNA sequences allow researchers to examine gene regulation more clearly than is possible in real genomes. Yet this approach does have its limitations and it is impossible to capture every type of cis-regulatory sequence in one library. The next step to extend this work is to combine the two approaches, taking sequences from the real genome and manipulating them one by one. This could help to unravel the rules that govern how cis-regulatory sequences work in real cells.
Collapse
Affiliation(s)
- Dana M King
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| | - Clarice Kit Yee Hong
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| | - James L Shepherdson
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| | - David M Granas
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| | - Brett B Maricque
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| | - Barak A Cohen
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| |
Collapse
|
25
|
Determinants of enhancer and promoter activities of regulatory elements. Nat Rev Genet 2019; 21:71-87. [DOI: 10.1038/s41576-019-0173-8] [Citation(s) in RCA: 405] [Impact Index Per Article: 67.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/04/2019] [Indexed: 12/13/2022]
|
26
|
Bejjani F, Evanno E, Zibara K, Piechaczyk M, Jariel-Encontre I. The AP-1 transcriptional complex: Local switch or remote command? Biochim Biophys Acta Rev Cancer 2019; 1872:11-23. [PMID: 31034924 DOI: 10.1016/j.bbcan.2019.04.003] [Citation(s) in RCA: 196] [Impact Index Per Article: 32.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 04/19/2019] [Accepted: 04/22/2019] [Indexed: 12/19/2022]
Abstract
The ubiquitous family of AP-1 dimeric transcription complexes is involved in virtually all cellular and physiological functions. It is paramount for cells to reprogram gene expression in response to cues of many sorts and is involved in many tumorigenic processes. How AP-1 controls gene transcription has largely remained elusive till recently. The advent of the "omics" technologies permitting genome-wide studies of transcription factors has however changed and improved our view of AP-1 mechanistical actions. If these studies confirm that AP-1 can sometimes act as a local transcriptional switch operating in the vicinity of transcription start sites (TSS), they strikingly indicate that AP-1 principally operates as a remote command binding to distal enhancers, placing chromatin architecture dynamics at the heart of its transcriptional actions. They also unveil novel constraints operating on AP-1, as well as novel mechanisms used to regulate gene expression via transcription-pioneering-, chromatin-remodeling- and chromatin accessibility maintenance effects.
Collapse
Affiliation(s)
- Fabienne Bejjani
- Equipe Labellisée Ligue Nationale contre le Cancer, Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France; PRASE and Biology Department, Faculty of Sciences - I, Lebanese University, Beirut, Lebanon
| | - Emilie Evanno
- Equipe Labellisée Ligue Nationale contre le Cancer, Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France
| | - Kazem Zibara
- PRASE and Biology Department, Faculty of Sciences - I, Lebanese University, Beirut, Lebanon
| | - Marc Piechaczyk
- Equipe Labellisée Ligue Nationale contre le Cancer, Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France.
| | - Isabelle Jariel-Encontre
- Equipe Labellisée Ligue Nationale contre le Cancer, Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France.
| |
Collapse
|
27
|
Vandel J, Cassan O, Lèbre S, Lecellier CH, Bréhélin L. Probing transcription factor combinatorics in different promoter classes and in enhancers. BMC Genomics 2019; 20:103. [PMID: 30709337 PMCID: PMC6359851 DOI: 10.1186/s12864-018-5408-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Accepted: 12/26/2018] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND In eukaryotic cells, transcription factors (TFs) are thought to act in a combinatorial way, by competing and collaborating to regulate common target genes. However, several questions remain regarding the conservation of these combinations among different gene classes, regulatory regions and cell types. RESULTS We propose a new approach named TFcoop to infer the TF combinations involved in the binding of a target TF in a particular cell type. TFcoop aims to predict the binding sites of the target TF upon the nucleotide content of the sequences and of the binding affinity of all identified cooperating TFs. The set of cooperating TFs and model parameters are learned from ChIP-seq data of the target TF. We used TFcoop to investigate the TF combinations involved in the binding of 106 TFs on 41 cell types and in four regulatory regions: promoters of mRNAs, lncRNAs and pri-miRNAs, and enhancers. We first assess that TFcoop is accurate and outperforms simple PWM methods for predicting TF binding sites. Next, analysis of the learned models sheds light on important properties of TF combinations in different promoter classes and in enhancers. First, we show that combinations governing TF binding on enhancers are more cell-type specific than that governing binding in promoters. Second, for a given TF and cell type, we observe that TF combinations are different between promoters and enhancers, but similar for promoters of mRNAs, lncRNAs and pri-miRNAs. Analysis of the TFs cooperating with the different targets show over-representation of pioneer TFs and a clear preference for TFs with binding motif composition similar to that of the target. Lastly, our models accurately distinguish promoters associated with specific biological processes. CONCLUSIONS TFcoop appears as an accurate approach for studying TF combinations. Its use on ENCODE and FANTOM data allowed us to discover important properties of human TF combinations in different promoter classes and in enhancers. The R code for learning a TFcoop model and for reproducing the main experiments described in the paper is available in an R Markdown file at address https://gite.lirmm.fr/brehelin/TFcoop .
Collapse
Affiliation(s)
- Jimmy Vandel
- LIRMM, Univ. Montpellier, CNRS, Montpellier, France
- IBC, CNRS, Univ. Montpellier, Montpellier, France
| | - Océane Cassan
- LIRMM, Univ. Montpellier, CNRS, Montpellier, France
- IBC, CNRS, Univ. Montpellier, Montpellier, France
| | - Sophie Lèbre
- IBC, CNRS, Univ. Montpellier, Montpellier, France
- IMAG, Univ. Montpellier, CNRS, Montpellier, France
- Univ. Paul Valery Montpellier, Montpellier, France
| | - Charles-Henri Lecellier
- IBC, CNRS, Univ. Montpellier, Montpellier, France.
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France.
| | - Laurent Bréhélin
- LIRMM, Univ. Montpellier, CNRS, Montpellier, France.
- IBC, CNRS, Univ. Montpellier, Montpellier, France.
| |
Collapse
|
28
|
Catarino RR, Stark A. Assessing sufficiency and necessity of enhancer activities for gene expression and the mechanisms of transcription activation. Genes Dev 2018; 32:202-223. [PMID: 29491135 PMCID: PMC5859963 DOI: 10.1101/gad.310367.117] [Citation(s) in RCA: 124] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Enhancers are important genomic regulatory elements directing cell type-specific transcription. They assume a key role during development and disease, and their identification and functional characterization have long been the focus of scientific interest. The advent of next-generation sequencing and clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9-based genome editing has revolutionized the means by which we study enhancer biology. In this review, we cover recent developments in the prediction of enhancers based on chromatin characteristics and their identification by functional reporter assays and endogenous DNA perturbations. We discuss that the two latter approaches provide different and complementary insights, especially in assessing enhancer sufficiency and necessity for transcription activation. Furthermore, we discuss recent insights into mechanistic aspects of enhancer function, including findings about cofactor requirements and the role of post-translational histone modifications such as monomethylation of histone H3 Lys4 (H3K4me1). Finally, we survey how these approaches advance our understanding of transcription regulation with respect to promoter specificity and transcriptional bursting and provide an outlook covering open questions and promising developments.
Collapse
Affiliation(s)
- Rui R Catarino
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), 1030 Vienna, Austria
| | - Alexander Stark
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), 1030 Vienna, Austria
| |
Collapse
|