51
|
Srivastava D, Aydin B, Mazzoni EO, Mahony S. An interpretable bimodal neural network characterizes the sequence and preexisting chromatin predictors of induced transcription factor binding. Genome Biol 2021; 22:20. [PMID: 33413545 PMCID: PMC7788824 DOI: 10.1186/s13059-020-02218-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Accepted: 12/03/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Transcription factor (TF) binding specificity is determined via a complex interplay between the transcription factor's DNA binding preference and cell type-specific chromatin environments. The chromatin features that correlate with transcription factor binding in a given cell type have been well characterized. For instance, the binding sites for a majority of transcription factors display concurrent chromatin accessibility. However, concurrent chromatin features reflect the binding activities of the transcription factor itself and thus provide limited insight into how genome-wide TF-DNA binding patterns became established in the first place. To understand the determinants of transcription factor binding specificity, we therefore need to examine how newly activated transcription factors interact with sequence and preexisting chromatin landscapes. RESULTS Here, we investigate the sequence and preexisting chromatin predictors of TF-DNA binding by examining the genome-wide occupancy of transcription factors that have been induced in well-characterized chromatin environments. We develop Bichrom, a bimodal neural network that jointly models sequence and preexisting chromatin data to interpret the genome-wide binding patterns of induced transcription factors. We find that the preexisting chromatin landscape is a differential global predictor of TF-DNA binding; incorporating preexisting chromatin features improves our ability to explain the binding specificity of some transcription factors substantially, but not others. Furthermore, by analyzing site-level predictors, we show that transcription factor binding in previously inaccessible chromatin tends to correspond to the presence of more favorable cognate DNA sequences. CONCLUSIONS Bichrom thus provides a framework for modeling, interpreting, and visualizing the joint sequence and chromatin landscapes that determine TF-DNA binding dynamics.
Collapse
Affiliation(s)
- Divyanshi Srivastava
- Center for Eukaryotic Gene Regulation, Department of Biochemistry & Molecular Biology, Pennsylvania State University, University Park, PA, USA
| | - Begüm Aydin
- Department of Biology, New York University, New York, NY, USA
| | | | - Shaun Mahony
- Center for Eukaryotic Gene Regulation, Department of Biochemistry & Molecular Biology, Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
52
|
Endometriosis Is Associated with a Significant Increase in hTERC and Altered Telomere/Telomerase Associated Genes in the Eutopic Endometrium, an Ex-Vivo and In Silico Study. Biomedicines 2020; 8:biomedicines8120588. [PMID: 33317189 PMCID: PMC7764055 DOI: 10.3390/biomedicines8120588] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 12/02/2020] [Accepted: 12/03/2020] [Indexed: 12/13/2022] Open
Abstract
Telomeres protect chromosomal ends and they are maintained by the specialised enzyme, telomerase. Endometriosis is a common gynaecological disease and high telomerase activity and higher hTERT levels associated with longer endometrial telomere lengths are characteristics of eutopic secretory endometrial aberrations of women with endometriosis. Our ex-vivo study examined the levels of hTERC and DKC1 RNA and dyskerin protein levels in the endometrium from healthy women and those with endometriosis (n = 117). The in silico study examined endometriosis-specific telomere- and telomerase-associated gene (TTAG) transcriptional aberrations of secretory phase eutopic endometrium utilising publicly available microarray datasets. Eutopic secretory endometrial hTERC levels were significantly increased in women with endometriosis compared to healthy endometrium, yet dyskerin mRNA and protein levels were unperturbed. Our in silico study identified 10 TTAGs (CDKN2A, PML, ZNHIT2, UBE3A, MCCC2, HSPC159, FGFR2, PIK3C2A, RALGAPA1, and HNRNPA2B1) to be altered in mid-secretory endometrium of women with endometriosis. High levels of hTERC and the identified other TTAGs might be part of the established alteration in the eutopic endometrial telomerase biology in women with endometriosis in the secretory phase of the endometrium and our data informs future research to unravel the fundamental involvement of telomerase in the pathogenesis of endometriosis.
Collapse
|
53
|
López-Rivera F, Foster Rhoades OK, Vincent BJ, Pym ECG, Bragdon MDJ, Estrada J, DePace AH, Wunderlich Z. A Mutation in the Drosophila melanogaster eve Stripe 2 Minimal Enhancer Is Buffered by Flanking Sequences. G3 (BETHESDA, MD.) 2020; 10:4473-4482. [PMID: 33037064 PMCID: PMC7718739 DOI: 10.1534/g3.120.401777] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Accepted: 10/01/2020] [Indexed: 01/18/2023]
Abstract
Enhancers are DNA sequences composed of transcription factor binding sites that drive complex patterns of gene expression in space and time. Until recently, studying enhancers in their genomic context was technically challenging. Therefore, minimal enhancers, the shortest pieces of DNA that can drive an expression pattern that resembles a gene's endogenous pattern, are often used to study features of enhancer function. However, evidence suggests that some enhancers require sequences outside the minimal enhancer to maintain function under environmental perturbations. We hypothesized that these additional sequences also prevent misexpression caused by a transcription factor binding site mutation within a minimal enhancer. Using the Drosophila melanogastereven-skipped stripe 2 enhancer as a case study, we tested the effect of a Giant binding site mutation (gt-2) on the expression patterns driven by minimal and extended enhancer reporter constructs. We found that, in contrast to the misexpression caused by the gt-2 binding site deletion in the minimal enhancer, the same gt-2 binding site deletion in the extended enhancer did not have an effect on expression. The buffering of expression levels, but not expression pattern, is partially explained by an additional Giant binding site outside the minimal enhancer. Deleting the gt-2 binding site in the endogenous locus had no significant effect on stripe 2 expression. Our results indicate that rules derived from mutating enhancer reporter constructs may not represent what occurs in the endogenous context.
Collapse
Affiliation(s)
- Francheska López-Rivera
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
- GSAS Research Scholar Initiative, Harvard University, Cambridge, MA 02138
| | | | - Ben J Vincent
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
| | - Edward C G Pym
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
| | | | - Javier Estrada
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
| | - Angela H DePace
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
| | - Zeba Wunderlich
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
| |
Collapse
|
54
|
Martin PC, Zabet NR. Dissecting the binding mechanisms of transcription factors to DNA using a statistical thermodynamics framework. Comput Struct Biotechnol J 2020; 18:3590-3605. [PMID: 33304457 PMCID: PMC7708957 DOI: 10.1016/j.csbj.2020.11.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 11/02/2020] [Accepted: 11/04/2020] [Indexed: 01/22/2023] Open
Abstract
Transcription Factors (TFs) bind to DNA and control activity of target genes. Here, we present ChIPanalyser, a user-friendly, versatile and powerful R/Bioconductor package predicting and modelling the binding of TFs to DNA. ChIPanalyser performs similarly to state-of-the-art tools, but is an explainable model and provides biological insights into binding mechanisms of TFs. We focused on investigating the binding mechanisms of three TFs that are known architectural proteins CTCF, BEAF-32 and su(Hw) in three Drosophila cell lines (BG3, Kc167 and S2). While CTCF preferentially binds only to a subset of high affinity sites located mainly in open chromatin, BEAF-32 binds to most of its high affinity binding sites available in open chromatin. In contrast, su(Hw) binds to both open chromatin and also partially closed chromatin. Most importantly, differences in TF binding profiles between cell lines for these TFs are mainly driven by differences in DNA accessibility and not by differences in TF concentrations between cell lines. Finally, we investigated binding of Hox TFs in Drosophila and found that Ubx binds only in open chromatin, while Abd-B and Dfd are capable to bind in both open and partially closed chromatin. Overall, our results show that TFs display different binding mechanisms and that our model is able to recapitulate their specific binding behaviour.
Collapse
Affiliation(s)
- Patrick C.N. Martin
- School of Life Sciences, University of Essex, Colchester CO4 3SQ, UK
- Biotech Research and Innovation Centre (BRIC), University of Copenhagen, DK-2200 Copenhagen, Denmark
| | - Nicolae Radu Zabet
- School of Life Sciences, University of Essex, Colchester CO4 3SQ, UK
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK
| |
Collapse
|
55
|
Nameki R, Chang H, Reddy J, Corona RI, Lawrenson K. Transcription factors in epithelial ovarian cancer: histotype-specific drivers and novel therapeutic targets. Pharmacol Ther 2020; 220:107722. [PMID: 33137377 DOI: 10.1016/j.pharmthera.2020.107722] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 10/26/2020] [Indexed: 02/06/2023]
Abstract
Transcription factors (TFs) are major contributors to cancer risk and somatic development. In preclinical and clinical studies, direct or indirect inhibition of TF-mediated oncogenic gene expression profiles have proven to be effective in many tumor types, highlighting this group of proteins as valuable therapeutic targets. In spite of this, our understanding of TFs in epithelial ovarian cancer (EOC) is relatively limited. EOC is a heterogeneous disease composed of five major histologic subtypes; high-grade serous, low-grade serous, endometrioid, clear cell and mucinous. Each histology is associated with unique clinical etiologies, sensitivity to therapies, and molecular signatures - including diverse transcriptional regulatory programs. While some TFs are shared across EOC subtypes, a set of TFs are expressed in a histotype-specific manner and likely explain part of the histologic diversity of EOC subtypes. Targeting TFs present with unique opportunities for development of novel precision medicine strategies for ovarian cancer. This article reviews the critical TFs in EOC subtypes and highlights the potential of exploiting TFs as biomarkers and therapeutic targets.
Collapse
Affiliation(s)
- Robbin Nameki
- Women's Cancer Research Program at the Samuel Oschin Comprehensive Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA; Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Heidi Chang
- Women's Cancer Research Program at the Samuel Oschin Comprehensive Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA; Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Jessica Reddy
- Women's Cancer Research Program at the Samuel Oschin Comprehensive Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA; Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Rosario I Corona
- Women's Cancer Research Program at the Samuel Oschin Comprehensive Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA; Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Kate Lawrenson
- Women's Cancer Research Program at the Samuel Oschin Comprehensive Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA; Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA; Center for Bioinformatics and Functional Genomics, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
| |
Collapse
|
56
|
Prediction of genome-wide effects of single nucleotide variants on transcription factor binding. Sci Rep 2020; 10:17632. [PMID: 33077858 PMCID: PMC7572467 DOI: 10.1038/s41598-020-74793-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Accepted: 10/07/2020] [Indexed: 11/26/2022] Open
Abstract
Single nucleotide variants (SNVs) located in transcriptional regulatory regions can result in gene expression changes that lead to adaptive or detrimental phenotypic outcomes. Here, we predict gain or loss of binding sites for 741 transcription factors (TFs) across the human genome. We calculated ‘gainability’ and ‘disruptability’ scores for each TF that represent the likelihood of binding sites being created or disrupted, respectively. We found that functional cis-eQTL SNVs are more likely to alter TF binding sites than rare SNVs in the human population. In addition, we show that cancer somatic mutations have different effects on TF binding sites from different TF families on a cancer-type basis. Finally, we discuss the relationship between these results and cancer mutational signatures. Altogether, we provide a blueprint to study the impact of SNVs derived from genetic variation or disease association on TF binding to gene regulatory regions.
Collapse
|
57
|
Guan X, Deng H, Choi UL, Li Z, Yang Y, Zeng J, Liu Y, Zhang X, Li G. EZH2 overexpression dampens tumor-suppressive signals via an EGR1 silencer to drive breast tumorigenesis. Oncogene 2020; 39:7127-7141. [PMID: 33009487 DOI: 10.1038/s41388-020-01484-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Revised: 08/27/2020] [Accepted: 09/21/2020] [Indexed: 02/08/2023]
Abstract
The mechanism underlying EZH2 overexpression in breast cancer and its involvement in tumorigenesis remain poorly understood. In this study, we developed an approach to systematically identify the trans-acting factors regulating the EZH2 expression, and identified more than 20 such factors. We revealed reciprocal regulation of early growth response 1 (EGR1) and EZH2: EGR1 activates the expression of EZH2, and EZH2 represses EGR1 expression. Using CRISPR-mediated genome/epigenome editing, we demonstrated that EHZ2 represses EGR1 expression through a silencer downstream of the EGR1 gene. Deletion of the EGR1 silencer resulted in reduced cell growth, invasion, tumorigenicity of breast cancer cells, and extensive changes in gene expression, such as upregulation of GADD45, DDIT3, and RND1; and downregulation of genes encoding cholesterol biosynthesis pathway enzymes. We hypothesize that EZH2/PRC2 acts as a "brake" for EGR1 expression by targeting the EGR1 silencer, and EZH2 overexpression dampens tumor-suppressive signals mediated by EGR1 to drive breast tumorigenesis.
Collapse
Affiliation(s)
- Xiaowen Guan
- Faculty of Health Sciences, University of Macau, Macau, China.,Cancer Centre, Faculty of Health Sciences, University of Macau, Macau, China.,Centre of Reproduction, Development and Aging, Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, China
| | - Houliang Deng
- Faculty of Health Sciences, University of Macau, Macau, China.,Cancer Centre, Faculty of Health Sciences, University of Macau, Macau, China.,Centre of Reproduction, Development and Aging, Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, China
| | - Un Lam Choi
- Faculty of Health Sciences, University of Macau, Macau, China.,Cancer Centre, Faculty of Health Sciences, University of Macau, Macau, China.,Centre of Reproduction, Development and Aging, Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, China
| | - Zhengfeng Li
- Faculty of Health Sciences, University of Macau, Macau, China.,Cancer Centre, Faculty of Health Sciences, University of Macau, Macau, China.,Centre of Reproduction, Development and Aging, Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, China
| | - Yiqi Yang
- Faculty of Health Sciences, University of Macau, Macau, China.,Cancer Centre, Faculty of Health Sciences, University of Macau, Macau, China.,Centre of Reproduction, Development and Aging, Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, China
| | - Jianming Zeng
- Faculty of Health Sciences, University of Macau, Macau, China.,Cancer Centre, Faculty of Health Sciences, University of Macau, Macau, China.,Centre of Reproduction, Development and Aging, Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, China
| | - Yunze Liu
- Faculty of Health Sciences, University of Macau, Macau, China.,Cancer Centre, Faculty of Health Sciences, University of Macau, Macau, China.,Centre of Reproduction, Development and Aging, Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, China
| | - Xuanjun Zhang
- Faculty of Health Sciences, University of Macau, Macau, China.,Cancer Centre, Faculty of Health Sciences, University of Macau, Macau, China.,Centre of Reproduction, Development and Aging, Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, China
| | - Gang Li
- Faculty of Health Sciences, University of Macau, Macau, China. .,Cancer Centre, Faculty of Health Sciences, University of Macau, Macau, China. .,Centre of Reproduction, Development and Aging, Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, China.
| |
Collapse
|
58
|
Molecular mechanism linking a novel PCSK9 copy number variant to severe hypercholesterolemia. Atherosclerosis 2020; 304:39-43. [DOI: 10.1016/j.atherosclerosis.2020.05.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Revised: 05/14/2020] [Accepted: 05/20/2020] [Indexed: 11/16/2022]
|
59
|
Snyder MP, Gingeras TR, Moore JE, Weng Z, Gerstein MB, Ren B, Hardison RC, Stamatoyannopoulos JA, Graveley BR, Feingold EA, Pazin MJ, Pagan M, Gilchrist DA, Hitz BC, Cherry JM, Bernstein BE, Mendenhall EM, Zerbino DR, Frankish A, Flicek P, Myers RM. Perspectives on ENCODE. Nature 2020; 583:693-698. [PMID: 32728248 PMCID: PMC7410827 DOI: 10.1038/s41586-020-2449-8] [Citation(s) in RCA: 112] [Impact Index Per Article: 22.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2019] [Accepted: 05/05/2020] [Indexed: 12/25/2022]
Abstract
The Encylopedia of DNA Elements (ENCODE) Project launched in 2003 with the long-term goal of developing a comprehensive map of functional elements in the human genome. These included genes, biochemical regions associated with gene regulation (for example, transcription factor binding sites, open chromatin, and histone marks) and transcript isoforms. The marks serve as sites for candidate cis-regulatory elements (cCREs) that may serve functional roles in regulating gene expression1. The project has been extended to model organisms, particularly the mouse. In the third phase of ENCODE, nearly a million and more than 300,000 cCRE annotations have been generated for human and mouse, respectively, and these have provided a valuable resource for the scientific community.
Collapse
Affiliation(s)
- Michael P Snyder
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA.
- Cardiovascular Institute, Stanford School of Medicine, Stanford, CA, USA.
| | - Thomas R Gingeras
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Jill E Moore
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
| | - Zhiping Weng
- University of Massachusetts Medical School, Program in Bioinformatics and Integrative Biology, Worcester, MA, USA
- Department of Thoracic Surgery, Clinical Translational Research Center, Shanghai Pulmonary Hospital, The School of Life Sciences and Technology, Tongji University, Shanghai, China
- Bioinformatics Program, Boston University, Boston, MA, USA
| | | | - Bing Ren
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
- Center for Epigenomics, University of California, San Diego, La Jolla, CA, USA
| | - Ross C Hardison
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA
| | - John A Stamatoyannopoulos
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Department of Medicine, University of Washington, Seattle, WA, USA
| | - Brenton R Graveley
- Department of Genetics and Genome Sciences, Institute for Systems Genomics, UConn Health, Farmington, CT, USA
| | - Elise A Feingold
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Michael J Pazin
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Michael Pagan
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Daniel A Gilchrist
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Benjamin C Hitz
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - J Michael Cherry
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Bradley E Bernstein
- Broad Institute and Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Eric M Mendenhall
- Biological Sciences, University of Alabama in Huntsville, Huntsville, AL, USA
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Daniel R Zerbino
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Richard M Myers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| |
Collapse
|
60
|
Srivastava D, Mahony S. Sequence and chromatin determinants of transcription factor binding and the establishment of cell type-specific binding patterns. BIOCHIMICA ET BIOPHYSICA ACTA. GENE REGULATORY MECHANISMS 2020; 1863:194443. [PMID: 31639474 PMCID: PMC7166147 DOI: 10.1016/j.bbagrm.2019.194443] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2019] [Revised: 09/21/2019] [Accepted: 10/06/2019] [Indexed: 12/14/2022]
Abstract
Transcription factors (TFs) selectively bind distinct sets of sites in different cell types. Such cell type-specific binding specificity is expected to result from interplay between the TF's intrinsic sequence preferences, cooperative interactions with other regulatory proteins, and cell type-specific chromatin landscapes. Cell type-specific TF binding events are highly correlated with patterns of chromatin accessibility and active histone modifications in the same cell type. However, since concurrent chromatin may itself be a consequence of TF binding, chromatin landscapes measured prior to TF activation provide more useful insights into how cell type-specific TF binding events became established in the first place. Here, we review the various sequence and chromatin determinants of cell type-specific TF binding specificity. We identify the current challenges and opportunities associated with computational approaches to characterizing, imputing, and predicting cell type-specific TF binding patterns. We further focus on studies that characterize TF binding in dynamic regulatory settings, and we discuss how these studies are leading to a more complex and nuanced understanding of dynamic protein-DNA binding activities. We propose that TF binding activities at individual sites can be viewed along a two-dimensional continuum of local sequence and chromatin context. Under this view, cell type-specific TF binding activities may result from either strongly favorable sequence features or strongly favorable chromatin context.
Collapse
Affiliation(s)
- Divyanshi Srivastava
- Center for Eukaryotic Gene Regulation, Department of Biochemistry & Molecular Biology, The Pennsylvania State University, University Park, PA, United States of America
| | - Shaun Mahony
- Center for Eukaryotic Gene Regulation, Department of Biochemistry & Molecular Biology, The Pennsylvania State University, University Park, PA, United States of America.
| |
Collapse
|
61
|
Moradifard S, Saghiri R, Ehsani P, Mirkhani F, Ebrahimi‐Rad M. A preliminary computational outputs versus experimental results: Application of sTRAP, a biophysical tool for the analysis of SNPs of transcription factor-binding sites. Mol Genet Genomic Med 2020; 8:e1219. [PMID: 32155318 PMCID: PMC7216802 DOI: 10.1002/mgg3.1219] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 02/25/2020] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND In the human genome, the transcription factors (TFs) and transcription factor-binding sites (TFBSs) network has a great regulatory function in the biological pathways. Such crosstalk might be affected by the single-nucleotide polymorphisms (SNPs), which could create or disrupt a TFBS, leading to either a disease or a phenotypic defect. Many computational resources have been introduced to predict the TFs binding variations due to SNPs inside TFBSs, sTRAP being one of them. METHODS A literature review was performed and the experimental data for 18 TFBSs located in 12 genes was provided. The sequences of TFBS motifs were extracted using two different strategies; in the size similar with synthetic target sites used in the experimental techniques, and with 60 bp upstream and downstream of the SNPs. The sTRAP (http://trap.molgen.mpg.de/cgi-bin/trap_two_seq_form.cgi) was applied to compute the binding affinity scores of their cognate TFs in the context of reference and mutant sequences of TFBSs. The alternative bioinformatics model used in this study was regulatory analysis of variation in enhancers (RAVEN; http://www.cisreg.ca/cgi-bin/RAVEN/a). The bioinformatics outputs of our study were compared with experimental data, electrophoretic mobility shift assay (EMSA). RESULTS In 6 out of 18 TFBSs in the following genes COL1A1, Hb ḉᴪ, TF, FIX, MBL2, NOS2A, the outputs of sTRAP were inconsistent with the results of EMSA. Furthermore, no p value of the difference between the two scores of binding affinity under the wild and mutant conditions of TFBSs was presented. Nor, were any criteria for preference or selection of any of the measurements of different matrices used for the same analysis. CONCLUSION Our preliminary study indicated some paradoxical results between sTRAP and experimental data. However, to link the data of sTRAP to the biological functions, its optimization via experimental procedures with the integration of expanded data and applying several other bioinformatics tools might be required.
Collapse
Affiliation(s)
| | - Reza Saghiri
- Biochemistry DepartmentPasteur Institute of IranTehranIran
| | - Parastoo Ehsani
- Molecular Biology DepartmentPasteur Institute of IranTehranIran
| | | | | |
Collapse
|
62
|
Schreiber J, Bilmes J, Noble WS. Completing the ENCODE3 compendium yields accurate imputations across a variety of assays and human biosamples. Genome Biol 2020; 21:82. [PMID: 32228713 PMCID: PMC7104481 DOI: 10.1186/s13059-020-01978-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Accepted: 02/26/2020] [Indexed: 12/16/2022] Open
Abstract
Recent efforts to describe the human epigenome have yielded thousands of epigenomic and transcriptomic datasets. However, due primarily to cost, the total number of such assays that can be performed is limited. Accordingly, we applied an imputation approach, Avocado, to a dataset of 3814 tracks of data derived from the ENCODE compendium, including measurements of chromatin accessibility, histone modification, transcription, and protein binding. Avocado shows significant improvements in imputing protein binding compared to the top models in the ENCODE-DREAM challenge. Additionally, we show that the Avocado model allows for efficient addition of new assays and biosamples to a pre-trained model.
Collapse
Affiliation(s)
- Jacob Schreiber
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA.
| | - Jeffrey Bilmes
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA
- Department of Electrical Engineering, University of Washington, Seattle, USA
| | - William Stafford Noble
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA.
- Department of Genome Sciences, University of Washington, Seattle, USA.
| |
Collapse
|
63
|
Qin Q, Fan J, Zheng R, Wan C, Mei S, Wu Q, Sun H, Brown M, Zhang J, Meyer CA, Liu XS. Lisa: inferring transcriptional regulators through integrative modeling of public chromatin accessibility and ChIP-seq data. Genome Biol 2020; 21:32. [PMID: 32033573 PMCID: PMC7007693 DOI: 10.1186/s13059-020-1934-6] [Citation(s) in RCA: 186] [Impact Index Per Article: 37.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Accepted: 01/13/2020] [Indexed: 12/21/2022] Open
Abstract
We developed Lisa (http://lisa.cistrome.org/) to predict the transcriptional regulators (TRs) of differentially expressed or co-expressed gene sets. Based on the input gene sets, Lisa first uses histone mark ChIP-seq and chromatin accessibility profiles to construct a chromatin model related to the regulation of these genes. Using TR ChIP-seq peaks or imputed TR binding sites, Lisa probes the chromatin models using in silico deletion to find the most relevant TRs. Applied to gene sets derived from targeted TF perturbation experiments, Lisa boosted the performance of imputed TR cistromes and outperformed alternative methods in identifying the perturbed TRs.
Collapse
Affiliation(s)
- Qian Qin
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Science and Technology, Tongji University, Shanghai, 200433, China
- Center of Molecular Medicine, Children's Hospital of Fudan University, Shanghai, 201102, China
| | - Jingyu Fan
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Science and Technology, Tongji University, Shanghai, 200433, China
| | - Rongbin Zheng
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Science and Technology, Tongji University, Shanghai, 200433, China
| | - Changxin Wan
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Science and Technology, Tongji University, Shanghai, 200433, China
| | - Shenglin Mei
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Science and Technology, Tongji University, Shanghai, 200433, China
| | - Qiu Wu
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Science and Technology, Tongji University, Shanghai, 200433, China
| | - Hanfei Sun
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Science and Technology, Tongji University, Shanghai, 200433, China
| | - Myles Brown
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, 02215, USA
- Department of Data Sciences, Dana-Farber Cancer Institute and Harvard T.H. Chan School of Public Health, Boston, MA, 02215, USA
| | - Jing Zhang
- Stem Cell Translational Research Center, Tongji Hospital, School of Life Science and Technology, Tongji University, Shanghai, 200065, China.
| | - Clifford A Meyer
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.
- Department of Data Sciences, Dana-Farber Cancer Institute and Harvard T.H. Chan School of Public Health, Boston, MA, 02215, USA.
| | - X Shirley Liu
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.
- Department of Data Sciences, Dana-Farber Cancer Institute and Harvard T.H. Chan School of Public Health, Boston, MA, 02215, USA.
| |
Collapse
|
64
|
Koo PK, Ploenzke M. Deep learning for inferring transcription factor binding sites. CURRENT OPINION IN SYSTEMS BIOLOGY 2020; 19:16-23. [PMID: 32905524 PMCID: PMC7469942 DOI: 10.1016/j.coisb.2020.04.001] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Deep learning is a powerful tool for predicting transcription factor binding sites from DNA sequence. Despite their high predictive accuracy, there are no guarantees that a high-performing deep learning model will learn causal sequence-function relationships. Thus a move beyond performance comparisons on benchmark datasets is needed. Interpreting model predictions is a powerful approach to identify which features drive performance gains and ideally provide insight into the underlying biological mechanisms. Here we highlight timely advances in deep learning for genomics, with a focus on inferring transcription factors binding sites. We describe recent applications, model architectures, and advances in local and global model interpretability methods, then conclude with a discussion on future research directions.
Collapse
Affiliation(s)
- Peter K Koo
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Matt Ploenzke
- Department of Biostatistics, Harvard University, Cambridge, MA, USA
| |
Collapse
|
65
|
Lin QXX, Thieffry D, Jha S, Benoukraf T. TFregulomeR reveals transcription factors' context-specific features and functions. Nucleic Acids Res 2020; 48:e10. [PMID: 31754708 PMCID: PMC6954419 DOI: 10.1093/nar/gkz1088] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Revised: 10/25/2019] [Accepted: 11/01/2019] [Indexed: 12/25/2022] Open
Abstract
Transcription factors (TFs) are sequence-specific DNA binding proteins, fine-tuning spatiotemporal gene expression. Since genomic occupancy of a TF is highly dynamic, it is crucial to study TF binding sites (TFBSs) in a cell-specific context. To date, thousands of ChIP-seq datasets have portrayed the genomic binding landscapes of numerous TFs in different cell types. Although these datasets can be browsed via several platforms, tools that can operate on that data flow are still lacking. Here, we introduce TFregulomeR (https://github.com/benoukraflab/TFregulomeR), an R-library linked to an up-to-date compendium of cistrome and methylome datasets, implemented with functionalities that facilitate integrative analyses. In particular, TFregulomeR enables the characterization of TF binding partners and cell-specific TFBSs, along with the study of TF’s functions in the context of different partnerships and DNA methylation levels. We demonstrated that TFs’ target gene ontologies can differ notably depending on their partners and, by re-analyzing well characterized TFs, we brought to light that numerous leucine zipper TFBSs derived from ChIP-seq experiments documented in current databases were inadequately characterized, due to the fact that their position weight matrices were assembled using a mixture of homodimer and heterodimer binding sites. Altogether, analyses of context-specific transcription regulation with TFregulomeR foster our understanding of regulatory network-dependent TF functions.
Collapse
Affiliation(s)
- Quy Xiao Xuan Lin
- Cancer Science Institute of Singapore, National University of Singapore, Singapore 117599, Singapore
| | - Denis Thieffry
- Computational Systems Biology Team, Institut de Biologie de l'École Normale Supérieure (IBENS), CNRS, INSERM, École Normale Supérieure, PSL Research University, Paris 75005, France
| | - Sudhakar Jha
- Cancer Science Institute of Singapore, National University of Singapore, Singapore 117599, Singapore.,Department of Biochemistry, National University of Singapore, Singapore 117596, Singapore
| | - Touati Benoukraf
- Cancer Science Institute of Singapore, National University of Singapore, Singapore 117599, Singapore.,Discipline of Genetics, Faculty of Medicine, Memorial University of Newfoundland, St. John's, NL A1B 3V6, Canada
| |
Collapse
|