1
|
Kamat A, Tran NT, Sharda M, Sontakke N, Le TBK, Badrinarayanan A. Widespread prevalence of a methylation-dependent switch to activate an essential DNA damage response in bacteria. PLoS Biol 2024; 22:e3002540. [PMID: 38466718 PMCID: PMC10957082 DOI: 10.1371/journal.pbio.3002540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 03/21/2024] [Accepted: 02/06/2024] [Indexed: 03/13/2024] Open
Abstract
DNA methylation plays central roles in diverse cellular processes, ranging from error-correction during replication to regulation of bacterial defense mechanisms. Nevertheless, certain aberrant methylation modifications can have lethal consequences. The mechanisms by which bacteria detect and respond to such damage remain incompletely understood. Here, we discover a highly conserved but previously uncharacterized transcription factor (Cada2), which orchestrates a methylation-dependent adaptive response in Caulobacter. This response operates independently of the SOS response, governs the expression of genes crucial for direct repair, and is essential for surviving methylation-induced damage. Our molecular investigation of Cada2 reveals a cysteine methylation-dependent posttranslational modification (PTM) and mode of action distinct from its Escherichia coli counterpart, a trait conserved across all bacteria harboring a Cada2-like homolog instead. Extending across the bacterial kingdom, our findings support the notion of divergence and coevolution of adaptive response transcription factors and their corresponding sequence-specific DNA motifs. Despite this diversity, the ubiquitous prevalence of adaptive response regulators underscores the significance of a transcriptional switch, mediated by methylation PTM, in driving a specific and essential bacterial DNA damage response.
Collapse
Affiliation(s)
- Aditya Kamat
- National Centre for Biological Sciences (TIFR), Bengaluru, India
| | - Ngat T. Tran
- John Innes Centre, Department of Molecular Microbiology, Colney Lane, Norwich, United Kingdom
| | - Mohak Sharda
- National Centre for Biological Sciences (TIFR), Bengaluru, India
| | - Neha Sontakke
- National Centre for Biological Sciences (TIFR), Bengaluru, India
| | - Tung B. K. Le
- John Innes Centre, Department of Molecular Microbiology, Colney Lane, Norwich, United Kingdom
| | | |
Collapse
|
2
|
Zhu X, Huang Q, Huang L, Luo J, Li Q, Kong D, Deng B, Gu Y, Wang X, Li C, Kong S, Zhang Y. MAE-seq refines regulatory elements across the genome. Nucleic Acids Res 2024; 52:e9. [PMID: 38038259 PMCID: PMC10810209 DOI: 10.1093/nar/gkad1129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 10/23/2023] [Accepted: 11/10/2023] [Indexed: 12/02/2023] Open
Abstract
Proper cell fate determination relies on precise spatial and temporal genome-wide cooperation between regulatory elements (REs) and their targeted genes. However, the lengths of REs defined using different methods vary, which indicates that there is sequence redundancy and that the context of the genome may be unintelligible. We developed a method called MAE-seq (Massive Active Enhancers by Sequencing) to experimentally identify functional REs at a 25-bp scale. In this study, MAE-seq was used to identify 626879, 541617 and 554826 25-bp enhancers in mouse embryonic stem cells (mESCs), C2C12 and HEK 293T, respectively. Using ∼1.6 trillion 25 bp DNA fragments and screening 12 billion cells, we identified 626879 as active enhancers in mESCs as an example. Comparative analysis revealed that most of the histone modification datasets were annotated by MAE-Seq loci. Furthermore, 33.85% (212195) of the identified enhancers were identified as de novo ones with no epigenetic modification. Intriguingly, distinct chromatin states dictate the requirement for dissimilar cofactors in governing novel and known enhancers. Validation results show that these 25-bp sequences could act as a functional unit, which shows identical or similar expression patterns as the previously defined larger elements, Enhanced resolution facilitated the identification of numerous cell-specific enhancers and their accurate annotation as super enhancers. Moreover, we characterized novel elements capable of augmenting gene activity. By integrating with high-resolution Hi-C data, over 55.64% of novel elements may have a distal association with different targeted genes. For example, we found that the Cdh1 gene interacts with one novel and two known REs in mESCs. The biological effects of these interactions were investigated using CRISPR-Cas9, revealing their role in coordinating Cdh1 gene expression and mESC proliferation. Our study presents an experimental approach to refine the REs at 25-bp resolution, advancing the precision of genome annotation and unveiling the underlying genome context. This novel approach not only advances our understanding of gene regulation but also opens avenues for comprehensive exploration of the genomic landscape.
Collapse
Affiliation(s)
- Xiusheng Zhu
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Qitong Huang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
- Department of animal sciences, Wageningen University & Research, Wageningen, 6708PB, Netherlands
| | - Lei Huang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Jing Luo
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Qing Li
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Dashuai Kong
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Biao Deng
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Yi Gu
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Xueyan Wang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Chenying Li
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Siyuan Kong
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Yubo Zhang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
- Kunpeng Institute of Modern Agriculture at Foshan, Foshan, 528225, China
| |
Collapse
|
3
|
Ribeiro ML, Sánchez Vinces S, Mondragon L, Roué G. Epigenetic targets in B- and T-cell lymphomas: latest developments. Ther Adv Hematol 2023; 14:20406207231173485. [PMID: 37273421 PMCID: PMC10236259 DOI: 10.1177/20406207231173485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 04/17/2023] [Indexed: 06/06/2023] Open
Abstract
Non-Hodgkin's lymphomas (NHLs) comprise a diverse group of diseases, either of mature B-cell or of T-cell derivation, characterized by heterogeneous molecular features and clinical manifestations. While most of the patients are responsive to standard chemotherapy, immunotherapy, radiation and/or stem cell transplantation, relapsed and/or refractory cases still have a dismal outcome. Deep sequencing analysis have pointed out that epigenetic dysregulations, including mutations in epigenetic enzymes, such as chromatin modifiers and DNA methyltransferases (DNMTs), are prevalent in both B- cell and T-cell lymphomas. Accordingly, over the past decade, a large number of epigenetic-modifying agents have been developed and introduced into the clinical management of these entities, and a few specific inhibitors have already been approved for clinical use. Here we summarize the main epigenetic alterations described in B- and T-NHL, that further supported the clinical development of a selected set of epidrugs in determined diseases, including inhibitors of DNMTs, histone deacetylases (HDACs), and extra-terminal domain proteins (bromodomain and extra-terminal motif; BETs). Finally, we highlight the most promising future directions of research in this area, explaining how bioinformatics approaches can help to identify new epigenetic targets in B- and T-cell lymphoid neoplasms.
Collapse
Affiliation(s)
- Marcelo Lima Ribeiro
- Lymphoma Translational Group, Josep Carreras
Leukaemia Research Institute, Badalona, Spain
- Laboratory of Immunopharmacology and Molecular
Biology, Sao Francisco University Medical School, Braganca Paulista,
Brazil
| | - Salvador Sánchez Vinces
- Laboratory of Immunopharmacology and Molecular
Biology, Sao Francisco University Medical School, Braganca Paulista,
Brazil
| | - Laura Mondragon
- T Cell Lymphoma Group, Josep Carreras Leukaemia
Research Institute, IJC. Ctra de Can Ruti, Camí de les Escoles s/n, 08916
Badalona, Barcelona, Spain
| | - Gael Roué
- Lymphoma Translational Group, Josep Carreras
Leukaemia Research Institute, IJC. Ctra de Can Ruti, Camí de les Escoles
s/n, 08916 Badalona, Barcelona, Spain
| |
Collapse
|
4
|
Schwaiger M, Andrikou C, Dnyansagar R, Murguia PF, Paganos P, Voronov D, Zimmermann B, Lebedeva T, Schmidt HA, Genikhovich G, Benvenuto G, Arnone MI, Technau U. An ancestral Wnt-Brachyury feedback loop in axial patterning and recruitment of mesoderm-determining target genes. Nat Ecol Evol 2022; 6:1921-1939. [PMID: 36396969 DOI: 10.1038/s41559-022-01905-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 09/12/2022] [Indexed: 11/18/2022]
Abstract
Transcription factors are crucial drivers of cellular differentiation during animal development and often share ancient evolutionary origins. The T-box transcription factor Brachyury plays a pivotal role as an early mesoderm determinant and neural repressor in vertebrates; yet, the ancestral function and key evolutionary transitions of the role of this transcription factor remain obscure. Here, we present a genome-wide target-gene screen using chromatin immunoprecipitation sequencing in the sea anemone Nematostella vectensis, an early branching non-bilaterian, and the sea urchin Strongylocentrotus purpuratus, a representative of the sister lineage of chordates. Our analysis reveals an ancestral gene regulatory feedback loop connecting Brachyury, FoxA and canonical Wnt signalling involved in axial patterning that predates the cnidarian-bilaterian split about 700 million years ago. Surprisingly, we also found that part of the gene regulatory network controlling the fate of neuromesodermal progenitors in vertebrates was already present in the common ancestor of cnidarians and bilaterians. However, while several endodermal and neuronal Brachyury target genes are ancestrally shared, hardly any of the key mesodermal downstream targets in vertebrates are found in the sea anemone or the sea urchin. Our study suggests that a limited number of target genes involved in mesoderm formation were newly acquired in the vertebrate lineage, leading to a dramatic shift in the function of this ancestral developmental regulator.
Collapse
Affiliation(s)
- Michaela Schwaiger
- Department of Neurosciences and Developmental Biology, Faculty of Life Sciences,University of Vienna, Vienna, Austria
- Friedrich Miescher Institute for Biomedical Research, Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Carmen Andrikou
- Stazione Zoologica Anton Dohrn, Villa Comunale, Naples, Italy
- Department of Biological Sciences, University of Bergen, Bergen, Norway
| | - Rohit Dnyansagar
- Department of Neurosciences and Developmental Biology, Faculty of Life Sciences,University of Vienna, Vienna, Austria
| | - Patricio Ferrer Murguia
- Department of Neurosciences and Developmental Biology, Faculty of Life Sciences,University of Vienna, Vienna, Austria
| | | | - Danila Voronov
- Stazione Zoologica Anton Dohrn, Villa Comunale, Naples, Italy
| | - Bob Zimmermann
- Department of Neurosciences and Developmental Biology, Faculty of Life Sciences,University of Vienna, Vienna, Austria
| | - Tatiana Lebedeva
- Department of Neurosciences and Developmental Biology, Faculty of Life Sciences,University of Vienna, Vienna, Austria
| | - Heiko A Schmidt
- Center for Integrative Bioinformatics Vienna, Max Perutz Labs, University of Vienna, Vienna, Austria
| | - Grigory Genikhovich
- Department of Neurosciences and Developmental Biology, Faculty of Life Sciences,University of Vienna, Vienna, Austria
| | | | | | - Ulrich Technau
- Department of Neurosciences and Developmental Biology, Faculty of Life Sciences,University of Vienna, Vienna, Austria.
- Max Perutz Labs, University of Vienna, Vienna, Austria.
- Research Platform 'Single Cell Regulation of Stem Cells', University of Vienna, Vienna, Austria.
| |
Collapse
|
5
|
Detilleux D, Spill YG, Balaramane D, Weber M, Bardet AF. Pan-cancer predictions of transcription factors mediating aberrant DNA methylation. Epigenetics Chromatin 2022; 15:10. [PMID: 35331302 PMCID: PMC8944071 DOI: 10.1186/s13072-022-00443-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 03/04/2022] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Aberrant DNA methylation is a hallmark of cancer cells. However, the mechanisms underlying changes in DNA methylation remain elusive. Transcription factors initially thought to be repressed from binding by DNA methylation, have recently emerged as being able to shape DNA methylation patterns. RESULTS Here, we integrated the massive amount of data available from The Cancer Genome Atlas to predict transcription factors driving aberrant DNA methylation in 13 cancer types. We identified differentially methylated regions between cancer and matching healthy samples, searched for transcription factor motifs enriched in those regions and selected transcription factors with corresponding changes in gene expression. We predict transcription factors known to be involved in cancer as well as novel candidates to drive hypo-methylated regions such as FOXA1 and GATA3 in breast cancer, FOXA1 and TWIST1 in prostate cancer and NFE2L2 in lung cancer. We also predict transcription factors that lead to hyper-methylated regions upon transcription factor loss such as EGR1 in several cancer types. Finally, we validate that FOXA1 and GATA3 mediate hypo-methylated regions in breast cancer cells. CONCLUSION Our work highlights the importance of some transcription factors as upstream regulators shaping DNA methylation patterns in cancer.
Collapse
Affiliation(s)
- Dylane Detilleux
- UMR7242 Biotechnology and Cell Signaling, CNRS, University of Strasbourg, 67412, Illkirch, France
| | - Yannick G Spill
- UMR7242 Biotechnology and Cell Signaling, CNRS, University of Strasbourg, 67412, Illkirch, France
| | - Delphine Balaramane
- UMR7242 Biotechnology and Cell Signaling, CNRS, University of Strasbourg, 67412, Illkirch, France
| | - Michaël Weber
- UMR7242 Biotechnology and Cell Signaling, CNRS, University of Strasbourg, 67412, Illkirch, France.
| | - Anaïs Flore Bardet
- UMR7242 Biotechnology and Cell Signaling, CNRS, University of Strasbourg, 67412, Illkirch, France.
| |
Collapse
|
6
|
Hou X, Zhu C, Xu M, Chen X, Sun C, Nashan B, Guang S, Feng X. The SNAPc complex mediates starvation-induced trans-splicing in Caenorhabditis elegans. J Genet Genomics 2022; 49:952-964. [DOI: 10.1016/j.jgg.2022.02.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Revised: 02/20/2022] [Accepted: 02/22/2022] [Indexed: 11/16/2022]
|
7
|
Jansen C, Paraiso KD, Zhou JJ, Blitz IL, Fish MB, Charney RM, Cho JS, Yasuoka Y, Sudou N, Bright AR, Wlizla M, Veenstra GJC, Taira M, Zorn AM, Mortazavi A, Cho KWY. Uncovering the mesendoderm gene regulatory network through multi-omic data integration. Cell Rep 2022; 38:110364. [PMID: 35172134 PMCID: PMC8917868 DOI: 10.1016/j.celrep.2022.110364] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 10/30/2021] [Accepted: 01/19/2022] [Indexed: 01/01/2023] Open
Abstract
Mesendodermal specification is one of the earliest events in embryogenesis, where cells first acquire distinct identities. Cell differentiation is a highly regulated process that involves the function of numerous transcription factors (TFs) and signaling molecules, which can be described with gene regulatory networks (GRNs). Cell differentiation GRNs are difficult to build because existing mechanistic methods are low throughput, and high-throughput methods tend to be non-mechanistic. Additionally, integrating highly dimensional data composed of more than two data types is challenging. Here, we use linked self-organizing maps to combine chromatin immunoprecipitation sequencing (ChIP-seq)/ATAC-seq with temporal, spatial, and perturbation RNA sequencing (RNA-seq) data from Xenopus tropicalis mesendoderm development to build a high-resolution genome scale mechanistic GRN. We recover both known and previously unsuspected TF-DNA/TF-TF interactions validated through reporter assays. Our analysis provides insights into transcriptional regulation of early cell fate decisions and provides a general approach to building GRNs using highly dimensional multi-omic datasets.
Collapse
Affiliation(s)
- Camden Jansen
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA; Center for Complex Biological Systems, University of California, Irvine, CA, USA
| | - Kitt D Paraiso
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA; Center for Complex Biological Systems, University of California, Irvine, CA, USA
| | - Jeff J Zhou
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
| | - Ira L Blitz
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
| | - Margaret B Fish
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
| | - Rebekah M Charney
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
| | - Jin Sun Cho
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
| | - Yuuri Yasuoka
- Laboratory for Comprehensive Genomic Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Norihiro Sudou
- Department of Anatomy, School of Medicine, Toho University, Tokyo, Japan
| | - Ann Rose Bright
- Department of Molecular Developmental Biology, Radboud University, Nijmegen, the Netherlands
| | - Marcin Wlizla
- Division of Developmental Biology, Department of Pediatrics, Cincinnati Children's Hospital Medical Center, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Gert Jan C Veenstra
- Department of Molecular Developmental Biology, Radboud University, Nijmegen, the Netherlands
| | - Masanori Taira
- Department of Biological Sciences, Chuo University, Tokyo, Japan
| | - Aaron M Zorn
- Division of Developmental Biology, Department of Pediatrics, Cincinnati Children's Hospital Medical Center, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA; Center for Complex Biological Systems, University of California, Irvine, CA, USA.
| | - Ken W Y Cho
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA; Center for Complex Biological Systems, University of California, Irvine, CA, USA.
| |
Collapse
|
8
|
Guest T, Haycocks JRJ, Warren GZL, Grainger DC. Genome-wide mapping of Vibrio cholerae VpsT binding identifies a mechanism for c-di-GMP homeostasis. Nucleic Acids Res 2021; 50:149-159. [PMID: 34908143 PMCID: PMC8754643 DOI: 10.1093/nar/gkab1194] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 11/16/2021] [Accepted: 11/18/2021] [Indexed: 11/13/2022] Open
Abstract
Many bacteria use cyclic dimeric guanosine monophosphate (c-di-GMP) to control changes in lifestyle. The molecule, synthesized by proteins having diguanylate cyclase activity, is often a signal to transition from motile to sedentary behaviour. In Vibrio cholerae, c-di-GMP can exert its effects via the transcription factors VpsT and VpsR. Together, these proteins activate genes needed for V. cholerae to form biofilms. In this work, we have mapped the genome-wide distribution of VpsT in a search for further regulatory roles. We show that VpsT binds 23 loci and recognises a degenerate DNA palindrome having the consensus 5'-W-5R-4[CG]-3Y-2W-1W+1R+2[GC]+3Y+4W+5-3'. Most genes targeted by VpsT encode functions related to motility, biofilm formation, or c-di-GMP metabolism. Most notably, VpsT activates expression of the vpvABC operon that encodes a diguanylate cyclase. This creates a positive feedback loop needed to maintain intracellular levels of c-di-GMP. Mutation of the key VpsT binding site, upstream of vpvABC, severs the loop and c-di-GMP levels fall accordingly. Hence, as well as relaying the c-di-GMP signal, VpsT impacts c-di-GMP homeostasis.
Collapse
Affiliation(s)
- Thomas Guest
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK
| | - James R J Haycocks
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK
| | - Gemma Z L Warren
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK
| | - David C Grainger
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK
| |
Collapse
|
9
|
Lu Y, Wu Y, Liu Y, Li Y, Jing R, Li M. Prediction of disease-associated functional variants in noncoding regions through a comprehensive analysis by integrating datasets and features. Hum Mutat 2021; 42:667-684. [PMID: 33822436 DOI: 10.1002/humu.24203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 02/01/2021] [Accepted: 03/31/2021] [Indexed: 02/01/2023]
Abstract
One of the greatest challenges in human genetics is deciphering the link between functional variants in noncoding sequences and the pathophysiology of complex diseases. To address this issue, many methods have been developed to sort functional single-nucleotide variants (SNVs) for neutral SNVs in noncoding regions. In this study, we integrated well-established features and commonly used datasets and merged them into large-scale datasets based on a random forest model, which yielded promising performance and outperformed some cutting-edge approaches. Our analyses of feature importance and data coverage also provide certain clues for future research in enhancing the prediction of functional noncoding SNVs.
Collapse
Affiliation(s)
- Yu Lu
- College of Chemistry, Sichuan University, Chengdu, Sichuan, China
| | - Yiming Wu
- Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Yuan Liu
- College of Chemistry, Sichuan University, Chengdu, Sichuan, China
| | - Yizhou Li
- College of Chemistry, Sichuan University, Chengdu, Sichuan, China
| | - Runyu Jing
- College of Cybersecurity, Sichuan University, Chengdu, Sichuan, China
| | - Menglong Li
- College of Chemistry, Sichuan University, Chengdu, Sichuan, China
| |
Collapse
|
10
|
Nakato R, Sakata T. Methods for ChIP-seq analysis: A practical workflow and advanced applications. Methods 2021; 187:44-53. [PMID: 32240773 DOI: 10.1016/j.ymeth.2020.03.005] [Citation(s) in RCA: 120] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 03/17/2020] [Accepted: 03/18/2020] [Indexed: 12/13/2022] Open
Abstract
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is a central method in epigenomic research. Genome-wide analysis of histone modifications, such as enhancer analysis and genome-wide chromatin state annotation, enables systematic analysis of how the epigenomic landscape contributes to cell identity, development, lineage specification, and disease. In this review, we first present a typical ChIP-seq analysis workflow, from quality assessment to chromatin-state annotation. We focus on practical, rather than theoretical, approaches for biological studies. Next, we outline various advanced ChIP-seq applications and introduce several state-of-the-art methods, including prediction of gene expression level and chromatin loops from epigenome data and data imputation. Finally, we discuss recently developed single-cell ChIP-seq analysis methodologies that elucidate the cellular diversity within complex tissues and cancers.
Collapse
Affiliation(s)
- Ryuichiro Nakato
- Laboratory of Computational Genomics, Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan.
| | - Toyonori Sakata
- Laboratory of Genome Structure and Function, Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan.
| |
Collapse
|
11
|
Leporcq C, Spill Y, Balaramane D, Toussaint C, Weber M, Bardet AF. TFmotifView: a webserver for the visualization of transcription factor motifs in genomic regions. Nucleic Acids Res 2020; 48:W208-W217. [PMID: 32324215 PMCID: PMC7319436 DOI: 10.1093/nar/gkaa252] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 03/24/2020] [Accepted: 04/08/2020] [Indexed: 12/31/2022] Open
Abstract
Transcription factors (TFs) regulate the expression of gene expression. The binding specificities of many TFs have been deciphered and summarized as position-weight matrices, also called TF motifs. Despite the availability of hundreds of known TF motifs in databases, it remains non-trivial to quickly query and visualize the enrichment of known TF motifs in genomic regions of interest. Towards this goal, we developed TFmotifView, a web server that allows to study the distribution of known TF motifs in genomic regions. Based on input genomic regions and selected TF motifs, TFmotifView performs an overlap of the genomic regions with TF motif occurrences identified using a dynamic P-value threshold. TFmotifView generates three different outputs: (i) an enrichment table and scatterplot calculating the significance of TF motif occurrences in genomic regions compared to control regions, (ii) a genomic view of the organisation of TF motifs in each genomic region and (iii) a metaplot summarizing the position of TF motifs relative to the center of the regions. TFmotifView will contribute to the integration of TF motif information with a wide range of genomic datasets towards the goal to better understand the regulation of gene expression by transcription factors. TFmotifView is freely available at http://bardet.u-strasbg.fr/tfmotifview/.
Collapse
Affiliation(s)
- Clémentine Leporcq
- CNRS, University of Strasbourg, UMR7242 Biotechnology and Cell Signaling, Illkirch 67412, France
| | - Yannick Spill
- CNRS, University of Strasbourg, UMR7242 Biotechnology and Cell Signaling, Illkirch 67412, France
| | - Delphine Balaramane
- CNRS, University of Strasbourg, UMR7242 Biotechnology and Cell Signaling, Illkirch 67412, France
| | - Christophe Toussaint
- CNRS, University of Strasbourg, UMR7242 Biotechnology and Cell Signaling, Illkirch 67412, France
| | - Michaël Weber
- CNRS, University of Strasbourg, UMR7242 Biotechnology and Cell Signaling, Illkirch 67412, France
| | - Anaïs Flore Bardet
- CNRS, University of Strasbourg, UMR7242 Biotechnology and Cell Signaling, Illkirch 67412, France
| |
Collapse
|
12
|
Anzawa H, Yamagata H, Kinoshita K. Theoretical characterisation of strand cross-correlation in ChIP-seq. BMC Bioinformatics 2020; 21:417. [PMID: 32962634 PMCID: PMC7510163 DOI: 10.1186/s12859-020-03729-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Accepted: 08/31/2020] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND Strand cross-correlation profiles are used for both peak calling pre-analysis and quality control (QC) in chromatin immunoprecipitation followed by sequencing (ChIP-seq) analysis. Despite its potential for robust and accurate assessments of signal-to-noise ratio (S/N) because of its peak calling independence, it remains unclear what aspects of quality such strand cross-correlation profiles actually measure. RESULTS We introduced a simple model to simulate the mapped read-density of ChIP-seq and then derived the theoretical maximum and minimum of cross-correlation coefficients between strands. The results suggest that the maximum coefficient of typical ChIP-seq samples is directly proportional to the number of total mapped reads and the square of the ratio of signal reads, and inversely proportional to the number of peaks and the length of read-enriched regions. Simulation analysis supported our results and evaluation using 790 ChIP-seq data obtained from the public database demonstrated high consistency between calculated cross-correlation coefficients and estimated coefficients based on the theoretical relations and peak calling results. In addition, we found that the mappability-bias-correction improved sensitivity, enabling differentiation of maximum coefficients from the noise level. Based on these insights, we proposed virtual S/N (VSN), a novel peak call-free metric for S/N assessment. We also developed PyMaSC, a tool to calculate strand cross-correlation and VSN efficiently. VSN achieved most consistent S/N estimation for various ChIP targets and sequencing read depths. Furthermore, we demonstrated that a combination of VSN and pre-existing peak calling results enable the estimation of the numbers of detectable peaks for posterior experiments and assess peak calling results. CONCLUSIONS We present the first theoretical insights into the strand cross-correlation, and the results reveal the potential and the limitations of strand cross-correlation analysis. Our quality assessment framework using VSN provides peak call-independent QC and will help in the evaluation of peak call analysis in ChIP-seq experiments.
Collapse
Affiliation(s)
- Hayato Anzawa
- Graduate School of Information Sciences, Tohoku University, Sendai, Miyagi, Japan
| | - Hitoshi Yamagata
- Advanced Research Laboratory, Canon Medical Systems Corporation, Otawara, Tochigi, Japan
| | - Kengo Kinoshita
- Graduate School of Information Sciences, Tohoku University, Sendai, Miyagi, Japan. .,Advanced Research Laboratory, Canon Medical Systems Corporation, Otawara, Tochigi, Japan. .,Tohoku Medical Megabank Organization, Sendai, Miyagi, Japan. .,Institute of Development, Aging and Cancer, Tohoku University, Sendai, Miyagi, Japan.
| |
Collapse
|
13
|
Chitpin JG, Awdeh A, Perkins TJ. RECAP reveals the true statistical significance of ChIP-seq peak calls. Bioinformatics 2020; 35:3592-3598. [PMID: 30824903 PMCID: PMC6761936 DOI: 10.1093/bioinformatics/btz150] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2018] [Revised: 01/18/2019] [Accepted: 02/27/2019] [Indexed: 12/29/2022] Open
Abstract
Motivation Chromatin Immunopreciptation (ChIP)-seq is used extensively to identify sites of transcription factor binding or regions of epigenetic modifications to the genome. A key step in ChIP-seq analysis is peak calling, where genomic regions enriched for ChIP versus control reads are identified. Many programs have been designed to solve this task, but nearly all fall into the statistical trap of using the data twice—once to determine candidate enriched regions, and again to assess enrichment by classical statistical hypothesis testing. This double use of the data invalidates the statistical significance assigned to enriched regions, thus the true significance or reliability of peak calls remains unknown. Results Using simulated and real ChIP-seq data, we show that three well-known peak callers, MACS, SICER and diffReps, output biased P-values and false discovery rate estimates that can be many orders of magnitude too optimistic. We propose a wrapper algorithm, RECAP, that uses resampling of ChIP-seq and control data to estimate a monotone transform correcting for biases built into peak calling algorithms. When applied to null hypothesis data, where there is no enrichment between ChIP-seq and control, P-values recalibrated by RECAP are approximately uniformly distributed. On data where there is genuine enrichment, RECAP P-values give a better estimate of the true statistical significance of candidate peaks and better false discovery rate estimates, which correlate better with empirical reproducibility. RECAP is a powerful new tool for assessing the true statistical significance of ChIP-seq peak calls. Availability and implementation The RECAP software is available through www.perkinslab.ca or on github at https://github.com/theodorejperkins/RECAP. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Justin G Chitpin
- Translational and Molecular Medicine Program, University of Ottawa, Ottawa, ON K1H8M5, Canada.,Regenerative Medicine Program, Ottawa Hospital Research Institute, Ottawa, ON K1H8L6, Canada
| | - Aseel Awdeh
- Regenerative Medicine Program, Ottawa Hospital Research Institute, Ottawa, ON K1H8L6, Canada.,School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON K1N6N5, Canada
| | - Theodore J Perkins
- Regenerative Medicine Program, Ottawa Hospital Research Institute, Ottawa, ON K1H8L6, Canada.,School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON K1N6N5, Canada.,Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, ON K1H8M5, Canada
| |
Collapse
|
14
|
Sharma V, Majumdar S. Comparative analysis of ChIP-exo peak-callers: impact of data quality, read duplication and binding subtypes. BMC Bioinformatics 2020; 21:65. [PMID: 32085702 PMCID: PMC7035708 DOI: 10.1186/s12859-020-3403-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Accepted: 02/10/2020] [Indexed: 01/26/2023] Open
Abstract
Background ChIP (Chromatin immunoprecipitation)-exo has emerged as an important and versatile improvement over conventional ChIP-seq as it reduces the level of noise, maps the transcription factor (TF) binding location in a very precise manner, upto single base-pair resolution, and enables binding mode prediction. Availability of numerous peak-callers for analyzing ChIP-exo reads has motivated the need to assess their performance and report which tool executes reasonably well for the task. Results This study has focussed on comparing peak-callers that report direct binding events with those that report indirect binding events. The effect of strandedness of reads and duplication of data on the performance of peak-callers has been investigated. The number of peaks reported by each peak-caller is compared followed by a comparison of the annotated motifs present in the reported peaks. The significance of peaks is assessed based on the presence of a motif in top peaks. Indirect binding tools have been compared on the basis of their ability to identify annotated motifs and predict mode of protein-DNA interaction. Conclusion By studying the output of the peak-callers investigated in this study, it is concluded that the tools that use self-learning algorithms, i.e. the tools that estimate all the essential parameters from the aligned reads, perform better than the algorithms which require formation of peak-pairs. The latest tools that account for indirect binding of TFs appear to be an upgrade over the available tools, as they are able to reveal valuable information about the mode of binding in addition to direct binding. Furthermore, the quality of ChIP-exo reads have important consequences on the output of data analysis.
Collapse
Affiliation(s)
- Vasudha Sharma
- Discipline of Biological Engineering, Indian Institute of Technology Gandhinagar, Palaj, Gujarat, 382355, India
| | - Sharmistha Majumdar
- Discipline of Biological Engineering, Indian Institute of Technology Gandhinagar, Palaj, Gujarat, 382355, India.
| |
Collapse
|
15
|
Yamada N, Lai WKM, Farrell N, Pugh BF, Mahony S. Characterizing protein-DNA binding event subtypes in ChIP-exo data. Bioinformatics 2019; 35:903-913. [PMID: 30165373 DOI: 10.1093/bioinformatics/bty703] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Revised: 07/14/2018] [Accepted: 08/23/2018] [Indexed: 01/21/2023] Open
Abstract
MOTIVATION Regulatory proteins associate with the genome either by directly binding cognate DNA motifs or via protein-protein interactions with other regulators. Each recruitment mechanism may be associated with distinct motifs and may also result in distinct characteristic patterns in high-resolution protein-DNA binding assays. For example, the ChIP-exo protocol precisely characterizes protein-DNA crosslinking patterns by combining chromatin immunoprecipitation (ChIP) with 5' → 3' exonuclease digestion. Since different regulatory complexes will result in different protein-DNA crosslinking signatures, analysis of ChIP-exo tag enrichment patterns should enable detection of multiple protein-DNA binding modes for a given regulatory protein. However, current ChIP-exo analysis methods either treat all binding events as being of a uniform type or rely on motifs to cluster binding events into subtypes. RESULTS To systematically detect multiple protein-DNA interaction modes in a single ChIP-exo experiment, we introduce the ChIP-exo mixture model (ChExMix). ChExMix probabilistically models the genomic locations and subtype memberships of binding events using both ChIP-exo tag distribution patterns and DNA motifs. We demonstrate that ChExMix achieves accurate detection and classification of binding event subtypes using in silico mixed ChIP-exo data. We further demonstrate the unique analysis abilities of ChExMix using a collection of ChIP-exo experiments that profile the binding of key transcription factors in MCF-7 cells. In these data, ChExMix identifies possible recruitment mechanisms of FoxA1 and ERα, thus demonstrating that ChExMix can effectively stratify ChIP-exo binding events into biologically meaningful subtypes. AVAILABILITY AND IMPLEMENTATION ChExMix is available from https://github.com/seqcode/chexmix. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Naomi Yamada
- Department of Biochemistry & Molecular Biology and Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, USA
| | - William K M Lai
- Department of Biochemistry & Molecular Biology and Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, USA
| | - Nina Farrell
- Department of Biochemistry & Molecular Biology and Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, USA
| | - B Franklin Pugh
- Department of Biochemistry & Molecular Biology and Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, USA
| | - Shaun Mahony
- Department of Biochemistry & Molecular Biology and Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
16
|
Toivonen J, Kivioja T, Jolma A, Yin Y, Taipale J, Ukkonen E. Modular discovery of monomeric and dimeric transcription factor binding motifs for large data sets. Nucleic Acids Res 2019; 46:e44. [PMID: 29385521 PMCID: PMC5934673 DOI: 10.1093/nar/gky027] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Accepted: 01/12/2018] [Indexed: 01/06/2023] Open
Abstract
In some dimeric cases of transcription factor (TF) binding, the specificity of dimeric motifs has been observed to differ notably from what would be expected were the two factors to bind to DNA independently of each other. Current motif discovery methods are unable to learn monomeric and dimeric motifs in modular fashion such that deviations from the expected motif would become explicit and the noise from dimeric occurrences would not corrupt monomeric models. We propose a novel modeling technique and an expectation maximization algorithm, implemented as software tool MODER, for discovering monomeric TF binding motifs and their dimeric combinations. Given training data and seeds for monomeric motifs, the algorithm learns in the same probabilistic framework a mixture model which represents monomeric motifs as standard position-specific probability matrices (PPMs), and dimeric motifs as pairs of monomeric PPMs, with associated orientation and spacing preferences. For dimers the model represents deviations from pure modular model of two independent monomers, thus making co-operative binding effects explicit. MODER can analyze in reasonable time tens of Mbps of training data. We validated the tool on HT-SELEX and ChIP-seq data. Our findings include some TFs whose expected model has palindromic symmetry but the observed model is directional.
Collapse
Affiliation(s)
- Jarkko Toivonen
- Department of Computer Science, P.O. Box 68, FI-00014 University of Helsinki, Helsinki, Finland
| | - Teemu Kivioja
- Genome-Scale Biology Program, P.O. Box 63, FI-00014 University of Helsinki, Helsinki, Finland
| | - Arttu Jolma
- Division of Functional Genomics and Systems Biology, Department of Medical Biochemistry and Biophysics, and Department of Biosciences and Nutrition, Karolinska Institutet, SE 141 83 Stockholm, Sweden
| | - Yimeng Yin
- Division of Functional Genomics and Systems Biology, Department of Medical Biochemistry and Biophysics, and Department of Biosciences and Nutrition, Karolinska Institutet, SE 141 83 Stockholm, Sweden
| | - Jussi Taipale
- Genome-Scale Biology Program, P.O. Box 63, FI-00014 University of Helsinki, Helsinki, Finland.,Division of Functional Genomics and Systems Biology, Department of Medical Biochemistry and Biophysics, and Department of Biosciences and Nutrition, Karolinska Institutet, SE 141 83 Stockholm, Sweden.,Department of Biochemistry, University of Cambridge, CB2 1GA Cambridge, UK
| | - Esko Ukkonen
- Department of Computer Science, P.O. Box 68, FI-00014 University of Helsinki, Helsinki, Finland.,Helsinki Institute for Information Technology HIIT, University of Helsinki & Aalto University, Helsinki, Finland
| |
Collapse
|
17
|
Eggeling R. Disentangling transcription factor binding site complexity. Nucleic Acids Res 2019; 46:e121. [PMID: 30085218 PMCID: PMC6237759 DOI: 10.1093/nar/gky683] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Accepted: 07/17/2018] [Indexed: 12/15/2022] Open
Abstract
The binding motifs of many transcription factors (TFs) comprise a higher degree of complexity than a single position weight matrix model permits. Additional complexity is typically taken into account either as intra-motif dependencies via more sophisticated probabilistic models or as heterogeneities via multiple weight matrices. However, both orthogonal approaches have limitations when learning from in vivo data where binding sites of other factors in close proximity can interfere with motif discovery for the protein of interest. In this work, we demonstrate how intra-motif complexity can, purely by analyzing the statistical properties of a given set of TF-binding sites, be distinguished from complexity arising from an intermix with motifs of co-binding TFs or other artifacts. In addition, we study the related question whether intra-motif complexity is represented more effectively by dependencies, heterogeneities or variants in between. Benchmarks demonstrate the effectiveness of both methods for their respective tasks and applications on motif discovery output from recent tools detect and correct many undesirable artifacts. These results further suggest that the prevalence of intra-motif dependencies may have been overestimated in previous studies on in vivo data and should thus be reassessed.
Collapse
Affiliation(s)
- Ralf Eggeling
- Department of Computer Science, University of Helsinki, Gustaf-Hällströmin katu 2b, FIN-00140 Helsinki, Finland
| |
Collapse
|
18
|
Pinter N, Hach CA, Hampel M, Rekhter D, Zienkiewicz K, Feussner I, Poehlein A, Daniel R, Finkernagel F, Heimel K. Signal peptide peptidase activity connects the unfolded protein response to plant defense suppression by Ustilago maydis. PLoS Pathog 2019; 15:e1007734. [PMID: 30998787 PMCID: PMC6490947 DOI: 10.1371/journal.ppat.1007734] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Revised: 04/30/2019] [Accepted: 03/27/2019] [Indexed: 11/18/2022] Open
Abstract
The corn smut fungus Ustilago maydis requires the unfolded protein response (UPR) to maintain homeostasis of the endoplasmic reticulum (ER) during the biotrophic interaction with its host plant Zea mays (maize). Crosstalk between the UPR and pathways controlling pathogenic development is mediated by protein-protein interactions between the UPR regulator Cib1 and the developmental regulator Clp1. Cib1/Clp1 complex formation results in mutual modification of the connected regulatory networks thereby aligning fungal proliferation in planta, efficient effector secretion with increased ER stress tolerance and long-term UPR activation in planta. Here we address UPR-dependent gene expression and its modulation by Clp1 using combinatorial RNAseq/ChIPseq analyses. We show that increased ER stress resistance is connected to Clp1-dependent alterations of Cib1 phosphorylation, protein stability and UPR gene expression. Importantly, we identify by deletion screening of UPR core genes the signal peptide peptidase Spp1 as a novel key factor that is required for establishing a compatible biotrophic interaction between U. maydis and its host plant maize. Spp1 is dispensable for ER stress resistance and vegetative growth but requires catalytic activity to interfere with the plant defense, revealing a novel virulence specific function for signal peptide peptidases in a biotrophic fungal/plant interaction. Biotrophic pathogens establish compatible interactions with their host to cause disease. A critical step in this process is the suppression of plant defense responses by secreted effector proteins. In the maize infecting fungus Ustilago maydis expression of effector encoding genes is coordinately upregulated at defined stages of pathogenic development in so-called effector waves. Efficient secretion of the multitude of effectors relies on the unfolded protein response (UPR) to maintain homeostasis of the endoplasmic reticulum. Activation of the UPR is connected to the control of fungal proliferation through direct protein-protein interactions between the UPR regulator Cib1 and the developmental regulator Clp1. Here, we show that this interaction leads to functional modification of Cib1 and modulation of UPR gene expression to adapt the UPR for long-term activity in the plant. Within a core set of UPR regulated genes we identify the signal peptide peptidase Spp1 as a key factor for fungal virulence. We show that Spp1 requires its conserved catalytic activity to suppress the plant defense and cause disease. The virulence specific function of Spp1 does not involve pathways previously known to be associated with Spp1-like proteins or plant defense suppression, suggesting a novel role for Spp1 substrates in biotrophic interactions.
Collapse
Affiliation(s)
- Niko Pinter
- Department of Molecular Microbiology and Genetics, Institute of Microbiology and Genetics, Göttingen Center for Molecular Biosciences (GZMB), University of Göttingen, Göttingen, Germany
| | - Christina Andrea Hach
- Department of Molecular Microbiology and Genetics, Institute of Microbiology and Genetics, Göttingen Center for Molecular Biosciences (GZMB), University of Göttingen, Göttingen, Germany
| | - Martin Hampel
- Department of Molecular Microbiology and Genetics, Institute of Microbiology and Genetics, Göttingen Center for Molecular Biosciences (GZMB), University of Göttingen, Göttingen, Germany
| | - Dmitrij Rekhter
- Department of Plant Biochemistry, Albrecht-von-Haller-Institute for Plant Sciences, Göttingen Center for Molecular Biosciences (GZMB), University of Göttingen, Göttingen, Germany
| | - Krzysztof Zienkiewicz
- Department of Plant Biochemistry, Albrecht-von-Haller-Institute for Plant Sciences, Göttingen Center for Molecular Biosciences (GZMB), University of Göttingen, Göttingen, Germany
- Service Unit for Metabolomics and Lipidomics, Göttingen Center for Molecular Biosciences (GZMB), University of Göttingen, Göttingen, Germany
| | - Ivo Feussner
- Department of Plant Biochemistry, Albrecht-von-Haller-Institute for Plant Sciences, Göttingen Center for Molecular Biosciences (GZMB), University of Göttingen, Göttingen, Germany
- Service Unit for Metabolomics and Lipidomics, Göttingen Center for Molecular Biosciences (GZMB), University of Göttingen, Göttingen, Germany
| | - Anja Poehlein
- Department of Genomic and Applied Microbiology & Göttingen Genomics Laboratory, Institute of Microbiology and Genetics, University of Göttingen, Göttingen, Germany
| | - Rolf Daniel
- Department of Genomic and Applied Microbiology & Göttingen Genomics Laboratory, Institute of Microbiology and Genetics, University of Göttingen, Göttingen, Germany
| | - Florian Finkernagel
- Center for Tumor Biology and Immunology (ZTI), Institute of Molecular Biology and Tumor Research (IMT), Marburg, Germany
| | - Kai Heimel
- Department of Molecular Microbiology and Genetics, Institute of Microbiology and Genetics, Göttingen Center for Molecular Biosciences (GZMB), University of Göttingen, Göttingen, Germany
- * E-mail:
| |
Collapse
|
19
|
Venters BJ. Insights from resolving protein-DNA interactions at near base-pair resolution. Brief Funct Genomics 2019; 17:80-88. [PMID: 29211822 DOI: 10.1093/bfgp/elx043] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
One of the central goals in molecular biology is to understand how cell-type-specific expression patterns arise through selective recruitment of RNA polymerase II (Pol II) to a subset of gene promoters. Pol II needs to be recruited to a precise genomic position at the proper time to produce messenger RNA from a DNA template. Ostensibly, transcription is a relatively simple cellular process; yet, experimentally measuring and then understanding the combinatorial possibilities of transcriptional regulators remain a daunting task. Since its introduction in 1985, chromatin immunoprecipitation (ChIP) has remained a key tool for investigating protein-DNA contacts in vivo. Over 30 years of intensive research using ChIP have provided numerous insights into mechanisms of gene regulation. As functional genomic technologies improve, they present new opportunities to address key biological questions. ChIP-exo is a refined version of ChIP-seq that significantly reduces background signal, while providing near base-pair mapping resolution for protein-DNA interactions. This review discusses the evolution of the ChIP assay over the years; the methodological differences between ChIP-seq, ChIP-exo and ChIP-nexus; and highlight new insights into epigenetic and transcriptional mechanisms that were uniquely enabled with the near base-pair resolution of ChIP-exo.
Collapse
|
20
|
Hartl D, Krebs AR, Grand RS, Baubec T, Isbel L, Wirbelauer C, Burger L, Schübeler D. CG dinucleotides enhance promoter activity independent of DNA methylation. Genome Res 2019; 29:554-563. [PMID: 30709850 PMCID: PMC6442381 DOI: 10.1101/gr.241653.118] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2018] [Accepted: 01/24/2019] [Indexed: 11/24/2022]
Abstract
Most mammalian RNA polymerase II initiation events occur at CpG islands, which are rich in CpGs and devoid of DNA methylation. Despite their relevance for gene regulation, it is unknown to what extent the CpG dinucleotide itself actually contributes to promoter activity. To address this question, we determined the transcriptional activity of a large number of chromosomally integrated promoter constructs and monitored binding of transcription factors assumed to play a role in CpG island activity. This revealed that CpG density significantly improves motif-based prediction of transcription factor binding. Our experiments also show that high CpG density alone is insufficient for transcriptional activity, yet results in increased transcriptional output when combined with particular transcription factor motifs. However, this CpG contribution to promoter activity is independent of DNA methyltransferase activity. Together, this refines our understanding of mammalian promoter regulation as it shows that high CpG density within CpG islands directly contributes to an environment permissive for full transcriptional activity.
Collapse
Affiliation(s)
- Dominik Hartl
- Friedrich Miescher Institute for Biomedical Research, CH 4058 Basel, Switzerland.,Faculty of Sciences, University of Basel, CH 4003 Basel, Switzerland
| | - Arnaud R Krebs
- Friedrich Miescher Institute for Biomedical Research, CH 4058 Basel, Switzerland
| | - Ralph S Grand
- Friedrich Miescher Institute for Biomedical Research, CH 4058 Basel, Switzerland
| | - Tuncay Baubec
- Friedrich Miescher Institute for Biomedical Research, CH 4058 Basel, Switzerland
| | - Luke Isbel
- Friedrich Miescher Institute for Biomedical Research, CH 4058 Basel, Switzerland
| | | | - Lukas Burger
- Friedrich Miescher Institute for Biomedical Research, CH 4058 Basel, Switzerland.,Swiss Institute of Bioinformatics, CH 4058 Basel, Switzerland
| | - Dirk Schübeler
- Friedrich Miescher Institute for Biomedical Research, CH 4058 Basel, Switzerland.,Faculty of Sciences, University of Basel, CH 4003 Basel, Switzerland
| |
Collapse
|
21
|
Fu S, Wang Q, Moore JE, Purcaro MJ, Pratt HE, Fan K, Gu C, Jiang C, Zhu R, Kundaje A, Lu A, Weng Z. Differential analysis of chromatin accessibility and histone modifications for predicting mouse developmental enhancers. Nucleic Acids Res 2018; 46:11184-11201. [PMID: 30137428 PMCID: PMC6265487 DOI: 10.1093/nar/gky753] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2018] [Revised: 07/15/2018] [Accepted: 08/08/2018] [Indexed: 12/11/2022] Open
Abstract
Enhancers are distal cis-regulatory elements that modulate gene expression. They are depleted of nucleosomes and enriched in specific histone modifications; thus, calling DNase-seq and histone mark ChIP-seq peaks can predict enhancers. We evaluated nine peak-calling algorithms for predicting enhancers validated by transgenic mouse assays. DNase and H3K27ac peaks were consistently more predictive than H3K4me1/2/3 and H3K9ac peaks. DFilter and Hotspot2 were the best DNase peak callers, while HOMER, MUSIC, MACS2, DFilter and F-seq were the best H3K27ac peak callers. We observed that the differential DNase or H3K27ac signals between two distant tissues increased the area under the precision-recall curve (PR-AUC) of DNase peaks by 17.5-166.7% and that of H3K27ac peaks by 7.1-22.2%. We further improved this differential signal method using multiple contrast tissues. Evaluated using a blind test, the differential H3K27ac signal method substantially improved PR-AUC from 0.48 to 0.75 for predicting heart enhancers. We further validated our approach using postnatal retina and cerebral cortex enhancers identified by massively parallel reporter assays, and observed improvements for both tissues. In summary, we compared nine peak callers and devised a superior method for predicting tissue-specific mouse developmental enhancers by reranking the called peaks.
Collapse
Affiliation(s)
- Shaliu Fu
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Qin Wang
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Michael J Purcaro
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Henry E Pratt
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Kaili Fan
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Cuihua Gu
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Cizhong Jiang
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Ruixin Zhu
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Anshul Kundaje
- Department of Genetics, School of Medicine, Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Aiping Lu
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Zhiping Weng
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| |
Collapse
|
22
|
Salekin S, Zhang JM, Huang Y. Base-pair resolution detection of transcription factor binding site by deep deconvolutional network. Bioinformatics 2018; 34:3446-3453. [PMID: 29757349 PMCID: PMC6184544 DOI: 10.1093/bioinformatics/bty383] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Revised: 03/05/2018] [Accepted: 05/05/2018] [Indexed: 02/01/2023] Open
Abstract
Motivation Transcription factor (TF) binds to the promoter region of a gene to control gene expression. Identifying precise TF binding sites (TFBSs) is essential for understanding the detailed mechanisms of TF-mediated gene regulation. However, there is a shortage of computational approach that can deliver single base pair resolution prediction of TFBS. Results In this paper, we propose DeepSNR, a Deep Learning algorithm for predicting TF binding location at Single Nucleotide Resolution de novo from DNA sequence. DeepSNR adopts a novel deconvolutional network (deconvNet) model and is inspired by the similarity to image segmentation by deconvNet. The proposed deconvNet architecture is constructed on top of 'DeepBind' and we trained the entire model using TF-specific data from ChIP-exonuclease (ChIP-exo) experiments. DeepSNR has been shown to outperform motif search-based methods for several evaluation metrics. We have also demonstrated the usefulness of DeepSNR in the regulatory analysis of TFBS as well as in improving the TFBS prediction specificity using ChIP-seq data. Availability and implementation DeepSNR is available open source in the GitHub repository (https://github.com/sirajulsalekin/DeepSNR). Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sirajul Salekin
- Electrical and Computer Engineering Department, University of Texas at San Antonio, San Antonio, TX, USA
| | - Jianqiu Michelle Zhang
- Electrical and Computer Engineering Department, University of Texas at San Antonio, San Antonio, TX, USA
| | - Yufei Huang
- Electrical and Computer Engineering Department, University of Texas at San Antonio, San Antonio, TX, USA
- Department of Epidemiology and Biostatistics, University of Texas Health Science Center, San Antonio, TX, USA
| |
Collapse
|
23
|
Martin-Herranz DE, Ribeiro AJM, Krueger F, Thornton JM, Reik W, Stubbs TM. cuRRBS: simple and robust evaluation of enzyme combinations for reduced representation approaches. Nucleic Acids Res 2017; 45:11559-11569. [PMID: 29036576 PMCID: PMC5714207 DOI: 10.1093/nar/gkx814] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Accepted: 09/04/2017] [Indexed: 11/14/2022] Open
Abstract
DNA methylation is an important epigenetic modification in many species that is critical for development, and implicated in ageing and many complex diseases, such as cancer. Many cost-effective genome-wide analyses of DNA modifications rely on restriction enzymes capable of digesting genomic DNA at defined sequence motifs. There are hundreds of restriction enzyme families but few are used to date, because no tool is available for the systematic evaluation of restriction enzyme combinations that can enrich for certain sites of interest in a genome. Herein, we present customised Reduced Representation Bisulfite Sequencing (cuRRBS), a novel and easy-to-use computational method that solves this problem. By computing the optimal enzymatic digestions and size selection steps required, cuRRBS generalises the traditional MspI-based Reduced Representation Bisulfite Sequencing (RRBS) protocol to all restriction enzyme combinations. In addition, cuRRBS estimates the fold-reduction in sequencing costs and provides a robustness value for the personalised RRBS protocol, allowing users to tailor the protocol to their experimental needs. Moreover, we show in silico that cuRRBS-defined restriction enzymes consistently out-perform MspI digestion in many biological systems, considering both CpG and CHG contexts. Finally, we have validated the accuracy of cuRRBS predictions for single and double enzyme digestions using two independent experimental datasets.
Collapse
Affiliation(s)
- Daniel E Martin-Herranz
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - António J M Ribeiro
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Felix Krueger
- Bioinformatics Group, The Babraham Institute, Cambridge CB22 3AT, UK
| | - Janet M Thornton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Wolf Reik
- Epigenetics Programme, The Babraham Institute, Cambridge CB22 3AT, UK.,Centre for Trophoblast Research, University of Cambridge, Cambridge CB2 3EG, UK.,Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK
| | - Thomas M Stubbs
- Epigenetics Programme, The Babraham Institute, Cambridge CB22 3AT, UK
| |
Collapse
|
24
|
Stanton KP, Jin J, Lederman RR, Weissman SM, Kluger Y. Ritornello: high fidelity control-free chromatin immunoprecipitation peak calling. Nucleic Acids Res 2017; 45:e173. [PMID: 28981893 PMCID: PMC5716106 DOI: 10.1093/nar/gkx799] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Accepted: 08/30/2017] [Indexed: 02/03/2023] Open
Abstract
With the advent of next generation high-throughput DNA sequencing technologies, omics experiments have become the mainstay for studying diverse biological effects on a genome wide scale. Chromatin immunoprecipitation (ChIP-seq) is the omics technique that enables genome wide localization of transcription factor (TF) binding or epigenetic modification events. Since the inception of ChIP-seq in 2007, many methods have been developed to infer ChIP-target binding loci from the resultant reads after mapping them to a reference genome. However, interpreting these data has proven challenging, and as such these algorithms have several shortcomings, including susceptibility to false positives due to artifactual peaks, poor localization of binding sites and the requirement for a total DNA input control which increases the cost of performing these experiments. We present Ritornello, a new approach for finding TF-binding sites in ChIP-seq, with roots in digital signal processing that addresses all of these problems. We show that Ritornello generally performs equally or better than the peak callers tested and recommended by the ENCODE consortium, but in contrast, Ritornello does not require a matched total DNA input control to avoid false positives, effectively decreasing the sequencing cost to perform ChIP-seq. Ritornello is freely available at https://github.com/KlugerLab/Ritornello.
Collapse
Affiliation(s)
- Kelly P Stanton
- Department of Pathology, Yale University School of Medicine, 333 Cedar Street, New Haven, CT 06520, USA.,Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06511, USA
| | - Jiaqi Jin
- Department of Genetics, Yale University School of Medicine, 333 Cedar Street, New Haven, CT 06520, USA
| | - Roy R Lederman
- Program of Applied Mathematics, Yale University, 51 Prospect Street, New Haven, CT 06511, USA.,Department of Mathematics and PACM, Princeton University, Fine Hall, Washington Road, Princeton, NJ 08544-1000, USA
| | - Sherman M Weissman
- Department of Genetics, Yale University School of Medicine, 333 Cedar Street, New Haven, CT 06520, USA
| | - Yuval Kluger
- Department of Pathology, Yale University School of Medicine, 333 Cedar Street, New Haven, CT 06520, USA.,Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06511, USA.,Program of Applied Mathematics, Yale University, 51 Prospect Street, New Haven, CT 06511, USA
| |
Collapse
|
25
|
Nakato R, Shirahige K. Recent advances in ChIP-seq analysis: from quality management to whole-genome annotation. Brief Bioinform 2017; 18:279-290. [PMID: 26979602 PMCID: PMC5444249 DOI: 10.1093/bib/bbw023] [Citation(s) in RCA: 78] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Indexed: 02/06/2023] Open
Abstract
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) analysis can detect protein/DNA-binding and histone-modification sites across an entire genome. Recent advances in sequencing technologies and analyses enable us to compare hundreds of samples simultaneously; such large-scale analysis has potential to reveal the high-dimensional interrelationship level for regulatory elements and annotate novel functional genomic regions de novo. Because many experimental considerations are relevant to the choice of a method in a ChIP-seq analysis, the overall design and quality management of the experiment are of critical importance. This review offers guiding principles of computation and sample preparation for ChIP-seq analyses, highlighting the validity and limitations of the state-of-the-art procedures at each step. We also discuss the latest challenges of single-cell analysis that will encourage a new era in this field.
Collapse
Affiliation(s)
- Ryuichiro Nakato
- Research Center for Epigenetic Disease, Institute of Molecular and Cellular Biosciences, The University of Tokyo, Tokyo, Japan
| | - Katsuhiko Shirahige
- Research Center for Epigenetic Disease, Institute of Molecular and Cellular Biosciences, The University of Tokyo, Tokyo, Japan.,Core Research for Evolutional Science and Technology (CREST), Japan Science and Technology Agency, Kawaguchi, Japan
| |
Collapse
|
26
|
Welch R, Chung D, Grass J, Landick R, Keles S. Data exploration, quality control and statistical analysis of ChIP-exo/nexus experiments. Nucleic Acids Res 2017; 45:e145. [PMID: 28911122 PMCID: PMC5587812 DOI: 10.1093/nar/gkx594] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Accepted: 07/12/2017] [Indexed: 01/03/2023] Open
Abstract
ChIP-exo/nexus experiments rely on innovative modifications of the commonly used ChIP-seq protocol for high resolution mapping of transcription factor binding sites. Although many aspects of the ChIP-exo data analysis are similar to those of ChIP-seq, these high throughput experiments pose a number of unique quality control and analysis challenges. We develop a novel statistical quality control pipeline and accompanying R/Bioconductor package, ChIPexoQual, to enable exploration and analysis of ChIP-exo and related experiments. ChIPexoQual evaluates a number of key issues including strand imbalance, library complexity, and signal enrichment of data. Assessment of these features are facilitated through diagnostic plots and summary statistics computed over regions of the genome with varying levels of coverage. We evaluated our QC pipeline with both large collections of public ChIP-exo/nexus data and multiple, new ChIP-exo datasets from Escherichia coli. ChIPexoQual analysis of these datasets resulted in guidelines for using these QC metrics across a wide range of sequencing depths and provided further insights for modelling ChIP-exo data.
Collapse
Affiliation(s)
- Rene Welch
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Dongjun Chung
- Department of Public Health Sciences, Medical University of South Carolina, SC 29425, USA
| | - Jeffrey Grass
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA.,Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Robert Landick
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA.,Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA.,Department of Bacteriology, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Sündüz Keles
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA.,Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
| |
Collapse
|
27
|
Angarica VE, Del Sol A. Bioinformatics Tools for Genome-Wide Epigenetic Research. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2017; 978:489-512. [PMID: 28523562 DOI: 10.1007/978-3-319-53889-1_25] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Epigenetics play a central role in the regulation of many important cellular processes, and dysregulations at the epigenetic level could be the source of serious pathologies, such as neurological disorders affecting brain development, neurodegeneration, and intellectual disability. Despite significant technological advances for epigenetic profiling, there is still a need for a systematic understanding of how epigenetics shapes cellular circuitry, and disease pathogenesis. The development of accurate computational approaches for analyzing complex epigenetic profiles is essential for disentangling the mechanisms underlying cellular development, and the intricate interaction networks determining and sensing chromatin modifications and DNA methylation to control gene expression. In this chapter, we review the recent advances in the field of "computational epigenetics," including computational methods for processing different types of epigenetic data, prediction of chromatin states, and study of protein dynamics. We also discuss how "computational epigenetics" has complemented the fast growth in the generation of epigenetic data for uncovering the main differences and similarities at the epigenetic level between individuals and the mechanisms underlying disease onset and progression.
Collapse
Affiliation(s)
- Vladimir Espinosa Angarica
- Computational Biology Group, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4366 Belvaux, Luxembourg.
| | - Antonio Del Sol
- Computational Biology Group, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4366 Belvaux, Luxembourg
| |
Collapse
|
28
|
Hartonen T, Sahu B, Dave K, Kivioja T, Taipale J. PeakXus: comprehensive transcription factor binding site discovery from ChIP-Nexus and ChIP-Exo experiments. Bioinformatics 2017; 32:i629-i638. [PMID: 27587683 DOI: 10.1093/bioinformatics/btw448] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Transcription factor (TF) binding can be studied accurately in vivo with ChIP-exo and ChIP-Nexus experiments. Only fraction of TF binding mechanisms are yet fully understood and accurate knowledge of binding locations and patterns of TFs is key to understanding binding that is not explained by simple positional weight matrix models. ChIP-exo/Nexus experiments can also offer insight on the effect of single nucleotide polymorphism (SNP) at TF binding sites on expression of the target genes. This is an important mechanism of action for disease-causing SNPs at non-coding genomic regions. RESULTS We describe a peak caller PeakXus that is specifically designed to leverage the increased resolution of ChIP-exo/Nexus and developed with the aim of making as few assumptions of the data as possible to allow discoveries of novel binding patterns. We apply PeakXus to ChIP-Nexus and ChIP-exo experiments performed both in Homo sapiens and in Drosophila melanogaster cell lines. We show that PeakXus consistently finds more peaks overlapping with a TF-specific recognition sequence than published methods. As an application example we demonstrate how PeakXus can be coupled with unique molecular identifiers (UMIs) to measure the effect of a SNP overlapping with a TF binding site on the in vivo binding of the TF. AVAILABILITY AND IMPLEMENTATION Source code of PeakXus is available at https://github.com/hartonen/PeakXus CONTACT tuomo.hartonen@helsinki.fi or jussi.taipale@ki.se.
Collapse
Affiliation(s)
- Tuomo Hartonen
- Genome-Scale Biology Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland
| | - Biswajyoti Sahu
- Genome-Scale Biology Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland
| | - Kashyap Dave
- Department of Biosciences and Nutrition, Karolinska Institutet, Stockholm, Sweden
| | - Teemu Kivioja
- Genome-Scale Biology Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland
| | - Jussi Taipale
- Genome-Scale Biology Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland Department of Biosciences and Nutrition, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
29
|
Gaiti F, Jindrich K, Fernandez-Valverde SL, Roper KE, Degnan BM, Tanurdžić M. Landscape of histone modifications in a sponge reveals the origin of animal cis-regulatory complexity. eLife 2017; 6:22194. [PMID: 28395144 PMCID: PMC5429095 DOI: 10.7554/elife.22194] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2016] [Accepted: 03/27/2017] [Indexed: 01/24/2023] Open
Abstract
Combinatorial patterns of histone modifications regulate developmental and cell type-specific gene expression and underpin animal complexity, but it is unclear when this regulatory system evolved. By analysing histone modifications in a morphologically-simple, early branching animal, the sponge Amphimedonqueenslandica, we show that the regulatory landscape used by complex bilaterians was already in place at the dawn of animal multicellularity. This includes distal enhancers, repressive chromatin and transcriptional units marked by H3K4me3 that vary with levels of developmental regulation. Strikingly, Amphimedon enhancers are enriched in metazoan-specific microsyntenic units, suggesting that their genomic location is extremely ancient and likely to place constraints on the evolution of surrounding genes. These results suggest that the regulatory foundation for spatiotemporal gene expression evolved prior to the divergence of sponges and eumetazoans, and was necessary for the evolution of animal multicellularity.
Collapse
Affiliation(s)
- Federico Gaiti
- School of Biological Sciences, University of Queensland, Brisbane, Australia
| | - Katia Jindrich
- School of Biological Sciences, University of Queensland, Brisbane, Australia
| | | | - Kathrein E Roper
- School of Biological Sciences, University of Queensland, Brisbane, Australia
| | - Bernard M Degnan
- School of Biological Sciences, University of Queensland, Brisbane, Australia
| | - Miloš Tanurdžić
- School of Biological Sciences, University of Queensland, Brisbane, Australia
| |
Collapse
|
30
|
Perreault AA, Venters BJ. The ChIP-exo Method: Identifying Protein-DNA Interactions with Near Base Pair Precision. J Vis Exp 2016. [PMID: 28060339 DOI: 10.3791/55016] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022] Open
Abstract
Chromatin immunoprecipitation (ChIP) is an indispensable tool in the fields of epigenetics and gene regulation that isolates specific protein-DNA interactions. ChIP coupled to high throughput sequencing (ChIP-seq) is commonly used to determine the genomic location of proteins that interact with chromatin. However, ChIP-seq is hampered by relatively low mapping resolution of several hundred base pairs and high background signal. The ChIP-exo method is a refined version of ChIP-seq that substantially improves upon both resolution and noise. The key distinction of the ChIP-exo methodology is the incorporation of lambda exonuclease digestion in the library preparation workflow to effectively footprint the left and right 5' DNA borders of the protein-DNA crosslink site. The ChIP-exo libraries are then subjected to high throughput sequencing. The resulting data can be leveraged to provide unique and ultra-high resolution insights into the functional organization of the genome. Here, we describe the ChIP-exo method that we have optimized and streamlined for mammalian systems and next-generation sequencing-by-synthesis platform.
Collapse
Affiliation(s)
- Andrea A Perreault
- Department of Molecular Physiology and Biophysics, Vanderbilt University
| | - Bryan J Venters
- Department of Molecular Physiology and Biophysics, Vanderbilt University;
| |
Collapse
|
31
|
Koenecke N, Johnston J, Gaertner B, Natarajan M, Zeitlinger J. Genome-wide identification of Drosophila dorso-ventral enhancers by differential histone acetylation analysis. Genome Biol 2016; 17:196. [PMID: 27678375 PMCID: PMC5037609 DOI: 10.1186/s13059-016-1057-2] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2016] [Accepted: 09/05/2016] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Drosophila dorso-ventral (DV) patterning is one of the best-understood regulatory networks to date, and illustrates the fundamental role of enhancers in controlling patterning, cell fate specification, and morphogenesis during development. Histone acetylation such as H3K27ac is an excellent marker for active enhancers, but it is challenging to obtain precise locations for enhancers as the highest levels of this modification flank the enhancer regions. How to best identify tissue-specific enhancers in a developmental system de novo with a minimal set of data is still unclear. RESULTS Using DV patterning as a test system, we develop a simple and effective method to identify tissue-specific enhancers de novo. We sample a broad set of candidate enhancer regions using data on CREB-binding protein co-factor binding or ATAC-seq chromatin accessibility, and then identify those regions with significant differences in histone acetylation between tissues. This method identifies hundreds of novel DV enhancers and outperforms ChIP-seq data of relevant transcription factors when benchmarked with mRNA expression data and transgenic reporter assays. These DV enhancers allow the de novo discovery of the relevant transcription factor motifs involved in DV patterning and contain additional motifs that are evolutionarily conserved and for which the corresponding transcription factors are expressed in a DV-biased fashion. Finally, we identify novel target genes of the regulatory network, implicating morphogenesis genes as early targets of DV patterning. CONCLUSIONS Taken together, our approach has expanded our knowledge of the DV patterning network even further and is a general method to identify enhancers in any developmental system, including mammalian development.
Collapse
Affiliation(s)
- Nina Koenecke
- Stowers Institute for Medical Research, Kansas City, MO, 64110, USA
| | - Jeff Johnston
- Stowers Institute for Medical Research, Kansas City, MO, 64110, USA
| | - Bjoern Gaertner
- Stowers Institute for Medical Research, Kansas City, MO, 64110, USA.,Present address: Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Malini Natarajan
- Stowers Institute for Medical Research, Kansas City, MO, 64110, USA
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, MO, 64110, USA. .,Department of Pathology and Laboratory Medicine, University of Kansas Medical Center, Kansas City, KS, 66160, USA.
| |
Collapse
|
32
|
Shlyueva D, Meireles-Filho ACA, Pagani M, Stark A. Genome-Wide Ultrabithorax Binding Analysis Reveals Highly Targeted Genomic Loci at Developmental Regulators and a Potential Connection to Polycomb-Mediated Regulation. PLoS One 2016; 11:e0161997. [PMID: 27575958 PMCID: PMC5004984 DOI: 10.1371/journal.pone.0161997] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2016] [Accepted: 08/16/2016] [Indexed: 12/22/2022] Open
Abstract
Hox homeodomain transcription factors are key regulators of animal development. They specify the identity of segments along the anterior-posterior body axis in metazoans by controlling the expression of diverse downstream targets, including transcription factors and signaling pathway components. The Drosophila melanogaster Hox factor Ultrabithorax (Ubx) directs the development of thoracic and abdominal segments and appendages, and loss of Ubx function can lead for example to the transformation of third thoracic segment appendages (e.g. halters) into second thoracic segment appendages (e.g. wings), resulting in a characteristic four-wing phenotype. Here we present a Drosophila melanogaster strain with a V5-epitope tagged Ubx allele, which we employed to obtain a high quality genome-wide map of Ubx binding sites using ChIP-seq. We confirm the sensitivity of the V5 ChIP-seq by recovering 7/8 of well-studied Ubx-dependent cis-regulatory regions. Moreover, we show that Ubx binding is predictive of enhancer activity as suggested by comparison with a genome-scale resource of in vivo tested enhancer candidates. We observed densely clustered Ubx binding sites at 12 extended genomic loci that included ANTP-C, BX-C, Polycomb complex genes, and other regulators and the clustered binding sites were frequently active enhancers. Furthermore, Ubx binding was detected at known Polycomb response elements (PREs) and was associated with significant enrichments of Pc and Pho ChIP signals in contrast to binding sites of other developmental TFs. Together, our results show that Ubx targets developmental regulators via strongly clustered binding sites and allow us to hypothesize that regulation by Ubx might involve Polycomb group proteins to maintain specific regulatory states in cooperative or mutually exclusive fashion, an attractive model that combines two groups of proteins with prominent gene regulatory roles during animal development.
Collapse
Affiliation(s)
- Daria Shlyueva
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), Vienna, Austria
| | | | - Michaela Pagani
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), Vienna, Austria
| | - Alexander Stark
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), Vienna, Austria
- * E-mail:
| |
Collapse
|
33
|
Domcke S, Bardet AF, Adrian Ginno P, Hartl D, Burger L, Schübeler D. Competition between DNA methylation and transcription factors determines binding of NRF1. Nature 2015; 528:575-9. [PMID: 26675734 DOI: 10.1038/nature16462] [Citation(s) in RCA: 357] [Impact Index Per Article: 35.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Accepted: 11/16/2015] [Indexed: 12/17/2022]
Abstract
Eukaryotic transcription factors (TFs) are key determinants of gene activity, yet they bind only a fraction of their corresponding DNA sequence motifs in any given cell type. Chromatin has the potential to restrict accessibility of binding sites; however, in which context chromatin states are instructive for TF binding remains mainly unknown. To explore the contribution of DNA methylation to constrained TF binding, we mapped DNase-I-hypersensitive sites in murine stem cells in the presence and absence of DNA methylation. Methylation-restricted sites are enriched for TF motifs containing CpGs, especially for those of NRF1. In fact, the TF NRF1 occupies several thousand additional sites in the unmethylated genome, resulting in increased transcription. Restoring de novo methyltransferase activity initiates remethylation at these sites and outcompetes NRF1 binding. This suggests that binding of DNA-methylation-sensitive TFs relies on additional determinants to induce local hypomethylation. In support of this model, removal of neighbouring motifs in cis or of a TF in trans causes local hypermethylation and subsequent loss of NRF1 binding. This competition between DNA methylation and TFs in vivo reveals a case of cooperativity between TFs that acts indirectly via DNA methylation. Methylation removal by methylation-insensitive factors enables occupancy of methylation-sensitive factors, a principle that rationalizes hypomethylation of regulatory regions.
Collapse
Affiliation(s)
- Silvia Domcke
- Friedrich Miescher Institute for Biomedical Research, Maulbeerstrasse 66, CH 4058 Basel, Switzerland.,University of Basel, Faculty of Sciences, Petersplatz 1, CH 4003 Basel, Switzerland
| | - Anaïs Flore Bardet
- Friedrich Miescher Institute for Biomedical Research, Maulbeerstrasse 66, CH 4058 Basel, Switzerland
| | - Paul Adrian Ginno
- Friedrich Miescher Institute for Biomedical Research, Maulbeerstrasse 66, CH 4058 Basel, Switzerland
| | - Dominik Hartl
- Friedrich Miescher Institute for Biomedical Research, Maulbeerstrasse 66, CH 4058 Basel, Switzerland.,University of Basel, Faculty of Sciences, Petersplatz 1, CH 4003 Basel, Switzerland
| | - Lukas Burger
- Friedrich Miescher Institute for Biomedical Research, Maulbeerstrasse 66, CH 4058 Basel, Switzerland.,Swiss Institute of Bioinformatics, Maulbeerstrasse 66, CH 4058 Basel, Switzerland
| | - Dirk Schübeler
- Friedrich Miescher Institute for Biomedical Research, Maulbeerstrasse 66, CH 4058 Basel, Switzerland.,University of Basel, Faculty of Sciences, Petersplatz 1, CH 4003 Basel, Switzerland
| |
Collapse
|
34
|
Douglas AT, Hill RD. Variation in vertebrate cis-regulatory elements in evolution and disease. Transcription 2015; 5:e28848. [PMID: 25764334 DOI: 10.4161/trns.28848] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Much of the genetic information that drives animal diversity lies within the vast non-coding regions of the genome. Multi-species sequence conservation in non-coding regions of the genome flags important regulatory elements and more recently, techniques that look for functional signatures predicted for regulatory sequences have added to the identification of thousands more. For some time, biologists have argued that changes in cis-regulatory sequences creates the basic genetic framework for evolutionary change. Recent advances support this notion and show that there is extensive genomic variability in non-coding regulatory elements associated with trait variation, speciation and disease.
Collapse
Affiliation(s)
- Adam Thomas Douglas
- a MRC Human Genetics Unit; MRC Institute of Genetics and Molecular Medicine; University of Edinburgh; Edinburgh, UK
| | | |
Collapse
|
35
|
Leoncini M, Montangero M, Pellegrini M, Tillan KP. CMStalker: A Combinatorial Tool for Composite Motif Discovery. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:1123-1136. [PMID: 26451824 DOI: 10.1109/tcbb.2014.2359444] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Controlling the differential expression of many thousands different genes at any given time is a fundamental task of metazoan organisms and this complex orchestration is controlled by the so-called regulatory genome encoding complex regulatory networks: several Transcription Factors bind to precise DNA regions, so to perform in a cooperative manner a specific regulation task for nearby genes. The in silico prediction of these binding sites is still an open problem, notwithstanding continuous progress and activity in the last two decades. In this paper, we describe a new efficient combinatorial approach to the problem of detecting sets of cooperating binding sites in promoter sequences, given in input a database of Transcription Factor Binding Sites encoded as Position Weight Matrices. We present CMStalker, a software tool for composite motif discovery which embodies a new approach that combines a constraint satisfaction formulation with a parameter relaxation technique to explore efficiently the space of possible solutions. Extensive experiments with 12 data sets and 11 state-of-the-art tools are reported, showing an average value of the correlation coefficient of 0.54 (against a value 0.41 of the closest competitor). This improvements in output quality due to CMStalker is statistically significant.
Collapse
|
36
|
Stein C, Bardet AF, Roma G, Bergling S, Clay I, Ruchti A, Agarinis C, Schmelzle T, Bouwmeester T, Schübeler D, Bauer A. YAP1 Exerts Its Transcriptional Control via TEAD-Mediated Activation of Enhancers. PLoS Genet 2015; 11:e1005465. [PMID: 26295846 PMCID: PMC4546604 DOI: 10.1371/journal.pgen.1005465] [Citation(s) in RCA: 302] [Impact Index Per Article: 30.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Accepted: 07/23/2015] [Indexed: 12/30/2022] Open
Abstract
YAP1 is a major effector of the Hippo pathway and a well-established oncogene. Elevated YAP1 activity due to mutations in Hippo pathway components or YAP1 amplification is observed in several types of human cancers. Here we investigated its genomic binding landscape in YAP1-activated cancer cells, as well as in non-transformed cells. We demonstrate that TEAD transcription factors mediate YAP1 chromatin-binding genome-wide, further explaining their dominant role as primary mediators of YAP1-transcriptional activity. Moreover, we show that YAP1 largely exerts its transcriptional control via distal enhancers that are marked by H3K27 acetylation and that YAP1 is necessary for this chromatin mark at bound enhancers and the activity of the associated genes. This work establishes YAP1-mediated transcriptional regulation at distal enhancers and provides an expanded set of target genes resulting in a fundamental source to study YAP1 function in a normal and cancer setting. The YAP1/Hippo signaling pathway is a key regulator of organ size and tissue homeostasis, and its dysregulation is linked to cancer development. Elevated activity of YAP1, a transcriptional coactivator and well-established oncogene has been reported to occur in human cancers. Comprehensive identification of YAP1 regulated genes and its mode of action will be of high importance to uncover YAP1 biology that could be exploited for a therapeutic intervention. To this end, we performed genome-wide analyses to identify YAP1 occupied sites in cancer cell lines representing different YAP1/Hippo pathway tumor etiologies and in non-transformed fibroblasts. Our data demonstrate that YAP1 activity is mediated predominantly via TEAD transcription factors supporting the importance of TEADs as main mediators of YAP1-coactivator activity. We further show that YAP1 and TEAD1 exert their transcriptional control via binding to enhancers, leading to characteristic chromatin changes and distal activation of genes. By linking enhancers to genes, we provide a list of novel YAP1 target genes in an oncogenic setting that we show can readily be exploited in tumor classification and provides a foundation for further investigations.
Collapse
Affiliation(s)
- Claudia Stein
- Developmental and Molecular Pathways, Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland
| | - Anaïs Flore Bardet
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
| | - Guglielmo Roma
- Developmental and Molecular Pathways, Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland
| | - Sebastian Bergling
- Developmental and Molecular Pathways, Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland
| | - Ieuan Clay
- Developmental and Molecular Pathways, Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland
| | - Alexandra Ruchti
- Developmental and Molecular Pathways, Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland
| | - Claudia Agarinis
- Developmental and Molecular Pathways, Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland
| | - Tobias Schmelzle
- Oncology, Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland
| | - Tewis Bouwmeester
- Developmental and Molecular Pathways, Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland
| | - Dirk Schübeler
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
- University of Basel, Faculty of Sciences, Basel, Switzerland
- * E-mail: (DS); (AB)
| | - Andreas Bauer
- Developmental and Molecular Pathways, Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland
- * E-mail: (DS); (AB)
| |
Collapse
|
37
|
Hansen P, Hecht J, Ibrahim DM, Krannich A, Truss M, Robinson PN. Saturation analysis of ChIP-seq data for reproducible identification of binding peaks. Genome Res 2015; 25:1391-400. [PMID: 26163319 PMCID: PMC4561497 DOI: 10.1101/gr.189894.115] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2015] [Accepted: 07/06/2015] [Indexed: 11/24/2022]
Abstract
Chromatin immunoprecipitation coupled with next-generation sequencing (ChIP-seq) is a powerful technology to identify the genome-wide locations of transcription factors and other DNA binding proteins. Computational ChIP-seq peak calling infers the location of protein–DNA interactions based on various measures of enrichment of sequence reads. In this work, we introduce an algorithm, Q, that uses an assessment of the quadratic enrichment of reads to center candidate peaks followed by statistical analysis of saturation of candidate peaks by 5′ ends of reads. We show that our method not only is substantially faster than several competing methods but also demonstrates statistically significant advantages with respect to reproducibility of results and in its ability to identify peaks with reproducible binding site motifs. We show that Q has superior performance in the delineation of double RNAPII and H3K4me3 peaks surrounding transcription start sites related to a better ability to resolve individual peaks. The method is implemented in C+l+ and is freely available under an open source license.
Collapse
Affiliation(s)
- Peter Hansen
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Berlin Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany
| | - Jochen Hecht
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Berlin Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Daniel M Ibrahim
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Alexander Krannich
- Department of Biostatistics, Clinical Research Unit, Berlin Institute of Health, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany
| | - Matthias Truss
- Labor für Pädiatrische Molekularbiologie, Charité-Universitätsmedizin Berlin, 10117, Berlin, Germany
| | - Peter N Robinson
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Berlin Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany; Institute for Bioinformatics, Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany
| |
Collapse
|
38
|
Abstract
Recent advances in experimental and computational methodologies are enabling ultra-high resolution genome-wide profiles of protein-DNA binding events. For example, the ChIP-exo protocol precisely characterizes protein-DNA cross-linking patterns by combining chromatin immunoprecipitation (ChIP) with 5' → 3' exonuclease digestion. Similarly, deeply sequenced chromatin accessibility assays (e.g. DNase-seq and ATAC-seq) enable the detection of protected footprints at protein-DNA binding sites. With these techniques and others, we have the potential to characterize the individual nucleotides that interact with transcription factors, nucleosomes, RNA polymerases and other regulatory proteins in a particular cellular context. In this review, we explain the experimental assays and computational analysis methods that enable high-resolution profiling of protein-DNA binding events. We discuss the challenges and opportunities associated with such approaches.
Collapse
Affiliation(s)
- Shaun Mahony
- a Department of Biochemistry & Molecular Biology , Center for Eukaryotic Gene Regulation, The Pennsylvania State University , University Park , PA , USA
| | - B Franklin Pugh
- a Department of Biochemistry & Molecular Biology , Center for Eukaryotic Gene Regulation, The Pennsylvania State University , University Park , PA , USA
| |
Collapse
|
39
|
He Q, Johnston J, Zeitlinger J. ChIP-nexus enables improved detection of in vivo transcription factor binding footprints. Nat Biotechnol 2015; 33:395-401. [PMID: 25751057 PMCID: PMC4390430 DOI: 10.1038/nbt.3121] [Citation(s) in RCA: 173] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2014] [Accepted: 12/08/2014] [Indexed: 11/13/2022]
Abstract
Understanding how eukaryotic enhancers are bound and regulated by specific combinations of transcription factors is still a major challenge. To better map transcription factor binding genome-wide at nucleotide resolution in vivo, we have developed a robust ChIP-exo protocol called ChIP experiments with nucleotide resolution through exonuclease, unique barcode and single ligation (ChIP-nexus), which utilizes an efficient DNA self-circularization step during library preparation. Application of ChIP-nexus to four proteins—human TBP and Drosophila NFkB, Twist and Max— demonstrates that it outperforms existing ChIP protocols in resolution and specificity, pinpoints relevant binding sites within enhancers containing multiple binding motifs and allows the analysis of in vivo binding specificities. Notably, we show that Max frequently interacts with DNA sequences next to its motif, and that this binding pattern correlates with local DNA sequence features such as DNA shape. ChIP-nexus will be broadly applicable to studying in vivo transcription factor binding specificity and its relationship to cis-regulatory changes in humans and model organisms.
Collapse
Affiliation(s)
- Qiye He
- Stowers Institute for Medical Research, Kansas City, Missouri, USA
| | - Jeff Johnston
- Stowers Institute for Medical Research, Kansas City, Missouri, USA
| | - Julia Zeitlinger
- 1] Stowers Institute for Medical Research, Kansas City, Missouri, USA. [2] Department of Pathology, Kansas University Medical Center, Kansas City, Kansas, USA
| |
Collapse
|
40
|
Ibrahim MM, Lacadie SA, Ohler U. JAMM: a peak finder for joint analysis of NGS replicates. ACTA ACUST UNITED AC 2014; 31:48-55. [PMID: 25223640 DOI: 10.1093/bioinformatics/btu568] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
MOTIVATION Although peak finding in next-generation sequencing (NGS) datasets has been addressed extensively, there is no consensus on how to analyze and process biological replicates. Furthermore, most peak finders do not focus on accurate determination of enrichment site widths and are not widely applicable to different types of datasets. RESULTS We developed JAMM (Joint Analysis of NGS replicates via Mixture Model clustering): a peak finder that can integrate information from biological replicates, determine enrichment site widths accurately and resolve neighboring narrow peaks. JAMM is a universal peak finder that is applicable to different types of datasets. We show that JAMM is among the best performing peak finders in terms of site detection accuracy and in terms of accurate determination of enrichment sites widths. In addition, JAMM's replicate integration improves peak spatial resolution, sorting and peak finding accuracy. AVAILABILITY AND IMPLEMENTATION JAMM is available for free and can run on Linux machines through the command line: http://code.google.com/p/jamm-peak-finder.
Collapse
Affiliation(s)
- Mahmoud M Ibrahim
- Department of Biology, Humboldt University, Invalidenstrasse 43, D-10115 Berlin, Germany and The Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine Berlin-Buch, Robert Rössle Str. 10, Berlin 13125, Germany Department of Biology, Humboldt University, Invalidenstrasse 43, D-10115 Berlin, Germany and The Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine Berlin-Buch, Robert Rössle Str. 10, Berlin 13125, Germany
| | - Scott A Lacadie
- Department of Biology, Humboldt University, Invalidenstrasse 43, D-10115 Berlin, Germany and The Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine Berlin-Buch, Robert Rössle Str. 10, Berlin 13125, Germany
| | - Uwe Ohler
- Department of Biology, Humboldt University, Invalidenstrasse 43, D-10115 Berlin, Germany and The Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine Berlin-Buch, Robert Rössle Str. 10, Berlin 13125, Germany Department of Biology, Humboldt University, Invalidenstrasse 43, D-10115 Berlin, Germany and The Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine Berlin-Buch, Robert Rössle Str. 10, Berlin 13125, Germany
| |
Collapse
|
41
|
Lin Z, Guo Z, Xu Y, Zhao X. Identification of a secondary promoter of CASP8 and its related transcription factor PURα. Int J Oncol 2014; 45:57-66. [PMID: 24819879 PMCID: PMC4079158 DOI: 10.3892/ijo.2014.2436] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2014] [Accepted: 04/11/2014] [Indexed: 01/18/2023] Open
Abstract
Caspase-8 (CASP8) is an essential initiator of apoptosis and is associated with many diseases in humans including esophageal squamous cell carcinoma. CASP8 produces a variety of transcripts, which might perform distinct functions. However, the cis and trans transcriptional determinants that control CASP8 expression remain poorly defined. Using a series of luciferase reporter assays, we identified a novel secondary promoter of CASP8 within chr2: 202,122,236 to 202,123,227 and 25 kb downstream of the previously described CASP8 promoter. ENCODE ChIP-seq data for this novel promoter region revealed several epigenetic features, including high levels of histone H3 lysine 27 acetylation and lysine 4 methylation, as well as low levels of CpG island methylation. We developed a mass spectrometry based strategy to identify transcription factors that contribute to the function of the secondary promoter. We found that the transcription activator protein PURα is specifically involved in the transcriptional activation of the secondary promoter and may exert its function by forming a complex with E2F-1 and RNA polymerase II. PURα can bind to both DNA and RNA, and functions in the initiation of DNA replication, regulation of transcription. We observed that knockdown of PURα expression decreased the transcriptional activity of the secondary promoter and mRNA expression of CASP8 isoform G. Although the physiologic roles of this secondary promoter remain unclear, our data may help explain the complexity of CASP8 transcription and suggest that the various caspase 8 isoforms may have distinct regulations and functions.
Collapse
Affiliation(s)
- Zhengwei Lin
- State Key Laboratory of Molecular Oncology, Cancer Institute and Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, P.R. China
| | - Zhimin Guo
- State Key Laboratory of Molecular Oncology, Cancer Institute and Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, P.R. China
| | - Yang Xu
- State Key Laboratory of Molecular Oncology, Cancer Institute and Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, P.R. China
| | - Xiaohang Zhao
- State Key Laboratory of Molecular Oncology, Cancer Institute and Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, P.R. China
| |
Collapse
|
42
|
Schwaiger M, Schönauer A, Rendeiro AF, Pribitzer C, Schauer A, Gilles AF, Schinko JB, Renfer E, Fredman D, Technau U. Evolutionary conservation of the eumetazoan gene regulatory landscape. Genome Res 2014; 24:639-50. [PMID: 24642862 PMCID: PMC3975063 DOI: 10.1101/gr.162529.113] [Citation(s) in RCA: 107] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Despite considerable differences in morphology and complexity of body plans among animals, a great part of the gene set is shared among Bilateria and their basally branching sister group, the Cnidaria. This suggests that the common ancestor of eumetazoans already had a highly complex gene repertoire. At present it is therefore unclear how morphological diversification is encoded in the genome. Here we address the possibility that differences in gene regulation could contribute to the large morphological divergence between cnidarians and bilaterians. To this end, we generated the first genome-wide map of gene regulatory elements in a nonbilaterian animal, the sea anemone Nematostella vectensis. Using chromatin immunoprecipitation followed by deep sequencing of five chromatin modifications and a transcriptional cofactor, we identified over 5000 enhancers in the Nematostella genome and could validate 75% of the tested enhancers in vivo. We found that in Nematostella, but not in yeast, enhancers are characterized by the same combination of histone modifications as in bilaterians, and these enhancers preferentially target developmental regulatory genes. Surprisingly, the distribution and abundance of gene regulatory elements relative to these genes are shared between Nematostella and bilaterian model organisms. Our results suggest that complex gene regulation originated at least 600 million yr ago, predating the common ancestor of eumetazoans.
Collapse
Affiliation(s)
- Michaela Schwaiger
- Department of Molecular Evolution and Development, Center for Organismal Systems Biology, Faculty of Life Sciences, University of Vienna, 1090 Vienna, Austria
| | | | | | | | | | | | | | | | | | | |
Collapse
|
43
|
Meireles-Filho ACA, Bardet AF, Yáñez-Cuna JO, Stampfel G, Stark A. cis-regulatory requirements for tissue-specific programs of the circadian clock. Curr Biol 2013; 24:1-10. [PMID: 24332542 DOI: 10.1016/j.cub.2013.11.017] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2013] [Revised: 09/24/2013] [Accepted: 11/06/2013] [Indexed: 01/04/2023]
Abstract
BACKGROUND Broadly expressed transcriptions factors (TFs) control tissue-specific programs of gene expression through interactions with local TF networks. A prime example is the circadian clock: although the conserved TFs CLOCK (CLK) and CYCLE (CYC) control a transcriptional circuit throughout animal bodies, rhythms in behavior and physiology are generated tissue specifically. Yet, how CLK and CYC determine tissue-specific clock programs has remained unclear. RESULTS Here, we use a functional genomics approach to determine the cis-regulatory requirements for clock specificity. We first determine CLK and CYC genome-wide binding targets in heads and bodies by ChIP-seq and show that they have distinct DNA targets in the two tissue contexts. Computational dissection of CLK/CYC context-specific binding sites reveals sequence motifs for putative partner factors, which are predictive for individual binding sites. Among them, we show that the opa and GATA motifs, differentially enriched in head and body binding sites respectively, can be bound by OPA and SERPENT (SRP). They act synergistically with CLK/CYC in the Drosophila feedback loop, suggesting that they help to determine their direct targets and therefore orchestrate tissue-specific clock outputs. In addition, using in vivo transgenic assays, we validate that GATA motifs are required for proper tissue-specific gene expression in the adult fat body, midgut, and Malpighian tubules, revealing a cis-regulatory signature for enhancers of the peripheral circadian clock. CONCLUSIONS Our results reveal how universal clock circuits can regulate tissue-specific rhythms and, more generally, provide insights into the mechanism by which universal TFs can be modulated to drive tissue-specific programs of gene expression.
Collapse
Affiliation(s)
| | - Anaïs F Bardet
- Research Institute of Molecular Pathology (IMP), 1030 Vienna, Austria
| | - J Omar Yáñez-Cuna
- Research Institute of Molecular Pathology (IMP), 1030 Vienna, Austria
| | - Gerald Stampfel
- Research Institute of Molecular Pathology (IMP), 1030 Vienna, Austria
| | - Alexander Stark
- Research Institute of Molecular Pathology (IMP), 1030 Vienna, Austria.
| |
Collapse
|