1
|
Cheng S, Miao B, Li T, Zhao G, Zhang B. Review and Evaluate the Bioinformatics Analysis Strategies of ATAC-seq and CUT&Tag Data. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae054. [PMID: 39255248 PMCID: PMC11464419 DOI: 10.1093/gpbjnl/qzae054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 05/28/2024] [Accepted: 07/18/2024] [Indexed: 09/12/2024]
Abstract
Efficient and reliable profiling methods are essential to study epigenetics. Tn5, one of the first identified prokaryotic transposases with high DNA-binding and tagmentation efficiency, is widely adopted in different genomic and epigenomic protocols for high-throughputly exploring the genome and epigenome. Based on Tn5, the Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) and the Cleavage Under Targets and Tagmentation (CUT&Tag) were developed to measure chromatin accessibility and detect DNA-protein interactions. These methodologies can be applied to large amounts of biological samples with low-input levels, such as rare tissues, embryos, and sorted single cells. However, fast and proper processing of these epigenomic data has become a bottleneck because massive data production continues to increase quickly. Furthermore, inappropriate data analysis can generate biased or misleading conclusions. Therefore, it is essential to evaluate the performance of Tn5-based ATAC-seq and CUT&Tag data processing bioinformatics tools, many of which were developed mostly for analyzing chromatin immunoprecipitation followed by sequencing (ChIP-seq) data. Here, we conducted a comprehensive benchmarking analysis to evaluate the performance of eight popular software for processing ATAC-seq and CUT&Tag data. We compared the sensitivity, specificity, and peak width distribution for both narrow-type and broad-type peak calling. We also tested the influence of the availability of control IgG input in CUT&Tag data analysis. Finally, we evaluated the differential analysis strategies commonly used for analyzing the CUT&Tag data. Our study provided comprehensive guidance for selecting bioinformatics tools and recommended analysis strategies, which were implemented into Docker/Singularity images for streamlined data analysis.
Collapse
Affiliation(s)
- Siyuan Cheng
- Department of Developmental Biology, Center of Regenerative Medicine, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Benpeng Miao
- Department of Developmental Biology, Center of Regenerative Medicine, Washington University School of Medicine, St. Louis, MO 63108, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Tiandao Li
- Department of Developmental Biology, Center of Regenerative Medicine, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Guoyan Zhao
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63108, USA
- Department of Neurology, Washington University School of Medicine, St. Louis, MO 63108, USA
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Bo Zhang
- Department of Developmental Biology, Center of Regenerative Medicine, Washington University School of Medicine, St. Louis, MO 63108, USA
| |
Collapse
|
2
|
Monsen RC. Higher-order G-quadruplexes in promoters are untapped drug targets. Front Chem 2023; 11:1211512. [PMID: 37351517 PMCID: PMC10282141 DOI: 10.3389/fchem.2023.1211512] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 05/30/2023] [Indexed: 06/24/2023] Open
Abstract
G-quadruplexes (G4s) are four-stranded nucleic acid secondary structures that form within guanine-rich regions of chromatin. G4 motifs are abundant in the genome, with a sizable proportion (∼40%) existing within gene promoter regions. G4s are proven epigenetic features that decorate the promoter landscape as binding centers for transcription factors. Stabilizing or disrupting promoter G4s can directly influence adjacent gene transcription, making G4s attractive as indirect drug targets for hard-to-target proteins, particularly in cancer. However, no G4 ligands have progressed through clinical trials, mostly owing to off targeting effects. A major hurdle in G4 drug discovery is the lack of distinctiveness of the small monomeric G4 structures currently used as receptors. This mini review describes and contrasts monomeric and higher-order G-quadruplex structure and function and provides a rationale for switching focus to the higher-order forms as selective molecular targets. The human telomerase reverse transcriptase (hTERT) core promoter G-quadruplex is then used as a case study that highlights the potential for higher-order G4s as selective indirect inhibitors of hard-to-target proteins in cancer.
Collapse
|
3
|
Bhattacharyya S, Kollipara RK, Orquera-Tornakian G, Goetsch S, Zhang M, Perry C, Li B, Shelton JM, Bhakta M, Duan J, Xie Y, Xiao G, Evers BM, Hon GC, Kittler R, Munshi NV. Global chromatin landscapes identify candidate noncoding modifiers of cardiac rhythm. J Clin Invest 2023; 133:e153635. [PMID: 36454649 PMCID: PMC9888383 DOI: 10.1172/jci153635] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 11/30/2022] [Indexed: 12/03/2022] Open
Abstract
Comprehensive cis-regulatory landscapes are essential for accurate enhancer prediction and disease variant mapping. Although cis-regulatory element (CRE) resources exist for most tissues and organs, many rare - yet functionally important - cell types remain overlooked. Despite representing only a small fraction of the heart's cellular biomass, the cardiac conduction system (CCS) unfailingly coordinates every life-sustaining heartbeat. To globally profile the mouse CCS cis-regulatory landscape, we genetically tagged CCS component-specific nuclei for comprehensive assay for transposase-accessible chromatin-sequencing (ATAC-Seq) analysis. Thus, we established a global CCS-enriched CRE database, referred to as CCS-ATAC, as a key resource for studying CCS-wide and component-specific regulatory functions. Using transcription factor (TF) motifs to construct CCS component-specific gene regulatory networks (GRNs), we identified and independently confirmed several specific TF sub-networks. Highlighting the functional importance of CCS-ATAC, we also validated numerous CCS-enriched enhancer elements and suggested gene targets based on CCS single-cell RNA-Seq data. Furthermore, we leveraged CCS-ATAC to improve annotation of existing human variants related to cardiac rhythm and nominated a potential enhancer-target pair that was dysregulated by a specific SNP. Collectively, our results established a CCS-regulatory compendium, identified novel CCS enhancer elements, and illuminated potential functional associations between human genomic variants and CCS component-specific CREs.
Collapse
Affiliation(s)
| | | | | | - Sean Goetsch
- Department of Internal Medicine, Division of Cardiology
| | - Minzhe Zhang
- Quantitative Biomedical Research Center, Department of Population and Data Sciences
| | - Cameron Perry
- Department of Internal Medicine, Division of Cardiology
| | - Boxun Li
- Laboratory of Regulatory Genomics, Cecil H. and Ida Green Center for Reproductive Biology Sciences, Division of Basic Reproductive Biology Research, Department of Obstetrics and Gynecology
| | | | - Minoti Bhakta
- Department of Internal Medicine, Division of Cardiology
| | - Jialei Duan
- Laboratory of Regulatory Genomics, Cecil H. and Ida Green Center for Reproductive Biology Sciences, Division of Basic Reproductive Biology Research, Department of Obstetrics and Gynecology
| | - Yang Xie
- Quantitative Biomedical Research Center, Department of Population and Data Sciences
- Department of Bioinformatics
| | - Guanghua Xiao
- Quantitative Biomedical Research Center, Department of Population and Data Sciences
- Department of Bioinformatics
| | - Bret M. Evers
- Department of Internal Medicine, Division of Cardiology
| | - Gary C. Hon
- Laboratory of Regulatory Genomics, Cecil H. and Ida Green Center for Reproductive Biology Sciences, Division of Basic Reproductive Biology Research, Department of Obstetrics and Gynecology
- Department of Bioinformatics
- Hamon Center for Regenerative Science and Medicine, and
| | - Ralf Kittler
- McDermott Center for Human Growth and Development
| | - Nikhil V. Munshi
- Department of Internal Medicine, Division of Cardiology
- McDermott Center for Human Growth and Development
- Hamon Center for Regenerative Science and Medicine, and
- Department of Molecular Biology, UT Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
4
|
Barissi S, Sala A, Wieczór M, Battistini F, Orozco M. DNAffinity: a machine-learning approach to predict DNA binding affinities of transcription factors. Nucleic Acids Res 2022; 50:9105-9114. [PMID: 36018808 PMCID: PMC9458447 DOI: 10.1093/nar/gkac708] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 07/21/2022] [Accepted: 08/08/2022] [Indexed: 12/24/2022] Open
Abstract
We present a physics-based machine learning approach to predict in vitro transcription factor binding affinities from structural and mechanical DNA properties directly derived from atomistic molecular dynamics simulations. The method is able to predict affinities obtained with techniques as different as uPBM, gcPBM and HT-SELEX with an excellent performance, much better than existing algorithms. Due to its nature, the method can be extended to epigenetic variants, mismatches, mutations, or any non-coding nucleobases. When complemented with chromatin structure information, our in vitro trained method provides also good estimates of in vivo binding sites in yeast.
Collapse
Affiliation(s)
| | | | - Miłosz Wieczór
- Institute for Research in Biomedicine (IRB Barcelona). The Barcelona Institute of Science and Technology. Baldiri Reixac 10–12, 08028 Barcelona, Spain,Department of Physical Chemistry. Gdansk University of Technology, 80-233 Gdańsk, Poland
| | | | - Modesto Orozco
- Correspondence may also be addressed to Modesto Orozco. Tel: +34 934 037 156;
| |
Collapse
|
5
|
Luo L, Gribskov M, Wang S. Bibliometric review of ATAC-Seq and its application in gene expression. Brief Bioinform 2022; 23:6543486. [PMID: 35255493 PMCID: PMC9116206 DOI: 10.1093/bib/bbac061] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 02/06/2022] [Accepted: 02/09/2022] [Indexed: 11/30/2022] Open
Abstract
With recent advances in high-throughput next-generation sequencing, it is possible to describe the regulation and expression of genes at multiple levels. An assay for transposase-accessible chromatin using sequencing (ATAC-seq), which uses Tn5 transposase to sequence protein-free binding regions of the genome, can be combined with chromatin immunoprecipitation coupled with deep sequencing (ChIP-seq) and ribonucleic acid sequencing (RNA-seq) to provide a detailed description of gene expression. Here, we reviewed the literature on ATAC-seq and described the characteristics of ATAC-seq publications. We then briefly introduced the principles of RNA-seq, ChIP-seq and ATAC-seq, focusing on the main features of the techniques. We built a phylogenetic tree from species that had been previously studied by using ATAC-seq. Studies of Mus musculus and Homo sapiens account for approximately 90% of the total ATAC-seq data, while other species are still in the process of accumulating data. We summarized the findings from human diseases and other species, illustrating the cutting-edge discoveries and the role of multi-omics data analysis in current research. Moreover, we collected and compared ATAC-seq analysis pipelines, which allowed biological researchers who lack programming skills to better analyze and explore ATAC-seq data. Through this review, it is clear that multi-omics analysis and single-cell sequencing technology will become the mainstream approach in future research.
Collapse
Affiliation(s)
- Liheng Luo
- School of Life Sciences, Northwestern Polytechnical University, Xi'an, Shaanxi, China, 710072
| | - Michael Gribskov
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Sufang Wang
- School of Life Sciences, Northwestern Polytechnical University, Xi'an, Shaanxi, China, 710072
| |
Collapse
|
6
|
Kumar S. SWI/SNF (BAF) complexes: From framework to a functional role in endothelial mechanotransduction. CURRENT TOPICS IN MEMBRANES 2021; 87:171-198. [PMID: 34696885 DOI: 10.1016/bs.ctm.2021.09.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/14/2023]
Abstract
Endothelial cells (ECs) are constantly subjected to an array of mechanical cues, especially shear stress, due to their luminal placement in the blood vessels. Blood flow can regulate various aspects of endothelial biology and pathophysiology by regulating the endothelial processes at the transcriptomic, proteomic, miRNomic, metabolomics, and epigenomic levels. ECs sense, respond, and adapt to altered blood flow patterns and shear profiles by specialized mechanisms of mechanosensing and mechanotransduction, resulting in qualitative and quantitative differences in their gene expression. Chromatin-regulatory proteins can regulate transcriptional activation by modifying the organization of nucleosomes at promoters, enhancers, silencers, insulators, and locus control regions. Recent research efforts have illustrated that SWI/SNF (SWItch/Sucrose Non-Fermentable) or BRG1/BRM-associated factor (BAF) complex regulates DNA accessibility and chromatin structure. Since the discovery, the gene-regulatory mechanisms of the BAF complex associated with chromatin remodeling have been intensively studied to investigate its role in diverse disease phenotypes. Thus far, it is evident that (1) the SWI/SNF complex broadly regulates the activity of transcriptional enhancers to control lineage-specific differentiation and (2) mutations in the BAF complex proteins lead to developmental disorders and cancers. It is unclear if blood flow can modulate the activity of SWI/SNF complex to regulate EC differentiation and reprogramming. This review emphasizes the integrative role of SWI/SNF complex from a structural and functional standpoint with a special reference to cardiovascular diseases (CVDs). The review also highlights how regulation of this complex by blood flow can lead to the discovery of new therapeutic interventions for the treatment of endothelial dysfunction in vascular diseases.
Collapse
Affiliation(s)
- Sandeep Kumar
- Wallace H. Coulter Department of Biomedical Engineering at Emory University and Georgia Institute of Technology, Atlanta, GA, United States.
| |
Collapse
|
7
|
McCormack LS, Efremov AK, Yan J. Effects of size, cooperativity, and competitive binding on protein positioning on DNA. Biophys J 2021; 120:2040-2053. [PMID: 33771470 DOI: 10.1016/j.bpj.2021.03.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 02/26/2021] [Accepted: 03/18/2021] [Indexed: 11/24/2022] Open
Abstract
Accurate positioning of proteins on chromosomal DNA is crucial for its proper organization as well as gene transcription regulation. Recent experiments revealed existence of periodic patterns of nucleoprotein complexes on DNA, which frequently cannot be explained by sequence-dependent binding of proteins. Previous theoretical studies suggest that such patterns typically emerge as a result of the proteins' volume-exclusion effect. However, the role of other physical factors in patterns' formation, such as the length of DNA, its sequence heterogeneity, and protein binding cooperativity/binding competition to DNA, remains unclear. To address these less understood yet important aspects, we investigated potential effects of these factors on protein positioning on finite-size DNA by using transfer-matrix calculations. It has been found that upon binding to DNA, proteins form oscillatory patterns that span over the length of up to ∼10 times the size of the protein binding site, with the shape of the patterns being strongly dependent on the length of DNA and the proteins' binding cooperativity to DNA. Furthermore, calculations showed that small variations in the proteins' affinity to DNA due to its sequence heterogeneity do not much change the main geometric characteristics of the observed protein patterns. Finally, competition between two different types of proteins for binding to DNA has been found to lead to formation of highly diverse and complex alternating positioning of the two proteins. Altogether, these results provide new insights into the roles of physicochemical properties of proteins, the DNA length, and DNA-binding competition between proteins in formation of protein positioning patterns on DNA.
Collapse
Affiliation(s)
- Leo S McCormack
- Department of Physics, Imperial College London, London, United Kingdom; Mechanobiology InstituteNational University of Singapore, Singapore, Singapore
| | - Artem K Efremov
- Mechanobiology InstituteNational University of Singapore, Singapore, Singapore.
| | - Jie Yan
- Mechanobiology InstituteNational University of Singapore, Singapore, Singapore; Department of Physics, National University of Singapore, Singapore, Singapore.
| |
Collapse
|
8
|
Levitsky V, Zemlyanskaya E, Oshchepkov D, Podkolodnaya O, Ignatieva E, Grosse I, Mironova V, Merkulova T. A single ChIP-seq dataset is sufficient for comprehensive analysis of motifs co-occurrence with MCOT package. Nucleic Acids Res 2020; 47:e139. [PMID: 31750523 PMCID: PMC6868382 DOI: 10.1093/nar/gkz800] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2019] [Revised: 08/12/2019] [Accepted: 09/09/2019] [Indexed: 01/20/2023] Open
Abstract
Recognition of composite elements consisting of two transcription factor binding sites gets behind the studies of tissue-, stage- and condition-specific transcription. Genome-wide data on transcription factor binding generated with ChIP-seq method facilitate an identification of composite elements, but the existing bioinformatics tools either require ChIP-seq datasets for both partner transcription factors, or omit composite elements with motifs overlapping. Here we present an universal Motifs Co-Occurrence Tool (MCOT) that retrieves maximum information about overrepresented composite elements from a single ChIP-seq dataset. This includes homo- and heterotypic composite elements of four mutual orientations of motifs, separated with a spacer or overlapping, even if recognition of motifs within composite element requires various stringencies. Analysis of 52 ChIP-seq datasets for 18 human transcription factors confirmed that for over 60% of analyzed datasets and transcription factors predicted co-occurrence of motifs implied experimentally proven protein-protein interaction of respecting transcription factors. Analysis of 164 ChIP-seq datasets for 57 mammalian transcription factors showed that abundance of predicted composite elements with an overlap of motifs compared to those with a spacer more than doubled; and they had 1.5-fold increase of asymmetrical pairs of motifs with one more conservative 'leading' motif and another one 'guided'.
Collapse
Affiliation(s)
- Victor Levitsky
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Elena Zemlyanskaya
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Dmitry Oshchepkov
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia
| | - Olga Podkolodnaya
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia
| | - Elena Ignatieva
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Ivo Grosse
- Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia.,Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany.,German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Leipzig, Germany
| | - Victoria Mironova
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Tatyana Merkulova
- Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia.,Department of Molecular Genetics, Institute of Cytology and Genetics, Novosibirsk 630090, Russia
| |
Collapse
|
9
|
Ryan GE, Farley EK. Functional genomic approaches to elucidate the role of enhancers during development. WILEY INTERDISCIPLINARY REVIEWS. SYSTEMS BIOLOGY AND MEDICINE 2020; 12:e1467. [PMID: 31808313 PMCID: PMC7027484 DOI: 10.1002/wsbm.1467] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 10/02/2019] [Accepted: 10/11/2019] [Indexed: 12/22/2022]
Abstract
Successful development depends on the precise tissue-specific regulation of genes by enhancers, genetic elements that act as switches to control when and where genes are expressed. Because enhancers are critical for development, and the majority of disease-associated mutations reside within enhancers, it is essential to understand which sequences within enhancers are important for function. Advances in sequencing technology have enabled the rapid generation of genomic data that predict putative active enhancers, but functionally validating these sequences at scale remains a fundamental challenge. Herein, we discuss the power of genome-wide strategies used to identify candidate enhancers, and also highlight limitations and misconceptions that have arisen from these data. We discuss the use of massively parallel reporter assays to test enhancers for function at scale. We also review recent advances in our ability to study gene regulation during development, including CRISPR-based tools to manipulate genomes and single-cell transcriptomics to finely map gene expression. Finally, we look ahead to a synthesis of complementary genomic approaches that will advance our understanding of enhancer function during development. This article is categorized under: Physiology > Mammalian Physiology in Health and Disease Developmental Biology > Developmental Processes in Health and Disease Laboratory Methods and Technologies > Genetic/Genomic Methods.
Collapse
Affiliation(s)
- Genevieve E. Ryan
- Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
- Division of Biological Sciences, Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
| | - Emma K. Farley
- Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
- Division of Biological Sciences, Department of MedicineUniversity of CaliforniaSan DiegoCalifornia
| |
Collapse
|
10
|
Sharma V, Majumdar S. Comparative analysis of ChIP-exo peak-callers: impact of data quality, read duplication and binding subtypes. BMC Bioinformatics 2020; 21:65. [PMID: 32085702 PMCID: PMC7035708 DOI: 10.1186/s12859-020-3403-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Accepted: 02/10/2020] [Indexed: 01/26/2023] Open
Abstract
Background ChIP (Chromatin immunoprecipitation)-exo has emerged as an important and versatile improvement over conventional ChIP-seq as it reduces the level of noise, maps the transcription factor (TF) binding location in a very precise manner, upto single base-pair resolution, and enables binding mode prediction. Availability of numerous peak-callers for analyzing ChIP-exo reads has motivated the need to assess their performance and report which tool executes reasonably well for the task. Results This study has focussed on comparing peak-callers that report direct binding events with those that report indirect binding events. The effect of strandedness of reads and duplication of data on the performance of peak-callers has been investigated. The number of peaks reported by each peak-caller is compared followed by a comparison of the annotated motifs present in the reported peaks. The significance of peaks is assessed based on the presence of a motif in top peaks. Indirect binding tools have been compared on the basis of their ability to identify annotated motifs and predict mode of protein-DNA interaction. Conclusion By studying the output of the peak-callers investigated in this study, it is concluded that the tools that use self-learning algorithms, i.e. the tools that estimate all the essential parameters from the aligned reads, perform better than the algorithms which require formation of peak-pairs. The latest tools that account for indirect binding of TFs appear to be an upgrade over the available tools, as they are able to reveal valuable information about the mode of binding in addition to direct binding. Furthermore, the quality of ChIP-exo reads have important consequences on the output of data analysis.
Collapse
Affiliation(s)
- Vasudha Sharma
- Discipline of Biological Engineering, Indian Institute of Technology Gandhinagar, Palaj, Gujarat, 382355, India
| | - Sharmistha Majumdar
- Discipline of Biological Engineering, Indian Institute of Technology Gandhinagar, Palaj, Gujarat, 382355, India.
| |
Collapse
|
11
|
Zeiske T, Baburajendran N, Kaczynska A, Brasch J, Palmer AG, Shapiro L, Honig B, Mann RS. Intrinsic DNA Shape Accounts for Affinity Differences between Hox-Cofactor Binding Sites. Cell Rep 2020; 24:2221-2230. [PMID: 30157419 DOI: 10.1016/j.celrep.2018.07.100] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 06/14/2018] [Accepted: 07/28/2018] [Indexed: 11/26/2022] Open
Abstract
Transcription factors bind to their binding sites over a wide range of affinities, yet how differences in affinity are encoded in DNA sequences is not well understood. Here, we report X-ray crystal structures of four heterodimers of the Hox protein AbdominalB bound with its cofactor Extradenticle to four target DNA molecules that differ in affinity by up to ∼20-fold. Remarkably, despite large differences in affinity, the overall structures are very similar in all four complexes. In contrast, the predicted shapes of the DNA binding sites (i.e., the intrinsic DNA shape) in the absence of bound protein are strikingly different from each other and correlate with affinity: binding sites that must change conformations upon protein binding have lower affinities than binding sites that have more optimal conformations prior to binding. Together, these observations suggest that intrinsic differences in DNA shape provide a robust mechanism for modulating affinity without affecting other protein-DNA interactions.
Collapse
Affiliation(s)
- Tim Zeiske
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA
| | - Nithya Baburajendran
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA
| | - Anna Kaczynska
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA
| | - Julia Brasch
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA
| | - Arthur G Palmer
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA
| | - Lawrence Shapiro
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA; Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Barry Honig
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA; Department of Systems Biology, Columbia University, New York, NY, USA; Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA.
| | - Richard S Mann
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA; Department of Systems Biology, Columbia University, New York, NY, USA; Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA.
| |
Collapse
|
12
|
How B-DNA Dynamics Decipher Sequence-Selective Protein Recognition. J Mol Biol 2019; 431:3845-3859. [DOI: 10.1016/j.jmb.2019.07.021] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Revised: 07/09/2019] [Accepted: 07/10/2019] [Indexed: 11/23/2022]
|
13
|
Shen Z, Lin Y, Zou Q. Transcription factors–DNA interactions in rice: identification and verification. Brief Bioinform 2019; 21:946-956. [DOI: 10.1093/bib/bbz045] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 03/25/2019] [Accepted: 03/25/2019] [Indexed: 01/08/2023] Open
Abstract
Abstract
The completion of the rice genome sequence paved the way for rice functional genomics research. Additionally, the functional characterization of transcription factors is currently a popular and crucial objective among researchers. Transcription factors are one of the groups of proteins that bind to either enhancer or promoter regions of genes to regulate expression. On the basis of several typical examples of transcription factor analyses, we herein summarize selected research strategies and methods and introduce their advantages and disadvantages. This review may provide some theoretical and technical guidelines for future investigations of transcription factors, which may be helpful to develop new rice varieties with ideal traits.
Collapse
Affiliation(s)
- Zijie Shen
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Yuan Lin
- Department of System Integration, Sparebanken Vest, Bergen, Norway
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
14
|
Abstract
The IBDs, Crohn's disease and ulcerative colitis, are chronic inflammatory conditions of the gastrointestinal tract resulting from an aberrant immune response to enteric microbiota in genetically susceptible individuals. Disease presentation and progression within and across IBDs, especially Crohn's disease, are highly heterogeneous in location, severity of inflammation and other phenotypes. Current clinical classifications fail to accurately predict disease course and response to therapies. Genome-wide association studies have identified >240 loci that confer risk of IBD, but the clinical utility of these findings remains unclear, and mechanisms by which the genetic variants contribute to disease are largely unknown. In the past 5 years, the profiling of genome-wide gene expression, epigenomic features and gut microbiota composition in intestinal tissue and faecal samples has uncovered distinct molecular signatures that define IBD subtypes, including within Crohn's disease and ulcerative colitis. In this Review, we summarize studies in both adult and paediatric patients that have identified different IBD subtypes, which in some cases have been associated with distinct clinical phenotypes. We posit that genome-scale molecular phenotyping in large cohorts holds great promise not only to further our understanding of the diverse molecular causes of IBD but also for improving clinical trial design to develop more personalized disease management and treatment.
Collapse
|
15
|
Venters BJ. Insights from resolving protein-DNA interactions at near base-pair resolution. Brief Funct Genomics 2019; 17:80-88. [PMID: 29211822 DOI: 10.1093/bfgp/elx043] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
One of the central goals in molecular biology is to understand how cell-type-specific expression patterns arise through selective recruitment of RNA polymerase II (Pol II) to a subset of gene promoters. Pol II needs to be recruited to a precise genomic position at the proper time to produce messenger RNA from a DNA template. Ostensibly, transcription is a relatively simple cellular process; yet, experimentally measuring and then understanding the combinatorial possibilities of transcriptional regulators remain a daunting task. Since its introduction in 1985, chromatin immunoprecipitation (ChIP) has remained a key tool for investigating protein-DNA contacts in vivo. Over 30 years of intensive research using ChIP have provided numerous insights into mechanisms of gene regulation. As functional genomic technologies improve, they present new opportunities to address key biological questions. ChIP-exo is a refined version of ChIP-seq that significantly reduces background signal, while providing near base-pair mapping resolution for protein-DNA interactions. This review discusses the evolution of the ChIP assay over the years; the methodological differences between ChIP-seq, ChIP-exo and ChIP-nexus; and highlight new insights into epigenetic and transcriptional mechanisms that were uniquely enabled with the near base-pair resolution of ChIP-exo.
Collapse
|
16
|
Lee NK, Li X, Wang D. A comprehensive survey on genetic algorithms for DNA motif prediction. Inf Sci (N Y) 2018. [DOI: 10.1016/j.ins.2018.07.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
17
|
Welch R, Chung D, Grass J, Landick R, Keles S. Data exploration, quality control and statistical analysis of ChIP-exo/nexus experiments. Nucleic Acids Res 2017; 45:e145. [PMID: 28911122 PMCID: PMC5587812 DOI: 10.1093/nar/gkx594] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Accepted: 07/12/2017] [Indexed: 01/03/2023] Open
Abstract
ChIP-exo/nexus experiments rely on innovative modifications of the commonly used ChIP-seq protocol for high resolution mapping of transcription factor binding sites. Although many aspects of the ChIP-exo data analysis are similar to those of ChIP-seq, these high throughput experiments pose a number of unique quality control and analysis challenges. We develop a novel statistical quality control pipeline and accompanying R/Bioconductor package, ChIPexoQual, to enable exploration and analysis of ChIP-exo and related experiments. ChIPexoQual evaluates a number of key issues including strand imbalance, library complexity, and signal enrichment of data. Assessment of these features are facilitated through diagnostic plots and summary statistics computed over regions of the genome with varying levels of coverage. We evaluated our QC pipeline with both large collections of public ChIP-exo/nexus data and multiple, new ChIP-exo datasets from Escherichia coli. ChIPexoQual analysis of these datasets resulted in guidelines for using these QC metrics across a wide range of sequencing depths and provided further insights for modelling ChIP-exo data.
Collapse
Affiliation(s)
- Rene Welch
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Dongjun Chung
- Department of Public Health Sciences, Medical University of South Carolina, SC 29425, USA
| | - Jeffrey Grass
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA.,Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Robert Landick
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA.,Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA.,Department of Bacteriology, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Sündüz Keles
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA.,Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
| |
Collapse
|
18
|
Lewis L, Crawford GE, Furey TS, Rusyn I. Genetic and epigenetic determinants of inter-individual variability in responses to toxicants. CURRENT OPINION IN TOXICOLOGY 2017; 6:50-59. [PMID: 29276797 PMCID: PMC5739339 DOI: 10.1016/j.cotox.2017.08.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
It is well established that genetic variability has a major impact on susceptibility to common diseases, responses to drugs and toxicants, and influences disease-related outcomes. The appreciation that epigenetic marks also vary across the population is growing with more data becoming available from studies in humans and model organisms. In addition, the links between genetic variability, toxicity outcomes and epigenetics are being actively explored. Recent studies demonstrate that gene-by-environment interactions involve both chromatin states and transcriptional regulation, and that epigenetics provides important mechanistic clues to connect expression-related quantitative trait loci (QTL) and disease outcomes. However, studies of Gene×Environment×Epigenetics further extend the complexity of the experimental designs and create a challenge for selecting the most informative epigenetic readouts that can be feasibly performed to interrogate multiple individuals, exposures, tissue types and toxicity phenotypes. We propose that among the many possible epigenetic experimental methodologies, assessment of chromatin accessibility coupled with total RNA levels provides a cost-effective and comprehensive option to sufficiently characterize the complexity of epigenetic and regulatory activity in the context of understanding the inter-individual variability in responses to toxicants.
Collapse
Affiliation(s)
- Lauren Lewis
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, Texas
| | - Gregory E. Crawford
- Center for Genomic and Computational Biology and Department of Pediatrics, Division of Medical Genetics, Duke University, Durham, NC, USA
| | - Terrence S. Furey
- Department of Genetics, Department of Biology, Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Ivan Rusyn
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, Texas
| |
Collapse
|
19
|
WGADseq: Whole Genome Affinity Determination of Protein-DNA Binding Sites. Methods Mol Biol 2017. [PMID: 28842875 DOI: 10.1007/978-1-4939-7098-8_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
We present a method through which one may monitor the relative binding affinity of a given protein to DNA motifs on the scale of a whole genome. Briefly, the protein of interest is incubated with fragmented genomic DNA and then affixed to a column. Washes with buffers containing low salt concentrations will remove nonbound DNA fragments, while stepwise washes with increasing salt concentrations will elute more specifically bound fragments. Massive sequencing is used to identify eluted DNA fragments and map them on the genome, which permits us to classify the different binding sites according to their affinity and determine corresponding consensus motifs (if any).
Collapse
|
20
|
Abstract
Chromosomes present one of most challenging of all substrates for biochemical study. This is because genomic DNA is physically associated with an astonishing collection of nuclear factors, which serve to not only store the nucleic acid in a stable form, but also grant access to the information it encodes when needed. Understanding this complex molecular choreography is central to the field of epigenetics. One of the great challenges in this area is to move beyond correlative type information, which is now in abundant supply, to the point where we can truly connect the dots at the molecular level. Establishing such causal relationships requires precise manipulation of the covalent structure of chromatin. Tools for this purpose are currently in short supply, creating an opportunity that, as we will argue in this Perspective, is well suited to the sensibilities of the chemist.
Collapse
Affiliation(s)
- Yael David
- Chemical Biology Program, Memorial Sloan Kettering Cancer Center , New York, New York 10065, United States
| | - Tom W Muir
- Department of Chemistry, Princeton University , Princeton, New Jersey 08544, United States
| |
Collapse
|
21
|
Chen X, Yu B, Carriero N, Silva C, Bonneau R. Mocap: large-scale inference of transcription factor binding sites from chromatin accessibility. Nucleic Acids Res 2017; 45:4315-4329. [PMID: 28334916 PMCID: PMC5416775 DOI: 10.1093/nar/gkx174] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2016] [Revised: 02/28/2017] [Accepted: 03/06/2017] [Indexed: 12/21/2022] Open
Abstract
Differential binding of transcription factors (TFs) at cis-regulatory loci drives the differentiation and function of diverse cellular lineages. Understanding the regulatory interactions that underlie cell fate decisions requires characterizing TF binding sites (TFBS) across multiple cell types and conditions. Techniques, e.g. ChIP-Seq can reveal genome-wide patterns of TF binding, but typically requires laborious and costly experiments for each TF-cell-type (TFCT) condition of interest. Chromosomal accessibility assays can connect accessible chromatin in one cell type to many TFs through sequence motif mapping. Such methods, however, rarely take into account that the genomic context preferred by each factor differs from TF to TF, and from cell type to cell type. To address the differences in TF behaviors, we developed Mocap, a method that integrates chromatin accessibility, motif scores, TF footprints, CpG/GC content, evolutionary conservation and other factors in an ensemble of TFCT-specific classifiers. We show that integration of genomic features, such as CpG islands improves TFBS prediction in some TFCT. Further, we describe a method for mapping new TFCT, for which no ChIP-seq data exists, onto our ensemble of classifiers and show that our cross-sample TFBS prediction method outperforms several previously described methods.
Collapse
Affiliation(s)
- Xi Chen
- Department of Biology, New York University, New York, NY 10003, USA
| | - Bowen Yu
- Department of Computer Science, New York University, New York, NY 10003, USA
| | - Nicholas Carriero
- Center for Computational Biology, Flatiron Foundation, Simons Foundation, New York, NY 10010, USA
| | - Claudio Silva
- Department of Computer Science, New York University, New York, NY 10003, USA
| | - Richard Bonneau
- Department of Biology, New York University, New York, NY 10003, USA
- Department of Computer Science, New York University, New York, NY 10003, USA
- Center for Computational Biology, Flatiron Foundation, Simons Foundation, New York, NY 10010, USA
| |
Collapse
|
22
|
Wedel C, Siegel TN. Genome-wide analysis of chromatin structures in Trypanosoma brucei using high-resolution MNase-ChIP-seq. Exp Parasitol 2017; 180:2-12. [PMID: 28286326 DOI: 10.1016/j.exppara.2017.03.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2016] [Revised: 02/24/2017] [Accepted: 03/07/2017] [Indexed: 11/16/2022]
Abstract
Specific DNA-protein interactions are the basis for many important cellular mechanisms like the regulation of gene expression or replication. Knowledge about the precise genomic locations of DNA-protein interactions is important because it provides insight into the regulation of these processes. Recently, we have adapted an approach that combines micrococcal nuclease (MNase) digestion of chromatin with chromatin immunoprecipitation in Trypanosoma brucei. Here, we describe in detail how this method can be used to map the genome-wide distribution of nucleosomes or other DNA-binding proteins at high resolution in T. brucei.
Collapse
Affiliation(s)
- Carolin Wedel
- Research Center for Infectious Diseases (ZINF), University of Würzburg, Josef-Schneider-Straße 2 / Bau D15, 97080 Würzburg, Germany
| | - T Nicolai Siegel
- Research Center for Infectious Diseases (ZINF), University of Würzburg, Josef-Schneider-Straße 2 / Bau D15, 97080 Würzburg, Germany.
| |
Collapse
|
23
|
Fischl H, Howe FS, Furger A, Mellor J. Paf1 Has Distinct Roles in Transcription Elongation and Differential Transcript Fate. Mol Cell 2017; 65:685-698.e8. [PMID: 28190769 PMCID: PMC5316414 DOI: 10.1016/j.molcel.2017.01.006] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2016] [Revised: 09/22/2016] [Accepted: 01/05/2017] [Indexed: 12/12/2022]
Abstract
RNA polymerase II (Pol2) movement through chromatin and the co-transcriptional processing and fate of nascent transcripts is coordinated by transcription elongation factors (TEFs) such as polymerase-associated factor 1 (Paf1), but it is not known whether TEFs have gene-specific functions. Using strand-specific nucleotide resolution techniques, we show that levels of Paf1 on Pol2 vary between genes, are controlled dynamically by environmental factors via promoters, and reflect levels of processing and export factors on the encoded transcript. High levels of Paf1 on Pol2 promote transcript nuclear export, whereas low levels reflect nuclear retention. Strains lacking Paf1 show marked elongation defects, although low levels of Paf1 on Pol2 are sufficient for transcription elongation. Our findings support distinct Paf1 functions: a core general function in transcription elongation, satisfied by the lowest Paf1 levels, and a regulatory function in determining differential transcript fate by varying the level of Paf1 on Pol2.
Collapse
Affiliation(s)
- Harry Fischl
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU, UK
| | - Françoise S Howe
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU, UK
| | - Andre Furger
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU, UK
| | - Jane Mellor
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU, UK.
| |
Collapse
|
24
|
Zhou X, Yan Q, Wang N. Deciphering the regulon of a GntR family regulator via transcriptome and ChIP-exo analyses and its contribution to virulence in Xanthomonas citri. MOLECULAR PLANT PATHOLOGY 2017; 18:249-262. [PMID: 26972728 PMCID: PMC6638223 DOI: 10.1111/mpp.12397] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Revised: 02/08/2016] [Accepted: 03/07/2016] [Indexed: 05/14/2023]
Abstract
Xanthomonas contains a large group of plant-associated species, many of which cause severe diseases on important crops worldwide. Six gluconate-operon repressor (GntR) family transcriptional regulators are predicted in Xanthomonas, one of which, belonging to the YtrA subfamily, plays a prominent role in bacterial virulence. However, the direct targets and comprehensive regulatory profile of YtrA remain unknown. Here, we performed microarray and high-resolution chromatin immunoprecipitation-exonuclease (ChIP-exo) experiments to identify YtrA direct targets and its DNA binding motif in X. citri ssp. citri (Xac), the causal agent of citrus canker. Integrative microarray and ChIP-exo data analysis revealed that YtrA directly regulates three operons by binding to a palindromic motif GGTG-N16 -CACC at the promoter region. A similar palindromic motif and YtrA homologues were also identified in many other bacteria, including Stenotrophomonas, Pseudoxanthomonas and Frateuria, indicating a widespread phenomenon. Deletion of ytrA in Xac abolishes bacterial virulence and induction of the hypersensitive response (HR). We found that YtrA regulates the expression of hrp/hrc genes encoding the bacterial type III secretion system (T3SS) and controls multiple biological processes, including motility and adhesion, oxidative stress, extracellular enzyme production and iron uptake. YtrA represses the expression of its direct targets in artificial medium or in planta. Importantly, over-expression of yro3, one of the YtrA directly regulated operons which contains trmL and XAC0231, induced weaker canker symptoms and down-regulation of hrp/hrc gene expression, suggesting a negative regulation in Xac virulence and T3SS. Our study has significantly advanced the mechanistic understanding of YtrA regulation and its contribution to bacterial virulence.
Collapse
Affiliation(s)
- Xiaofeng Zhou
- Citrus Research and Education CenterDepartment of Microbiology and Cell Science, IFAS, University of Florida700 Experiment Station RoadLake AlfredFL33850USA
| | - Qing Yan
- Citrus Research and Education CenterDepartment of Microbiology and Cell Science, IFAS, University of Florida700 Experiment Station RoadLake AlfredFL33850USA
| | - Nian Wang
- Citrus Research and Education CenterDepartment of Microbiology and Cell Science, IFAS, University of Florida700 Experiment Station RoadLake AlfredFL33850USA
| |
Collapse
|
25
|
Abstract
Surveys of public sequence resources show that experimentally supported functional information is still completely missing for a considerable fraction of known proteins and is clearly incomplete for an even larger portion. Bioinformatics methods have long made use of very diverse data sources alone or in combination to predict protein function, with the understanding that different data types help elucidate complementary biological roles. This chapter focuses on methods accepting amino acid sequences as input and producing GO term assignments directly as outputs; the relevant biological and computational concepts are presented along with the advantages and limitations of individual approaches.
Collapse
Affiliation(s)
- Domenico Cozzetto
- Bioinformatics Group, Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK
| | - David T Jones
- Bioinformatics Group, Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK.
| |
Collapse
|
26
|
Cell Cycle Constraints and Environmental Control of Local DNA Hypomethylation in α-Proteobacteria. PLoS Genet 2016; 12:e1006499. [PMID: 27997543 PMCID: PMC5172544 DOI: 10.1371/journal.pgen.1006499] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Accepted: 11/22/2016] [Indexed: 11/19/2022] Open
Abstract
Heritable DNA methylation imprints are ubiquitous and underlie genetic variability from bacteria to humans. In microbial genomes, DNA methylation has been implicated in gene transcription, DNA replication and repair, nucleoid segregation, transposition and virulence of pathogenic strains. Despite the importance of local (hypo)methylation at specific loci, how and when these patterns are established during the cell cycle remains poorly characterized. Taking advantage of the small genomes and the synchronizability of α-proteobacteria, we discovered that conserved determinants of the cell cycle transcriptional circuitry establish specific hypomethylation patterns in the cell cycle model system Caulobacter crescentus. We used genome-wide methyl-N6-adenine (m6A-) analyses by restriction-enzyme-cleavage sequencing (REC-Seq) and single-molecule real-time (SMRT) sequencing to show that MucR, a transcriptional regulator that represses virulence and cell cycle genes in S-phase but no longer in G1-phase, occludes 5'-GANTC-3' sequence motifs that are methylated by the DNA adenine methyltransferase CcrM. Constitutive expression of CcrM or heterologous methylases in at least two different α-proteobacteria homogenizes m6A patterns even when MucR is present and affects promoter activity. Environmental stress (phosphate limitation) can override and reconfigure local hypomethylation patterns imposed by the cell cycle circuitry that dictate when and where local hypomethylation is instated.
Collapse
|
27
|
Lai WKM, Pugh BF. Genome-wide uniformity of human 'open' pre-initiation complexes. Genome Res 2016; 27:15-26. [PMID: 27927716 PMCID: PMC5204339 DOI: 10.1101/gr.210955.116] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2016] [Accepted: 11/03/2016] [Indexed: 01/05/2023]
Abstract
Transcription of protein-coding and noncoding DNA occurs pervasively throughout the mammalian genome. Their sites of initiation are generally inferred from transcript 5' ends and are thought to be either locally dispersed or focused. How these two modes of initiation relate is unclear. Here, we apply permanganate treatment and chromatin immunoprecipitation (PIP-seq) of initiation factors to identify the precise location of melted DNA separately associated with the preinitiation complex (PIC) and the adjacent paused complex (PC). This approach revealed the two known modes of transcription initiation. However, in contrast to prevailing views, they co-occurred within the same promoter region: initiation originating from a focused PIC, and broad nucleosome-linked initiation. PIP-seq allowed transcriptional orientation of Pol II to be determined, which may be useful near promoters where sufficient sense/anti-sense transcript mapping information is lacking. PIP-seq detected divergently oriented Pol II at both coding and noncoding promoters, as well as at enhancers. Their occupancy levels were not necessarily coupled in the two orientations. DNA sequence and shape analysis of initiation complex sites suggest that both sequence and shape contribute to specificity, but in a context-restricted manner. That is, initiation sites have the locally "best" initiator (INR) sequence and/or shape. These findings reveal a common core to pervasive Pol II initiation throughout the human genome.
Collapse
Affiliation(s)
- William K M Lai
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - B Franklin Pugh
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
28
|
Fullard JF, Halene TB, Giambartolomei C, Haroutunian V, Akbarian S, Roussos P. Understanding the genetic liability to schizophrenia through the neuroepigenome. Schizophr Res 2016; 177:115-124. [PMID: 26827128 PMCID: PMC4963306 DOI: 10.1016/j.schres.2016.01.039] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Revised: 01/14/2016] [Accepted: 01/18/2016] [Indexed: 12/17/2022]
Abstract
The Psychiatric Genomics Consortium-Schizophrenia Workgroup (PGC-SCZ) recently identified 108 loci associated with increased risk for schizophrenia (SCZ). The vast majority of these variants reside within non-coding sequences of the genome and are predicted to exert their effects by affecting the mechanism of action of cis regulatory elements (CREs), such as promoters and enhancers. Although a number of large-scale collaborative efforts (e.g. ENCODE) have achieved a comprehensive mapping of CREs in human cell lines or tissue homogenates, it is becoming increasingly evident that many risk-associated variants are enriched for expression Quantitative Trait Loci (eQTLs) and CREs in specific tissues or cells. As such, data derived from previous research endeavors may not capture fully cell-type and/or region specific changes associated with brain diseases. Coupling recent technological advances in genomics with cell-type specific methodologies, we are presented with an unprecedented opportunity to better understand the genetics of normal brain development and function and, in turn, the molecular basis of neuropsychiatric disorders. In this review, we will outline ongoing efforts towards this goal and will discuss approaches with the potential to shed light on the mechanism(s) of action of cell-type specific cis regulatory elements and their putative roles in disease, with particular emphasis on understanding the manner in which the epigenome and CREs influence the etiology of SCZ.
Collapse
Affiliation(s)
- John F. Fullard
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Tobias B. Halene
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA,Mental Illness Research, Education, and Clinical Center (VISN 3), James J. Peters VA Medical Center, Bronx, NY, USA
| | | | - Vahram Haroutunian
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA,Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA,Mental Illness Research, Education, and Clinical Center (VISN 3), James J. Peters VA Medical Center, Bronx, NY, USA
| | - Schahram Akbarian
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA,Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Panos Roussos
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Genetics and Genomic Science and Institute for Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Mental Illness Research, Education, and Clinical Center (VISN 3), James J. Peters VA Medical Center, Bronx, NY, USA.
| |
Collapse
|
29
|
Sharma V, Monti P, Fronza G, Inga A. Human transcription factors in yeast: the fruitful examples of P53 and NF-кB. FEMS Yeast Res 2016; 16:fow083. [PMID: 27683095 DOI: 10.1093/femsyr/fow083] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/24/2016] [Indexed: 12/31/2022] Open
Abstract
The observation that human transcription factors (TFs) can function when expressed in yeast cells has stimulated the development of various functional assays to investigate (i) the role of binding site sequences (herein referred to as response elements, REs) in transactivation specificity, (ii) the impact of polymorphic nucleotide variants on transactivation potential, (iii) the functional consequences of mutations in TFs and (iv) the impact of cofactors or small molecules. These approaches have found applications in basic as well as applied research, including the identification and the characterisation of mutant TF alleles from clinical samples. The ease of genome editing of yeast cells and the availability of regulated systems for ectopic protein expression enabled the development of quantitative reporter systems, integrated at a chosen chromosomal locus in isogenic yeast strains that differ only at the level of a specific RE targeted by a TF or for the expression of distinct TF alleles. In many cases, these assays were proven predictive of results in higher eukaryotes. The potential to work in small volume formats and the availability of yeast strains with modified chemical uptake have enhanced the scalability of these approaches. Next to well-established one-, two-, three-hybrid assays, the functional assays with non-chimeric human TFs enrich the palette of opportunities for functional characterisation. We review ∼25 years of research on human sequence-specific TFs expressed in yeast, with an emphasis on the P53 and NF-кB family of proteins, highlighting outcomes, advantages, challenges and limitations of these heterologous assays.
Collapse
Affiliation(s)
- Vasundhara Sharma
- Centre for Integrative Biology, CIBIO, University of Trento, via Sommarive 9, 38123, Trento, Italy
| | - Paola Monti
- U.O.C. Mutagenesi, IRCCS AOU San Martino-IST, Largo R. Benzi, 10, 16132, Genova, Italy
| | - Gilberto Fronza
- U.O.C. Mutagenesi, IRCCS AOU San Martino-IST, Largo R. Benzi, 10, 16132, Genova, Italy
| | - Alberto Inga
- Centre for Integrative Biology, CIBIO, University of Trento, via Sommarive 9, 38123, Trento, Italy
| |
Collapse
|
30
|
Terooatea TW, Pozner A, Buck-Koehntop BA. PAtCh-Cap: input strategy for improving analysis of ChIP-exo data sets and beyond. Nucleic Acids Res 2016; 44:e159. [PMID: 27550178 PMCID: PMC5137431 DOI: 10.1093/nar/gkw741] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2016] [Revised: 08/08/2016] [Accepted: 08/12/2016] [Indexed: 11/24/2022] Open
Abstract
Recently, a number of advances have been implemented into the core ChIP-seq (chromatin immunoprecipitation coupled with next-generation sequencing) methodology to streamline the process, reduce costs or improve data resolution. Several of these emerging ChIP-based methods perform additional chemical steps on bead-bound immunoprecipitated chromatin, posing a challenge for generating similarly treated input controls required for artifact removal during bioinformatics analyses. Here we present a versatile method for producing technique-specific input controls for ChIP-based methods that utilize additional bead-bound processing steps. This reported method, termed protein attached chromatin capture (PAtCh-Cap), relies on the non-specific capture of chromatin-bound proteins via their carboxylate groups, leaving the DNA accessible for subsequent chemical treatments in parallel with chromatin separately immunoprecipitated for the target protein. Application of this input strategy not only significantly enhanced artifact removal from ChIP-exo data, increasing confidence in peak identification and allowing for de novo motif searching, but also afforded discovery of a novel CTCF binding motif.
Collapse
Affiliation(s)
- Tommy W Terooatea
- Department of Chemistry, University of Utah, Salt Lake City, UT 84112, USA
| | - Amir Pozner
- Department of Chemistry, University of Utah, Salt Lake City, UT 84112, USA
| | | |
Collapse
|
31
|
Genome-wide profiling of RNA polymerase transcription at nucleotide resolution in human cells with native elongating transcript sequencing. Nat Protoc 2016; 11:813-33. [PMID: 27010758 DOI: 10.1038/nprot.2016.047] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Many features of how gene transcription occurs in human cells remain unclear, mainly because of a lack of quantitative approaches to follow genome transcription with nucleotide precision in vivo. Here we present a robust genome-wide approach for studying RNA polymerase II (Pol II)-mediated transcription in human cells at single-nucleotide resolution by native elongating transcript sequencing (NET-seq). Elongating RNA polymerase and the associated nascent RNA are prepared by cell fractionation, avoiding immunoprecipitation or RNA labeling. The 3' ends of nascent RNAs are captured through barcode linker ligation and converted into a DNA sequencing library. The identity and abundance of the 3' ends are determined by high-throughput sequencing, which reveals the exact genomic locations of Pol II. Human NET-seq can be applied to the study of the full spectrum of Pol II transcriptional activities, including the production of unstable RNAs and transcriptional pausing. By using the protocol described here, a NET-seq library can be obtained from human cells in 5 d.
Collapse
|
32
|
Abstract
Nucleosome positioning is an important process required for proper genome packing and its accessibility to execute the genetic program in a cell-specific, timely manner. In the recent years hundreds of papers have been devoted to the bioinformatics, physics and biology of nucleosome positioning. The purpose of this review is to cover a practical aspect of this field, namely, to provide a guide to the multitude of nucleosome positioning resources available online. These include almost 300 experimental datasets of genome-wide nucleosome occupancy profiles determined in different cell types and more than 40 computational tools for the analysis of experimental nucleosome positioning data and prediction of intrinsic nucleosome formation probabilities from the DNA sequence. A manually curated, up to date list of these resources will be maintained at http://generegulation.info.
Collapse
|
33
|
Madrigal P. On Accounting for Sequence-Specific Bias in Genome-Wide Chromatin Accessibility Experiments: Recent Advances and Contradictions. Front Bioeng Biotechnol 2015; 3:144. [PMID: 26442258 PMCID: PMC4585268 DOI: 10.3389/fbioe.2015.00144] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2015] [Accepted: 09/07/2015] [Indexed: 11/13/2022] Open
Affiliation(s)
- Pedro Madrigal
- Wellcome Trust Sanger Institute , Cambridge , UK ; Department of Surgery, University of Cambridge , Cambridge , UK
| |
Collapse
|