1
|
Liu S, Li X, Gao H, Chen J, Jiang H. Progress in Aptamer Research and Future Applications. ChemistryOpen 2025:e202400463. [PMID: 39901496 DOI: 10.1002/open.202400463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2024] [Revised: 01/08/2025] [Indexed: 02/05/2025] Open
Abstract
Aptamers are short, single-stranded DNA, RNA or synthetic XNA molecules that bind to target molecules with high specificity and affinity. These intrinsically structured RNA or DNA oligonucleotides are not only substitutes for antibodies, but also show great potential for applications in diagnostics, specific drug delivery, and treatment of certain diseases. While the process of aptamer identification and its core functional mechanism known as systematic evolution of exponentially enriched ligands (SELEX), SELEX involves a number of single processes, each contributing to the success or failure of aptamer generation. Today, aptamers are widely used to facilitate basic research discoveries and clinical diagnostics. In addition, aptamers play a promising role as clinical diagnostic and therapeutic agents. This review provides recent advances in this rapidly growing field of research, with special emphasis on aptamer generation and screening, small molecule aptamers, the development of aptamer applications, and applications in clinical medicine. And it also discusses the problems that still exist today with aptamers.
Collapse
Affiliation(s)
- Song Liu
- Beijing Anzhen Hospital, Capital Medical University, Experimental Research Center, Beijing Institute of Heart Lung and Blood Vessel Disease, Beijing, China
| | - Xiaolu Li
- Beijing Anzhen Hospital, Capital Medical University, Experimental Research Center, Beijing Institute of Heart Lung and Blood Vessel Disease, Beijing, China
| | - Huyang Gao
- Guangxi Medical University, Life Sciences Institute, Nanning, China
| | - Jing Chen
- Beijing Anzhen Hospital, Capital Medical University, Experimental Research Center, Beijing Institute of Heart Lung and Blood Vessel Disease, Beijing, China
| | - Hongfeng Jiang
- Beijing Anzhen Hospital, Capital Medical University, Experimental Research Center, Beijing Institute of Heart Lung and Blood Vessel Disease, Beijing, China
| |
Collapse
|
2
|
Meger AT, Spence MA, Sandhu M, Matthews D, Chen J, Jackson CJ, Raman S. Rugged fitness landscapes minimize promiscuity in the evolution of transcriptional repressors. Cell Syst 2024; 15:374-387.e6. [PMID: 38537640 PMCID: PMC11299162 DOI: 10.1016/j.cels.2024.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 09/08/2023] [Accepted: 03/05/2024] [Indexed: 04/20/2024]
Abstract
How a protein's function influences the shape of its fitness landscape, smooth or rugged, is a fundamental question in evolutionary biochemistry. Smooth landscapes arise when incremental mutational steps lead to a progressive change in function, as commonly seen in enzymes and binding proteins. On the other hand, rugged landscapes are poorly understood because of the inherent unpredictability of how sequence changes affect function. Here, we experimentally characterize the entire sequence phylogeny, comprising 1,158 extant and ancestral sequences, of the DNA-binding domain (DBD) of the LacI/GalR transcriptional repressor family. Our analysis revealed an extremely rugged landscape with rapid switching of specificity, even between adjacent nodes. Further, the ruggedness arises due to the necessity of the repressor to simultaneously evolve specificity for asymmetric operators and disfavors potentially adverse regulatory crosstalk. Our study provides fundamental insight into evolutionary, molecular, and biophysical rules of genetic regulation through the lens of fitness landscapes.
Collapse
Affiliation(s)
- Anthony T Meger
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Matthew A Spence
- Research School of Chemistry, Australian National University, Canberra, ACT 2601, Australia
| | - Mahakaran Sandhu
- Research School of Chemistry, Australian National University, Canberra, ACT 2601, Australia
| | - Dana Matthews
- Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
| | - Jackie Chen
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Colin J Jackson
- Research School of Chemistry, Australian National University, Canberra, ACT 2601, Australia; ARC Centre of Excellence for Innovations in Peptide & Protein Science, Research School of Chemistry, Australian National University, Canberra, ACT 2601, Australia; ARC Centre of Excellence for Innovations in Synthetic Biology, Research School of Chemistry, Australian National University, Canberra, ACT 2601, Australia.
| | - Srivatsan Raman
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA; Department of Bacteriology, University of Wisconsin-Madison, Madison, WI 53706, USA; Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA.
| |
Collapse
|
3
|
Zheng Y, Sun C, Zhang X, Ruzycki PA, Chen S. Missense mutations in CRX homeodomain cause dominant retinopathies through two distinct mechanisms. eLife 2023; 12:RP87147. [PMID: 37963072 PMCID: PMC10645426 DOI: 10.7554/elife.87147] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2023] Open
Abstract
Homeodomain transcription factors (HD TFs) are instrumental to vertebrate development. Mutations in HD TFs have been linked to human diseases, but their pathogenic mechanisms remain elusive. Here, we use Cone-Rod Homeobox (CRX) as a model to decipher the disease-causing mechanisms of two HD mutations, p.E80A and p.K88N, that produce severe dominant retinopathies. Through integrated analysis of molecular and functional evidence in vitro and in knock-in mouse models, we uncover two novel gain-of-function mechanisms: p.E80A increases CRX-mediated transactivation of canonical CRX target genes in developing photoreceptors; p.K88N alters CRX DNA-binding specificity resulting in binding at ectopic sites and severe perturbation of CRX target gene expression. Both mechanisms produce novel retinal morphological defects and hinder photoreceptor maturation distinct from loss-of-function models. This study reveals the distinct roles of E80 and K88 residues in CRX HD regulatory functions and emphasizes the importance of transcriptional precision in normal development.
Collapse
Affiliation(s)
- Yiqiao Zheng
- Molecular Genetic and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Washington University in St LouisSaint LouisUnited States
- Department of Ophthalmology and Visual Sciences, Washington University in St LouisSaint LouisUnited States
| | - Chi Sun
- Molecular Genetic and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Washington University in St LouisSaint LouisUnited States
- Department of Ophthalmology and Visual Sciences, Washington University in St LouisSaint LouisUnited States
| | - Xiaodong Zhang
- Department of Ophthalmology and Visual Sciences, Washington University in St LouisSaint LouisUnited States
| | - Philip A Ruzycki
- Department of Ophthalmology and Visual Sciences, Washington University in St LouisSaint LouisUnited States
- Department of Genetics, Washington University in St LouisSaint LouisUnited States
| | - Shiming Chen
- Molecular Genetic and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Washington University in St LouisSaint LouisUnited States
- Department of Ophthalmology and Visual Sciences, Washington University in St LouisSaint LouisUnited States
- Department of Developmental Biology, Washington University in St LouisSaint LouisUnited States
| |
Collapse
|
4
|
Chen H, Xu Y, Jin J, Su XD. KaScape: a sequencing-based method for global characterization of protein‒DNA binding affinity. Sci Rep 2023; 13:16595. [PMID: 37789131 PMCID: PMC10547764 DOI: 10.1038/s41598-023-43426-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 09/23/2023] [Indexed: 10/05/2023] Open
Abstract
It is difficult to exhaustively screen all possible DNA binding sequences for a given transcription factor (TF). Here, we developed the KaScape method, in which TFs bind to all possible DNA sequences in the same DNA pool where DNA sequences are prepared by randomized oligo synthesis and the random length can be adjusted to a length such as 4, 5, 6, or 7. After separating bound from unbound double-stranded DNAs (dsDNAs), their sequences are determined by next-generation sequencing. To demonstrate the relative binding affinities of all possible DNA sequences determined by KaScape, we developed three-dimensional KaScape viewing software based on a K-mer graph. We applied KaScape to 12 plant TF family AtWRKY proteins and found that all AtWRKY proteins bound to the core sequence GAC with similar profiles. KaScape can detect not only binding sequences consistent with the consensus W-box "TTGAC(C/T)" but also other sequences with weak affinity. KaScape provides a high-throughput, easy-to-operate, sensitive, and exhaustive method for quantitatively characterizing the relative binding strength of a TF with all possible binding sequences, allowing us to comprehensively characterize the specificity and affinity landscape of transcription factors, particularly for moderate- and low-affinity binding sites.
Collapse
Affiliation(s)
- Hong Chen
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, 100871, China
| | - Yongping Xu
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, 100871, China
| | - Jianshi Jin
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, 1 Beichen West Road, Chaoyang District, Beijing, 100101, People's Republic of China
| | - Xiao-Dong Su
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, 100871, China.
| |
Collapse
|
5
|
Zuo Z. Quantifying the arms race between LINE-1 and KRAB-zinc finger genes through TECookbook. NAR Genom Bioinform 2023; 5:lqad078. [PMID: 37680368 PMCID: PMC10480687 DOI: 10.1093/nargab/lqad078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 07/13/2023] [Accepted: 08/22/2023] [Indexed: 09/09/2023] Open
Abstract
To defend against the invasion of transposons, hundreds of KRAB-zinc finger genes (ZNFs) evolved to recognize and silence various repeat families specifically. However, most repeat elements reside in the human genome with high copy numbers, making the ChIP-seq reads of ZNFs targeting these repeats predominantly multi-mapping reads. This complicates downstream data analysis and signal quantification. To better visualize and quantify the arms race between transposons and ZNFs, the R package TECookbook has been developed to lift ChIP-seq data into reference repeat coordinates with proper normalization and extract all putative ZNF binding sites from defined loci of reference repeats for downstream analysis. In conjunction with specificity profiles derived from in vitro Spec-seq data, human ZNF10 has been found to bind to a conserved ORF2 locus of selected LINE-1 subfamilies. This provides insight into how LINE-1 evaded capture at least twice and was subsequently recaptured by ZNF10 during evolutionary history. Through similar analyses, ZNF382 and ZNF248 were shown to be broad-spectrum LINE-1 binders. Overall, this work establishes a general analysis workflow to decipher the arms race between ZNFs and transposons through nucleotide substitutions rather than structural variations, particularly in the protein-coding region of transposons.
Collapse
Affiliation(s)
- Zheng Zuo
- Shenzhen University, Shenzhen, China
| |
Collapse
|
6
|
Zuo Z, Billings T, Walker M, Petkov PM, Fordyce P, Stormo GD. On the dependent recognition of some long zinc finger proteins. Nucleic Acids Res 2023; 51:5364-5376. [PMID: 36951113 PMCID: PMC10287918 DOI: 10.1093/nar/gkad207] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 02/28/2023] [Accepted: 03/13/2023] [Indexed: 03/24/2023] Open
Abstract
The human genome contains about 800 C2H2 zinc finger proteins (ZFPs), and most of them are composed of long arrays of zinc fingers. Standard ZFP recognition model asserts longer finger arrays should recognize longer DNA-binding sites. However, recent experimental efforts to identify in vivo ZFP binding sites contradict this assumption, with many exhibiting short motifs. Here we use ZFY, CTCF, ZIM3, and ZNF343 as examples to address three closely related questions: What are the reasons that impede current motif discovery methods? What are the functions of those seemingly unused fingers and how can we improve the motif discovery algorithms based on long ZFPs' biophysical properties? Using ZFY, we employed a variety of methods and find evidence for 'dependent recognition' where downstream fingers can recognize some previously undiscovered motifs only in the presence of an intact core site. For CTCF, high-throughput measurements revealed its upstream specificity profile depends on the strength of its core. Moreover, the binding strength of the upstream site modulates CTCF's sensitivity to different epigenetic modifications within the core, providing new insight into how the previously identified intellectual disability-causing and cancer-related mutant R567W disrupts upstream recognition and deregulates the epigenetic control by CTCF. Our results establish that, because of irregular motif structures, variable spacing and dependent recognition between sub-motifs, the specificities of long ZFPs are significantly underestimated, so we developed an algorithm, ModeMap, to infer the motifs and recognition models of ZIM3 and ZNF343, which facilitates high-confidence identification of specific binding sites, including repeats-derived elements. With revised concept, technique, and algorithm, we can discover the overlooked specificities and functions of those 'extra' fingers, and therefore decipher their broader roles in human biology and diseases.
Collapse
Affiliation(s)
- Zheng Zuo
- Department of Genetics, Stanford University, CA, USA
- Department of Genetics, Washington University in St. Louis, MO, USA
| | | | | | | | - Polly M Fordyce
- Department of Genetics, Stanford University, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
- Department of Bioengineering, Stanford University, CA, USA
- Stanford ChEM-H Institute, Stanford University, CA, USA
| | - Gary D Stormo
- Department of Genetics, Washington University in St. Louis, MO, USA
| |
Collapse
|
7
|
Zheng Y, Sun C, Zhang X, Ruzycki PA, Chen S. Missense mutations in CRX homeodomain cause dominant retinopathies through two distinct mechanisms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.01.526652. [PMID: 36778408 PMCID: PMC9915647 DOI: 10.1101/2023.02.01.526652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Homeodomain transcription factors (HD TFs) are instrumental to vertebrate development. Mutations in HD TFs have been linked to human diseases, but their pathogenic mechanisms remain elusive. Here we use Cone-Rod Homeobox (CRX) as a model to decipher the disease-causing mechanisms of two HD mutations, p.E80A and p.K88N, that produce severe dominant retinopathies. Through integrated analysis of molecular and functional evidence in vitro and in knock-in mouse models, we uncover two novel gain-of-function mechanisms: p.E80A increases CRX-mediated transactivation of canonical CRX target genes in developing photoreceptors; p.K88N alters CRX DNA-binding specificity resulting in binding at ectopic sites and severe perturbation of CRX target gene expression. Both mechanisms produce novel retinal morphological defects and hinder photoreceptor maturation distinct from loss-of-function models. This study reveals the distinct roles of E80 and K88 residues in CRX HD regulatory functions and emphasizes the importance of transcriptional precision in normal development.
Collapse
Affiliation(s)
- Yiqiao Zheng
- Molecular Genetic and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Washington University in St Louis, Saint Louis, Missouri, USA
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, Missouri, USA
| | - Chi Sun
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, Missouri, USA
| | - Xiaodong Zhang
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, Missouri, USA
| | - Philip A. Ruzycki
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, Missouri, USA
- Department of Genetics, Washington University in St Louis, Saint Louis, Missouri, USA
| | - Shiming Chen
- Molecular Genetic and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Washington University in St Louis, Saint Louis, Missouri, USA
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, Missouri, USA
- Department of Developmental Biology, Washington University in St Louis, Saint Louis, Missouri, USA
| |
Collapse
|
8
|
Rube HT, Rastogi C, Feng S, Kribelbauer JF, Li A, Becerra B, Melo LAN, Do BV, Li X, Adam HH, Shah NH, Mann RS, Bussemaker HJ. Prediction of protein-ligand binding affinity from sequencing data with interpretable machine learning. Nat Biotechnol 2022; 40:1520-1527. [PMID: 35606422 PMCID: PMC9546773 DOI: 10.1038/s41587-022-01307-0] [Citation(s) in RCA: 55] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2021] [Accepted: 04/04/2022] [Indexed: 01/02/2023]
Abstract
Protein-ligand interactions are increasingly profiled at high throughput using affinity selection and massively parallel sequencing. However, these assays do not provide the biophysical parameters that most rigorously quantify molecular interactions. Here we describe a flexible machine learning method, called ProBound, that accurately defines sequence recognition in terms of equilibrium binding constants or kinetic rates. This is achieved using a multi-layered maximum-likelihood framework that models both the molecular interactions and the data generation process. We show that ProBound quantifies transcription factor (TF) behavior with models that predict binding affinity over a range exceeding that of previous resources; captures the impact of DNA modifications and conformational flexibility of multi-TF complexes; and infers specificity directly from in vivo data such as ChIP-seq without peak calling. When coupled with an assay called KD-seq, it determines the absolute affinity of protein-ligand interactions. We also apply ProBound to profile the kinetics of kinase-substrate interactions. ProBound opens new avenues for decoding biological networks and rationally engineering protein-ligand interactions.
Collapse
Affiliation(s)
- H Tomas Rube
- Department of Bioengineering, University of California, Merced, Merced, CA, USA
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Chaitanya Rastogi
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Siqian Feng
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA
| | | | - Allyson Li
- Department of Chemistry, Columbia University, New York, NY, USA
| | - Basheer Becerra
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Lucas A N Melo
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Bach Viet Do
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Xiaoting Li
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Hammaad H Adam
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Neel H Shah
- Department of Chemistry, Columbia University, New York, NY, USA
| | - Richard S Mann
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Harmen J Bussemaker
- Department of Biological Sciences, Columbia University, New York, NY, USA.
- Department of Systems Biology, Columbia University, New York, NY, USA.
| |
Collapse
|
9
|
Poon GMK. The Non-continuum Nature of Eukaryotic Transcriptional Regulation. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1371:11-32. [PMID: 33616894 PMCID: PMC8380751 DOI: 10.1007/5584_2021_618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/25/2024]
Abstract
Eukaryotic transcription factors are versatile mediators of specificity in gene regulation. This versatility is achieved through mutual specification by context-specific DNA binding on the one hand, and identity-specific protein-protein partnerships on the other. This interactivity, known as combinatorial control, enables a repertoire of complex transcriptional outputs that are qualitatively disjoint, or non-continuum, with respect to binding affinity. This feature contrasts starkly with prokaryotic gene regulators, whose activities in general vary quantitatively in step with binding affinity. Biophysical studies on prokaryotic model systems and more recent investigations on transcription factors highlight an important role for folded state dynamics and molecular hydration in protein/DNA recognition. Analysis of molecular models of combinatorial control and recent literature in low-affinity gene regulation suggest that transcription factors harbor unique conformational dynamics that are inaccessible or unused by prokaryotic DNA-binding proteins. Thus, understanding the intrinsic dynamics involved in DNA binding and co-regulator recruitment appears to be a key to understanding how transcription factors mediate non-continuum outcomes in eukaryotic gene expression, and how such capability might have evolved from ancient, structurally conserved counterparts.
Collapse
Affiliation(s)
- Gregory M K Poon
- Department of Chemistry, Georgia State University, Atlanta, GA, USA.
- Center for Diagnostics and Therapeutics, Georgia State University, Atlanta, GA, USA.
| |
Collapse
|
10
|
Ge W, Meier M, Roth C, Söding J. Bayesian Markov models improve the prediction of binding motifs beyond first order. NAR Genom Bioinform 2021; 3:lqab026. [PMID: 33928244 PMCID: PMC8057495 DOI: 10.1093/nargab/lqab026] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 03/11/2021] [Accepted: 03/30/2021] [Indexed: 12/13/2022] Open
Abstract
Transcription factors (TFs) regulate gene expression by binding to specific DNA motifs. Accurate models for predicting binding affinities are crucial for quantitatively understanding of transcriptional regulation. Motifs are commonly described by position weight matrices, which assume that each position contributes independently to the binding energy. Models that can learn dependencies between positions, for instance, induced by DNA structure preferences, have yielded markedly improved predictions for most TFs on in vivo data. However, they are more prone to overfit the data and to learn patterns merely correlated with rather than directly involved in TF binding. We present an improved, faster version of our Bayesian Markov model software, BaMMmotif2. We tested it with state-of-the-art motif discovery tools on a large collection of ChIP-seq and HT-SELEX datasets. BaMMmotif2 models of fifth-order achieved a median false-discovery-rate-averaged recall 13.6% and 12.2% higher than the next best tool on 427 ChIP-seq datasets and 164 HT-SELEX datasets, respectively, while being 8 to 1000 times faster. BaMMmotif2 models showed no signs of overtraining in cross-cell line and cross-platform tests, with similar improvements on the next-best tool. These results demonstrate that dependencies beyond first order clearly improve binding models for most TFs.
Collapse
Affiliation(s)
- Wanwan Ge
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany
| | - Markus Meier
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany
| | - Christian Roth
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany
| | - Johannes Söding
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany
| |
Collapse
|
11
|
Long P, Zhang L, Huang B, Chen Q, Liu H. Integrating genome sequence and structural data for statistical learning to predict transcription factor binding sites. Nucleic Acids Res 2020; 48:12604-12617. [PMID: 33264415 PMCID: PMC7736823 DOI: 10.1093/nar/gkaa1134] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 09/18/2020] [Accepted: 11/10/2020] [Indexed: 01/11/2023] Open
Abstract
We report an approach to predict DNA specificity of the tetracycline repressor (TetR) family transcription regulators (TFRs). First, a genome sequence-based method was streamlined with quantitative P-values defined to filter out reliable predictions. Then, a framework was introduced to incorporate structural data and to train a statistical energy function to score the pairing between TFR and TFR binding site (TFBS) based on sequences. The predictions benchmarked against experiments, TFBSs for 29 out of 30 TFRs were correctly predicted by either the genome sequence-based or the statistical energy-based method. Using P-values or Z-scores as indicators, we estimate that 59.6% of TFRs are covered with relatively reliable predictions by at least one of the two methods, while only 28.7% are covered by the genome sequence-based method alone. Our approach predicts a large number of new TFBs which cannot be correctly retrieved from public databases such as FootprintDB. High-throughput experimental assays suggest that the statistical energy can model the TFBSs of a significant number of TFRs reliably. Thus the energy function may be applied to explore for new TFBSs in respective genomes. It is possible to extend our approach to other transcriptional factor families with sufficient structural information.
Collapse
Affiliation(s)
- Pengpeng Long
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Lu Zhang
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Bin Huang
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Quan Chen
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China
- Hefei National Laboratory for Physical Sciences at the Microscale, Hefei, Anhui 230026, China
| | - Haiyan Liu
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China
- Hefei National Laboratory for Physical Sciences at the Microscale, Hefei, Anhui 230026, China
- School of Data Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
12
|
Zhang L, Rube HT, Vakulskas CA, Behlke MA, Bussemaker HJ, Pufall MA. Systematic in vitro profiling of off-target affinity, cleavage and efficiency for CRISPR enzymes. Nucleic Acids Res 2020; 48:5037-5053. [PMID: 32315032 PMCID: PMC7229833 DOI: 10.1093/nar/gkaa231] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Revised: 03/06/2020] [Accepted: 03/27/2020] [Indexed: 12/14/2022] Open
Abstract
CRISPR RNA-guided endonucleases (RGEs) cut or direct activities to specific genomic loci, yet each has off-target activities that are often unpredictable. We developed a pair of simple in vitro assays to systematically measure the DNA-binding specificity (Spec-seq), catalytic activity specificity (SEAM-seq) and cleavage efficiency of RGEs. By separately quantifying binding and cleavage specificity, Spec/SEAM-seq provides detailed mechanistic insight into off-target activity. Feature-based models generated from Spec/SEAM-seq data for SpCas9 were consistent with previous reports of its in vitro and in vivo specificity, validating the approach. Spec/SEAM-seq is also useful for profiling less-well characterized RGEs. Application to an engineered SpCas9, HiFi-SpCas9, indicated that its enhanced target discrimination can be attributed to cleavage rather than binding specificity. The ortholog ScCas9, on the other hand, derives specificity from binding to an extended PAM. The decreased off-target activity of AsCas12a (Cpf1) appears to be primarily driven by DNA-binding specificity. Finally, we performed the first characterization of CasX specificity, revealing an all-or-nothing mechanism where mismatches can be bound, but not cleaved. Together, these applications establish Spec/SEAM-seq as an accessible method to rapidly and reliably evaluate the specificity of RGEs, Cas::gRNA pairs, and gain insight into the mechanism and thermodynamics of target discrimination.
Collapse
Affiliation(s)
- Liyang Zhang
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Coralville, IA 52241, USA.,Integrated DNA Technologies, Inc., 1710 Commercial Park, Coralville, IA 52241, USA
| | - H Tomas Rube
- Department of Bioengineering, University of California, Merced, New York, NY 10027, USA.,Department of Biological Sciences, Columbia University, New York, NY 10027, USA.,Department of Systems Biology, Columbia University Irving Medical Center, New York, NY 10032, USA
| | | | - Mark A Behlke
- Integrated DNA Technologies, Inc., 1710 Commercial Park, Coralville, IA 52241, USA
| | - Harmen J Bussemaker
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA.,Department of Systems Biology, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Miles A Pufall
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Coralville, IA 52241, USA
| |
Collapse
|
13
|
Lancaster BR, McGhee JD. How affinity of the ELT-2 GATA factor binding to cis-acting regulatory sites controls Caenorhabditis elegans intestinal gene transcription. Development 2020; 147:dev190330. [PMID: 32586978 PMCID: PMC7390640 DOI: 10.1242/dev.190330] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Accepted: 06/06/2020] [Indexed: 12/13/2022]
Abstract
We define a quantitative relationship between the affinity with which the intestine-specific GATA factor ELT-2 binds to cis-acting regulatory motifs and the resulting transcription of asp-1, a target gene representative of genes involved in Caenorhabditis elegans intestine differentiation. By establishing an experimental system that allows unknown parameters (e.g. the influence of chromatin) to effectively cancel out, we show that levels of asp-1 transcripts increase monotonically with increasing binding affinity of ELT-2 to variant promoter TGATAA sites. The shape of the response curve reveals that the product of the unbound ELT-2 concentration in vivo [i.e. (ELT-2free) or ELT-2 'activity'] and the largest ELT-XXTGATAAXX association constant (Kmax) lies between five and ten. We suggest that this (unitless) product [Kmax×(ELT-2free) or the equivalent product for any other transcription factor] provides an important quantitative descriptor of transcription-factor/regulatory-motif interaction in development, evolution and genetic disease. A more complicated model than simple binding affinity is necessary to explain the fact that ELT-2 appears to discriminate in vivo against equal-affinity binding sites that contain AGATAA instead of TGATAA.
Collapse
Affiliation(s)
- Brett R Lancaster
- Department of Biochemistry and Molecular Biology, University of Calgary, Cumming School of Medicine, Alberta Children's Hospital Research Institute, Calgary, Alberta T2N 4N1, Canada
| | - James D McGhee
- Department of Biochemistry and Molecular Biology, University of Calgary, Cumming School of Medicine, Alberta Children's Hospital Research Institute, Calgary, Alberta T2N 4N1, Canada
| |
Collapse
|
14
|
Toivonen J, Das PK, Taipale J, Ukkonen E. MODER2: first-order Markov modeling and discovery of monomeric and dimeric binding motifs. Bioinformatics 2020; 36:2690-2696. [PMID: 31999322 PMCID: PMC7203737 DOI: 10.1093/bioinformatics/btaa045] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 12/23/2019] [Accepted: 01/23/2020] [Indexed: 12/21/2022] Open
Abstract
MOTIVATION Position-specific probability matrices (PPMs, also called position-specific weight matrices) have been the dominating model for transcription factor (TF)-binding motifs in DNA. There is, however, increasing recent evidence of better performance of higher order models such as Markov models of order one, also called adjacent dinucleotide matrices (ADMs). ADMs can model dependencies between adjacent nucleotides, unlike PPMs. A modeling technique and software tool that would estimate such models simultaneously both for monomers and their dimers have been missing. RESULTS We present an ADM-based mixture model for monomeric and dimeric TF-binding motifs and an expectation maximization algorithm MODER2 for learning such models from training data and seeds. The model is a mixture that includes monomers and dimers, built from the monomers, with a description of the dimeric structure (spacing, orientation). The technique is modular, meaning that the co-operative effect of dimerization is made explicit by evaluating the difference between expected and observed models. The model is validated using HT-SELEX and generated datasets, and by comparing to some earlier PPM and ADM techniques. The ADM models explain data slightly better than PPM models for 314 tested TFs (or their DNA-binding domains) from four families (bHLH, bZIP, ETS and Homeodomain), the ADM mixture models by MODER2 being the best on average. AVAILABILITY AND IMPLEMENTATION Software implementation is available from https://github.com/jttoivon/moder2. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jarkko Toivonen
- Department of Computer Science, University of Helsinki, Helsinki FI-00014, Finland
| | - Pratyush K Das
- Applied Tumor Genomics, Research Programs Unit, University of Helsinki, Helsinki FI-00014, Finland
| | - Jussi Taipale
- Department of Biochemistry, University of Cambridge, CB2 1GA Cambridge, UK
- Division of Functional Genomics and Systems Biology, Department of Medical Biochemistry and Biophysics, SE 141 83 Stockholm, Sweden
- Department of Biosciences and Nutrition, Karolinska Institutet, SE 141 83 Stockholm, Sweden
- Genome-Scale Biology Program, University of Helsinki, Helsinki FI-00014, Finland
| | - Esko Ukkonen
- Department of Computer Science, University of Helsinki, Helsinki FI-00014, Finland
| |
Collapse
|
15
|
Transcription factor YcjW controls the emergency H 2S production in E. coli. Nat Commun 2019; 10:2868. [PMID: 31253770 PMCID: PMC6599011 DOI: 10.1038/s41467-019-10785-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2018] [Accepted: 06/03/2019] [Indexed: 12/18/2022] Open
Abstract
Prokaryotes and eukaryotes alike endogenously generate the gaseous molecule hydrogen sulfide (H2S). Bacterial H2S acts as a cytoprotectant against antibiotics-induced stress and promotes redox homeostasis. In E. coli, endogenous H2S production is primarily dependent on 3-mercaptopyruvate sulfurtransferase (3MST), encoded by mstA. Here, we show that cells lacking 3MST acquire a phenotypic suppressor mutation resulting in compensatory H2S production and tolerance to antibiotics and oxidative stress. Using whole genome sequencing, we identified a non-synonymous mutation within an uncharacterized LacI-type transcription factor, ycjW. We then mapped regulatory targets of YcjW and discovered it controls the expression of carbohydrate metabolic genes and thiosulfate sulfurtransferase PspE. Induction of pspE expression in the suppressor strain provides an alternative mechanism for H2S biosynthesis. Our results reveal a complex interaction between carbohydrate metabolism and H2S production in bacteria and the role, a hitherto uncharacterized transcription factor, YcjW, plays in linking the two. Hydrogen sulfide (H2S) production in Escherichia coli is controlled by the sulfurtransferase 3MST. Here, the authors describe an alternative mechanism for H2S biosynthesis via activation of the thiosulfate sulfurtransferase PspE, a process mediated by the transcription factor YcjW.
Collapse
|
16
|
Establishment and application of CRISPR interference to affect sporulation, hydrogen peroxide detoxification, and mannitol catabolism in the methylotrophic thermophile Bacillus methanolicus. Appl Microbiol Biotechnol 2019; 103:5879-5889. [DOI: 10.1007/s00253-019-09907-8] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Revised: 05/07/2019] [Accepted: 05/08/2019] [Indexed: 11/30/2022]
|
17
|
Kinney JB, McCandlish DM. Massively Parallel Assays and Quantitative Sequence-Function Relationships. Annu Rev Genomics Hum Genet 2019; 20:99-127. [PMID: 31091417 DOI: 10.1146/annurev-genom-083118-014845] [Citation(s) in RCA: 88] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Over the last decade, a rich variety of massively parallel assays have revolutionized our understanding of how biological sequences encode quantitative molecular phenotypes. These assays include deep mutational scanning, high-throughput SELEX, and massively parallel reporter assays. Here, we review these experimental methods and how the data they produce can be used to quantitatively model sequence-function relationships. In doing so, we touch on a diverse range of topics, including the identification of clinically relevant genomic variants, the modeling of transcription factor binding to DNA, the functional and evolutionary landscapes of proteins, and cis-regulatory mechanisms in both transcription and mRNA splicing. We further describe a unified conceptual framework and a core set of mathematical modeling strategies that studies in these diverse areas can make use of. Finally, we highlight key aspects of experimental design and mathematical modeling that are important for the results of such studies to be interpretable and reproducible.
Collapse
Affiliation(s)
- Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| |
Collapse
|
18
|
Abstract
For nearly a century adaptive landscapes have provided overviews of the evolutionary process and yet they remain metaphors. We redefine adaptive landscapes in terms of biological processes rather than descriptive phenomenology. We focus on the underlying mechanisms that generate emergent properties such as epistasis, dominance, trade-offs and adaptive peaks. We illustrate the utility of landscapes in predicting the course of adaptation and the distribution of fitness effects. We abandon aged arguments concerning landscape ruggedness in favor of empirically determining landscape architecture. In so doing, we transform the landscape metaphor into a scientific framework within which causal hypotheses can be tested.
Collapse
Affiliation(s)
- Xiao Yi
- BioTechnology Institute, University of Minnesota, St. Paul, MN
| | - Antony M Dean
- BioTechnology Institute, University of Minnesota, St. Paul, MN
- Department of Ecology, Evolution, and Behavior, University of Minnesota, St. Paul, MN
| |
Collapse
|
19
|
Barnes SL, Belliveau NM, Ireland WT, Kinney JB, Phillips R. Mapping DNA sequence to transcription factor binding energy in vivo. PLoS Comput Biol 2019; 15:e1006226. [PMID: 30716072 PMCID: PMC6375646 DOI: 10.1371/journal.pcbi.1006226] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Revised: 02/14/2019] [Accepted: 11/06/2018] [Indexed: 11/18/2022] Open
Abstract
Despite the central importance of transcriptional regulation in biology, it has proven difficult to determine the regulatory mechanisms of individual genes, let alone entire gene networks. It is particularly difficult to decipher the biophysical mechanisms of transcriptional regulation in living cells and determine the energetic properties of binding sites for transcription factors and RNA polymerase. In this work, we present a strategy for dissecting transcriptional regulatory sequences using in vivo methods (massively parallel reporter assays) to formulate quantitative models that map a transcription factor binding site’s DNA sequence to transcription factor-DNA binding energy. We use these models to predict the binding energies of transcription factor binding sites to within 1 kBT of their measured values. We further explore how such a sequence-energy mapping relates to the mechanisms of trancriptional regulation in various promoter contexts. Specifically, we show that our models can be used to design specific induction responses, analyze the effects of amino acid mutations on DNA sequence preference, and determine how regulatory context affects a transcription factor’s sequence specificity. It has been said that we live in the “genomic era,” a time where we can readily sequence full genomes at will. However, it remains difficult to interpret much of the information within a genome. This is especially true of non-coding sequences such as promoters, which contain a number of features such as transcription factor binding sites that determine how genes are regulated. There is no straightforward regulatory “code” that tells us how transcription factor binding sites are organized within a promoter. In this work we examine how DNA sequence determines one of the most important features of a promoter, the strength with which a transcription factor binds to its DNA binding site. We discuss an approach to modeling DNA sequence-specific transcription factor binding energies in vivo using a massively parellel reporter assay. We develop models that allow us to predict the binding energy between a transcription factor and a mutated version of its binding site. We then show that this modeling technique can be used to address a number of scientific and design questions, such as engineering the behavior of genetic circuit elements or examining how transcription factors and their binding sites co-evolve.
Collapse
Affiliation(s)
- Stephanie L. Barnes
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Nathan M. Belliveau
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - William T. Ireland
- Department of Physics, California Institute of Technology, Pasadena, California, United States of America
| | - Justin B. Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Rob Phillips
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
- Department of Physics, California Institute of Technology, Pasadena, California, United States of America
- * E-mail:
| |
Collapse
|
20
|
Seckfort D, Montgomery Pettitt B. Price of disorder in the lac repressor hinge helix. Biopolymers 2019; 110:e23239. [PMID: 30485404 PMCID: PMC6335174 DOI: 10.1002/bip.23239] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Revised: 09/12/2018] [Accepted: 10/04/2018] [Indexed: 12/26/2022]
Abstract
The Lac system of genes has been pivotal in understanding gene regulation. When the lac repressor protein binds to the correct DNA sequence, the hinge region of the protein goes through a disorder to order transition. The structure of this region of the protein is well understood when it is in this bound conformation, but less so when it is not. Structural studies show that this region is flexible. Our simulations show this region is extremely flexible in solution; however, a high concentration of salt can help kinetically trap the hinge helix. Thermodynamically, disorder is more favorable without the DNA present.
Collapse
Affiliation(s)
- Danielle Seckfort
- Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, Texas
| | - B Montgomery Pettitt
- Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, Texas
- Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology, University of Texas Medical Branch, Galveston, Texas
| |
Collapse
|
21
|
Comprehensive, high-resolution binding energy landscapes reveal context dependencies of transcription factor binding. Proc Natl Acad Sci U S A 2018; 115:E3702-E3711. [PMID: 29588420 PMCID: PMC5910820 DOI: 10.1073/pnas.1715888115] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Transcription factors (TFs) are primary regulators of gene expression in cells, where they bind specific genomic target sites to control transcription. Quantitative measurements of TF-DNA binding energies can improve the accuracy of predictions of TF occupancy and downstream gene expression in vivo and shed light on how transcriptional networks are rewired throughout evolution. Here, we present a sequencing-based TF binding assay and analysis pipeline (BET-seq, for Binding Energy Topography by sequencing) capable of providing quantitative estimates of binding energies for more than one million DNA sequences in parallel at high energetic resolution. Using this platform, we measured the binding energies associated with all possible combinations of 10 nucleotides flanking the known consensus DNA target interacting with two model yeast TFs, Pho4 and Cbf1. A large fraction of these flanking mutations change overall binding energies by an amount equal to or greater than consensus site mutations, suggesting that current definitions of TF binding sites may be too restrictive. By systematically comparing estimates of binding energies output by deep neural networks (NNs) and biophysical models trained on these data, we establish that dinucleotide (DN) specificities are sufficient to explain essentially all variance in observed binding behavior, with Cbf1 binding exhibiting significantly more nonadditivity than Pho4. NN-derived binding energies agree with orthogonal biochemical measurements and reveal that dynamically occupied sites in vivo are both energetically and mutationally distant from the highest affinity sites.
Collapse
|
22
|
Chang YK, Zuo Z, Stormo GD. Quantitative profiling of BATF family proteins/JUNB/IRF hetero-trimers using Spec-seq. BMC Mol Biol 2018; 19:5. [PMID: 29587652 PMCID: PMC5869772 DOI: 10.1186/s12867-018-0106-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Accepted: 03/19/2018] [Indexed: 01/13/2023] Open
Abstract
Background BATF family transcription factors (BATF, BATF2 and BATF3) form hetero-trimers with JUNB and either IRF4 or IRF8 to regulate cell fate in T cells and dendritic cells in vivo. While each combination of the hetero-trimer has a distinct role, some degree of cross-compensation was observed. The basis for the differential actions of IRF4 and IRF8 with BATF factors and JUNB is still unknown. We propose that the differences in function between these hetero-trimers may be caused by differences in their DNA binding preferences. While all three BATF family transcription factors have similar binding preferences when binding as a hetero-dimer with JUNB, the cooperative binding of IRF4 or IRF8 to the hetero-dimer/DNA complex could change the preferences. We used Spec-seq, which allows for the efficient and accurate determination of relative affinity to a large collection of sequences in parallel, to find differences between cooperative DNA binding of IRF4, IRF8 and BATF family members. Results We found that without IRF binding, all three hetero-dimer pairs exhibit nearly the same binding preferences to both expected wildtype binding sites TRE (TGA(C/G)TCA) and CRE (TGACGTCA). IRF4 and IRF8 show the very similar DNA binding preferences when binding with any of the three hetero-dimers. No major change of binding preferences was found in the half-sites between different hetero-trimers. IRF proteins bind with substantially lower affinity with either a single nucleotide spacer between IRF and BATF binding site or with an alternative mode of binding in the opposite orientation. In addition, the preference to CRE binding site was reduced with either IRF binding in all BATF–JUNB combinations. Conclusions The specificities of BATF, BATF2 and BATF3 are all very similar as are their interactions with IRF4 and IRF8. IRF proteins binding adjacent to BATF sites increases affinity substantially compared to sequences with spacings between the sites, indicating cooperative binding through protein–protein interactions. The preference for the type of BATF binding site, TRE or CRE, is also altered when IRF proteins bind. These in vitro preferences aid in the understanding of in vivo binding activities. Electronic supplementary material The online version of this article (10.1186/s12867-018-0106-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yiming K Chang
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - Zheng Zuo
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - Gary D Stormo
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA.
| |
Collapse
|
23
|
Joyce AP, Havranek JJ. Deciphering the protein-DNA code of bacterial winged helix-turn-helix transcription factors. QUANTITATIVE BIOLOGY 2018; 6:68-84. [PMID: 37990674 PMCID: PMC10662834 DOI: 10.1007/s40484-018-0130-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Revised: 07/14/2017] [Accepted: 07/24/2017] [Indexed: 10/18/2022]
Abstract
Background Sequence-specific binding by transcription factors (TFs) plays a significant role in the selection and regulation of target genes. At the protein:DNA interface, amino acid side-chains construct a diverse physicochemical network of specific and non-specific interactions, and seemingly subtle changes in amino acid identity at certain positions may dramatically impact TF:DNA binding. Variation of these specificity-determining residues (SDRs) is a major mechanism of functional divergence between TFs with strong structural or sequence homology. Methods In this study, we employed a combination of high-throughput specificity profiling by SELEX and Spec-seq, structural modeling, and evolutionary analysis to probe the binding preferences of winged helix-turn-helix TFs belonging to the OmpR sub-family in Escherichia coli. Results We found that E. coli OmpR paralogs recognize tandem, variably spaced repeats composed of "GT-A" or "GCT"-containing half-sites. Some divergent sequence preferences observed within the "GT-A" mode correlate with amino acid similarity; conversely, "GCT"-based motifs were observed for a subset of paralogs with low sequence homology. Direct specificity profiling of a subset of OmpR homologues (CpxR, RstA, and OmpR) as well as predicted "SDR-swap" variants revealed that individual SDRs may impact sequence preferences locally through direct contact with DNA bases or distally via the DNA backbone. Conclusions Overall, our work provides evidence for a common structural code for sequence-specific wHTH:DNA interactions, and demonstrates that surprisingly modest residue changes can enable recognition of highly divergent sequence motifs. Further examination of SDR predictions will likely reveal additional mechanisms controlling the evolutionary divergence of this important class of transcriptional regulators.
Collapse
Affiliation(s)
- Adam P. Joyce
- Program in Developmental, Regenerative, and Stem Cell Biology, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - James J. Havranek
- Department of Biochemistry and Molecular Biophysics, Washington University in St. Louis, St. Louis, MO 63110, USA
| |
Collapse
|
24
|
Xu JS, Hewitt MN, Gulati JS, Cruz MA, Zhan H, Liu S, Matthews KS. Lactose repressor hinge domain independently binds DNA. Protein Sci 2018; 27:839-847. [PMID: 29318690 PMCID: PMC5866929 DOI: 10.1002/pro.3372] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2017] [Revised: 01/02/2018] [Accepted: 01/02/2018] [Indexed: 12/29/2022]
Abstract
The short 8-10 amino acid "hinge" sequence in lactose repressor (LacI), present in other LacI/GalR family members, links DNA and inducer-binding domains. Structural studies of full-length or truncated LacI-operator DNA complexes demonstrate insertion of the dimeric helical "hinge" structure at the center of the operator sequence. This association bends the DNA ∼40° and aligns flanking semi-symmetric DNA sites for optimal contact by the N-terminal helix-turn-helix (HtH) sequences within each dimer. In contrast, the hinge region remains unfolded when bound to nonspecific DNA sequences. To determine ability of the hinge helix alone to mediate DNA binding, we examined (i) binding of LacI variants with deletion of residues 1-50 to remove the HtH DNA binding domain or residues 1-58 to remove both HtH and hinge domains and (ii) binding of a synthetic peptide corresponding to the hinge sequence with a Val52Cys substitution that allows reversible dimer formation via a disulfide linkage. Binding affinity for DNA is orders of magnitude lower in the absence of the helix-turn-helix domain with its highly positive charge. LacI missing residues 1-50 binds to DNA with ∼4-fold greater affinity for operator than for nonspecific sequences with minimal impact of inducer presence; in contrast, LacI missing residues 1-58 exhibits no detectable affinity for DNA. In oxidized form, the dimeric hinge peptide alone binds to O1 and nonspecific DNA with similarly small difference in affinity; reduction to monomer diminished binding to both O1 and nonspecific targets. These results comport with recent reports regarding LacI hinge interaction with DNA sequences.
Collapse
Affiliation(s)
- Joseph S Xu
- Department of BioSciences, MS-140, Rice University, Houston, Texas, 77251
| | - Madeleine N Hewitt
- Department of BioSciences, MS-140, Rice University, Houston, Texas, 77251
| | - Jaskeerat S Gulati
- Department of BioSciences, MS-140, Rice University, Houston, Texas, 77251
| | - Matthew A Cruz
- Department of BioSciences, MS-140, Rice University, Houston, Texas, 77251
| | - Hongli Zhan
- Department of BioSciences, MS-140, Rice University, Houston, Texas, 77251
| | - Shirley Liu
- Department of BioSciences, MS-140, Rice University, Houston, Texas, 77251
| | | |
Collapse
|
25
|
Aditham AK, Shimko TC, Fordyce PM. BET-seq: Binding energy topographies revealed by microfluidics and high-throughput sequencing. Methods Cell Biol 2018; 148:229-250. [PMID: 30473071 PMCID: PMC7531582 DOI: 10.1016/bs.mcb.2018.09.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Biophysical models of transcriptional regulation rely on energetic measurements of the binding affinities between transcription factors (TFs) and target DNA binding sites. Historically, assays capable of measuring TF-DNA binding affinities have been relatively low-throughput (measuring ~103 sequences in parallel) and have required significant specialized equipment, limiting their use to a handful of laboratories. Recently, we developed an experimental assay and analysis pipeline that allows measurement of binding energies between a single TF and up to 106 DNA species in a single experiment (Binding Energy Topography by sequencing, or BET-seq) (Le et al., 2018). BET-seq employs the Mechanically Induced Trapping of Molecular Interactions (MITOMI) platform to purify DNA bound to a TF at equilibrium followed by high coverage sequencing to reveal relative differences in binding energy for each sequence. While we have previously used BET-seq to refine the binding affinity landscapes surrounding high-affinity DNA consensus target sites, we anticipate this technique will be applied in future work toward measuring a wide variety of TF-DNA landscapes. Here, we provide detailed instructions and general considerations for DNA library design, performing BET-seq assays, and analyzing the resulting data.
Collapse
Affiliation(s)
- Arjun K. Aditham
- Department of Bioengineering, Stanford University, Stanford, CA, United States,Stanford ChEM-H, Stanford University, Stanford, CA, United States
| | - Tyler C. Shimko
- Department of Genetics, Stanford University, Stanford, CA, United States
| | - Polly M. Fordyce
- Department of Bioengineering, Stanford University, Stanford, CA, United States,Stanford ChEM-H, Stanford University, Stanford, CA, United States,Department of Genetics, Stanford University, Stanford, CA, United States,Chan Zuckerberg Biohub, San Francisco, CA, United States,Corresponding author:
| |
Collapse
|
26
|
Crocker J, Ilsley GR. Using synthetic biology to study gene regulatory evolution. Curr Opin Genet Dev 2017; 47:91-101. [DOI: 10.1016/j.gde.2017.09.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2017] [Revised: 09/06/2017] [Accepted: 09/11/2017] [Indexed: 12/21/2022]
|
27
|
Zuo Z, Roy B, Chang YK, Granas D, Stormo GD. Measuring quantitative effects of methylation on transcription factor-DNA binding affinity. SCIENCE ADVANCES 2017; 3:eaao1799. [PMID: 29159284 PMCID: PMC5694663 DOI: 10.1126/sciadv.aao1799] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/22/2017] [Accepted: 10/20/2017] [Indexed: 06/07/2023]
Abstract
Methylation of CpG (cytosine-phosphate-guanine) dinucleotides is a common epigenetic mark that influences gene expression. The effects of methylation on transcription factor (TF) binding are unknown for most TFs and, even when known, such knowledge is often only qualitative. In reality, methylation sensitivity is a quantitative effect, just as changes to the DNA sequence have quantitative effects on TF binding affinity. We describe Methyl-Spec-seq, an easy-to-use method that measures the effects of CpG methylation (mCPG) on binding affinity for hundreds to thousands of variants in parallel, allowing one to quantitatively assess the effects at every position in a binding site. We demonstrate its use on several important DNA binding proteins. We calibrate the accuracy of Methyl-Spec-seq using a novel two-color competitive fluorescence anisotropy method that can accurately determine the relative affinities of two sequences in solution. We also present software that extends standard methods for representing, visualizing, and searching for matches to binding site motifs to include the effects of methylation. These tools facilitate the study of the consequences for gene regulation of epigenetic marks on DNA.
Collapse
Affiliation(s)
- Zheng Zuo
- Corresponding author. (G.D.S.); (Z.Z.)
| | | | | | | | | |
Collapse
|
28
|
Roy B, Zuo Z, Stormo GD. Quantitative specificity of STAT1 and several variants. Nucleic Acids Res 2017; 45:8199-8207. [PMID: 28510715 PMCID: PMC5737217 DOI: 10.1093/nar/gkx393] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Accepted: 05/12/2017] [Indexed: 01/09/2023] Open
Abstract
The quantitative specificity of the STAT1 transcription factor was determined by measuring the relative affinity to hundreds of variants of the consensus binding site including variations in the length of the site. The known consensus sequence is observed to have the highest affinity, with all variants decreasing binding affinity considerably. There is very little loss of binding affinity when the CpG within the consensus binding site is methylated. Additionally, the specificity of mutant proteins, with variants of amino acids that interact with the DNA, was determined and nearly all of them are observed to lose specificity across the entire binding site. The change of Asn at position 460 to His, which corresponds to the natural amino acid at the homologous position in STAT6, does not change the specificity nor does it change the length preference to match that of STAT6. These results provide the first quantitative analysis of changes in binding affinity for the STAT1 protein, and several variants of it, to hundreds of different binding sites including different spacer lengths, and the effect of CpG methylation.
Collapse
Affiliation(s)
- Basab Roy
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63108-8510, USA
| | - Zheng Zuo
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63108-8510, USA
| | - Gary D Stormo
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63108-8510, USA
| |
Collapse
|
29
|
Inherent limitations of probabilistic models for protein-DNA binding specificity. PLoS Comput Biol 2017; 13:e1005638. [PMID: 28686588 PMCID: PMC5521849 DOI: 10.1371/journal.pcbi.1005638] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2017] [Revised: 07/21/2017] [Accepted: 06/21/2017] [Indexed: 01/10/2023] Open
Abstract
The specificities of transcription factors are most commonly represented with probabilistic models. These models provide a probability for each base occurring at each position within the binding site and the positions are assumed to contribute independently. The model is simple and intuitive and is the basis for many motif discovery algorithms. However, the model also has inherent limitations that prevent it from accurately representing true binding probabilities, especially for the highest affinity sites under conditions of high protein concentration. The limitations are not due to the assumption of independence between positions but rather are caused by the non-linear relationship between binding affinity and binding probability and the fact that independent normalization at each position skews the site probabilities. Generally probabilistic models are reasonably good approximations, but new high-throughput methods allow for biophysical models with increased accuracy that should be used whenever possible. Transcription factors (TFs), a class of DNA-binding proteins, play a central role in the regulation of gene expression. TFs control the rate of transcription by binding to the genome in a sequence-specific manner. Thus, one important aspect in the study of gene regulation mechanism is to model the binding specificities of TFs, namely the features of the DNA sequences that a TF prefers to bind. Multiple models have been proposed to characterize the binding specificities of TFs, among which the class of probabilistic models is the most popular. In this study, we point out several major limitations of the well-established probabilistic model by comparing it with the biophysical model. Through simulations we demonstrate that the probabilistic model is only an approximation of the biophysical model. The latter has most of the advantages of the former, and is a more accurate representation of binding specificities. We propose a shift from the probabilistic model to the biophysical model in future studies of protein-DNA interactions.
Collapse
|
30
|
Inukai S, Kock KH, Bulyk ML. Transcription factor-DNA binding: beyond binding site motifs. Curr Opin Genet Dev 2017; 43:110-119. [PMID: 28359978 PMCID: PMC5447501 DOI: 10.1016/j.gde.2017.02.007] [Citation(s) in RCA: 213] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2016] [Revised: 02/02/2017] [Accepted: 02/07/2017] [Indexed: 12/12/2022]
Abstract
Sequence-specific transcription factors (TFs) regulate gene expression by binding to cis-regulatory elements in promoter and enhancer DNA. While studies of TF-DNA binding have focused on TFs' intrinsic preferences for primary nucleotide sequence motifs, recent studies have elucidated additional layers of complexity that modulate TF-DNA binding. In this review, we discuss technological developments for identifying TF binding preferences and highlight recent discoveries that elaborate how TF interactions, local DNA structure, and genomic features influence TF-DNA binding. We highlight novel approaches for characterizing functional binding site motifs that promise to inform our understanding of how TF binding controls gene expression and ultimately contributes to phenotype.
Collapse
Affiliation(s)
- Sachi Inukai
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Kian Hong Kock
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Program in Biological and Biomedical Sciences, Harvard University, Cambridge, MA 02138, USA
| | - Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Program in Biological and Biomedical Sciences, Harvard University, Cambridge, MA 02138, USA; Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
31
|
Chang YK, Srivastava Y, Hu C, Joyce A, Yang X, Zuo Z, Havranek JJ, Stormo GD, Jauch R. Quantitative profiling of selective Sox/POU pairing on hundreds of sequences in parallel by Coop-seq. Nucleic Acids Res 2016; 45:832-845. [PMID: 27915232 PMCID: PMC5314778 DOI: 10.1093/nar/gkw1198] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2016] [Revised: 11/14/2016] [Accepted: 11/17/2016] [Indexed: 12/30/2022] Open
Abstract
Cooperative binding of transcription factors is known to be important in the regulation of gene expression programs conferring cellular identities. However, current methods to measure cooperativity parameters have been laborious and therefore limited to studying only a few sequence variants at a time. We developed Coop-seq (cooperativity by sequencing) that is capable of efficiently and accurately determining the cooperativity parameters for hundreds of different DNA sequences in a single experiment. We apply Coop-seq to 12 dimer pairs from the Sox and POU families of transcription factors using 324 unique sequences with changed half-site orientation, altered spacing and discrete randomization within the binding elements. The study reveals specific dimerization profiles of different Sox factors with Oct4. By contrast, Oct4 and the three neural class III POU factors Brn2, Brn4 and Oct6 assemble with Sox2 in a surprisingly indistinguishable manner. Two novel half-site configurations can support functional Sox/Oct dimerization in addition to known composite motifs. Moreover, Coop-seq uncovers a nucleotide switch within the POU half-site when spacing is altered, which is mirrored in genomic loci bound by Sox2/Oct4 complexes.
Collapse
Affiliation(s)
- Yiming K Chang
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - Yogesh Srivastava
- Genome Regulation Laboratory, Drug Discovery Pipeline, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China.,Key Laboratory of Regenerative Biology, South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China.,Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Caizhen Hu
- Genome Regulation Laboratory, Drug Discovery Pipeline, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China.,Key Laboratory of Regenerative Biology, South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China.,Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Adam Joyce
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO, USA
| | - Xiaoxiao Yang
- Genome Regulation Laboratory, Drug Discovery Pipeline, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China.,Key Laboratory of Regenerative Biology, South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China.,Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Zheng Zuo
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - James J Havranek
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO, USA
| | - Gary D Stormo
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - Ralf Jauch
- Genome Regulation Laboratory, Drug Discovery Pipeline, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China .,Key Laboratory of Regenerative Biology, South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China.,Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| |
Collapse
|
32
|
Siebert M, Söding J. Bayesian Markov models consistently outperform PWMs at predicting motifs in nucleotide sequences. Nucleic Acids Res 2016; 44:6055-69. [PMID: 27288444 PMCID: PMC5291271 DOI: 10.1093/nar/gkw521] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Accepted: 05/29/2016] [Indexed: 01/01/2023] Open
Abstract
Position weight matrices (PWMs) are the standard model for DNA and RNA regulatory motifs. In PWMs nucleotide probabilities are independent of nucleotides at other positions. Models that account for dependencies need many parameters and are prone to overfitting. We have developed a Bayesian approach for motif discovery using Markov models in which conditional probabilities of order k - 1 act as priors for those of order k This Bayesian Markov model (BaMM) training automatically adapts model complexity to the amount of available data. We also derive an EM algorithm for de-novo discovery of enriched motifs. For transcription factor binding, BaMMs achieve significantly (P = 1/16) higher cross-validated partial AUC than PWMs in 97% of 446 ChIP-seq ENCODE datasets and improve performance by 36% on average. BaMMs also learn complex multipartite motifs, improving predictions of transcription start sites, polyadenylation sites, bacterial pause sites, and RNA binding sites by 26-101%. BaMMs never performed worse than PWMs. These robust improvements argue in favour of generally replacing PWMs by BaMMs.
Collapse
Affiliation(s)
- Matthias Siebert
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany Gene Center, Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 Munich, Germany
| | - Johannes Söding
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany
| |
Collapse
|
33
|
Quantitatively predictable control of Drosophila transcriptional enhancers in vivo with engineered transcription factors. Nat Genet 2016; 48:292-8. [DOI: 10.1038/ng.3509] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2015] [Accepted: 01/15/2016] [Indexed: 12/13/2022]
|
34
|
Sasse SK, Zuo Z, Kadiyala V, Zhang L, Pufall MA, Jain MK, Phang TL, Stormo GD, Gerber AN. Response Element Composition Governs Correlations between Binding Site Affinity and Transcription in Glucocorticoid Receptor Feed-forward Loops. J Biol Chem 2015; 290:19756-69. [PMID: 26088140 DOI: 10.1074/jbc.m115.668558] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Indexed: 01/02/2023] Open
Abstract
Combinatorial gene regulation through feed-forward loops (FFLs) can bestow specificity and temporal control to client gene expression; however, characteristics of binding sites that mediate these effects are not established. We previously showed that the glucocorticoid receptor (GR) and KLF15 form coherent FFLs that cooperatively induce targets such as the amino acid-metabolizing enzymes AASS and PRODH and incoherent FFLs exemplified by repression of MT2A by KLF15. Here, we demonstrate that GR and KLF15 physically interact and identify low affinity GR binding sites within glucocorticoid response elements (GREs) for PRODH and AASS that contribute to combinatorial regulation with KLF15. We used deep sequencing and electrophoretic mobility shift assays to derive in vitro GR binding affinities across sequence space. We applied these data to show that AASS GRE activity correlated (r(2) = 0.73) with predicted GR binding affinities across a 50-fold affinity range in transfection assays; however, the slope of the linear relationship more than doubled when KLF15 was expressed. Whereas activity of the MT2A GRE was even more strongly (r(2) = 0.89) correlated with GR binding site affinity, the slope of the linear relationship was sharply reduced by KLF15, consistent with incoherent FFL logic. Thus, GRE architecture and co-regulator expression together determine the functional parameters that relate GR binding site affinity to hormone-induced transcriptional responses. Utilization of specific affinity response functions and GR binding sites by FFLs may contribute to the diversity of gene expression patterns within GR-regulated transcriptomes.
Collapse
Affiliation(s)
- Sarah K Sasse
- From the Department of Medicine, National Jewish Health, Denver, Colorado 80206
| | - Zheng Zuo
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63108-8510
| | - Vineela Kadiyala
- From the Department of Medicine, National Jewish Health, Denver, Colorado 80206
| | - Liyang Zhang
- Department of Biochemistry, University of Iowa, Iowa City, Iowa 52242
| | - Miles A Pufall
- Department of Biochemistry, University of Iowa, Iowa City, Iowa 52242
| | - Mukesh K Jain
- Case Cardiovascular Research Institute and Harrington Heart and Vascular Institute, Department of Medicine, Case Western Reserve University School of Medicine, Cleveland, Ohio 44106-7290, and
| | - Tzu L Phang
- Department of Medicine, University of Colorado, Denver, Colorado 80045
| | - Gary D Stormo
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63108-8510
| | - Anthony N Gerber
- From the Department of Medicine, National Jewish Health, Denver, Colorado 80206, Department of Medicine, University of Colorado, Denver, Colorado 80045
| |
Collapse
|
35
|
Zuo Z, Chang Y, Stormo GD. A quantitative understanding of lac repressor's binding specificity and flexibility. QUANTITATIVE BIOLOGY 2015; 3:69-80. [PMID: 26752632 DOI: 10.1007/s40484-015-0044-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Lac repressor, the first discovered transcriptional regulator, has been shown to confer multiple-modes of binding to its operator sites depending on the central spacer length. Other homolog members in the LacI/GalR family (PurR and YcjW) cannot bind their operator sites with similar structural flexibility. To decipher the underlying mechanism for this unique property, we used Spec-seq approach combined with site-directed mutagenesis to quantify the DNA binding specificity of multiple hybrids of lacI and PurR. We find that lac repressor's recognition di-residues YQ and its hinge helix loop regions are both critical for its structural flexibility. Also, specificity profiling of the whole lac operator suggests that a simple additive model from single variants suffice to predict other multivariant sites' energy reasonably well, and the genome occupancy model based on this specificity data correlates well with in vivo lac repressor binding profile.
Collapse
Affiliation(s)
- Zheng Zuo
- Department of Genetics and Center for Genomic Sciences and Systems Biology, School of Medicine, Washington University, St. Louis, MO 63108, USA
| | - Yiming Chang
- Department of Genetics and Center for Genomic Sciences and Systems Biology, School of Medicine, Washington University, St. Louis, MO 63108, USA
| | - Gary D Stormo
- Department of Genetics and Center for Genomic Sciences and Systems Biology, School of Medicine, Washington University, St. Louis, MO 63108, USA
| |
Collapse
|
36
|
Quantitative biology: from genes, cells to networks. QUANTITATIVE BIOLOGY 2014. [DOI: 10.1007/s40484-014-0038-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
37
|
Stormo GD, Zuo Z, Chang YK. Spec-seq: determining protein-DNA-binding specificity by sequencing. Brief Funct Genomics 2014; 14:30-8. [PMID: 25362070 DOI: 10.1093/bfgp/elu043] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
The specificity of protein-DNA interactions can be determined directly by sequencing the bound and unbound fractions in a standard binding reaction. The procedure is easy and inexpensive, and the accuracy can be high for thousands of sequences assayed in parallel. From the measurements, simple models of specificity, such as position weight matrices, can be assessed for their accuracy and more complex models developed if useful. Those may provide more accurate predictions of in vivo binding sites and can help us to understand the details of recognition. As an example, we demonstrate new information gained about the binding of lac repressor. One can apply the same method to combinations of factors that bind simultaneously to a single DNA and determine both the specificity of the individual factors and the cooperativity between them.
Collapse
|