1
|
Peng H, Kotelnikov S, Egbert ME, Ofaim S, Stevens GC, Phanse S, Saccon T, Ignatov M, Dutta S, Istace Z, Moutaoufik MT, Aoki H, Kewalramani N, Sun J, Gong Y, Padhorny D, Poda G, Alekseenko A, Porter KA, Jones G, Rodionova I, Guo H, Pogoutse O, Datta S, Saier M, Crovella M, Vajda S, Moreno-Hagelsieb G, Parkinson J, Segre D, Babu M, Kozakov D, Emili A. Ligand interaction landscape of transcription factors and essential enzymes in E. coli. Cell 2025; 188:1441-1455.e15. [PMID: 39862855 DOI: 10.1016/j.cell.2025.01.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 09/23/2024] [Accepted: 01/02/2025] [Indexed: 01/27/2025]
Abstract
Knowledge of protein-metabolite interactions can enhance mechanistic understanding and chemical probing of biochemical processes, but the discovery of endogenous ligands remains challenging. Here, we combined rapid affinity purification with precision mass spectrometry and high-resolution molecular docking to precisely map the physical associations of 296 chemically diverse small-molecule metabolite ligands with 69 distinct essential enzymes and 45 transcription factors in the gram-negative bacterium Escherichia coli. We then conducted systematic metabolic pathway integration, pan-microbial evolutionary projections, and independent in-depth biophysical characterization experiments to define the functional significance of ligand interfaces. This effort revealed principles governing functional crosstalk on a network level, divergent patterns of binding pocket conservation, and scaffolds for designing selective chemical probes. This structurally resolved ligand interactome mapping pipeline can be scaled to illuminate the native small-molecule networks of complete cells and potentially entire multi-cellular communities.
Collapse
Affiliation(s)
- Hui Peng
- Department of Chemistry, University of Toronto, Toronto, ON M5S 3H6, Canada
| | - Sergei Kotelnikov
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA
| | - Megan E Egbert
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Shany Ofaim
- Program in Bioinformatics, Boston University, Boston, MA 02215, USA
| | - Grant C Stevens
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Sadhna Phanse
- Center for Network Systems Biology, Boston University, Boston, MA 02218, USA; Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Tatiana Saccon
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Mikhail Ignatov
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA
| | - Shubham Dutta
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Zoe Istace
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Mohamed Taha Moutaoufik
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada; Faculty of Medical Sciences, University Mohammed VI Polytechnic, Benguerir, Morocco
| | - Hiroyuki Aoki
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Neal Kewalramani
- Program in Bioinformatics, Boston University, Boston, MA 02215, USA; Center for Network Systems Biology, Boston University, Boston, MA 02218, USA
| | - Jianxian Sun
- Department of Chemistry, University of Toronto, Toronto, ON M5S 3H6, Canada
| | - Yufeng Gong
- Department of Chemistry, University of Toronto, Toronto, ON M5S 3H6, Canada
| | - Dzmitry Padhorny
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA
| | - Gennady Poda
- Drug Discovery Program, Ontario Institute for Cancer Research, Toronto, ON M5G 0A3, Canada; Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON M5S 3M2, Canada
| | - Andrey Alekseenko
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA
| | - Kathryn A Porter
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - George Jones
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA
| | - Irina Rodionova
- Department of Molecular Biology, University of California, San Diego, La Jolla, San Diego, CA 920930, USA
| | - Hongbo Guo
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Oxana Pogoutse
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Suprama Datta
- Center for Network Systems Biology, Boston University, Boston, MA 02218, USA
| | - Milton Saier
- Department of Molecular Biology, University of California, San Diego, La Jolla, San Diego, CA 920930, USA
| | - Mark Crovella
- Program in Bioinformatics, Boston University, Boston, MA 02215, USA; Department of Computer Science, Boston University, Boston, MA 02215, USA
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA; Department of Chemistry, Boston University, Boston, MA 02215, USA
| | | | - John Parkinson
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Daniel Segre
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA; Program in Bioinformatics, Boston University, Boston, MA 02215, USA; Department of Biology, Boston University, Boston, MA 02215, USA
| | - Mohan Babu
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada.
| | - Dima Kozakov
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA.
| | - Andrew Emili
- Program in Bioinformatics, Boston University, Boston, MA 02215, USA; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada; Center for Network Systems Biology, Boston University, Boston, MA 02218, USA; Department of Chemistry, Boston University, Boston, MA 02215, USA; Department of Chemical Physiology and Biochemistry, Division of Oncological Sciences, Knight Cancer Institute, Oregon Health and Science University, Portland, OR, USA.
| |
Collapse
|
2
|
Hasenauer F, Barreto H, Lotton C, Matic I. Genome-wide mapping of spontaneous DNA replication error-hotspots using mismatch repair proteins in rapidly proliferating Escherichia coli. Nucleic Acids Res 2025; 53:gkae1196. [PMID: 39660654 PMCID: PMC11754648 DOI: 10.1093/nar/gkae1196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2024] [Revised: 11/12/2024] [Accepted: 11/19/2024] [Indexed: 12/12/2024] Open
Abstract
Fidelity of DNA replication is crucial for the accurate transmission of genetic information across generations, yet errors still occur despite multiple control mechanisms. This study investigated the factors influencing spontaneous replication errors across the Escherichia coli genome. We detected errors using the MutS and MutL mismatch repair proteins in rapidly proliferating mutH-deficient cells, where errors can be detected but not corrected. Our findings reveal that replication error hotspots are non-randomly distributed along the chromosome and are enriched in sequences with distinct features: lower thermal stability facilitating DNA strand separation, mononucleotide repeats prone to DNA polymerase slippage and sequences prone to forming secondary structures like cruciforms and G4 structures, which increase likelihood of DNA polymerase stalling. These hotspots showed enrichment for binding sites of nucleoid-associated proteins, RpoB and GyrA, as well as highly expressed genes, and depletion of GATC sequence. Finally, the enrichment of single-stranded DNA stretches in the hotspot regions establishes a nexus between the formation of secondary structures, transcriptional activity and replication stress. In conclusion, this study provides a comprehensive genome-wide map of replication error hotspots, offering a holistic perspective on the intricate interplay between various mechanisms that can compromise the faithful transmission of genetic information.
Collapse
Affiliation(s)
- Flavia C Hasenauer
- Université Paris Cité, CNRS, Inserm, Institut Cochin, F-75014 Paris, France
| | - Hugo C Barreto
- Université Paris Cité, CNRS, Inserm, Institut Cochin, F-75014 Paris, France
| | - Chantal Lotton
- Université Paris Cité, CNRS, Inserm, Institut Cochin, F-75014 Paris, France
| | - Ivan Matic
- Université Paris Cité, CNRS, Inserm, Institut Cochin, F-75014 Paris, France
| |
Collapse
|
3
|
Li Y, Wang Y, Wang C, Ma A, Ma Q, Liu B. A weighted two-stage sequence alignment framework to identify motifs from ChIP-exo data. PATTERNS (NEW YORK, N.Y.) 2024; 5:100927. [PMID: 38487805 PMCID: PMC10935504 DOI: 10.1016/j.patter.2024.100927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/18/2023] [Accepted: 01/10/2024] [Indexed: 03/17/2024]
Abstract
In this study, we introduce TESA (weighted two-stage alignment), an innovative motif prediction tool that refines the identification of DNA-binding protein motifs, essential for deciphering transcriptional regulatory mechanisms. Unlike traditional algorithms that rely solely on sequence data, TESA integrates the high-resolution chromatin immunoprecipitation (ChIP) signal, specifically from ChIP-exonuclease (ChIP-exo), by assigning weights to sequence positions, thereby enhancing motif discovery. TESA employs a nuanced approach combining a binomial distribution model with a graph model, further supported by a "bookend" model, to improve the accuracy of predicting motifs of varying lengths. Our evaluation, utilizing an extensive compilation of 90 prokaryotic ChIP-exo datasets from proChIPdb and 167 H. sapiens datasets, compared TESA's performance against seven established tools. The results indicate TESA's improved precision in motif identification, suggesting its valuable contribution to the field of genomic research.
Collapse
Affiliation(s)
- Yang Li
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
| | - Yizhong Wang
- School of Mathematics, Shandong University, Jinan, Shandong 250100, China
| | - Cankun Wang
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
| | - Anjun Ma
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
| | - Qin Ma
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA
| | - Bingqiang Liu
- School of Mathematics, Shandong University, Jinan, Shandong 250100, China
| |
Collapse
|
4
|
Salgado H, Gama-Castro S, Lara P, Mejia-Almonte C, Alarcón-Carranza G, López-Almazo AG, Betancourt-Figueroa F, Peña-Loredo P, Alquicira-Hernández S, Ledezma-Tejeida D, Arizmendi-Zagal L, Mendez-Hernandez F, Diaz-Gomez AK, Ochoa-Praxedis E, Muñiz-Rascado LJ, García-Sotelo JS, Flores-Gallegos FA, Gómez L, Bonavides-Martínez C, del Moral-Chávez VM, Hernández-Alvarez AJ, Santos-Zavaleta A, Capella-Gutierrez S, Gelpi JL, Collado-Vides J. RegulonDB v12.0: a comprehensive resource of transcriptional regulation in E. coli K-12. Nucleic Acids Res 2024; 52:D255-D264. [PMID: 37971353 PMCID: PMC10767902 DOI: 10.1093/nar/gkad1072] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/25/2023] [Accepted: 11/02/2023] [Indexed: 11/19/2023] Open
Abstract
RegulonDB is a database that contains the most comprehensive corpus of knowledge of the regulation of transcription initiation of Escherichia coli K-12, including data from both classical molecular biology and high-throughput methodologies. Here, we describe biological advances since our last NAR paper of 2019. We explain the changes to satisfy FAIR requirements. We also present a full reconstruction of the RegulonDB computational infrastructure, which has significantly improved data storage, retrieval and accessibility and thus supports a more intuitive and user-friendly experience. The integration of graphical tools provides clear visual representations of genetic regulation data, facilitating data interpretation and knowledge integration. RegulonDB version 12.0 can be accessed at https://regulondb.ccg.unam.mx.
Collapse
Affiliation(s)
- Heladia Salgado
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | - Socorro Gama-Castro
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | - Paloma Lara
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | - Citlalli Mejia-Almonte
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | - Gabriel Alarcón-Carranza
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | - Andrés G López-Almazo
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | - Felipe Betancourt-Figueroa
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | - Pablo Peña-Loredo
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | | | - Daniela Ledezma-Tejeida
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | - Lizeth Arizmendi-Zagal
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | - Francisco Mendez-Hernandez
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | - Ana K Diaz-Gomez
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | - Elizabeth Ochoa-Praxedis
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | - Luis J Muñiz-Rascado
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | - Jair S García-Sotelo
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Querétaro 76230, Querétaro, Mexico
| | - Fanny A Flores-Gallegos
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | - Laura Gómez
- Instituto Nacional de Medicina Genómica, Periférico Sur 4809, Arenal Tepepan, Tlalpan, 14610 Ciudad de México, Mexico
- Escuela de Medicina, Tecnológico de Monterrey, Campus Ciudad de México, CDMX 14380, Meéxico
| | - César Bonavides-Martínez
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | - Víctor M del Moral-Chávez
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
| | | | - Alberto Santos-Zavaleta
- Instituto de Energías Renovables, Universidad Nacional Autónoma de México, Temixco, Morelos 62580, Meéxico
| | | | - Josep Lluis Gelpi
- Department of Biochemistry and Molecular Biomedicine. Univ. of Barcelona. Av. Diagonal 643, 08028, Barcelona, Spain
- Centre for Genomic Regulation (CRG), Universitat Pompeu Fabra(UPF), Dr. Aiguader 88, Barcelona, 08003, Barcelona, Spain
| | - Julio Collado-Vides
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico
- Centre for Genomic Regulation (CRG), Universitat Pompeu Fabra(UPF), Dr. Aiguader 88, Barcelona, 08003, Barcelona, Spain
- Department of Biomedical Engineering, Boston University, 44 Cummington Mall. Boston, MA 02215, USA
| |
Collapse
|
5
|
Han Y, Li W, Filko A, Li J, Zhang F. Genome-wide promoter responses to CRISPR perturbations of regulators reveal regulatory networks in Escherichia coli. Nat Commun 2023; 14:5757. [PMID: 37717013 PMCID: PMC10505187 DOI: 10.1038/s41467-023-41572-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 09/08/2023] [Indexed: 09/18/2023] Open
Abstract
Elucidating genome-scale regulatory networks requires a comprehensive collection of gene expression profiles, yet measuring gene expression responses for every transcription factor (TF)-gene pair in living prokaryotic cells remains challenging. Here, we develop pooled promoter responses to TF perturbation sequencing (PPTP-seq) via CRISPR interference to address this challenge. Using PPTP-seq, we systematically measure the activity of 1372 Escherichia coli promoters under single knockdown of 183 TF genes, illustrating more than 200,000 possible TF-gene responses in one experiment. We perform PPTP-seq for E. coli growing in three different media. The PPTP-seq data reveal robust steady-state promoter activities under most single TF knockdown conditions. PPTP-seq also enables identifications of, to the best of our knowledge, previously unknown TF autoregulatory responses and complex transcriptional control on one-carbon metabolism. We further find context-dependent promoter regulation by multiple TFs whose relative binding strengths determined promoter activities. Additionally, PPTP-seq reveals different promoter responses in different growth media, suggesting condition-specific gene regulation. Overall, PPTP-seq provides a powerful method to examine genome-wide transcriptional regulatory networks and can be potentially expanded to reveal gene expression responses to other genetic elements.
Collapse
Affiliation(s)
- Yichao Han
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA
| | - Wanji Li
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA
| | - Alden Filko
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA
| | - Jingyao Li
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA
| | - Fuzhong Zhang
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA.
- Division of Biological and Biomedical Sciences, Washington University in St. Louis, Saint Louis, Missouri, USA.
- Institute of Materials Science and Engineering, Washington University in St. Louis, Saint Louis, Missouri, USA.
| |
Collapse
|
6
|
Bang I, Lee SM, Park S, Park JY, Nong LK, Gao Y, Palsson BO, Kim D. Deep-learning optimized DEOCSU suite provides an iterable pipeline for accurate ChIP-exo peak calling. Brief Bioinform 2023; 24:7005164. [PMID: 36702751 DOI: 10.1093/bib/bbad024] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 01/02/2023] [Accepted: 01/08/2023] [Indexed: 01/28/2023] Open
Abstract
Recognizing binding sites of DNA-binding proteins is a key factor for elucidating transcriptional regulation in organisms. ChIP-exo enables researchers to delineate genome-wide binding landscapes of DNA-binding proteins with near single base-pair resolution. However, the peak calling step hinders ChIP-exo application since the published algorithms tend to generate false-positive and false-negative predictions. Here, we report the development of DEOCSU (DEep-learning Optimized ChIP-exo peak calling SUite), a novel machine learning-based ChIP-exo peak calling suite. DEOCSU entails the deep convolutional neural network model which was trained with curated ChIP-exo peak data to distinguish the visualized data of bona fide peaks from false ones. Performance validation of the trained deep-learning model indicated its high accuracy, high precision and high recall of over 95%. Applying the new suite to both in-house and publicly available ChIP-exo datasets obtained from bacteria, eukaryotes and archaea revealed an accurate prediction of peaks containing canonical motifs, highlighting the versatility and efficiency of DEOCSU. Furthermore, DEOCSU can be executed on a cloud computing platform or the local environment. With visualization software included in the suite, adjustable options such as the threshold of peak probability, and iterable updating of the pre-trained model, DEOCSU can be optimized for users' specific needs.
Collapse
Affiliation(s)
- Ina Bang
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| | - Sang-Mok Lee
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| | - Seojoung Park
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| | - Joon Young Park
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| | - Linh Khanh Nong
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| | - Ye Gao
- Department of Bioengineering, University of California San Diego, La Jolla CA 92093, USA
| | - Bernhard O Palsson
- Department of Bioengineering, University of California San Diego, La Jolla CA 92093, USA
- Department of Pediatrics, University of California San Diego, La Jolla CA 92093, USA
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| | - Donghyuk Kim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea
| |
Collapse
|
7
|
RfaH Counter-Silences Inhibition of Transcript Elongation by H-NS-StpA Nucleoprotein Filaments in Pathogenic Escherichia coli. mBio 2022; 13:e0266222. [PMID: 36264101 PMCID: PMC9765446 DOI: 10.1128/mbio.02662-22] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Expression of virulence genes in pathogenic Escherichia coli is controlled in part by the transcription silencer H-NS and its paralogs (e.g., StpA), which sequester DNA in multi-kb nucleoprotein filaments to inhibit transcription initiation, elongation, or both. Some activators counter-silence initiation by displacing H-NS from promoters, but how H-NS inhibition of elongation is overcome is not understood. In uropathogenic E. coli (UPEC), elongation regulator RfaH aids expression of some H-NS-silenced pathogenicity operons (e.g., hlyCABD encoding hemolysin). RfaH associates with elongation complexes (ECs) via direct contacts to a transiently exposed, nontemplate DNA strand sequence called operon polarity suppressor (ops). RfaH-ops interactions establish long-lived RfaH-EC contacts that allow RfaH to recruit ribosomes to the nascent mRNA and to suppress transcriptional pausing and termination. Using ChIP-seq, we mapped the genome-scale distributions of RfaH, H-NS, StpA, RNA polymerase (RNAP), and σ70 in the UPEC strain CFT073. We identify eight RfaH-activated operons, all of which were bound by H-NS and StpA. Four are new additions to the RfaH regulon. Deletion of RfaH caused premature termination, whereas deletion of H-NS and StpA allowed elongation without RfaH. Thus, RfaH is an elongation counter-silencer of H-NS. Consistent with elongation counter-silencing, deletion of StpA alone decreased the effect of RfaH. StpA increases DNA bridging, which inhibits transcript elongation via topological constraints on RNAP. Residual RfaH effect when both H-NS and StpA were deleted was attributable to targeting of RfaH-regulated operons by a minor H-NS paralog, Hfp. These operons have evolved higher levels of H-NS-binding features, explaining minor-paralog targeting. IMPORTANCE Bacterial pathogens adapt to hosts and host defenses by reprogramming gene expression, including by H-NS counter-silencing. Counter-silencing turns on transcription initiation when regulators bind to promoters and rearrange repressive H-NS nucleoprotein filaments that ordinarily block transcription. The specialized NusG paralog RfaH also reprograms virulence genes but regulates transcription elongation. To understand how elongation regulators might affect genes silenced by H-NS, we mapped H-NS, StpA (an H-NS paralog), RfaH, σ70, and RNA polymerase (RNAP) locations on DNA in the uropathogenic E. coli strain CFT073. Although H-NS-StpA filaments bind only 18% of the CFT073 genome, all loci at which RfaH binds RNAP are also bound by H-NS-StpA and are silenced when RfaH is absent. Thus, RfaH represents a distinct class of counter-silencer that acts on elongating RNAP to enable transcription through repressive nucleoprotein filaments. Our findings define a new mechanism of elongation counter-silencing and explain how RfaH functions as a virulence regulator.
Collapse
|
8
|
Gao R, Brokaw SE, Li Z, Helfant LJ, Wu T, Malik M, Stock AM. Exploring the mono-/bistability range of positively autoregulated signaling systems in the presence of competing transcription factor binding sites. PLoS Comput Biol 2022; 18:e1010738. [PMID: 36413575 PMCID: PMC9725139 DOI: 10.1371/journal.pcbi.1010738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 12/06/2022] [Accepted: 11/14/2022] [Indexed: 11/23/2022] Open
Abstract
Binding of transcription factor (TF) proteins to regulatory DNA sites is key to accurate control of gene expression in response to environmental stimuli. Theoretical modeling of transcription regulation is often focused on a limited set of genes of interest, while binding of the TF to other genomic sites is seldom considered. The total number of TF binding sites (TFBSs) affects the availability of TF protein molecules and sequestration of a TF by TFBSs can promote bistability. For many signaling systems where a graded response is desirable for continuous control over the input range, biochemical parameters of the regulatory proteins need be tuned to avoid bistability. Here we analyze the mono-/bistable parameter range for positively autoregulated two-component systems (TCSs) in the presence of different numbers of competing TFBSs. TCS signaling, one of the major bacterial signaling strategies, couples signal perception with output responses via protein phosphorylation. For bistability, competition for TF proteins by TFBSs lowers the requirement for high fold change of the autoregulated transcription but demands high phosphorylation activities of TCS proteins. We show that bistability can be avoided with a low phosphorylation capacity of TCSs, a high TF affinity for the autoregulated promoter or a low fold change in signaling protein levels upon induction. These may represent general design rules for TCSs to ensure uniform graded responses. Examining the mono-/bistability parameter range allows qualitative prediction of steady-state responses, which are experimentally validated in the E. coli CusRS system.
Collapse
Affiliation(s)
- Rong Gao
- Center for Advanced Biotechnology and Medicine, Department of Biochemistry and Molecular Biology, Rutgers University - Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - Samantha E. Brokaw
- Center for Advanced Biotechnology and Medicine, Department of Biochemistry and Molecular Biology, Rutgers University - Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - Zeyue Li
- Center for Advanced Biotechnology and Medicine, Department of Biochemistry and Molecular Biology, Rutgers University - Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - Libby J. Helfant
- Center for Advanced Biotechnology and Medicine, Department of Biochemistry and Molecular Biology, Rutgers University - Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - Ti Wu
- Center for Advanced Biotechnology and Medicine, Department of Biochemistry and Molecular Biology, Rutgers University - Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - Muhammad Malik
- Center for Advanced Biotechnology and Medicine, Department of Biochemistry and Molecular Biology, Rutgers University - Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| | - Ann M. Stock
- Center for Advanced Biotechnology and Medicine, Department of Biochemistry and Molecular Biology, Rutgers University - Robert Wood Johnson Medical School, Piscataway, New Jersey, United States of America
| |
Collapse
|
9
|
Tierrafría VH, Rioualen C, Salgado H, Lara P, Gama-Castro S, Lally P, Gómez-Romero L, Peña-Loredo P, López-Almazo AG, Alarcón-Carranza G, Betancourt-Figueroa F, Alquicira-Hernández S, Polanco-Morelos JE, García-Sotelo J, Gaytan-Nuñez E, Méndez-Cruz CF, Muñiz LJ, Bonavides-Martínez C, Moreno-Hagelsieb G, Galagan JE, Wade JT, Collado-Vides J. RegulonDB 11.0: Comprehensive high-throughput datasets on transcriptional regulation in Escherichia coli K-12. Microb Genom 2022; 8:mgen000833. [PMID: 35584008 PMCID: PMC9465075 DOI: 10.1099/mgen.0.000833] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Accepted: 04/24/2022] [Indexed: 01/23/2023] Open
Abstract
Genomics has set the basis for a variety of methodologies that produce high-throughput datasets identifying the different players that define gene regulation, particularly regulation of transcription initiation and operon organization. These datasets are available in public repositories, such as the Gene Expression Omnibus, or ArrayExpress. However, accessing and navigating such a wealth of data is not straightforward. No resource currently exists that offers all available high and low-throughput data on transcriptional regulation in Escherichia coli K-12 to easily use both as whole datasets, or as individual interactions and regulatory elements. RegulonDB (https://regulondb.ccg.unam.mx) began gathering high-throughput dataset collections in 2009, starting with transcription start sites, then adding ChIP-seq and gSELEX in 2012, with up to 99 different experimental high-throughput datasets available in 2019. In this paper we present a radical upgrade to more than 2000 high-throughput datasets, processed to facilitate their comparison, introducing up-to-date collections of transcription termination sites, transcription units, as well as transcription factor binding interactions derived from ChIP-seq, ChIP-exo, gSELEX and DAP-seq experiments, besides expression profiles derived from RNA-seq experiments. For ChIP-seq experiments we offer both the data as presented by the authors, as well as data uniformly processed in-house, enhancing their comparability, as well as the traceability of the methods and reproducibility of the results. Furthermore, we have expanded the tools available for browsing and visualization across and within datasets. We include comparisons against previously existing knowledge in RegulonDB from classic experiments, a nucleotide-resolution genome viewer, and an interface that enables users to browse datasets by querying their metadata. A particular effort was made to automatically extract detailed experimental growth conditions by implementing an assisted curation strategy applying Natural language processing and machine learning. We provide summaries with the total number of interactions found in each experiment, as well as tools to identify common results among different experiments. This is a long-awaited resource to make use of such wealth of knowledge and advance our understanding of the biology of the model bacterium E. coli K-12.
Collapse
Affiliation(s)
- Víctor H. Tierrafría
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
- Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, MA 02215, USA
| | - Claire Rioualen
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
| | - Heladia Salgado
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
| | - Paloma Lara
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
| | - Socorro Gama-Castro
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
| | - Patrick Lally
- Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, MA 02215, USA
| | - Laura Gómez-Romero
- Instituto Nacional de Medicina Genómica, INMEGEN, Periférico Sur 4809, Arenal Tepepan, Tlalpan 14610, CDMX, Mexico
| | - Pablo Peña-Loredo
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
| | - Andrés G. López-Almazo
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
| | - Gabriel Alarcón-Carranza
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
| | - Felipe Betancourt-Figueroa
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
| | - Shirley Alquicira-Hernández
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
| | - J. Enrique Polanco-Morelos
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
| | - Jair García-Sotelo
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Querétaro 76230, Querétaro, Mexico
| | - Estefani Gaytan-Nuñez
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
| | - Carlos-Francisco Méndez-Cruz
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
| | - Luis J. Muñiz
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
| | - César Bonavides-Martínez
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
| | - Gabriel Moreno-Hagelsieb
- Department of Biology, Wilfrid Laurier University, 75 University Ave W, Waterloo, ON N2L 3C5, Canada
| | - James E. Galagan
- Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, MA 02215, USA
| | - Joseph T. Wade
- Wadsworth Center, New York State Department of Health, Albany, NY, USA
- Department of Biomedical Sciences, University at Albany, SUNY, Albany, NY, USA
| | - Julio Collado-Vides
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n, Cuernavaca 62210, Morelos, Mexico
- Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, MA 02215, USA
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Universitat Pompeu Fabra(UPF), Barcelona, Spain
| |
Collapse
|