51
|
Mori M, Zhang Z, Banaei‐Esfahani A, Lalanne J, Okano H, Collins BC, Schmidt A, Schubert OT, Lee D, Li G, Aebersold R, Hwa T, Ludwig C. From coarse to fine: the absolute Escherichia coli proteome under diverse growth conditions. Mol Syst Biol 2021; 17:e9536. [PMID: 34032011 PMCID: PMC8144880 DOI: 10.15252/msb.20209536] [Citation(s) in RCA: 73] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 04/07/2021] [Accepted: 04/09/2021] [Indexed: 12/17/2022] Open
Abstract
Accurate measurements of cellular protein concentrations are invaluable to quantitative studies of gene expression and physiology in living cells. Here, we developed a versatile mass spectrometric workflow based on data-independent acquisition proteomics (DIA/SWATH) together with a novel protein inference algorithm (xTop). We used this workflow to accurately quantify absolute protein abundances in Escherichia coli for > 2,000 proteins over > 60 growth conditions, including nutrient limitations, non-metabolic stresses, and non-planktonic states. The resulting high-quality dataset of protein mass fractions allowed us to characterize proteome responses from a coarse (groups of related proteins) to a fine (individual) protein level. Hereby, a plethora of novel biological findings could be elucidated, including the generic upregulation of low-abundant proteins under various metabolic limitations, the non-specificity of catabolic enzymes upregulated under carbon limitation, the lack of large-scale proteome reallocation under stress compared to nutrient limitations, as well as surprising strain-dependent effects important for biofilm formation. These results present valuable resources for the systems biology community and can be used for future multi-omics studies of gene regulation and metabolic control in E. coli.
Collapse
Affiliation(s)
- Matteo Mori
- Department of PhysicsUniversity of California at San DiegoLa JollaCAUSA
| | - Zhongge Zhang
- Section of Molecular BiologyDivision of Biological SciencesUniversity of California at San DiegoLa JollaCAUSA
| | - Amir Banaei‐Esfahani
- Department of BiologyInstitute of Molecular Systems BiologyETH ZurichZurichSwitzerland
| | - Jean‐Benoît Lalanne
- Department of BiologyMassachusetts Institute of TechnologyCambridgeMAUSA
- Department of PhysicsMassachusetts Institute of TechnologyCambridgeMAUSA
| | - Hiroyuki Okano
- Department of PhysicsUniversity of California at San DiegoLa JollaCAUSA
| | - Ben C Collins
- Department of BiologyInstitute of Molecular Systems BiologyETH ZurichZurichSwitzerland
- School of Biological SciencesQueen's University of BelfastBelfastUK
| | | | - Olga T Schubert
- Department of Human GeneticsUniversity of California, Los AngelesLos AngelesCAUSA
| | - Deok‐Sun Lee
- School of Computational SciencesKorea Institute for Advanced StudySeoulKorea
| | - Gene‐Wei Li
- Department of BiologyMassachusetts Institute of TechnologyCambridgeMAUSA
| | - Ruedi Aebersold
- Department of BiologyInstitute of Molecular Systems BiologyETH ZurichZurichSwitzerland
- Faculty of ScienceUniversity of ZurichZurichSwitzerland
| | - Terence Hwa
- Department of PhysicsUniversity of California at San DiegoLa JollaCAUSA
- Section of Molecular BiologyDivision of Biological SciencesUniversity of California at San DiegoLa JollaCAUSA
| | - Christina Ludwig
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS)Technical University of Munich (TUM)FreisingGermany
| |
Collapse
|
52
|
Santibáñez R, Garrido D, Martin AJM. Atlas: automatic modeling of regulation of bacterial gene expression and metabolism using rule-based languages. Bioinformatics 2021; 36:5473-5480. [PMID: 33367504 PMCID: PMC8016457 DOI: 10.1093/bioinformatics/btaa1040] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Revised: 11/19/2020] [Accepted: 12/12/2020] [Indexed: 12/31/2022] Open
Abstract
MOTIVATION Cells are complex systems composed of hundreds of genes whose products interact to produce elaborated behaviors. To control such behaviors, cells rely on transcription factors to regulate gene expression, and gene regulatory networks (GRNs) are employed to describe and understand such behavior. However, GRNs are static models, and dynamic models are difficult to obtain due to their size, complexity, stochastic dynamics and interactions with other cell processes. RESULTS We developed Atlas, a Python software that converts genome graphs and gene regulatory, interaction and metabolic networks into dynamic models. The software employs these biological networks to write rule-based models for the PySB framework. The underlying method is a divide-and-conquer strategy to obtain sub-models and combine them later into an ensemble model. To exemplify the utility of Atlas, we used networks of varying size and complexity of Escherichia coli and evaluated in silico modifications, such as gene knockouts and the insertion of promoters and terminators. Moreover, the methodology could be applied to the dynamic modeling of natural and synthetic networks of any bacteria. AVAILABILITY AND IMPLEMENTATION Code, models and tutorials are available online (https://github.com/networkbiolab/atlas). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rodrigo Santibáñez
- Laboratorio de Biología de Redes, Centro de Genómica y Bioinformática, Universidad Mayor, Santiago 8580745, Chile
- Department of Chemical and Bioprocess Engineering, School of Engineering, Pontificia Universidad Católica de Chile, Santiago 7820436, Chile
| | - Daniel Garrido
- Department of Chemical and Bioprocess Engineering, School of Engineering, Pontificia Universidad Católica de Chile, Santiago 7820436, Chile
| | - Alberto J M Martin
- Laboratorio de Biología de Redes, Centro de Genómica y Bioinformática, Universidad Mayor, Santiago 8580745, Chile
| |
Collapse
|
53
|
Bennett RK, Gregory GJ, Gonzalez JE, Har JRG, Antoniewicz MR, Papoutsakis ET. Improving the Methanol Tolerance of an Escherichia coli Methylotroph via Adaptive Laboratory Evolution Enhances Synthetic Methanol Utilization. Front Microbiol 2021; 12:638426. [PMID: 33643274 PMCID: PMC7904680 DOI: 10.3389/fmicb.2021.638426] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Accepted: 01/21/2021] [Indexed: 02/05/2023] Open
Abstract
There is great interest in developing synthetic methylotrophs that harbor methane and methanol utilization pathways in heterologous hosts such as Escherichia coli for industrial bioconversion of one-carbon compounds. While there are recent reports that describe the successful engineering of synthetic methylotrophs, additional efforts are required to achieve the robust methylotrophic phenotypes required for industrial realization. Here, we address an important issue of synthetic methylotrophy in E. coli: methanol toxicity. Both methanol, and its oxidation product, formaldehyde, are cytotoxic to cells. Methanol alters the fluidity and biological properties of cellular membranes while formaldehyde reacts readily with proteins and nucleic acids. Thus, efforts to enhance the methanol tolerance of synthetic methylotrophs are important. Here, adaptive laboratory evolution was performed to improve the methanol tolerance of several E. coli strains, both methylotrophic and non-methylotrophic. Serial batch passaging in rich medium containing toxic methanol concentrations yielded clones exhibiting improved methanol tolerance. In several cases, these evolved clones exhibited a > 50% improvement in growth rate and biomass yield in the presence of high methanol concentrations compared to the respective parental strains. Importantly, one evolved clone exhibited a two to threefold improvement in the methanol utilization phenotype, as determined via 13C-labeling, at non-toxic, industrially relevant methanol concentrations compared to the respective parental strain. Whole genome sequencing was performed to identify causative mutations contributing to methanol tolerance. Common mutations were identified in 30S ribosomal subunit proteins, which increased translational accuracy and provided insight into a novel methanol tolerance mechanism. This study addresses an important issue of synthetic methylotrophy in E. coli and provides insight as to how methanol toxicity can be alleviated via enhancing methanol tolerance. Coupled improvement of methanol tolerance and synthetic methanol utilization is an important advancement for the field of synthetic methylotrophy.
Collapse
Affiliation(s)
- R Kyle Bennett
- Department of Chemical and Biomolecular Engineering, University of Delaware, Newark, DE, United States.,Molecular Biotechnology Laboratory, The Delaware Biotechnology Institute, University of Delaware, Newark, DE, United States
| | - Gwendolyn J Gregory
- Department of Chemical and Biomolecular Engineering, University of Delaware, Newark, DE, United States.,Molecular Biotechnology Laboratory, The Delaware Biotechnology Institute, University of Delaware, Newark, DE, United States
| | - Jacqueline E Gonzalez
- Department of Chemical and Biomolecular Engineering, University of Delaware, Newark, DE, United States
| | - Jie Ren Gerald Har
- Department of Chemical and Biomolecular Engineering, University of Delaware, Newark, DE, United States
| | - Maciek R Antoniewicz
- Department of Chemical and Biomolecular Engineering, University of Delaware, Newark, DE, United States
| | - Eleftherios T Papoutsakis
- Department of Chemical and Biomolecular Engineering, University of Delaware, Newark, DE, United States.,Molecular Biotechnology Laboratory, The Delaware Biotechnology Institute, University of Delaware, Newark, DE, United States
| |
Collapse
|
54
|
Wu PIF, Ross C, Siegele DA, Hu JC. Insights from the reanalysis of high-throughput chemical genomics data for Escherichia coli K-12. G3-GENES GENOMES GENETICS 2021; 11:6044125. [PMID: 33561236 PMCID: PMC8022724 DOI: 10.1093/g3journal/jkaa035] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 11/11/2020] [Indexed: 11/14/2022]
Abstract
Despite the demonstrated success of genome-wide genetic screens and chemical genomics studies at predicting functions for genes of unknown function or predicting new functions for well-characterized genes, their potential to provide insights into gene function has not been fully explored. We systematically reanalyzed a published high-throughput phenotypic dataset for the model Gram-negative bacterium Escherichia coli K-12. The availability of high-quality annotation sets allowed us to compare the power of different metrics for measuring phenotypic profile similarity to correctly infer gene function. We conclude that there is no single best method; the three metrics tested gave comparable results for most gene pairs. We also assessed how converting quantitative phenotypes to discrete, qualitative phenotypes affected the association between phenotype and function. Our results indicate that this approach may allow phenotypic data from different studies to be combined to produce a larger dataset that may reveal functional connections between genes not detected in individual studies.
Collapse
Affiliation(s)
- Peter I-Fan Wu
- Department of Biochemistry and Biophysics, Texas A&M University and Texas Agrilife Research, College Station, TX 77843-2128, USA
| | - Curtis Ross
- Department of Biochemistry and Biophysics, Texas A&M University and Texas Agrilife Research, College Station, TX 77843-2128, USA
| | - Deborah A Siegele
- Department of Biology, Texas A&M University, College Station, TX 77843-3258, USA
| | - James C Hu
- Department of Biochemistry and Biophysics, Texas A&M University and Texas Agrilife Research, College Station, TX 77843-2128, USA
| |
Collapse
|
55
|
Li W, O’Neill KR, Haft DH, DiCuccio M, Chetvernin V, Badretdin A, Coulouris G, Chitsaz F, Derbyshire M, Durkin AS, Gonzales NR, Gwadz M, Lanczycki C, Song JS, Thanki N, Wang J, Yamashita R, Yang M, Zheng C, Marchler-Bauer A, Thibaud-Nissen F. RefSeq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation. Nucleic Acids Res 2021; 49:D1020-D1028. [PMID: 33270901 PMCID: PMC7779008 DOI: 10.1093/nar/gkaa1105] [Citation(s) in RCA: 622] [Impact Index Per Article: 155.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/19/2020] [Accepted: 11/02/2020] [Indexed: 11/14/2022] Open
Abstract
The Reference Sequence (RefSeq) project at the National Center for Biotechnology Information (NCBI) contains nearly 200 000 bacterial and archaeal genomes and 150 million proteins with up-to-date annotation. Changes in the Prokaryotic Genome Annotation Pipeline (PGAP) since 2018 have resulted in a substantial reduction in spurious annotation. The hierarchical collection of protein family models (PFMs) used by PGAP as evidence for structural and functional annotation was expanded to over 35 000 protein profile hidden Markov models (HMMs), 12 300 BlastRules and 36 000 curated CDD architectures. As a result, >122 million or 79% of RefSeq proteins are now named based on a match to a curated PFM. Gene symbols, Enzyme Commission numbers or supporting publication attributes are available on over 40% of the PFMs and are inherited by the proteins and features they name, facilitating multi-genome analyses and connections to the literature. In adherence with the principles of FAIR (findable, accessible, interoperable, reusable), the PFMs are available in the Protein Family Models Entrez database to any user. Finally, the reference and representative genome set, a taxonomically diverse subset of RefSeq prokaryotic genomes, is now recalculated regularly and available for download and homology searches with BLAST. RefSeq is found at https://www.ncbi.nlm.nih.gov/refseq/.
Collapse
Affiliation(s)
- Wenjun Li
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Kathleen R O’Neill
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Daniel H Haft
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Michael DiCuccio
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Vyacheslav Chetvernin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Azat Badretdin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - George Coulouris
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Farideh Chitsaz
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Myra K Derbyshire
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - A Scott Durkin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Noreen R Gonzales
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Marc Gwadz
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Christopher J Lanczycki
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - James S Song
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Narmada Thanki
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Jiyao Wang
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Roxanne A Yamashita
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Mingzhang Yang
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Chanjuan Zheng
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Aron Marchler-Bauer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| |
Collapse
|
56
|
Mejía-Almonte C, Busby SJW, Wade JT, van Helden J, Arkin AP, Stormo GD, Eilbeck K, Palsson BO, Galagan JE, Collado-Vides J. Redefining fundamental concepts of transcription initiation in bacteria. Nat Rev Genet 2020; 21:699-714. [PMID: 32665585 PMCID: PMC7990032 DOI: 10.1038/s41576-020-0254-8] [Citation(s) in RCA: 93] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/29/2020] [Indexed: 12/15/2022]
Abstract
Despite enormous progress in understanding the fundamentals of bacterial gene regulation, our knowledge remains limited when compared with the number of bacterial genomes and regulatory systems to be discovered. Derived from a small number of initial studies, classic definitions for concepts of gene regulation have evolved as the number of characterized promoters has increased. Together with discoveries made using new technologies, this knowledge has led to revised generalizations and principles. In this Expert Recommendation, we suggest precise, updated definitions that support a logical, consistent conceptual framework of bacterial gene regulation, focusing on transcription initiation. The resulting concepts can be formalized by ontologies for computational modelling, laying the foundation for improved bioinformatics tools, knowledge-based resources and scientific communication. Thus, this work will help researchers construct better predictive models, with different formalisms, that will be useful in engineering, synthetic biology, microbiology and genetics.
Collapse
Affiliation(s)
- Citlalli Mejía-Almonte
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Morelos, Cuernavaca, México
| | | | - Joseph T Wade
- Division of Genetics, Wadsworth Center, New York State Department of Health, Albany, NY, USA
| | - Jacques van Helden
- Aix-Marseille University, INSERM UMR S 1090, Theory and Approaches of Genome Complexity (TAGC), Marseille, France
- CNRS, Institut Français de Bioinformatique, IFB-core, UMS 3601, Evry, France
| | - Adam P Arkin
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
| | - Gary D Stormo
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
| | - Karen Eilbeck
- Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Bernhard O Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - James E Galagan
- Department of Biomedical Engineering, Boston University, Boston, MA, USA
| | - Julio Collado-Vides
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Morelos, Cuernavaca, México.
- Department of Biomedical Engineering, Boston University, Boston, MA, USA.
| |
Collapse
|
57
|
M A Basher AR, McLaughlin RJ, Hallam SJ. Metabolic pathway inference using multi-label classification with rich pathway features. PLoS Comput Biol 2020; 16:e1008174. [PMID: 33001968 PMCID: PMC7529316 DOI: 10.1371/journal.pcbi.1008174] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2020] [Accepted: 07/21/2020] [Indexed: 12/15/2022] Open
Abstract
Metabolic inference from genomic sequence information is a necessary step in determining the capacity of cells to make a living in the world at different levels of biological organization. A common method for determining the metabolic potential encoded in genomes is to map conceptually translated open reading frames onto a database containing known product descriptions. Such gene-centric methods are limited in their capacity to predict pathway presence or absence and do not support standardized rule sets for automated and reproducible research. Pathway-centric methods based on defined rule sets or machine learning algorithms provide an adjunct or alternative inference method that supports hypothesis generation and testing of metabolic relationships within and between cells. Here, we present mlLGPR, multi-label based on logistic regression for pathway prediction, a software package that uses supervised multi-label classification and rich pathway features to infer metabolic networks in organismal and multi-organismal datasets. We evaluated mlLGPR performance using a corpora of 12 experimental datasets manifesting diverse multi-label properties, including manually curated organismal genomes, synthetic microbial communities and low complexity microbial communities. Resulting performance metrics equaled or exceeded previous reports for organismal genomes and identify specific challenges associated with features engineering and training data for community-level metabolic inference.
Collapse
Affiliation(s)
- Abdur Rahman M A Basher
- Graduate Program in Bioinformatics, University of British Columbia, Genome Sciences Centre, 100-570 West 7th Avenue, Vancouver, British Columbia, Canada
| | - Ryan J McLaughlin
- Graduate Program in Bioinformatics, University of British Columbia, Genome Sciences Centre, 100-570 West 7th Avenue, Vancouver, British Columbia, Canada
| | - Steven J Hallam
- Graduate Program in Bioinformatics, University of British Columbia, Genome Sciences Centre, 100-570 West 7th Avenue, Vancouver, British Columbia, Canada
- Department of Microbiology & Immunology, University of British Columbia, 2552-2350 Health Sciences Mall, Vancouver, British Columbia, Canada
- Genome Science and Technology Program, University of British Columbia, 2329 West Mall, Vancouver, BC, Canada
- Life Sciences Institute, University of British Columbia, Vancouver, British Columbia, Canada
- ECOSCOPE Training Program, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
58
|
Abstract
While there has been much study of bacterial gene dispensability, there is a lack of comprehensive genome-scale examinations of the impact of gene deletion on growth in different carbon sources. In this context, a lot can be learned from such experiments in the model microbe Escherichia coli where much is already understood and there are existing tools for the investigation of carbon metabolism and physiology (1). Gene deletion studies have practical potential in the field of antibiotic drug discovery where there is emerging interest in bacterial central metabolism as a target for new antibiotics (2). Furthermore, some carbon utilization pathways have been shown to be critical for initiating and maintaining infection for certain pathogens and sites of infection (3–5). Here, with the use of high-throughput solid medium phenotyping methods, we have generated kinetic growth measurements for 3,796 genes under 30 different carbon source conditions. This data set provides a foundation for research that will improve our understanding of genes with unknown function, aid in predicting potential antibiotic targets, validate and advance metabolic models, and help to develop our understanding of E. coli metabolism. Central metabolism is a topic that has been studied for decades, and yet, this process is still not fully understood in Escherichia coli, perhaps the most amenable and well-studied model organism in biology. To further our understanding, we used a high-throughput method to measure the growth kinetics of each of 3,796 E. coli single-gene deletion mutants in 30 different carbon sources. In total, there were 342 genes (9.01%) encompassing a breadth of biological functions that showed a growth phenotype on at least 1 carbon source, demonstrating that carbon metabolism is closely linked to a large number of processes in the cell. We identified 74 genes that showed low growth in 90% of conditions, defining a set of genes which are essential in nutrient-limited media, regardless of the carbon source. The data are compiled into a Web application, Carbon Phenotype Explorer (CarPE), to facilitate easy visualization of growth curves for each mutant strain in each carbon source. Our experimental data matched closely with the predictions from the EcoCyc metabolic model which uses flux balance analysis to predict growth phenotypes. From our comparisons to the model, we found that, unexpectedly, phosphoenolpyruvate carboxylase (ppc) was required for robust growth in most carbon sources other than most trichloroacetic acid (TCA) cycle intermediates. We also identified 51 poorly annotated genes that showed a low growth phenotype in at least 1 carbon source, which allowed us to form hypotheses about the functions of these genes. From this list, we further characterized the ydhC gene and demonstrated its role in adenosine efflux.
Collapse
|
59
|
Abstract
Feedback mechanisms are critical to control physiological responses. In gene regulation, one important example, termed negative autoregulation (NAR), occurs when a transcription factor (TF) inhibits its own production. NAR is common across the tree of life, enabling rapid homeostatic control of gene expression. NAR behavior can be described in accordance with its core biochemical parameters, but how constrained these parameters are by evolution is unclear. Here, we describe a model genetic network controlled by an NAR circuit within the bacterium Escherichia coli and elucidate these constraints by experimentally changing a key parameter and measuring its effect on circuit response and fitness. This analysis yielded a parameter-fitness landscape representing the genetic network, providing a window into what gene-environment conditions favor evolution of this regulatory strategy. Feedback mechanisms are fundamental to the control of physiological responses. One important example in gene regulation, termed negative autoregulation (NAR), occurs when a transcription factor (TF) inhibits its own production through transcriptional repression. This enables more-rapid homeostatic control of gene expression. NAR circuits presumably evolve to limit the fitness costs of gratuitous gene expression. The key biochemical reactions of NAR can be parameterized using a mathematical model of promoter activity; however, this model of NAR has been studied mostly in the context of synthetic NAR circuits that are disconnected from the target genes of the TFs. Thus, it remains unclear how constrained NAR parameters are in a native circuit context, where the TF target genes can have fitness effects on the cell. To quantify these constraints, we created a panel of Escherichia coli strains with different lexA-NAR circuit parameters and analyzed the effect on SOS response function and bacterial fitness. Using a mathematical model for NAR, these experimental data were used to calculate NAR parameter values and derive a parameter-fitness landscape. Without feedback, survival of DNA damage was decreased due to high LexA concentrations and slower SOS “turn-on” kinetics. However, we show that, even in the absence of DNA damage, the lexA promoter is strong enough that, without feedback, high levels of lexA expression result in a fitness cost to the cell. Conversely, hyperfeedback can mimic lexA deletion, which is also costly. This work elucidates the lexA-NAR parameter values capable of balancing the cell’s requirement for rapid SOS response activation with limiting its toxicity. IMPORTANCE Feedback mechanisms are critical to control physiological responses. In gene regulation, one important example, termed negative autoregulation (NAR), occurs when a transcription factor (TF) inhibits its own production. NAR is common across the tree of life, enabling rapid homeostatic control of gene expression. NAR behavior can be described in accordance with its core biochemical parameters, but how constrained these parameters are by evolution is unclear. Here, we describe a model genetic network controlled by an NAR circuit within the bacterium Escherichia coli and elucidate these constraints by experimentally changing a key parameter and measuring its effect on circuit response and fitness. This analysis yielded a parameter-fitness landscape representing the genetic network, providing a window into what gene-environment conditions favor evolution of this regulatory strategy.
Collapse
|
60
|
Ong WK, Midford PE, Karp PD. Taxonomic weighting improves the accuracy of a gap-filling algorithm for metabolic models. Bioinformatics 2020; 36:1823-1830. [PMID: 31688932 PMCID: PMC7523652 DOI: 10.1093/bioinformatics/btz813] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Revised: 08/29/2019] [Accepted: 10/31/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION The increasing availability of annotated genome sequences enables construction of genome-scale metabolic networks, which are useful tools for studying organisms of interest. However, due to incomplete genome annotations, draft metabolic models contain gaps that must be filled in a time-consuming process before they are usable. Optimization-based algorithms that fill these gaps have been developed, however, gap-filling algorithms show significant error rates and often introduce incorrect reactions. RESULTS Here, we present a new gap-filling method that computes the costs of candidate gap-filling reactions from a universal reaction database (MetaCyc) based on taxonomic information. When gap-filling a metabolic model for an organism M (such as Escherichia coli), the cost for reaction R is based on the frequency with which R occurs in other organisms within the phylum of M (in this case, Proteobacteria). The assumption behind this method is that different taxonomic groups are biased toward using different metabolic reactions. Evaluation of the new gap-filler on randomly degraded variants of the EcoCyc metabolic model for E.coli showed an increase in the average F1-score to 99.0 (when using the variable weights by frequency method at the phylum level), compared to 91.0 using the previous MetaFlux gap-filler and 80.3 using a basic gap-filler. Evaluation on two other microbial metabolic models showed similar improvements. AVAILABILITY AND IMPLEMENTATION The Pathway Tools software (including MetaFlux) is free for academic use and is available at http://pathwaytools.com. Additional code for reproducing the results presented here is available at www.ai.sri.com/pkarp/pubs/taxgap/supplementary.zip. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wai Kit Ong
- Bioinformatics Research Group, Artificial Intelligence Center, SRI International, Menlo Park, CA 94025, USA
| | - Peter E Midford
- Bioinformatics Research Group, Artificial Intelligence Center, SRI International, Menlo Park, CA 94025, USA
| | - Peter D Karp
- Bioinformatics Research Group, Artificial Intelligence Center, SRI International, Menlo Park, CA 94025, USA
| |
Collapse
|
61
|
Hör J, Matera G, Vogel J, Gottesman S, Storz G. Trans-Acting Small RNAs and Their Effects on Gene Expression in Escherichia coli and Salmonella enterica. EcoSal Plus 2020; 9:10.1128/ecosalplus.ESP-0030-2019. [PMID: 32213244 PMCID: PMC7112153 DOI: 10.1128/ecosalplus.esp-0030-2019] [Citation(s) in RCA: 117] [Impact Index Per Article: 23.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2019] [Indexed: 12/20/2022]
Abstract
The last few decades have led to an explosion in our understanding of the major roles that small regulatory RNAs (sRNAs) play in regulatory circuits and the responses to stress in many bacterial species. Much of the foundational work was carried out with Escherichia coli and Salmonella enterica serovar Typhimurium. The studies of these organisms provided an overview of how the sRNAs function and their impact on bacterial physiology, serving as a blueprint for sRNA biology in many other prokaryotes. They also led to the development of new technologies. In this chapter, we first summarize how these sRNAs were identified, defining them in the process. We discuss how they are regulated and how they act and provide selected examples of their roles in regulatory circuits and the consequences of this regulation. Throughout, we summarize the methodologies that were developed to identify and study the regulatory RNAs, most of which are applicable to other bacteria. Newly updated databases of the known sRNAs in E. coli K-12 and S. enterica Typhimurium SL1344 serve as a reference point for much of the discussion and, hopefully, as a resource for readers and for future experiments to address open questions raised in this review.
Collapse
Affiliation(s)
- Jens Hör
- Institute of Molecular Infection Biology, University of Würzburg, 97080 Würzburg, Germany
| | - Gianluca Matera
- Institute of Molecular Infection Biology, University of Würzburg, 97080 Würzburg, Germany
| | - Jörg Vogel
- Helmholtz Institute for RNA-based Infection Research (HIRI), 97080 Würzburg, Germany
- Institute of Molecular Infection Biology, University of Würzburg, 97080 Würzburg, Germany
| | - Susan Gottesman
- Laboratory of Molecular Biology, National Cancer Institute, Bethesda, MD 20892
| | - Gisela Storz
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892
| |
Collapse
|
62
|
Caspi R, Billington R, Keseler IM, Kothari A, Krummenacker M, Midford PE, Ong WK, Paley S, Subhraveti P, Karp PD. The MetaCyc database of metabolic pathways and enzymes - a 2019 update. Nucleic Acids Res 2020; 48:D445-D453. [PMID: 31586394 PMCID: PMC6943030 DOI: 10.1093/nar/gkz862] [Citation(s) in RCA: 676] [Impact Index Per Article: 135.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Revised: 09/19/2019] [Accepted: 10/01/2019] [Indexed: 11/18/2022] Open
Abstract
MetaCyc (MetaCyc.org) is a comprehensive reference database of metabolic pathways and enzymes from all domains of life. It contains 2749 pathways derived from more than 60 000 publications, making it the largest curated collection of metabolic pathways. The data in MetaCyc are evidence-based and richly curated, resulting in an encyclopedic reference tool for metabolism. MetaCyc is also used as a knowledge base for generating thousands of organism-specific Pathway/Genome Databases (PGDBs), which are available in BioCyc.org and other genomic portals. This article provides an update on the developments in MetaCyc during September 2017 to August 2019, up to version 23.1. Some of the topics that received intensive curation during this period include cobamides biosynthesis, sterol metabolism, fatty acid biosynthesis, lipid metabolism, carotenoid metabolism, protein glycosylation, antibiotics and cytotoxins biosynthesis, siderophore biosynthesis, bioluminescence, vitamin K metabolism, brominated compound metabolism, plant secondary metabolism and human metabolism. Other additions include modifications to the GlycanBuilder software that enable displaying glycans using symbolic representation, improved graphics and fonts for web displays, improvements in the PathoLogic component of Pathway Tools, and the optional addition of regulatory information to pathway diagrams.
Collapse
Affiliation(s)
- Ron Caspi
- SRI International, 333 Ravenswood Ave, Menlo Park, CA 94025, USA
| | | | - Ingrid M Keseler
- SRI International, 333 Ravenswood Ave, Menlo Park, CA 94025, USA
| | - Anamika Kothari
- SRI International, 333 Ravenswood Ave, Menlo Park, CA 94025, USA
| | | | - Peter E Midford
- SRI International, 333 Ravenswood Ave, Menlo Park, CA 94025, USA
| | - Wai Kit Ong
- SRI International, 333 Ravenswood Ave, Menlo Park, CA 94025, USA
| | - Suzanne Paley
- SRI International, 333 Ravenswood Ave, Menlo Park, CA 94025, USA
| | | | - Peter D Karp
- SRI International, 333 Ravenswood Ave, Menlo Park, CA 94025, USA
| |
Collapse
|
63
|
Pinu FR, Beale DJ, Paten AM, Kouremenos K, Swarup S, Schirra HJ, Wishart D. Systems Biology and Multi-Omics Integration: Viewpoints from the Metabolomics Research Community. Metabolites 2019; 9:E76. [PMID: 31003499 PMCID: PMC6523452 DOI: 10.3390/metabo9040076] [Citation(s) in RCA: 342] [Impact Index Per Article: 57.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 04/15/2019] [Accepted: 04/16/2019] [Indexed: 02/07/2023] Open
Abstract
The use of multiple omics techniques (i.e., genomics, transcriptomics, proteomics, and metabolomics) is becoming increasingly popular in all facets of life science. Omics techniques provide a more holistic molecular perspective of studied biological systems compared to traditional approaches. However, due to their inherent data differences, integrating multiple omics platforms remains an ongoing challenge for many researchers. As metabolites represent the downstream products of multiple interactions between genes, transcripts, and proteins, metabolomics, the tools and approaches routinely used in this field could assist with the integration of these complex multi-omics data sets. The question is, how? Here we provide some answers (in terms of methods, software tools and databases) along with a variety of recommendations and a list of continuing challenges as identified during a peer session on multi-omics integration that was held at the recent 'Australian and New Zealand Metabolomics Conference' (ANZMET 2018) in Auckland, New Zealand (Sept. 2018). We envisage that this document will serve as a guide to metabolomics researchers and other members of the community wishing to perform multi-omics studies. We also believe that these ideas may allow the full promise of integrated multi-omics research and, ultimately, of systems biology to be realized.
Collapse
Affiliation(s)
- Farhana R Pinu
- The New Zealand Institute for Plant and Food Research Limited, Private Bag 92169, Auckland 1142, New Zealand.
| | - David J Beale
- Land and Water, Commonwealth Scientific and Industrial Research Organization (CSIRO), Ecosciences Precinct, Dutton Park, Dutton Park, QLD 4102, Australia.
| | - Amy M Paten
- Land and Water, Commonwealth Scientific and Industrial Research Organization (CSIRO), Research and Innovation Park, Acton, ACT 2601, Australia.
| | - Konstantinos Kouremenos
- Trajan Scientific and Medical, Ringwood, VIC 3134, Australia.
- Bio21 Institute, The University of Melbourne, Parkville, VIC 3010, Australia.
| | - Sanjay Swarup
- Department of Biological Sciences, National University of Singapore, Singapore 117411, Singapore.
| | - Horst J Schirra
- Centre for Advanced Imaging, The University of Queensland, St Lucia, QLD 4072, Australia.
| | - David Wishart
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E8, Canada.
- Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada.
| |
Collapse
|