1
|
Karp PD, Paley S, Caspi R, Kothari A, Krummenacker M, Midford PE, Moore LR, Subhraveti P, Gama-Castro S, Tierrafria VH, Lara P, Muñiz-Rascado L, Bonavides-Martinez C, Santos-Zavaleta A, Mackie A, Sun G, Ahn-Horst TA, Choi H, Juenemann R, Knudsen CNM, Covert MW, Collado-Vides J, Paulsen I. The EcoCyc database (2025). EcoSal Plus 2025:eesp00192024. [PMID: 40304522 DOI: 10.1128/ecosalplus.esp-0019-2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2024] [Accepted: 03/18/2025] [Indexed: 05/02/2025]
Abstract
EcoCyc is a bioinformatics database (DB) available at EcoCyc.org that describes the genome and the biochemical machinery of Escherichia coli K-12 MG1655. The long-term goal of the project was to describe the complete molecular catalog of the E. coli cell, as well as the functions of each of its molecular parts, to facilitate a system-level understanding of E. coli. EcoCyc is an electronic reference source for E. coli biologists and for biologists who work with related microorganisms. The database includes information pages on each E. coli gene product, metabolite, reaction, operon, and metabolic pathway. The database also includes information on the regulation of gene expression, E. coli gene essentiality, and nutrient conditions that do or do not support the growth of E. coli. The website and downloadable software contain tools for the analysis of high-throughput data sets. In addition, a steady-state metabolic flux model is generated from each new version of EcoCyc and can be executed via EcoCyc.org. The model can predict metabolic flux rates, nutrient uptake rates, and growth rates for different gene knockouts and nutrient conditions. Data generated from a whole-cell model that is parameterized from the latest data on EcoCyc is also available. This review outlines the data content of EcoCyc and the procedures by which this content is generated.
Collapse
Affiliation(s)
- Peter D Karp
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Suzanne Paley
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Ron Caspi
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Anamika Kothari
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Markus Krummenacker
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Peter E Midford
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Lisa R Moore
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Pallavi Subhraveti
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Socorro Gama-Castro
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, Mexico
| | - Víctor H Tierrafria
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, Mexico
| | - Paloma Lara
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, Mexico
| | - Luis Muñiz-Rascado
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, Mexico
| | - César Bonavides-Martinez
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, Mexico
| | - Alberto Santos-Zavaleta
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, Mexico
| | - Amanda Mackie
- School of Natural Sciences, Macquarie University, Sydney, New South Wales, Australia
| | - Gwanggyu Sun
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Travis A Ahn-Horst
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Heejo Choi
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Riley Juenemann
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Cyrus N M Knudsen
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Markus W Covert
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Julio Collado-Vides
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, Mexico
| | - Ian Paulsen
- School of Natural Sciences, Macquarie University, Sydney, New South Wales, Australia
| |
Collapse
|
2
|
Karp PD, Paley S, Caspi R, Kothari A, Krummenacker M, Midford PE, Moore LR, Subhraveti P, Gama-Castro S, Tierrafria VH, Lara P, Muñiz-Rascado L, Bonavides-Martinez C, Santos-Zavaleta A, Mackie A, Sun G, Ahn-Horst TA, Choi H, Covert MW, Collado-Vides J, Paulsen I. The EcoCyc Database (2023). EcoSal Plus 2023; 11:eesp00022023. [PMID: 37220074 PMCID: PMC10729931 DOI: 10.1128/ecosalplus.esp-0002-2023] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 04/04/2023] [Indexed: 01/28/2024]
Abstract
EcoCyc is a bioinformatics database available online at EcoCyc.org that describes the genome and the biochemical machinery of Escherichia coli K-12 MG1655. The long-term goal of the project is to describe the complete molecular catalog of the E. coli cell, as well as the functions of each of its molecular parts, to facilitate a system-level understanding of E. coli. EcoCyc is an electronic reference source for E. coli biologists and for biologists who work with related microorganisms. The database includes information pages on each E. coli gene product, metabolite, reaction, operon, and metabolic pathway. The database also includes information on the regulation of gene expression, E. coli gene essentiality, and nutrient conditions that do or do not support the growth of E. coli. The website and downloadable software contain tools for the analysis of high-throughput data sets. In addition, a steady-state metabolic flux model is generated from each new version of EcoCyc and can be executed online. The model can predict metabolic flux rates, nutrient uptake rates, and growth rates for different gene knockouts and nutrient conditions. Data generated from a whole-cell model that is parameterized from the latest data on EcoCyc are also available. This review outlines the data content of EcoCyc and of the procedures by which this content is generated.
Collapse
Affiliation(s)
- Peter D. Karp
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Suzanne Paley
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Ron Caspi
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Anamika Kothari
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Markus Krummenacker
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Peter E. Midford
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Lisa R. Moore
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Pallavi Subhraveti
- Bioinformatics Research Group, SRI International, Menlo Park, California, USA
| | - Socorro Gama-Castro
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
| | - Victor H. Tierrafria
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
| | - Paloma Lara
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
| | - Luis Muñiz-Rascado
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
| | - César Bonavides-Martinez
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
| | - Alberto Santos-Zavaleta
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
| | - Amanda Mackie
- Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, New South Wales, Australia
| | - Gwanggyu Sun
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Travis A. Ahn-Horst
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Heejo Choi
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Markus W. Covert
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Julio Collado-Vides
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
| | - Ian Paulsen
- School of Natural Sciences, Macquarie University, Sydney, New South Wales, Australia
| |
Collapse
|
3
|
Espinosa-Cantú A, Cruz-Bonilla E, Noda-Garcia L, DeLuna A. Multiple Forms of Multifunctional Proteins in Health and Disease. Front Cell Dev Biol 2020; 8:451. [PMID: 32587857 PMCID: PMC7297953 DOI: 10.3389/fcell.2020.00451] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Accepted: 05/14/2020] [Indexed: 12/23/2022] Open
Abstract
Protein science has moved from a focus on individual molecules to an integrated perspective in which proteins emerge as dynamic players with multiple functions, rather than monofunctional specialists. Annotation of the full functional repertoire of proteins has impacted the fields of biochemistry and genetics, and will continue to influence basic and applied science questions - from the genotype-to-phenotype problem, to our understanding of human pathologies and drug design. In this review, we address the phenomena of pleiotropy, multidomain proteins, promiscuity, and protein moonlighting, providing examples of multitasking biomolecules that underlie specific mechanisms of human disease. In doing so, we place in context different types of multifunctional proteins, highlighting useful attributes for their systematic definition and classification in future research directions.
Collapse
Affiliation(s)
- Adriana Espinosa-Cantú
- Unidad de Genómica Avanzada (Langebio), Centro de Investigación y de Estudios Avanzados, Guanajuato, Mexico
| | - Erika Cruz-Bonilla
- Unidad de Genómica Avanzada (Langebio), Centro de Investigación y de Estudios Avanzados, Guanajuato, Mexico
| | - Lianet Noda-Garcia
- Department of Plant Pathology and Microbiology, Robert H. Smith Faculty of Agriculture, Food, and Environment, The Hebrew University of Jerusalem, Rehovot, Israel
| | - Alexander DeLuna
- Unidad de Genómica Avanzada (Langebio), Centro de Investigación y de Estudios Avanzados, Guanajuato, Mexico
| |
Collapse
|
4
|
Gonçalves MCP, Kieckbusch TG, Perna RF, Fujimoto JT, Morales SAV, Romanelli JP. Trends on enzyme immobilization researches based on bibliometric analysis. Process Biochem 2019. [DOI: 10.1016/j.procbio.2018.09.016] [Citation(s) in RCA: 81] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
5
|
Abstract
Many proteins assemble into homomultimeric structures, with a number of subunits that can vary substantially among phylogenetic lineages. As protein-protein interactions require productive encounters among subunits, such variation might partially be explained by variation in cellular protein abundance. Protein abundance in turn depends on the intrinsic rates of production and decay of mRNA and protein molecules, as well as rates of cell growth and division. Using a stochastic framework for prediction of the multimeric state of a protein as a function of these processes and the free energy associated with interface-interface binding, we demonstrate agreement with a wide class of proteins using E. coli proteome data. As such, this platform, which links protein quaternary structure with biochemical rates governing gene expression, protein association and dissociation, and cell growth and division, can be extended to evolutionary models for the emergence and diversification of multimers. While it is tempting to think of multimerization as adaptive, the diversity of multimeric states raises the question of its functional role and impact on fitness. As a force driving selection, we consider the possible increase in enzymatic activity of proteins arising strictly as a consequence of interface-interface binding-namely, enhanced stability to degradation, substrate binding affinity, or catalytic rate of multimers with respect to monomers without invoking further conformational changes, as in allostery. For fixed cost of protein production, we find a benefit conferred by multimers that is dependent on context and can therefore become different in diverging lineages.
Collapse
Affiliation(s)
- Kyle Hagner
- Department of Physics, Indiana University, Bloomington, Indiana 47405, USA
| | - Sima Setayeshgar
- Department of Physics, Indiana University, Bloomington, Indiana 47405, USA
| | - Michael Lynch
- Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, Arizona 85287, USA
| |
Collapse
|
6
|
Karp PD, Ong WK, Paley S, Billington R, Caspi R, Fulcher C, Kothari A, Krummenacker M, Latendresse M, Midford PE, Subhraveti P, Gama-Castro S, Muñiz-Rascado L, Bonavides-Martinez C, Santos-Zavaleta A, Mackie A, Collado-Vides J, Keseler IM, Paulsen I. The EcoCyc Database. EcoSal Plus 2018; 8:10.1128/ecosalplus.ESP-0006-2018. [PMID: 30406744 PMCID: PMC6504970 DOI: 10.1128/ecosalplus.esp-0006-2018] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Indexed: 01/28/2023]
Abstract
EcoCyc is a bioinformatics database available at EcoCyc.org that describes the genome and the biochemical machinery of Escherichia coli K-12 MG1655. The long-term goal of the project is to describe the complete molecular catalog of the E. coli cell, as well as the functions of each of its molecular parts, to facilitate a system-level understanding of E. coli. EcoCyc is an electronic reference source for E. coli biologists and for biologists who work with related microorganisms. The database includes information pages on each E. coli gene product, metabolite, reaction, operon, and metabolic pathway. The database also includes information on E. coli gene essentiality and on nutrient conditions that do or do not support the growth of E. coli. The website and downloadable software contain tools for analysis of high-throughput data sets. In addition, a steady-state metabolic flux model is generated from each new version of EcoCyc and can be executed via EcoCyc.org. The model can predict metabolic flux rates, nutrient uptake rates, and growth rates for different gene knockouts and nutrient conditions. This review outlines the data content of EcoCyc and of the procedures by which this content is generated.
Collapse
Affiliation(s)
- Peter D Karp
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | - Wai Kit Ong
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | - Suzanne Paley
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | | | - Ron Caspi
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | - Carol Fulcher
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | - Anamika Kothari
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | | | - Mario Latendresse
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | - Peter E Midford
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | | | - Socorro Gama-Castro
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, A.P. 565-A, Cuernavaca, Morelos 62100, México
| | - Luis Muñiz-Rascado
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, A.P. 565-A, Cuernavaca, Morelos 62100, México
| | - César Bonavides-Martinez
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, A.P. 565-A, Cuernavaca, Morelos 62100, México
| | - Alberto Santos-Zavaleta
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, A.P. 565-A, Cuernavaca, Morelos 62100, México
| | - Amanda Mackie
- Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, NSW 2109, Australia
| | - Julio Collado-Vides
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, A.P. 565-A, Cuernavaca, Morelos 62100, México
| | - Ingrid M Keseler
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | - Ian Paulsen
- Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, NSW 2109, Australia
| |
Collapse
|
7
|
Garcia-Ruiz E, HamediRad M, Zhao H. Pathway Design, Engineering, and Optimization. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2018; 162:77-116. [PMID: 27629378 DOI: 10.1007/10_2016_12] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
The microbial metabolic versatility found in nature has inspired scientists to create microorganisms capable of producing value-added compounds. Many endeavors have been made to transfer and/or combine pathways, existing or even engineered enzymes with new function to tractable microorganisms to generate new metabolic routes for drug, biofuel, and specialty chemical production. However, the success of these pathways can be impeded by different complications from an inherent failure of the pathway to cell perturbations. Pursuing ways to overcome these shortcomings, a wide variety of strategies have been developed. This chapter will review the computational algorithms and experimental tools used to design efficient metabolic routes, and construct and optimize biochemical pathways to produce chemicals of high interest.
Collapse
Affiliation(s)
- Eva Garcia-Ruiz
- Department of Chemical and Biomolecular Engineering, Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Mohammad HamediRad
- Department of Chemical and Biomolecular Engineering, Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Huimin Zhao
- Department of Chemical and Biomolecular Engineering, Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
- Departments of Chemistry, Biochemistry, and Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
| |
Collapse
|
8
|
Kachroo AH, Laurent JM, Akhmetov A, Szilagyi-Jones M, McWhite CD, Zhao A, Marcotte EM. Systematic bacterialization of yeast genes identifies a near-universally swappable pathway. eLife 2017; 6:e25093. [PMID: 28661399 PMCID: PMC5536947 DOI: 10.7554/elife.25093] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2017] [Accepted: 06/26/2017] [Indexed: 11/13/2022] Open
Abstract
Eukaryotes and prokaryotes last shared a common ancestor ~2 billion years ago, and while many present-day genes in these lineages predate this divergence, the extent to which these genes still perform their ancestral functions is largely unknown. To test principles governing retention of ancient function, we asked if prokaryotic genes could replace their essential eukaryotic orthologs. We systematically replaced essential genes in yeast by their 1:1 orthologs from Escherichia coli. After accounting for mitochondrial localization and alternative start codons, 31 out of 51 bacterial genes tested (61%) could complement a lethal growth defect and replace their yeast orthologs with minimal effects on growth rate. Replaceability was determined on a pathway-by-pathway basis; codon usage, abundance, and sequence similarity contributed predictive power. The heme biosynthesis pathway was particularly amenable to inter-kingdom exchange, with each yeast enzyme replaceable by its bacterial, human, or plant ortholog, suggesting it as a near-universally swappable pathway.
Collapse
Affiliation(s)
- Aashiq H Kachroo
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, United States
| | - Jon M Laurent
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, United States
| | - Azat Akhmetov
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, United States
| | - Madelyn Szilagyi-Jones
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, United States
| | - Claire D McWhite
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, United States
| | - Alice Zhao
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, United States
| | - Edward M Marcotte
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, United States
- Department of Molecular Biosciences, University of Texas at Austin, Austin, United States
| |
Collapse
|
9
|
Karp PD, Weaver D, Paley S, Fulcher C, Kubo A, Kothari A, Krummenacker M, Subhraveti P, Weerasinghe D, Gama-Castro S, Huerta AM, Muñiz-Rascado L, Bonavides-Martinez C, Weiss V, Peralta-Gil M, Santos-Zavaleta A, Schröder I, Mackie A, Gunsalus R, Collado-Vides J, Keseler IM, Paulsen I. The EcoCyc Database. EcoSal Plus 2014; 6:10.1128/ecosalplus.ESP-0009-2013. [PMID: 26442933 PMCID: PMC4243172 DOI: 10.1128/ecosalplus.esp-0009-2013] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2014] [Indexed: 11/20/2022]
Abstract
EcoCyc is a bioinformatics database available at EcoCyc.org that describes the genome and the biochemical machinery of Escherichia coli K-12 MG1655. The long-term goal of the project is to describe the complete molecular catalog of the E. coli cell, as well as the functions of each of its molecular parts, to facilitate a system-level understanding of E. coli. EcoCyc is an electronic reference source for E. coli biologists and for biologists who work with related microorganisms. The database includes information pages on each E. coli gene, metabolite, reaction, operon, and metabolic pathway. The database also includes information on E. coli gene essentiality and on nutrient conditions that do or do not support the growth of E. coli. The website and downloadable software contain tools for analysis of high-throughput data sets. In addition, a steady-state metabolic flux model is generated from each new version of EcoCyc. The model can predict metabolic flux rates, nutrient uptake rates, and growth rates for different gene knockouts and nutrient conditions. This review provides a detailed description of the data content of EcoCyc and of the procedures by which this content is generated.
Collapse
Affiliation(s)
- Peter D Karp
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | - Daniel Weaver
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | - Suzanne Paley
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | - Carol Fulcher
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | - Aya Kubo
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | - Anamika Kothari
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | | | | | | | - Socorro Gama-Castro
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, A.P. 565-A, Cuernavaca, Morelos 62100, México
| | - Araceli M Huerta
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, A.P. 565-A, Cuernavaca, Morelos 62100, México
| | - Luis Muñiz-Rascado
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, A.P. 565-A, Cuernavaca, Morelos 62100, México
| | - César Bonavides-Martinez
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, A.P. 565-A, Cuernavaca, Morelos 62100, México
| | - Verena Weiss
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, A.P. 565-A, Cuernavaca, Morelos 62100, México
| | - Martin Peralta-Gil
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, A.P. 565-A, Cuernavaca, Morelos 62100, México
| | - Alberto Santos-Zavaleta
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, A.P. 565-A, Cuernavaca, Morelos 62100, México
| | - Imke Schröder
- Department of Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles, CA 90095
- UCLA Institute of Genomics and Proteomics, University of California, Los Angeles, CA 90095
| | - Amanda Mackie
- Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, NSW 2109, Australia
| | - Robert Gunsalus
- Department of Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles, CA 90095
| | - Julio Collado-Vides
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, A.P. 565-A, Cuernavaca, Morelos 62100, México
| | - Ingrid M Keseler
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025
| | - Ian Paulsen
- Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, NSW 2109, Australia
| |
Collapse
|
10
|
Current and emerging options for taxol production. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2014; 148:405-25. [PMID: 25528175 DOI: 10.1007/10_2014_292] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Paclitaxel (trademark "Taxol") is a plant-derived isoprenoid natural product that exhibits potent anticancer activity. Taxol was originally isolated from the Pacific yew tree in 1967 and triggered an intense scientific and engineering venture to provide the compound reliably to cancer patients. The choices available for production include synthetic and biosynthetic routes (and combinations thereof). This chapter focuses on the currently utilized and emerging biosynthetic options for Taxol production. A particular emphasis is placed on the biosynthetic production hosts including macroscopic and unicellular plant species and more recent attempts to elucidate, transfer, and reconstitute the Taxol pathway within technically advanced microbial hosts. In so doing, we provide the reader with relevant background related to Taxol and more general information related to producing valuable, but structurally complex, natural products through biosynthetic strategies.
Collapse
|
11
|
Evolution of tryptophan biosynthetic pathway in microbial genomes: a comparative genetic study. SYSTEMS AND SYNTHETIC BIOLOGY 2013; 8:59-72. [PMID: 24592292 DOI: 10.1007/s11693-013-9127-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2013] [Revised: 10/05/2013] [Accepted: 10/08/2013] [Indexed: 10/26/2022]
Abstract
Biosynthetic pathway evolution needs to consider the evolution of a group of genes that code for enzymes catalysing the multiple chemical reaction steps leading to the final end product. Tryptophan biosynthetic pathway has five chemical reaction steps that are highly conserved in diverse microbial genomes, though the genes of the pathway enzymes show considerable variations in arrangements, operon structure (gene fusion and splitting) and regulation. We use a combined bioinformatic and statistical analyses approach to address the question if the pathway genes from different microbial genomes, belonging to a wide range of groups, show similar evolutionary relationships within and between them. Our analyses involved detailed study of gene organization (fusion/splitting events), base composition, relative synonymous codon usage pattern of the genes, gene expressivity, amino acid usage, etc. to assess inter- and intra-genic variations, between and within the pathway genes, in diverse group of microorganisms. We describe these genetic and genomic variations in the tryptophan pathway genes in different microorganisms to show the similarities across organisms, and compare the same genes across different organisms to find the possible variability arising possibly due to horizontal gene transfers. Such studies form the basis for moving from single gene evolution to pathway evolutionary studies that are important steps towards understanding the systems biology of intracellular pathways.
Collapse
|
12
|
Karp PD, Keseler IM, Shearer A, Latendresse M, Krummenacker M, Paley SM, Paulsen I, Collado-Vides J, Gama-Castro S, Peralta-Gil M, Santos-Zavaleta A, Peñaloza-Spínola MI, Bonavides-Martinez C, Ingraham J. Multidimensional annotation of the Escherichia coli K-12 genome. Nucleic Acids Res 2007; 35:7577-90. [PMID: 17940092 PMCID: PMC2190727 DOI: 10.1093/nar/gkm740] [Citation(s) in RCA: 121] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2007] [Revised: 08/09/2007] [Accepted: 09/06/2007] [Indexed: 11/22/2022] Open
Abstract
The annotation of the Escherichia coli K-12 genome in the EcoCyc database is one of the most accurate, complete and multidimensional genome annotations. Of the 4460 E. coli genes, EcoCyc assigns biochemical functions to 76%, and 66% of all genes had their functions determined experimentally. EcoCyc assigns E. coli genes to Gene Ontology and to MultiFun. Seventy-five percent of gene products contain reviews authored by the EcoCyc project that summarize the experimental literature about the gene product. EcoCyc information was derived from 15 000 publications. The database contains extensive descriptions of E. coli cellular networks, describing its metabolic, transport and transcriptional regulatory processes. A comparison to genome annotations for other model organisms shows that the E. coli genome contains the most experimentally determined gene functions in both relative and absolute terms: 2941 (66%) for E. coli, 2319 (37%) for Saccharomyces cerevisiae, 1816 (5%) for Arabidopsis thaliana, 1456 (4%) for Mus musculus and 614 (4%) for Drosophila melanogaster. Database queries to EcoCyc survey the global properties of E. coli cellular networks and illuminate the extent of information gaps for E. coli, such as dead-end metabolites. EcoCyc provides a genome browser with novel properties, and a novel interactive display of transcriptional regulatory networks.
Collapse
Affiliation(s)
- Peter D Karp
- SRI International, 333 Ravenswood Ave EK207, Menlo Park, CA 94025, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Tehlivets O, Scheuringer K, Kohlwein SD. Fatty acid synthesis and elongation in yeast. Biochim Biophys Acta Mol Cell Biol Lipids 2007; 1771:255-70. [PMID: 16950653 DOI: 10.1016/j.bbalip.2006.07.004] [Citation(s) in RCA: 309] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2006] [Revised: 07/14/2006] [Accepted: 07/17/2006] [Indexed: 12/30/2022]
Abstract
Fatty acids are essential compounds in the cell. Since the yeast Saccharomyces cerevisiae does not feed typically on fatty acids, cellular function and growth relies on endogenous synthesis. Since all cellular organelles are involved in--or dependent on--fatty acid synthesis, multiple levels of control may exist to ensure proper fatty acid composition and homeostasis. In this review, we summarize what is currently known about enzymes involved in cellular fatty acid synthesis and elongation, and discuss potential links between fatty acid metabolism, physiology and cellular regulation.
Collapse
Affiliation(s)
- Oksana Tehlivets
- Institute of Molecular Biosciences, University of Graz, A8010 Graz, Austria
| | | | | |
Collapse
|
14
|
Apic G, Huber W, Teichmann SA. Multi-domain protein families and domain pairs: comparison with known structures and a random model of domain recombination. ACTA ACUST UNITED AC 2004; 4:67-78. [PMID: 14649290 DOI: 10.1023/a:1026113408773] [Citation(s) in RCA: 76] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
There is a limited repertoire of domain families in nature that are duplicated and combined in different ways to form the set of proteins in a genome. Most proteins in both prokaryote and eukaryote genomes consist of two or more domains, and we show that the family size distribution of multi-domain protein families follows a power law like that of individual families. Most domain pairs occur in four to six different domain architectures: in isolation and in combinations with different partners. We showed previously that within the set of all pairwise domain combinations, most small and medium-sized families are observed in combination with one or two other families, while a few large families are very versatile and combine with many different partners. Though this may appear to be a stochastic pattern, in which large families have more combination partners by virtue of their size, we establish here that all the domain families with more than three members in genomes are duplicated more frequently than would be expected by chance considering their number of neighbouring domains. This duplication of domain pairs is statistically significant for between one and three quarters of all families with seven or more members. For the majority of pairwise domain combinations, there is no known three-dimensional structure of the two domains together, and we term these novel combinations. Novel domain combinations are interesting and important targets for structural elucidation, as the geometry and interaction between the domains will help understand the function and evolution of multi-domain proteins. Of particular interest are those combinations that occur in the largest number of multi-domain proteins, and several of these frequent novel combinations contain DNA-binding domains.
Collapse
Affiliation(s)
- Gordana Apic
- MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, UK
| | | | | |
Collapse
|
15
|
Wang LH, He Y, Gao Y, Wu JE, Dong YH, He C, Wang SX, Weng LX, Xu JL, Tay L, Fang RX, Zhang LH. A bacterial cell-cell communication signal with cross-kingdom structural analogues. Mol Microbiol 2004; 51:903-12. [PMID: 14731288 DOI: 10.1046/j.1365-2958.2003.03883.x] [Citation(s) in RCA: 312] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Extracellular signals are the key components of microbial cell-cell communication systems. This report identified a diffusible signal factor (DSF), which regulates virulence in Xanthomonas campestris pv. campestris, as cis-11-methyl-2-dodecenoic acid, an alpha,beta unsaturated fatty acid. Analysis of DSF derivatives established the double bond at the alpha,beta positions as the most important structural feature for DSF biological activity. A range of bacterial pathogens, including several Mycobacterium species, also displayed DSF-like activity. Furthermore, DSF is structurally and functionally related to farnesoic acid (FA), which regulates morphological transition and virulence by Candida albicans, a fungal pathogen. Similar to FA, which is also an alpha,beta unsaturated fatty acid, DSF inhibits the dimorphic transition of C. albicans at a physiologically relevant concentration. We conclude that alpha,beta unsaturated fatty acids represent a new class of extracellular signals for bacterial and fungal cell-cell communications. As prokaryote-eukaryote interactions are ubiquitous, such cross-kingdom conservation in cell-cell communication systems might have significant ecological and economic importance.
Collapse
Affiliation(s)
- Lian-Hui Wang
- Institute of Molecular and Cell Biology, 30 Medical Drive, Singapore 117609
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Güldener U, Koehler GJ, Haussmann C, Bacher A, Kricke J, Becher D, Hegemann JH. Characterization of the Saccharomyces cerevisiae Fol1 protein: starvation for C1 carrier induces pseudohyphal growth. Mol Biol Cell 2004; 15:3811-28. [PMID: 15169867 PMCID: PMC491839 DOI: 10.1091/mbc.e03-09-0680] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Tetrahydrofolate (vitamin B9) and its folate derivatives are essential cofactors in one-carbon (C1) transfer reactions and absolutely required for the synthesis of a variety of different compounds including methionine and purines. Most plants, microbial eukaryotes, and prokaryotes synthesize folate de novo. We have characterized an important enzyme in this pathway, the Saccharomyces cerevisiae FOL1 gene. Expression of the budding yeast gene FOL1 in Escherichia coli identified the folate biosynthetic enzyme activities dihydroneopterin aldolase (DHNA), 7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase (HPPK), and dihydropteroate synthase (DHPS). All three enzyme activities were also detected in wild-type yeast strains, whereas fol1Delta deletion strains only showed background activities, thus demonstrating that Fol1p catalyzes three sequential steps of the tetrahydrofolate biosynthetic pathway and thus is the central enzyme of this pathway, which starting from GTP consists of seven enzymatic reactions in total. Fol1p is exclusively localized to mitochondria as shown by fluorescence microscopy and immune electronmicroscopy. FOL1 is an essential gene and the nongrowth phenotype of the fol1 deletion leads to a recessive auxotrophy for folinic acid (5'-formyltetrahydrofolate). Growth of the fol1Delta deletion strain on folinic acid-supplemented rich media induced a dimorphic switch with haploid invasive and filamentous pseudohyphal growth in the presence of glucose and ammonium, which are known suppressors of filamentous and invasive growth. The invasive growth phenotype induced by the depletion of C1 carrier is dependent on the transcription factor Ste12p and the flocullin/adhesin Flo11p, whereas the filamentation phenotype is independent of Ste12p, Tec1p, Phd1p, and Flo11p, suggesting other signaling pathways as well as other adhesion proteins.
Collapse
Affiliation(s)
- Ulrich Güldener
- Heinrich-Heine-Universität, Funktionelle Genomforschung der Mikroorganismen, 40225 Düsseldorf, Germany
| | | | | | | | | | | | | |
Collapse
|
17
|
Nobeli I, Ponstingl H, Krissinel EB, Thornton JM. A Structure-based Anatomy of the E.coli Metabolome. J Mol Biol 2003; 334:697-719. [PMID: 14636597 DOI: 10.1016/j.jmb.2003.10.008] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
The Escherichia coli metabolome has been characterised using the two-dimensional structures of 745 metabolites, obtained from the EcoCyc and KEGG databases. Physicochemical properties of the metabolome have been calculated to provide an overview of this set of cognate ligands. A library of fragments commonly found among these molecules has been employed to reveal the main constituents of metabolites, and to assist a broad classification of the metabolome into biochemically relevant classes. Fragment-based fingerprints reveal the metabolome as a continuum in the two-dimensional structural space, where clusters of molecules sharing similar scaffolds can be identified, but are generally overlapping. Nucleotide, carbohydrate and amino acid-like molecules are the most prominent, but at high levels of similarity, a more detailed classification is possible. Classification schemes for the metabolome are a promising tool for understanding the chemical diversity of the metabolome. When used in conjunction with existing classifications of the proteome, they can help to elucidate the binding preferences and promiscuity of proteins and their cognate substrates.
Collapse
Affiliation(s)
- Irene Nobeli
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | |
Collapse
|
18
|
Abstract
Most proteins have been formed by gene duplication, recombination, and divergence. Proteins of known structure can be matched to about 50% of genome sequences, and these data provide a quantitative description and can suggest hypotheses about the origins of these processes.
Collapse
Affiliation(s)
- Cyrus Chothia
- Structural Studies Division, MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, UK
| | | | | | | |
Collapse
|
19
|
Abstract
Protein translations of over 100 complete genomes are now available. About half of these sequences can be provided with structural annotation, thereby enabling some profound insights into protein and pathway evolution. Whereas the major domain structure families are common to all kingdoms of life, these are combined in different ways in multidomain proteins to give various domain architectures that are specific to kingdoms or individual genomes, and contribute to the diverse phenotypes observed. These data argue for more targets in structural genomics initiatives and particularly for the selection of different domain architectures to gain better insights into protein functions.
Collapse
Affiliation(s)
- David Lee
- Department of Biochemistry and Molecular Biology, University College, Gower Street, WC1E 6BT, London, UK.
| | | | | | | |
Collapse
|
20
|
Current awareness on yeast. Yeast 2002; 19:1277-84. [PMID: 12400546 DOI: 10.1002/yea.829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
21
|
Abstract
Small-molecule metabolism forms the core of the metabolic processes of all living organisms. As early as 1945, possible mechanisms for the evolution of such a complex metabolic system were considered. The problem is to explain the appearance and development of a highly regulated complex network of interacting proteins and substrates from a limited structural and functional repertoire. By permitting the co-analysis of phylogeny and metabolism, the combined exploitation of pathway and structural databases, as well as the use of multiple-sequence alignment search algorithms, sheds light on this problem. Much of the current research suggests a chemistry-driven 'patchwork' model of pathway evolution, but other mechanisms may play a role. In the future, as metabolic structure and sequence space are further explored, it should become easier to trace the finer details of pathway development and understand how complexity has evolved.
Collapse
Affiliation(s)
- Stuart C G Rison
- Department of Biochemistry and Molecular Biology, University College London, Darwin Building, Gower Street, London WC1E 6BT, UK
| | | |
Collapse
|