1
|
Widney KA, Phillips LC, Rusch LM, Copley SD. A cheater founds the winning lineages during evolution of a novel metabolic pathway. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.26.634942. [PMID: 39990456 PMCID: PMC11844401 DOI: 10.1101/2025.01.26.634942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/25/2025]
Abstract
Underground metabolic pathways-leaks in the metabolic network caused by promiscuous enzyme activities and non-enzymatic transformations-can provide the starting point for emergence of novel protopathways if a mutation or environmental change increases flux to a physiologically significant level. This early stage in the evolution of metabolic pathways is typically hidden from our view. We have evolved a novel protopathway in ΔpdxB E. coli, which lacks an enzyme required for synthesis of the essential cofactor pyridoxal 5'-phosphate (PLP). This protopathway is comprised of four steps catalyzed by promiscuous enzymes that are still serving their native functions. Complex population dynamics occurred during the evolution experiment. The dominant strain after 150 population doublings, JK1, had acquired four mutations. We constructed every intermediate between the ΔpdxB strain and JK1 and identified the order in which mutations arose in JK1 and the physiological effect of each. Three of the mutations together increased the PLP accumulation rate by 32-fold. The second mutation created a cheater that was less fit on its own but thrived in the population by scavenging nutrients released from the fragile parental cells. Notably, the dominant lineages at the end of the experiment all derived from this cheater strain.
Collapse
Affiliation(s)
- Karl A Widney
- Department of Biochemistry, University of Colorado Boulder, Boulder, CO, 80309, USA
- Cooperative Institute for Research in Environmental Sciences, University of Colorado, Boulder, CO, 80205, USA
| | - Lauren C Phillips
- Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, 80309, USA
- Cooperative Institute for Research in Environmental Sciences, University of Colorado, Boulder, CO, 80205, USA
| | - Leo M Rusch
- Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, 80309, USA
- Cooperative Institute for Research in Environmental Sciences, University of Colorado, Boulder, CO, 80205, USA
| | - Shelley D Copley
- Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, 80309, USA
- Cooperative Institute for Research in Environmental Sciences, University of Colorado, Boulder, CO, 80205, USA
| |
Collapse
|
2
|
Goldford JE, Smith HB, Longo LM, Wing BA, McGlynn SE. Primitive purine biosynthesis connects ancient geochemistry to modern metabolism. Nat Ecol Evol 2024; 8:999-1009. [PMID: 38519634 DOI: 10.1038/s41559-024-02361-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 02/06/2024] [Indexed: 03/25/2024]
Abstract
An unresolved question in the origin and evolution of life is whether a continuous path from geochemical precursors to the majority of molecules in the biosphere can be reconstructed from modern-day biochemistry. Here we identified a feasible path by simulating the evolution of biosphere-scale metabolism, using only known biochemical reactions and models of primitive coenzymes. We find that purine synthesis constitutes a bottleneck for metabolic expansion, which can be alleviated by non-autocatalytic phosphoryl coupling agents. Early phases of the expansion are enriched with enzymes that are metal dependent and structurally symmetric, supporting models of early biochemical evolution. This expansion trajectory suggests distinct hypotheses regarding the tempo, mode and timing of metabolic pathway evolution, including a late appearance of methane metabolisms and oxygenic photosynthesis consistent with the geochemical record. The concordance between biological and geological analyses suggests that this trajectory provides a plausible evolutionary history for the vast majority of core biochemistry.
Collapse
Affiliation(s)
- Joshua E Goldford
- Division of Geological and Planetary Sciences, California Institute of Technology, Pasadena, CA, USA.
- Physics of Living Systems, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Blue Marble Space Institute of Science, Seattle, WA, USA.
| | - Harrison B Smith
- Blue Marble Space Institute of Science, Seattle, WA, USA
- Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo, Japan
| | - Liam M Longo
- Blue Marble Space Institute of Science, Seattle, WA, USA
- Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo, Japan
| | - Boswell A Wing
- Department of Geological Sciences, University of Colorado, Boulder, CO, USA
| | - Shawn Erin McGlynn
- Blue Marble Space Institute of Science, Seattle, WA, USA.
- Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo, Japan.
- Biofunctional Catalyst Research Team, RIKEN Center for Sustainable Resource Science, Wako, Japan.
| |
Collapse
|
3
|
Hidden resources in the Escherichia coli genome restore PLP synthesis and robust growth after deletion of the essential gene pdxB. Proc Natl Acad Sci U S A 2019; 116:24164-24173. [PMID: 31712440 PMCID: PMC6883840 DOI: 10.1073/pnas.1915569116] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
The evolution of new metabolic pathways has been a driver of diversification from the last universal common ancestor 3.8 billion y ago to the present. Bioinformatic evidence suggests that many pathways were assembled by recruiting promiscuous enzymes to serve new functions. However, the processes by which new pathways have emerged are lost in time. We have little information about the environmental conditions that fostered emergence of new pathways, the genome context in which new pathways emerged, and the types of mutations that elevated flux through inefficient new pathways. Experimental laboratory evolution has allowed us to evolve a new pathway and identify mechanisms by which mutations increase fitness when an inefficient new pathway becomes important for survival. PdxB (erythronate 4-phosphate dehydrogenase) is expected to be required for synthesis of the essential cofactor pyridoxal 5′-phosphate (PLP) in Escherichia coli. Surprisingly, incubation of the ∆pdxB strain in medium containing glucose as a sole carbon source for 10 d resulted in visible turbidity, suggesting that PLP is being produced by some alternative pathway. Continued evolution of parallel lineages for 110 to 150 generations produced several strains that grow robustly in glucose. We identified a 4-step bypass pathway patched together from promiscuous enzymes that restores PLP synthesis in strain JK1. None of the mutations in JK1 occurs in a gene encoding an enzyme in the new pathway. Two mutations indirectly enhance the ability of SerA (3-phosphoglycerate dehydrogenase) to perform a new function in the bypass pathway. Another disrupts a gene encoding a PLP phosphatase, thus preserving PLP levels. These results demonstrate that a functional pathway can be patched together from promiscuous enzymes in the proteome, even without mutations in the genes encoding those enzymes.
Collapse
|
4
|
Chaliotis A, Vlastaridis P, Mossialos D, Ibba M, Becker HD, Stathopoulos C, Amoutzias GD. The complex evolutionary history of aminoacyl-tRNA synthetases. Nucleic Acids Res 2017; 45:1059-1068. [PMID: 28180287 PMCID: PMC5388404 DOI: 10.1093/nar/gkw1182] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Revised: 10/20/2016] [Accepted: 11/16/2016] [Indexed: 12/15/2022] Open
Abstract
Aminoacyl-tRNA synthetases (AARSs) are a superfamily of enzymes responsible for the faithful translation of the genetic code and have lately become a prominent target for synthetic biologists. Our large-scale analysis of >2500 prokaryotic genomes reveals the complex evolutionary history of these enzymes and their paralogs, in which horizontal gene transfer played an important role. These results show that a widespread belief in the evolutionary stability of this superfamily is misconceived. Although AlaRS, GlyRS, LeuRS, IleRS, ValRS are the most stable members of the family, GluRS, LysRS and CysRS often have paralogs, whereas AsnRS, GlnRS, PylRS and SepRS are often absent from many genomes. In the course of this analysis, highly conserved protein motifs and domains within each of the AARS loci were identified and used to build a web-based computational tool for the genome-wide detection of AARS coding sequences. This is based on hidden Markov models (HMMs) and is available together with a cognate database that may be used for specific analyses. The bioinformatics tools that we have developed may also help to identify new antibiotic agents and targets using these essential enzymes. These tools also may help to identify organisms with alternative pathways that are involved in maintaining the fidelity of the genetic code.
Collapse
Affiliation(s)
- Anargyros Chaliotis
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, Larissa, Greece
| | - Panayotis Vlastaridis
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, Larissa, Greece
| | - Dimitris Mossialos
- Molecular Microbiology Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, Larissa, Greece
| | - Michael Ibba
- Department of Microbiology, The Ohio State University, Columbus, OH, USA
| | - Hubert D Becker
- Génétique Moléculaire, Génomique, Microbiologie, UMR 7156, CNRS, Université de Strasbourg, 4 allée Konrad Röntgen, Strasbourg Cedex, France
| | | | - Grigorios D Amoutzias
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, Larissa, Greece
| |
Collapse
|
5
|
Martínez-Núñez MA, Rodríguez-Escamilla Z, Rodríguez-Vázquez K, Pérez-Rueda E. Tracing the Repertoire of Promiscuous Enzymes along the Metabolic Pathways in Archaeal Organisms. Life (Basel) 2017; 7:life7030030. [PMID: 28703743 PMCID: PMC5617955 DOI: 10.3390/life7030030] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2017] [Revised: 07/09/2017] [Accepted: 07/10/2017] [Indexed: 01/10/2023] Open
Abstract
The metabolic pathways that carry out the biochemical transformations sustaining life depend on the efficiency of their associated enzymes. In recent years, it has become clear that promiscuous enzymes have played an important role in the function and evolution of metabolism. In this work we analyze the repertoire of promiscuous enzymes in 89 non-redundant genomes of the Archaea cellular domain. Promiscuous enzymes are defined as those proteins with two or more different Enzyme Commission (E.C.) numbers, according the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. From this analysis, it was found that the fraction of promiscuous enzymes is lower in Archaea than in Bacteria. A greater diversity of superfamily domains is associated with promiscuous enzymes compared to specialized enzymes, both in Archaea and Bacteria, and there is an enrichment of substrate promiscuity rather than catalytic promiscuity in the archaeal enzymes. Finally, the presence of promiscuous enzymes in the metabolic pathways was found to be heterogeneously distributed at the domain level and in the phyla that make up the Archaea. These analyses increase our understanding of promiscuous enzymes and provide additional clues to the evolution of metabolism in Archaea.
Collapse
Affiliation(s)
- Mario Alberto Martínez-Núñez
- Laboratorio de Estudios Ecogenómicos, Facultad de Ciencias, Unidad Académica de Ciencias y Tecnología de la UNAM en Yucatán, Universidad Nacional Autónoma de México, Carretera Sierra Papacal-Chuburna Km. 5, C.P. 97302, Mérida, Yucatán, Mexico.
| | - Zuemy Rodríguez-Escamilla
- Departamento de Microbiología, Instituto de Biotecnología, Universidad Nacional, Autónoma de México, C.P. 62210, Cuernavaca, Morelos, Mexico.
| | - Katya Rodríguez-Vázquez
- Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Ciudad Universitaria, C.P. 04510, Ciudad de México, Mexico.
| | - Ernesto Pérez-Rueda
- Departamento de Ingeniería Celular y Biocatálisis, Instituto de Biotecnología, Universidad Nacional Autónoma de México, C.P. 62210, Cuernavaca, Morelos, Mexico.
- Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Unidad Académica Yucatán, Carretera Sierra Papacal-Chuburna Km. 5, C.P. 97302, Mérida, Yucatán, Mexico.
| |
Collapse
|
6
|
Reconstruction and Application of Protein-Protein Interaction Network. Int J Mol Sci 2016; 17:ijms17060907. [PMID: 27338356 PMCID: PMC4926441 DOI: 10.3390/ijms17060907] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2016] [Revised: 05/31/2016] [Accepted: 06/03/2016] [Indexed: 11/17/2022] Open
Abstract
The protein-protein interaction network (PIN) is a useful tool for systematic investigation of the complex biological activities in the cell. With the increasing interests on the proteome-wide interaction networks, PINs have been reconstructed for many species, including virus, bacteria, plants, animals, and humans. With the development of biological techniques, the reconstruction methods of PIN are further improved. PIN has gradually penetrated many fields in biological research. In this work we systematically reviewed the development of PIN in the past fifteen years, with respect to its reconstruction and application of function annotation, subsystem investigation, evolution analysis, hub protein analysis, and regulation mechanism analysis. Due to the significant role of PIN in the in-depth exploration of biological process mechanisms, PIN will be preferred by more and more researchers for the systematic study of the protein systems in various kinds of organisms.
Collapse
|
7
|
Robinson JL, Bertolo RF. The Pediatric Methionine Requirement Should Incorporate Remethylation Potential and Transmethylation Demands. Adv Nutr 2016; 7:523-34. [PMID: 27184279 PMCID: PMC4863267 DOI: 10.3945/an.115.010843] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
The metabolic demand for methionine is great in neonates. Indeed, methionine is the only indispensable sulfur amino acid and is required not only for protein synthesis and growth but is also partitioned to a greater extent to transsulfuration for cysteine and taurine synthesis and to >50 transmethylation reactions that serve to methylate DNA and synthesize metabolites, including creatine and phosphatidylcholine. Therefore, the pediatric methionine requirement must accommodate the demands of rapid protein turnover as well as vast nonprotein demands. Because cysteine spares the methionine requirement, it is likely that the dietary provision of transmethylation products can also feasibly spare methionine. However, understanding the requirement of methionine is further complicated because demethylated methionine can be remethylated by the dietary methyl donors folate and betaine (derived from choline). Intakes of dietary methyl donors are highly variable, which is of particular concern for newborns. It has been demonstrated that many populations have enhanced requirements for these nutrients, and nutrient fortification may exacerbate this phenomenon by selecting phenotypes that increase methyl requirements. Moreover, higher transmethylation rates can limit methyl supply and affect other transmethylation reactions as well as protein synthesis. Therefore, careful investigations are needed to determine how remethylation and transmethylation contribute to the methionine requirement. The purpose of this review is to support our hypothesis that dietary methyl donors and consumers can drive methionine availability for protein synthesis and transmethylation reactions. We argue that nutritional strategies in neonates need to ensure that methionine is available to meet requirements for growth as well as for transmethylation products.
Collapse
Affiliation(s)
| | - Robert F Bertolo
- Department of Biochemistry, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada
| |
Collapse
|
8
|
Large-Scale Analysis Exploring Evolution of Catalytic Machineries and Mechanisms in Enzyme Superfamilies. J Mol Biol 2015; 428:253-267. [PMID: 26585402 PMCID: PMC4751976 DOI: 10.1016/j.jmb.2015.11.010] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Revised: 10/05/2015] [Accepted: 11/10/2015] [Indexed: 01/28/2023]
Abstract
Enzymes, as biological catalysts, form the basis of all forms of life. How these proteins have evolved their functions remains a fundamental question in biology. Over 100 years of detailed biochemistry studies, combined with the large volumes of sequence and protein structural data now available, means that we are able to perform large-scale analyses to address this question. Using a range of computational tools and resources, we have compiled information on all experimentally annotated changes in enzyme function within 379 structurally defined protein domain superfamilies, linking the changes observed in functions during evolution to changes in reaction chemistry. Many superfamilies show changes in function at some level, although one function often dominates one superfamily. We use quantitative measures of changes in reaction chemistry to reveal the various types of chemical changes occurring during evolution and to exemplify these by detailed examples. Additionally, we use structural information of the enzymes active site to examine how different superfamilies have changed their catalytic machinery during evolution. Some superfamilies have changed the reactions they perform without changing catalytic machinery. In others, large changes of enzyme function, in terms of both overall chemistry and substrate specificity, have been brought about by significant changes in catalytic machinery. Interestingly, in some superfamilies, relatives perform similar functions but with different catalytic machineries. This analysis highlights characteristics of functional evolution across a wide range of superfamilies, providing insights that will be useful in predicting the function of uncharacterised sequences and the design of new synthetic enzymes. Examining how enzyme function evolves using sequence, structure, and reaction mechanism data. Quantifying changes in reaction mechanisms reveals how function has diverged in many superfamilies. Homologous domains frequently use different catalytic residues, which sometimes perform the same enzyme chemistry. This large-scale analysis has significance in protein function prediction and enzyme design.
Collapse
|
9
|
Das S, Dawson NL, Orengo CA. Diversity in protein domain superfamilies. Curr Opin Genet Dev 2015; 35:40-9. [PMID: 26451979 PMCID: PMC4686048 DOI: 10.1016/j.gde.2015.09.005] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2015] [Revised: 09/07/2015] [Accepted: 09/08/2015] [Indexed: 01/25/2023]
Abstract
Whilst ∼93% of domain superfamilies appear to be relatively structurally and functionally conserved based on the available data from the CATH-Gene3D domain classification resource, the remainder are much more diverse. In this review, we consider how domains in some of the most ubiquitous and promiscuous superfamilies have evolved, in particular the plasticity in their functional sites and surfaces which expands the repertoire of molecules they interact with and actions performed on them. To what extent can we identify a core function for these superfamilies which would allow us to develop a ‘domain grammar of function’ whereby a protein's biological role can be proposed from its constituent domains? Clearly the first step is to understand the extent to which these components vary and how changes in their molecular make-up modifies function.
Collapse
Affiliation(s)
- Sayoni Das
- Institute of Structural and Molecular Biology, UCL, 627 Darwin Building, Gower Street, WC1E 6BT, UK
| | - Natalie L Dawson
- Institute of Structural and Molecular Biology, UCL, 627 Darwin Building, Gower Street, WC1E 6BT, UK
| | - Christine A Orengo
- Institute of Structural and Molecular Biology, UCL, 627 Darwin Building, Gower Street, WC1E 6BT, UK.
| |
Collapse
|
10
|
Brown S, Babbitt P. Using the structure-function linkage database to characterize functional domains in enzymes. ACTA ACUST UNITED AC 2014; 48:2.10.1-2.10.16. [PMID: 25501940 DOI: 10.1002/0471250953.bi0210s48] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The Structure-Function Linkage Database (SFLD; http://sfld.rbvi.ucsf.edu/) is a Web-accessible database designed to link enzyme sequence, structure, and functional information. This unit describes the protocols by which a user may query the database to predict the function of uncharacterized enzymes and to correct misannotated functional assignments. The information in this unit is especially useful in helping a user discriminate functional capabilities of a sequence that is only distantly related to characterized sequences in publicly available databases.
Collapse
Affiliation(s)
- Shoshana Brown
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, California
| | | |
Collapse
|
11
|
Murakami Y, Kinoshita K, Kinjo AR, Nakamura H. Exhaustive comparison and classification of ligand-binding surfaces in proteins. Protein Sci 2013; 22:1379-91. [PMID: 23934772 PMCID: PMC3795496 DOI: 10.1002/pro.2329] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Revised: 07/29/2013] [Accepted: 08/05/2013] [Indexed: 12/03/2022]
Abstract
Many proteins function by interacting with other small molecules (ligands). Identification of ligand-binding sites (LBS) in proteins can therefore help to infer their molecular functions. A comprehensive comparison among local structures of LBSs was previously performed, in order to understand their relationships and to classify their structural motifs. However, similar exhaustive comparison among local surfaces of LBSs (patches) has never been performed, due to computational complexity. To enhance our understanding of LBSs, it is worth performing such comparisons among patches and classifying them based on similarities of their surface configurations and electrostatic potentials. In this study, we first developed a rapid method to compare two patches. We then clustered patches corresponding to the same PDB chemical component identifier for a ligand, and selected a representative patch from each cluster. We subsequently exhaustively as compared the representative patches and clustered them using similarity score, PatSim. Finally, the resultant PatSim scores were compared with similarities of atomic structures of the LBSs and those of the ligand-binding protein sequences and functions. Consequently, we classified the patches into ∼2000 well-characterized clusters. We found that about 63% of these clusters are used in identical protein folds, although about 25% of the clusters are conserved in distantly related proteins and even in proteins with cross-fold similarity. Furthermore, we showed that patches with higher PatSim score have potential to be involved in similar biological processes.
Collapse
Affiliation(s)
- Yoichi Murakami
- Graduate School of Information Sciences, Tohoku University, 6-3-09 Aramaki-aza-aoba, Aoba-ku, Sendai, Miyagi, 982-0036, Japan
| | | | | | | |
Collapse
|
12
|
Caetano-Anollés K, Caetano-Anollés G. Structural phylogenomics reveals gradual evolutionary replacement of abiotic chemistries by protein enzymes in purine metabolism. PLoS One 2013; 8:e59300. [PMID: 23516625 PMCID: PMC3596326 DOI: 10.1371/journal.pone.0059300] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2012] [Accepted: 02/13/2013] [Indexed: 11/30/2022] Open
Abstract
The origin of metabolism has been linked to abiotic chemistries that existed in our planet at the beginning of life. While plausible chemical pathways have been proposed, including the synthesis of nucleobases, ribose and ribonucleotides, the cooption of these reactions by modern enzymes remains shrouded in mystery. Here we study the emergence of purine metabolism. The ages of protein domains derived from a census of fold family structure in hundreds of genomes were mapped onto enzymes in metabolic diagrams. We find that the origin of the nucleotide interconversion pathway benefited most parsimoniously from the prebiotic formation of adenine nucleosides. In turn, pathways of nucleotide biosynthesis, catabolism and salvage originated ∼300 million years later by concerted enzymatic recruitments and gradual replacement of abiotic chemistries. Remarkably, this process led to the emergence of the fully enzymatic biosynthetic pathway ∼3 billion years ago, concurrently with the appearance of a functional ribosome. The simultaneous appearance of purine biosynthesis and the ribosome probably fulfilled the expanding matter-energy and processing needs of genomic information.
Collapse
Affiliation(s)
- Kelsey Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- Chicago School of Professional Psychology, Chicago, Illinois, United States of America
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- * E-mail:
| |
Collapse
|
13
|
Franceschini A, Szklarczyk D, Frankild S, Kuhn M, Simonovic M, Roth A, Lin J, Minguez P, Bork P, von Mering C, Jensen LJ. STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res 2013; 41:D808-15. [PMID: 23203871 PMCID: PMC3531103 DOI: 10.1093/nar/gks1094] [Citation(s) in RCA: 3337] [Impact Index Per Article: 278.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2012] [Revised: 10/15/2012] [Accepted: 10/18/2012] [Indexed: 12/12/2022] Open
Abstract
Complete knowledge of all direct and indirect interactions between proteins in a given cell would represent an important milestone towards a comprehensive description of cellular mechanisms and functions. Although this goal is still elusive, considerable progress has been made-particularly for certain model organisms and functional systems. Currently, protein interactions and associations are annotated at various levels of detail in online resources, ranging from raw data repositories to highly formalized pathway databases. For many applications, a global view of all the available interaction data is desirable, including lower-quality data and/or computational predictions. The STRING database (http://string-db.org/) aims to provide such a global perspective for as many organisms as feasible. Known and predicted associations are scored and integrated, resulting in comprehensive protein networks covering >1100 organisms. Here, we describe the update to version 9.1 of STRING, introducing several improvements: (i) we extend the automated mining of scientific texts for interaction information, to now also include full-text articles; (ii) we entirely re-designed the algorithm for transferring interactions from one model organism to the other; and (iii) we provide users with statistical information on any functional enrichment observed in their networks.
Collapse
Affiliation(s)
- Andrea Franceschini
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Sune Frankild
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Michael Kuhn
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Alexander Roth
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Jianyi Lin
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Pablo Minguez
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Peer Bork
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| | - Lars J. Jensen
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Switzerland, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark, Biotechnology Center, Technical University Dresden, Germany, Department of Computer Science, University of Milan, Italy, European Molecular Biology Laboratory, Heidelberg and Max-Delbrück-Centre for Molecular Medicine, Berlin, Germany
| |
Collapse
|
14
|
Suen S, Lu HHS, Yeang CH. Evolution of domain architectures and catalytic functions of enzymes in metabolic systems. Genome Biol Evol 2012; 4:976-93. [PMID: 22936075 PMCID: PMC3468959 DOI: 10.1093/gbe/evs072] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Domain architectures and catalytic functions of enzymes constitute the centerpieces of a metabolic network. These types of information are formulated as a two-layered network consisting of domains, proteins, and reactions-a domain-protein-reaction (DPR) network. We propose an algorithm to reconstruct the evolutionary history of DPR networks across multiple species and categorize the mechanisms of metabolic systems evolution in terms of network changes. The reconstructed history reveals distinct patterns of evolutionary mechanisms between prokaryotic and eukaryotic networks. Although the evolutionary mechanisms in early ancestors of prokaryotes and eukaryotes are quite similar, more novel and duplicated domain compositions with identical catalytic functions arise along the eukaryotic lineage. In contrast, prokaryotic enzymes become more versatile by catalyzing multiple reactions with similar chemical operations. Moreover, different metabolic pathways are enriched with distinct network evolution mechanisms. For instance, although the pathways of steroid biosynthesis, protein kinases, and glycosaminoglycan biosynthesis all constitute prominent features of animal-specific physiology, their evolution of domain architectures and catalytic functions follows distinct patterns. Steroid biosynthesis is enriched with reaction creations but retains a relatively conserved repertoire of domain compositions and proteins. Protein kinases retain conserved reactions but possess many novel domains and proteins. In contrast, glycosaminoglycan biosynthesis has high rates of reaction/protein creations and domain recruitments. Finally, we elicit and validate two general principles underlying the evolution of DPR networks: 1) duplicated enzyme proteins possess similar catalytic functions and 2) the majority of novel domains arise to catalyze novel reactions. These results shed new lights on the evolution of metabolic systems.
Collapse
Affiliation(s)
- Summit Suen
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | | | | |
Collapse
|
15
|
Grassi L, Tramontano A. Horizontal and vertical growth of S. cerevisiae metabolic network. BMC Evol Biol 2011; 11:301. [PMID: 21999464 PMCID: PMC3216907 DOI: 10.1186/1471-2148-11-301] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2011] [Accepted: 10/14/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The growth and development of a biological organism is reflected by its metabolic network, the evolution of which relies on the essential gene duplication mechanism. There are two current views about the evolution of metabolic networks. The retrograde model hypothesizes that a pathway evolves by recruiting novel enzymes in a direction opposite to the metabolic flow. The patchwork model is instead based on the assumption that the evolution is based on the exploitation of broad-specificity enzymes capable of catalysing a variety of metabolic reactions. RESULTS We analysed a well-studied unicellular eukaryotic organism, S. cerevisiae, and studied the effect of the removal of paralogous gene products on its metabolic network. Our results, obtained using different paralog and network definitions, show that, after an initial period when gene duplication was indeed instrumental in expanding the metabolic space, the latter reached an equilibrium and subsequent gene duplications were used as a source of more specialized enzymes rather than as a source of novel reactions. We also show that the switch between the two evolutionary strategies in S. cerevisiae can be dated to about 350 million years ago. CONCLUSIONS Our data, obtained through a novel analysis methodology, strongly supports the hypothesis that the patchwork model better explains the more recent evolution of the S. cerevisiae metabolic network. Interestingly, the effects of a patchwork strategy acting before the Euascomycete-Hemiascomycete divergence are still detectable today.
Collapse
Affiliation(s)
- Luigi Grassi
- Physics Department, Sapienza University of Rome, Roma, Italy
| | | |
Collapse
|
16
|
Almonacid DE, Babbitt PC. Toward mechanistic classification of enzyme functions. Curr Opin Chem Biol 2011; 15:435-42. [PMID: 21489855 PMCID: PMC3551611 DOI: 10.1016/j.cbpa.2011.03.008] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2011] [Accepted: 03/17/2011] [Indexed: 11/15/2022]
Abstract
Classification of enzyme function should be quantitative, computationally accessible, and informed by sequences and structures to enable use of genomic information for functional inference and other applications. Large-scale studies have established that divergently evolved enzymes share conserved elements of structure and common mechanistic steps and that convergently evolved enzymes often converge to similar mechanisms too, suggesting that reaction mechanisms could be used to develop finer-grained functional descriptions than provided by the Enzyme Commission (EC) system currently in use. Here we describe how evolution informs these structure-function mappings and review the databases that store mechanisms of enzyme reactions along with recent developments to measure ligand and mechanistic similarities. Together, these provide a foundation for new classifications of enzyme function.
Collapse
Affiliation(s)
- Daniel E. Almonacid
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, 1700 4th Street, MC 2550, San Francisco, CA 94158, USA;
- Department of Pharmaceutical Chemistry, University of California San Francisco, 600 16 Street, MC 2240, San Francisco, CA 94158, USA; Telephone: +1 (415) 476-3784; Fax: +1 (415) 514-9656;
- California Institute for Quantitative Biosciences, University of California San Francisco
| | - Patricia C. Babbitt
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, 1700 4th Street, MC 2550, San Francisco, CA 94158, USA;
- Department of Pharmaceutical Chemistry, University of California San Francisco, 600 16 Street, MC 2240, San Francisco, CA 94158, USA; Telephone: +1 (415) 476-3784; Fax: +1 (415) 514-9656;
- California Institute for Quantitative Biosciences, University of California San Francisco
| |
Collapse
|
17
|
Lovelace LL, Cooper CL, Sodetz JM, Lebioda L. Structure of human C8 protein provides mechanistic insight into membrane pore formation by complement. J Biol Chem 2011; 286:17585-92. [PMID: 21454577 PMCID: PMC3093833 DOI: 10.1074/jbc.m111.219766] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2011] [Revised: 03/23/2011] [Indexed: 11/06/2022] Open
Abstract
C8 is one of five complement proteins that assemble on bacterial membranes to form the lethal pore-like "membrane attack complex" (MAC) of complement. The MAC consists of one C5b, C6, C7, and C8 and 12-18 molecules of C9. C8 is composed of three genetically distinct subunits, C8α, C8β, and C8γ. The C6, C7, C8α, C8β, and C9 proteins are homologous and together comprise the MAC family of proteins. All contain N- and C-terminal modules and a central 40-kDa membrane attack complex perforin (MACPF) domain that has a key role in forming the MAC pore. Here, we report the 2.5 Å resolution crystal structure of human C8 purified from blood. This is the first structure of a MAC family member and of a human MACPF-containing protein. The structure shows the modules in C8α and C8β are located on the periphery of C8 and not likely to interact with the target membrane. The C8γ subunit, a member of the lipocalin family of proteins that bind and transport small lipophilic molecules, shows no occupancy of its putative ligand-binding site. C8α and C8β are related by a rotation of ∼22° with only a small translational component along the rotation axis. Evolutionary arguments suggest the geometry of binding between these two subunits is similar to the arrangement of C9 molecules within the MAC pore. This leads to a model of the MAC that explains how C8-C9 and C9-C9 interactions could facilitate refolding and insertion of putative MACPF transmembrane β-hairpins to form a circular pore.
Collapse
Affiliation(s)
- Leslie L. Lovelace
- From the Department of Chemistry and Biochemistry, University of South Carolina, Columbia, South Carolina 29208
| | - Christopher L. Cooper
- From the Department of Chemistry and Biochemistry, University of South Carolina, Columbia, South Carolina 29208
| | - James M. Sodetz
- From the Department of Chemistry and Biochemistry, University of South Carolina, Columbia, South Carolina 29208
| | - Lukasz Lebioda
- From the Department of Chemistry and Biochemistry, University of South Carolina, Columbia, South Carolina 29208
| |
Collapse
|
18
|
Khersonsky O, Malitsky S, Rogachev I, Tawfik DS. Role of Chemistry versus Substrate Binding in Recruiting Promiscuous Enzyme Functions. Biochemistry 2011; 50:2683-90. [DOI: 10.1021/bi101763c] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Olga Khersonsky
- Department of Biological Chemistry and ‡Department of Plant Sciences, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Sergey Malitsky
- Department of Biological Chemistry and ‡Department of Plant Sciences, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Ilana Rogachev
- Department of Biological Chemistry and ‡Department of Plant Sciences, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Dan S. Tawfik
- Department of Biological Chemistry and ‡Department of Plant Sciences, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
19
|
Glasner ME, Gerlt JA, Babbitt PC. Mechanisms of protein evolution and their application to protein engineering. ADVANCES IN ENZYMOLOGY AND RELATED AREAS OF MOLECULAR BIOLOGY 2010; 75:193-239, xii-xiii. [PMID: 17124868 DOI: 10.1002/9780471224464.ch3] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Protein engineering holds great promise for the development of new biosensors, diagnostics, therapeutics, and agents for bioremediation. Despite some remarkable successes in experimental and computational protein design, engineered proteins rarely achieve the efficiency or specificity of natural enzymes. Current protein design methods utilize evolutionary concepts, including mutation, recombination, and selection, but the inability to fully recapitulate the success of natural evolution suggests that some evolutionary principles have not been fully exploited. One aspect of protein engineering that has received little attention is how to select the most promising proteins to serve as templates, or scaffolds, for engineering. Two evolutionary concepts that could provide a rational basis for template selection are the conservation of catalytic mechanisms and functional promiscuity. Knowledge of the catalytic motifs responsible for conserved aspects of catalysis in mechanistically diverse superfamilies could be used to identify promising templates for protein engineering. Second, protein evolution often proceeds through promiscuous intermediates, suggesting that templates which are naturally promiscuous for a target reaction could enhance protein engineering strategies. This review explores these ideas and alternative hypotheses concerning protein evolution and engineering. Future research will determine if application of these principles will lead to a protein engineering methodology governed by predictable rules for designing efficient, novel catalysts.
Collapse
Affiliation(s)
- Margaret E Glasner
- Department of Biopharmaceutical Sciences, University of California-San Francisco, San Francisco, CA 94143, USA
| | | | | |
Collapse
|
20
|
Almonacid DE, Yera ER, Mitchell JBO, Babbitt PC. Quantitative comparison of catalytic mechanisms and overall reactions in convergently evolved enzymes: implications for classification of enzyme function. PLoS Comput Biol 2010; 6:e1000700. [PMID: 20300652 PMCID: PMC2837397 DOI: 10.1371/journal.pcbi.1000700] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2009] [Accepted: 02/02/2010] [Indexed: 11/19/2022] Open
Abstract
Functionally analogous enzymes are those that catalyze similar reactions on similar substrates but do not share common ancestry, providing a window on the different structural strategies nature has used to evolve required catalysts. Identification and use of this information to improve reaction classification and computational annotation of enzymes newly discovered in the genome projects would benefit from systematic determination of reaction similarities. Here, we quantified similarity in bond changes for overall reactions and catalytic mechanisms for 95 pairs of functionally analogous enzymes (non-homologous enzymes with identical first three numbers of their EC codes) from the MACiE database. Similarity of overall reactions was computed by comparing the sets of bond changes in the transformations from substrates to products. For similarity of mechanisms, sets of bond changes occurring in each mechanistic step were compared; these similarities were then used to guide global and local alignments of mechanistic steps. Using this metric, only 44% of pairs of functionally analogous enzymes in the dataset had significantly similar overall reactions. For these enzymes, convergence to the same mechanism occurred in 33% of cases, with most pairs having at least one identical mechanistic step. Using our metric, overall reaction similarity serves as an upper bound for mechanistic similarity in functional analogs. For example, the four carbon-oxygen lyases acting on phosphates (EC 4.2.3) show neither significant overall reaction similarity nor significant mechanistic similarity. By contrast, the three carboxylic-ester hydrolases (EC 3.1.1) catalyze overall reactions with identical bond changes and have converged to almost identical mechanisms. The large proportion of enzyme pairs that do not show significant overall reaction similarity (56%) suggests that at least for the functionally analogous enzymes studied here, more stringent criteria could be used to refine definitions of EC sub-subclasses for improved discrimination in their classification of enzyme reactions. The results also indicate that mechanistic convergence of reaction steps is widespread, suggesting that quantitative measurement of mechanistic similarity can inform approaches for functional annotation.
Collapse
Affiliation(s)
- Daniel E. Almonacid
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California, United States of America
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, California, United States of America
- California Institute for Quantitative Biosciences, University of California San Francisco, San Francisco, California, United States of America
| | - Emmanuel R. Yera
- Biological and Medical Informatics Graduate Program, University of California San Francisco, San Francisco, California, United States of America
| | - John B. O. Mitchell
- Centre for Biomolecular Sciences, University of St Andrews, St Andrews, United Kingdom
| | - Patricia C. Babbitt
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California, United States of America
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, California, United States of America
- California Institute for Quantitative Biosciences, University of California San Francisco, San Francisco, California, United States of America
| |
Collapse
|
21
|
Reid AJ, Ranea JA, Orengo CA. Comparative evolutionary analysis of protein complexes in E. coli and yeast. BMC Genomics 2010; 11:79. [PMID: 20122144 PMCID: PMC2837643 DOI: 10.1186/1471-2164-11-79] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2009] [Accepted: 02/01/2010] [Indexed: 11/17/2022] Open
Abstract
Background Proteins do not act in isolation; they frequently act together in protein complexes to carry out concerted cellular functions. The evolution of complexes is poorly understood, especially in organisms other than yeast, where little experimental data has been available. Results We generated accurate, high coverage datasets of protein complexes for E. coli and yeast in order to study differences in the evolution of complexes between these two species. We show that substantial differences exist in how complexes have evolved between these organisms. A previously proposed model of complex evolution identified complexes with cores of interacting homologues. We support findings of the relative importance of this mode of evolution in yeast, but find that it is much less common in E. coli. Additionally it is shown that those homologues which do cluster in complexes are involved in eukaryote-specific functions. Furthermore we identify correlated pairs of non-homologous domains which occur in multiple protein complexes. These were identified in both yeast and E. coli and we present evidence that these too may represent complex cores in yeast but not those of E. coli. Conclusions Our results suggest that there are differences in the way protein complexes have evolved in E. coli and yeast. Whereas some yeast complexes have evolved by recruiting paralogues, this is not apparent in E. coli. Furthermore, such complexes are involved in eukaryotic-specific functions. This implies that the increase in gene family sizes seen in eukaryotes in part reflects multiple family members being used within complexes. However, in general, in both E. coli and yeast, homologous domains are used in different complexes.
Collapse
Affiliation(s)
- Adam J Reid
- Research Department of Structural & Molecular Biology, University College London, London, WC1E 6BT, UK.
| | | | | |
Collapse
|
22
|
Evolution of biomolecular networks: lessons from metabolic and protein interactions. Nat Rev Mol Cell Biol 2009; 10:791-803. [PMID: 19851337 DOI: 10.1038/nrm2787] [Citation(s) in RCA: 148] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Despite only becoming popular at the beginning of this decade, biomolecular networks are now frameworks that facilitate many discoveries in molecular biology. The nodes of these networks are usually proteins (specifically enzymes in metabolic networks), whereas the links (or edges) are their interactions with other molecules. These networks are made up of protein-protein interactions or enzyme-enzyme interactions through shared metabolites in the case of metabolic networks. Evolutionary analysis has revealed that changes in the nodes and links in protein-protein interaction and metabolic networks are subject to different selection pressures owing to distinct topological features. However, many evolutionary constraints can be uncovered only if temporal and spatial aspects are included in the network analysis.
Collapse
|
23
|
Abstract
It has been known for more than 35 years that, during evolution, new proteins are formed by gene duplications, sequence and structural divergence and, in many cases, gene combinations. The genome projects have produced complete, or almost complete, descriptions of the protein repertoires of over 600 distinct organisms. Analyses of these data have dramatically increased our understanding of the formation of new proteins. At the present time, we can accurately trace the evolutionary relationships of about half the proteins found in most genomes, and it is these proteins that we discuss in the present review. Usually, the units of evolution are protein domains that are duplicated, diverge and form combinations. Small proteins contain one domain, and large proteins contain combinations of two or more domains. Domains descended from a common ancestor are clustered into superfamilies. In most genomes, the net growth of superfamily members means that more than 90% of domains are duplicates. In a section on domain duplications, we discuss the number of currently known superfamilies, their size and distribution, and superfamily expansions related to biological complexity and to specific lineages. In a section on divergence, we describe how sequences and structures diverge, the changes in stability produced by acceptable mutations, and the nature of functional divergence and selection. In a section on domain combinations, we discuss their general nature, the sequential order of domains, how combinations modify function, and the extraordinary variety of the domain combinations found in different genomes. We conclude with a brief note on other forms of protein evolution and speculations of the origins of the duplication, divergence and combination processes.
Collapse
Affiliation(s)
- Cyrus Chothia
- MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 0QH, UK.
| | | |
Collapse
|
24
|
Iwasaki W, Takagi T. Rapid pathway evolution facilitated by horizontal gene transfers across prokaryotic lineages. PLoS Genet 2009; 5:e1000402. [PMID: 19266023 PMCID: PMC2644373 DOI: 10.1371/journal.pgen.1000402] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2008] [Accepted: 02/03/2009] [Indexed: 11/25/2022] Open
Abstract
The evolutionary history of biological pathways is of general interest, especially in this post-genomic era, because it may provide clues for understanding how complex systems encoded on genomes have been organized. To explain how pathways can evolve de novo, some noteworthy models have been proposed. However, direct reconstruction of pathway evolutionary history both on a genomic scale and at the depth of the tree of life has suffered from artificial effects in estimating the gene content of ancestral species. Recently, we developed an algorithm that effectively reconstructs gene-content evolution without these artificial effects, and we applied it to this problem. The carefully reconstructed history, which was based on the metabolic pathways of 160 prokaryotic species, confirmed that pathways have grown beyond the random acquisition of individual genes. Pathway acquisition took place quickly, probably eliminating the difficulty in holding genes during the course of the pathway evolution. This rapid evolution was due to massive horizontal gene transfers as gene groups, some of which were possibly operon transfers, which would convey existing pathways but not be able to generate novel pathways. To this end, we analyzed how these pathways originally appeared and found that the original acquisition of pathways occurred more contemporaneously than expected across different phylogenetic clades. As a possible model to explain this observation, we propose that novel pathway evolution may be facilitated by bidirectional horizontal gene transfers in prokaryotic communities. Such a model would complement existing pathway evolution models. Many biological functions, from energy metabolism to antibiotic resistance, are carried out by biological pathways that require a number of cooperatively functioning genes. Hence, underlying mechanisms in the evolution of biological pathways are of particular interest. However, compared to the evolution of individual genes, which has been well studied, the evolution of biological pathways is far less understood. In this study, we used the abundant genome sequences available today and a novel algorithm we recently developed to trace the evolutionary history of prokaryotic metabolic pathways and to analyze how these pathways emerged. We found that the pathways have experienced significantly rapid acquisition, which would play a key role in eliminating the difficulty in holding genes during the course of pathway evolution. In addition, the emergence of novel pathways was suggested to have occurred more contemporaneously than expected across different phylogenetic clades. Based on these observations, we propose that novel pathway evolution can be facilitated by bidirectional horizontal gene transfers in prokaryotic communities. This simple model may approach the question of how biological pathways requiring a number of cooperatively functioning genes can be obtained and are the core event within the evolution of biological pathways in prokaryotes.
Collapse
Affiliation(s)
- Wataru Iwasaki
- Department of Computational Biology, University of Tokyo, Kashiwa, Chiba, Japan.
| | | |
Collapse
|
25
|
Abstract
Contemporary protein architectures can be regarded as molecular fossils, historical imprints that mark important milestones in the history of life. Whereas sequences change at a considerable pace, higher-order structures are constrained by the energetic landscape of protein folding, the exploration of sequence and structure space, and complex interactions mediated by the proteostasis and proteolytic machineries of the cell. The survey of architectures in the living world that was fuelled by recent structural genomic initiatives has been summarized in protein classification schemes, and the overall structure of fold space explored with novel bioinformatic approaches. However, metrics of general structural comparison have not yet unified architectural complexity using the 'shared and derived' tenet of evolutionary analysis. In contrast, a shift of focus from molecules to proteomes and a census of protein structure in fully sequenced genomes were able to uncover global evolutionary patterns in the structure of proteins. Timelines of discovery of architectures and functions unfolded episodes of specialization, reductive evolutionary tendencies of architectural repertoires in proteomes and the rise of modularity in the protein world. They revealed a biologically complex ancestral proteome and the early origin of the archaeal lineage. Studies also identified an origin of the protein world in enzymes of nucleotide metabolism harbouring the P-loop-containing triphosphate hydrolase fold and the explosive discovery of metabolic functions that recapitulated well-defined prebiotic shells and involved the recruitment of structures and functions. These observations have important implications for origins of modern biochemistry and diversification of life.
Collapse
|
26
|
|
27
|
Lacroix V, Cottret L, Thébault P, Sagot MF. An introduction to metabolic networks and their structural analysis. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008; 5:594-617. [PMID: 18989046 DOI: 10.1109/tcbb.2008.79] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
There has been a renewed interest for metabolism in the computational biology community, leading to an avalanche of papers coming from methodological network analysis as well as experimental and theoretical biology. This paper is meant to serve as an initial guide for both the biologists interested in formal approaches and the mathematicians or computer scientists wishing to inject more realism into their models. The paper is focused on the structural aspects of metabolism only. The literature is vast enough already, and the thread through it difficult to follow even for the more experienced worker in the field. We explain methods for acquiring data and reconstructing metabolic networks, and review the various models that have been used for their structural analysis. Several concepts such as modularity are introduced, as are the controversies that have beset the field these past few years, for instance, on whether metabolic networks are small-world or scale-free, and on which model better explains the evolution of metabolism. Clarifying the work that has been done also helps in identifying open questions and in proposing relevant future directions in the field, which we do along the paper and in the conclusion.
Collapse
Affiliation(s)
- Vincent Lacroix
- Genome Bioinformatics Research Group, Centre de Regulacio Genomica (CRG), PRBB, Aiguader 88, 08003 Barcelona, Spain.
| | | | | | | |
Collapse
|
28
|
Caetano-Anollés G, Yafremava LS, Gee H, Caetano-Anollés D, Kim HS, Mittenthal JE. The origin and evolution of modern metabolism. Int J Biochem Cell Biol 2008; 41:285-97. [PMID: 18790074 DOI: 10.1016/j.biocel.2008.08.022] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2008] [Revised: 08/09/2008] [Accepted: 08/11/2008] [Indexed: 10/21/2022]
Abstract
One fundamental goal of current research is to understand how complex biomolecular networks took the form that we observe today. Cellular metabolism is probably one of the most ancient biological networks and constitutes a good model system for the study of network evolution. While many evolutionary models have been proposed, a substantial body of work suggests metabolic pathways evolve fundamentally by recruitment, in which enzymes are drawn from close or distant regions of the network to perform novel chemistries or use different substrates. Here we review how structural and functional genomics has impacted our knowledge of evolution of modern metabolism and describe some approaches that merge evolutionary and structural genomics with advances in bioinformatics. These include mining the data on structure and function of enzymes for salient patterns of enzyme recruitment. Initial studies suggest modern metabolism originated in enzymes of nucleotide metabolism harboring the P-loop hydrolase fold, probably in pathways linked to the purine metabolic subnetwork. This gateway of recruitment gave rise to pathways related to the synthesis of nucleotides and cofactors for an ancient RNA world. Once the TIM beta/alpha-barrel fold architecture was discovered, it appears metabolic activities were recruited explosively giving rise to subnetworks related to carbohydrate and then amino acid metabolism. Remarkably, recruitment occurred in a layered system reminiscent of Morowitz's prebiotic shells, supporting the notion that modern metabolism represents a palimpsest of ancient metabolic chemistries.
Collapse
|
29
|
Hernández-Montes G, Díaz-Mejía JJ, Pérez-Rueda E, Segovia L. The hidden universal distribution of amino acid biosynthetic networks: a genomic perspective on their origins and evolution. Genome Biol 2008; 9:R95. [PMID: 18541022 PMCID: PMC2481427 DOI: 10.1186/gb-2008-9-6-r95] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2007] [Revised: 05/06/2008] [Accepted: 06/09/2008] [Indexed: 12/13/2022] Open
Abstract
A core of widely distributed network branches biosynthesizing at least 16 out of the 20 standard amino acids is predicted using comparative genomics. Background Twenty amino acids comprise the universal building blocks of proteins. However, their biosynthetic routes do not appear to be universal from an Escherichia coli-centric perspective. Nevertheless, it is necessary to understand their origin and evolution in a global context, that is, to include more 'model' species and alternative routes in order to do so. We use a comparative genomics approach to assess the origins and evolution of alternative amino acid biosynthetic network branches. Results By tracking the taxonomic distribution of amino acid biosynthetic enzymes, we predicted a core of widely distributed network branches biosynthesizing at least 16 out of the 20 standard amino acids, suggesting that this core occurred in ancient cells, before the separation of the three cellular domains of life. Additionally, we detail the distribution of two types of alternative branches to this core: analogs, enzymes that catalyze the same reaction (using the same metabolites) and belong to different superfamilies; and 'alternologs', herein defined as branches that, proceeding via different metabolites, converge to the same end product. We suggest that the origin of alternative branches is closely related to different environmental metabolite sources and life-styles among species. Conclusion The multi-organismal seed strategy employed in this work improves the precision of dating and determining evolutionary relationships among amino acid biosynthetic branches. This strategy could be extended to diverse metabolic routes and even other biological processes. Additionally, we introduce the concept of 'alternolog', which not only plays an important role in the relationships between structure and function in biological networks, but also, as shown here, has strong implications for their evolution, almost equal to paralogy and analogy.
Collapse
Affiliation(s)
- Georgina Hernández-Montes
- Departamento de Ingeniería Celular y Biocatálisis, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Av, Universidad, Col, Chamilpa, Cuernavaca, Morelos, México
| | | | | | | |
Collapse
|
30
|
Cygler M, Hung MN, Wagner J, Matte A. Bacterial structural genomics initiative: overview of methods and technologies applied to the process of structure determination. Methods Mol Biol 2008; 426:537-559. [PMID: 18542889 DOI: 10.1007/978-1-60327-058-8_36] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
The focus over the last several years on increasing the number of three-dimensional structures of macromolecules by implementation of high throughput methodology has led to the establishment of dedicated structural genomics programs around the world. These worldwide efforts have in turn led to development of novel, parallelized approaches to cloning, expression, purification, and crystallization of proteins. This chapter describes in some detail the approaches and protocols that have been implemented in the Bacterial Structural Genomics Initiative.
Collapse
Affiliation(s)
- Miroslaw Cygler
- Biotechnology Research Institute, National Research Council Canada, Montreal, Canada
| | | | | | | |
Collapse
|
31
|
Díaz-Mejía JJ, Pérez-Rueda E, Segovia L. A network perspective on the evolution of metabolism by gene duplication. Genome Biol 2007; 8:R26. [PMID: 17326820 PMCID: PMC1852415 DOI: 10.1186/gb-2007-8-2-r26] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2006] [Revised: 10/23/2006] [Accepted: 02/27/2007] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Gene duplication followed by divergence is one of the main sources of metabolic versatility. The patchwork and stepwise models of metabolic evolution help us to understand these processes, but their assumptions are relatively simplistic. We used a network-based approach to determine the influence of metabolic constraints on the retention of duplicated genes. RESULTS We detected duplicated genes by looking for enzymes sharing homologous domains and uncovered an increased retention of duplicates for enzymes catalyzing consecutive reactions, as illustrated by the ligases acting in the biosynthesis of peptidoglycan. As a consequence, metabolic networks show a high retention of duplicates within functional modules, and we found a preferential biochemical coupling of reactions that partially explains this bias. A similar situation was found in enzyme-enzyme interaction networks, but not in interaction networks of non-enzymatic proteins or gene transcriptional regulatory networks, suggesting that the retention of duplicates results from the biochemical rules governing substrate-enzyme-product relationships. We confirmed a high retention of duplicates between chemically similar reactions, as illustrated by fatty-acid metabolism. The retention of duplicates between chemically dissimilar reactions is, however, also greater than expected by chance. Finally, we detected a significant retention of duplicates as groups, instead of single pairs. CONCLUSION Our results indicate that in silico modeling of the origin and evolution of metabolism is improved by the inclusion of specific functional constraints, such as the preferential biochemical coupling of reactions. We suggest that the stepwise and patchwork models are not independent of each other: in fact, the network perspective enables us to reconcile and combine these models.
Collapse
Affiliation(s)
- Juan Javier Díaz-Mejía
- Departamento de Ingeniería Celular y Biocatálisis, Instituto de Biotecnología, Universidad Nacional Autónoma de México. Av. Universidad 2001, Col. Chamilpa, Cuernavaca, Morelos, CP 62210 México
| | - Ernesto Pérez-Rueda
- Departamento de Ingeniería Celular y Biocatálisis, Instituto de Biotecnología, Universidad Nacional Autónoma de México. Av. Universidad 2001, Col. Chamilpa, Cuernavaca, Morelos, CP 62210 México
| | - Lorenzo Segovia
- Departamento de Ingeniería Celular y Biocatálisis, Instituto de Biotecnología, Universidad Nacional Autónoma de México. Av. Universidad 2001, Col. Chamilpa, Cuernavaca, Morelos, CP 62210 México
| |
Collapse
|
32
|
Caetano-Anollés G, Kim HS, Mittenthal JE. The origin of modern metabolic networks inferred from phylogenomic analysis of protein architecture. Proc Natl Acad Sci U S A 2007; 104:9358-63. [PMID: 17517598 PMCID: PMC1890499 DOI: 10.1073/pnas.0701214104] [Citation(s) in RCA: 118] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Metabolism represents a complex collection of enzymatic reactions and transport processes that convert metabolites into molecules capable of supporting cellular life. Here we explore the origins and evolution of modern metabolism. Using phylogenomic information linked to the structure of metabolic enzymes, we sort out recruitment processes and discover that most enzymatic activities were associated with the nine most ancient and widely distributed protein fold architectures. An analysis of newly discovered functions showed enzymatic diversification occurred early, during the onset of the modern protein world. Most importantly, phylogenetic reconstruction exercises and other evidence suggest strongly that metabolism originated in enzymes with the P-loop hydrolase fold in nucleotide metabolism, probably in pathways linked to the purine metabolic subnetwork. Consequently, the first enzymatic takeover of an ancient biochemistry or prebiotic chemistry was related to the synthesis of nucleotides for the RNA world.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Departments of Crop Sciences and Cell and Developmental Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
| | | | | |
Collapse
|
33
|
Kim HS, Mittenthal JE, Caetano-Anollés G. MANET: tracing evolution of protein architecture in metabolic networks. BMC Bioinformatics 2006; 7:351. [PMID: 16854231 PMCID: PMC1559654 DOI: 10.1186/1471-2105-7-351] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2006] [Accepted: 07/19/2006] [Indexed: 11/13/2022] Open
Abstract
Background Cellular metabolism can be characterized by networks of enzymatic reactions and transport processes capable of supporting cellular life. Our aim is to find evolutionary patterns and processes embedded in the architecture and function of modern metabolism, using information derived from structural genomics. Description The Molecular Ancestry Network (MANET) project traces evolution of protein architecture in biomolecular networks. We describe metabolic MANET, a database that links information in the Structural Classification of Proteins (SCOP), the Kyoto Encyclopedia of Genes and Genomes (KEGG), and phylogenetic reconstructions depicting the evolution of protein fold architecture. Metabolic MANET literally 'paints' the ancestries of enzymes derived from rooted phylogenomic trees directly onto over one hundred metabolic subnetworks, enabling the study of evolutionary patterns at global and local levels. An initial analysis of painted subnetworks reveals widespread enzymatic recruitment and an early origin of amino acid metabolism. Conclusion MANET maps evolutionary relationships directly and globally onto biological networks, and can generate and test hypotheses related to evolution of metabolism. We anticipate its use in the study of other networks, such as signaling and other protein-protein interaction networks.
Collapse
Affiliation(s)
- Hee Shin Kim
- Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Jay E Mittenthal
- Department of Cell and Developmental Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Gustavo Caetano-Anollés
- Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
34
|
Glasner ME, Fayazmanesh N, Chiang RA, Sakai A, Jacobson MP, Gerlt JA, Babbitt PC. Evolution of structure and function in the o-succinylbenzoate synthase/N-acylamino acid racemase family of the enolase superfamily. J Mol Biol 2006; 360:228-50. [PMID: 16740275 DOI: 10.1016/j.jmb.2006.04.055] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2006] [Revised: 04/22/2006] [Accepted: 04/25/2006] [Indexed: 11/30/2022]
Abstract
Understanding how proteins evolve to provide both exquisite specificity and proficient activity is a fundamental problem in biology that has implications for protein function prediction and protein engineering. To study this problem, we analyzed the evolution of structure and function in the o-succinylbenzoate synthase/N-acylamino acid racemase (OSBS/NAAAR) family, part of the mechanistically diverse enolase superfamily. Although all characterized members of the family catalyze the OSBS reaction, this family is extraordinarily divergent, with some members sharing <15% identity. In addition, a member of this family, Amycolatopsis OSBS/NAAAR, is promiscuous, catalyzing both dehydration and racemization. Although the OSBS/NAAAR family appears to have a single evolutionary origin, no sequence or structural motifs unique to this family could be identified; all residues conserved in the family are also found in enolase superfamily members that have different functions. Based on their species distribution, several uncharacterized proteins similar to Amycolatopsis OSBS/NAAAR appear to have been transmitted by lateral gene transfer. Like Amycolatopsis OSBS/NAAAR, these might have additional or alternative functions to OSBS because many are from organisms lacking the pathway in which OSBS is an intermediate. In addition to functional differences, the OSBS/NAAAR family exhibits surprising structural variations, including large differences in orientation between the two domains. These results offer several insights into protein evolution. First, orthologous proteins can exhibit significant structural variation, and specificity can be maintained with little conservation of ligand-contacting residues. Second, the discovery of a set of proteins similar to Amycolatopsis OSBS/NAAAR supports the hypothesis that new protein functions evolve through promiscuous intermediates. Finally, a combination of evolutionary, structural, and sequence analyses identified characteristics that might prime proteins, such as Amycolatopsis OSBS/NAAAR, for the evolution of new activities.
Collapse
Affiliation(s)
- Margaret E Glasner
- Department of Biopharmaceutical Sciences, University of California, San Francisco, CA 94143, USA
| | | | | | | | | | | | | |
Collapse
|
35
|
Pegg SCH, Brown SD, Ojha S, Seffernick J, Meng EC, Morris JH, Chang PJ, Huang CC, Ferrin TE, Babbitt PC. Leveraging enzyme structure-function relationships for functional inference and experimental design: the structure-function linkage database. Biochemistry 2006; 45:2545-55. [PMID: 16489747 DOI: 10.1021/bi052101l] [Citation(s) in RCA: 122] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The study of mechanistically diverse enzyme superfamilies-collections of enzymes that perform different overall reactions but share both a common fold and a distinct mechanistic step performed by key conserved residues-helps elucidate the structure-function relationships of enzymes. We have developed a resource, the structure-function linkage database (SFLD), to analyze these structure-function relationships. Unique to the SFLD is its hierarchical classification scheme based on linking the specific partial reactions (or other chemical capabilities) that are conserved at the superfamily, subgroup, and family levels with the conserved structural elements that mediate them. We present the results of analyses using the SFLD in correcting misannotations, guiding protein engineering experiments, and elucidating the function of recently solved enzyme structures from the structural genomics initiative. The SFLD is freely accessible at http://sfld.rbvi.ucsf.edu.
Collapse
Affiliation(s)
- Scott C-H Pegg
- Department of Biopharmaceutical Sciences, University of California, San Francisco, 1700 Fourth Street, San Francisco, California 94143-2250, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Pereira-Leal JB, Levy ED, Teichmann SA. The origins and evolution of functional modules: lessons from protein complexes. Philos Trans R Soc Lond B Biol Sci 2006; 361:507-17. [PMID: 16524839 PMCID: PMC1609335 DOI: 10.1098/rstb.2005.1807] [Citation(s) in RCA: 100] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Modularity is an attribute of a system that can be decomposed into a set of cohesive entities that are loosely coupled. Many cellular networks can be decomposed into functional modules-each functionally separable from the other modules. The protein complexes in physical protein interaction networks are a good example of this, and here we focus on their origins and evolution. We investigate the emergence of protein complexes and physical interactions between proteins by duplication, and review other mechanisms. We dissect the dataset of protein complexes of known three-dimensional structure, and show that roughly 90% of these complexes contain contacts between identical proteins within the same complex. Proteins that are shared across different complexes occur frequently, and they tend to be essential genes more often than members of a single protein complex. We also provide a perspective on the evolutionary mechanisms driving the growth of other modular cellular networks such as transcriptional regulatory and metabolic networks.
Collapse
|
37
|
Brown S, Babbitt P. Using the Structure-function Linkage Database to characterize functional domains in enzymes. CURRENT PROTOCOLS IN BIOINFORMATICS 2006; Chapter 2:Unit 2.10. [PMID: 18428763 DOI: 10.1002/0471250953.bi0210s13] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The Structure-Function Linkage Database (SFLD; http://sfld.rbvi.ucsf.edu/) is a Web-accessible database designed to link enzyme sequence, structure, and functional information. This unit describes the protocols by which a user may query the database to predict the function of newly sequenced enzymes and to correct misannotated functional assignments for enzymes currently in public databases. It is especially useful in helping a user discriminate functional capabilities of a sequence that is only distantly related to characterized sequences in publicly available databases.
Collapse
Affiliation(s)
- Shoshana Brown
- University of California, San Francisco, San Francisco, California, USA
| | | |
Collapse
|
38
|
Pál C, Papp B, Lercher MJ. Adaptive evolution of bacterial metabolic networks by horizontal gene transfer. Nat Genet 2005; 37:1372-5. [PMID: 16311593 DOI: 10.1038/ng1686] [Citation(s) in RCA: 347] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2005] [Accepted: 09/08/2005] [Indexed: 11/09/2022]
Abstract
Numerous studies have considered the emergence of metabolic pathways, but the modes of recent evolution of metabolic networks are poorly understood. Here, we integrate comparative genomics with flux balance analysis to examine (i) the contribution of different genetic mechanisms to network growth in bacteria, (ii) the selective forces driving network evolution and (iii) the integration of new nodes into the network. Most changes to the metabolic network of Escherichia coli in the past 100 million years are due to horizontal gene transfer, with little contribution from gene duplicates. Networks grow by acquiring genes involved in the transport and catalysis of external nutrients, driven by adaptations to changing environments. Accordingly, horizontally transferred genes are integrated at the periphery of the network, whereas central parts remain evolutionarily stable. Genes encoding physiologically coupled reactions are often transferred together, frequently in operons. Thus, bacterial metabolic networks evolve by direct uptake of peripheral reactions in response to changed environments.
Collapse
Affiliation(s)
- Csaba Pál
- European Molecular Biology Laboratory, Meyerhofstrasse 1, D-69012 Heidelberg, Germany
| | | | | |
Collapse
|
39
|
Song J, Bonner CA, Wolinsky M, Jensen RA. The TyrA family of aromatic-pathway dehydrogenases in phylogenetic context. BMC Biol 2005; 3:13. [PMID: 15888209 PMCID: PMC1173090 DOI: 10.1186/1741-7007-3-13] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2005] [Accepted: 05/12/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The TyrA protein family includes members that catalyze two dehydrogenase reactions in distinct pathways leading to L-tyrosine and a third reaction that is not part of tyrosine biosynthesis. Family members share a catalytic core region of about 30 kDa, where inhibitors operate competitively by acting as substrate mimics. This protein family typifies many that are challenging for bioinformatic analysis because of relatively modest sequence conservation and small size. RESULTS Phylogenetic relationships of TyrA domains were evaluated in the context of combinatorial patterns of specificity for the two substrates, as well as the presence or absence of a variety of fusions. An interactive tool is provided for prediction of substrate specificity. Interactive alignments for a suite of catalytic-core TyrA domains of differing specificity are also provided to facilitate phylogenetic analysis. tyrA membership in apparent operons (or supraoperons) was examined, and patterns of conserved synteny in relationship to organismal positions on the 16S rRNA tree were ascertained for members of the domain Bacteria. A number of aromatic-pathway genes (hisHb, aroF, aroQ) have fused with tyrA, and it must be more than coincidental that the free-standing counterparts of all of the latter fused genes exhibit a distinct trace of syntenic association. CONCLUSION We propose that the ancestral TyrA dehydrogenase had broad specificity for both the cyclohexadienyl and pyridine nucleotide substrates. Indeed, TyrA proteins of this type persist today, but it is also common to find instances of narrowed substrate specificities, as well as of acquisition via gene fusion of additional catalytic domains or regulatory domains. In some clades a qualitative change associated with either narrowed substrate specificity or gene fusion has produced an evolutionary "jump" in the vertical genealogy of TyrA homologs. The evolutionary history of gene organizations that include tyrA can be deduced in genome assemblages of sufficiently close relatives, the most fruitful opportunities currently being in the Proteobacteria. The evolution of TyrA proteins within the broader context of how their regulation evolved and to what extent TyrA co-evolved with other genes as common members of aromatic-pathway regulons is now feasible as an emerging topic of ongoing inquiry.
Collapse
Affiliation(s)
- Jian Song
- Los Alamos National Laboratory, Los Alamos, New Mexico, 87545, USA
| | - Carol A Bonner
- Emerson Hall, University of Florida, P.O. Box 14425, Gainesville, Florida, 32604-2425, USA
| | - Murray Wolinsky
- Los Alamos National Laboratory, Los Alamos, New Mexico, 87545, USA
| | - Roy A Jensen
- Los Alamos National Laboratory, Los Alamos, New Mexico, 87545, USA
- Emerson Hall, University of Florida, P.O. Box 14425, Gainesville, Florida, 32604-2425, USA
| |
Collapse
|
40
|
Pereira-Leal JB, Teichmann SA. Novel specificities emerge by stepwise duplication of functional modules. Genome Res 2005; 15:552-9. [PMID: 15805495 PMCID: PMC1074369 DOI: 10.1101/gr.3102105] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2004] [Accepted: 01/26/2005] [Indexed: 11/24/2022]
Abstract
A functional module can be defined as a spatially or chemically isolated set of functionally associated components that accomplishes a discrete biological process. Modularity is a key attribute of cellular systems, but the mechanisms that underlie the evolution of functional modules are largely unknown. Duplication of modules has been shown to be an efficient mechanism for the generation of functional innovation in the field of artificial intelligence, but has not been studied in biological networks. Therefore, we ask whether module duplication occurs in cellular networks. We developed a generic framework for the analysis of module duplication, and use it in a large-scale analysis of Saccharomyces cerevisiae protein complexes. Protein complexes are well defined, experimentally derived, functional modules. We observe that at least 6%-20% of the protein complexes have strong similarity to other complexes; thus a considerable fraction has evolved by duplication. Our results indicate that many complexes evolved by step-wise partial duplications. We show that duplicated complexes retain the same overall function, but have different binding specificities and regulation, revealing that duplication of these modules is associated with functional specialization.
Collapse
Affiliation(s)
- José B Pereira-Leal
- MRC-Laboratory of Molecular Biology, Structural Studies Division, Cambridge CB2 2QH, United Kingdom.
| | | |
Collapse
|
41
|
Van Lanen SG, Reader JS, Swairjo MA, de Crécy-Lagard V, Lee B, Iwata-Reuyl D. From cyclohydrolase to oxidoreductase: discovery of nitrile reductase activity in a common fold. Proc Natl Acad Sci U S A 2005; 102:4264-9. [PMID: 15767583 PMCID: PMC555470 DOI: 10.1073/pnas.0408056102] [Citation(s) in RCA: 92] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2004] [Indexed: 11/18/2022] Open
Abstract
The enzyme YkvM from Bacillus subtilis was identified previously along with three other enzymes (YkvJKL) in a bioinformatics search for enzymes involved in the biosynthesis of queuosine, a 7-deazaguanine modified nucleoside found in tRNA(GUN) of Bacteria and Eukarya. Genetic analysis of ykvJKLM mutants in Acinetobacter confirmed that each was essential for queuosine biosynthesis, and the genes were renamed queCDEF. QueF exhibits significant homology to the type I GTP cyclohydrolases characterized by FolE. Given that GTP is the precursor to queuosine and that a cyclohydrolase-like reaction was postulated as the initial step in queuosine biosynthesis, QueF was proposed to be the putative cyclohydrolase-like enzyme responsible for this reaction. We have cloned the queF genes from B. subtilis and Escherichia coli and characterized the recombinant enzymes. Contrary to the predictions based on sequence analysis, we discovered that the enzymes, in fact, catalyze a mechanistically unrelated reaction, the NADPH-dependent reduction of 7-cyano-7-deazaguanineto7-aminomethyl-7-deazaguanine, a late step in the biosynthesis of queuosine. We report here in vitro and in vivo studies that demonstrate this catalytic activity, as well as preliminary biochemical and bioinformatics analysis that provide insight into the structure of this family of enzymes.
Collapse
Affiliation(s)
- Steven G Van Lanen
- Department of Chemistry, Portland State University, P.O. Box 751, Portland, OR 97207, USA
| | | | | | | | | | | |
Collapse
|
42
|
Bonner C, Jensen R, Gander J, Keyhani N. A core catalytic domain of the TyrA protein family: arogenate dehydrogenase from Synechocystis. Biochem J 2005; 382:279-91. [PMID: 15171683 PMCID: PMC1133941 DOI: 10.1042/bj20031809] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2003] [Revised: 05/11/2004] [Accepted: 06/01/2004] [Indexed: 11/17/2022]
Abstract
The TyrA protein family includes prephenate dehydrogenases, cyclohexadienyl dehydrogenases and TyrA(a)s (arogenate dehydrogenases). tyrA(a) from Synechocystis sp. PCC 6803, encoding a 30 kDa TyrA(a) protein, was cloned into an overexpression vector in Escherichia coli. TyrA(a) was then purified to apparent homogeneity and characterized. This protein is a model structure for a catalytic core domain in the TyrA superfamily, uncomplicated by allosteric or fused domains. Competitive inhibitors acting at the catalytic core of TyrA proteins are analogues of any accepted cyclohexadienyl substrate. The homodimeric enzyme was specific for L-arogenate (K(m)=331 microM) and NADP+ (K(m)=38 microM), being unable to substitute prephenate or NAD+ respectively. L-Tyrosine was a potent inhibitor of the enzyme (K(i)=70 microM). NADPH had no detectable ability to inhibit the reaction. Although the mechanism is probably steady-state random order, properties of 2',5'-ADP as an inhibitor suggest a high preference for L-arogenate binding first. Comparative enzymology established that both of the arogenate-pathway enzymes, prephenate aminotransferase and TyrA(a), were present in many diverse cyanobacteria and in a variety of eukaryotic red and green algae.
Collapse
Affiliation(s)
- Carol A. Bonner
- *Department of Microbiology and Cell Science, Bldg 981, PO Box 110700, University of Florida, Gainesville, FL 32611, U.S.A
| | - Roy A. Jensen
- *Department of Microbiology and Cell Science, Bldg 981, PO Box 110700, University of Florida, Gainesville, FL 32611, U.S.A
- †Biosciences Division, Los Alamos National Laboratory, Los Alamos, NM 87544, U.S.A
- ‡Department of Chemistry, City College of New York, New York, NY 10031, U.S.A
| | - John E. Gander
- *Department of Microbiology and Cell Science, Bldg 981, PO Box 110700, University of Florida, Gainesville, FL 32611, U.S.A
| | - Nemat O. Keyhani
- *Department of Microbiology and Cell Science, Bldg 981, PO Box 110700, University of Florida, Gainesville, FL 32611, U.S.A
- To whom correspondence should be addressed (email )
| |
Collapse
|
43
|
Huynen MA, Gabaldón T, Snel B. Variation and evolution of biomolecular systems: Searching for functional relevance. FEBS Lett 2005; 579:1839-45. [PMID: 15763561 DOI: 10.1016/j.febslet.2005.02.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2005] [Revised: 01/18/2005] [Accepted: 02/01/2005] [Indexed: 11/29/2022]
Abstract
The availability of genome sequences and functional genomics data from multiple species enables us to compare the composition of biomolecular systems like biochemical pathways and protein complexes between species. Here, we review small- and large-scale, "genomics-based" approaches to biomolecular systems variation. In general, caution is required when comparing the results of bioinformatics analyses of genomes or of functional genomics data between species. Limitations to the sensitivity of sequence analysis tools and the noisy nature of genomics data tend to lead to systematic overestimates of the amount of variation. Nevertheless, the results from detailed manual analyses, and of large-scale analyses that filter out systematic biases, point to a large amount of variation in the composition of biomolecular systems. Such observations challenge our understanding of the function of the systems and their individual components and can potentially facilitate the identification and functional characterization of sub-systems within a system. Mapping the inter-species variation of complex biomolecular systems on a phylogenetic species tree allows one to reconstruct their evolution.
Collapse
Affiliation(s)
- Martijn A Huynen
- Center for Molecular and Biomolecular Informatics, Nijmegen Center for Molecular Life Sciences, Radboud University Nijmegen Medical Center, P.O. Box 9010, 6500 GL Nijmegen, The Netherlands.
| | | | | |
Collapse
|
44
|
Liu J, Hegyi H, Acton TB, Montelione GT, Rost B. Automatic target selection for structural genomics on eukaryotes. Proteins 2004; 56:188-200. [PMID: 15211504 DOI: 10.1002/prot.20012] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
A central goal of structural genomics is to experimentally determine representative structures for all protein families. At least 14 structural genomics pilot projects are currently investigating the feasibility of high-throughput structure determination; the National Institutes of Health funded nine of these in the United States. Initiatives differ in the particular subset of "all families" on which they focus. At the NorthEast Structural Genomics consortium (NESG), we target eukaryotic protein domain families. The automatic target selection procedure has three aims: 1) identify all protein domain families from currently five entirely sequenced eukaryotic target organisms based on their sequence homology, 2) discard those families that can be modeled on the basis of structural information already present in the PDB, and 3) target representatives of the remaining families for structure determination. To guarantee that all members of one family share a common foldlike region, we had to begin by dissecting proteins into structural domain-like regions before clustering. Our hierarchical approach, CHOP, utilizing homology to PrISM, Pfam-A, and SWISS-PROT chopped the 103,796 eukaryotic proteins/ORFs into 247,222 fragments. Of these fragments, 122,999 appeared suitable targets that were grouped into >27,000 singletons and >18,000 multifragment clusters. Thus, our results suggested that it might be necessary to determine >40,000 structures to minimally cover the subset of five eukaryotic proteomes.
Collapse
Affiliation(s)
- Jinfeng Liu
- CUBIC, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York 10032, USA
| | | | | | | | | |
Collapse
|
45
|
Tsoka S, Ouzounis CA. Metabolic database systems for the analysis of genome-wide function. Biotechnol Bioeng 2004; 84:750-5. [PMID: 14708115 DOI: 10.1002/bit.10881] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Genome sequencing projects provide an inventory of molecular components for a wide variety of organisms. Metabolic databases integrate these functional descriptions of individual modules into a higher-level characterization of cellular metabolism. This article reviews efforts related to the development of metabolic databases and discusses how such systems have aided the delineation of genome properties. We illustrate the design features of metabolic databases and discuss the challenges facing metabolic as well as databases of other functional type.
Collapse
Affiliation(s)
- Sophia Tsoka
- Computational Genomics Group, The European Bioinformatics Institute, EMBL Cambridge Outstation, Cambridge CB1O 1SD, UK.
| | | |
Collapse
|
46
|
Apic G, Huber W, Teichmann SA. Multi-domain protein families and domain pairs: comparison with known structures and a random model of domain recombination. ACTA ACUST UNITED AC 2004; 4:67-78. [PMID: 14649290 DOI: 10.1023/a:1026113408773] [Citation(s) in RCA: 76] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
There is a limited repertoire of domain families in nature that are duplicated and combined in different ways to form the set of proteins in a genome. Most proteins in both prokaryote and eukaryote genomes consist of two or more domains, and we show that the family size distribution of multi-domain protein families follows a power law like that of individual families. Most domain pairs occur in four to six different domain architectures: in isolation and in combinations with different partners. We showed previously that within the set of all pairwise domain combinations, most small and medium-sized families are observed in combination with one or two other families, while a few large families are very versatile and combine with many different partners. Though this may appear to be a stochastic pattern, in which large families have more combination partners by virtue of their size, we establish here that all the domain families with more than three members in genomes are duplicated more frequently than would be expected by chance considering their number of neighbouring domains. This duplication of domain pairs is statistically significant for between one and three quarters of all families with seven or more members. For the majority of pairwise domain combinations, there is no known three-dimensional structure of the two domains together, and we term these novel combinations. Novel domain combinations are interesting and important targets for structural elucidation, as the geometry and interaction between the domains will help understand the function and evolution of multi-domain proteins. Of particular interest are those combinations that occur in the largest number of multi-domain proteins, and several of these frequent novel combinations contain DNA-binding domains.
Collapse
Affiliation(s)
- Gordana Apic
- MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, UK
| | | | | |
Collapse
|
47
|
Abstract
We show that three-dimensional signatures consisting of only a few functionally important residues can be diagnostic of membership in superfamilies of enzymes. Using the enolase superfamily as a model system, we demonstrate that such a signature, or template, can identify superfamily members in structural databases with high sensitivity and specificity. This is remarkable because superfamilies can be highly diverse, with members catalyzing many different overall reactions; the unifying principle can be a conserved partial reaction or chemical capability. Our definition of a superfamily thus hinges on the disposition of residues involved in a conserved function, rather than on fold similarity alone. A clear advantage of basing structure searches on such active site templates rather than on fold similarity is the specificity with which superfamilies with distinct functional characteristics can be identified within a large set of proteins with the same fold, such as the (beta/alpha)8 barrels. Preliminary results are presented for an additional group of enzymes with a different fold, the haloacid dehalogenase superfamily, suggesting that this approach may be generally useful for assigning reading frames of unknown function to specific superfamilies and thereby allowing inference of some of their functional properties.
Collapse
Affiliation(s)
- Elaine C Meng
- Department of Pharmaceutical Chemistry, University of California, Genentech Hall, 600 Sixteenth Street, San Francisco, CA 94143-2240, USA
| | | | | |
Collapse
|
48
|
Abstract
We developed a method CHOP dissecting proteins into domain-like fragments. The basic idea was to cut proteins beginning from very reliable experimental information (PDB), proceeding to expert annotations of domain-like regions (Pfam-A), and completing through cuts based on termini of known proteins. In this way, CHOP dissected more than two thirds of all proteins from 62 proteomes. Analysis of our structural domain-like fragments revealed four surprising results. First, >70% of all dissected proteins contained more than one fragment. Second, most domains spanned on average over approximately 100 residues. This average was similar for eukaryotic and prokaryotic proteins, and it is also valid-although previously not described-for all proteins in the PDB. Third, single-domain proteins were significant longer than most domains in multidomain proteins. Fourth, three fourths of all domains appeared shorter than 210 residues. We believe that our CHOP fragments constituted an important resource for functional and structural genomics. Nevertheless, our main motivation to develop CHOP was that the single-linkage clustering method failed to adequately group full-length proteins. In contrast, CLUP-the simple clustering scheme CLUP introduced here-succeeded largely to group the CHOP fragments from 62 proteomes such that all members of one cluster shared a basic structural core. CLUP found >63,000 multi- and >118,000 single-member clusters. Although most fragments were restricted to a particular cluster, approximately 24% of the fragments were duplicated in at least two clusters. Our thresholds for grouping two fragments into the same cluster were rather conservative. Nevertheless, our results suggested that structural genomics initiatives have to target >30,000 fragments to at least cover the multimember clusters in 62 proteomes.
Collapse
Affiliation(s)
- Jinfeng Liu
- CUBIC, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, USA
| | | |
Collapse
|
49
|
Light S, Kraulis P. Network analysis of metabolic enzyme evolution in Escherichia coli. BMC Bioinformatics 2004; 5:15. [PMID: 15113413 PMCID: PMC394313 DOI: 10.1186/1471-2105-5-15] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2003] [Accepted: 02/18/2004] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The two most common models for the evolution of metabolism are the patchwork evolution model, where enzymes are thought to diverge from broad to narrow substrate specificity, and the retrograde evolution model, according to which enzymes evolve in response to substrate depletion. Analysis of the distribution of homologous enzyme pairs in the metabolic network can shed light on the respective importance of the two models. We here investigate the evolution of the metabolism in E. coli viewed as a single network using EcoCyc. RESULTS Sequence comparison between all enzyme pairs was performed and the minimal path length (MPL) between all enzyme pairs was determined. We find a strong over-representation of homologous enzymes at MPL 1. We show that the functionally similar and functionally undetermined enzyme pairs are responsible for most of the over-representation of homologous enzyme pairs at MPL 1. CONCLUSIONS The retrograde evolution model predicts that homologous enzymes pairs are at short metabolic distances from each other. In general agreement with previous studies we find that homologous enzymes occur close to each other in the network more often than expected by chance, which lends some support to the retrograde evolution model. However, we show that the homologous enzyme pairs which may have evolved through retrograde evolution, namely the pairs that are functionally dissimilar, show a weaker over-representation at MPL 1 than the functionally similar enzyme pairs. Our study indicates that, while the retrograde evolution model may have played a small part, the patchwork evolution model is the predominant process of metabolic enzyme evolution.
Collapse
Affiliation(s)
- Sara Light
- Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm Center for Physics, Astronomy and Biotechnology, Stockholm University, Stockholm SE-10691, Sweden
| | - Per Kraulis
- Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm Center for Physics, Astronomy and Biotechnology, Stockholm University, Stockholm SE-10691, Sweden
| |
Collapse
|
50
|
Mizuta S, Munakata H, Aimaiti A, Oosawa K, Shimizu T. Evaluation of the color-coding method for searching tandem repeats in prokaryotic genomes. CHEM-BIO INFORMATICS JOURNAL 2004. [DOI: 10.1273/cbij.4.133] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Affiliation(s)
- Satoshi Mizuta
- Department of Electronic and Information System Engineering, Faculty of Science and Technology, Hirosaki University
| | - Hikaru Munakata
- Department of Electronic and Information System Engineering, Faculty of Science and Technology, Hirosaki University
| | - Abulimiti Aimaiti
- Department of Electronic and Information System Engineering, Faculty of Science and Technology, Hirosaki University
| | - Kenji Oosawa
- Department of Nano-Material Systems, Graduate School of Engineering, Gunma University
| | - Toshio Shimizu
- Department of Electronic and Information System Engineering, Faculty of Science and Technology, Hirosaki University
| |
Collapse
|