51
|
Universal distribution of component frequencies in biological and technological systems. Proc Natl Acad Sci U S A 2013; 110:6235-9. [PMID: 23530195 DOI: 10.1073/pnas.1217795110] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Bacterial genomes and large-scale computer software projects both consist of a large number of components (genes or software packages) connected via a network of mutual dependencies. Components can be easily added or removed from individual systems, and their use frequencies vary over many orders of magnitude. We study this frequency distribution in genomes of ∼500 bacterial species and in over 2 million Linux computers and find that in both cases it is described by the same scale-free power-law distribution with an additional peak near the tail of the distribution corresponding to nearly universal components. We argue that the existence of a power law distribution of frequencies of components is a general property of any modular system with a multilayered dependency network. We demonstrate that the frequency of a component is positively correlated with its dependency degree given by the total number of upstream components whose operation directly or indirectly depends on the selected component. The observed frequency/dependency degree distributions are reproduced in a simple mathematically tractable model introduced and analyzed in this study.
Collapse
|
52
|
Beber ME, Hütt MT. How do production systems in biological cells maintain their function in changing environments? LOGISTICS RESEARCH 2012. [DOI: 10.1007/s12159-012-0090-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
53
|
A topological characterization of medium-dependent essential metabolic reactions. Metabolites 2012; 2:632-47. [PMID: 24957651 PMCID: PMC3901215 DOI: 10.3390/metabo2030632] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2012] [Revised: 08/28/2012] [Accepted: 09/12/2012] [Indexed: 12/04/2022] Open
Abstract
Metabolism has frequently been analyzed from a network perspective. A major question is how network properties correlate with biological features like growth rates, flux patterns and enzyme essentiality. Using methods from graph theory as well as established topological categories of metabolic systems, we analyze the essentiality of metabolic reactions depending on the growth medium and identify the topological footprint of these reactions. We find that the typical topological context of a medium-dependent essential reaction is systematically different from that of a globally essential reaction. In particular, we observe systematic differences in the distribution of medium-dependent essential reactions across three-node subgraphs (the network motif signature of medium-dependent essential reactions) compared to globally essential or globally redundant reactions. In this way, we provide evidence that the analysis of metabolic systems on the few-node subgraph scale is meaningful for explaining dynamic patterns. This topological characterization of medium-dependent essentiality provides a better understanding of the interplay between reaction deletions and environmental conditions.
Collapse
|
54
|
Takemoto K. Metabolic network modularity arising from simple growth processes. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2012; 86:036107. [PMID: 23030980 DOI: 10.1103/physreve.86.036107] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2012] [Indexed: 06/01/2023]
Abstract
Metabolic networks consist of linked functional components, or modules. The mechanism underlying metabolic network modularity is of great interest not only to researchers of basic science but also to those in fields of engineering. Previous studies have suggested a theoretical model, which proposes that a change in the evolutionary goal (system-specific purpose) increases network modularity, and this hypothesis was supported by statistical data analysis. Nevertheless, further investigation has uncovered additional possibilities that might explain the origin of network modularity. In this work we propose an evolving network model without tuning parameters to describe metabolic networks. We demonstrate, quantitatively, that metabolic network modularity can arise from simple growth processes, independent of the change in the evolutionary goal. Our model is applicable to a wide range of organisms and appears to suggest that metabolic network modularity can be more simply determined than previously thought. Nonetheless, our proposition does not serve to contradict the previous model; it strives to provide an insight from a different angle in the ongoing efforts to understand metabolic evolution, with the hope of eventually achieving the synthetic engineering of metabolic networks.
Collapse
Affiliation(s)
- Kazuhiro Takemoto
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka Fukuoka 820-8502, Japan.
| |
Collapse
|
55
|
Current understanding of the formation and adaptation of metabolic systems based on network theory. Metabolites 2012; 2:429-57. [PMID: 24957641 PMCID: PMC3901219 DOI: 10.3390/metabo2030429] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2012] [Revised: 06/26/2012] [Accepted: 07/09/2012] [Indexed: 11/17/2022] Open
Abstract
Formation and adaptation of metabolic networks has been a long-standing question in biology. With recent developments in biotechnology and bioinformatics, the understanding of metabolism is progressively becoming clearer from a network perspective. This review introduces the comprehensive metabolic world that has been revealed by a wide range of data analyses and theoretical studies; in particular, it illustrates the role of evolutionary events, such as gene duplication and horizontal gene transfer, and environmental factors, such as nutrient availability and growth conditions, in evolution of the metabolic network. Furthermore, the mathematical models for the formation and adaptation of metabolic networks have also been described, according to the current understanding from a perspective of metabolic networks. These recent findings are helpful in not only understanding the formation of metabolic networks and their adaptation, but also metabolic engineering.
Collapse
|
56
|
Cellular Automata on Graphs: Topological Properties of ER Graphs Evolved towards Low-Entropy Dynamics. ENTROPY 2012. [DOI: 10.3390/e14060993] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
|
57
|
Haegeman B, Weitz JS. A neutral theory of genome evolution and the frequency distribution of genes. BMC Genomics 2012; 13:196. [PMID: 22613814 PMCID: PMC3386021 DOI: 10.1186/1471-2164-13-196] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2012] [Accepted: 05/21/2012] [Indexed: 12/31/2022] Open
Abstract
Background The gene composition of bacteria of the same species can differ significantly between isolates. Variability in gene composition can be summarized in terms of gene frequency distributions, in which individual genes are ranked according to the frequency of genomes in which they appear. Empirical gene frequency distributions possess a U-shape, such that there are many rare genes, some genes of intermediate occurrence, and many common genes. It would seem that U-shaped gene frequency distributions can be used to infer the essentiality and/or importance of a gene to a species. Here, we ask: can U-shaped gene frequency distributions, instead, arise generically via neutral processes of genome evolution? Results We introduce a neutral model of genome evolution which combines birth-death processes at the organismal level with gene uptake and loss at the genomic level. This model predicts that gene frequency distributions possess a characteristic U-shape even in the absence of selective forces driving genome and population structure. We compare the model predictions to empirical gene frequency distributions from 6 multiply sequenced species of bacterial pathogens. We fit the model with constant population size to data, matching U-shape distributions albeit without matching all quantitative features of the distribution. We find stronger model fits in the case where we consider exponentially growing populations. We also show that two alternative models which contain a "rigid" and "flexible" core component of genomes provide strong fits to gene frequency distributions. Conclusions The analysis of neutral models of genome evolution suggests that U-shaped gene frequency distributions provide less information than previously suggested regarding gene essentiality. We discuss the need for additional theory and genomic level information to disentangle the roles of evolutionary mechanisms operating within and amongst individuals in driving the dynamics of gene distributions.
Collapse
Affiliation(s)
- Bart Haegeman
- INRIA Research Team MODEMIC, UMR MISTEA, 34060 Montpellier, France.
| | | |
Collapse
|
58
|
Sonnenschein N, Golib Dzib JF, Lesne A, Eilebrecht S, Boulkroun S, Zennaro MC, Benecke A, Hütt MT. A network perspective on metabolic inconsistency. BMC SYSTEMS BIOLOGY 2012; 6:41. [PMID: 22583819 PMCID: PMC3579709 DOI: 10.1186/1752-0509-6-41] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2011] [Accepted: 04/14/2012] [Indexed: 11/10/2022]
Abstract
BACKGROUND Integrating gene expression profiles and metabolic pathways under different experimental conditions is essential for understanding the coherence of these two layers of cellular organization. The network character of metabolic systems can be instrumental in developing concepts of agreement between expression data and pathways. A network-driven interpretation of gene expression data has the potential of suggesting novel classifiers for pathological cellular states and of contributing to a general theoretical understanding of gene regulation. RESULTS Here, we analyze the coherence of gene expression patterns and a reconstruction of human metabolism, using consistency scores obtained from network and constraint-based analysis methods. We find a surprisingly strong correlation between the two measures, demonstrating that a substantial part of inconsistencies between metabolic processes and gene expression can be understood from a network perspective alone. Prompted by this finding, we investigate the topological context of the individual biochemical reactions responsible for the observed inconsistencies. On this basis, we are able to separate the differential contributions that bear physiological information about the system, from the unspecific contributions that unravel gaps in the metabolic reconstruction. We demonstrate the biological potential of our network-driven approach by analyzing transcriptome profiles of aldosterone producing adenomas that have been obtained from a cohort of Primary Aldosteronism patients. We unravel systematics in the data that could not have been resolved by conventional microarray data analysis. In particular, we discover two distinct metabolic states in the adenoma expression patterns. CONCLUSIONS The methodology presented here can help understand metabolic inconsistencies from a network perspective. It thus serves as a mediator between the topology of metabolic systems and their dynamical function. Finally, we demonstrate how physiologically relevant insights into the structure and dynamics of metabolic networks can be obtained using this novel approach.
Collapse
Affiliation(s)
- Nikolaus Sonnenschein
- School of Engineering and Science, Jacobs University Bremen, Campus Ring 1, 28759 Bremen, Germany.
| | | | | | | | | | | | | | | |
Collapse
|
59
|
Stein RR, Isambert H. Logistic map analysis of biomolecular network evolution. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2011; 84:051904. [PMID: 22181441 DOI: 10.1103/physreve.84.051904] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2011] [Revised: 08/22/2011] [Indexed: 05/31/2023]
Abstract
We study the expansion of biomolecular networks from the view point of first evolutionary principles based on the duplication and divergence of ancestral genes. The expansion of gene families and subnetworks is analyzed in terms of logistic map compositions, which capture the varying functional constraints of individual genes in the course of evolution. Using a mean-field approach, we then demonstrate the existence of spontaneous growth-rate variations between gene families and discuss the relevance of such heterogeneous expansions for the emergent properties of actual biomolecular networks.
Collapse
Affiliation(s)
- R R Stein
- Institut Curie, CNRS-UMR168, UPMC, Paris, France
| | | |
Collapse
|
60
|
Grilli J, Bassetti B, Maslov S, Cosentino Lagomarsino M. Joint scaling laws in functional and evolutionary categories in prokaryotic genomes. Nucleic Acids Res 2011; 40:530-40. [PMID: 21937509 PMCID: PMC3258127 DOI: 10.1093/nar/gkr711] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We propose and study a class-expansion/innovation/loss model of genome evolution taking into account biological roles of genes and their constituent domains. In our model, numbers of genes in different functional categories are coupled to each other. For example, an increase in the number of metabolic enzymes in a genome is usually accompanied by addition of new transcription factors regulating these enzymes. Such coupling can be thought of as a proportional ‘recipe’ for genome composition of the type ‘a spoonful of sugar for each egg yolk’. The model jointly reproduces two known empirical laws: the distribution of family sizes and the non-linear scaling of the number of genes in certain functional categories (e.g. transcription factors) with genome size. In addition, it allows us to derive a novel relation between the exponents characterizing these two scaling laws, establishing a direct quantitative connection between evolutionary and functional categories. It predicts that functional categories that grow faster-than-linearly with genome size to be characterized by flatter-than-average family size distributions. This relation is confirmed by our bioinformatics analysis of prokaryotic genomes. This proves that the joint quantitative trends of functional and evolutionary classes can be understood in terms of evolutionary growth with proportional recipes.
Collapse
Affiliation(s)
- J Grilli
- Dipartimento di Fisica, Università degli Studi di Milano, Milano, Italy
| | | | | | | |
Collapse
|
61
|
Nam H, Conrad T, Lewis NE. The role of cellular objectives and selective pressures in metabolic pathway evolution. Curr Opin Biotechnol 2011; 22:595-600. [PMID: 21481583 PMCID: PMC3173765 DOI: 10.1016/j.copbio.2011.03.006] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2011] [Accepted: 03/21/2011] [Indexed: 01/28/2023]
Abstract
Evolution results from molecular-level changes in an organism, thereby producing novel phenotypes and, eventually novel species. However, changes in a single gene can lead to significant changes in biomolecular networks through the gain and loss of many molecular interactions. Thus, significant insights into microbial evolution have been gained through the analysis and comparison of reconstructed metabolic networks. However, challenges remain from reconstruction incompleteness and the inability to experiment with evolution on the timescale necessary for new species to arise. Despite these challenges, experimental laboratory evolution of microbes has provided some insights into the cellular objectives underlying evolution, under the constraints of nutrient availability and the use of mechanisms that protect cells from extreme conditions.
Collapse
Affiliation(s)
- Hojung Nam
- Department of Bioengineering, University of California, San Diego, La Jolla, California, USA
| | - Tom Conrad
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California, USA
| | - Nathan E. Lewis
- Department of Bioengineering, University of California, San Diego, La Jolla, California, USA
| |
Collapse
|
62
|
Pang TY, Maslov S. A toolbox model of evolution of metabolic pathways on networks of arbitrary topology. PLoS Comput Biol 2011; 7:e1001137. [PMID: 21625566 PMCID: PMC3098196 DOI: 10.1371/journal.pcbi.1001137] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2010] [Accepted: 04/14/2011] [Indexed: 11/19/2022] Open
Abstract
In prokaryotic genomes the number of transcriptional regulators is known to be
proportional to the square of the total number of protein-coding genes. A
toolbox model of evolution was recently proposed to explain this empirical
scaling for metabolic enzymes and their regulators. According to its rules, the
metabolic network of an organism evolves by horizontal transfer of pathways from
other species. These pathways are part of a larger “universal”
network formed by the union of all species-specific networks. It remained to be
understood, however, how the topological properties of this universal network
influence the scaling law of functional content of genomes in the toolbox model.
Here we answer this question by first analyzing the scaling properties of the
toolbox model on arbitrary tree-like universal networks. We prove that critical
branching topology, in which the average number of upstream neighbors of a node
is equal to one, is both necessary and sufficient for quadratic scaling. We
further generalize the rules of the model to incorporate reactions with multiple
substrates/products as well as branched and cyclic metabolic pathways. To
achieve its metabolic tasks, the new model employs evolutionary optimized
pathways with minimal number of reactions. Numerical simulations of this
realistic model on the universal network of all reactions in the KEGG database
produced approximately quadratic scaling between the number of regulated
pathways and the size of the metabolic network. To quantify the geometrical
structure of individual pathways, we investigated the relationship between their
number of reactions, byproducts, intermediate, and feedback metabolites. Our
results validate and explain the ubiquitous appearance of the quadratic scaling
for a broad spectrum of topologies of underlying universal metabolic networks.
They also demonstrate why, in spite of “small-world” topology,
real-life metabolic networks are characterized by a broad distribution of
pathway lengths and sizes of metabolic regulons in regulatory networks. It has been previously reported that in prokaryotic genomes the number of
transcriptional regulators is proportional to the square of the total number of
genes. We recently offered a general explanation of this empirical powerlaw
scaling in terms of the “toolbox” model in which metabolic and
regulatory networks co-evolve together. This evolution is driven by horizontal
gene transfer of co-regulated metabolic pathways from other species. These
pathways are part of a larger “universal” network formed by the
union of all species-specific networks. In the present work we address the
question of how topological properties of this universal network influence the
powerlaw scaling of regulators in the toolbox model. We also generalize its
rules to include reactions with multiple substrates and products, branched and
cyclic metabolic pathways, and to account for optimality of metabolic pathways.
The main conclusion of our analytical and numerical modeling efforts is that the
quadratic scaling is the robust feature of the toolbox model in a broad range of
universal network topologies. They also demonstrate why, in spite of
“small-world” topology, real-life metabolic networks are
characterized by a broad distribution of pathway lengths and sizes of regulons
in regulatory networks.
Collapse
Affiliation(s)
- Tin Yau Pang
- Department of Condensed Matter Physics and
Materials Science, Brookhaven National Laboratory, Upton, New York, United
States of America
- Department of Physics and Astronomy, Stony
Brook University, Stony Brook, New York, United States of America
| | - Sergei Maslov
- Department of Condensed Matter Physics and
Materials Science, Brookhaven National Laboratory, Upton, New York, United
States of America
- * E-mail:
| |
Collapse
|
63
|
Kumar M, Balaji PV. Comparative genomics analysis of completely sequenced microbial genomes reveals the ubiquity of N-linked glycosylation in prokaryotes. MOLECULAR BIOSYSTEMS 2011; 7:1629-45. [PMID: 21387023 DOI: 10.1039/c0mb00259c] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Glycosylation of proteins in prokaryotes has been known for the last few decades. Glycan structures and/or the glycosylation pathways have been experimentally characterized in only a small number of prokaryotes. Even this has become possible only during the last decade or so, primarily due to technological and methodological developments. Glycosylated proteins are diverse in their function and localization. Glycosylation has been shown to be associated with a wide range of biological phenomena. Characterization of the various types of glycans and the glycosylation machinery is critical to understand such processes. Such studies can help in the identification of novel targets for designing drugs, diagnostics, and engineering of therapeutic proteins. In view of this, the experimentally characterized pgl system of Campylobacter jejuni, responsible for N-linked glycosylation, has been used in this study to identify glycosylation loci in 865 prokaryotes whose genomes have been completely sequenced. Results from the present study show that only a small number of organisms have homologs for all the pgl enzymes and a few others have homologs for none of the pgl enzymes. Most of the organisms have homologs for only a subset of the pgl enzymes. There is no specific pattern for the presence or absence of pgl homologs vis-à-vis the 16S rRNA sequence-based phylogenetic tree. This may be due to differences in the glycan structures, high sequence divergence, horizontal gene transfer or non-orthologous gene displacement. Overall, the presence of homologs for pgl enzymes in a large number of organisms irrespective of their habitat, pathogenicity, energy generation mechanism, etc., hints towards the ubiquity of N-linked glycosylation in prokaryotes.
Collapse
Affiliation(s)
- Manjeet Kumar
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400 076, India
| | | |
Collapse
|
64
|
Bernhardsson S, Gerlee P, Lizana L. Structural correlations in bacterial metabolic networks. BMC Evol Biol 2011; 11:20. [PMID: 21251250 PMCID: PMC3033826 DOI: 10.1186/1471-2148-11-20] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2010] [Accepted: 01/20/2011] [Indexed: 11/22/2022] Open
Abstract
Background Evolution of metabolism occurs through the acquisition and loss of genes whose products acts as enzymes in metabolic reactions, and from a presumably simple primordial metabolism the organisms living today have evolved complex and highly variable metabolisms. We have studied this phenomenon by comparing the metabolic networks of 134 bacterial species with known phylogenetic relationships, and by studying a neutral model of metabolic network evolution. Results We consider the 'union-network' of 134 bacterial metabolisms, and also the union of two smaller subsets of closely related species. Each reaction-node is tagged with the number of organisms it belongs to, which we denote organism degree (OD), a key concept in our study. Network analysis shows that common reactions are found at the centre of the network and that the average OD decreases as we move to the periphery. Nodes of the same OD are also more likely to be connected to each other compared to a random OD relabelling based on their occurrence in the real data. This trend persists up to a distance of around five reactions. A simple growth model of metabolic networks is used to investigate the biochemical constraints put on metabolic-network evolution. Despite this seemingly drastic simplification, a 'union-network' of a collection of unrelated model networks, free of any selective pressure, still exhibit similar structural features as their bacterial counterpart. Conclusions The OD distribution quantifies topological properties of the evolutionary history of bacterial metabolic networks, and lends additional support to the importance of horizontal gene transfer during bacterial metabolic evolution where new reactions are attached at the periphery of the network. The neutral model of metabolic network growth can reproduce the main features of real networks, but we observe that the real networks contain a smaller common core, while they are more similar at the periphery of the network. This suggests that natural selection and biochemical correlations can act both to diversify and to narrow down metabolic evolution.
Collapse
Affiliation(s)
- Sebastian Bernhardsson
- Center for Models of Life, Niels Bohr Institute, Blegdamsvej 17 DK-2100 Copenhagen Ø, Denmark.
| | | | | |
Collapse
|
65
|
Abstract
Escherichia coli exhibits a wide range of lifestyles encompassing commensalism and various pathogenic behaviors which its highly dynamic genome contributes to develop. How environmental and host factors shape the genetic structure of E. coli strains remains, however, largely unknown. Following a previous study of E. coli genomic diversity, we investigated its diversity at the metabolic level by building and analyzing the genome-scale metabolic networks of 29 E. coli strains (8 commensal and 21 pathogenic strains, including 6 Shigella strains). Using a tailor-made reconstruction strategy, we significantly improved the completeness and accuracy of the metabolic networks over default automatic reconstruction processes. Among the 1,545 reactions forming E. coli panmetabolism, 885 reactions were common to all strains. This high proportion of core reactions (57%) was found to be in sharp contrast to the low proportion (13%) of core genes in the E. coli pangenome, suggesting less diversity of metabolic functions compared to that of all gene functions. Core reactions were significantly overrepresented among biosynthetic reactions compared to the more variable degradation processes. Differences between metabolic networks were found to follow E. coli phylogeny rather than pathogenic phenotypes, except for Shigella networks, which were significantly more distant from the others. This suggests that most metabolic changes in non-Shigella strains were not driven by their pathogenic phenotypes. Using a supervised method, we were yet able to identify small sets of reactions related to pathogenicity or commensalism. The quality of our reconstructed networks also makes them reliable bases for building metabolic models.
Collapse
|
66
|
Schütte M, Skupin A, Segrè D, Ebenhöh O. Modeling the complex dynamics of enzyme-pathway coevolution. CHAOS (WOODBURY, N.Y.) 2010; 20:045115. [PMID: 21198127 DOI: 10.1063/1.3530440] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Metabolic pathways must have coevolved with the corresponding enzyme gene sequences. However, the evolutionary dynamics ensuing from the interplay between metabolic networks and genomes is still poorly understood. Here, we present a computational model that generates putative evolutionary walks on the metabolic network using a parallel evolution of metabolic reactions and their catalyzing enzymes. Starting from an initial set of compounds and enzymes, we expand the metabolic network iteratively by adding new enzymes with a probability that depends on their sequence-based similarity to already present enzymes. Thus, we obtain simulated time courses of chemical evolution in which we can monitor the appearance of new metabolites, enzyme sequences, or even entire organisms. We observe that new enzymes do not appear gradually but rather in clusters which correspond to enzyme classes. A comparison with Brownian motion dynamics indicates that our system displays biased random walks similar to diffusion on the metabolic network with long-range correlations. This suggests that a quantitative molecular principle may underlie the appearance of punctuated equilibrium dynamics, whereby enzymes occur in bursts rather than by phyletic gradualism. Moreover, the simulated time courses lead to a putative time-order of enzyme and organism appearance. Among the patterns we detect in these evolutionary trends is a significant correlation between the time of appearance and their enzyme repertoire size. Hence, our approach to metabolic evolution may help understand the rise in complexity at the biochemical and genomic levels.
Collapse
Affiliation(s)
- Moritz Schütte
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | | | | | | |
Collapse
|
67
|
Charoensawan V, Wilson D, Teichmann SA. Genomic repertoires of DNA-binding transcription factors across the tree of life. Nucleic Acids Res 2010; 38:7364-77. [PMID: 20675356 PMCID: PMC2995046 DOI: 10.1093/nar/gkq617] [Citation(s) in RCA: 116] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2010] [Revised: 06/22/2010] [Accepted: 06/25/2010] [Indexed: 11/14/2022] Open
Abstract
Sequence-specific transcription factors (TFs) are important to genetic regulation in all organisms because they recognize and directly bind to regulatory regions on DNA. Here, we survey and summarize the TF resources available. We outline the organisms for which TF annotation is provided, and discuss the criteria and methods used to annotate TFs by different databases. By using genomic TF repertoires from ∼700 genomes across the tree of life, covering Bacteria, Archaea and Eukaryota, we review TF abundance with respect to the number of genes, as well as their structural complexity in diverse lineages. While typical eukaryotic TFs are longer than the average eukaryotic proteins, the inverse is true for prokaryotes. Only in eukaryotes does the same family of DNA-binding domain (DBD) occur multiple times within one polypeptide chain. This potentially increases the length and diversity of DNA-recognition sequence by reusing DBDs from the same family. We examined the increase in TF abundance with the number of genes in genomes, using the largest set of prokaryotic and eukaryotic genomes to date. As pointed out before, prokaryotic TFs increase faster than linearly. We further observe a similar relationship in eukaryotic genomes with a slower increase in TFs.
Collapse
|
68
|
Beslon G, Parsons D, Sanchez-Dehesa Y, Peña JM, Knibbe C. Scaling laws in bacterial genomes: A side-effect of selection of mutational robustness? Biosystems 2010; 102:32-40. [DOI: 10.1016/j.biosystems.2010.07.009] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2010] [Accepted: 07/15/2010] [Indexed: 11/25/2022]
|
69
|
Evolution of gene regulatory networks by fluctuating selection and intrinsic constraints. PLoS Comput Biol 2010; 6. [PMID: 20700492 PMCID: PMC2916849 DOI: 10.1371/journal.pcbi.1000873] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2009] [Accepted: 06/30/2010] [Indexed: 11/23/2022] Open
Abstract
Various characteristics of complex gene regulatory networks (GRNs) have been discovered during the last decade, e.g., redundancy, exponential indegree distributions, scale-free outdegree distributions, mutational robustness, and evolvability. Although progress has been made in this field, it is not well understood whether these characteristics are the direct products of selection or those of other evolutionary forces such as mutational biases and biophysical constraints. To elucidate the causal factors that promoted the evolution of complex GRNs, we examined the effect of fluctuating environmental selection and some intrinsic constraining factors on GRN evolution by using an individual-based model. We found that the evolution of complex GRNs is remarkably promoted by fixation of beneficial gene duplications under unpredictably fluctuating environmental conditions and that some internal factors inherent in organisms, such as mutational bias, gene expression costs, and constraints on expression dynamics, are also important for the evolution of GRNs. The results indicate that various biological properties observed in GRNs could evolve as a result of not only adaptation to unpredictable environmental changes but also non-adaptive processes owing to the properties of the organisms themselves. Our study emphasizes that evolutionary models considering such intrinsic constraining factors should be used as null models to analyze the effect of selection on GRN evolution. Various organismal traits, including the morphology of multicellular species and metabolism in unicellular species, are determined by the amount and combinations of proteins in the cell. The complex regulatory network plays an important role in controlling the protein profiles in a cell. Recent studies have revealed that gene regulatory networks have many interesting structural and mutational features such as their scale-free structure, mutational robustness, and evolvability. However, why and how these features have emerged from evolution is unknown. In this paper, we constructed an evolutionary model of gene regulatory networks and simulated its evolution under various environmental conditions. The results show that most features of known gene regulatory networks evolve as a result of adaptation to unpredictable environmental fluctuations. In addition, some internal organismal factors, such as mutational bias, gene expression costs, and constraints on expression dynamics, are also important for GRN evolution observed in real organisms. Thus, these GRN features appear to evolve as a result of not only adaptation to unpredictable environmental changes but also non-adaptive processes owing to the properties of the organisms themselves.
Collapse
|
70
|
Abstract
Multiple constraints variously affect different parts of the genomes of diverse life forms. The selective pressures that shape the evolution of viral, archaeal, bacterial and eukaryotic genomes differ markedly, even among relatively closely related animal and bacterial lineages; by contrast, constraints affecting protein evolution seem to be more universal. The constraints that shape the evolution of genomes and phenomes are complemented by the plasticity and robustness of genome architecture, expression and regulation. Taken together, these findings are starting to reveal complex networks of evolutionary processes that must be integrated to attain a new synthesis of evolutionary biology.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
| | | |
Collapse
|
71
|
Ordered structure of the transcription network inherited from the yeast whole-genome duplication. BMC SYSTEMS BIOLOGY 2010; 4:77. [PMID: 20525287 PMCID: PMC2900227 DOI: 10.1186/1752-0509-4-77] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/12/2010] [Accepted: 06/03/2010] [Indexed: 01/07/2023]
Abstract
Background Gene duplication, a major evolutionary path to genomic innovation, can occur at the scale of an entire genome. One such "whole-genome duplication" (WGD) event among the Ascomycota fungi gave rise to genes with distinct biological properties compared to small-scale duplications. Results We studied the evolution of transcriptional interactions of whole-genome duplicates, to understand how they are wired into the yeast regulatory system. Our work combines network analysis and modeling of the large-scale structure of the interactions stemming from the WGD. Conclusions The results uncover the WGD as a major source for the evolution of a complex interconnected block of transcriptional pathways. The inheritance of interactions among WGD duplicates follows elementary "duplication subgraphs", relating ancestral interactions with newly formed ones. Duplication subgraphs are correlated with their neighbours and give rise to higher order circuits with two elementary properties: newly formed transcriptional pathways remain connected (paths are not broken), and are preferentially cross-connected with ancestral ones. The result is a coherent and connected "WGD-network", where duplication subgraphs are arranged in an astonishingly ordered configuration.
Collapse
|
72
|
Evolution of the RpoS regulon: origin of RpoS and the conservation of RpoS-dependent regulation in bacteria. J Mol Evol 2010; 70:557-71. [PMID: 20506020 DOI: 10.1007/s00239-010-9352-0] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2010] [Accepted: 05/03/2010] [Indexed: 10/19/2022]
Abstract
The RpoS sigma factor in proteobacteria regulates genes in stationary phase and in response to stress. Although of conserved function, the RpoS regulon may have different gene composition across species due to high genomic diversity and to known environmental conditions that select for RpoS mutants. In this study, the distribution of RpoS homologs in prokaryotes and the differential dependence of regulon members on RpoS for expression in two gamma-proteobacteria (Escherichia coli and Pseudomonas aeruginosa) were examined. Using a maximum-likelihood phylogeny and reciprocal best hits analysis, we show that the RpoS sigma factor is conserved within gamma-, beta-, and delta-proteobacteria. Annotated RpoS of Borrelia and the enteric RpoS are postulated to have separate evolutionary origins. To determine the conservation of RpoS-dependent gene expression across species, reciprocal best hits analysis was used to identify orthologs of the E. coli RpoS regulon in the RpoS regulon of P. aeruginosa. Of the 186 RpoS-dependent genes of E. coli, 50 proteins have an ortholog within the P. aeruginosa genome. Twelve genes of the 50 orthologs are RpoS-dependent in both species, and at least four genes are regulated by RpoS in other gamma-proteobacteria. Despite RpoS conservation in gamma-, beta-, and delta-proteobacteria, RpoS regulon composition is subject to modification between species. Environmental selection for RpoS mutants likely contributes to the evolutionary divergence and specialization of the RpoS regulon within different bacterial genomes.
Collapse
|
73
|
Comparing genomes to computer operating systems in terms of the topology and evolution of their regulatory control networks. Proc Natl Acad Sci U S A 2010; 107:9186-91. [PMID: 20439753 DOI: 10.1073/pnas.0914771107] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The genome has often been called the operating system (OS) for a living organism. A computer OS is described by a regulatory control network termed the call graph, which is analogous to the transcriptional regulatory network in a cell. To apply our firsthand knowledge of the architecture of software systems to understand cellular design principles, we present a comparison between the transcriptional regulatory network of a well-studied bacterium (Escherichia coli) and the call graph of a canonical OS (Linux) in terms of topology and evolution. We show that both networks have a fundamentally hierarchical layout, but there is a key difference: The transcriptional regulatory network possesses a few global regulators at the top and many targets at the bottom; conversely, the call graph has many regulators controlling a small set of generic functions. This top-heavy organization leads to highly overlapping functional modules in the call graph, in contrast to the relatively independent modules in the regulatory network. We further develop a way to measure evolutionary rates comparably between the two networks and explain this difference in terms of network evolution. The process of biological evolution via random mutation and subsequent selection tightly constrains the evolution of regulatory network hubs. The call graph, however, exhibits rapid evolution of its highly connected generic components, made possible by designers' continual fine-tuning. These findings stem from the design principles of the two systems: robustness for biological systems and cost effectiveness (reuse) for software systems.
Collapse
|
74
|
Riehl WJ, Krapivsky PL, Redner S, Segrè D. Signatures of arithmetic simplicity in metabolic network architecture. PLoS Comput Biol 2010; 6:e1000725. [PMID: 20369010 PMCID: PMC2848538 DOI: 10.1371/journal.pcbi.1000725] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2009] [Accepted: 02/26/2010] [Indexed: 11/19/2022] Open
Abstract
Metabolic networks perform some of the most fundamental functions in living cells, including energy transduction and building block biosynthesis. While these are the best characterized networks in living systems, understanding their evolutionary history and complex wiring constitutes one of the most fascinating open questions in biology, intimately related to the enigma of life's origin itself. Is the evolution of metabolism subject to general principles, beyond the unpredictable accumulation of multiple historical accidents? Here we search for such principles by applying to an artificial chemical universe some of the methodologies developed for the study of genome scale models of cellular metabolism. In particular, we use metabolic flux constraint-based models to exhaustively search for artificial chemistry pathways that can optimally perform an array of elementary metabolic functions. Despite the simplicity of the model employed, we find that the ensuing pathways display a surprisingly rich set of properties, including the existence of autocatalytic cycles and hierarchical modules, the appearance of universally preferable metabolites and reactions, and a logarithmic trend of pathway length as a function of input/output molecule size. Some of these properties can be derived analytically, borrowing methods previously used in cryptography. In addition, by mapping biochemical networks onto a simplified carbon atom reaction backbone, we find that properties similar to those predicted for the artificial chemistry hold also for real metabolic networks. These findings suggest that optimality principles and arithmetic simplicity might lie beneath some aspects of biochemical complexity. Metabolism is the network of biochemical reactions that transforms available resources (“inputs”) into energy currency and building blocks (“outputs”). Different organisms have different assortments of metabolic pathways and input/output requirements, reflecting their adaptation to specific environments, and to specific strategies for reproduction and survival. Here we ask whether, beneath the intricate wiring of these networks, it is possible to discern signatures of optimal (i.e., shortest and maximally efficient) pathway architectures. A systematic search for such optimal pathways between all possible pairs of input and output molecules in real organic chemistry is computationally intractable. However, we can implement such a search in a simple artificial chemistry, which roughly resembles a single atom (e.g., carbon) version of real biochemistry. We find that optimal pathways in our idealized chemistry display a logarithmic dependence of pathway length on input/output molecule size. They also display recurring topologies, including autocatalytic cycles reminiscent of ancient and highly conserved cores of real biochemistry. Finally, across all optimal pathways, we identify universally important metabolites and reactions, as well as a characteristic distribution of reaction utilization. Similar features can be observed in real metabolic networks, suggesting that arithmetic simplicity may lie beneath some aspects of biochemical complexity.
Collapse
Affiliation(s)
- William J. Riehl
- Program in Bioinformatics and Systems Biology, Boston University, Boston, Massachusetts, United States of America
| | - Paul L. Krapivsky
- Department of Physics, Boston University, Boston, Massachusetts, United States of America
| | - Sidney Redner
- Department of Physics, Boston University, Boston, Massachusetts, United States of America
| | - Daniel Segrè
- Program in Bioinformatics and Systems Biology, Boston University, Boston, Massachusetts, United States of America
- Department of Biology, Boston University, Boston, Massachusetts, United States of America
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
75
|
Angelini A, Amato A, Bianconi G, Bassetti B, Cosentino Lagomarsino M. Mean-field methods in evolutionary duplication-innovation-loss models for the genome-level repertoire of protein domains. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2010; 81:021919. [PMID: 20365607 DOI: 10.1103/physreve.81.021919] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2009] [Revised: 01/22/2010] [Indexed: 05/29/2023]
Abstract
We present a combined mean-field and simulation approach to different models describing the dynamics of classes formed by elements that can appear, disappear, or copy themselves. These models, related to a paradigm duplication-innovation model known as Chinese restaurant process, are devised to reproduce the scaling behavior observed in the genome-wide repertoire of protein domains of all known species. In view of these data, we discuss the qualitative and quantitative differences of the alternative model formulations, focusing in particular on the roles of element loss and of the specificity of empirical domain classes.
Collapse
Affiliation(s)
- A Angelini
- Dipartimento di Fisica, Università degli Studi di Milano, Via Celoria 16, 20133 Milano, Italy
| | | | | | | | | |
Collapse
|
76
|
Koonin EV, Wolf YI. Is evolution Darwinian or/and Lamarckian? Biol Direct 2009; 4:42. [PMID: 19906303 PMCID: PMC2781790 DOI: 10.1186/1745-6150-4-42] [Citation(s) in RCA: 159] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2009] [Accepted: 11/11/2009] [Indexed: 12/15/2022] Open
Abstract
Background The year 2009 is the 200th anniversary of the publication of Jean-Bapteste Lamarck's Philosophie Zoologique and the 150th anniversary of Charles Darwin's On the Origin of Species. Lamarck believed that evolution is driven primarily by non-randomly acquired, beneficial phenotypic changes, in particular, those directly affected by the use of organs, which Lamarck believed to be inheritable. In contrast, Darwin assigned a greater importance to random, undirected change that provided material for natural selection. The concept The classic Lamarckian scheme appears untenable owing to the non-existence of mechanisms for direct reverse engineering of adaptive phenotypic characters acquired by an individual during its life span into the genome. However, various evolutionary phenomena that came to fore in the last few years, seem to fit a more broadly interpreted (quasi)Lamarckian paradigm. The prokaryotic CRISPR-Cas system of defense against mobile elements seems to function via a bona fide Lamarckian mechanism, namely, by integrating small segments of viral or plasmid DNA into specific loci in the host prokaryote genome and then utilizing the respective transcripts to destroy the cognate mobile element DNA (or RNA). A similar principle seems to be employed in the piRNA branch of RNA interference which is involved in defense against transposable elements in the animal germ line. Horizontal gene transfer (HGT), a dominant evolutionary process, at least, in prokaryotes, appears to be a form of (quasi)Lamarckian inheritance. The rate of HGT and the nature of acquired genes depend on the environment of the recipient organism and, in some cases, the transferred genes confer a selective advantage for growth in that environment, meeting the Lamarckian criteria. Various forms of stress-induced mutagenesis are tightly regulated and comprise a universal adaptive response to environmental stress in cellular life forms. Stress-induced mutagenesis can be construed as a quasi-Lamarckian phenomenon because the induced genomic changes, although random, are triggered by environmental factors and are beneficial to the organism. Conclusion Both Darwinian and Lamarckian modalities of evolution appear to be important, and reflect different aspects of the interaction between populations and the environment. Reviewers this article was reviewed by Juergen Brosius, Valerian Dolja, and Martijn Huynen. For complete reports, see the Reviewers' reports section.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| | | |
Collapse
|
77
|
Evolution of biomolecular networks: lessons from metabolic and protein interactions. Nat Rev Mol Cell Biol 2009; 10:791-803. [PMID: 19851337 DOI: 10.1038/nrm2787] [Citation(s) in RCA: 148] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Despite only becoming popular at the beginning of this decade, biomolecular networks are now frameworks that facilitate many discoveries in molecular biology. The nodes of these networks are usually proteins (specifically enzymes in metabolic networks), whereas the links (or edges) are their interactions with other molecules. These networks are made up of protein-protein interactions or enzyme-enzyme interactions through shared metabolites in the case of metabolic networks. Evolutionary analysis has revealed that changes in the nodes and links in protein-protein interaction and metabolic networks are subject to different selection pressures owing to distinct topological features. However, many evolutionary constraints can be uncovered only if temporal and spatial aspects are included in the network analysis.
Collapse
|
78
|
Isambert H, Stein RR. On the need for widespread horizontal gene transfers under genome size constraint. Biol Direct 2009; 4:28. [PMID: 19703318 PMCID: PMC2740843 DOI: 10.1186/1745-6150-4-28] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2009] [Accepted: 08/25/2009] [Indexed: 11/20/2022] Open
Abstract
Background While eukaryotes primarily evolve by duplication-divergence expansion (and reduction) of their own gene repertoire with only rare horizontal gene transfers, prokaryotes appear to evolve under both gene duplications and widespread horizontal gene transfers over long evolutionary time scales. But, the evolutionary origin of this striking difference in the importance of horizontal gene transfers remains by and large a mystery. Hypothesis We propose that the abundance of horizontal gene transfers in free-living prokaryotes is a simple but necessary consequence of two opposite effects: i) their apparent genome size constraint compared to typical eukaryote genomes and ii) their underlying genome expansion dynamics through gene duplication-divergence evolution, as demonstrated by the presence of many tandem and block repeated genes. In principle, this combination of genome size constraint and underlying duplication expansion should lead to a coalescent-like process with extensive turnover of functional genes. This would, however, imply the unlikely, systematic reinvention of functions from discarded genes within independent phylogenetic lineages. Instead, we propose that the long-term evolutionary adaptation of free-living prokaryotes must have resulted in the emergence of efficient non-phylogenetic pathways to circumvent gene loss. Implications This need for widespread horizontal gene transfers due to genome size constraint implies, in particular, that prokaryotes must remain under strong selection pressure in order to maintain the long-term evolutionary adaptation of their "mutualized" gene pool, beyond the inevitable turnover of individual prokaryote species. By contrast, the absence of genome size constraint for typical eukaryotes has presumably relaxed their need for widespread horizontal gene transfers and strong selection pressure. Yet, the resulting loss of genetic functions, due to weak selection pressure and inefficient gene recovery mechanisms, must have ultimately favored the emergence of more complex life styles and ecological integration of many eukaryotes. Reviewers This article was reviewed by Pierre Pontarotti, Eugene V Koonin and Sergei Maslov.
Collapse
Affiliation(s)
- Hervé Isambert
- Institut Curie, CNRS UMR168, 11 rue P, & M, Curie, 75005 Paris, France.
| | | |
Collapse
|