101
|
Pathways of Genetic Code Evolution in Ancient and Modern Organisms. J Mol Evol 2015; 80:229-43. [DOI: 10.1007/s00239-015-9686-8] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Accepted: 06/03/2015] [Indexed: 10/23/2022]
|
102
|
Carels N, Ponce de Leon M. An Interpretation of the Ancestral Codon from Miller's Amino Acids and Nucleotide Correlations in Modern Coding Sequences. Bioinform Biol Insights 2015; 9:37-47. [PMID: 25922573 PMCID: PMC4401237 DOI: 10.4137/bbi.s24021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Revised: 03/08/2015] [Accepted: 03/13/2015] [Indexed: 12/31/2022] Open
Abstract
Purine bias, which is usually referred to as an “ancestral codon”, is known to result in short-range correlations between nucleotides in coding sequences, and it is common in all species. We demonstrate that RWY is a more appropriate pattern than the classical RNY, and purine bias (Rrr) is the product of a network of nucleotide compensations induced by functional constraints on the physicochemical properties of proteins. Through deductions from universal correlation properties, we also demonstrate that amino acids from Miller’s spark discharge experiment are compatible with functional primeval proteins at the dawn of living cell radiation on earth. These amino acids match the hydropathy and secondary structures of modern proteins.
Collapse
Affiliation(s)
- Nicolas Carels
- Laboratório de Modelagem de Sistemas Biológicos, National Institute for Science and Technology on Innovation in Neglected Diseases (INCT/IDN), Centro de Desenvolvimento Tecnológico em Saúde (CDTS), Fundação Oswaldo Cruz (FIOCRUZ), Rio de Janeiro, Brazil
| | - Miguel Ponce de Leon
- Departamento de Bioquímica y Biología Molecular I, Facultad de Ciencias Químicas, Universidad Complutense de Madrid, Ciudad Universitaria, Madrid, Spain
| |
Collapse
|
103
|
Extraordinarily adaptive properties of the genetically encoded amino acids. Sci Rep 2015; 5:9414. [PMID: 25802223 PMCID: PMC4371090 DOI: 10.1038/srep09414] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2014] [Accepted: 02/12/2015] [Indexed: 02/02/2023] Open
Abstract
Using novel advances in computational chemistry, we demonstrate that the set of 20 genetically encoded amino acids, used nearly universally to construct all coded terrestrial proteins, has been highly influenced by natural selection. We defined an adaptive set of amino acids as one whose members thoroughly cover relevant physico-chemical properties, or “chemistry space.” Using this metric, we compared the encoded amino acid alphabet to random sets of amino acids. These random sets were drawn from a computationally generated compound library containing 1913 alternative amino acids that lie within the molecular weight range of the encoded amino acids. Sets that cover chemistry space better than the genetically encoded alphabet are extremely rare and energetically costly. Further analysis of more adaptive sets reveals common features and anomalies, and we explore their implications for synthetic biology. We present these computations as evidence that the set of 20 amino acids found within the standard genetic code is the result of considerable natural selection. The amino acids used for constructing coded proteins may represent a largely global optimum, such that any aqueous biochemistry would use a very similar set.
Collapse
|
104
|
Strazewski P. Omne Vivum Ex Vivo … Omne? How to Feed an Inanimate Evolvable Chemical System so as to Let it Self-evolve into Increased Complexity and Life-like Behaviour. Isr J Chem 2015. [DOI: 10.1002/ijch.201400175] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
105
|
Ancestral Reconstruction of a Pre-LUCA Aminoacyl-tRNA Synthetase Ancestor Supports the Late Addition of Trp to the Genetic Code. J Mol Evol 2015; 80:171-85. [PMID: 25791872 DOI: 10.1007/s00239-015-9672-1] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2015] [Accepted: 03/09/2015] [Indexed: 01/14/2023]
Abstract
The genetic code was likely complete in its current form by the time of the last universal common ancestor (LUCA). Several scenarios have been proposed for explaining the code's pre-LUCA emergence and expansion, and the relative order of the appearance of amino acids used in translation. One co-evolutionary model of genetic code expansion proposes that at least some amino acids were added to the code by the ancient divergence of aminoacyl-tRNA synthetase (aaRS) families. Of all the amino acids used within the genetic code, Trp is most frequently claimed as a relatively recent addition. We observe that, since TrpRS and TyrRS are paralogous protein families retaining significant sequence similarity, the inferred sequence composition of their ancestor can be used to evaluate this co-evolutionary model of genetic code expansion. We show that ancestral sequence reconstructions of the pre-LUCA paralog ancestor of TyrRS and TrpRS have several sites containing Tyr, yet a complete absence of sites containing Trp. This is consistent with the paralog ancestor being specific for the utilization of Tyr, with Trp being a subsequent addition to the genetic code facilitated by a process of aaRS divergence and neofunctionalization. Only after this divergence could Trp be specifically encoded and incorporated into proteins, including the TyrRS and TrpRS descendant lineages themselves. This early absence of Trp is observed under both homogeneous and non-homogeneous models of ancestral sequence reconstruction. Simulations support that this observed absence of Trp is unlikely to be due to chance or model bias. These results support that the final stages of genetic code evolution occurred well within the "protein world," and that the presence-absence of Trp within conserved sites of ancient protein domains is a likely measure of their relative antiquity, permitting the relative timing of extremely early events within protein evolution before LUCA.
Collapse
|
106
|
RNA editing and modifications of RNAs might have favoured the evolution of the triplet genetic code from an ennuplet code. J Theor Biol 2014; 359:1-5. [DOI: 10.1016/j.jtbi.2014.05.037] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2014] [Revised: 05/21/2014] [Accepted: 05/27/2014] [Indexed: 11/24/2022]
|
107
|
Davila AF, McKay CP. Chance and necessity in biochemistry: implications for the search for extraterrestrial biomarkers in Earth-like environments. ASTROBIOLOGY 2014; 14:534-40. [PMID: 24867145 PMCID: PMC4060776 DOI: 10.1089/ast.2014.1150] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
In this paper, we examine a restricted subset of the question of possible alien biochemistries. That is, we look into how different life might be if it emerged in environments similar to that required for life on Earth. We advocate a principle of chance and necessity in biochemistry. According to this principle, biochemistry is in some fundamental way the sum of two processes: there is an aspect of biochemistry that is an endowment from prebiotic processes, which represents the necessity, plus an aspect that is invented by the process of evolution, which represents the chance. As a result, we predict that life originating in extraterrestrial Earth-like environments will share biochemical motifs that can be traced back to the prebiotic world but will also have intrinsic biochemical traits that are unlikely to be duplicated elsewhere as they are combinatorially path-dependent. Effective and objective strategies to search for biomarkers, and evidence for a second genesis, on planets with Earth-like environments can be built based on this principle.
Collapse
Affiliation(s)
- Alfonso F. Davila
- Carl Sagan Center at the SETI Institute, Mountain View, California
- Space Science and Astrobiology Division, NASA Ames Research Center, Moffett Field, California
| | - Christopher P. McKay
- Space Science and Astrobiology Division, NASA Ames Research Center, Moffett Field, California
| |
Collapse
|
108
|
Ilardo MA, Freeland SJ. Testing for adaptive signatures of amino acid alphabet evolution using chemistry space. ACTA ACUST UNITED AC 2014. [DOI: 10.1186/1759-2208-5-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
109
|
Bandhu AV, Aggarwal N, Sengupta S. Revisiting the physico-chemical hypothesis of code origin: an analysis based on code-sequence coevolution in a finite population. ORIGINS LIFE EVOL B 2013; 43:465-89. [PMID: 24500541 DOI: 10.1007/s11084-014-9353-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2013] [Accepted: 01/13/2014] [Indexed: 01/23/2023]
Abstract
The origin of the genetic code marked a major transition from a plausible RNA world to the world of DNA and proteins and is an important milestone in our understanding of the origin of life. We examine the efficacy of the physico-chemical hypothesis of code origin by carrying out simulations of code-sequence coevolution in finite populations in stages, leading first to the emergence of ten amino acid code(s) and subsequently to 14 amino acid code(s). We explore two different scenarios of primordial code evolution. In one scenario, competition occurs between populations of equilibrated code-sequence sets while in another scenario; new codes compete with existing codes as they are gradually introduced into the population with a finite probability. In either case, we find that natural selection between competing codes distinguished by differences in the degree of physico-chemical optimization is unable to explain the structure of the standard genetic code. The code whose structure is most consistent with the standard genetic code is often not among the codes that have a high fixation probability. However, we find that the composition of the code population affects the code fixation probability. A physico-chemically optimized code gets fixed with a significantly higher probability if it competes against a set of randomly generated codes. Our results suggest that physico-chemical optimization may not be the sole driving force in ensuring the emergence of the standard genetic code.
Collapse
Affiliation(s)
- Ashutosh Vishwa Bandhu
- School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi, 110067, India
| | | | | |
Collapse
|
110
|
Meringer M, Cleaves HJ, Freeland SJ. Beyond terrestrial biology: charting the chemical universe of α-amino acid structures. J Chem Inf Model 2013; 53:2851-62. [PMID: 24152173 DOI: 10.1021/ci400209n] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
α-Amino acids are fundamental to biochemistry as the monomeric building blocks with which cells construct proteins according to genetic instructions. However, the 20 amino acids of the standard genetic code represent a tiny fraction of the number of α-amino acid chemical structures that could plausibly play such a role, both from the perspective of natural processes by which life emerged and evolved, and from the perspective of human-engineered genetically coded proteins. Until now, efforts to describe the structures comprising this broader set, or even estimate their number, have been hampered by the complex combinatorial properties of organic molecules. Here, we use computer software based on graph theory and constructive combinatorics in order to conduct an efficient and exhaustive search of the chemical structures implied by two careful and precise definitions of the α-amino acids relevant to coded biological proteins. Our results include two virtual libraries of α-amino acid structures corresponding to these different approaches, comprising 121 044 and 3 846 structures, respectively, and suggest a simple approach to exploring much larger, as yet uncomputed, libraries of interest.
Collapse
Affiliation(s)
- Markus Meringer
- German Aerospace Center (DLR), Earth Observation Center (EOC) , Münchner Straße 20, D-82234 Oberpfaffenhofen-Wessling, Germany
| | | | | |
Collapse
|
111
|
Di Giulio M. The Origin of the Genetic Code: Matter of Metabolism or Physicochemical Determinism? J Mol Evol 2013; 77:131-3. [DOI: 10.1007/s00239-013-9593-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2013] [Accepted: 10/18/2013] [Indexed: 12/27/2022]
|
112
|
Toxvaerd S. The role of carbohydrates at the origin of homochirality in biosystems. ORIGINS LIFE EVOL B 2013; 43:391-409. [PMID: 23996458 DOI: 10.1007/s11084-013-9342-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2012] [Accepted: 07/12/2013] [Indexed: 01/18/2023]
Abstract
Pasteur has demonstrated that the chiral components in a racemic mixture can separate in homochiral crystals. But with a strong chiral discrimination the chiral components in a concentrated mixture can also phase separate into homochiral fluid domains, and the isomerization kinetics can then perform a symmetry breaking into one thermodynamical stable homochiral system. Glyceraldehyde has a sufficient chiral discrimination to perform such a symmetry breaking. The requirement of a high concentration of the chiral reactant(s) in an aqueous solution in order to perform and maintain homochirality; the appearance of phosphorylation of almost all carbohydrates in the central machinery of life; the basic ideas that the biochemistry and the glycolysis and gluconeogenesis contain the trace of the biochemical evolution, all point in the direction of that homochirality was obtained just after- or at a phosphorylation of the very first products of the formose reaction, at high concentrations of the reactants in phosphate rich compartments in submarine hydrothermal vents. A racemic solution of D,L-glyceraldehyde-3-phosphate could be the template for obtaining homochiral D-glyceraldehyde-3-phosphate(aq) as well as L-amino acids.
Collapse
Affiliation(s)
- Søren Toxvaerd
- DNRF centre "Glass and Time", IMFUFA, Department of Sciences, Roskilde University, Postbox 260, 4000, Roskilde, Denmark,
| |
Collapse
|
113
|
A realistic model under which the genetic code is optimal. J Mol Evol 2013; 77:170-84. [PMID: 23877342 DOI: 10.1007/s00239-013-9571-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Accepted: 06/27/2013] [Indexed: 01/23/2023]
Abstract
The genetic code has a high level of error robustness. Using values of hydrophobicity scales as a proxy for amino acid character, and the mean square measure as a function quantifying error robustness, a value can be obtained for a genetic code which reflects the error robustness of that code. By comparing this value with a distribution of values belonging to codes generated by random permutations of amino acid assignments, the level of error robustness of a genetic code can be quantified. We present a calculation in which the standard genetic code is shown to be optimal. We obtain this result by (1) using recently updated values of polar requirement as input; (2) fixing seven assignments (Ile, Trp, His, Phe, Tyr, Arg, and Leu) based on aptamer considerations; and (3) using known biosynthetic relations of the 20 amino acids. This last point is reflected in an approach of subdivision (restricting the random reallocation of assignments to amino acid subgroups, the set of 20 being divided in four such subgroups). The three approaches to explain robustness of the code (specific selection for robustness, amino acid-RNA interactions leading to assignments, or a slow growth process of assignment patterns) are reexamined in light of our findings. We offer a comprehensive hypothesis, stressing the importance of biosynthetic relations, with the code evolving from an early stage with just glycine and alanine, via intermediate stages, towards 64 codons carrying todays meaning.
Collapse
|
114
|
Pollack JD, Gerard D, Pearl DK. Uniquely localized intra-molecular amino acid concentrations at the glycolytic enzyme catalytic/active centers of Archaea, Bacteria and Eukaryota are associated with their proposed temporal appearances on earth. ORIGINS LIFE EVOL B 2013; 43:161-87. [PMID: 23715690 DOI: 10.1007/s11084-013-9331-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2012] [Accepted: 04/04/2013] [Indexed: 11/27/2022]
Abstract
The distributions of amino acids at most-conserved sites nearest catalytic/active centers (C/AC) in 4,645 sequences of ten enzymes of the glycolytic Embden-Meyerhof-Parnas pathway in Archaea, Bacteria and Eukaryota are similar to the proposed temporal order of their appearance on Earth. Glycine, isoleucine, leucine, valine, glutamic acid and possibly lysine often described as prebiotic, i.e., existing or occurring before the emergence of life, were localized in positional and conservational defined aggregations in all enzymes of all Domains. The distributions of all 20 biologic amino acids in most-conserved sites nearest their C/ACs were quite different either from distributions in sites less-conserved and further from their C/ACs or from all amino acids regardless of their position or conservation. The major concentrations of glycine, e.g., perhaps the earliest prebiotic amino acid, occupies ≈ 16 % of all the most-conserved sites within a volume of ≈ 7-8 Å radius from their C/ACs and decreases linearly towards the molecule's peripheries. Spatially localized major concentrations of isoleucine, leucine and valine are in the mid-conserved and mid-distant sites from their C/ACs in protein interiors. Lysine and glutamic acid comprise ≈ 25-30 % of all amino acids within an irregular volume bounded by ≈ 24-28 Å radii from their C/ACs at the most-distant least-conserved sites. The unreported characteristics of these amino acids: their spatially and conservationally identified concentrations in Archaea, Bacteria and Eukaryota, suggest some common structural organization of glycolytic enzymes that may be relevant to their evolution and that of other proteins. We discuss our data in relation to enzyme evolution, their reported prebiotic putative temporal appearances on Earth, abundances, biological "cost", neighbor-sequence preferences or "ordering" and some thermodynamic parameters.
Collapse
Affiliation(s)
- J Dennis Pollack
- Department of Molecular Virology, Immunology and Medical Genetics, The College of Medicine, The Ohio State University, Columbus, OH 43210, USA.
| | | | | |
Collapse
|
115
|
Morgens DW, Cavalcanti ARO. An alternative look at code evolution: using non-canonical codes to evaluate adaptive and historic models for the origin of the genetic code. J Mol Evol 2013; 76:71-80. [PMID: 23344715 DOI: 10.1007/s00239-013-9542-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2012] [Accepted: 01/15/2013] [Indexed: 10/27/2022]
Abstract
The canonical code has been shown many times to be highly robust against point mutations; that is, mutations that change a single nucleotide tend to result in similar amino acids more often than expected by chance. There are two major types of models for the origin of the code, which explain how this sophisticated structure evolved. Adaptive models state that the primitive code was specifically selected for error minimization, while historic models hypothesize that the robustness of the code is an artifact or by-product of the mechanism of code evolution. In this paper, we evaluated the levels of robustness in existing non-canonical codes as well as codes that differ in only one codon assignment from the standard code. We found that the level of robustness of many of these codes is comparable or better than that of the standard code. Although these results do not preclude an adaptive origin of the genetic code, they suggest that the code was not selected for minimizing the effects of point mutations.
Collapse
Affiliation(s)
- David W Morgens
- Department of Biology, Pomona College, 175 W 6th Street, Claremont, CA, USA
| | | |
Collapse
|
116
|
Almeida L, Demongeot J. Predictive power of "a minima" models in biology. Acta Biotheor 2012; 60:3-19. [PMID: 22318429 DOI: 10.1007/s10441-012-9146-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2011] [Accepted: 01/11/2012] [Indexed: 12/19/2022]
Abstract
Many apparently complex mechanisms in biology, especially in embryology and molecular biology, can be explained easily by reasoning at the level of the "efficient cause" of the observed phenomenology: the mechanism can then be explained by a simple geometrical argument or a variational principle, leading to the solution of an optimization problem, for example, via the co-existence of a minimization and a maximization problem (a min-max principle). Passing from a microscopic (or cellular) level (optimal min-max solution of the simple mechanistic system) to the macroscopic level often involves an averaging effect (linked to the repetition of a large number of such microscopic systems with possible random choice of the parameters of each of them) that gives birth to a global functional feature (e.g. at the tissue level). We will illustrate these general principles by building in four different domains of application "a minima" models and showing the main properties of their solutions: (1) extraction of a minimal RNA structure functioning as the first "peptidic machine," a kind of ancestral ribosome; (2) study of a genetic regulatory network of Drosophila centred on Engrailed gene and expressing successively two genes inside a limit cycle; (3) study of a genetic network regulating neural activity and proliferation in mammals; and (4) study of a simple geometric model of epiboly in zebrafish.
Collapse
|
117
|
Greenwald J, Riek R. On the possible amyloid origin of protein folds. J Mol Biol 2012; 421:417-26. [PMID: 22542525 DOI: 10.1016/j.jmb.2012.04.015] [Citation(s) in RCA: 111] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2011] [Revised: 04/17/2012] [Accepted: 04/17/2012] [Indexed: 11/26/2022]
Abstract
The diversity of protein folds is derived from the diversity of the underlying proteome. Such diversity must have originated from a so-called common ancestor: a hypothetical fold whose identity will, in all likelihood, never be known. Nonetheless, hypotheses exist to explain the evolution of protein folds. When formulating such hypotheses as done here, the entire repertoire of polypeptide structure, from well-defined tertiary structures and molten globule states to intrinsically disordered proteins and oligomeric aggregates, is worth considering. It is the aim of this short essay to discuss the hypothesis that one type of protein aggregate-the cross-β-sheet motif-was the first functional protein fold, that is, the common ancestor fold. Support for this hypothesis comes from the observations that (i) short peptides with simple amino acid sequences are able to form the cross-β-sheet structure, (ii) amyloids can be very stable under harsh conditions, (iii) amyloids can self-assemble in complex mixtures, (iv) amyloids have many potent activities that are attributable to the inherent repetitiveness of the structure, and (v) the proteomes of modern organisms appear to have evolved away from the more amyloidogenic sequences of older organisms, suggesting that amyloids were more ubiquitous earlier in the evolution of modern protein folds.
Collapse
Affiliation(s)
- Jason Greenwald
- ETH Zurich, Physical Chemistry, ETH Honggerberg, 8093 Zurich, Switzerland
| | | |
Collapse
|
118
|
Mutuality in Discrete and Compositional Information: Perspectives for Synthetic Genetic Codes. Cognit Comput 2011. [DOI: 10.1007/s12559-011-9116-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
119
|
Philip GK, Freeland SJ. Did evolution select a nonrandom "alphabet" of amino acids? ASTROBIOLOGY 2011; 11:235-240. [PMID: 21434765 DOI: 10.1089/ast.2010.0567] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
The last universal common ancestor of contemporary biology (LUCA) used a precise set of 20 amino acids as a standard alphabet with which to build genetically encoded protein polymers. Considerable evidence indicates that some of these amino acids were present through nonbiological syntheses prior to the origin of life, while the rest evolved as inventions of early metabolism. However, the same evidence indicates that many alternatives were also available, which highlights the question: what factors led biological evolution on our planet to define its standard alphabet? One possibility is that natural selection favored a set of amino acids that exhibits clear, nonrandom properties-a set of especially useful building blocks. However, previous analysis that tested whether the standard alphabet comprises amino acids with unusually high variance in size, charge, and hydrophobicity (properties that govern what protein structures and functions can be constructed) failed to clearly distinguish evolution's choice from a sample of randomly chosen alternatives. Here, we demonstrate unambiguous support for a refined hypothesis: that an optimal set of amino acids would spread evenly across a broad range of values for each fundamental property. Specifically, we show that the standard set of 20 amino acids represents the possible spectra of size, charge, and hydrophobicity more broadly and more evenly than can be explained by chance alone.
Collapse
Affiliation(s)
- Gayle K Philip
- NASA Astrobiology Institute, University of Hawaii, Honolulu, 96822, USA
| | | |
Collapse
|
120
|
Szori M, Jójárt B, Izsák R, Szori K, Csizmadia IG, Viskolcz B. Chemical evolution of biomolecule building blocks. Can thermodynamics explain the accumulation of glycine in the prebiotic ocean? Phys Chem Chem Phys 2011; 13:7449-58. [PMID: 21431107 DOI: 10.1039/c0cp02687e] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
It has always been a question of considerable scientific interest why amino acids (and other biomolecule building blocks) formed and accumulated in the prebiotic ocean. In this study, we suggest an answer to this question for the simplest amino acid, glycine. We have shown for the first time that classical equilibrium thermodynamics can explain the most likely selection of glycine (and the derivative of its dipeptide) in aqueous media, although glycine is not the lowest free energy structure among all (404) possible constitutional isomers. Species preceding glycine in the free energy order are either supramolecular complexes of small molecules or such molecules likely to dissociate and thus get back to the gas phase. Then, 2-hydroxyacetamide condensates yielding a thermodynamically favored derivative of glycine dipeptide providing an alternative way for peptide formation. It is remarkable that a simple equilibrium thermodynamic model can explain the accumulation of glycine and provide a reason for the importance of water in the formation process.
Collapse
Affiliation(s)
- Milán Szori
- Department of Chemical Informatics, Faculty of Education, University of Szeged, Boldogasszony sgt. 6, Szeged 6725, Hungary.
| | | | | | | | | | | |
Collapse
|
121
|
McDonald GD, Storrie-Lombardi MC. Biochemical constraints in a protobiotic earth devoid of basic amino acids: the "BAA(-) world". ASTROBIOLOGY 2010; 10:989-1000. [PMID: 21162678 DOI: 10.1089/ast.2010.0484] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
It has been hypothesized in this journal and elsewhere, based on surveys of published data from prebiotic synthesis experiments and carbonaceous meteorite analyses, that basic amino acids such as lysine and arginine were not abundant on prebiotic Earth. If the basic amino acids were incorporated only rarely into the first peptides formed in that environment, it is important to understand what protobiotic chemistry is possible in their absence. As an initial test of the hypothesis that basic amino acid negative [BAA(-)] proteins could have performed at least a subset of protobiotic chemistry, the current work reports on a survey of 13 archaeal and 13 bacterial genomes that has identified 61 modern gene sequences coding for known or putative proteins not containing arginine or lysine. Eleven of the sequences found code for proteins whose functions are well known and important in the biochemistry of modern microbial life: lysine biosynthesis protein LysW, arginine cluster proteins, copper ion binding proteins, bacterial flagellar proteins, and PE or PPE family proteins. These data indicate that the lack of basic amino acids does not prevent peptides or proteins from serving useful structural and biochemical functions. However, as would be predicted from fundamental physicochemical principles, we see no fossil evidence of prebiotic BAA(-) peptide sequences capable of interacting directly with nucleic acids.
Collapse
Affiliation(s)
- Gene D McDonald
- Department of Chemistry and Biochemistry, University of Texas at Austin, Austin, Texas 78712, USA.
| | | |
Collapse
|
122
|
Stability of the genetic code and optimal parameters of amino acids. J Theor Biol 2010; 269:57-63. [PMID: 20955716 DOI: 10.1016/j.jtbi.2010.10.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2010] [Revised: 09/20/2010] [Accepted: 10/12/2010] [Indexed: 11/24/2022]
Abstract
The standard genetic code is known to be much more efficient in minimizing adverse effects of misreading errors and one-point mutations in comparison with a random code having the same structure, i.e. the same number of codons coding for each particular amino acid. We study the inverse problem, how the code structure affects the optimal physico-chemical parameters of amino acids ensuring the highest stability of the genetic code. It is shown that the choice of two or more amino acids with given properties determines unambiguously all the others. In this sense the code structure determines strictly the optimal parameters of amino acids or the corresponding scales may be derived directly from the genetic code. In the code with the structure of the standard genetic code the resulting values for hydrophobicity obtained in the scheme "leave one out" and in the scheme with fixed maximum and minimum parameters correlate significantly with the natural scale. The comparison of the optimal and natural parameters allows assessing relative impact of physico-chemical and error-minimization factors during evolution of the genetic code. As the resulting optimal scale depends on the choice of amino acids with given parameters, the technique can also be applied to testing various scenarios of the code evolution with increasing number of codified amino acids. Our results indicate the co-evolution of the genetic code and physico-chemical properties of recruited amino acids.
Collapse
|
123
|
Smith DR, Chapman MR. Economical evolution: microbes reduce the synthetic cost of extracellular proteins. mBio 2010; 1:e00131-10. [PMID: 20824102 PMCID: PMC2932507 DOI: 10.1128/mbio.00131-10] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2010] [Accepted: 07/29/2010] [Indexed: 11/20/2022] Open
Abstract
Protein evolution is not simply a race toward improved function. Because organisms compete for limited resources, fitness is also affected by the relative economy of an organism's proteome. Indeed, many abundant proteins contain relatively high percentages of amino acids that are metabolically less taxing for the cell to make, thus reducing cellular cost. However, not all abundant proteins are economical, and many economical proteins are not particularly abundant. Here we examined protein composition and found that the relative synthetic cost of amino acids constrains the composition of microbial extracellular proteins. In Escherichia coli, extracellular proteins contain, on average, fewer energetically expensive amino acids independent of their abundance, length, function, or structure. Economic pressures have strategically shaped the amino acid composition of multicomponent surface appendages, such as flagella, curli, and type I pili, and extracellular enzymes, including type III effector proteins and secreted serine proteases. Furthermore, in silico analysis of Pseudomonas syringae, Mycobacterium tuberculosis, Saccharomyces cerevisiae, and over 25 other microbes spanning a wide range of GC content revealed a broad bias toward more economical amino acids in extracellular proteins. The synthesis of any protein, especially those rich in expensive aromatic amino acids, represents a significant investment. Because extracellular proteins are lost to the environment and not recycled like other cellular proteins, they present a greater burden on the cell, as their amino acids cannot be reutilized during translation. We hypothesize that evolution has optimized extracellular proteins to reduce their synthetic burden on the cell.
Collapse
Affiliation(s)
- Daniel R Smith
- Department of Molecular, Cellular and Developmental Biology, University of Michigan, Ann Arbor, Michigan, USA
| | | |
Collapse
|
124
|
Goldman AD, Samudrala R, Baross JA. The evolution and functional repertoire of translation proteins following the origin of life. Biol Direct 2010; 5:15. [PMID: 20377891 PMCID: PMC2873265 DOI: 10.1186/1745-6150-5-15] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2010] [Accepted: 04/08/2010] [Indexed: 11/11/2022] Open
Abstract
Background The RNA world hypothesis posits that the earliest genetic system consisted of informational RNA molecules that directed the synthesis of modestly functional RNA molecules. Further evidence suggests that it was within this RNA-based genetic system that life developed the ability to synthesize proteins by translating genetic code. Here we investigate the early development of the translation system through an evolutionary survey of protein architectures associated with modern translation. Results Our analysis reveals a structural expansion of translation proteins immediately following the RNA world and well before the establishment of the DNA genome. Subsequent functional annotation shows that representatives of the ten most ancestral protein architectures are responsible for all of the core protein functions found in modern translation. Conclusions We propose that this early robust translation system evolved by virtue of a positive feedback cycle in which the system was able to create increasingly complex proteins to further enhance its own function. Reviewers This article was reviewed by Janet Siefert, George Fox, and Antonio Lazcano (nominated by Laura Landweber)
Collapse
Affiliation(s)
- Aaron D Goldman
- Department of Microbiology, University of Washington, Box 357242, Seattle, WA 98195, USA.
| | | | | |
Collapse
|
125
|
Cleaves HJ. The origin of the biologically coded amino acids. J Theor Biol 2010; 263:490-8. [PMID: 20034500 DOI: 10.1016/j.jtbi.2009.12.014] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2009] [Revised: 11/19/2009] [Accepted: 12/14/2009] [Indexed: 11/29/2022]
Affiliation(s)
- H James Cleaves
- Geophysical Laboratory, The Carnegie Institution for Science, 5251 Broad Branch Road NW, Washington, DC 20015, USA.
| |
Collapse
|
126
|
Grosjean H, de Crécy-Lagard V, Marck C. Deciphering synonymous codons in the three domains of life: co-evolution with specific tRNA modification enzymes. FEBS Lett 2010; 584:252-64. [PMID: 19931533 DOI: 10.1016/j.febslet.2009.11.052] [Citation(s) in RCA: 215] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2009] [Revised: 11/11/2009] [Accepted: 11/16/2009] [Indexed: 10/20/2022]
Abstract
The strategies organisms use to decode synonymous codons in cytosolic protein synthesis are not uniform. The complete isoacceptor tRNA repertoire and the type of modified nucleoside found at the wobble position 34 of their anticodons were analyzed in all kingdoms of life. This led to the identification of four main decoding strategies that are diversely used in Bacteria, Archaea and Eukarya. Many of the modern tRNA modification enzymes acting at position 34 of tRNAs are present only in specific domains and obviously have arisen late during evolution. In an evolutionary fine-tuning process, these enzymes must have played an essential role in the progressive introduction of new amino acids, and in the refinement and standardization of the canonical nuclear genetic code observed in all extant organisms (functional convergent evolutionary hypothesis).
Collapse
Affiliation(s)
- Henri Grosjean
- Université Paris-Sud, CNRS, UMR8621, Institut de Génétique et de Microbiologie, Orsay F-91405, France.
| | | | | |
Collapse
|
127
|
Novozhilov AS, Koonin EV. Exceptional error minimization in putative primordial genetic codes. Biol Direct 2009; 4:44. [PMID: 19925661 PMCID: PMC2785773 DOI: 10.1186/1745-6150-4-44] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2009] [Accepted: 11/19/2009] [Indexed: 11/11/2022] Open
Abstract
Background The standard genetic code is redundant and has a highly non-random structure. Codons for the same amino acids typically differ only by the nucleotide in the third position, whereas similar amino acids are encoded, mostly, by codon series that differ by a single base substitution in the third or the first position. As a result, the code is highly albeit not optimally robust to errors of translation, a property that has been interpreted either as a product of selection directed at the minimization of errors or as a non-adaptive by-product of evolution of the code driven by other forces. Results We investigated the error-minimization properties of putative primordial codes that consisted of 16 supercodons, with the third base being completely redundant, using a previously derived cost function and the error minimization percentage as the measure of a code's robustness to mistranslation. It is shown that, when the 16-supercodon table is populated with 10 putative primordial amino acids, inferred from the results of abiotic synthesis experiments and other evidence independent of the code's evolution, and with minimal assumptions used to assign the remaining supercodons, the resulting 2-letter codes are nearly optimal in terms of the error minimization level. Conclusion The results of the computational experiments with putative primordial genetic codes that contained only two meaningful letters in all codons and encoded 10 to 16 amino acids indicate that such codes are likely to have been nearly optimal with respect to the minimization of translation errors. This near-optimality could be the outcome of extensive early selection during the co-evolution of the code with the primordial, error-prone translation system, or a result of a unique, accidental event. Under this hypothesis, the subsequent expansion of the code resulted in a decrease of the error minimization level that became sustainable owing to the evolution of a high-fidelity translation system. Reviewers This article was reviewed by Paul Higgs (nominated by Arcady Mushegian), Rob Knight, and Sandor Pongor. For the complete reports, go to the Reviewers' Reports section.
Collapse
Affiliation(s)
- Artem S Novozhilov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| | | |
Collapse
|
128
|
Higgs PG. A four-column theory for the origin of the genetic code: tracing the evolutionary pathways that gave rise to an optimized code. Biol Direct 2009; 4:16. [PMID: 19393096 PMCID: PMC2689856 DOI: 10.1186/1745-6150-4-16] [Citation(s) in RCA: 104] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2009] [Accepted: 04/24/2009] [Indexed: 11/18/2022] Open
Abstract
Background The arrangement of the amino acids in the genetic code is such that neighbouring codons are assigned to amino acids with similar physical properties. Hence, the effects of translational error are minimized with respect to randomly reshuffled codes. Further inspection reveals that it is amino acids in the same column of the code (i.e. same second base) that are similar, whereas those in the same row show no particular similarity. We propose a 'four-column' theory for the origin of the code that explains how the action of selection during the build-up of the code leads to a final code that has the observed properties. Results The theory makes the following propositions. (i) The earliest amino acids in the code were those that are easiest to synthesize non-biologically, namely Gly, Ala, Asp, Glu and Val. (ii) These amino acids are assigned to codons with G at first position. Therefore the first code may have used only these codons. (iii) The code rapidly developed into a four-column code where all codons in the same column coded for the same amino acid: NUN = Val, NCN = Ala, NAN = Asp and/or Glu, and NGN = Gly. (iv) Later amino acids were added sequentially to the code by a process of subdivision of codon blocks in which a subset of the codons assigned to an early amino acid were reassigned to a later amino acid. (v) Later amino acids were added into positions formerly occupied by amino acids with similar properties because this can occur with minimal disruption to the proteins already encoded by the earlier code. As a result, the properties of the amino acids in the final code retain a four-column pattern that is a relic of the earliest stages of code evolution. Conclusion The driving force during this process is not the minimization of translational error, but positive selection for the increased diversity and functionality of the proteins that can be made with a larger amino acid alphabet. Nevertheless, the code that results is one in which translational error is minimized. We define a cost function with which we can compare the fitness of codes with varying numbers of amino acids, and a barrier function, which measures the change in cost immediately after addition of a new amino acid. We show that the barrier is positive if an amino acid is added into a column with dissimilar properties, but negative if an amino acid is added into a column with similar physical properties. Thus, natural selection favours the assignment of amino acids to the positions that they occupy in the final code. Reviewers This article was reviewed by David Ardell, Eugene Koonin and Stephen Freeland (nominated by Laurence Hurst)
Collapse
Affiliation(s)
- Paul G Higgs
- Department of Physics and Astronomy, McMaster University, Hamilton, Ontario L8S 4M1, Canada.
| |
Collapse
|