1
|
Fontecilla-Camps JC. Reflections on the Origin and Early Evolution of the Genetic Code. Chembiochem 2023; 24:e202300048. [PMID: 37052530 DOI: 10.1002/cbic.202300048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 03/01/2023] [Indexed: 04/14/2023]
Abstract
Examination of the genetic code (GeCo) reveals that amino acids coded by (A/U) codons display a large functional spectrum and bind RNA whereas, except for Arg, those coded by (G/C) codons do not. From a stereochemical viewpoint, the clear preference for (A/U)-rich codons to be located at the GeCo half blocks suggests they were specifically determined. Conversely, the overall lower affinity of cognate amino acids for their (G/C)-rich anticodons points to their late arrival to the GeCo. It is proposed that i) initially the code was composed of the eight (A/U) codons; ii) these codons were duplicated when G/C nucleotides were added to their wobble positions, and three new codons with G/C in their first position were incorporated; and iii) a combination of A/U and G/C nucleotides progressively generated the remaining codons.
Collapse
|
2
|
Abstract
Abstract
The code is meaningless unless translated. (Monod 1971, 143)
We address issues of a description of the origin and evolution of the genetic code from the semiotics standpoint. Developing the concept of codepoiesis introduced by M. Barbieri, a new idea of semio-poiesis is proposed. Semio-poiesis, a recursive auto-referential processing of a semiotic system, becomes a form of organization of the bio-world when and while notions of meaning and aiming are introduced into it. The description of the genetic code as a semiotic system (grammar and vocabulary) allows us to apply the method of internal reconstruction to it: on the basis of heterogeneity and irregularity of the current state, to explicate possible previous states and various ways of forming coding and textualization mechanisms. The revealed patterns and irregularities are consistent with hypotheses about the origin and evolution of the genetic code.
Collapse
|
3
|
Demongeot J, Seligmann H. Theoretical minimal RNA rings recapitulate the order of the genetic code's codon-amino acid assignments. J Theor Biol 2019; 471:108-116. [DOI: 10.1016/j.jtbi.2019.03.024] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Revised: 09/19/2018] [Accepted: 03/28/2019] [Indexed: 12/21/2022]
|
4
|
Frank A, Froese T. The Standard Genetic Code can Evolve from a Two-Letter GC Code Without Information Loss or Costly Reassignments. ORIGINS LIFE EVOL B 2018; 48:259-272. [PMID: 29959584 DOI: 10.1007/s11084-018-9559-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2018] [Accepted: 06/21/2018] [Indexed: 11/27/2022]
Abstract
It is widely agreed that the standard genetic code must have been preceded by a simpler code that encoded fewer amino acids. How this simpler code could have expanded into the standard genetic code is not well understood because most changes to the code are costly. Taking inspiration from the recently synthesized six-letter code, we propose a novel hypothesis: the initial genetic code consisted of only two letters, G and C, and then expanded the number of available codons via the introduction of an additional pair of letters, A and U. Various lines of evidence, including the relative prebiotic abundance of the earliest assigned amino acids, the balance of their hydrophobicity, and the higher GC content in genome coding regions, indicate that the original two nucleotides were indeed G and C. This process of code expansion probably started with the third base, continued with the second base, and ended up as the standard genetic code when the second pair of letters was introduced into the first base. The proposed process is consistent with the available empirical evidence, and it uniquely avoids the problem of costly code changes by positing instead that the code expanded its capacity via the creation of new codons with extra letters.
Collapse
Affiliation(s)
- Alejandro Frank
- Institute for Nuclear Sciences (ICN), National Autonomous University of Mexico (UNAM), Mexico City, Mexico
- Center for the Sciences of Complexity (C3), National Autonomous University of Mexico (UNAM), Mexico City, Mexico
- El Colegio Nacional, Mexico City, Mexico
| | - Tom Froese
- Center for the Sciences of Complexity (C3), National Autonomous University of Mexico (UNAM), Mexico City, Mexico.
- Institute for Applied Mathematics and Systems Research (IIMAS), National Autonomous University of Mexico (UNAM), Mexico City, Mexico.
| |
Collapse
|
5
|
Quaternionic representation of the genetic code. Biosystems 2016; 141:10-9. [PMID: 26751396 DOI: 10.1016/j.biosystems.2015.12.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2015] [Revised: 12/04/2015] [Accepted: 12/28/2015] [Indexed: 11/23/2022]
Abstract
A heuristic diagram of the evolution of the standard genetic code is presented. It incorporates, in a way that resembles the energy levels of an atom, the physical notion of broken symmetry and it is consistent with original ideas by Crick on the origin and evolution of the code as well as with the chronological order of appearance of the amino acids along the evolution as inferred from work that mixtures known experimental results with theoretical speculations. Suggested by the diagram we propose a Hamilton quaternions based mathematical representation of the code as it stands now-a-days. The central object in the description is a codon function that assigns to each amino acid an integer quaternion in such a way that the observed code degeneration is preserved. We emphasize the advantages of a quaternionic representation of amino acids taking as an example the folding of proteins. With this aim we propose an algorithm to go from the quaternions sequence to the protein three dimensional structure which can be compared with the corresponding experimental one stored at the Protein Data Bank. In our criterion the mathematical representation of the genetic code in terms of quaternions merits to be taken into account because it describes not only most of the known properties of the genetic code but also opens new perspectives that are mainly derived from the close relationship between quaternions and rotations.
Collapse
|
6
|
Francis BR. Evolution of the genetic code by incorporation of amino acids that improved or changed protein function. J Mol Evol 2013; 77:134-58. [PMID: 23743924 DOI: 10.1007/s00239-013-9567-y] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2013] [Accepted: 05/25/2013] [Indexed: 12/31/2022]
Abstract
Fifty years have passed since the genetic code was deciphered, but how the genetic code came into being has not been satisfactorily addressed. It is now widely accepted that the earliest genetic code did not encode all 20 amino acids found in the universal genetic code as some amino acids have complex biosynthetic pathways and likely were not available from the environment. Therefore, the genetic code evolved as pathways for synthesis of new amino acids became available. One hypothesis proposes that early in the evolution of the genetic code four amino acids-valine, alanine, aspartic acid, and glycine-were coded by GNC codons (N = any base) with the remaining codons being nonsense codons. The other sixteen amino acids were subsequently added to the genetic code by changing nonsense codons into sense codons for these amino acids. Improvement in protein function is presumed to be the driving force behind the evolution of the code, but how improved function was achieved by adding amino acids has not been examined. Based on an analysis of amino acid function in proteins, an evolutionary mechanism for expansion of the genetic code is described in which individual coded amino acids were replaced by new amino acids that used nonsense codons differing by one base change from the sense codons previously used. The improved or altered protein function afforded by the changes in amino acid function provided the selective advantage underlying the expansion of the genetic code. Analysis of amino acid properties and functions explains why amino acids are found in their respective positions in the genetic code.
Collapse
Affiliation(s)
- Brian R Francis
- Department of Molecular Biology, University of Wyoming, Laramie, WY, 82071-3944, USA,
| |
Collapse
|
7
|
|
8
|
Abstract
The aminoacyl-tRNA synthetases (aaRSs) are essential components of the protein synthesis machinery responsible for defining the genetic code by pairing the correct amino acids to their cognate tRNAs. The aaRSs are an ancient enzyme family believed to have origins that may predate the last common ancestor and as such they provide insights into the evolution and development of the extant genetic code. Although the aaRSs have long been viewed as a highly conserved group of enzymes, findings within the last couple of decades have started to demonstrate how diverse and versatile these enzymes really are. Beyond their central role in translation, aaRSs and their numerous homologs have evolved a wide array of alternative functions both inside and outside translation. Current understanding of the emergence of the aaRSs, and their subsequent evolution into a functionally diverse enzyme family, are discussed in this chapter.
Collapse
|
9
|
Guo M, Schimmel P. Structural analyses clarify the complex control of mistranslation by tRNA synthetases. Curr Opin Struct Biol 2011; 22:119-26. [PMID: 22155179 DOI: 10.1016/j.sbi.2011.11.008] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2011] [Revised: 11/13/2011] [Accepted: 11/15/2011] [Indexed: 12/24/2022]
Abstract
Proteins are precisely assembled with amino acids by matching the anticodons of charged transfer RNAs to nucleotide triplets in mRNA sequences. Accurate translation depends on the specific coupling of cognate amino acids and tRNAs - a step carried out by aminoacyl-tRNA synthetases (aaRSs) and that generates the genetic code. Owing to their intrinsic similarity, aaRSs developed highly differentiated structures to discriminate between amino acids at the active site for aminoacylation. Because this discrimination is not sufficient to prevent toxic mistranslation, aaRSs developed separate structures to further refine recognition by proofreading. From comprehensive structural studies on aaRSs, many of the molecular details have been elucidated for the recognition of cognate amino acids and for the misactivation and editing of noncognate amino acids, Here we review recent advances in the structural description of the binding, activation and editing of amino acids, which collectively reveal many aspects of the fine-tuned systems that resulted in a robust and universal genetic code.
Collapse
Affiliation(s)
- Min Guo
- Department of Cancer Biology, The Scripps Research Institute, Scripps Florida, Jupiter, FL 33458, United States
| | | |
Collapse
|
10
|
Zhang Z, Yu J. On the organizational dynamics of the genetic code. GENOMICS PROTEOMICS & BIOINFORMATICS 2011; 9:21-9. [PMID: 21641559 PMCID: PMC5054158 DOI: 10.1016/s1672-0229(11)60004-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2010] [Accepted: 10/26/2010] [Indexed: 11/23/2022]
Abstract
The organization of the canonical genetic code needs to be thoroughly illuminated. Here we reorder the four nucleotides—adenine, thymine, guanine and cytosine—according to their emergence in evolution, and apply the organizational rules to devising an algebraic representation for the canonical genetic code. Under a framework of the devised code, we quantify codon and amino acid usages from a large collection of 917 prokaryotic genome sequences, and associate the usages with its intrinsic structure and classification schemes as well as amino acid physicochemical properties. Our results show that the algebraic representation of the code is structurally equivalent to a content-centric organization of the code and that codon and amino acid usages under different classification schemes were correlated closely with GC content, implying a set of rules governing composition dynamics across a wide variety of prokaryotic genome sequences. These results also indicate that codons and amino acids are not randomly allocated in the code, where the six-fold degenerate codons and their amino acids have important balancing roles for error minimization. Therefore, the content-centric code is of great usefulness in deciphering its hitherto unknown regularities as well as the dynamics of nucleotide, codon, and amino acid compositions.
Collapse
Affiliation(s)
- Zhang Zhang
- Plant Stress Genomics Research Center, Division of Chemical and Life Sciences and Engineering, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | | |
Collapse
|
11
|
Görnerup O, Jacobi MN. A model-independent approach to infer hierarchical codon substitution dynamics. BMC Bioinformatics 2010; 11:201. [PMID: 20412602 PMCID: PMC2868013 DOI: 10.1186/1471-2105-11-201] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2009] [Accepted: 04/23/2010] [Indexed: 12/03/2022] Open
Abstract
Background Codon substitution constitutes a fundamental process in molecular biology that has been studied extensively. However, prior studies rely on various assumptions, e.g. regarding the relevance of specific biochemical properties, or on conservation criteria for defining substitution groups. Ideally, one would instead like to analyze the substitution process in terms of raw dynamics, independently of underlying system specifics. In this paper we propose a method for doing this by identifying groups of codons and amino acids such that these groups imply closed dynamics. The approach relies on recently developed spectral and agglomerative techniques for identifying hierarchical organization in dynamical systems. Results We have applied the techniques on an empirically derived Markov model of the codon substitution process that is provided in the literature. Without system specific knowledge of the substitution process, the techniques manage to "blindly" identify multiple levels of dynamics; from amino acid substitutions (via the standard genetic code) to higher order dynamics on the level of amino acid groups. We hypothesize that the acquired groups reflect earlier versions of the genetic code. Conclusions The results demonstrate the applicability of the techniques. Due to their generality, we believe that they can be used to coarse grain and identify hierarchical organization in a broad range of other biological systems and processes, such as protein interaction networks, genetic regulatory networks and food webs.
Collapse
Affiliation(s)
- Olof Görnerup
- Complex Systems Group, Department of Energy and Environment, Chalmers University of Technology, 412 96 Göteborg, Sweden.
| | | |
Collapse
|
12
|
Liu X, Zhang J, Ni F, Dong X, Han B, Han D, Ji Z, Zhao Y. Genome wide exploration of the origin and evolution of amino acids. BMC Evol Biol 2010; 10:77. [PMID: 20230639 PMCID: PMC2853539 DOI: 10.1186/1471-2148-10-77] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2009] [Accepted: 03/15/2010] [Indexed: 11/10/2022] Open
Abstract
Background Even after years of exploration, the terrestrial origin of bio-molecules remains unsolved and controversial. Today, observation of amino acid composition in proteins has become an alternative way for a global understanding of the mystery encoded in whole genomes and seeking clues for the origin of amino acids. Results In this study, we statistically monitored the frequencies of 20 alpha-amino acids in 549 taxa from three kingdoms of life: archaebacteria, eubacteria, and eukaryotes. We found that the amino acids evolved independently in these three kingdoms; but, conserved linkages were observed in two groups of amino acids, (A, G, H, L, P, Q, R, and W) and (F, I, K, N, S, and Y). Moreover, the amino acids encoded by GC-poor codons (F, Y, N, K, I, and M) were found to "lose" their usage in the development from single cell eukaryotic organisms like S. cerevisiae to H. sapiens, while the amino acids encoded by GC-rich codons (P, A, G, and W) were found to gain usage. These findings further support the co-evolution hypothesis of amino acids and genetic codes. Conclusion We proposed a new chronological order of the appearance of amino acids (L, A, V/E/G, S, I, K, T, R/D, P, N, F, Q, Y, M, H, W, C). Two conserved evolutionary paths of amino acids were also suggested: A→G→R→P and K→Y.
Collapse
Affiliation(s)
- Xiaoxia Liu
- The Key Laboratory for Chemical Biology of Fujian Province, Department of Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, Fujian, PR China
| | | | | | | | | | | | | | | |
Collapse
|
13
|
Sánchez R, Grau R. An algebraic hypothesis about the primeval genetic code architecture. Math Biosci 2009; 221:60-76. [PMID: 19607845 DOI: 10.1016/j.mbs.2009.07.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2008] [Revised: 06/23/2009] [Accepted: 07/09/2009] [Indexed: 11/26/2022]
Abstract
A plausible architecture of an ancient genetic code is derived from an extended base triplet vector space over the Galois field of the extended base alphabet {D,A,C,G,U}, where symbol D represents one or more hypothetical bases with unspecific pairings. We hypothesized that the high degeneration of a primeval genetic code with five bases and the gradual origin and improvement of a primeval DNA repair system could make possible the transition from ancient to modern genetic codes. Our results suggest that the Watson-Crick base pairing G identical with C and A=U and the non-specific base pairing of the hypothetical ancestral base D used to define the sum and product operations are enough features to determine the coding constraints of the primeval and the modern genetic code, as well as, the transition from the former to the latter. Geometrical and algebraic properties of this vector space reveal that the present codon assignment of the standard genetic code could be induced from a primeval codon assignment. Besides, the Fourier spectrum of the extended DNA genome sequences derived from the multiple sequence alignment suggests that the called period-3 property of the present coding DNA sequences could also exist in the ancient coding DNA sequences. The phylogenetic analyses achieved with metrics defined in the N-dimensional vector space (B(3))(N) of DNA sequences and with the new evolutionary model presented here also suggest that an ancient DNA coding sequence with five or more bases does not contradict the expected evolutionary history.
Collapse
Affiliation(s)
- Robersy Sánchez
- Research Institute of Tropical Roots, Tuber Crops and Plantains (INIVIT), Biotechnology Group, Villa Clara, Cuba
| | | |
Collapse
|
14
|
Ignatova Z, Zimmermann KH, Martínez-Pérez I. Molecular Biology. DNA COMPUTING MODELS 2008. [PMCID: PMC7122864 DOI: 10.1007/978-0-387-73637-2_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Genetic information is passed with high accuracy from the parental organism to the offspring and its expression governs the biochemical and physiological tasks of the cell. Although different types of cells exist and are shaped by development to fill different physiological niches, all cells have fundamental similarities and share common principles of organization and biochemical activities. This chapter gives an overview of general principles of the storage and flow of genetic information. It aims to summarize and describe in a broadly approachable way, from the point of view of molecular biology, some general terms, mechanisms and processes used as a base for the molecular computing in the subsequent chapters.
Collapse
Affiliation(s)
- Zoya Ignatova
- Cellular Biochemistry, Max Planck Institute of Biochemistry, Munich, 82152 Martinsried by Munich Germany
| | - Karl-Heinz Zimmermann
- Institute of Computer Technology, Hamburg University of Technology, 21071 Hamburg Germany
| | - Israel Martínez-Pérez
- Institute of Computer Technology, Hamburg University of Technology, 21071 Hamburg Germany
| |
Collapse
|
15
|
Abstract
Since the early days of the discovery of the genetic code nonrandom patterns have been searched for in the code in the hope of providing information about its origin and early evolution. Here we present a new classification scheme of the genetic code that is based on a binary representation of the purines and pyrimidines. This scheme reveals known patterns more clearly than the common one, for instance, the classification of strong, mixed, and weak codons as well as the ordering of codon families. Furthermore, new patterns have been found that have not been described before: Nearly all quantitative amino acid properties, such as Woese's polarity and the specific volume, show a perfect correlation to Lagerkvist's codon-anticodon binding strength. Our new scheme leads to new ideas about the evolution of the genetic code. It is hypothesized that it started with a binary doublet code and developed via a quaternary doublet code into the contemporary triplet code. Furthermore, arguments are presented against suggestions that a "simpler" code, where only the midbase was informational, was at the origin of the genetic code.
Collapse
Affiliation(s)
- Thomas Wilhelm
- Institute of Molecular Biotechnology, Beutenbergstr. 11, 07745 Jena, Germany.
| | | |
Collapse
|
16
|
Copley SD, Smith E, Morowitz HJ. A mechanism for the association of amino acids with their codons and the origin of the genetic code. Proc Natl Acad Sci U S A 2005; 102:4442-7. [PMID: 15764708 PMCID: PMC555468 DOI: 10.1073/pnas.0501049102] [Citation(s) in RCA: 93] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2004] [Indexed: 11/18/2022] Open
Abstract
The genetic code has certain regularities that have resisted mechanistic interpretation. These include strong correlations between the first base of codons and the precursor from which the encoded amino acid is synthesized and between the second base of codons and the hydrophobicity of the encoded amino acid. These regularities are even more striking in a projection of the modern code onto a simpler code consisting of doublet codons encoding a set of simple amino acids. These regularities can be explained if, before the emergence of macromolecules, simple amino acids were synthesized in covalent complexes of dinucleotides with alpha-keto acids originating from the reductive tricarboxylic acid cycle or reductive acetate pathway. The bases and phosphates of the dinucleotide are proposed to have enhanced the rates of synthetic reactions leading to amino acids in a small-molecule reaction network that preceded the RNA translation apparatus but created an association between amino acids and the first two bases of their codons that was retained when translation emerged later in evolution.
Collapse
Affiliation(s)
- Shelley D Copley
- Cooperative Institute for Research in Environmental Sciences, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, CO 80309, USA.
| | | | | |
Collapse
|
17
|
Bacher JM, de Crécy-Lagard V, Schimmel PR. Inhibited cell growth and protein functional changes from an editing-defective tRNA synthetase. Proc Natl Acad Sci U S A 2005; 102:1697-701. [PMID: 15647356 PMCID: PMC547871 DOI: 10.1073/pnas.0409064102] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The genetic code is established in aminoacylation reactions catalyzed by aminoacyl-tRNA synthetases. Many aminoacyl-tRNA synthetases require an additional domain for editing, to correct errors made by the catalytic domain. A nonfunctional editing domain results in an ambiguous genetic code, where a single codon is not translated as a specific amino acid but rather as a statistical distribution of amino acids. Here, wide-ranging consequences of genetic code ambiguity in Escherichia coli were investigated with an editing-defective isoleucyl-tRNA synthetase. Ambiguity retarded cell growth at most temperatures in rich and minimal media. These growth rate differences were seen regardless of the carbon source. Inclusion of an amino acid analogue that is misactivated (and not cleared) diminished growth rate by up to 100-fold relative to an isogenic strain with normal editing function. Experiments with target-specific antibiotics for ribosomes, DNA replication, and cell wall biosynthesis, in conjunction with measurements of mutation frequencies, were consistent with global changes in protein function caused by errors of translation and not editing-induced mutational errors. Thus, a single defective editing domain caused translationally generated global effects on protein functions that, in turn, provide powerful selective pressures for maintenance of editing by aminoacyl-tRNA synthetases.
Collapse
Affiliation(s)
- Jamie M Bacher
- The Scripps Research Institute, BCC-379, 10550 North Torrey Pines Road, La Jolla, CA 92037, USA
| | | | | |
Collapse
|
18
|
Abstract
Temporal order ("chronology") of appearance of amino acids and their respective codons on evolutionary scene is reconstructed. A consensus chronology of amino acids is built on the basis of 60 different criteria each offering certain temporal order. After several steps of filtering the chronology vectors are averaged resulting in the consensus order: G, A, D, V, P, S, E, (L, T), R, (I, Q, N), H, K, C, F, Y, M, W. It reveals two important features: the amino acids synthesized in imitation experiments of S. Miller appeared first, while the amino acids associated with codon capture events came last. The reconstruction of codon chronology is based on the above consensus temporal order of amino acids, supplemented by the stability and complementarity rules first suggested by M. Eigen and P. Schuster, and on the earlier established processivity rule. At no point in the reconstruction the consensus amino-acid chronology was in conflict with these three rules. The derived genealogy of all 64 codons suggested several important predictions that are confirmed. The reconstruction of the origin and evolutionary history of the triplet code becomes, thus, a powerful research tool for molecular evolution studies, especially in its early stages.
Collapse
Affiliation(s)
- E N Trifonov
- Genome Diversity Center, Institute of Evolution, University of Haifa, Haifa 31905, Israel.
| |
Collapse
|
19
|
Pezo V, Metzgar D, Hendrickson TL, Waas WF, Hazebrouck S, Döring V, Marlière P, Schimmel P, De Crécy-Lagard V. Artificially ambiguous genetic code confers growth yield advantage. Proc Natl Acad Sci U S A 2004; 101:8593-7. [PMID: 15163798 PMCID: PMC423239 DOI: 10.1073/pnas.0402893101] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A primitive genetic code is thought to have encoded statistical, ambiguous proteins in which more than one amino acid was inserted at a given codon. The relative vitality of organisms bearing ambiguous proteins and the kinds of pressures that forced development of the highly specific modern genetic code are unknown. Previous work demonstrated that, in the absence of selective pressure, enforced ambiguity in cells leads to death or to sequence reversion to eliminate the ambiguous phenotype. Here, we report the creation of a nonreverting strain of bacteria that produced statistical proteins. Ablating the editing activity of isoleucyl-tRNA synthetase resulted in an ambiguous code in which, through supplementation of a limited supply of isoleucine with an alternative amino acid that was noncoding, the mutant generating statistical proteins was favored over the wild-type isogenic strain. Such organisms harboring statistical proteins could have had an enhanced adaptive capacity and could have played an important role in the early development of living systems.
Collapse
Affiliation(s)
- V Pezo
- Evologic SA, 93 Rue Henri Rochefort, 91000 Evry, France
| | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Abstract
Since discovering the pattern by which amino acids are assigned to codons within the standard genetic code, investigators have explored the idea that natural selection placed biochemically similar amino acids near to one another in coding space so as to minimize the impact of mutations and/or mistranslations. The analytical evidence to support this theory has grown in sophistication and strength over the years, and counterclaims questioning its plausibility and quantitative support have yet to transcend some significant weaknesses in their approach. These weaknesses are illustrated here by means of a simple simulation model for adaptive genetic code evolution. There remain ill explored facets of the 'error minimizing' code hypothesis, however, including the mechanism and pathway by which an adaptive pattern of codon assignments emerged, the extent to which natural selection created synonym redundancy, its role in shaping the amino acid and nucleotide languages, and even the correct interpretation of the adaptive codon assignment pattern: these represent fertile areas for future research.
Collapse
Affiliation(s)
- Stephen J Freeland
- Department of Biology, University of Maryland, Baltimore County, Catonsville, MD, USA.
| | | | | |
Collapse
|
21
|
Ikehara K. Origins of gene, genetic code, protein and life: comprehensive view of life systems from a GNC-SNS primitive genetic code hypothesis. J Biosci 2002; 27:165-86. [PMID: 11937687 DOI: 10.1007/bf02703773] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We have investigated the origin of genes, the genetic code, proteins and life using six indices (hydropathy, alpha-helix, beta-sheet and beta-turn formabilities, acidic amino acid content and basic amino acid content) necessary for appropriate three-dimensional structure formation of globular proteins. From the analysis of microbial genes, we have concluded that newly-born genes are products of nonstop frames (NSF) on antisense strands of microbial GC-rich genes [GC-NSF(a)] and from SNS repeating sequences [(SNS)n] similar to the GC-NSF(a) (S and N mean G or C and either of four bases, respectively). We have also proposed that the universal genetic code used by most organisms on the earth presently could be derived from a GNC-SNS primitive genetic code. We have further presented the [GADV]-protein world hypothesis of the origin of life as well as a hypothesis of protein production, suggesting that proteins were originally produced by random peptide formation of amino acids restricted in specific amino acid compositions termed as GNC-, SNS- and GC-NSF(a)-0th order structures of proteins. The [GADV]-protein world hypothesis is primarily derived from the GNC-primitive genetic code hypothesis. It is also expected that basic properties of extant genes and proteins could be revealed by considerations based on the scenario with four stages.
Collapse
Affiliation(s)
- K Ikehara
- Department of Chemistry, Faculty of Science, Nara Women's University, Kita-uoya-nishi-machi, Nara, Nara 630-8506, Japan.
| |
Collapse
|
22
|
Abstract
Forty different single-factor criteria and multi-factor hypotheses about chronological order of appearance of amino acids in the early evolution are summarized in consensus ranking. All available knowledge and thoughts about origin and evolution of the genetic code are thus combined in a single list where the amino acids are ranked chronologically. Due to consensus nature of the chronology it has several important properties not visible in individual rankings by any of the initial criteria. Nine amino acids of the Miller's imitation of primordial environment are all ranked as topmost (G, A, V, D, E, P, S, L, T). This result does not change even after several criteria related to Miller's data are excluded from calculations. The consensus order of appearance of the 20 amino acids on the evolutionary scene also reveals a unique and strikingly simple chronological organization of 64 codons, that could not be figured out from individual criteria: New codons appear in descending order of their thermostability, as complementary pairs, with the complements recruited sequentially from the codon repertoires of the earlier or simultaneously appearing amino acids. These three rules (Thermostability, Complementarity and Processivity) hold strictly as well as leading position of the earliest amino acids according to Miller. The consensus chronology of amino acids, G/A, V/D, P, S, E/L, T, R, N, K, Q, I, C, H, F, M, Y, W, and the derived temporal order for codons may serve, thus, as a justified working model of choice for further studies on the origin and evolution of the genetic code.
Collapse
Affiliation(s)
- E N Trifonov
- Department of Structural Biology, The Weizmann Institute of Science, Rehovot 76100, Israel.
| |
Collapse
|
23
|
Abstract
A ubiquitious class of RNA-binding proteins is distinguished by an arginine-rich motif. Such proteins function in transcription, translation, RNA trafficking, and packaging. Peptide models are derived from viral regulatory proteins, including the virulence factors Tat and Rev of mammalian immunodeficiency viruses. Structures of model peptide-RNA complexes exhibit diverse strategies of recognition based in each case on structural transitions. Induced RNA structures contain noncanonical elements such as purine-purine mismatches, base triples, and flipped bases. Such elements enlarge and extend the RNA major groove to create specific peptide-binding pockets and surfaces. The repertoire of bound peptide structures--beta-hairpin, alpha-helix, and helix-bend-helix-reflects the diversity of induced RNA architectures. This repertoire, reminiscent of primordial exon-encoded peptides, may recapitulate early events in the transition between RNA and protein worlds. Peptide-directed changes in modern RNA structures can provide a mechanism of signaling in higher-order RNA-protein assemblies.
Collapse
Affiliation(s)
- M A Weiss
- Department of Biochemistry, University of Chicago, IL 60637-5419, USA
| | | |
Collapse
|
24
|
|