1
|
Fontecilla-Camps JC. Reflections on the Origin and Early Evolution of the Genetic Code. Chembiochem 2023; 24:e202300048. [PMID: 37052530 DOI: 10.1002/cbic.202300048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 03/01/2023] [Indexed: 04/14/2023]
Abstract
Examination of the genetic code (GeCo) reveals that amino acids coded by (A/U) codons display a large functional spectrum and bind RNA whereas, except for Arg, those coded by (G/C) codons do not. From a stereochemical viewpoint, the clear preference for (A/U)-rich codons to be located at the GeCo half blocks suggests they were specifically determined. Conversely, the overall lower affinity of cognate amino acids for their (G/C)-rich anticodons points to their late arrival to the GeCo. It is proposed that i) initially the code was composed of the eight (A/U) codons; ii) these codons were duplicated when G/C nucleotides were added to their wobble positions, and three new codons with G/C in their first position were incorporated; and iii) a combination of A/U and G/C nucleotides progressively generated the remaining codons.
Collapse
|
2
|
Zhao F, Akanuma S. Ancestral Sequence Reconstruction of the Ribosomal Protein uS8 and Reduction of Amino Acid Usage to a Smaller Alphabet. J Mol Evol 2023; 91:10-23. [PMID: 36396786 DOI: 10.1007/s00239-022-10078-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 11/08/2022] [Indexed: 11/19/2022]
Abstract
Understanding the origin and early evolution of proteins is important for unveiling how the RNA world developed into an RNA-protein world. Because the composition of organic molecules in the Earth's primitive environment was plausibly not as diverse as today, the number of different amino acids used in early protein synthesis is likely to be substantially less than the current 20 proteinogenic residues. In this study, we have explored the thermal stability and RNA binding of ancestral variants of the ribosomal protein uS8 constructed from a reduced-alphabet of amino acids. First, we built a phylogenetic tree based on the amino acid sequences of uS8 from multiple extant organisms and used the tree to infer two plausible amino acid sequences corresponding to the last bacterial common ancestor of uS8. Both ancestral proteins were thermally stable and bound to an RNA fragment. By eliminating individual amino acid letters and monitoring thermal stability and RNA binding in the resulting proteins, we reduced the size of the amino acid set constituting one of the ancestral proteins, eventually finding that convergent sequences consisting of 15- or 14-amino acid alphabets still folded into stable structures that bound to the RNA fragment. Furthermore, a simplified variant reconstructed from a 13-amino-acid alphabet retained affinity for the RNA fragment, although it lost conformational stability. Collectively, RNA-binding activity may be achieved with a subset of the current 20 amino acids, raising the possibility of a simpler composition of RNA-binding proteins in the earliest stage of protein evolution.
Collapse
Affiliation(s)
- Fangzheng Zhao
- Faculty of Human Sciences, Waseda University, 2-579-15, Mikajima, Tokorozawa, Saitama, 359-1192, Japan
| | - Satoshi Akanuma
- Faculty of Human Sciences, Waseda University, 2-579-15, Mikajima, Tokorozawa, Saitama, 359-1192, Japan.
| |
Collapse
|
3
|
Harrison SA, Palmeira RN, Halpern A, Lane N. A biophysical basis for the emergence of the genetic code in protocells. BIOCHIMICA ET BIOPHYSICA ACTA. BIOENERGETICS 2022; 1863:148597. [PMID: 35868450 DOI: 10.1016/j.bbabio.2022.148597] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 06/27/2022] [Accepted: 07/13/2022] [Indexed: 11/17/2022]
Abstract
The origin of the genetic code is an abiding mystery in biology. Hints of a 'code within the codons' suggest biophysical interactions, but these patterns have resisted interpretation. Here, we present a new framework, grounded in the autotrophic growth of protocells from CO2 and H2. Recent work suggests that the universal core of metabolism recapitulates a thermodynamically favoured protometabolism right up to nucleotide synthesis. Considering the genetic code in relation to an extended protometabolism allows us to predict most codon assignments. We show that the first letter of the codon corresponds to the distance from CO2 fixation, with amino acids encoded by the purines (G followed by A) being closest to CO2 fixation. These associations suggest a purine-rich early metabolism with a restricted pool of amino acids. The second position of the anticodon corresponds to the hydrophobicity of the amino acid encoded. We combine multiple measures of hydrophobicity to show that this correlation holds strongly for early amino acids but is weaker for later species. Finally, we demonstrate that redundancy at the third position is not randomly distributed around the code: non-redundant amino acids can be assigned based on size, specifically length. We attribute this to additional stereochemical interactions at the anticodon. These rules imply an iterative expansion of the genetic code over time with codon assignments depending on both distance from CO2 and biophysical interactions between nucleotide sequences and amino acids. In this way the earliest RNA polymers could produce non-random peptide sequences with selectable functions in autotrophic protocells.
Collapse
Affiliation(s)
- Stuart A Harrison
- Department of Genetics, Evolution and Environment, University College London, Darwin Building, Gower Street, London WC1E 6BT, United Kingdom of Great Britain and Northern Ireland
| | - Raquel Nunes Palmeira
- Department of Genetics, Evolution and Environment, University College London, Darwin Building, Gower Street, London WC1E 6BT, United Kingdom of Great Britain and Northern Ireland
| | - Aaron Halpern
- Department of Genetics, Evolution and Environment, University College London, Darwin Building, Gower Street, London WC1E 6BT, United Kingdom of Great Britain and Northern Ireland
| | - Nick Lane
- Department of Genetics, Evolution and Environment, University College London, Darwin Building, Gower Street, London WC1E 6BT, United Kingdom of Great Britain and Northern Ireland.
| |
Collapse
|
4
|
Li DJ. Distributional features of triplet codons in genomes underlie the diversification of life. Biosystems 2022; 217:104681. [DOI: 10.1016/j.biosystems.2022.104681] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2021] [Revised: 04/04/2022] [Accepted: 04/07/2022] [Indexed: 11/02/2022]
|
5
|
Vallée Y, Youssef-Saliba S. Sulfur Amino Acids: From Prebiotic Chemistry to Biology and Vice Versa. SYNTHESIS-STUTTGART 2021. [DOI: 10.1055/a-1472-7914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
AbstractTwo sulfur-containing amino acids are included in the list of the 20 classical protein amino acids. A methionine residue is introduced at the start of the synthesis of all current proteins. Cysteine, thanks to its thiol function, plays an essential role in a very large number of catalytic sites. Here we present what is known about the prebiotic synthesis of these two amino acids and homocysteine, and we discuss their introduction into primitive peptides and more elaborate proteins.1 Introduction2 Sulfur Sources3 Prebiotic Synthesis of Cysteine4 Prebiotic Synthesis of Methionine5 Homocysteine and Its Thiolactone6 Methionine and Cystine in Proteins7 Prebiotic Scenarios Using Sulfur Amino Acids8 Introduction of Cys and Met in the Genetic Code9 Conclusion
Collapse
|
6
|
Kimura M, Akanuma S. Reconstruction and Characterization of Thermally Stable and Catalytically Active Proteins Comprising an Alphabet of ~ 13 Amino Acids. J Mol Evol 2020; 88:372-381. [PMID: 32201904 DOI: 10.1007/s00239-020-09938-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2019] [Accepted: 03/11/2020] [Indexed: 10/24/2022]
Abstract
While extant organisms synthesize proteins using approximately 20 kinds of genetically coded amino acids, the earliest protein synthesis system is likely to have been much simpler, utilizing a reduced set of amino acids. However, which types of building blocks were involved in primordial protein synthesis remains unclear. Herein, we reconstructed three convergent sequences of an ancestral nucleoside diphosphate kinase, each comprising a 10 amino acid "alphabet," and found that two of these variants folded into soluble and stable tertiary structures. Therefore, an alphabet consisting of 10 amino acids contains sufficient information for creating stable proteins. Furthermore, re-incorporation of a few more amino acid types into the active site of the 10 amino acid variants improved the catalytic activity, although the specific activity was not as high as that of extant proteins. Collectively, our results provide experimental support for the idea that robust protein scaffolds can be built with a subset of the current 20 amino acids that might have existed abundantly in the prebiotic environment, while the other amino acids, especially those with functional sidechains, evolved to contribute to efficient enzyme catalysis.
Collapse
Affiliation(s)
- Madoka Kimura
- Faculty of Human Sciences, Waseda University, 2-579-15 Mikajima, Tokorozawa, Saitama, 359-1192, Japan
| | - Satoshi Akanuma
- Faculty of Human Sciences, Waseda University, 2-579-15 Mikajima, Tokorozawa, Saitama, 359-1192, Japan.
| |
Collapse
|
7
|
Demongeot J, Seligmann H. Theoretical minimal RNA rings recapitulate the order of the genetic code's codon-amino acid assignments. J Theor Biol 2019; 471:108-116. [DOI: 10.1016/j.jtbi.2019.03.024] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Revised: 09/19/2018] [Accepted: 03/28/2019] [Indexed: 12/21/2022]
|
8
|
Comprehensive reduction of amino acid set in a protein suggests the importance of prebiotic amino acids for stable proteins. Sci Rep 2018; 8:1227. [PMID: 29352156 PMCID: PMC5775292 DOI: 10.1038/s41598-018-19561-1] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Accepted: 01/03/2018] [Indexed: 11/19/2022] Open
Abstract
Modern organisms commonly use the same set of 20 genetically coded amino acids for protein synthesis with very few exceptions. However, earlier protein synthesis was plausibly much simpler than modern one and utilized only a limited set of amino acids. Nevertheless, few experimental tests of this issue with arbitrarily chosen amino acid sets had been reported prior to this report. Herein we comprehensively and systematically reduced the size of the amino acid set constituting an ancestral nucleoside kinase that was reconstructed in our previous study. We eventually found that two convergent sequences, each comprised of a 13-amino acid alphabet, folded into soluble, stable and catalytically active structures, even though their stabilities and activities were not as high as those of the parent protein. Notably, many but not all of the reduced-set amino acids coincide with those plausibly abundant in primitive Earth. The inconsistent amino acids appeared to be important for catalytic activity but not for stability. Therefore, our findings suggest that the prebiotically abundant amino acids were used for creating stable protein structures and other amino acids with functional side chains were recruited to achieve efficient catalysis.
Collapse
|
9
|
Akanuma S. Characterization of Reconstructed Ancestral Proteins Suggests a Change in Temperature of the Ancient Biosphere. Life (Basel) 2017; 7:life7030033. [PMID: 28783077 PMCID: PMC5617958 DOI: 10.3390/life7030033] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Revised: 08/02/2017] [Accepted: 08/03/2017] [Indexed: 01/02/2023] Open
Abstract
Understanding the evolution of ancestral life, and especially the ability of some organisms to flourish in the variable environments experienced in Earth’s early biosphere, requires knowledge of the characteristics and the environment of these ancestral organisms. Information about early life and environmental conditions has been obtained from fossil records and geological surveys. Recent advances in phylogenetic analysis, and an increasing number of protein sequences available in public databases, have made it possible to infer ancestral protein sequences possessed by ancient organisms. However, the in silico studies that assess the ancestral base content of ribosomal RNAs, the frequency of each amino acid in ancestral proteins, and estimate the environmental temperatures of ancient organisms, show conflicting results. The characterization of ancestral proteins reconstructed in vitro suggests that ancient organisms had very thermally stable proteins, and therefore were thermophilic or hyperthermophilic. Experimental data supports the idea that only thermophilic ancestors survived the catastrophic increase in temperature of the biosphere that was likely associated with meteorite impacts during the early history of Earth. In addition, by expanding the timescale and including more ancestral proteins for reconstruction, it appears as though the Earth’s surface temperature gradually decreased over time, from Archean to present.
Collapse
Affiliation(s)
- Satoshi Akanuma
- Faculty of Human Sciences, Waseda University, 2-579-15 Mikajima, Tokorozawa, Saitama 359-1192, Japan.
| |
Collapse
|
10
|
Mukai T, Yamaguchi A, Ohtake K, Takahashi M, Hayashi A, Iraha F, Kira S, Yanagisawa T, Yokoyama S, Hoshi H, Kobayashi T, Sakamoto K. Reassignment of a rare sense codon to a non-canonical amino acid in Escherichia coli. Nucleic Acids Res 2015; 43:8111-22. [PMID: 26240376 PMCID: PMC4652775 DOI: 10.1093/nar/gkv787] [Citation(s) in RCA: 65] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 07/22/2015] [Indexed: 11/13/2022] Open
Abstract
The immutability of the genetic code has been challenged with the successful reassignment of the UAG stop codon to non-natural amino acids in Escherichia coli. In the present study, we demonstrated the in vivo reassignment of the AGG sense codon from arginine to L-homoarginine. As the first step, we engineered a novel variant of the archaeal pyrrolysyl-tRNA synthetase (PylRS) able to recognize L-homoarginine and L-N(6)-(1-iminoethyl)lysine (L-NIL). When this PylRS variant or HarRS was expressed in E. coli, together with the AGG-reading tRNA(Pyl) CCU molecule, these arginine analogs were efficiently incorporated into proteins in response to AGG. Next, some or all of the AGG codons in the essential genes were eliminated by their synonymous replacements with other arginine codons, whereas the majority of the AGG codons remained in the genome. The bacterial host's ability to translate AGG into arginine was then restricted in a temperature-dependent manner. The temperature sensitivity caused by this restriction was rescued by the translation of AGG to L-homoarginine or L-NIL. The assignment of AGG to L-homoarginine in the cells was confirmed by mass spectrometric analyses. The results showed the feasibility of breaking the degeneracy of sense codons to enhance the amino-acid diversity in the genetic code.
Collapse
Affiliation(s)
- Takahito Mukai
- Division of Structural and Synthetic Biology, RIKEN Center for Life Science Technologies, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan RIKEN Systems and Structural Biology Center, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan
| | - Atsushi Yamaguchi
- Division of Structural and Synthetic Biology, RIKEN Center for Life Science Technologies, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan
| | - Kazumasa Ohtake
- Division of Structural and Synthetic Biology, RIKEN Center for Life Science Technologies, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan
| | - Mihoko Takahashi
- Division of Structural and Synthetic Biology, RIKEN Center for Life Science Technologies, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan
| | - Akiko Hayashi
- Division of Structural and Synthetic Biology, RIKEN Center for Life Science Technologies, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan RIKEN Systems and Structural Biology Center, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan
| | - Fumie Iraha
- Division of Structural and Synthetic Biology, RIKEN Center for Life Science Technologies, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan
| | - Satoshi Kira
- RIKEN Systems and Structural Biology Center, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan
| | - Tatsuo Yanagisawa
- RIKEN Systems and Structural Biology Center, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan RIKEN Structural Biology Laboratory, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan
| | - Shigeyuki Yokoyama
- RIKEN Systems and Structural Biology Center, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan RIKEN Structural Biology Laboratory, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan
| | - Hiroko Hoshi
- RIKEN Systems and Structural Biology Center, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan
| | - Takatsugu Kobayashi
- RIKEN Systems and Structural Biology Center, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan
| | - Kensaku Sakamoto
- Division of Structural and Synthetic Biology, RIKEN Center for Life Science Technologies, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan RIKEN Systems and Structural Biology Center, 1-7-22 Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan
| |
Collapse
|
11
|
Kawahara-Kobayashi A, Hitotsuyanagi M, Amikura K, Kiga D. Experimental evolution of a green fluorescent protein composed of 19 unique amino acids without tryptophan. ORIGINS LIFE EVOL B 2014; 44:75-86. [PMID: 25399308 DOI: 10.1007/s11084-014-9371-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2013] [Accepted: 09/25/2014] [Indexed: 10/24/2022]
Abstract
At some stage of evolution, genes of organisms may have encoded proteins that were synthesized using fewer than 20 unique amino acids. Similar to evolution of the natural 19-amino-acid proteins GroEL/ES, proteins composed of 19 unique amino acids would have been able to evolve by accumulating beneficial mutations within the 19-amino-acid repertoire encoded in an ancestral genetic code. Because Trp is thought to be the last amino acid included in the canonical 20-amino-acid repertoire, this late stage of protein evolution could be mimicked by experimental evolution of 19-amino-acid proteins without tryptophan (Trp). To further understand the evolution of proteins, we tried to mimic the evolution of a 19-amino-acid protein involving the accumulation of beneficial mutations using directed evolution by random mutagenesis on the whole targeted gene sequence. We created active 19-amino-acid green fluorescent proteins (GFPs) without Trp from a poorly fluorescent 19-amino-acid mutant, S1-W57F, by using directed evolution with two rounds of mutagenesis and selection. The N105I and S205T mutations showed beneficial effects on the S1-W57F mutant. When these two mutations were combined on S1-W57F, we observed an additive effect on the fluorescence intensity. In contrast, these mutations showed no clear improvement individually or in combination on GFPS1, which is the parental GFP mutant composed of 20 amino acids. Our results provide an additional example for the experimental evolution of 19-amino-acid proteins without Trp, and would help understand the mechanisms underlying the evolution of 19-amino-acid proteins. (236 words).
Collapse
Affiliation(s)
- Akio Kawahara-Kobayashi
- Department of Computational Intelligence and Systems Science, Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, Yokohama, Kanagawa, 226-8503, Japan
| | | | | | | |
Collapse
|
12
|
Pollack JD, Gerard D, Pearl DK. Uniquely localized intra-molecular amino acid concentrations at the glycolytic enzyme catalytic/active centers of Archaea, Bacteria and Eukaryota are associated with their proposed temporal appearances on earth. ORIGINS LIFE EVOL B 2013; 43:161-87. [PMID: 23715690 DOI: 10.1007/s11084-013-9331-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2012] [Accepted: 04/04/2013] [Indexed: 11/27/2022]
Abstract
The distributions of amino acids at most-conserved sites nearest catalytic/active centers (C/AC) in 4,645 sequences of ten enzymes of the glycolytic Embden-Meyerhof-Parnas pathway in Archaea, Bacteria and Eukaryota are similar to the proposed temporal order of their appearance on Earth. Glycine, isoleucine, leucine, valine, glutamic acid and possibly lysine often described as prebiotic, i.e., existing or occurring before the emergence of life, were localized in positional and conservational defined aggregations in all enzymes of all Domains. The distributions of all 20 biologic amino acids in most-conserved sites nearest their C/ACs were quite different either from distributions in sites less-conserved and further from their C/ACs or from all amino acids regardless of their position or conservation. The major concentrations of glycine, e.g., perhaps the earliest prebiotic amino acid, occupies ≈ 16 % of all the most-conserved sites within a volume of ≈ 7-8 Å radius from their C/ACs and decreases linearly towards the molecule's peripheries. Spatially localized major concentrations of isoleucine, leucine and valine are in the mid-conserved and mid-distant sites from their C/ACs in protein interiors. Lysine and glutamic acid comprise ≈ 25-30 % of all amino acids within an irregular volume bounded by ≈ 24-28 Å radii from their C/ACs at the most-distant least-conserved sites. The unreported characteristics of these amino acids: their spatially and conservationally identified concentrations in Archaea, Bacteria and Eukaryota, suggest some common structural organization of glycolytic enzymes that may be relevant to their evolution and that of other proteins. We discuss our data in relation to enzyme evolution, their reported prebiotic putative temporal appearances on Earth, abundances, biological "cost", neighbor-sequence preferences or "ordering" and some thermodynamic parameters.
Collapse
Affiliation(s)
- J Dennis Pollack
- Department of Molecular Virology, Immunology and Medical Genetics, The College of Medicine, The Ohio State University, Columbus, OH 43210, USA.
| | | | | |
Collapse
|
13
|
Abstract
Commonly calculated zero probabilities for synthesis of a given protein sequence by chance are that small because the sizes of the proteins taken for the calculations are too large (over 100 residues). Same estimate for 20-30 residue chains makes the chance close to 1.
Collapse
Affiliation(s)
- Edward N Trifonov
- Genome Diversity Center, Institute of Evolution, University of Haifa, Haifa 31905, Israel.
| |
Collapse
|
14
|
Di Giulio M. The origin of the genetic code: theories and their relationships, a review. Biosystems 2004; 80:175-84. [PMID: 15823416 DOI: 10.1016/j.biosystems.2004.11.005] [Citation(s) in RCA: 97] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2004] [Revised: 11/12/2004] [Accepted: 11/18/2004] [Indexed: 10/26/2022]
Abstract
A review of the main theories proposed to explain the origin of the genetic code is presented. I analyze arguments and data in favour of different theories proposed to explain the origin of the organization of the genetic code. It is possible to suggest a mechanism that makes compatible the different theories of the origin of the code, even if these are based on a historical or physicochemical determinism and thus appear incompatible by definition. Finally, I discuss the question of why a given number of synonymous codons was attributed to the amino acids in the genetic code.
Collapse
Affiliation(s)
- Massimo Di Giulio
- Institute of Genetics and Biophysics Adriano Buzzati-Traverso, CNR, Naples, Italy
| |
Collapse
|
15
|
Abstract
Temporal order ("chronology") of appearance of amino acids and their respective codons on evolutionary scene is reconstructed. A consensus chronology of amino acids is built on the basis of 60 different criteria each offering certain temporal order. After several steps of filtering the chronology vectors are averaged resulting in the consensus order: G, A, D, V, P, S, E, (L, T), R, (I, Q, N), H, K, C, F, Y, M, W. It reveals two important features: the amino acids synthesized in imitation experiments of S. Miller appeared first, while the amino acids associated with codon capture events came last. The reconstruction of codon chronology is based on the above consensus temporal order of amino acids, supplemented by the stability and complementarity rules first suggested by M. Eigen and P. Schuster, and on the earlier established processivity rule. At no point in the reconstruction the consensus amino-acid chronology was in conflict with these three rules. The derived genealogy of all 64 codons suggested several important predictions that are confirmed. The reconstruction of the origin and evolutionary history of the triplet code becomes, thus, a powerful research tool for molecular evolution studies, especially in its early stages.
Collapse
Affiliation(s)
- E N Trifonov
- Genome Diversity Center, Institute of Evolution, University of Haifa, Haifa 31905, Israel.
| |
Collapse
|
16
|
|
17
|
Abstract
Since discovering the pattern by which amino acids are assigned to codons within the standard genetic code, investigators have explored the idea that natural selection placed biochemically similar amino acids near to one another in coding space so as to minimize the impact of mutations and/or mistranslations. The analytical evidence to support this theory has grown in sophistication and strength over the years, and counterclaims questioning its plausibility and quantitative support have yet to transcend some significant weaknesses in their approach. These weaknesses are illustrated here by means of a simple simulation model for adaptive genetic code evolution. There remain ill explored facets of the 'error minimizing' code hypothesis, however, including the mechanism and pathway by which an adaptive pattern of codon assignments emerged, the extent to which natural selection created synonym redundancy, its role in shaping the amino acid and nucleotide languages, and even the correct interpretation of the adaptive codon assignment pattern: these represent fertile areas for future research.
Collapse
Affiliation(s)
- Stephen J Freeland
- Department of Biology, University of Maryland, Baltimore County, Catonsville, MD, USA.
| | | | | |
Collapse
|
18
|
Abstract
Experimental studies have shown that the full sequence complexity of naturally occurring proteins is not required to generate rapidly folding and functional proteins, i.e. proteins can be designed with fewer than 20 letters. This raises the question of what is the minimum number of amino acid types required to encode complex protein folds? Here, we investigate this issue from three aspects. First, we study the minimum sequence complexity that can reserve the necessary structural information for detection of distantly related homologues. Second, we compare the ability of designing foldable model sequences over a wide range of reduced amino acid alphabets, which find the minimum number of letters that have the similar design ability as 20. Finally, we survey the lower bound of alphabet size of globular proteins in a non-redundant protein database. These different approaches give a remarkably consistent view, that the minimum number of letters required to fold a protein is around ten.
Collapse
Affiliation(s)
- Ke Fan
- National Laboratory of Solid State Microstructure and Department of Physics, Nanjing University, People's Republic of China
| | | |
Collapse
|
19
|
Akanuma S, Kigawa T, Yokoyama S. Combinatorial mutagenesis to restrict amino acid usage in an enzyme to a reduced set. Proc Natl Acad Sci U S A 2002; 99:13549-53. [PMID: 12361984 PMCID: PMC129711 DOI: 10.1073/pnas.222243999] [Citation(s) in RCA: 64] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We developed an effective strategy to restrict the amino acid usage in a relatively large protein to a reduced set with conservation of its in vivo function. The 213-residue Escherichia coli orotate phosphoribosyltransferase was subjected to 22 cycles of segment-wise combinatorial mutagenesis followed by 6 cycles of site-directed random mutagenesis, both coupled with a growth-related phenotype selection. The enzyme eventually tolerated 73 amino acid substitutions: In the final variant, 9 amino acid types (A, D, G, L, P, R, T, V, and Y) occupied 188 positions (88%), and none of 7 amino acid types (C, H, I, M, N, Q, and W) appeared. Therefore, the catalytic function associated with a relatively large protein may be achieved with a subset of the 20 amino acid. The converged sequence also implies simpler constituents for proteins in the early stage of evolution.
Collapse
Affiliation(s)
- Satoshi Akanuma
- RIKEN Genomic Sciences Center, Tsurumi, Yokohama 230-0045, Japan
| | | | | |
Collapse
|
20
|
Miseta A, Csutora P. Relationship between the occurrence of cysteine in proteins and the complexity of organisms. Mol Biol Evol 2000; 17:1232-9. [PMID: 10908643 DOI: 10.1093/oxfordjournals.molbev.a026406] [Citation(s) in RCA: 303] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The occurrence and relative positions of cysteine residues were investigated in proteins of various species. Considering random mathematical occurrence for an amino acid coded by two codons (3. 28%), cysteine is underrepresented in all organisms investigated. Representation of cysteine appears to correlate positively with the complexity of the organism, ranging between 2.26% in mammals and 0. 5% in some members of the Archeabacteria order. This observation, together with the results obtained from comparison of cysteine content of various ribosomal proteins, indicates that evolution takes advantage of increased use of cysteine residues. In all organisms studied except plants, two cysteines are frequently found two amino acid residues apart (C-(X)(2)-C motif). Such a motif is known to be present in a variety of metal-binding proteins and oxidoreductases. Remarkably, more than 21% of all of cysteines were found within the C-(X)(2)-C motifs in ARCHEA.: This observation may indicate that cysteine appeared in ancient metal-binding proteins first and was introduced into other proteins later.
Collapse
Affiliation(s)
- A Miseta
- Department of Clinical Chemistry, Faculty of Medicine, Pécs University, Pécs, Hungary.
| | | |
Collapse
|
21
|
Abstract
According to the molecular recognition theory, the complementarity of the sense and nonsense DNA strands is reflected in a complementarity of polypeptides and the corresponding nonsense polypeptides. A comparison of the sense and nonsense code matrices, and of the antisense and antinonsense code matrices, either by visual inspection or by comparing the corresponding hydrophobicity matrices (e.g. by simply adding them together), revealed no complementarity of these pairs of matrices in terms of possible attractive physical forces. Instead, it was evident that the codes divide the amino acids into two major groups: hydrophilic and hydrophobic, a division which is directly correlated with the folding property of proteins. A simple primordial genetic code distinguishing between these two types of amino acids would have been capable of generating three-dimensionally folded peptides, which could stabilize coding RNAs by forming ribonucleoprotein complexes. This evolutionary scheme is reflected in the present organisation of information processing and storage in essentially all organisms. RNAs are processed and translated into proteins by ribonucleoproteins, while other steps in information retrieval and processing, such as DNA replication, transcription, protein folding and posttranslational processing, are catalyzed by proteins. This shows that the evolution of DNA as an information storage medium was a secondary event, unrelated to the evolution of the genetic code. From the primordial hydrophilic/hydrophobic (f.ex. Leu/Arg) code, evolution proceeded by introduction of a catalytic amino acid (Ser). The further evolution of the code has mainly served to increase the number of functional hydrophilic amino acids, since there has not been a great advantage in increasing the number of structural, hydrophobic amino acids. At some stage during the evolution of the genetic code, double-stranded DNA was introduced as a maximally safe genetic copy of RNA. This required the action of highly specific enzymes, and was therefore preceded by the refinement of the genetic code. As a conclusion of this evolutionary scheme, it can be inferred that, in general only the sense strand encodes proteins.
Collapse
Affiliation(s)
- G Houen
- Department of Protein Chemistry, Statens Serum Institut, Copenhagen S, Denmark.
| |
Collapse
|
22
|
Di Giulio M. The beta-sheets of proteins, the biosynthetic relationships between amino acids, and the origin of the genetic code. ORIGINS LIFE EVOL B 1996; 26:589-609. [PMID: 9008882 DOI: 10.1007/bf01808222] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Two forces are generally hypothesised as being responsible for conditioning the origin of the organization of the genetic code: the physicochemical properties of amino acids and their biosynthetic relationships (relationships between precursor and product amino acids). If we assume that the biosynthetic relationships between amino acids were fundamental in defining the genetic code, then it is reasonable to expect that the distribution of physicochemical properties among the amino acids in precursor-product relationships cannot be random but must, rather, be affected by some selective constraints imposed by the structure of primitive proteins. Analysis shows that measurements representing the 'size' of amino acids, e.g. bulkiness, are specifically associated to the pairs of amino acids in precurso-product relationships. However, the size of amino acids cannot have been selected per se but, rather, because it reflects the beta-sheets of proteins which are, therefore, identified as the main adaptive theme promoting the origin of genetic code organization. Whereas there are no traces of the alpha-helix in the genetic code table. The above considerations make it necessary to re-examine the relationship linking the hydrophilicity of the dinucleoside monophosphates of anticodons and the polarity and bulkiness of amino acids. It can be concluded that this relationship seems to be meaningful only between the hydrophilicity of anticodons and the polarity of amino acids. The latter relationship is supposed to have been operative on hairpin structures, ancestors of the tRNA molecule. Moreover, it is on these very structures that the biosynthetic links between precursor and product amino acids might have been achieved, and the interaction between the hydrophilicity of anticodons and the polarity of amino acids might have had a role in the concession of codons (anticodons) from precursors to products.
Collapse
Affiliation(s)
- M Di Giulio
- International Institute of Genetics and Biophysics, CNR, Napoli, Italy
| |
Collapse
|
23
|
Abstract
Two ideas have essentially been used to explain the origin of the genetic code: Crick's frozen accident and Woese's amino acid-codon specific chemical interaction. Whatever the origin and codon-amino acid correlation, it is difficult to imagine the sudden appearance of the genetic code in its present form of 64 codons coding for 20 amino acids without appealing to some evolutionary process. On the contrary, it is more reasonable to assume that it evolved from a much simpler initial state in which a few triplets were coding for each of a small number of amino acids. Analysis of genetic code through information theory and the metabolism of pyrimidine biosynthesis provide evidence that suggests that the genetic code could have begun in an RNA world with the two letters A and U grouped in eight triplets coding for seven amino acids and one stop signal. This code could have progressively evolved by making gradual use of letters G and C to end with 64 triplets coding for 20 amino acids and three stop signals. According to proposed evidence, DNA could have appeared after the four-letter structure was already achieved. In the newborn DNA world, T substituted U to get higher physicochemical and genetic stability.
Collapse
Affiliation(s)
- A Jiménez-Sánchez
- Departmento de Bioquímica, Biología Molecular y Genética, Universidad de Extremadura, Badajoz, Spain
| |
Collapse
|
24
|
Abstract
A series of stages in the evolution of the genetic code is postulated, representing a chain of logical steps that leads to the present-day code. The stages described are based on translation machinery between the RNA world and that of amino acids, a model that consists of an RNA assembler strand along which RNA hairpin molecules are lined up, forming a picket-fence-like aggregate. Each hairpin carries an amino acid at the bottom of one of its legs, and the mutual proximity of amino acids achieved in this way facilitates their linkage into oligopeptides, in a sequence governed by the nucleotide sequence along the assembler strand, the code. The order in which amino acids are introduced into the code is in the approximate order of their availability, tempered by polarity and structural considerations.
Collapse
Affiliation(s)
- H Kuhn
- Max-Planck-Institut für biophysikalische Chemie, Göttingen, Germany
| | | |
Collapse
|