1
|
Di Giulio M. Theories of the origin of the genetic code: Strong corroboration for the coevolution theory. Biosystems 2024; 239:105217. [PMID: 38663520 DOI: 10.1016/j.biosystems.2024.105217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 04/16/2024] [Accepted: 04/18/2024] [Indexed: 04/29/2024]
Abstract
I analyzed all the theories and models of the origin of the genetic code, and over the years, I have considered the main suggestions that could explain this origin. The conclusion of this analysis is that the coevolution theory of the origin of the genetic code is the theory that best captures the majority of observations concerning the organization of the genetic code. In other words, the biosynthetic relationships between amino acids would have heavily influenced the origin of the organization of the genetic code, as supported by the coevolution theory. Instead, the presence in the genetic code of physicochemical properties of amino acids, which have also been linked to the physicochemical properties of anticodons or codons or bases by stereochemical and physicochemical theories, would simply be the result of natural selection. More explicitly, I maintain that these correlations between codons, anticodons or bases and amino acids are in fact the result not of a real correlation between amino acids and codons, for example, but are only the effect of the intervention of natural selection. Specifically, in the genetic code table we expect, for example, that the most similar codons - that is, those that differ by only one base - will have more similar physicochemical properties. Therefore, the 64 codons of the genetic code table ordered in a certain way would also represent an ordering of some of their physicochemical properties. Now, a study aimed at clarifying which physicochemical property of amino acids has influenced the allocation of amino acids in the genetic code has established that the partition energy of amino acids has played a role decisive in this. Indeed, under some conditions, the genetic code was found to be approximately 98% optimized on its columns. In this same work, it was shown that this was most likely the result of the action of natural selection. If natural selection had truly allocated the amino acids in the genetic code in such a way that similar amino acids also have similar codons - this, not through a mechanism of physicochemical interaction between, for example, codons and amino acids - then it might turn out that even different physicochemical properties of codons (or anticodons or bases) show some correlation with the physicochemical properties of amino acids, simply because the partition energy of amino acids is correlated with other physicochemical properties of amino acids. It is very likely that this would inevitably lead to a correlation between codons (or anticodons or bases) and amino acids. In other words, since the codons (anticodons or bases) are ordered in the genetic code, that is to say, some of their physicochemical properties should also be ordered by a similar order, and given that the amino acids would also appear to have been ordered in the genetic code by selection natural, then it should inevitably turn out that there is a correlation between, for example, the hydrophobicity of anticodons and that of amino acids. Instead, the intervention of natural selection in organizing the genetic code would appear to be highly compatible with the main mechanism of structuring the genetic code as supported by the coevolution theory. This would make the coevolution theory the only plausible explanation for the origin of the genetic code.
Collapse
Affiliation(s)
- Massimo Di Giulio
- The Ionian School, Early Evolution of Life Department, Genetic Code and tRNA Origin Laboratory, Via Roma 19, 67030, Alfedena, L'Aquila, Italy.
| |
Collapse
|
2
|
Di Giulio M. The time of appearance of the genetic code. Biosystems 2024; 237:105159. [PMID: 38373543 DOI: 10.1016/j.biosystems.2024.105159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 02/13/2024] [Accepted: 02/16/2024] [Indexed: 02/21/2024]
Abstract
I support the hypothesis that the origin of the genetic code occurred simultaneously with the evolution of cellularity. That is to say, I favour the hypothesis that the origin of the genetic code is a very, very late event in the history of life on Earth. I corroborate this hypothesis with observations favouring the progenote's stage for the Last Universal Common Ancestor (LUCA), for the ancestor of bacteria and that of archaea. Indeed, these progenotic stages would imply that - at that time - the origin of the genetic code was still ongoing simply because this origin would fall within the very definition of progenote. Therefore, if the evolution of cellularity had truly been coeval with the origin of the genetic code - at least in its terminal part - then this would favour theories such as the coevolution theory of the origin of the genetic code because this theory would postulate that this origin must have occurred in extremely complex protocellular conditions and not concerning stereochemical or physicochemical interactions having to do with other stages of the origin of life. In this sense, the coevolution theory would be corroborated while the stereochemical and physicochemical theories would be damaged. Therefore, the origin of the genetic code would be linked to the origin of the cell and not to the origin of life as sometimes asserted. Therefore, I will discuss the late hypothesis of the origin of the genetic code in the context of the theories proposed to explain this origin and more generally of its implications for the early evolution of life.
Collapse
Affiliation(s)
- Massimo Di Giulio
- The Ionian School, Early Evolution of Life Department, Genetic Code and tRNA Origin Laboratory, Via Roma 19, 67030, Alfedena, L'Aquila, Italy.
| |
Collapse
|
3
|
Marshall LK, Fahrenbach AC, Thordarson P. RNA-Binding Peptides Inspired by the RNA Recognition Motif. ACS Chem Biol 2024; 19:243-248. [PMID: 38314708 DOI: 10.1021/acschembio.3c00694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]
Abstract
β-Hairpin peptides with RNA-binding sequences mimicking the central two β-strands of the RNA recognition motif (RRM) protein domain have been observed to bind in a 2:1 fashion to a series of RNA homooligonucleotides in aqueous solution (PBS buffer, pH 7.40) with binding energies (-27 to -35 kJ mol-1) similar to those of full-size protein RRMs. The peptides display mild selectivities with respect to the binding of the different homooligomers. Binding studies in 500 mM magnesium chloride suggest that the complex formation is not predominantly driven by Coulombic attraction. These peptides represent a starting point for further studies of non-Coulombic binding of RNA by peptides and proteins, which is important in the context of contemporary biology, potential therapeutic applications, and prebiotic peptide-RNA interactions.
Collapse
|
4
|
Kak S. Self-similarity and the maximum entropy principle in the genetic code. Theory Biosci 2023; 142:205-210. [PMID: 37402087 DOI: 10.1007/s12064-023-00396-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 06/16/2023] [Indexed: 07/05/2023]
Abstract
This paper addresses the relationship between information and structure of the genetic code. The code has two puzzling anomalies: First, when viewed as 64 sub-cubes of a [Formula: see text] cube, the codons for serine (S) are not contiguous, and there are amino acid codons with zero redundancy, which goes counter to the objective of error correction. To make sense of this, the paper shows that the genetic code must be viewed not only on stereochemical, co-evolution, and error-correction considerations, but also on two additional factors of significance to natural systems, that of an information-theoretic dimensionality of the code data, and the principle of maximum entropy. One implication of non-integer dimensionality associated with data dimensions is self-similarity to different scales, and it is shown that the genetic code does satisfy this property, and it is further shown that the maximum entropy principle operates through the scrambling of the elements in the sense of maximum algorithmic information complexity, generated by an appropriate exponentiation mapping. It is shown that the new considerations and the use of maximum entropy transformation create new constraints that are likely the reasons for the non-uniform codon groups and codons with no redundancy.
Collapse
Affiliation(s)
- Subhash Kak
- Chapman University, Orange, CA, 92866, USA.
- Oklahoma State University, Stillwater, OK, 74078, USA.
| |
Collapse
|
5
|
Abstract
The mechanism and the evolution of DNA replication and transcription, the key elements of the central dogma of biology, are fundamentally well explained by the physicochemical complementarity between strands of nucleic acids. However, the determinants that have shaped the third part of the dogma-the process of biological translation and the universal genetic code-remain unclear. We review and seek parallels between different proposals that view the evolution of translation through the prism of weak, noncovalent interactions between biological macromolecules. In particular, we focus on a recent proposal that there exists a hitherto unrecognized complementarity at the heart of biology, that between messenger RNA coding regions and the proteins that they encode, especially if the two are unstructured. Reflecting the idea that the genetic code evolved from intrinsic binding propensities between nucleotides and amino acids, this proposal promises to forge a link between the distant past and the present of biological systems.
Collapse
Affiliation(s)
- Bojan Zagrovic
- Department of Structural and Computational Biology, Max Perutz Labs & University of Vienna, Vienna, Austria;
| | - Marlene Adlhart
- Department of Structural and Computational Biology, Max Perutz Labs & University of Vienna, Vienna, Austria;
| | - Thomas H Kapral
- Department of Structural and Computational Biology, Max Perutz Labs & University of Vienna, Vienna, Austria;
- Vienna BioCenter PhD Program, Doctoral School of the University of Vienna and Medical University of Vienna, Vienna, Austria
| |
Collapse
|
6
|
Jenne F, Berezkin I, Tempel F, Schmidt D, Popov R, Nesterov-Mueller A. Screening for Primordial RNA–Peptide Interactions Using High-Density Peptide Arrays. Life (Basel) 2023; 13:life13030796. [PMID: 36983951 PMCID: PMC10053474 DOI: 10.3390/life13030796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Revised: 03/08/2023] [Accepted: 03/13/2023] [Indexed: 03/17/2023] Open
Abstract
RNA–peptide interactions are an important factor in the origin of the modern mechanism of translation and the genetic code. Despite great progress in the bioinformatics of RNA–peptide interactions due to the rapid growth in the number of known RNA–protein complexes, there is no comprehensive experimental method to take into account the influence of individual amino acids on non-covalent RNA–peptide bonds. First, we designed the combinatorial libraries of primordial peptides according to the combinatorial fusion rules based on Watson–Crick mutations. Next, we used high-density peptide arrays to investigate the interaction of primordial peptides with their cognate homo-oligonucleotides. We calculated the interaction scores of individual peptide fragments and evaluated the influence of the peptide length and its composition on the strength of RNA binding. The analysis shows that the amino acids phenylalanine, tyrosine, and proline contribute significantly to the strong binding between peptides and homo-oligonucleotides, while the sum charge of the peptide does not have a significant effect. We discuss the physicochemical implications of the combinatorial fusion cascade, a hypothesis that follows from the amino acid partition used in the work.
Collapse
Affiliation(s)
- Felix Jenne
- Institute of Microstructure Technology, Karlsruhe Institute of Technology, DE-76344 Eggenstein-Leopoldshafen, Germany
| | - Ivan Berezkin
- Institute of Microstructure Technology, Karlsruhe Institute of Technology, DE-76344 Eggenstein-Leopoldshafen, Germany
| | - Frank Tempel
- Institute of Microstructure Technology, Karlsruhe Institute of Technology, DE-76344 Eggenstein-Leopoldshafen, Germany
| | - Dimitry Schmidt
- Institute of Microstructure Technology, Karlsruhe Institute of Technology, DE-76344 Eggenstein-Leopoldshafen, Germany
| | | | - Alexander Nesterov-Mueller
- Institute of Microstructure Technology, Karlsruhe Institute of Technology, DE-76344 Eggenstein-Leopoldshafen, Germany
- Correspondence: ; Tel.: +49-721-608-29253
| |
Collapse
|
7
|
Borah C, Ali T. Genetic code noise immunity features: Degeneracy and frameshift correction. GENE REPORTS 2022. [DOI: 10.1016/j.genrep.2022.101707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
8
|
Arguments against the stereochemical theory of the origin of the genetic code. Biosystems 2022; 221:104750. [PMID: 35970477 DOI: 10.1016/j.biosystems.2022.104750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 07/26/2022] [Accepted: 07/26/2022] [Indexed: 11/23/2022]
Abstract
I support the hypothesis that stereochemical theory is unnatural because it is based on artificial and not simple mechanisms as required for a good theory. Indeed, for stereochemical theory the origin of the genetic code requires, in the first place, a primary interaction, for example, between a codon and an amino acid on a proto-tRNA. But this interaction is a necessary but not sufficient condition, because the evolution of the mRNA molecule, which would really define the genetic code, is still necessary for the complete origin of the genetic code. In other words, the need for two molecules, tRNA and mRNA, to define the genetic code, with their at least partial independence would testify to an artificial mechanism typical of stereochemical theory because it would not guarantee that amino acid-codon (or -anticodon) assignments realized in the first phase of the origin of the genetic code, would necessarily be maintained also in the second phase of its completion. Furthermore, the genetic code encodes for amino acids but amino acids are not the truly functional aspect, they are only intermediaries, of their final products, proteins, which are the only true entities actually coded by genes. Therefore, it would not be immediately clear from the point of view of stereochemical theory, to say why it is the amino acids and not the proteins that are involved in the primary stereochemical interactions that would have led to the origin of the genetic code. Hence, at least some of the stereochemical theory models would be not very credible, not being able to say much about the coding of proteins by genes. Finally, I inspected the genetic code table following the logic that more closely similar amino acids should - according to stereochemical theory - be coded by highly similar codons, finding that only a few pairs of amino acids actually satisfy this logic, further discretizing the stereochemical theory.
Collapse
|
9
|
Model of Genetic Code Structure Evolution under Various Types of Codon Reading. Int J Mol Sci 2022; 23:ijms23031690. [PMID: 35163612 PMCID: PMC8835785 DOI: 10.3390/ijms23031690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Revised: 01/23/2022] [Accepted: 01/25/2022] [Indexed: 11/28/2022] Open
Abstract
The standard genetic code (SGC) is a set of rules according to which 64 codons are assigned to 20 canonical amino acids and stop coding signal. As a consequence, the SGC is redundant because there is a greater number of codons than the number of encoded labels. This redundancy implies the existence of codons that encode the same genetic information. The size and organization of such synonymous codon blocks are important characteristics of the SGC structure whose evolution is still unclear. Therefore, we studied possible evolutionary mechanisms of the codon block structure. We conducted computer simulations assuming that coding systems at early stages of the SGC evolution were sets of ambiguous codon assignments with high entropy. We included three types of reading systems characterized by different inaccuracy and pattern of codon recognition. In contrast to the previous study, we allowed for evolution of the reading systems and their competition. The simulations performed under minimization of translational errors and reduction of coding ambiguity produced the coding system resistant to these errors. The reading system similar to that present in the SGC dominated the others very quickly. The survived system was also characterized by low entropy and possessed properties similar to that in the SGC. Our simulation show that the unambiguous SGC could emerged from a code with a lower level of ambiguity and the number of tRNAs increased during the evolution.
Collapse
|
10
|
Caldararo F, Di Giulio M. The genetic code is very close to a global optimum in a model of its origin taking into account both the partition energy of amino acids and their biosynthetic relationships. Biosystems 2022; 214:104613. [DOI: 10.1016/j.biosystems.2022.104613] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Revised: 01/16/2022] [Accepted: 01/17/2022] [Indexed: 01/23/2023]
|
11
|
Use of the Codon Table to Quantify the Evolutionary Role of Random Mutations. ALGORITHMS 2021. [DOI: 10.3390/a14090270] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
The various biases affecting RNA mutations during evolution is the subject of intense research, leaving the extent of the role of random mutations undefined. To remedy this lacuna, using the codon table, the number of codons representing each amino acid was correlated with the amino acid frequencies in different branches of the evolutionary tree. The correlations were seen to increase as evolution progressed. Furthermore, the number of RNA mutations that resulted in a given amino acid mutation were found to be correlated with several widely used amino acid similarity tables (used in sequence alignments). These correlations were seen to increase when the observed codon usage was factored in.
Collapse
|
12
|
The Combinatorial Fusion Cascade to Generate the Standard Genetic Code. Life (Basel) 2021; 11:life11090975. [PMID: 34575125 PMCID: PMC8467831 DOI: 10.3390/life11090975] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 09/14/2021] [Accepted: 09/14/2021] [Indexed: 11/17/2022] Open
Abstract
Combinatorial fusion cascade was proposed as a transition stage between prebiotic chemistry and early forms of life. The combinatorial fusion cascade consists of three stages: eight initial complimentary pairs of amino acids, four protocodes, and the standard genetic code. The initial complimentary pairs and the protocodes are divided into dominant and recessive entities. The transitions between these stages obey the same combinatorial fusion rules for all amino acids. The combinatorial fusion cascade mathematically describes the codon assignments in the standard genetic code. It explains the availability of amino acids with the even and odd numbers of codons, the appearance of stop codons, inclusion of novel canonical amino acids, exceptional high numbers of codons for amino acids arginine, leucine, and serine, and the temporal order of amino acid inclusion into the genetic code. The temporal order of amino acids within the cascade is congruent with the consensus temporal order previously derived from the similarities between the available hypotheses. The control over the combinatorial fusion cascades would open the road for a novel technology to develop artificial microorganisms.
Collapse
|
13
|
Pawlak K, Wnetrzak M, Mackiewicz D, Mackiewicz P, Błażej P. Models of genetic code structure evolution with variable number of coded labels. Biosystems 2021; 210:104528. [PMID: 34492316 DOI: 10.1016/j.biosystems.2021.104528] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 08/26/2021] [Accepted: 08/27/2021] [Indexed: 10/20/2022]
Abstract
It is assumed that at the early stage of cell evolution its translation machinery was characterized by high noise, i.e. ambiguous assignment of codons to amino acids in the genetic code, which initially encoded only few amino acids. Next, during its evolution new amino acids were added to this code. Taking into account this facts, we investigated theoretical models of genetic code's structure, which evolved from a set of ambiguous codons assignments into a coding system with a low level of uncertainty. We considered three types of translational inaccuracies assuming a different number of fixed codon positions. We applied a modified version of evolutionary algorithm for finding the genetic codes that the most effectively reduced the initial uncertainty in the assignment of codons to encoded labels, i.e. amino acids and a stop translation signal. We examined codes with the number of labels from four to 22. Our results indicated that the quality of genetic code structure is strongly dependent on the number of encoded labels as well as the type of translational mechanism. The more strict assignments of codon to the labels was preferred by the codes encoding more number of labels. The results showed that a smaller degeneracy of codes evolved from a more tolerant coding with the stepwise addition of coded amino acids to the genetic code. The distribution of codon groups in the standard genetic code corresponds well to the translation model assuming two fixed codon positions, whereas the six-codon groups can be relics form previous stages of evolution when the code characterized by a greater uncertainty.
Collapse
Affiliation(s)
- Konrad Pawlak
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland
| | - Małgorzata Wnetrzak
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland
| | - Dorota Mackiewicz
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland
| | - Paweł Mackiewicz
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland
| | - Paweł Błażej
- Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland.
| |
Collapse
|
14
|
Ying J, Ding R, Liu Y, Zhao Y. Prebiotic Chemistry in Aqueous Environment: A Review of Peptide Synthesis and Its Relationship with Genetic Code. CHINESE J CHEM 2021. [DOI: 10.1002/cjoc.202100120] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Jianxi Ying
- Institute of Drug Discovery Technology Ningbo University, No.818 Fenghua Road, Ningbo Zhejiang 315211 China
- Qian Xuesen Collaborative Research Center of Astrochemistry and Space Life Sciences Ningbo University No.818 Fenghua Road, Ningbo Zhejiang 315211 China
| | - Ruiwen Ding
- Institute of Drug Discovery Technology Ningbo University, No.818 Fenghua Road, Ningbo Zhejiang 315211 China
- Qian Xuesen Collaborative Research Center of Astrochemistry and Space Life Sciences Ningbo University No.818 Fenghua Road, Ningbo Zhejiang 315211 China
| | - Yan Liu
- College of Chemistry and Chemical Engineering Xiamen University, No. 422, Siming South Road Xiamen Fujian 361005 China
| | - Yufen Zhao
- Institute of Drug Discovery Technology Ningbo University, No.818 Fenghua Road, Ningbo Zhejiang 315211 China
- Qian Xuesen Collaborative Research Center of Astrochemistry and Space Life Sciences Ningbo University No.818 Fenghua Road, Ningbo Zhejiang 315211 China
- College of Chemistry and Chemical Engineering Xiamen University, No. 422, Siming South Road Xiamen Fujian 361005 China
| |
Collapse
|
15
|
Fimmel E, Gumbel M, Starman M, Strüngmann L. Robustness against point mutations of genetic code extensions under consideration of wobble-like effects. Biosystems 2021; 208:104485. [PMID: 34280517 DOI: 10.1016/j.biosystems.2021.104485] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Revised: 07/07/2021] [Accepted: 07/09/2021] [Indexed: 11/25/2022]
Abstract
Many theories of the evolution of the genetic code assume that the genetic code has always evolved in the direction of increasing the supply of amino acids to be encoded (Barbieri, 2019; Di Giulio, 2005; Wong, 1975). In order to reduce the risk of the formation of a non-functional protein due to point mutations, nature is said to have built in control mechanisms. Using graph theory the authors have investigated in Blazej et al. (2019) if this robustness is optimal in the sense that a different codon-amino acid assignment would not generate a code that is even more robust. At present, efforts to expand the genetic code are very relevant in biotechnological applications, for example, for the synthesis of new drugs (Anderson et al., 2004; Chin, 2017; Dien et al., 2018; Kimoto et al., 2009; Neumann et al., 2010). In this paper we generalize the approach proposed in Blazej et al. (2019) and will explore hypothetical extensions of the standard genetic code with respect to their optimal robustness in two ways: (1) We keep the usual genetic alphabet but move from codons to longer words, such as tetranucleotides. This increases the supply of coding words and thus makes it possible to encode non-canonical amino acids. (2) We expand the genetic alphabet by introducing non-canonical base pairs. In addition, the approach from Blazej et al. (2019) and Blazej et al. (2018) is extended by incorporating the weights of single point-mutations into the model. The weights can be interpreted as probabilities (appropriately normalized) or degrees of severity of a single point mutation. In particular, this new approach allows us to take a closer look at the wobble effects in the translation of codons into amino acids. According to the results from Blazej et al. (2019) and Blazej et al. (2018), the standard genetic code is not optimal in terms of its robustness to point mutations if the weights of single point mutations are not taken into account. After incorporation into the model weights that mimic the wobble effect, the results of the present work show that it is much more robust, almost optimal in that respect. We hope, that this theoretical analysis might help to assess extended genetic codes and their abilities to encode new amino acids.
Collapse
Affiliation(s)
- E Fimmel
- Competence Center in Medicine, Biology, and Biotechnology, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| | - M Gumbel
- Competence Center in Medicine, Biology, and Biotechnology, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| | - M Starman
- Competence Center in Medicine, Biology, and Biotechnology, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| | - L Strüngmann
- Competence Center in Medicine, Biology, and Biotechnology, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| |
Collapse
|
16
|
Ehrlich R, Davyt M, López I, Chalar C, Marín M. On the Track of the Missing tRNA Genes: A Source of Non-Canonical Functions? Front Mol Biosci 2021; 8:643701. [PMID: 33796548 PMCID: PMC8007984 DOI: 10.3389/fmolb.2021.643701] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 02/02/2021] [Indexed: 01/31/2023] Open
Abstract
Cellular tRNAs appear today as a diverse population of informative macromolecules with conserved general elements ensuring essential common functions and different and distinctive features securing specific interactions and activities. Their differential expression and the variety of post-transcriptional modifications they are subject to, lead to the existence of complex repertoires of tRNA populations adjusted to defined cellular states. Despite the tRNA-coding genes redundancy in prokaryote and eukaryote genomes, it is surprising to note the absence of genes coding specific translational-active isoacceptors throughout the phylogeny. Through the analysis of different releases of tRNA databases, this review aims to provide a general summary about those “missing tRNA genes.” This absence refers to both tRNAs that are not encoded in the genome, as well as others that show critical sequence variations that would prevent their activity as canonical translation adaptor molecules. Notably, while a group of genes are universally missing, others are absent in particular kingdoms. Functional information available allows to hypothesize that the exclusion of isodecoding molecules would be linked to: 1) reduce ambiguities of signals that define the specificity of the interactions in which the tRNAs are involved; 2) ensure the adaptation of the translational apparatus to the cellular state; 3) divert particular tRNA variants from ribosomal protein synthesis to other cellular functions. This leads to consider the “missing tRNA genes” as a source of putative non-canonical tRNA functions and to broaden the concept of adapter molecules in ribosomal-dependent protein synthesis.
Collapse
Affiliation(s)
- Ricardo Ehrlich
- Biochemistry-Molecular Biology, Faculty of Science, Universidad de la República, Montevideo, Uruguay.,Institut Pasteur de Montevideo, Montevideo, Uruguay
| | - Marcos Davyt
- Biochemistry-Molecular Biology, Faculty of Science, Universidad de la República, Montevideo, Uruguay
| | - Ignacio López
- Biochemistry-Molecular Biology, Faculty of Science, Universidad de la República, Montevideo, Uruguay
| | - Cora Chalar
- Biochemistry-Molecular Biology, Faculty of Science, Universidad de la República, Montevideo, Uruguay
| | - Mónica Marín
- Biochemistry-Molecular Biology, Faculty of Science, Universidad de la República, Montevideo, Uruguay
| |
Collapse
|
17
|
Abstract
We find that the degeneracies and many peculiarities of the DNA genetic code may be described thanks to two closely related (fivefold symmetric) finite groups. The first group has signature G=Z5⋊H where H=Z2.S4≅2O is isomorphic to the binary octahedral group 2O and S4 is the symmetric group on four letters/bases. The second group has signature G=Z5⋊GL(2,3) and points out a threefold symmetry of base pairings. For those groups, the representations for the 22 conjugacy classes of G are in one-to-one correspondence with the multiplets encoding the proteinogenic amino acids. Additionally, most of the 22 characters of G attached to those representations are informationally complete. The biological meaning of these coincidences is discussed.
Collapse
|
18
|
Kaiser F, Krautwurst S, Salentin S, Haupt VJ, Leberecht C, Bittrich S, Labudde D, Schroeder M. The structural basis of the genetic code: amino acid recognition by aminoacyl-tRNA synthetases. Sci Rep 2020; 10:12647. [PMID: 32724042 PMCID: PMC7387524 DOI: 10.1038/s41598-020-69100-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 07/06/2020] [Indexed: 12/29/2022] Open
Abstract
Storage and directed transfer of information is the key requirement for the development of life. Yet any information stored on our genes is useless without its correct interpretation. The genetic code defines the rule set to decode this information. Aminoacyl-tRNA synthetases are at the heart of this process. We extensively characterize how these enzymes distinguish all natural amino acids based on the computational analysis of crystallographic structure data. The results of this meta-analysis show that the correct read-out of genetic information is a delicate interplay between the composition of the binding site, non-covalent interactions, error correction mechanisms, and steric effects.
Collapse
Affiliation(s)
- Florian Kaiser
- Biotechnology Center (BIOTEC), TU Dresden, 01307, Dresden, Germany. .,PharmAI GmbH, Tatzberg 47, 01307, Dresden, Germany.
| | - Sarah Krautwurst
- University of Applied Sciences Mittweida, 09648, Mittweida, Germany
| | | | - V Joachim Haupt
- Biotechnology Center (BIOTEC), TU Dresden, 01307, Dresden, Germany.,PharmAI GmbH, Tatzberg 47, 01307, Dresden, Germany
| | | | | | - Dirk Labudde
- University of Applied Sciences Mittweida, 09648, Mittweida, Germany
| | | |
Collapse
|
19
|
A search for the physical basis of the genetic code. Biosystems 2020; 195:104148. [DOI: 10.1016/j.biosystems.2020.104148] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 04/09/2020] [Accepted: 04/09/2020] [Indexed: 01/01/2023]
|
20
|
Grabow WW, Andrews GE. On the nature and origin of biological information: The curious case of RNA. Biosystems 2019; 185:104031. [PMID: 31525398 DOI: 10.1016/j.biosystems.2019.104031] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2019] [Revised: 09/11/2019] [Accepted: 09/12/2019] [Indexed: 11/18/2022]
Abstract
Biological information is most commonly thought of in terms of biology's Central Dogma where DNA is viewed as a linearized code used to synthesize proteins. Using DNA's chemical cousin, RNA, as a case study we consider how biological information operates outside the linear arrangement of its polymeric subunits. Much like individual pieces of a jigsaw puzzle, particular structures enable biomolecules to undergo precise molecular interactions with one another based on their respective shapes. By exploring the relationship between sequence and structure in RNA we argue that biological information finds its ultimate functional fulfillment in the three-dimensional structural arrangement of its atoms. We show how recurrent structural RNA motifs-operating at the tertiary level of a molecule-provide robust building blocks for the formation of new structural configurations and thereby convey the information required for emergent biological functions. We posit that these same RNA structures, guided by their respective thermodynamic stabilities, experience selective pressure to maintain particular three-dimensional architectures over and above pressures to maintain a particular sequence of nucleotides. Ultimately, this framework for understanding the nature of biological information provides a useful paradigm for understanding its origins and how biological information can result from chaotic prebiotic conditions.
Collapse
Affiliation(s)
- Wade W Grabow
- Department of Chemistry and Biochemistry, Seattle Pacific University, Seattle, WA, 918119-1997, USA.
| | - Grace E Andrews
- Department of Chemistry and Biochemistry, Seattle Pacific University, Seattle, WA, 918119-1997, USA
| |
Collapse
|
21
|
Barbhuiya RI, Uddin A, Chakraborty S. Compositional properties and codon usage pattern of mitochondrial ATP gene in different classes of Arthropoda. Genetica 2019; 147:231-248. [PMID: 31152294 DOI: 10.1007/s10709-019-00067-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2018] [Accepted: 05/22/2019] [Indexed: 12/17/2022]
Abstract
Codon usage bias (CUB) is defined as the usage of synonymous codons unequally for an amino acid in a gene transcript. It is influenced by both mutation pressure and natural selection and is a species-specific property. In our current study, we used bioinformatic methods to investigate the coding sequences of mitochondrial adenosine triphosphate gene (MT-ATP) in different classes of arthropoda to know the codon usage pattern of the gene as no work was described earlier. The analysis of compositional properties suggested that the gene is AT rich. The effective number of codons revealed the CUB of both ATP6 and ATP8 gene was moderate. Heat map showed that the codons ending with AT were negatively associated with GC3 while the codons ending with GC were positively associated with GC3 in all the classes of arthropoda. Correspondence study revealed that the pattern of codon usage of ATP6 and ATP8 genes differed across classes. Neutrality plot suggested the codon usage bias of these two genes in phylum arthropoda was influenced by both mutation pressure and natural selection.
Collapse
Affiliation(s)
| | - Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Science College, Algapur, Hailakandi, Assam, 788150, India
| | - Supriyo Chakraborty
- Department of Biotechnology, Assam University, Silchar, Assam, 788011, India.
| |
Collapse
|
22
|
The Quality of Genetic Code Models in Terms of Their Robustness Against Point Mutations. Bull Math Biol 2019; 81:2239-2257. [DOI: 10.1007/s11538-019-00603-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Accepted: 03/25/2019] [Indexed: 11/29/2022]
|
23
|
BłaŻej P, Wnetrzak M, Mackiewicz D, Mackiewicz P. The influence of different types of translational inaccuracies on the genetic code structure. BMC Bioinformatics 2019; 20:114. [PMID: 30841864 PMCID: PMC6404327 DOI: 10.1186/s12859-019-2661-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Accepted: 01/29/2019] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND The standard genetic code is a recipe for assigning unambiguously 21 labels, i.e. amino acids and stop translation signal, to 64 codons. However, at early stages of the translational machinery development, the codons did not have to be read unambiguously and the early genetic codes could have contained some ambiguous assignments of codons to amino acids. Therefore, the goal of this work was to obtain the genetic code structures which could have evolved assuming different types of inaccuracy of the translational machinery starting from unambiguous assignments of codons to amino acids. RESULTS We developed a theoretical model assuming that the level of uncertainty of codon assignments can gradually decrease during the simulations. Since it is postulated that the standard code has evolved to be robust against point mutations and mistranslations, we developed three simulation scenarios assuming that such errors can influence one, two or three codon positions. The simulated codes were selected using the evolutionary algorithm methodology to decrease coding ambiguity and increase their robustness against mistranslation. CONCLUSIONS The results indicate that the typical codon block structure of the genetic code could have evolved to decrease the ambiguity of amino acid to codon assignments and to increase the fidelity of reading the genetic information. However, the robustness to errors was not the decisive factor that influenced the genetic code evolution because it is possible to find theoretical codes that minimize the reading errors better than the standard genetic code.
Collapse
Affiliation(s)
- Paweł BłaŻej
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| | - Małgorzata Wnetrzak
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| | - Dorota Mackiewicz
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| | - Paweł Mackiewicz
- Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383 Poland
| |
Collapse
|
24
|
Wnętrzak M, Błażej P, Mackiewicz D, Mackiewicz P. The optimality of the standard genetic code assessed by an eight-objective evolutionary algorithm. BMC Evol Biol 2018; 18:192. [PMID: 30545289 PMCID: PMC6293558 DOI: 10.1186/s12862-018-1304-0] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Accepted: 11/22/2018] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The standard genetic code (SGC) is a unique set of rules which assign amino acids to codons. Similar amino acids tend to have similar codons indicating that the code evolved to minimize the costs of amino acid replacements in proteins, caused by mutations or translational errors. However, if such optimization in fact occurred, many different properties of amino acids must have been taken into account during the code evolution. Therefore, this problem can be reformulated as a multi-objective optimization task, in which the selection constraints are represented by measures based on various amino acid properties. RESULTS To study the optimality of the SGC we applied a multi-objective evolutionary algorithm and we used the representatives of eight clusters, which grouped over 500 indices describing various physicochemical properties of amino acids. Thanks to that we avoided an arbitrary choice of amino acid features as optimization criteria. As a consequence, we were able to conduct a more general study on the properties of the SGC than the ones presented so far in other papers on this topic. We considered two models of the genetic code, one preserving the characteristic codon blocks structure of the SGC and the other without this restriction. The results revealed that the SGC could be significantly improved in terms of error minimization, hereby it is not fully optimized. Its structure differs significantly from the structure of the codes optimized to minimize the costs of amino acid replacements. On the other hand, using newly defined quality measures that placed the SGC in the global space of theoretical genetic codes, we showed that the SGC is definitely closer to the codes that minimize the costs of amino acids replacements than those maximizing them. CONCLUSIONS The standard genetic code represents most likely only partially optimized systems, which emerged under the influence of many different factors. Our findings can be useful to researchers involved in modifying the genetic code of the living organisms and designing artificial ones.
Collapse
Affiliation(s)
- Małgorzata Wnętrzak
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, 50-383, Wrocław, Poland
| | - Paweł Błażej
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, 50-383, Wrocław, Poland
| | - Dorota Mackiewicz
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, 50-383, Wrocław, Poland
| | - Paweł Mackiewicz
- Department of Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, 50-383, Wrocław, Poland.
| |
Collapse
|
25
|
Khan MF, Patra S. Deciphering the rationale behind specific codon usage pattern in extremophiles. Sci Rep 2018; 8:15548. [PMID: 30341344 PMCID: PMC6195531 DOI: 10.1038/s41598-018-33476-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Accepted: 09/21/2018] [Indexed: 12/03/2022] Open
Abstract
Protein stability is affected at different hierarchies – gene, RNA, amino acid sequence and structure. Gene is the first level which contributes via varying codon compositions. Codon selectivity of an organism differs with normal and extremophilic milieu. The present work attempts at detailing the codon usage pattern of six extremophilic classes and their harmony. Homologous gene datasets of thermophile-mesophile, psychrophile-mesophile, thermophile-psychrophile, acidophile-alkaliphile, halophile-nonhalophile and barophile-nonbarophile were analysed for filtering statistically significant attributes. Relative abundance analysis, 1–9 scale ranking, nucleotide compositions, attribute weighting and machine learning algorithms were employed to arrive at findings. AGG in thermophiles and barophiles, CAA in mesophiles and psychrophiles, TGG in acidophiles, GAG in alkaliphiles and GAC in halophiles had highest preference. Preference of GC-rich and G/C-ending codons were observed in halophiles and barophiles whereas, a decreasing trend was reflected in psychrophiles and alkaliphiles. GC-rich codons were found to decrease and G/C-ending codons increased in thermophiles whereas, acidophiles showed equal contents of GC-rich and G/C-ending codons. Codon usage patterns exhibited harmony among different extremophiles and has been detailed. However, the codon attribute preferences and their selectivity of extremophiles varied in comparison to non-extremophiles. The finding can be instrumental in codon optimization application for heterologous expression of extremophilic proteins.
Collapse
Affiliation(s)
- Mohd Faheem Khan
- Department of Biosciences and Bioengineering, Indian Institute of Technology Guwahati, Guwahati, 781039, Assam, India
| | - Sanjukta Patra
- Department of Biosciences and Bioengineering, Indian Institute of Technology Guwahati, Guwahati, 781039, Assam, India.
| |
Collapse
|
26
|
Zagrovic B, Bartonek L, Polyansky AA. RNA-protein interactions in an unstructured context. FEBS Lett 2018; 592:2901-2916. [PMID: 29851074 PMCID: PMC6175095 DOI: 10.1002/1873-3468.13116] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Revised: 05/12/2018] [Accepted: 05/13/2018] [Indexed: 02/02/2023]
Abstract
Despite their importance, our understanding of noncovalent RNA-protein interactions is incomplete. This especially concerns the binding between RNA and unstructured protein regions, a widespread class of such interactions. Here, we review the recent experimental and computational work on RNA-protein interactions in an unstructured context with a particular focus on how such interactions may be shaped by the intrinsic interaction affinities between individual nucleobases and protein side chains. Specifically, we articulate the claim that the universal genetic code reflects the binding specificity between nucleobases and protein side chains and that, in turn, the code may be seen as the Rosetta stone for understanding RNA-protein interactions in general.
Collapse
Affiliation(s)
- Bojan Zagrovic
- Department of Structural and Computational BiologyMax F. Perutz LaboratoriesUniversity of ViennaAustria
| | - Lukas Bartonek
- Department of Structural and Computational BiologyMax F. Perutz LaboratoriesUniversity of ViennaAustria
| | - Anton A. Polyansky
- Department of Structural and Computational BiologyMax F. Perutz LaboratoriesUniversity of ViennaAustria,MM Shemyakin and Yu A Ovchinnikov Institute of Bioorganic ChemistryRussian Academy of SciencesMoscowRussia
| |
Collapse
|
27
|
Kaiser F, Bittrich S, Salentin S, Leberecht C, Haupt VJ, Krautwurst S, Schroeder M, Labudde D. Backbone Brackets and Arginine Tweezers delineate Class I and Class II aminoacyl tRNA synthetases. PLoS Comput Biol 2018; 14:e1006101. [PMID: 29659563 PMCID: PMC5919687 DOI: 10.1371/journal.pcbi.1006101] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Revised: 04/26/2018] [Accepted: 03/20/2018] [Indexed: 12/22/2022] Open
Abstract
The origin of the machinery that realizes protein biosynthesis in all organisms is still unclear. One key component of this machinery are aminoacyl tRNA synthetases (aaRS), which ligate tRNAs to amino acids while consuming ATP. Sequence analyses revealed that these enzymes can be divided into two complementary classes. Both classes differ significantly on a sequence and structural level, feature different reaction mechanisms, and occur in diverse oligomerization states. The one unifying aspect of both classes is their function of binding ATP. We identified Backbone Brackets and Arginine Tweezers as most compact ATP binding motifs characteristic for each Class. Geometric analysis shows a structural rearrangement of the Backbone Brackets upon ATP binding, indicating a general mechanism of all Class I structures. Regarding the origin of aaRS, the Rodin-Ohno hypothesis states that the peculiar nature of the two aaRS classes is the result of their primordial forms, called Protozymes, being encoded on opposite strands of the same gene. Backbone Brackets and Arginine Tweezers were traced back to the proposed Protozymes and their more efficient successors, the Urzymes. Both structural motifs can be observed as pairs of residues in contemporary structures and it seems that the time of their addition, indicated by their placement in the ancient aaRS, coincides with the evolutionary trace of Proto- and Urzymes. Aminoacyl tRNA synthetases (aaRS) are primordial enzymes essential for interpretation and transfer of genetic information. Understanding the origin of the peculiarities observed with aaRS can explain what constituted the earliest life forms and how the genetic code was established. The increasing amount of experimentally determined three-dimensional structures of aaRS opens up new avenues for high-throughput analyses of molecular mechanisms. In this study, we present an exhaustive structural analysis of ATP binding motifs. We unveil an oppositional implementation of enzyme substrate binding in each aaRS Class. While Class I binds via interactions mediated by backbone hydrogen bonds, Class II uses a pair of arginine residues to establish salt bridges to its ATP ligand. We show how nature realized the binding of the same ligand species with completely different mechanisms. In addition, we demonstrate that sequence or even structure analysis for conserved residues may miss important functional aspects which can only be revealed by ligand interaction studies. Additionally, the placement of those key residues in the structure supports a popular hypothesis, which states that prototypic aaRS were once coded on complementary strands of the same gene.
Collapse
Affiliation(s)
- Florian Kaiser
- University of Applied Sciences Mittweida, Mittweida, Germany
- Biotechnology Center (BIOTEC), TU Dresden, Dresden, Germany
- * E-mail:
| | - Sebastian Bittrich
- University of Applied Sciences Mittweida, Mittweida, Germany
- Biotechnology Center (BIOTEC), TU Dresden, Dresden, Germany
| | | | - Christoph Leberecht
- University of Applied Sciences Mittweida, Mittweida, Germany
- Biotechnology Center (BIOTEC), TU Dresden, Dresden, Germany
| | | | | | | | - Dirk Labudde
- University of Applied Sciences Mittweida, Mittweida, Germany
| |
Collapse
|
28
|
Di Giulio M. On Earth, there would be a number of fundamental kinds of primary cells – cellular domains – greater than or equal to four. J Theor Biol 2018; 443:10-17. [DOI: 10.1016/j.jtbi.2018.01.025] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 01/10/2018] [Accepted: 01/19/2018] [Indexed: 11/15/2022]
|
29
|
de Oliveira LL, Freitas AA, Tinós R. Multi-objective genetic algorithms in the study of the genetic code’s adaptability. Inf Sci (N Y) 2018. [DOI: 10.1016/j.ins.2017.10.022] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
30
|
Affiliation(s)
- Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Artem S. Novozhilov
- Department of Mathematics, North Dakota State University, Fargo, North Dakota 58108, USA
| |
Collapse
|
31
|
Frozen Accident Pushing 50: Stereochemistry, Expansion, and Chance in the Evolution of the Genetic Code. Life (Basel) 2017; 7:life7020022. [PMID: 28545255 PMCID: PMC5492144 DOI: 10.3390/life7020022] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Revised: 05/19/2017] [Accepted: 05/20/2017] [Indexed: 12/31/2022] Open
Abstract
Nearly 50 years ago, Francis Crick propounded the frozen accident scenario for the evolution of the genetic code along with the hypothesis that the early translation system consisted primarily of RNA. Under the frozen accident perspective, the code is universal among modern life forms because any change in codon assignment would be highly deleterious. The frozen accident can be considered the default theory of code evolution because it does not imply any specific interactions between amino acids and the cognate codons or anticodons, or any particular properties of the code. The subsequent 49 years of code studies have elucidated notable features of the standard code, such as high robustness to errors, but failed to develop a compelling explanation for codon assignments. In particular, stereochemical affinity between amino acids and the cognate codons or anticodons does not seem to account for the origin and evolution of the code. Here, I expand Crick’s hypothesis on RNA-only translation system by presenting evidence that this early translation already attained high fidelity that allowed protein evolution. I outline an experimentally testable scenario for the evolution of the code that combines a distinct version of the stereochemical hypothesis, in which amino acids are recognized via unique sites in the tertiary structure of proto-tRNAs, rather than by anticodons, expansion of the code via proto-tRNA duplication, and the frozen accident.
Collapse
|
32
|
Santos J, Monteagudo Á. Inclusion of the fitness sharing technique in an evolutionary algorithm to analyze the fitness landscape of the genetic code adaptability. BMC Bioinformatics 2017; 18:195. [PMID: 28347270 PMCID: PMC5369190 DOI: 10.1186/s12859-017-1608-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2016] [Accepted: 03/16/2017] [Indexed: 11/26/2022] Open
Abstract
Background The canonical code, although prevailing in complex genomes, is not universal. It was shown the canonical genetic code superior robustness compared to random codes, but it is not clearly determined how it evolved towards its current form. The error minimization theory considers the minimization of point mutation adverse effect as the main selection factor in the evolution of the code. We have used simulated evolution in a computer to search for optimized codes, which helps to obtain information about the optimization level of the canonical code in its evolution. A genetic algorithm searches for efficient codes in a fitness landscape that corresponds with the adaptability of possible hypothetical genetic codes. The lower the effects of errors or mutations in the codon bases of a hypothetical code, the more efficient or optimal is that code. The inclusion of the fitness sharing technique in the evolutionary algorithm allows the extent to which the canonical genetic code is in an area corresponding to a deep local minimum to be easily determined, even in the high dimensional spaces considered. Results The analyses show that the canonical code is not in a deep local minimum and that the fitness landscape is not a multimodal fitness landscape with deep and separated peaks. Moreover, the canonical code is clearly far away from the areas of higher fitness in the landscape. Conclusions Given the non-presence of deep local minima in the landscape, although the code could evolve and different forces could shape its structure, the fitness landscape nature considered in the error minimization theory does not explain why the canonical code ended its evolution in a location which is not an area of a localized deep minimum of the huge fitness landscape.
Collapse
Affiliation(s)
- José Santos
- Department of Computer Science, University of A Coruña, Campus de Elviña s/n, A Coruña, 15071, Spain.
| | - Ángel Monteagudo
- Department of Computer Science, University of A Coruña, Campus de Elviña s/n, A Coruña, 15071, Spain
| |
Collapse
|
33
|
The role of crossover operator in evolutionary-based approach to the problem of genetic code optimization. Biosystems 2016; 150:61-72. [DOI: 10.1016/j.biosystems.2016.08.008] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2016] [Revised: 05/20/2016] [Accepted: 08/11/2016] [Indexed: 11/17/2022]
|
34
|
Aggarwal N, Bandhu AV, Sengupta S. Finite population analysis of the effect of horizontal gene transfer on the origin of an universal and optimal genetic code. Phys Biol 2016; 13:036007. [DOI: 10.1088/1478-3975/13/3/036007] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
35
|
Simões J, Bezerra AR, Moura GR, Araújo H, Gut I, Bayes M, Santos MAS. The Fungus Candida albicans Tolerates Ambiguity at Multiple Codons. Front Microbiol 2016; 7:401. [PMID: 27065968 PMCID: PMC4814463 DOI: 10.3389/fmicb.2016.00401] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 03/14/2016] [Indexed: 12/31/2022] Open
Abstract
The ascomycete Candida albicans is a normal resident of the gastrointestinal tract of humans and other warm-blooded animals. It occurs in a broad range of body sites and has high capacity to survive and proliferate in adverse environments with drastic changes in oxygen, carbon dioxide, pH, osmolarity, nutrients, and temperature. Its biology is unique due to flexible reassignment of the leucine CUG codon to serine and synthesis of statistical proteins. Under standard growth conditions, CUG sites incorporate leucine (3% of the times) and serine (97% of the times) on a proteome wide scale, but leucine incorporation fluctuates in response to environmental stressors and can be artificially increased up to 98%. In order to determine whether such flexibility also exists at other codons, we have constructed several serine tRNAs that decode various non-cognate codons. Expression of these tRNAs had minor effects on fitness, but growth of the mistranslating strains at different temperatures, in medium with different pH and nutrients composition was often enhanced relatively to the wild type (WT) strain, supporting our previous data on adaptive roles of CUG ambiguity in variable growth conditions. Parallel evolution of the recombinant strains (100 generations) followed by full genome resequencing identified various strain specific single nucleotide polymorphisms (SNP) and one SNP in the deneddylase (JAB1) gene in all strains. Since JAB1 is a subunit of the COP9 signalosome complex, which interacts with cullin (Cdc53p) to mediate degradation of a variety of cellular proteins, our data suggest that neddylation plays a key role in tolerance and adaptation to codon ambiguity in C. albicans.
Collapse
Affiliation(s)
- João Simões
- Health Sciences Program, Department of Medical Sciences, Institute of Biomedicine - iBiMED, University of Aveiro Aveiro, Portugal
| | - Ana R Bezerra
- Health Sciences Program, Department of Medical Sciences, Institute of Biomedicine - iBiMED, University of Aveiro Aveiro, Portugal
| | - Gabriela R Moura
- Health Sciences Program, Department of Medical Sciences, Institute of Biomedicine - iBiMED, University of Aveiro Aveiro, Portugal
| | - Hugo Araújo
- Health Sciences Program, Department of Medical Sciences, Institute of Biomedicine - iBiMED, University of Aveiro Aveiro, Portugal
| | - Ivo Gut
- Centro Nacional de Análises Genómico, Parc Científic Barcelona, Spain
| | - Mónica Bayes
- Centro Nacional de Análises Genómico, Parc Científic Barcelona, Spain
| | - Manuel A S Santos
- Health Sciences Program, Department of Medical Sciences, Institute of Biomedicine - iBiMED, University of Aveiro Aveiro, Portugal
| |
Collapse
|
36
|
Gardini S, Cheli S, Baroni S, Di Lascio G, Mangiavacchi G, Micheletti N, Monaco CL, Savini L, Alocci D, Mangani S, Niccolai N. On Nature's Strategy for Assigning Genetic Code Multiplicity. PLoS One 2016; 11:e0148174. [PMID: 26849571 PMCID: PMC4746209 DOI: 10.1371/journal.pone.0148174] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2015] [Accepted: 01/13/2016] [Indexed: 11/26/2022] Open
Abstract
Genetic code redundancy would yield, on the average, the assignment of three codons for each of the natural amino acids. The fact that this number is observed only for incorporating Ile and to stop RNA translation still waits for an overall explanation. Through a Structural Bioinformatics approach, the wealth of information stored in the Protein Data Bank has been used here to look for unambiguous clues to decipher the rationale of standard genetic code (SGC) in assigning from one to six different codons for amino acid translation. Leu and Arg, both protected from translational errors by six codons, offer the clearest clue by appearing as the most abundant amino acids in protein-protein and protein-nucleic acid interfaces. Other SGC hidden messages have been sought by analyzing, in a protein structure framework, the roles of over- and under-protected amino acids.
Collapse
Affiliation(s)
- Simone Gardini
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Sara Cheli
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Silvia Baroni
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Gabriele Di Lascio
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Guido Mangiavacchi
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Nicholas Micheletti
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Carmen Luigia Monaco
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Lorenzo Savini
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Davide Alocci
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Stefano Mangani
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| | - Neri Niccolai
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena, Italy
| |
Collapse
|
37
|
|
38
|
de Oliveira LL, de Oliveira PSL, Tinós R. A multiobjective approach to the genetic code adaptability problem. BMC Bioinformatics 2015; 16:52. [PMID: 25879480 PMCID: PMC4341243 DOI: 10.1186/s12859-015-0480-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Accepted: 01/27/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The organization of the canonical code has intrigued researches since it was first described. If we consider all codes mapping the 64 codes into 20 amino acids and one stop codon, there are more than 1.51×10(84) possible genetic codes. The main question related to the organization of the genetic code is why exactly the canonical code was selected among this huge number of possible genetic codes. Many researchers argue that the organization of the canonical code is a product of natural selection and that the code's robustness against mutations would support this hypothesis. In order to investigate the natural selection hypothesis, some researches employ optimization algorithms to identify regions of the genetic code space where best codes, according to a given evaluation function, can be found (engineering approach). The optimization process uses only one objective to evaluate the codes, generally based on the robustness for an amino acid property. Only one objective is also employed in the statistical approach for the comparison of the canonical code with random codes. We propose a multiobjective approach where two or more objectives are considered simultaneously to evaluate the genetic codes. RESULTS In order to test our hypothesis that the multiobjective approach is useful for the analysis of the genetic code adaptability, we implemented a multiobjective optimization algorithm where two objectives are simultaneously optimized. Using as objectives the robustness against mutation with the amino acids properties polar requirement (objective 1) and robustness with respect to hydropathy index or molecular volume (objective 2), we found solutions closer to the canonical genetic code in terms of robustness, when compared with the results using only one objective reported by other authors. CONCLUSIONS Using more objectives, more optimal solutions are obtained and, as a consequence, more information can be used to investigate the adaptability of the genetic code. The multiobjective approach is also more natural, because more than one objective was adapted during the evolutionary process of the canonical genetic code. Our results suggest that the evaluation function employed to compare genetic codes should consider simultaneously more than one objective, in contrast to what has been done in the literature.
Collapse
Affiliation(s)
| | | | - Renato Tinós
- Department of Computing and Mathematics, University of São Paulo, Ribeirão Preto, Brazil.
| |
Collapse
|
39
|
de Ruiter A, Zagrovic B. Absolute binding-free energies between standard RNA/DNA nucleobases and amino-acid sidechain analogs in different environments. Nucleic Acids Res 2014; 43:708-18. [PMID: 25550435 PMCID: PMC4333394 DOI: 10.1093/nar/gku1344] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Despite the great importance of nucleic acid-protein interactions in the cell, our understanding of their physico-chemical basis remains incomplete. In order to address this challenge, we have for the first time determined potentials of mean force and the associated absolute binding free energies between all standard RNA/DNA nucleobases and amino-acid sidechain analogs in high- and low-dielectric environments using molecular dynamics simulations and umbrella sampling. A comparison against a limited set of available experimental values for analogous systems attests to the quality of the computational approach and the force field used. Overall, our analysis provides a microscopic picture behind nucleobase/sidechain interaction preferences and creates a unified framework for understanding and sculpting nucleic acid-protein interactions in different contexts. Here, we use this framework to demonstrate a strong relationship between nucleobase density profiles of mRNAs and nucleobase affinity profiles of their cognate proteins and critically analyze a recent hypothesis that the two may be capable of direct, complementary interactions.
Collapse
Affiliation(s)
- Anita de Ruiter
- Department of Structural and Computational Biology, Max F. Perutz Laboratories, University of Vienna, Vienna 1030, Austria
| | - Bojan Zagrovic
- Department of Structural and Computational Biology, Max F. Perutz Laboratories, University of Vienna, Vienna 1030, Austria
| |
Collapse
|
40
|
Hajnic M, Osorio JI, Zagrovic B. Computational analysis of amino acids and their sidechain analogs in crowded solutions of RNA nucleobases with implications for the mRNA-protein complementarity hypothesis. Nucleic Acids Res 2014; 42:12984-94. [PMID: 25361976 PMCID: PMC4245939 DOI: 10.1093/nar/gku1035] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2014] [Revised: 09/29/2014] [Accepted: 10/11/2014] [Indexed: 12/31/2022] Open
Abstract
Many critical processes in the cell involve direct binding between RNAs and proteins, making it imperative to fully understand the physicochemical principles behind such interactions at the atomistic level. Here, we use molecular dynamics simulations and 15 μs of sampling to study the behavior of amino acids and amino acid sidechain analogs in high-concentration aqueous solutions of standard RNA nucleobases. Structural and energetic analysis of simulated systems allows us to derive interaction propensity scales for different amino acid/nucleobase combinations. The derived scales closely match and greatly extend the available experimental data, providing a comprehensive foundation for studying RNA-protein interactions in different contexts. By using these scales, we demonstrate a statistically significant connection between nucleobase composition of human mRNA coding sequences and nucleobase interaction propensities of their cognate protein sequences. For example, pyrimidine density profiles of mRNAs match uracil-propensity profiles of their cognate proteins with a median Pearson correlation coefficient of R = -0.70. Our results provide support for the recently proposed hypotheses that mRNAs and their cognate proteins may be physicochemically complementary to each other and bind, especially if unstructured, with the complementarity level being negatively influenced by mRNA adenine content. Finally, we utilize the derived scales to refine the complementarity hypothesis and closely examine its physicochemical underpinnings.
Collapse
Affiliation(s)
- Matea Hajnic
- Department of Structural and Computational Biology, Max F. Perutz Laboratories, University of Vienna, Vienna 1030, Austria
| | - Juan Iregui Osorio
- Department of Structural and Computational Biology, Max F. Perutz Laboratories, University of Vienna, Vienna 1030, Austria
| | - Bojan Zagrovic
- Department of Structural and Computational Biology, Max F. Perutz Laboratories, University of Vienna, Vienna 1030, Austria
| |
Collapse
|
41
|
Penny D, Zhong B. Two fundamental questions about protein evolution. Biochimie 2014; 119:278-83. [PMID: 25447137 DOI: 10.1016/j.biochi.2014.10.020] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2014] [Accepted: 10/15/2014] [Indexed: 01/16/2023]
Abstract
Two basic questions are considered that approach protein evolution from different directions; the problems arising from using Markov models for the deeper divergences, and then the origin of proteins themselves. The real problem for the first question (going backwards in time) is that at deeper phylogenies the Markov models of sequence evolution must lose information exponentially at deeper divergences, and several testable methods are suggested that should help resolve these deeper divergences. For the second question (coming forwards in time) a problem is that most models for the origin of protein synthesis do not give a role for the very earliest stages of the process. From our knowledge of the importance of replication accuracy in limiting the length of a coding molecule, a testable hypothesis is proposed. The length of the code, the code itself, and tRNAs would all have prior roles in increasing the accuracy of RNA replication; thus proteins would have been formed only after the tRNAs and the length of the triplet code are already formed. Both questions lead to testable predictions.
Collapse
Affiliation(s)
- David Penny
- Institute of Fundamental Sciences, Massey University, Palmerston North, New Zealand.
| | - Bojian Zhong
- Institute of Fundamental Sciences, Massey University, Palmerston North, New Zealand
| |
Collapse
|
42
|
Babbitt GA, Alawad MA, Schulze KV, Hudson AO. Synonymous codon bias and functional constraint on GC3-related DNA backbone dynamics in the prokaryotic nucleoid. Nucleic Acids Res 2014; 42:10915-26. [PMID: 25200075 PMCID: PMC4176184 DOI: 10.1093/nar/gku811] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
While mRNA stability has been demonstrated to control rates of translation, generating both global and local synonymous codon biases in many unicellular organisms, this explanation cannot adequately explain why codon bias strongly tracks neighboring intergene GC content; suggesting that structural dynamics of DNA might also influence codon choice. Because minor groove width is highly governed by 3-base periodicity in GC, the existence of triplet-based codons might imply a functional role for the optimization of local DNA molecular dynamics via GC content at synonymous sites (≈GC3). We confirm a strong association between GC3-related intrinsic DNA flexibility and codon bias across 24 different prokaryotic multiple whole-genome alignments. We develop a novel test of natural selection targeting synonymous sites and demonstrate that GC3-related DNA backbone dynamics have been subject to moderate selective pressure, perhaps contributing to our observation that many genes possess extreme DNA backbone dynamics for their given protein space. This dual function of codons may impose universal functional constraints affecting the evolution of synonymous and non-synonymous sites. We propose that synonymous sites may have evolved as an 'accessory' during an early expansion of a primordial genetic code, allowing for multiplexed protein coding and structural dynamic information within the same molecular context.
Collapse
Affiliation(s)
- Gregory A Babbitt
- Thomas H. Gosnell School of Life Sciences, Rochester Institute of Technology, Rochester NY, USA 14623
| | - Mohammed A Alawad
- B. Thomas Golisano College of Computing and Information Sciences, Rochester Institute of Technology, Rochester NY, USA 14623
| | - Katharina V Schulze
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston TX, USA 77030
| | - André O Hudson
- Thomas H. Gosnell School of Life Sciences, Rochester Institute of Technology, Rochester NY, USA 14623
| |
Collapse
|
43
|
Rosandić M, Paar V. Codon sextets with leading role of serine create "ideal" symmetry classification scheme of the genetic code. Gene 2014; 543:45-52. [PMID: 24709107 DOI: 10.1016/j.gene.2014.04.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2014] [Accepted: 04/03/2014] [Indexed: 11/17/2022]
Abstract
The standard classification scheme of the genetic code is organized for alphabetic ordering of nucleotides. Here we introduce the new, "ideal" classification scheme in compact form, for the first time generated by codon sextets encoding Ser, Arg and Leu amino acids. The new scheme creates the known purine/pyrimidine, codon-anticodon, and amino/keto type symmetries and a novel A+U rich/C+G rich symmetry. This scheme is built from "leading" and "nonleading" groups of 32 codons each. In the ensuing 4 × 16 scheme, based on trinucleotide quadruplets, Ser has a central role as initial generator. Six codons encoding Ser and six encoding Arg extend continuously along a linear array in the "leading" group, and together with four of six Leu codons uniquely define construction of the "leading" group. The remaining two Leu codons enable construction of the "nonleading" group. The "ideal" genetic code suggests the evolution of genetic code with serine as an initiator.
Collapse
Affiliation(s)
- Marija Rosandić
- Croatian Academy of Sciences and Arts, Zrinski trg 11, 10000 Zagreb, Croatia
| | - Vladimir Paar
- Croatian Academy of Sciences and Arts, Zrinski trg 11, 10000 Zagreb, Croatia; Faculty of Science, University of Zagreb, Bijenička 32, 10000 Zagreb, Croatia.
| |
Collapse
|
44
|
Lenstra R. Evolution of the genetic code through progressive symmetry breaking. J Theor Biol 2014; 347:95-108. [DOI: 10.1016/j.jtbi.2014.01.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2012] [Revised: 12/18/2013] [Accepted: 01/01/2014] [Indexed: 01/18/2023]
|
45
|
Harada K, Aoyama S, Matsugami A, Kumar PKR, Katahira M, Kato N, Ohkanda J. RNA-directed amino acid coupling as a model reaction for primitive coded translation. Chembiochem 2014; 15:794-8. [PMID: 24591237 DOI: 10.1002/cbic.201400029] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2014] [Indexed: 11/09/2022]
Abstract
The stereochemical theory claims that primitive coded translation initially occurred in the RNA world by RNA-directed amino acid coupling. In this study, we show that the HIV Tat aptamer RNA is capable of recognizing two consecutive arginine residues within the Tat peptide, thus demonstrating how RNA might be able to position two amino acids for sequence-specific coupling. We also show that this RNA can act as a template to accelerate the coupling of a single arginine residue to the N-terminal arginine residue of a peptide primer. The results might have implications for our understanding of the origin of translation.
Collapse
Affiliation(s)
- Kazuo Harada
- Department of Life Sciences, Tokyo Gakugei University, 4-1-1 Nukuikita-machi, Koganei, Tokyo 184-8501 (Japan).
| | | | | | | | | | | | | |
Collapse
|
46
|
Wang B, Kennedy MA. Principal components analysis of protein sequence clusters. ACTA ACUST UNITED AC 2014; 15:1-11. [PMID: 24496727 DOI: 10.1007/s10969-014-9173-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2013] [Accepted: 01/24/2014] [Indexed: 12/21/2022]
Abstract
Sequence analysis of large protein families can produce sub-clusters even within the same family. In some cases, it is of interest to know precisely which amino acid position variations are most responsible for driving separation into sub-clusters. In large protein families composed of large proteins, it can be quite challenging to assign the relative importance to specific amino acid positions. Principal components analysis (PCA) is ideal for such a task, since the problem is posed in a large variable space, i.e. the number of amino acids that make up the protein sequence, and PCA is powerful at reducing the dimensionality of complex problems by projecting the data into an eigenspace that represents the directions of greatest variation. However, PCA of aligned protein sequence families is complicated by the fact that protein sequences are traditionally represented by single letter alphabetic codes, whereas PCA of protein sequence families requires conversion of sequence information into a numerical representation. Here, we introduce a new amino acid sequence conversion algorithm optimized for PCA data input. The method is demonstrated using a small artificial dataset to illustrate the characteristics and performance of the algorithm, as well as a small protein sequence family consisting of nine members, COG2263, and finally with a large protein sequence family, Pfam04237, which contains more than 1,800 sequences that group into two sub-clusters.
Collapse
Affiliation(s)
- Bo Wang
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH, 45056, USA
| | | |
Collapse
|
47
|
Bandhu AV, Aggarwal N, Sengupta S. Revisiting the physico-chemical hypothesis of code origin: an analysis based on code-sequence coevolution in a finite population. ORIGINS LIFE EVOL B 2013; 43:465-89. [PMID: 24500541 DOI: 10.1007/s11084-014-9353-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2013] [Accepted: 01/13/2014] [Indexed: 01/23/2023]
Abstract
The origin of the genetic code marked a major transition from a plausible RNA world to the world of DNA and proteins and is an important milestone in our understanding of the origin of life. We examine the efficacy of the physico-chemical hypothesis of code origin by carrying out simulations of code-sequence coevolution in finite populations in stages, leading first to the emergence of ten amino acid code(s) and subsequently to 14 amino acid code(s). We explore two different scenarios of primordial code evolution. In one scenario, competition occurs between populations of equilibrated code-sequence sets while in another scenario; new codes compete with existing codes as they are gradually introduced into the population with a finite probability. In either case, we find that natural selection between competing codes distinguished by differences in the degree of physico-chemical optimization is unable to explain the structure of the standard genetic code. The code whose structure is most consistent with the standard genetic code is often not among the codes that have a high fixation probability. However, we find that the composition of the code population affects the code fixation probability. A physico-chemically optimized code gets fixed with a significantly higher probability if it competes against a set of randomly generated codes. Our results suggest that physico-chemical optimization may not be the sole driving force in ensuring the emergence of the standard genetic code.
Collapse
Affiliation(s)
- Ashutosh Vishwa Bandhu
- School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi, 110067, India
| | | | | |
Collapse
|
48
|
Rosandić M, Paar V, Glunčić M. Fundamental role of start/stop regulators in whole DNA and new trinucleotide classification. Gene 2013; 531:184-90. [PMID: 24042127 DOI: 10.1016/j.gene.2013.09.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2013] [Revised: 08/31/2013] [Accepted: 09/05/2013] [Indexed: 10/26/2022]
Abstract
The origin and logic of genetic code are two of greatest mysteries of life sciences. Analyzing DNA sequences we showed that the start/stop trinucleotides have broader importance than just marking start and stop of exons in coding DNA. On this basis, here we introduced new classification of trinucleotides and showed that all A+T rich trinucleotides consisting of three different nucleotides arise from start-ATG, stop-TGA and stop-TAG using their complement, reverse complement and reverse transformations. Due to the same transformations during generations of crossing-over they can switch from one form to the other. By direct process the start-ATG and stop-TAG can irreversibly transform into stop-TAA. By transformation into A+T rich trinucleotides and 16/32 C+G rich they can lose the start/stop function and take the role of a sense codon in reversible way. The remaining 16 C+G trinucleotides cannot directly transform into start/stop trinucleotides and thus remain a firm skeleton for structuring the C+G rich DNA. We showed that start/stops strongly enrich the A+T rich noncoding DNA through frequently extended forms. From the evolutionary viewpoint the start/stops are chief creators of prevailing A+T rich noncoding DNA, and of more stable coding DNA. We propose that start/stops have basic role as "seeds" in trinucleotide evolution of noncoding and coding sequences and lead to asymmetry between A+T and C+G rich DNA. By dynamical transformations during evolution they enabled pronounced phylogenetic broadness, keeping the regulator function.
Collapse
Affiliation(s)
- Marija Rosandić
- Faculty of Science, University of Zagreb, Bijenička 32, 10000 Zagreb, Croatia
| | | | | |
Collapse
|
49
|
Fimmel E, Danielli A, Strüngmann L. On dichotomic classes and bijections of the genetic code. J Theor Biol 2013; 336:221-30. [PMID: 23988795 DOI: 10.1016/j.jtbi.2013.07.027] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2013] [Revised: 06/05/2013] [Accepted: 07/25/2013] [Indexed: 11/17/2022]
Abstract
Dichotomic classes arising from a recent mathematical model of the genetic code allow to uncover many symmetry properties of the code, and although theoretically derived, they permitted to build statistical classifiers able to retrieve the correct translational frame of coding sequences. Herein we formalize the mathematical properties of these classes, first focusing on all the possible decompositions of the 64 codons of the genetic code into two equally sized dichotomic subsets. Then the global framework of bijective transformations of the nucleotide bases is discussed and we clarify when dichotomic partitions can be generated. In addition, we show that the parity dichotomic classes of the mathematical model and complementarity dichotomic classes obtained in the present article can be formalized in the same algorithmic way the dichotomic Rumer's degeneracy classes. Interestingly, we find that the algorithm underlying dichotomic class definition mirrors biochemical features occurring at discrete base positions in the decoding center of the ribosome.
Collapse
Affiliation(s)
- Elena Fimmel
- Institute of Applied Mathematics, Faculty of Computer Sciences, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| | | | | |
Collapse
|
50
|
Guilloux A, Caudron B, Jestin JL. A method to predict edge strands in beta-sheets from protein sequences. Comput Struct Biotechnol J 2013; 7:e201305001. [PMID: 24688737 PMCID: PMC3962219 DOI: 10.5936/csbj.201305001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2013] [Revised: 05/27/2013] [Accepted: 05/30/2013] [Indexed: 12/15/2022] Open
Abstract
There is a need for rules allowing three-dimensional structure information to be derived from protein sequences. In this work, consideration of an elementary protein folding step allows protein sub-sequences which optimize folding to be derived for any given protein sequence. Classical mechanics applied to this system and the energy conservation law during the elementary folding step yields an equation whose solutions are taken over the field of rational numbers. This formalism is applied to beta-sheets containing two edge strands and at least two central strands. The number of protein sub-sequences optimized for folding per amino acid in beta-strands is shown in particular to predict edge strands from protein sequences. Topological information on beta-strands and loops connecting them is derived for protein sequences with a prediction accuracy of 75%. The statistical significance of the finding is given. Applications in protein structure prediction are envisioned such as for the quality assessment of protein structure models.
Collapse
Affiliation(s)
- Antonin Guilloux
- Analyse algébrique, Institut de Mathématiques de Jussieu, Université Pierre et Marie Curie, Paris VI, France
| | - Bernard Caudron
- Centre d'Informatique pour la Biologie, Institut Pasteur, Paris, France
| | | |
Collapse
|