1
|
Rozhoňová H, Martí-Gómez C, McCandlish DM, Payne JL. Robust genetic codes enhance protein evolvability. PLoS Biol 2024; 22:e3002594. [PMID: 38754362 PMCID: PMC11098591 DOI: 10.1371/journal.pbio.3002594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 03/19/2024] [Indexed: 05/18/2024] Open
Abstract
The standard genetic code defines the rules of translation for nearly every life form on Earth. It also determines the amino acid changes accessible via single-nucleotide mutations, thus influencing protein evolvability-the ability of mutation to bring forth adaptive variation in protein function. One of the most striking features of the standard genetic code is its robustness to mutation, yet it remains an open question whether such robustness facilitates or frustrates protein evolvability. To answer this question, we use data from massively parallel sequence-to-function assays to construct and analyze 6 empirical adaptive landscapes under hundreds of thousands of rewired genetic codes, including those of codon compression schemes relevant to protein engineering and synthetic biology. We find that robust genetic codes tend to enhance protein evolvability by rendering smooth adaptive landscapes with few peaks, which are readily accessible from throughout sequence space. However, the standard genetic code is rarely exceptional in this regard, because many alternative codes render smoother landscapes than the standard code. By constructing low-dimensional visualizations of these landscapes, which each comprise more than 16 million mRNA sequences, we show that such alternative codes radically alter the topological features of the network of high-fitness genotypes. Whereas the genetic codes that optimize evolvability depend to some extent on the detailed relationship between amino acid sequence and protein function, we also uncover general design principles for engineering nonstandard genetic codes for enhanced and diminished evolvability, which may facilitate directed protein evolution experiments and the bio-containment of synthetic organisms, respectively.
Collapse
Affiliation(s)
- Hana Rozhoňová
- Institute of Integrative Biology, ETH Zürich, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Carlos Martí-Gómez
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - David M. McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Joshua L. Payne
- Institute of Integrative Biology, ETH Zürich, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
2
|
Romero Romero ML, Landerer C, Poehls J, Toth‐Petroczy A. Phenotypic mutations contribute to protein diversity and shape protein evolution. Protein Sci 2022; 31:e4397. [PMID: 36040266 PMCID: PMC9375231 DOI: 10.1002/pro.4397] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 06/14/2022] [Accepted: 07/04/2022] [Indexed: 11/16/2022]
Abstract
Errors in DNA replication generate genetic mutations, while errors in transcription and translation lead to phenotypic mutations. Phenotypic mutations are orders of magnitude more frequent than genetic ones, yet they are less understood. Here, we review the types of phenotypic mutations, their quantifications, and their role in protein evolution and disease. The diversity generated by phenotypic mutation can facilitate adaptive evolution. Indeed, phenotypic mutations, such as ribosomal frameshift and stop codon readthrough, sometimes serve to regulate protein expression and function. Phenotypic mutations have often been linked to fitness decrease and diseases. Thus, understanding the protein heterogeneity and phenotypic diversity caused by phenotypic mutations will advance our understanding of protein evolution and have implications on human health and diseases.
Collapse
Affiliation(s)
- Maria Luisa Romero Romero
- Max Planck Institute of Molecular Cell Biology and GeneticsDresdenGermany
- Center for Systems Biology DresdenDresdenGermany
| | - Cedric Landerer
- Max Planck Institute of Molecular Cell Biology and GeneticsDresdenGermany
- Center for Systems Biology DresdenDresdenGermany
| | - Jonas Poehls
- Max Planck Institute of Molecular Cell Biology and GeneticsDresdenGermany
- Center for Systems Biology DresdenDresdenGermany
| | - Agnes Toth‐Petroczy
- Max Planck Institute of Molecular Cell Biology and GeneticsDresdenGermany
- Center for Systems Biology DresdenDresdenGermany
- Cluster of Excellence Physics of LifeTU DresdenDresdenGermany
| |
Collapse
|
3
|
Mattenberger F, Vila-Nistal M, Geller R. Increased RNA virus population diversity improves adaptability. Sci Rep 2021; 11:6824. [PMID: 33767337 PMCID: PMC7994910 DOI: 10.1038/s41598-021-86375-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 03/15/2021] [Indexed: 11/20/2022] Open
Abstract
The replication machinery of most RNA viruses lacks proofreading mechanisms. As a result, RNA virus populations harbor a large amount of genetic diversity that confers them the ability to rapidly adapt to changes in their environment. In this work, we investigate whether further increasing the initial population diversity of a model RNA virus can improve adaptation to a single selection pressure, thermal inactivation. For this, we experimentally increased the diversity of coxsackievirus B3 (CVB3) populations across the capsid region. We then compared the ability of these high diversity CVB3 populations to achieve resistance to thermal inactivation relative to standard CVB3 populations in an experimental evolution setting. We find that viral populations with high diversity are better able to achieve resistance to thermal inactivation at both the temperature employed during experimental evolution as well as at a more extreme temperature. Moreover, we identify mutations in the CVB3 capsid that confer resistance to thermal inactivation, finding significant mutational epistasis. Our results indicate that even naturally diverse RNA virus populations can benefit from experimental augmentation of population diversity for optimal adaptation and support the use of such viral populations in directed evolution efforts that aim to select viruses with desired characteristics.
Collapse
Affiliation(s)
- Florian Mattenberger
- Institute for Integrative Systems Biology, I2SysBio (Universitat de València-CSIC), C. Catedràtic José Beltrán 2, 46980, Paterna, Spain
| | - Marina Vila-Nistal
- Department of Physiology, Genetics and Microbiology, Universidad de Alicante, C. San Vicente del Raspeig s/n, 03690, Alicante, Spain
| | - Ron Geller
- Institute for Integrative Systems Biology, I2SysBio (Universitat de València-CSIC), C. Catedràtic José Beltrán 2, 46980, Paterna, Spain.
| |
Collapse
|
4
|
Tripathi S, Deem MW. The Standard Genetic Code Facilitates Exploration of the Space of Functional Nucleotide Sequences. J Mol Evol 2018; 86:325-339. [PMID: 29959476 DOI: 10.1007/s00239-018-9852-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Accepted: 06/21/2018] [Indexed: 01/07/2023]
Abstract
The standard genetic code is well known to be optimized for minimizing the phenotypic effects of single-nucleotide substitutions, a property that was likely selected for during the emergence of a universal code. Given the fitness advantage afforded by high standing genetic diversity in a population in a dynamic environment, it is possible that selection to explore a large fraction of the space of functional proteins also occurred. To determine whether selection for such a property played a role during the emergence of the nearly universal standard genetic code, we investigated the number of functional variants of the Escherichia coli PhoQ protein explored at different time scales under translation using different genetic codes. We found that the standard genetic code is highly optimal for exploring a large fraction of the space of functional PhoQ variants at intermediate time scales as compared to random codes. Environmental changes, in response to which genetic diversity in a population provides a fitness advantage, are likely to have occurred at these intermediate time scales. Our results indicate that the ability of the standard code to explore a large fraction of the space of functional sequence variants arises from a balance between robustness and flexibility and is largely independent of the property of the standard code to minimize the phenotypic effects of mutations. We propose that selection to explore a large fraction of the functional sequence space while minimizing the phenotypic effects of mutations contributed toward the emergence of the standard code as the universal genetic code.
Collapse
Affiliation(s)
- Shubham Tripathi
- PhD Program in Systems, Synthetic, and Physical Biology, Rice University, Houston, TX, 77005, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, 77005, USA
| | - Michael W Deem
- PhD Program in Systems, Synthetic, and Physical Biology, Rice University, Houston, TX, 77005, USA.
- Center for Theoretical Biological Physics, Rice University, Houston, TX, 77005, USA.
- Department of Bioengineering, Rice University, Houston, TX, 77005, USA.
- Department of Physics and Astronomy, Rice University, Houston, TX, 77005, USA.
| |
Collapse
|
5
|
Salinas DG, Gallardo MO, Osorio MI. Local conditions for global stability in the space of codons of the genetic code. Biosystems 2016; 150:73-77. [PMID: 27531459 DOI: 10.1016/j.biosystems.2016.08.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2015] [Revised: 06/01/2016] [Accepted: 08/11/2016] [Indexed: 11/26/2022]
Abstract
The polar requirement is an attribute of amino acids that is a major determinant of the structure and function of the proteins, and it plays a role in the flexibility and robustness of the genetic code. The viability of an organism depends on flexibility, which allows the exploration of new functions. However, robustness is necessary to protect the organism from deleterious changes derived from misreading errors and single-point mutations. Compared with random codes, the standard genetic code is one of the most robust against such errors. Here, using analytical and numerical calculations and the set of amino acid-encoding codons, we have proposed some local conditions that are necessary for the optimal robustness of the genetic code, and we explored the association between the local conditions and the robustness. The localness of the proposed conditions and the underlying evolutionary mechanism, which begins with a random code and progresses toward more efficient codes (e.g., the standard code), might be biologically plausible.
Collapse
Affiliation(s)
- Dino G Salinas
- Centro de Investigación Biomédica, Facultad de Medicina, Universidad Diego Portales, Avda. Ejército 141, Santiago, Chile.
| | - Mauricio O Gallardo
- Centro de Investigación Biomédica, Facultad de Medicina, Universidad Diego Portales, Avda. Ejército 141, Santiago, Chile.
| | - Manuel I Osorio
- Centro de Investigación Biomédica, Facultad de Medicina, Universidad Diego Portales, Avda. Ejército 141, Santiago, Chile.
| |
Collapse
|
6
|
Londe S, Monnin T, Cornette R, Debat V, Fisher BL, Molet M. Phenotypic plasticity and modularity allow for the production of novel mosaic phenotypes in ants. EvoDevo 2015; 6:36. [PMID: 26629324 PMCID: PMC4666092 DOI: 10.1186/s13227-015-0031-5] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2015] [Accepted: 11/12/2015] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND The origin of discrete novelties remains unclear. Some authors suggest that qualitative phenotypic changes may result from the reorganization of preexisting phenotypic traits during development (i.e., developmental recombination) following genetic or environmental changes. Because ants combine high modularity with extreme phenotypic plasticity (queen and worker castes), their diversified castes could have evolved by developmental recombination. We performed a quantitative morphometric study to investigate the developmental origins of novel phenotypes in the ant Mystrium rogeri, which occasionally produces anomalous 'intercastes.' Our analysis compared the variation of six morphological modules with body size using a large sample of intercastes. RESULTS We confirmed that intercastes are conspicuous mosaics that recombine queen and worker modules. In addition, we found that many other individuals traditionally classified as workers or queens also exhibit some level of mosaicism. The six modules had distinct profiles of variation suggesting that each module responds differentially to factors that control body size and polyphenism. Mosaicism appears to result from each module responding differently yet in an ordered and predictable manner to intermediate levels of inducing factors that control polyphenism. The order of module response determines which mosaic combinations are produced. CONCLUSIONS Because the frequency of mosaics and their canalization around a particular phenotype may evolve by selection on standing genetic variation that affects the plastic response (i.e., genetic accommodation), developmental recombination is likely to play an important role in the evolution of novel castes in ants. Indeed, we found that most mosaics have queen-like head and gaster but a worker-like thorax congruent with the morphology of ergatoid queens and soldiers, respectively. Ergatoid queens of M. oberthueri, a sister species of M. rogeri, could have evolved from intercastes produced ancestrally through such a process.
Collapse
Affiliation(s)
- Sylvain Londe
- />UMR 7618 Institute of Ecology and Environmental Sciences of Paris, Sorbonne Universités, UPMC Univ Paris 06, 7 quai St Bernard, 75 252 Paris, France
| | - Thibaud Monnin
- />UMR 7618 Institute of Ecology and Environmental Sciences of Paris, Sorbonne Universités, UPMC Univ Paris 06, 7 quai St Bernard, 75 252 Paris, France
| | - Raphaël Cornette
- />Département Systématique et Évolution, Muséum National d’Histoire Naturelle; CNRS UMR 7205, Institut de Systématique, Evolution, Biodiversité, Paris, France
| | - Vincent Debat
- />Département Systématique et Évolution, Muséum National d’Histoire Naturelle; CNRS UMR 7205, Institut de Systématique, Evolution, Biodiversité, Paris, France
| | - Brian L. Fisher
- />Department of Entomology, California Academy of Sciences, Golden Gate Park, 55 Music Concourse Drive, San Francisco, CA 94118 USA
| | - Mathieu Molet
- />UMR 7618 Institute of Ecology and Environmental Sciences of Paris, Sorbonne Universités, UPMC Univ Paris 06, 7 quai St Bernard, 75 252 Paris, France
| |
Collapse
|
7
|
Neutral and weakly nonneutral sequence variants may define individuality. Proc Natl Acad Sci U S A 2013; 110:14255-60. [PMID: 23940345 DOI: 10.1073/pnas.1216613110] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Large-scale computational analyses of the growing wealth of genome-variation data consistently tell two distinct stories. The first is expected: coding variants reported in disease-related databases significantly alter the function of affected proteins. The second is surprising: the genomes of healthy individuals appear to carry many variants that are predicted to have some effect on function. As long as the complete experimental analysis of all human genome variants remains impossible, computational methods, such as PolyPhen, SNAP, and SIFT, might provide important insights. These methods capture the effects of particular variants very well and can highlight trends in populations of variants. Diseases are, arguably, extreme phenotypic variations and are often attributable to one or a few severely functionally disruptive variants. Our findings suggest a genomic basis of the different nondisease phenotypes. Prediction methods indicate that variants in seemingly healthy individuals tend to be neutral or weakly disruptive for protein molecular function. These variant effects are predicted to be largely either experimentally undetectable or are not deemed significant enough to be published. This may suggest that nondisease phenotypes arise through combinations of many variants whose effects are weakly nonneutral (damaging or enhancing) to the molecular protein function but fall within the wild-type range of overall physiological function.
Collapse
|
8
|
José MV, Morgado ER, Govezensky T. Genetic hotels for the standard genetic code: evolutionary analysis based upon novel three-dimensional algebraic models. Bull Math Biol 2010; 73:1443-76. [PMID: 20725796 DOI: 10.1007/s11538-010-9571-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2009] [Accepted: 07/02/2010] [Indexed: 11/30/2022]
Abstract
Herein, we rigorously develop novel 3-dimensional algebraic models called Genetic Hotels of the Standard Genetic Code (SGC). We start by considering the primeval RNA genetic code which consists of the 16 codons of type RNY (purine-any base-pyrimidine). Using simple algebraic operations, we show how the RNA code could have evolved toward the current SGC via two different intermediate evolutionary stages called Extended RNA code type I and II. By rotations or translations of the subset RNY, we arrive at the SGC via the former (type I) or via the latter (type II), respectively. Biologically, the Extended RNA code type I, consists of all codons of the type RNY plus codons obtained by considering the RNA code but in the second (NYR type) and third (YRN type) reading frames. The Extended RNA code type II, comprises all codons of the type RNY plus codons that arise from transversions of the RNA code in the first (YNY type) and third (RNR) nucleotide bases. Since the dimensions of remarkable subsets of the Genetic Hotels are not necessarily integer numbers, we also introduce the concept of algebraic fractal dimension. A general decoding function which maps each codon to its corresponding amino acid or the stop signals is also derived. The Phenotypic Hotel of amino acids is also illustrated. The proposed evolutionary paths are discussed in terms of the existing theories of the evolution of the SGC. The adoption of 3-dimensional models of the Genetic and Phenotypic Hotels will facilitate the understanding of the biological properties of the SGC.
Collapse
Affiliation(s)
- Marco V José
- Theoretical Biology Group, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, Mexico.
| | | | | |
Collapse
|
9
|
Castro-Chavez F. The rules of variation: amino acid exchange according to the rotating circular genetic code. J Theor Biol 2010; 264:711-21. [PMID: 20371250 PMCID: PMC3130497 DOI: 10.1016/j.jtbi.2010.03.046] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2009] [Revised: 03/06/2010] [Accepted: 03/30/2010] [Indexed: 12/11/2022]
Abstract
General guidelines for the molecular basis of functional variation are presented while focused on the rotating circular genetic code and allowable exchanges that make it resistant to genetic diseases under normal conditions. The rules of variation, bioinformatics aids for preventative medicine, are: (1) same position in the four quadrants for hydrophobic codons, (2) same or contiguous position in two quadrants for synonymous or related codons, and (3) same quadrant for equivalent codons. To preserve protein function, amino acid exchange according to the first rule takes into account the positional homology of essential hydrophobic amino acids with every codon with a central uracil in the four quadrants, the second rule includes codons for identical, acidic, or their amidic amino acids present in two quadrants, and the third rule, the smaller, aromatic, stop codons, and basic amino acids, each in proximity within a 90 degree angle. I also define codifying genes and palindromati, CTCGTGCCGAATTCGGCACGAG.
Collapse
|
10
|
Certain non-standard coding tables appear to be more robust to error than the standard genetic code. J Mol Evol 2009; 70:13-28. [PMID: 20012032 DOI: 10.1007/s00239-009-9303-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2009] [Accepted: 11/10/2009] [Indexed: 10/20/2022]
Abstract
Since the identification of the Standard Coding Table as a "universal" method to translate genetic information into amino acids, exceptions to this rule have been reported, and to date there are nearly 20 alternative genetic coding tables deployed by either nuclear genomes or organelles of organisms. Why are these codes still in use and why are new codon reassignments occurring? This present study aims to provide a new method to address these questions and to analyze whether these alternative codes present any advantages or disadvantages to the organisms or organelles in terms of robustness to error. We show that two of the alternative coding tables, The Ciliate, Dasycladacean and Hexamita Nuclear Code (CDH) and The Flatworm Mitochondrial Code (FMC), exhibit an advantage, while others such as The Yeast Mitochondrial Code (YMC) are at a significant disadvantage. We propose that the Standard Code is likely to have emerged as a "local minimum" and that the "coding landscape" is still being searched for a "global" minimum.
Collapse
|
11
|
Baranov PV, Venin M, Provan G. Codon size reduction as the origin of the triplet genetic code. PLoS One 2009; 4:e5708. [PMID: 19479032 PMCID: PMC2682656 DOI: 10.1371/journal.pone.0005708] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2009] [Accepted: 04/22/2009] [Indexed: 11/26/2022] Open
Abstract
The genetic code appears to be optimized in its robustness to missense errors and frameshift errors. In addition, the genetic code is near-optimal in terms of its ability to carry information in addition to the sequences of encoded proteins. As evolution has no foresight, optimality of the modern genetic code suggests that it evolved from less optimal code variants. The length of codons in the genetic code is also optimal, as three is the minimal nucleotide combination that can encode the twenty standard amino acids. The apparent impossibility of transitions between codon sizes in a discontinuous manner during evolution has resulted in an unbending view that the genetic code was always triplet. Yet, recent experimental evidence on quadruplet decoding, as well as the discovery of organisms with ambiguous and dual decoding, suggest that the possibility of the evolution of triplet decoding from living systems with non-triplet decoding merits reconsideration and further exploration. To explore this possibility we designed a mathematical model of the evolution of primitive digital coding systems which can decode nucleotide sequences into protein sequences. These coding systems can evolve their nucleotide sequences via genetic events of Darwinian evolution, such as point-mutations. The replication rates of such coding systems depend on the accuracy of the generated protein sequences. Computer simulations based on our model show that decoding systems with codons of length greater than three spontaneously evolve into predominantly triplet decoding systems. Our findings suggest a plausible scenario for the evolution of the triplet genetic code in a continuous manner. This scenario suggests an explanation of how protein synthesis could be accomplished by means of long RNA-RNA interactions prior to the emergence of the complex decoding machinery, such as the ribosome, that is required for stabilization and discrimination of otherwise weak triplet codon-anticodon interactions.
Collapse
Affiliation(s)
- Pavel V Baranov
- Biochemistry Department, University College Cork, Cork, Ireland.
| | | | | |
Collapse
|
12
|
Abstract
The genetic code is nearly universal, and the arrangement of the codons in the standard codon table is highly nonrandom. The three main concepts on the origin and evolution of the code are the stereochemical theory, according to which codon assignments are dictated by physicochemical affinity between amino acids and the cognate codons (anticodons); the coevolution theory, which posits that the code structure coevolved with amino acid biosynthesis pathways; and the error minimization theory under which selection to minimize the adverse effect of point mutations and translation errors was the principal factor of the code's evolution. These theories are not mutually exclusive and are also compatible with the frozen accident hypothesis, that is, the notion that the standard code might have no special properties but was fixed simply because all extant life forms share a common ancestor, with subsequent changes to the code, mostly, precluded by the deleterious effect of codon reassignment. Mathematical analysis of the structure and possible evolutionary trajectories of the code shows that it is highly robust to translational misreading but there are numerous more robust codes, so the standard code potentially could evolve from a random code via a short sequence of codon series reassignments. Thus, much of the evolution that led to the standard code could be a combination of frozen accident with selection for error minimization although contributions from coevolution of the code with metabolic pathways and weak affinities between amino acids and nucleotide triplets cannot be ruled out. However, such scenarios for the code evolution are based on formal schemes whose relevance to the actual primordial evolution is uncertain. A real understanding of the code origin and evolution is likely to be attainable only in conjunction with a credible scenario for the evolution of the coding principle itself and the translation system.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| | | |
Collapse
|
13
|
A statistical analysis of the robustness of alternate genetic coding tables. Int J Mol Sci 2008; 9:679-697. [PMID: 19325778 PMCID: PMC2635705 DOI: 10.3390/ijms9050679] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2007] [Revised: 02/25/2008] [Accepted: 04/11/2008] [Indexed: 11/24/2022] Open
Abstract
The rules that specify how the information contained in DNA is translated into amino acid “language” during protein synthesis are called “the genetic code”, commonly called the “Standard” or “Universal” Genetic Code Table. As a matter of fact, this coding table is not at all “universal”: in addition to different genetic code tables used by different organisms, even within the same organism the nuclear and mitochondrial genes may be subject to two different coding tables. Results In an attempt to understand the advantages and disadvantages these coding tables may bring to an organism, we have decided to analyze various coding tables on genes subject to mutations, and have estimated how these genes “survive” over generations. We have used this as indicative of the “evolutionary” success of that particular coding table. We find that the “standard” genetic code is not actually the most robust of all coding tables, and interestingly, Flatworm Mitochondrial Code (FMC) appears to be the highest ranking coding table given our assumptions. Conclusions It is commonly hypothesized that the more robust a genetic code, the better suited it is for maintenance of the genome. Our study shows that, given the assumptions in our model, Standard Genetic Code is quite poor when compared to other alternate code tables in terms of robustness. This brings about the question of why Standard Code has been so widely accepted by a wider variety of organisms instead of FMC, which needs to be addressed for a thorough understanding of genetic code evolution.
Collapse
|
14
|
Wolf YI, Koonin EV. On the origin of the translation system and the genetic code in the RNA world by means of natural selection, exaptation, and subfunctionalization. Biol Direct 2007; 2:14. [PMID: 17540026 PMCID: PMC1894784 DOI: 10.1186/1745-6150-2-14] [Citation(s) in RCA: 130] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2007] [Accepted: 05/31/2007] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND The origin of the translation system is, arguably, the central and the hardest problem in the study of the origin of life, and one of the hardest in all evolutionary biology. The problem has a clear catch-22 aspect: high translation fidelity hardly can be achieved without a complex, highly evolved set of RNAs and proteins but an elaborate protein machinery could not evolve without an accurate translation system. The origin of the genetic code and whether it evolved on the basis of a stereochemical correspondence between amino acids and their cognate codons (or anticodons), through selectional optimization of the code vocabulary, as a "frozen accident" or via a combination of all these routes is another wide open problem despite extensive theoretical and experimental studies. Here we combine the results of comparative genomics of translation system components, data on interaction of amino acids with their cognate codons and anticodons, and data on catalytic activities of ribozymes to develop conceptual models for the origins of the translation system and the genetic code. RESULTS Our main guide in constructing the models is the Darwinian Continuity Principle whereby a scenario for the evolution of a complex system must consist of plausible elementary steps, each conferring a distinct advantage on the evolving ensemble of genetic elements. Evolution of the translation system is envisaged to occur in a compartmentalized ensemble of replicating, co-selected RNA segments, i.e., in a RNA World containing ribozymes with versatile activities. Since evolution has no foresight, the translation system could not evolve in the RNA World as the result of selection for protein synthesis and must have been a by-product of evolution drive by selection for another function, i.e., the translation system evolved via the exaptation route. It is proposed that the evolutionary process that eventually led to the emergence of translation started with the selection for ribozymes binding abiogenic amino acids that stimulated ribozyme-catalyzed reactions. The proposed scenario for the evolution of translation consists of the following steps: binding of amino acids to a ribozyme resulting in an enhancement of its catalytic activity; evolution of the amino-acid-stimulated ribozyme into a peptide ligase (predecessor of the large ribosomal subunit) yielding, initially, a unique peptide activating the original ribozyme and, possibly, other ribozymes in the ensemble; evolution of self-charging proto-tRNAs that were selected, initially, for accumulation of amino acids, and subsequently, for delivery of amino acids to the peptide ligase; joining of the peptide ligase with a distinct RNA molecule (predecessor of the small ribosomal subunit) carrying a built-in template for more efficient, complementary binding of charged proto-tRNAs; evolution of the ability of the peptide ligase to assemble peptides using exogenous RNAs as template for complementary binding of charged proteo-tRNAs, yielding peptides with the potential to activate different ribozymes; evolution of the translocation function of the protoribosome leading to the production of increasingly longer peptides (the first proteins), i.e., the origin of translation. The specifics of the recognition of amino acids by proto-tRNAs and the origin of the genetic code depend on whether or not there is a physical affinity between amino acids and their cognate codons or anticodons, a problem that remains unresolved. CONCLUSION We describe a stepwise model for the origin of the translation system in the ancient RNA world such that each step confers a distinct advantage onto an ensemble of co-evolving genetic elements. Under this scenario, the primary cause for the emergence of translation was the ability of amino acids and peptides to stimulate reactions catalyzed by ribozymes. Thus, the translation system might have evolved as the result of selection for ribozymes capable of, initially, efficient amino acid binding, and subsequently, synthesis of increasingly versatile peptides. Several aspects of this scenario are amenable to experimental testing.
Collapse
Affiliation(s)
- Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
15
|
Duax WL, Huether R, Pletnev VZ, Langs D, Addlagatta A, Connare S, Habegger L, Gill J. Rational genomics I: antisense open reading frames and codon bias in short-chain oxido reductase enzymes and the evolution of the genetic code. Proteins 2006; 61:900-6. [PMID: 16245321 PMCID: PMC1476703 DOI: 10.1002/prot.20687] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The short-chain oxidoreductase (SCOR) family of enzymes includes over 6000 members, extending from bacteria and archaea to humans. Nucleic acid sequence analysis reveals that significant numbers of these genes are remarkably free of stopcodons in reading frames other than the coding frame, including those on the antisense strand. The genes from this subset also use almost entirely the GC-rich half of the 64 codons. Analysis of a million hypothetical genes having random nucleotide composition shows that the percentage of SCOR genes having multiple open reading frames exceeds random by a factor of as much as 1 x 10(6). Nevertheless, screening the content of the SWISS-PROT TrEMBL database reveals that 15% of all genes contain multiple open reading frames. The SCOR genes having multiple open reading frames and a GC-rich coding bias exhibit a similar GC bias in the nucleotide triple composition of their DNA. This bias is not correlated with the GC content of the species in which the SCOR genes are found. One possible explanation for the conservation of multiple open reading frames and extreme bias in nucleic acid composition in the family of Rossman folds is that the primordial member of this family was encoded early using only very stable GC-rich DNA and that evolution proceeded with extremely limited introduction of any codons having two or more adenine or thymine nucleotides. These and other data suggest that the SCOR family of enzymes may even have diverged from a common ancestor before most of the AT-rich half of the genetic code was fully defined.
Collapse
Affiliation(s)
- William L Duax
- Hauptman-Woodward Medical Research Institute, Buffalo, New York 14203, USA.
| | | | | | | | | | | | | | | |
Collapse
|
16
|
Abel DL, Trevors JT. Three subsets of sequence complexity and their relevance to biopolymeric information. Theor Biol Med Model 2005; 2:29. [PMID: 16095527 PMCID: PMC1208958 DOI: 10.1186/1742-4682-2-29] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2005] [Accepted: 08/11/2005] [Indexed: 11/24/2022] Open
Abstract
Genetic algorithms instruct sophisticated biological organization. Three qualitative kinds of sequence complexity exist: random (RSC), ordered (OSC), and functional (FSC). FSC alone provides algorithmic instruction. Random and Ordered Sequence Complexities lie at opposite ends of the same bi-directional sequence complexity vector. Randomness in sequence space is defined by a lack of Kolmogorov algorithmic compressibility. A sequence is compressible because it contains redundant order and patterns. Law-like cause-and-effect determinism produces highly compressible order. Such forced ordering precludes both information retention and freedom of selection so critical to algorithmic programming and control. Functional Sequence Complexity requires this added programming dimension of uncoerced selection at successive decision nodes in the string. Shannon information theory measures the relative degrees of RSC and OSC. Shannon information theory cannot measure FSC. FSC is invariably associated with all forms of complex biofunction, including biochemical pathways, cycles, positive and negative feedback regulation, and homeostatic metabolism. The algorithmic programming of FSC, not merely its aperiodicity, accounts for biological organization. No empirical evidence exists of either RSC of OSC ever having produced a single instance of sophisticated biological organization. Organization invariably manifests FSC rather than successive random events (RSC) or low-informational self-ordering phenomena (OSC).
Collapse
Affiliation(s)
- David L Abel
- Director, The Gene Emergence Project, The Origin-of-Life Foundation, Inc., 113 Hedgewood Dr., Greenbelt, MD 20770-1610 USA
| | - Jack T Trevors
- Professor, Department of Environmental Biology, University of Guelph, Rm 3220 Bovey Building, Guelph, Ontario, N1G 2W1, Canada
| |
Collapse
|
17
|
Swire J, Judson OP, Burt A. Mitochondrial Genetic Codes Evolve to Match Amino Acid Requirements of Proteins. J Mol Evol 2005; 60:128-39. [PMID: 15696375 DOI: 10.1007/s00239-004-0077-9] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2004] [Accepted: 08/31/2004] [Indexed: 10/25/2022]
Abstract
Mitochondria often use genetic codes different from the standard genetic code. Now that many mitochondrial genomes have been sequenced, these variant codes provide the first opportunity to examine empirically the processes that produce new genetic codes. The key question is: Are codon reassignments the sole result of mutation and genetic drift? Or are they the result of natural selection? Here we present an analysis of 24 phylogenetically independent codon reassignments in mitochondria. Although the mutation-drift hypothesis can explain reassignments from stop to an amino acid, we found that it cannot explain reassignments from one amino acid to another. In particular--and contrary to the predictions of the mutation-drift hypothesis--the codon involved in such a reassignment was not rare in the ancestral genome. Instead, such reassignments appear to take place while the codon is in use at an appreciable frequency. Moreover, the comparison of inferred amino acid usage in the ancestral genome with the neutral expectation shows that the amino acid gaining the codon was selectively favored over the amino acid losing the codon. These results are consistent with a simple model of weak selection on the amino acid composition of proteins in which codon reassignments are selected because they compensate for multiple slightly deleterious mutations throughout the mitochondrial genome. We propose that the selection pressure is for reduced protein synthesis cost: most reassignments give amino acids that are less expensive to synthesize. Taken together, our results strongly suggest that mitochondrial genetic codes evolve to match the amino acid requirements of proteins.
Collapse
Affiliation(s)
- Jonathan Swire
- Centre for Bioinformatics, Biochemistry Building, Department of Biological Sciences, Imperial College, London, SW7 2AY, UK.
| | | | | |
Collapse
|
18
|
Abstract
Since discovering the pattern by which amino acids are assigned to codons within the standard genetic code, investigators have explored the idea that natural selection placed biochemically similar amino acids near to one another in coding space so as to minimize the impact of mutations and/or mistranslations. The analytical evidence to support this theory has grown in sophistication and strength over the years, and counterclaims questioning its plausibility and quantitative support have yet to transcend some significant weaknesses in their approach. These weaknesses are illustrated here by means of a simple simulation model for adaptive genetic code evolution. There remain ill explored facets of the 'error minimizing' code hypothesis, however, including the mechanism and pathway by which an adaptive pattern of codon assignments emerged, the extent to which natural selection created synonym redundancy, its role in shaping the amino acid and nucleotide languages, and even the correct interpretation of the adaptive codon assignment pattern: these represent fertile areas for future research.
Collapse
Affiliation(s)
- Stephen J Freeland
- Department of Biology, University of Maryland, Baltimore County, Catonsville, MD, USA.
| | | | | |
Collapse
|
19
|
Abstract
The primordial genetic code probably has been a drastically simplified ancestor of the canonical code that is used by contemporary cells. In order to understand how the present-day code came about we first need to explain how the language of the building plan can change without destroying the encoded information. In this work we introduce a minimal organism model that is based on biophysically reasonable descriptions of RNA and protein, namely secondary structure folding and knowledge based potentials. The evolution of a population of such organism under competition for a common resource is simulated explicitly at the level of individual replication events. Starting with very simple codes, and hence greatly reduced amino acid alphabets, we observe a diversification of the codes in most simulation runs. The driving force behind this effect is the possibility to produce fitter proteins when the repertoire of amino acids is enlarged.
Collapse
Affiliation(s)
- Günter Weberndorfer
- Institut für Theoretische Chemie und Molekulare Strukturbiologie, Universität Wien, Wien, Austria
| | | | | |
Collapse
|
20
|
Kunichika K, Hashimoto Y, Imoto T. Robustness of hen lysozyme monitored by random mutations. Protein Eng Des Sel 2002; 15:805-9. [PMID: 12468714 DOI: 10.1093/protein/15.10.805] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We investigated the robustness of hen lysozyme by using random mutant libraries. Six random mutant libraries containing 1, 1.5, 2, 3, 5 and 14 amino acid mutations per hen lysozyme were systematically constructed by varying the concentrations of Mg(2+) and Mn(2+) on polymerase chain reaction. The mutated genes from the six libraries were cloned to a yeast expression vector and a total of 4000 clones were screened on the basis of lysis activity and ELISA employing monoclonal antibody that recognized only lysozyme with native conformation. About 80% of the clones with an average of two amino acid mutations retained active structure. Almost all clones with an average of five mutations lost active structure. On the other hand, 80% of the clones with an average of two amino acid mutations retained both gross conformation and active structure and 24% of the clones with an average of 14 amino acid mutations retained gross conformation. These results show that gross conformation is robust against mutations and so is active structure to a lesser extent.
Collapse
Affiliation(s)
- Kaori Kunichika
- Graduate School of Pharmaceutical Science, Kyushu University, Fukuoka 812-8582, Japan
| | | | | |
Collapse
|
21
|
Abstract
The construction of the genetic code is investigated based on a stability principle. The concept and formulation of mutational deterioration (MD) of the genetic code is proposed. It is proved that the degeneracies of codon multiplets obey the rule to best resist MD. The MD for each ideal multiplet of codons is expressed by four parameters and it takes on a minimum value for real distributions of codons in the multiplet. Then the global mutational deterioration (GMD) of code table is calculated and the minimal code is deduced. The domain-like distribution of hydrophobic and hydrophilic amino acids on the genetic code is explained from the minimization of GMD. It is demonstrated that the standard code is approximately GMD-minimal. By introducing some constraints that are related to the initial condition of the system, we have deduced the standard genetic code from the minimization of GMD. The minimization shows the general trend of evolutionary process to some stable state while the constraints reflect a 'frozen accident.' Many deviant codon assignments are also explained through MD minimization assuming the changeable degrees of degeneracies for some multiplets. So, a possible answer to the question of "Why are synonymous codons and amino acids distributed in the code table just as they are?" is given.
Collapse
Affiliation(s)
- Liaofu Luo
- Department of Physics, Inner Mongolia University, Hohhot 010021, PR China.
| | | |
Collapse
|
22
|
Luo L, Li X. Coding rules for amino acids in the genetic code: the genetic code is a minimal code of mutational deterioration. ORIGINS LIFE EVOL B 2002; 32:23-33. [PMID: 11889915 DOI: 10.1023/a:1013963505140] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Coding rules for amino acids in the genetic code are discussed from the point that the genetic code is a minimal code of mutational deterioration. The global mutational deterioration (GMD) function is defined through several parameters describing single base mutations and amino acid distances. The problem of searching for the global minimum of the GMD function is discussed in some detail. From GMD minimization under initial constraints we have succeeded in deducing the standard genetic code.
Collapse
Affiliation(s)
- Liaofu Luo
- Department of Physics, Inner Mongolia University, Hohhot, China.
| | | |
Collapse
|
23
|
Gilis D, Massar S, Cerf NJ, Rooman M. Optimality of the genetic code with respect to protein stability and amino-acid frequencies. Genome Biol 2001; 2:RESEARCH0049. [PMID: 11737948 PMCID: PMC60310 DOI: 10.1186/gb-2001-2-11-research0049] [Citation(s) in RCA: 142] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2001] [Revised: 07/06/2001] [Accepted: 09/28/2001] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND The genetic code is known to be efficient in limiting the effect of mistranslation errors. A misread codon often codes for the same amino acid or one with similar biochemical properties, so the structure and function of the coded protein remain relatively unaltered. Previous studies have attempted to address this question quantitatively, by estimating the fraction of randomly generated codes that do better than the genetic code in respect of overall robustness. We extended these results by investigating the role of amino-acid frequencies in the optimality of the genetic code. RESULTS We found that taking the amino-acid frequency into account decreases the fraction of random codes that beat the natural code. This effect is particularly pronounced when more refined measures of the amino-acid substitution cost are used than hydrophobicity. To show this, we devised a new cost function by evaluating in silico the change in folding free energy caused by all possible point mutations in a set of protein structures. With this function, which measures protein stability while being unrelated to the code's structure, we estimated that around two random codes in a billion (109) are fitter than the natural code. When alternative codes are restricted to those that interchange biosynthetically related amino acids, the genetic code appears even more optimal. CONCLUSIONS These results lead us to discuss the role of amino-acid frequencies and other parameters in the genetic code's evolution, in an attempt to propose a tentative picture of primitive life.
Collapse
Affiliation(s)
- D Gilis
- Biomolecular Engineering, Université Libre de Bruxelles, ave F D Roosevelt, 1050 Bruxelles, Belgium.
| | | | | | | |
Collapse
|
24
|
|
25
|
Abstract
The nature of the role played by mobile elements in host genome evolution is reassessed considering numerous recent developments in many areas of biology. It is argued that easy popular appellations such as "selfish DNA" and "junk DNA" may be either inaccurate or misleading and that a more enlightened view of the transposable element-host relationship encompasses a continuum from extreme parasitism to mutualism. Transposable elements are potent, broad spectrum, endogenous mutators that are subject to the influence of chance as well as selection at several levels of biological organization. Of particular interest are transposable element traits that early evolve neutrally at the host level but at a later stage of evolution are co-opted for new host functions.
Collapse
Affiliation(s)
- M G Kidwell
- Department of Ecology and Evolutionary Biology, The University of Arizona, Tucson 85721, USA.
| | | |
Collapse
|
26
|
|
27
|
Abstract
Biological diversity has evolved despite the essentially infinite complexity of protein sequence space. We present a hierarchical approach to the efficient searching of this space and quantify the evolutionary potential of our approach with Monte Carlo simulations. These simulations demonstrate that nonhomologous juxtaposition of encoded structure is the rate-limiting step in the production of new tertiary protein folds. Nonhomologous "swapping" of low-energy secondary structures increased the binding constant of a simulated protein by approximately 10(7) relative to base substitution alone. Applications of our approach include the generation of new protein folds and modeling the molecular evolution of disease.
Collapse
Affiliation(s)
- L D Bogarad
- Division of Biology, California Institute of Technology, Pasadena, CA 91125, USA
| | | |
Collapse
|