1
|
Rozhoňová H, Martí-Gómez C, McCandlish DM, Payne JL. Robust genetic codes enhance protein evolvability. PLoS Biol 2024; 22:e3002594. [PMID: 38754362 PMCID: PMC11098591 DOI: 10.1371/journal.pbio.3002594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 03/19/2024] [Indexed: 05/18/2024] Open
Abstract
The standard genetic code defines the rules of translation for nearly every life form on Earth. It also determines the amino acid changes accessible via single-nucleotide mutations, thus influencing protein evolvability-the ability of mutation to bring forth adaptive variation in protein function. One of the most striking features of the standard genetic code is its robustness to mutation, yet it remains an open question whether such robustness facilitates or frustrates protein evolvability. To answer this question, we use data from massively parallel sequence-to-function assays to construct and analyze 6 empirical adaptive landscapes under hundreds of thousands of rewired genetic codes, including those of codon compression schemes relevant to protein engineering and synthetic biology. We find that robust genetic codes tend to enhance protein evolvability by rendering smooth adaptive landscapes with few peaks, which are readily accessible from throughout sequence space. However, the standard genetic code is rarely exceptional in this regard, because many alternative codes render smoother landscapes than the standard code. By constructing low-dimensional visualizations of these landscapes, which each comprise more than 16 million mRNA sequences, we show that such alternative codes radically alter the topological features of the network of high-fitness genotypes. Whereas the genetic codes that optimize evolvability depend to some extent on the detailed relationship between amino acid sequence and protein function, we also uncover general design principles for engineering nonstandard genetic codes for enhanced and diminished evolvability, which may facilitate directed protein evolution experiments and the bio-containment of synthetic organisms, respectively.
Collapse
Affiliation(s)
- Hana Rozhoňová
- Institute of Integrative Biology, ETH Zürich, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Carlos Martí-Gómez
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - David M. McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Joshua L. Payne
- Institute of Integrative Biology, ETH Zürich, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
2
|
Gabzi T, Pilpel Y, Friedlander T. Fitness landscape analysis of a tRNA gene reveals that the wild type allele is sub-optimal, yet mutationally robust. Mol Biol Evol 2022; 39:6670756. [PMID: 35976926 PMCID: PMC9447856 DOI: 10.1093/molbev/msac178] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Fitness landscape mapping and the prediction of evolutionary trajectories on these landscapes are major tasks in evolutionary biology research. Evolutionary dynamics is tightly linked to the landscape topography, but this relation is not straightforward. Here, we analyze a fitness landscape of a yeast tRNA gene, previously measured under four different conditions. We find that the wild type allele is sub-optimal, and 8–10% of its variants are fitter. We rule out the possibilities that the wild type is fittest on average on these four conditions or located on a local fitness maximum. Notwithstanding, we cannot exclude the possibility that the wild type might be fittest in some of the many conditions in the complex ecology that yeast lives at. Instead, we find that the wild type is mutationally robust (“flat”), while more fit variants are typically mutationally fragile. Similar observations of mutational robustness or flatness have been so far made in very few cases, predominantly in viral genomes.
Collapse
Affiliation(s)
- Tzahi Gabzi
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Yitzhak Pilpel
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Tamar Friedlander
- The Robert H. Smith Institute of Plant Sciences and Genetics in Agriculture Faculty of Agriculture, Hebrew University of Jerusalem, 229 Herzl St., Rehovot 7610001, Israel
| |
Collapse
|
3
|
Schweizer G, Wagner A. Both Binding Strength and Evolutionary Accessibility Affect the Population Frequency of Transcription Factor Binding Sequences in Arabidopsis thaliana. Genome Biol Evol 2021; 13:6459646. [PMID: 34894231 PMCID: PMC8712246 DOI: 10.1093/gbe/evab273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/06/2021] [Indexed: 11/22/2022] Open
Abstract
Mutations in DNA sequences that bind transcription factors and thus modulate gene expression are a source of adaptive variation in gene expression. To understand how transcription factor binding sequences evolve in natural populations of the thale cress Arabidopsis thaliana, we integrated genomic polymorphism data for loci bound by transcription factors with in vitro data on binding affinity for these transcription factors. Specifically, we studied 19 different transcription factors, and the allele frequencies of 8,333 genomic loci bound in vivo by these transcription factors in 1,135 A. thaliana accessions. We find that transcription factor binding sequences show very low genetic diversity, suggesting that they are subject to purifying selection. High frequency alleles of such binding sequences tend to bind transcription factors strongly. Conversely, alleles that are absent from the population tend to bind them weakly. In addition, alleles with high frequencies also tend to be the endpoints of many accessible evolutionary paths leading to these alleles. We show that both high affinity and high evolutionary accessibility contribute to high allele frequency for at least some transcription factors. Although binding sequences with stronger affinity are more frequent, we did not find them to be associated with higher gene expression levels. Epistatic interactions among individual mutations that alter binding affinity are pervasive and can help explain variation in accessibility among binding sequences. In summary, combining in vitro binding affinity data with in vivo binding sequence data can help understand the forces that affect the evolution of transcription factor binding sequences in natural populations.
Collapse
Affiliation(s)
- Gabriel Schweizer
- Department of Evolutionary Biology and Environmental Studies, University of Zürich, Switzerland.,Swiss Institute of Bioinformatics, Quartier Sorge-Batiment Genopode, Lausanne, Switzerland
| | - Andreas Wagner
- Department of Evolutionary Biology and Environmental Studies, University of Zürich, Switzerland.,Swiss Institute of Bioinformatics, Quartier Sorge-Batiment Genopode, Lausanne, Switzerland.,Santa Fe Institute, Santa Fe, New Mexico, USA.,Stellenbosch Institute for Advanced Study (STIAS), Wallenberg Research Centre at Stellenbosch University, South Africa
| |
Collapse
|
4
|
McCandlish DM. Long-term evolution on complex fitness landscapes when mutation is weak. Heredity (Edinb) 2018; 121:449-465. [PMID: 30232363 PMCID: PMC6180110 DOI: 10.1038/s41437-018-0142-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2017] [Revised: 08/04/2018] [Accepted: 08/06/2018] [Indexed: 12/25/2022] Open
Abstract
Understanding evolution on complex fitness landscapes is difficult both because of the large dimensionality of sequence space and the stochasticity inherent to population-genetic processes. Here, I present an integrated suite of mathematical tools for understanding evolution on time-invariant fitness landscapes when mutations occur sufficiently rarely that the population is typically monomorphic and evolution can be modeled as a sequence of well-separated fixation events. The basic intuition behind this suite of tools is that surrounding any particular genotype lies a region of the fitness landscape that is easy to evolve to, while other pieces of the fitness landscape are difficult to evolve to (due to distance, being across a fitness valley, etc.). I propose a rigorous definition for this "dynamical neighborhood" of a genotype which captures several aspects of the distribution of waiting times to evolve from one genotype to another. The neighborhood structure of the landscape as a whole can be summarized as a matrix, and I show how this matrix can be used to approximate the expected waiting time for certain evolutionary events to occur and to provide an intuitive interpretation to existing formal results on the index of dispersion of the molecular clock.
Collapse
Affiliation(s)
- David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA.
| |
Collapse
|
5
|
Abstract
The sequence space of five protein superfamilies was investigated by constructing sequence networks. The nodes represent individual sequences, and two nodes are connected by an edge if the global sequence identity of two sequences exceeds a threshold. The networks were characterized by their degree distribution (number of nodes with a given number of neighbors) and by their fractal network dimension. Although the five protein families differed in sequence length, fold, and domain arrangement, their network properties were similar. The fractal network dimension Df was distance-dependent: a high dimension for single and double mutants (Df = 4.0), which dropped to Df = 0.7-1.0 at 90% sequence identity, and increased to Df = 3.5-4.5 below 70% sequence identity. The distance dependency of the network dimension is consistent with evolutionary constraints for functional proteins. While random single and double mutations often result in a functional protein, the accumulation of more than ten mutations is dominated by epistasis. The networks of the five protein families were highly inhomogeneous with few highly connected communities ("hub sequences") and a large number of smaller and less connected communities. The degree distributions followed a power-law distribution with similar scaling exponents close to 1. Because the hub sequences have a large number of functional neighbors, they are expected to be robust toward possible deleterious effects of mutations. Because of their robustness, hub sequences have the potential of high innovability, with additional mutations readily inducing new functions. Therefore, they form hotspots of evolution and are promising candidates as starting points for directed evolution experiments in biotechnology.
Collapse
|
6
|
Aguilar‐Rodríguez J, Peel L, Stella M, Wagner A, Payne JL. The architecture of an empirical genotype-phenotype map. Evolution 2018; 72:1242-1260. [PMID: 29676774 PMCID: PMC6055911 DOI: 10.1111/evo.13487] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Accepted: 04/03/2018] [Indexed: 12/15/2022]
Abstract
Recent advances in high-throughput technologies are bringing the study of empirical genotype-phenotype (GP) maps to the fore. Here, we use data from protein-binding microarrays to study an empirical GP map of transcription factor (TF) -binding preferences. In this map, each genotype is a DNA sequence. The phenotype of this DNA sequence is its ability to bind one or more TFs. We study this GP map using genotype networks, in which nodes represent genotypes with the same phenotype, and edges connect nodes if their genotypes differ by a single small mutation. We describe the structure and arrangement of genotype networks within the space of all possible binding sites for 525 TFs from three eukaryotic species encompassing three kingdoms of life (animal, plant, and fungi). We thus provide a high-resolution depiction of the architecture of an empirical GP map. Among a number of findings, we show that these genotype networks are "small-world" and assortative, and that they ubiquitously overlap and interface with one another. We also use polymorphism data from Arabidopsis thaliana to show how genotype network structure influences the evolution of TF-binding sites in vivo. We discuss our findings in the context of regulatory evolution.
Collapse
Affiliation(s)
- José Aguilar‐Rodríguez
- Department of Evolutionary Biology and Environmental StudiesUniversity of ZurichZurichSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
- Current Address: Department of Biology, Stanford University, StanfordCA, USA; Department of Chemical and Systems Biology, Stanford UniversityStanfordCAUSA
| | - Leto Peel
- Institute of Information and Communication Technologies, Electronics and Applied MathematicsUniversité Catholique de LouvainLouvain‐la‐NeuveBelgium
- Namur Center for Complex SystemsUniversity of NamurNamurBelgium
| | - Massimo Stella
- Institute for Complex Systems Simulation, Department of Electronics and Computer ScienceUniversity of SouthamptonSouthamptonUnited Kingdom
| | - Andreas Wagner
- Department of Evolutionary Biology and Environmental StudiesUniversity of ZurichZurichSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
- The Santa Fe InstituteSanta FeNew MexicoUSA
| | - Joshua L. Payne
- Swiss Institute of BioinformaticsLausanneSwitzerland
- Institute for Integrative Biology, ETHZurichSwitzerland
| |
Collapse
|
7
|
Epistasis and the Dynamics of Reversion in Molecular Evolution. Genetics 2016; 203:1335-51. [PMID: 27194749 DOI: 10.1534/genetics.116.188961] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2016] [Accepted: 04/27/2016] [Indexed: 12/27/2022] Open
Abstract
Recent studies of protein evolution contend that the longer an amino acid substitution is present at a site, the less likely it is to revert to the amino acid previously occupying that site. Here we study this phenomenon of decreasing reversion rates rigorously and in a much more general context. We show that, under weak mutation and for arbitrary fitness landscapes, reversion rates decrease with time for any site that is involved in at least one epistatic interaction. Specifically, we prove that, at stationarity, the hazard function of the distribution of waiting times until reversion is strictly decreasing for any such site. Thus, in the presence of epistasis, the longer a particular character has been absent from a site, the less likely the site will revert to its prior state. We also explore several examples of this general result, which share a common pattern whereby the probability of having reverted increases rapidly at short times to some substantial value before becoming almost flat after a few substitutions at other sites. This pattern indicates a characteristic tendency for reversion to occur either almost immediately after the initial substitution or only after a very long time.
Collapse
|
8
|
Abstract
Epistatic interactions can frustrate and shape evolutionary change. Indeed, phenotypes may fail to evolve when essential mutations are only accessible through positive selection if they are fixed simultaneously. How environmental variability affects such constraints is poorly understood. Here, we studied genetic constraints in fixed and fluctuating environments using the Escherichia coli lac operon as a model system for genotype-environment interactions. We found that, in different fixed environments, all trajectories that were reconstructed by applying point mutations within the transcription factor-operator interface became trapped at suboptima, where no additional improvements were possible. Paradoxically, repeated switching between these same environments allows unconstrained adaptation by continuous improvements. This evolutionary mode is explained by pervasive cross-environmental tradeoffs that reposition the peaks in such a way that trapped genotypes can repeatedly climb ascending slopes and hence, escape adaptive stasis. Using a Markov approach, we developed a mathematical framework to quantify the landscape-crossing rates and show that this ratchet-like adaptive mechanism is robust in a wide spectrum of fluctuating environments. Overall, this study shows that genetic constraints can be overcome by environmental change and that cross-environmental tradeoffs do not necessarily impede but also, can facilitate adaptive evolution. Because tradeoffs and environmental variability are ubiquitous in nature, we speculate this evolutionary mode to be of general relevance.
Collapse
|
9
|
Survival probability of beneficial mutations in bacterial batch culture. Genetics 2015; 200:309-20. [PMID: 25758382 DOI: 10.1534/genetics.114.172890] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2014] [Accepted: 03/07/2015] [Indexed: 01/17/2023] Open
Abstract
The survival of rare beneficial mutations can be extremely sensitive to the organism's life history and the trait affected by the mutation. Given the tremendous impact of bacteria in batch culture as a model system for the study of adaptation, it is important to understand the survival probability of beneficial mutations in these populations. Here we develop a life-history model for bacterial populations in batch culture and predict the survival of mutations that increase fitness through their effects on specific traits: lag time, fission time, viability, and the timing of stationary phase. We find that if beneficial mutations are present in the founding population at the beginning of culture growth, mutations that reduce the mortality of daughter cells are the most likely to survive drift. In contrast, of mutations that occur de novo during growth, those that delay the onset of stationary phase are the most likely to survive. Our model predicts that approximately fivefold population growth between bottlenecks will optimize the occurrence and survival of beneficial mutations of all four types. This prediction is relatively insensitive to other model parameters, such as the lag time, fission time, or mortality rate of the population. We further estimate that bottlenecks that are more severe than this optimal prediction substantially reduce the occurrence and survival of adaptive mutations.
Collapse
|
10
|
Topological features of rugged fitness landscapes in sequence space. Trends Genet 2015; 31:24-33. [DOI: 10.1016/j.tig.2014.09.009] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2014] [Revised: 09/17/2014] [Accepted: 09/18/2014] [Indexed: 12/22/2022]
|
11
|
McCandlish DM, Stoltzfus A. Modeling evolution using the probability of fixation: history and implications. QUARTERLY REVIEW OF BIOLOGY 2014; 89:225-52. [PMID: 25195318 DOI: 10.1086/677571] [Citation(s) in RCA: 102] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Many models of evolution calculate the rate of evolution by multiplying the rate at which new mutations originate within a population by a probability of fixation. Here we review the historical origins, contemporary applications, and evolutionary implications of these "origin-fixation" models, which are widely used in evolutionary genetics, molecular evolution, and phylogenetics. Origin-fixation models were first introduced in 1969, in association with an emerging view of "molecular" evolution. Early origin-fixation models were used to calculate an instantaneous rate of evolution across a large number of independently evolving loci; in the 1980s and 1990s, a second wave of origin-fixation models emerged to address a sequence of fixation events at a single locus. Although origin fixation models have been applied to a broad array of problems in contemporary evolutionary research, their rise in popularity has not been accompanied by an increased appreciation of their restrictive assumptions or their distinctive implications. We argue that origin-fixation models constitute a coherent theory of mutation-limited evolution that contrasts sharply with theories of evolution that rely on the presence of standing genetic variation. A major unsolved question in evolutionary biology is the degree to which these models provide an accurate approximation of evolution in natural populations.
Collapse
|
12
|
Sammons RD, Gaines TA. Glyphosate resistance: state of knowledge. PEST MANAGEMENT SCIENCE 2014; 70:1367-77. [PMID: 25180399 PMCID: PMC4260172 DOI: 10.1002/ps.3743] [Citation(s) in RCA: 247] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2013] [Revised: 01/17/2014] [Accepted: 01/25/2014] [Indexed: 05/18/2023]
Abstract
Studies of mechanisms of resistance to glyphosate have increased current understanding of herbicide resistance mechanisms. Thus far, single-codon non-synonymous mutations of EPSPS (5-enolypyruvylshikimate-3-phosphate synthase) have been rare and, relative to other herbicide mode of action target-site mutations, unconventionally weak in magnitude for resistance to glyphosate. However, it is possible that weeds will emerge with non-synonymous mutations of two codons of EPSPS to produce an enzyme endowing greater resistance to glyphosate. Today, target-gene duplication is a common glyphosate resistance mechanism and could become a fundamental process for developing any resistance trait. Based on competition and substrate selectivity studies in several species, rapid vacuole sequestration of glyphosate occurs via a transporter mechanism. Conversely, as the chloroplast requires transporters for uptake of important metabolites, transporters associated with the two plastid membranes may separately, or together, successfully block glyphosate delivery. A model based on finite glyphosate dose and limiting time required for chloroplast loading sets the stage for understanding how uniquely different mechanisms can contribute to overall glyphosate resistance.
Collapse
Affiliation(s)
| | - Todd A Gaines
- Department of Bioagricultural Sciences and Pest Management, Colorado State UniversityFort Collins, CO, USA
| |
Collapse
|
13
|
Martin CH, Wainwright PC. On the measurement of ecological novelty: scale-eating pupfish are separated by 168 my from other scale-eating fishes. PLoS One 2013; 8:e71164. [PMID: 23976994 PMCID: PMC3747246 DOI: 10.1371/journal.pone.0071164] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2013] [Accepted: 07/03/2013] [Indexed: 11/28/2022] Open
Abstract
The colonization of new adaptive zones is widely recognized as one of the hallmarks of adaptive radiation. However, the adoption of novel resources during this process is rarely distinguished from phenotypic change because morphology is a common proxy for ecology. How can we quantify ecological novelty independent of phenotype? Our study is split into two parts: we first document a remarkable example of ecological novelty, scale-eating (lepidophagy), within a rapidly-evolving adaptive radiation of Cyprinodon pupfishes on San Salvador Island, Bahamas. This specialized predatory niche is known in several other fish groups, but is not found elsewhere among the 1,500 species of atherinomorphs. Second, we quantify this ecological novelty by measuring the time-calibrated phylogenetic distance in years to the most closely-related species with convergent ecology. We find that scale-eating pupfish are separated by 168 million years of evolution from the nearest scale-eating fish. We apply this approach to a variety of examples and highlight the frequent decoupling of ecological novelty from phenotypic divergence. We observe that novel ecology is not always tightly correlated with rates of phenotypic or species diversification, particularly within recent adaptive radiations, necessitating the use of additional measures of ecological novelty independent of phenotype.
Collapse
Affiliation(s)
- Christopher H. Martin
- Department of Evolution and Ecology and Center for Population Biology, University of California Davis, Davis, California, United States of America
- * E-mail:
| | - Peter C. Wainwright
- Department of Evolution and Ecology and Center for Population Biology, University of California Davis, Davis, California, United States of America
| |
Collapse
|
14
|
Experiments on the role of deleterious mutations as stepping stones in adaptive evolution. Proc Natl Acad Sci U S A 2013; 110:E3171-8. [PMID: 23918358 DOI: 10.1073/pnas.1313424110] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Many evolutionary studies assume that deleterious mutations necessarily impede adaptive evolution. However, a later mutation that is conditionally beneficial may interact with a deleterious predecessor before it is eliminated, thereby providing access to adaptations that might otherwise be inaccessible. It is unknown whether such sign-epistatic recoveries are inconsequential events or an important factor in evolution, owing to the difficulty of monitoring the effects and fates of all mutations during experiments with biological organisms. Here, we used digital organisms to compare the extent of adaptive evolution in populations when deleterious mutations were disallowed with control populations in which such mutations were allowed. Significantly higher fitness levels were achieved over the long term in the control populations because some of the deleterious mutations served as stepping stones across otherwise impassable fitness valleys. As a consequence, initially deleterious mutations facilitated the evolution of complex, beneficial functions. We also examined the effects of disallowing neutral mutations, of varying the mutation rate, and of sexual recombination. Populations evolving without neutral mutations were able to leverage deleterious and compensatory mutation pairs to overcome, at least partially, the absence of neutral mutations. Substantially raising or lowering the mutation rate reduced or eliminated the long-term benefit of deleterious mutations, but introducing recombination did not. Our work demonstrates that deleterious mutations can play an important role in adaptive evolution under at least some conditions.
Collapse
|