1
|
Zurita AMI, Kyriazis CC, Lohmueller KE. The impact of non-neutral synonymous mutations when inferring selection on non-synonymous mutations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.07.579314. [PMID: 38370782 PMCID: PMC10871344 DOI: 10.1101/2024.02.07.579314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
The distribution of fitness effects (DFE) describes the proportions of new mutations that have different effects on reproductive fitness. Accurate measurements of the DFE are important because the DFE is a fundamental parameter in evolutionary genetics and has implications for our understanding of other phenomena like complex disease or inbreeding depression. Current computational methods to infer the DFE for nonsynonymous mutations from natural variation first estimate demographic parameters from synonymous variants to control for the effects of demography and background selection. Then, conditional on these parameters, the DFE is then inferred for nonsynonymous mutations. This approach relies on the assumption that synonymous variants are neutrally evolving. However, some evidence points toward synonymous mutations having measurable effects on fitness. To test whether selection on synonymous mutations affects inference of the DFE of nonsynonymous mutations, we simulated several possible models of selection on synonymous mutations using SLiM and attempted to recover the DFE of nonsynonymous mutations using Fit∂a∂i, a common method for DFE inference. Our results show that the presence of selection on synonymous variants leads to incorrect inferences of recent population growth. Furthermore, under certain parameter combinations, inferences of the DFE can have an inflated proportion of highly deleterious nonsynonymous mutations. However, this bias can be eliminated if the correct demographic parameters are used for DFE inference instead of the biased ones inferred from synonymous variants. Our work demonstrates how unmodeled selection on synonymous mutations may affect downstream inferences of the DFE.
Collapse
Affiliation(s)
- Aina Martinez I Zurita
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, USA
| | - Christopher C Kyriazis
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, USA
| | - Kirk E Lohmueller
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, USA
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, USA
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, USA
| |
Collapse
|
2
|
James J, Kastally C, Budde KB, González-Martínez SC, Milesi P, Pyhäjärvi T, Lascoux M. Between but Not Within-Species Variation in the Distribution of Fitness Effects. Mol Biol Evol 2023; 40:msad228. [PMID: 37832225 PMCID: PMC10630145 DOI: 10.1093/molbev/msad228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 09/04/2023] [Accepted: 09/25/2023] [Indexed: 10/15/2023] Open
Abstract
New mutations provide the raw material for evolution and adaptation. The distribution of fitness effects (DFE) describes the spectrum of effects of new mutations that can occur along a genome, and is, therefore, of vital interest in evolutionary biology. Recent work has uncovered striking similarities in the DFE between closely related species, prompting us to ask whether there is variation in the DFE among populations of the same species, or among species with different degrees of divergence, that is whether there is variation in the DFE at different levels of evolution. Using exome capture data from six tree species sampled across Europe we characterized the DFE for multiple species, and for each species, multiple populations, and investigated the factors potentially influencing the DFE, such as demography, population divergence, and genetic background. We find statistical support for the presence of variation in the DFE at the species level, even among relatively closely related species. However, we find very little difference at the population level, suggesting that differences in the DFE are primarily driven by deep features of species biology, and those evolutionarily recent events, such as demographic changes and local adaptation, have little impact.
Collapse
Affiliation(s)
- Jennifer James
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden
- Swedish Collegium of Advanced Study, Uppsala University, Uppsala, Sweden
| | - Chedly Kastally
- Department of Forest Sciences, University of Helsinki, Helsinki, Finland
- Viikki Plant Science Centre, University of Helsinki, Helsinki, Finland
| | - Katharina B Budde
- Department of Forest Genetics and Forest Tree Breeding, Georg-August-University Goettingen, Goettingen, Germany
- Center of Biodiversity and Sustainable Land Use (CBL), University of Goettingen, Goettingen, Germany
| | - Santiago C González-Martínez
- National Research Institute for Agriculture, Food and the Environment (INRAE), University of Bordeaux, BIOGECO, Cestas, France
| | - Pascal Milesi
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden
- Science for Life Laboratory (SciLifeLab), Uppsala University, Uppsala, Sweden
| | - Tanja Pyhäjärvi
- Department of Forest Sciences, University of Helsinki, Helsinki, Finland
- Viikki Plant Science Centre, University of Helsinki, Helsinki, Finland
| | - Martin Lascoux
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden
| |
Collapse
|
3
|
Savageau MA. Phenotype Design Space Provides a Mechanistic Framework Relating Molecular Parameters to Phenotype Diversity Available for Selection. J Mol Evol 2023; 91:687-710. [PMID: 37620617 PMCID: PMC10598110 DOI: 10.1007/s00239-023-10127-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 07/27/2023] [Indexed: 08/26/2023]
Abstract
Two long-standing challenges in theoretical population genetics and evolution are predicting the distribution of phenotype diversity generated by mutation and available for selection, and determining the interaction of mutation, selection and drift to characterize evolutionary equilibria and dynamics. More fundamental for enabling such predictions is the current inability to causally link genotype to phenotype. There are three major mechanistic mappings required for such a linking - genetic sequence to kinetic parameters of the molecular processes, kinetic parameters to biochemical system phenotypes, and biochemical phenotypes to organismal phenotypes. This article introduces a theoretical framework, the Phenotype Design Space (PDS) framework, for addressing these challenges by focusing on the mapping of kinetic parameters to biochemical system phenotypes. It provides a quantitative theory whose key features include (1) a mathematically rigorous definition of phenotype based on biochemical kinetics, (2) enumeration of the full phenotypic repertoire, and (3) functional characterization of each phenotype independent of its context-dependent selection or fitness contributions. This framework is built on Design Space methods that relate system phenotypes to genetically determined parameters and environmentally determined variables. It also has the potential to automate prediction of phenotype-specific mutation rate constants and equilibrium distributions of phenotype diversity in microbial populations undergoing steady-state exponential growth, which provides an ideal reference to which more realistic cases can be compared. Although the framework is quite general and flexible, the details will undoubtedly differ for different functions, organisms and contexts. Here a hypothetical case study involving a small molecular system, a primordial circadian clock, is used to introduce this framework and to illustrate its use in a particular case. The framework is built on fundamental biochemical kinetics. Thus, the foundation is based on linear algebra and reasonable physical assumptions, which provide numerous opportunities for experimental testing and further elaboration to deal with complex multicellular organisms that are currently beyond its scope. The discussion provides a comparison of results from the PDS framework with those from other approaches in theoretical population genetics.
Collapse
Affiliation(s)
- Michael A Savageau
- Department of Microbiology & Molecular Genetics, University of California, 228 Briggs, Davis, CA, 95616, USA.
- Department of Biomedical Engineering, University of California, One Shields Avenue, Davis, CA, 95616, USA.
| |
Collapse
|
4
|
Andersson BA, Zhao W, Haller BC, Brännström Å, Wang XR. Inference of the distribution of fitness effects of mutations is affected by single nucleotide polymorphism filtering methods, sample size and population structure. Mol Ecol Resour 2023; 23:1589-1603. [PMID: 37340611 DOI: 10.1111/1755-0998.13825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 06/02/2023] [Accepted: 06/08/2023] [Indexed: 06/22/2023]
Abstract
The distribution of fitness effects (DFE) of new mutations has been of interest to evolutionary biologists since the concept of mutations arose. Modern population genomic data enable us to quantify the DFE empirically, but few studies have examined how data processing, sample size and cryptic population structure might affect the accuracy of DFE inference. We used simulated and empirical data (from Arabidopsis lyrata) to show the effects of missing data filtering, sample size, number of single nucleotide polymorphisms (SNPs) and population structure on the accuracy and variance of DFE estimates. Our analyses focus on three filtering methods-downsampling, imputation and subsampling-with sample sizes of 4-100 individuals. We show that (1) the choice of missing-data treatment directly affects the estimated DFE, with downsampling performing better than imputation and subsampling; (2) the estimated DFE is less reliable in small samples (<8 individuals), and becomes unpredictable with too few SNPs (<5000, the sum of 0- and 4-fold SNPs); and (3) population structure may skew the inferred DFE towards more strongly deleterious mutations. We suggest that future studies should consider downsampling for small data sets, and use samples larger than 4 (ideally larger than 8) individuals, with more than 5000 SNPs in order to improve the robustness of DFE inference and enable comparative analyses.
Collapse
Affiliation(s)
| | - Wei Zhao
- Department of Ecology and Environmental Sciences, Umeå University, Umeå, Sweden
| | - Benjamin C Haller
- Department of Computational Biology, Cornell University, Ithaca, New York, USA
| | - Åke Brännström
- Department of Mathematics and Mathematical Statistics, Umeå University, Umeå, Sweden
- Advancing Systems Analysis Program, International Institute for Applied Systems Analysis, Laxenburg, Austria
- Complexity Science and Evolution Unit, Okinawa Institute of Science and Technology Graduate University, Kunigami, Japan
| | - Xiao-Ru Wang
- Department of Ecology and Environmental Sciences, Umeå University, Umeå, Sweden
| |
Collapse
|
5
|
Wientjes YCJ, Bijma P, van den Heuvel J, Zwaan BJ, Vitezica ZG, Calus MPL. The long-term effects of genomic selection: 2. Changes in allele frequencies of causal loci and new mutations. Genetics 2023; 225:iyad141. [PMID: 37506255 PMCID: PMC10471209 DOI: 10.1093/genetics/iyad141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 05/17/2023] [Accepted: 07/18/2023] [Indexed: 07/30/2023] Open
Abstract
Genetic selection has been applied for many generations in animal, plant, and experimental populations. Selection changes the allelic architecture of traits to create genetic gain. It remains unknown whether the changes in allelic architecture are different for the recently introduced technique of genomic selection compared to traditional selection methods and whether they depend on the genetic architectures of traits. Here, we investigate the allele frequency changes of old and new causal loci under 50 generations of phenotypic, pedigree, and genomic selection, for a trait controlled by either additive, additive and dominance, or additive, dominance, and epistatic effects. Genomic selection resulted in slightly larger and faster changes in allele frequencies of causal loci than pedigree selection. For each locus, allele frequency change per generation was not only influenced by its statistical additive effect but also to a large extent by the linkage phase with other loci and its allele frequency. Selection fixed a large number of loci, and 5 times more unfavorable alleles became fixed with genomic and pedigree selection than with phenotypic selection. For pedigree selection, this was mainly a result of increased genetic drift, while genetic hitchhiking had a larger effect on genomic selection. When epistasis was present, the average allele frequency change was smaller (∼15% lower), and a lower number of loci became fixed for all selection methods. We conclude that for long-term genetic improvement using genomic selection, it is important to consider hitchhiking and to limit the loss of favorable alleles.
Collapse
Affiliation(s)
- Yvonne C J Wientjes
- Animal Breeding and Genomics, Wageningen University & Research, 6700 AH Wageningen, The Netherlands
| | - Piter Bijma
- Animal Breeding and Genomics, Wageningen University & Research, 6700 AH Wageningen, The Netherlands
| | - Joost van den Heuvel
- Laboratory of Genetics, Wageningen University & Research, 6700 AH Wageningen, The Netherlands
| | - Bas J Zwaan
- Laboratory of Genetics, Wageningen University & Research, 6700 AH Wageningen, The Netherlands
| | | | - Mario P L Calus
- Animal Breeding and Genomics, Wageningen University & Research, 6700 AH Wageningen, The Netherlands
| |
Collapse
|
6
|
Fernández-Calvet A, Toribio-Celestino L, Alonso-del Valle A, Sastre-Dominguez J, Valdes-Chiara P, San Millan A, DelaFuente J. The distribution of fitness effects of plasmid pOXA-48 in clinical enterobacteria. MICROBIOLOGY (READING, ENGLAND) 2023; 169:001369. [PMID: 37505800 PMCID: PMC10433420 DOI: 10.1099/mic.0.001369] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 07/12/2023] [Indexed: 07/29/2023]
Abstract
Antimicrobial resistance (AMR) in bacteria is a major public health problem. The main route for AMR acquisition in clinically important bacteria is the horizontal transfer of plasmids carrying resistance genes. AMR plasmids allow bacteria to survive antibiotics, but they also entail physiological alterations in the host cell. Multiple studies over the last few years have indicated that these alterations can translate into a fitness cost when antibiotics are absent. However, due to technical limitations, most of these studies are based on analysing new associations between plasmids and bacteria generated in vitro, and we know very little about the effects of plasmids in their native bacterial hosts. In this study, we used a CRISPR-Cas9-tool to selectively cure plasmids from clinical enterobacteria to overcome this limitation. Using this approach, we were able to study the fitness effects of the carbapenem resistance plasmid pOXA-48 in 35 pOXA-48-carrying isolates recovered from hospitalized patients. Our results revealed that pOXA-48 produces variable effects across the collection of wild-type enterobacterial strains naturally carrying the plasmid, ranging from fitness costs to fitness benefits. Importantly, the plasmid was only associated with a significant fitness reduction in four out of 35 clones, and produced no significant changes in fitness in the great majority of isolates. Our results suggest that plasmids produce neutral fitness effects in most native bacterial hosts, helping to explain the great prevalence of plasmids in natural microbial communities.
Collapse
Affiliation(s)
| | | | | | | | | | - Alvaro San Millan
- Centro Nacional de Biotecnología (CNB-CSIC), Madrid, Spain
- Centro de Investigación Biológica en Red de Epidemiología y Salud Pública (CIBERESP), Instituto de Salud Carlos III, Madrid, Spain
| | | |
Collapse
|
7
|
Cotto O, Day T. A null model for the distribution of fitness effects of mutations. Proc Natl Acad Sci U S A 2023; 120:e2218200120. [PMID: 37252948 PMCID: PMC10266029 DOI: 10.1073/pnas.2218200120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 04/28/2023] [Indexed: 06/01/2023] Open
Abstract
The distribution of fitness effects (DFE) of new mutations is key to our understanding of many evolutionary processes. Theoreticians have developed several models to help understand the patterns seen in empirical DFEs. Many such models reproduce the broad patterns seen in empirical DFEs but these models often rely on structural assumptions that cannot be tested empirically. Here, we investigate how much of the underlying "microscopic" biological processes involved in the mapping of new mutations to fitness can be inferred from "macroscopic" observations of the DFE. We develop a null model by generating random genotype-to-fitness maps and show that the null DFE is that with the largest possible information entropy. We further show that, subject to one simple constraint, this null DFE is a Gompertz distribution. Finally, we illustrate how the predictions of this null DFE match empirically measured DFEs from several datasets, as well as DFEs simulated from Fisher's geometric model. This suggests that a match between models and empirical data is often not a very strong indication of the mechanisms underlying the mapping of mutation to fitness.
Collapse
Affiliation(s)
- Olivier Cotto
- Department of Mathematics and Statistics, Queens University, Kingston, ON, K7L 3N6, Canada
- Department of Biology, Queens University, Kingston, ON, K7L 3N6, Canada
- Plant Health Institute Montpellier, Université Montpellier, Institut National de Recherche pour l’Agriculture, l’alimentation et l’Environnement, Centre de coopération Internationale en Recherche Agronomique pour le Développement, Institut de Recherche pour le Développement, Institut Agro, Montpellier, F-34398, France
| | - Troy Day
- Department of Mathematics and Statistics, Queens University, Kingston, ON, K7L 3N6, Canada
- Department of Biology, Queens University, Kingston, ON, K7L 3N6, Canada
| |
Collapse
|
8
|
Robinson J, Kyriazis CC, Yuan SC, Lohmueller KE. Deleterious Variation in Natural Populations and Implications for Conservation Genetics. Annu Rev Anim Biosci 2023; 11:93-114. [PMID: 36332644 PMCID: PMC9933137 DOI: 10.1146/annurev-animal-080522-093311] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Deleterious mutations decrease reproductive fitness and are ubiquitous in genomes. Given that many organisms face ongoing threats of extinction, there is interest in elucidating the impact of deleterious variation on extinction risk and optimizing management strategies accounting for such mutations. Quantifying deleterious variation and understanding the effects of population history on deleterious variation are complex endeavors because we do not know the strength of selection acting on each mutation. Further, the effect of demographic history on deleterious mutations depends on the strength of selection against the mutation and the degree of dominance. Here we clarify how deleterious variation can be quantified and studied in natural populations. We then discuss how different demographic factors, such as small population size, nonequilibrium population size changes, inbreeding, and gene flow, affect deleterious variation. Lastly, we provide guidance on studying deleterious variation in nonmodel populations of conservation concern.
Collapse
Affiliation(s)
- Jacqueline Robinson
- Institute for Human Genetics, University of California, San Francisco, California, USA;
| | - Christopher C Kyriazis
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA; , ,
| | - Stella C Yuan
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA; , ,
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA; , , .,Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| |
Collapse
|
9
|
Evolutionary scaling of maximum growth rate with organism size. Sci Rep 2022; 12:22586. [PMID: 36585440 PMCID: PMC9803686 DOI: 10.1038/s41598-022-23626-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 11/02/2022] [Indexed: 12/31/2022] Open
Abstract
Data from nearly 1000 species reveal the upper bound to rates of biomass production achievable by natural selection across the Tree of Life. For heterotrophs, maximum growth rates scale positively with organism size in bacteria but negatively in eukaryotes, whereas for phototrophs, the scaling is negligible for cyanobacteria and weakly negative for eukaryotes. These results have significant implications for understanding the bioenergetic consequences of the transition from prokaryotes to eukaryotes, and of the expansion of some groups of the latter into multicellularity. The magnitudes of the scaling coefficients for eukaryotes are significantly lower than expected under any proposed physical-constraint model. Supported by genomic, bioenergetic, and population-genetic data and theory, an alternative hypothesis for the observed negative scaling in eukaryotes postulates that growth-diminishing mutations with small effects passively accumulate with increasing organism size as a consequence of associated increases in the power of random genetic drift. In contrast, conditional on the structural and functional features of ribosomes, natural selection has been able to promote bacteria with the fastest possible growth rates, implying minimal conflicts with both bioenergetic constraints and random genetic drift. If this extension of the drift-barrier hypothesis is correct, the interpretations of comparative studies of biological traits that have traditionally ignored differences in population-genetic environments will require revisiting.
Collapse
|
10
|
Abstract
Viruses are the most abundant biological entities on Earth, and yet, they have not received enough consideration in astrobiology. Viruses are also extraordinarily diverse, which is evident in the types of relationships they establish with their host, their strategies to store and replicate their genetic information and the enormous diversity of genes they contain. A viral population, especially if it corresponds to a virus with an RNA genome, can contain an array of sequence variants that greatly exceeds what is present in most cell populations. The fact that viruses always need cellular resources to multiply means that they establish very close interactions with cells. Although in the short term these relationships may appear to be negative for life, it is evident that they can be beneficial in the long term. Viruses are one of the most powerful selective pressures that exist, accelerating the evolution of defense mechanisms in the cellular world. They can also exchange genetic material with the host during the infection process, providing organisms with capacities that favor the colonization of new ecological niches or confer an advantage over competitors, just to cite a few examples. In addition, viruses have a relevant participation in the biogeochemical cycles of our planet, contributing to the recycling of the matter necessary for the maintenance of life. Therefore, although viruses have traditionally been excluded from the tree of life, the structure of this tree is largely the result of the interactions that have been established throughout the intertwined history of the cellular and the viral worlds. We do not know how other possible biospheres outside our planet could be, but it is clear that viruses play an essential role in the terrestrial one. Therefore, they must be taken into account both to improve our understanding of life that we know, and to understand other possible lives that might exist in the cosmos.
Collapse
Affiliation(s)
- Ignacio de la Higuera
- Department of Biology, Center for Life in Extreme Environments, Portland State University, Portland, OR, United States
| | - Ester Lázaro
- Centro de Astrobiología (CAB), CSIC-INTA, Torrejón de Ardoz, Spain
| |
Collapse
|
11
|
Unpredictable repeatability in molecular evolution. Proc Natl Acad Sci U S A 2022; 119:e2209373119. [PMID: 36122210 PMCID: PMC9522380 DOI: 10.1073/pnas.2209373119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The extent of parallel evolution at the genotypic level is quantitatively linked to the distribution of beneficial fitness effects (DBFE) of mutations. The standard view, based on light-tailed distributions (i.e., distributions with finite moments), is that the probability of parallel evolution in duplicate populations is inversely proportional to the number of available mutations and, moreover, that the DBFE is sufficient to determine the probability when the number of available mutations is large. Here, we show that when the DBFE is heavy-tailed, as found in several recent experiments, these expectations are defied. The probability of parallel evolution decays anomalously slowly in the number of mutations or even becomes independent of it, implying higher repeatability of evolution. At the same time, the probability of parallel evolution is non-self-averaging—that is, it does not converge to its mean value, even when a large number of mutations are involved. This behavior arises because the evolutionary process is dominated by only a few mutations of high weight. Consequently, the probability varies widely across systems with the same DBFE. Contrary to the standard view, the DBFE is no longer sufficient to determine the extent of parallel evolution, making it much less predictable. We illustrate these ideas theoretically and through analysis of empirical data on antibiotic-resistance evolution.
Collapse
|
12
|
Extreme purifying selection against point mutations in the human genome. Nat Commun 2022; 13:4312. [PMID: 35879308 PMCID: PMC9314448 DOI: 10.1038/s41467-022-31872-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Accepted: 07/07/2022] [Indexed: 12/13/2022] Open
Abstract
Large-scale genome sequencing has enabled the measurement of strong purifying selection in protein-coding genes. Here we describe a new method, called ExtRaINSIGHT, for measuring such selection in noncoding as well as coding regions of the human genome. ExtRaINSIGHT estimates the prevalence of “ultraselection” by the fractional depletion of rare single-nucleotide variants, after controlling for variation in mutation rates. Applying ExtRaINSIGHT to 71,702 whole genome sequences from gnomAD v3, we find abundant ultraselection in evolutionarily ancient miRNAs and neuronal protein-coding genes, as well as at splice sites. By contrast, we find much less ultraselection in other noncoding RNAs and transcription factor binding sites, and only modest levels in ultraconserved elements. We estimate that ~0.4–0.7% of the human genome is ultraselected, implying ~ 0.26–0.51 strongly deleterious mutations per generation. Overall, our study sheds new light on the genome-wide distribution of fitness effects by combining deep sequencing data and classical theory from population genetics. Previous work has investigated selection in the coding genome, but it is not as well characterized in the non-coding genome. By analyzing rare variants in 70k genome sequences from gnomAD, the authors detect very strong purifying selection ("ultraselection”) across the human genome, finding it in some microRNAs and coding sequences but generally rare in regulatory sequences.
Collapse
|
13
|
Bao K, Melde RH, Sharp NP. Are mutations usually deleterious? A perspective on the fitness effects of mutation accumulation. Evol Ecol 2022; 36:753-766. [DOI: 10.1007/s10682-022-10187-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
14
|
Wendlandt CE, Roberts M, Nguyen KT, Graham ML, Lopez Z, Helliwell EE, Friesen ML, Griffitts JS, Price P, Porter SS. Negotiating mutualism: A locus for exploitation by rhizobia has a broad effect size distribution and context-dependent effects on legume hosts. J Evol Biol 2022; 35:844-854. [PMID: 35506571 PMCID: PMC9325427 DOI: 10.1111/jeb.14011] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 03/07/2022] [Accepted: 04/02/2022] [Indexed: 01/02/2023]
Abstract
In mutualisms, variation at genes determining partner fitness provides the raw material upon which coevolutionary selection acts, setting the dynamics and pace of coevolution. However, we know little about variation in the effects of genes that underlie symbiotic fitness in natural mutualist populations. In some species of legumes that form root nodule symbioses with nitrogen‐fixing rhizobial bacteria, hosts secrete nodule‐specific cysteine‐rich (NCR) peptides that cause rhizobia to differentiate in the nodule environment. However, rhizobia can cleave NCR peptides through the expression of genes like the plasmid‐borne Host range restriction peptidase (hrrP), whose product degrades specific NCR peptides. Although hrrP activity can confer host exploitation by depressing host fitness and enhancing symbiont fitness, the effects of hrrP on symbiosis phenotypes depend strongly on the genotypes of the interacting partners. However, the effects of hrrP have yet to be characterised in a natural population context, so its contribution to variation in wild mutualist populations is unknown. To understand the distribution of effects of hrrP in wild rhizobia, we measured mutualism phenotypes conferred by hrrP in 12 wild Ensifer medicae strains. To evaluate context dependency of hrrP effects, we compared hrrP effects across two Medicago polymorpha host genotypes and across two experimental years for five E. medicae strains. We show for the first time in a natural population context that hrrP has a wide distribution of effect sizes for many mutualism traits, ranging from strongly positive to strongly negative. Furthermore, we show that hrrP effect size varies across host genotypes and experiment years, suggesting that researchers should be cautious about extrapolating the role of genes in natural populations from controlled laboratory studies of single genetic variants.
Collapse
Affiliation(s)
- Camille E Wendlandt
- School of Biological Sciences, Washington State University, Vancouver, Washington, USA
| | - Miles Roberts
- School of Biological Sciences, Washington State University, Vancouver, Washington, USA
| | - Kyle T Nguyen
- School of Biological Sciences, Washington State University, Vancouver, Washington, USA
| | - Marion L Graham
- Biology Department, Eastern Michigan University, Ypsilanti, Michigan, USA
| | - Zoie Lopez
- School of Biological Sciences, Washington State University, Vancouver, Washington, USA
| | - Emily E Helliwell
- School of Biological Sciences, Washington State University, Vancouver, Washington, USA
| | - Maren L Friesen
- Department of Plant Pathology, Washington State University, Pullman, Washington, USA.,Department of Crop & Soil Sciences, Washington State University, Pullman, Washington, USA
| | - Joel S Griffitts
- Department of Microbiology and Molecular Biology, Brigham Young University, Provo, Utah, USA
| | - Paul Price
- Biology Department, Eastern Michigan University, Ypsilanti, Michigan, USA
| | - Stephanie S Porter
- School of Biological Sciences, Washington State University, Vancouver, Washington, USA
| |
Collapse
|
15
|
Density fluctuations, homeostasis, and reproduction effects in bacteria. Commun Biol 2022; 5:397. [PMID: 35484403 PMCID: PMC9050864 DOI: 10.1038/s42003-022-03348-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Accepted: 04/10/2022] [Indexed: 12/02/2022] Open
Abstract
Single-cells grow by increasing their biomass and size. Here, we report that while mass and size accumulation rates of single Escherichia coli cells are exponential, their density and, thus, the levels of macromolecular crowding fluctuate during growth. As such, the average rates of mass and size accumulation of a single cell are generally not the same, but rather cells differentiate into increasing one rate with respect to the other. This differentiation yields a density homeostasis mechanism that we support mathematically. Further, we observe that density fluctuations can affect the reproduction rates of single cells, suggesting a link between the levels of macromolecular crowding with metabolism and overall population fitness. We detail our experimental approach and the “invisible” microfluidic arrays that enabled increased precision and throughput. Infections and natural communities start from a few cells, thus, emphasizing the significance of density-fluctuations when taking non-genetic variability into consideration. Quantitative imaging, invisible microfluidics, and mathematical models demonstrate how the density of single E. coli cells fluctuates during the cell cycle, unmasking key homeostasis and population fitness effects.
Collapse
|
16
|
Pontz M, Bürger R. The effects of epistasis and linkage on the invasion of locally beneficial mutations and the evolution of genomic islands. Theor Popul Biol 2022; 144:49-69. [DOI: 10.1016/j.tpb.2022.01.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 01/04/2022] [Accepted: 01/07/2022] [Indexed: 11/26/2022]
|
17
|
Chen J, Bataillon T, Glémin S, Lascoux M. What does the distribution of fitness effects of new mutations reflect? Insights from plants. THE NEW PHYTOLOGIST 2022; 233:1613-1619. [PMID: 34704271 DOI: 10.1111/nph.17826] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 09/28/2021] [Indexed: 06/13/2023]
Abstract
The distribution of fitness effects (DFE) of new mutations plays a central role in molecular evolution. It is therefore crucial to be able to estimate it accurately from genomic data and to understand the factors that shape it. After a rapid overview of available methods to characterize the fitness effects of mutations, we review what is known on the factors affecting them in plants. Available data indicate that life history traits (e.g. mating system and longevity) have a major effect on the DFE. By contrast, the impact of demography within species appears to be more limited. These results remain to be confirmed, and methods to estimate the joint evolution of demography, life history traits, and the DFE need to be developed.
Collapse
Affiliation(s)
- Jun Chen
- College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, China
| | - Thomas Bataillon
- Bioinformatics Research Centre, Aarhus University, C.F. Möllers Allé 8, Aarhus C, DK-8000, Denmark
| | - Sylvain Glémin
- Centre National de la Recherche Scientifique (CNRS), ECOBIO (Ecosystèmes, Biodiversité, Evolution) - Unité Mixte de Recherche (UMR) 6553, Université de Rennes, Rennes, F-35000, France
- Program in Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Uppsala, 75236, Sweden
| | - Martin Lascoux
- Program in Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Uppsala, 75236, Sweden
| |
Collapse
|
18
|
Vecchyo DOD, Lohmueller KE, Novembre J. Haplotype-based inference of the distribution of fitness effects. Genetics 2022; 220:6501446. [PMID: 35100400 PMCID: PMC8982047 DOI: 10.1093/genetics/iyac002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 12/18/2021] [Indexed: 11/13/2022] Open
Abstract
Abstract
Recent genome sequencing studies with large sample sizes in humans have discovered a vast quantity of low-frequency variants, providing an important source of information to analyze how selection is acting on human genetic variation. In order to estimate the strength of natural selection acting on low-frequency variants, we have developed a likelihood-based method that uses the lengths of pairwise identity-by-state between haplotypes carrying low-frequency variants. We show that in some non-equilibrium populations (such as those that have had recent population expansions) it is possible to distinguish between positive or negative selection acting on a set of variants. With our new framework, one can infer a fixed selection intensity acting on a set of variants at a particular frequency, or a distribution of selection coefficients for standing variants and new mutations. We show an application of our method to the UK10K phased haplotype dataset of individuals.
Collapse
Affiliation(s)
- Diego Ortega-Del Vecchyo
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Querétaro, 76230, México
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, California, 90095, United States of America
| | - Kirk E Lohmueller
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, California, 90095, United States of America
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, California, 90095, United States of America
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, 90095, United States of America
| | - John Novembre
- Department of Human Genetics, University of Chicago, Chicago, Illinois, 60637, United States of America
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, 60637, United States of America
| |
Collapse
|
19
|
Wahl LM, Agashe D. Selection bias in mutation accumulation. Evolution 2022; 76:528-540. [PMID: 34989408 DOI: 10.1111/evo.14430] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 11/26/2021] [Indexed: 12/01/2022]
Abstract
Mutation accumulation (MA) experiments, in which de novo mutations are sampled and subsequently characterized, are an essential tool in understanding the processes underlying evolution. In microbial populations, MA protocols typically involve a period of population growth between severe bottlenecks, such that a single individual can form a visible colony. While it has long been appreciated that the action of positive selection during this growth phase cannot be eliminated, it is typically assumed to be negligible. Here, we quantify the effect of both positive and negative selection in MA studies, demonstrating that selective effects can substantially bias the distribution of fitness effects (DFE) and mutation rates estimated from typical MA protocols in microbes. We then present a simple correction for this bias which applies to both beneficial and deleterious mutations, and can be used to correct the observed DFE in multiple environments. We use simulated MA experiments to illustrate the extent to which the MA-inferred DFE differs from the underlying true DFE, and demonstrate that the proposed correction accurately reconstructs the true DFE over a wide range of scenarios; we also provide an example of these corrections applied to experimental data. These results highlight that positive selection during microbial MA experiments is in fact not negligible, but can be corrected to gain a more accurate understanding of fundamental evolutionary parameters. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
| | - Deepa Agashe
- National Centre for Biological Sciences, GKVK Campus, Bellary Road,Bengaluru, India
| |
Collapse
|
20
|
Rana A, Patton D, Turner NT, Dillon MM, Cooper VS, Sung W. Precise measurement of the fitness effects of spontaneous mutations by droplet digital PCR in Burkholderia cenocepacia. Genetics 2021; 219:6325026. [PMID: 34849876 DOI: 10.1093/genetics/iyab117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2021] [Revised: 07/07/2021] [Accepted: 07/09/2021] [Indexed: 11/12/2022] Open
Abstract
Understanding how mutations affect survivability is a key component to knowing how organisms and complex traits evolve. However, most mutations have a minor effect on fitness and these effects are difficult to resolve using traditional molecular techniques. Therefore, there is a dire need for more accurate and precise fitness measurements methods. Here, we measured the fitness effects in Burkholderia cenocepacia HI2424 mutation accumulation (MA) lines using droplet-digital polymerase chain reaction (ddPCR). Overall, the fitness measurements from ddPCR-MA are correlated positively with fitness measurements derived from traditional phenotypic marker assays (r = 0.297, P = 0.05), but showed some differences. First, ddPCR had significantly lower measurement variance in fitness (F = 3.78, P < 2.6 × 10-13) in control experiments. Second, the mean fitness from ddPCR-MA measurements were significantly lower than phenotypic marker assays (-0.0041 vs -0.0071, P = 0.006). Consistent with phenotypic marker assays, ddPCR-MA measurements observed multiple (27/43) lineages that significantly deviated from mean fitness, suggesting that a majority of the mutations are neutral or slightly deleterious and intermixed with a few mutations that have extremely large effects. Of these mutations, we found a significant excess of mutations within DNA excinuclease and Lys R transcriptional regulators that have extreme deleterious and beneficial effects, indicating that modifications to transcription and replication may have a strong effect on organismal fitness. This study demonstrates the power of ddPCR as a ubiquitous method for high-throughput fitness measurements in both DNA- and RNA-based organisms regardless of cell type or physiology.
Collapse
Affiliation(s)
- Anita Rana
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - David Patton
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Nathan T Turner
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Marcus M Dillon
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario M5S3B2, Canada
| | - Vaughn S Cooper
- Department of Microbiology and Molecular Genetics, University of Pittsburgh School of Medicine, Pittsburgh, PA 15219, USA
| | - Way Sung
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| |
Collapse
|
21
|
Chen J, Bataillon T, Glémin S, Lascoux M. Hunting for beneficial mutations: conditioning on SIFT scores when estimating the distribution of fitness effect of new mutations. Genome Biol Evol 2021; 14:6310736. [PMID: 34180988 PMCID: PMC8743036 DOI: 10.1093/gbe/evab151] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/21/2021] [Indexed: 11/13/2022] Open
Abstract
The Distribution of Fitness Effects (DFE) of new mutations is a key parameter of molecular evolution. The DFE can in principle be estimated by comparing the Site Frequency Spectra (SFS) of putatively neutral and functional polymorphisms. Unfortunately the DFE is intrinsically hard to estimate, especially for beneficial mutations since these tend to be exceedingly rare. There is therefore a strong incentive to find out whether conditioning on properties of mutations that are independent of the SFS could provide additional information. In the present study, we developed a new measure based on SIFT scores. SIFT scores are assigned to nucleotide sites based on their level of conservation across a multi species alignment: the more conserved a site, the more likely mutations occurring at this site are deleterious and the lower the SIFT score. If one knows the ancestral state at a given site, one can assign a value to new mutations occurring at the site based on the change of SIFT score associated with the mutation. We called this new measure δ. We show that properties of the DFE as well as the flux of beneficial mutations across classes covary with δ and, hence, that SIFT scores are informative when estimating the fitness effect of new mutations. In particular, conditioning on SIFT scores can help to characterize beneficial mutations.
Collapse
Affiliation(s)
- J Chen
- College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, China
| | - T Bataillon
- Bioinformatics Research Centre, Aarhus University, C.F. Møllers Allé 8, Aarhus C, DK-8000, Denmark
| | - S Glémin
- Université de Rennes, Centre National de la Recherche Scientifique (CNRS), ECOBIO (Ecosystèmes, Biodiversité, Evolution) - Unité Mixte de Recherche (UMR) 6553, Rennes, F-35000, France.,Program in Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Uppsala, 75236, Sweden
| | - M Lascoux
- Program in Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Uppsala, 75236, Sweden
| |
Collapse
|
22
|
Cote-Hammarlof PA, Fragata I, Flynn J, Mavor D, Zeldovich KB, Bank C, Bolon DNA. The Adaptive Potential of the Middle Domain of Yeast Hsp90. Mol Biol Evol 2021; 38:368-379. [PMID: 32871012 PMCID: PMC7826181 DOI: 10.1093/molbev/msaa211] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
The distribution of fitness effects (DFEs) of new mutations across different environments quantifies the potential for adaptation in a given environment and its cost in others. So far, results regarding the cost of adaptation across environments have been mixed, and most studies have sampled random mutations across different genes. Here, we quantify systematically how costs of adaptation vary along a large stretch of protein sequence by studying the distribution of fitness effects of the same ≈2,300 amino-acid changing mutations obtained from deep mutational scanning of 119 amino acids in the middle domain of the heat shock protein Hsp90 in five environments. This region is known to be important for client binding, stabilization of the Hsp90 dimer, stabilization of the N-terminal-Middle and Middle-C-terminal interdomains, and regulation of ATPase–chaperone activity. Interestingly, we find that fitness correlates well across diverse stressful environments, with the exception of one environment, diamide. Consistent with this result, we find little cost of adaptation; on average only one in seven beneficial mutations is deleterious in another environment. We identify a hotspot of beneficial mutations in a region of the protein that is located within an allosteric center. The identified protein regions that are enriched in beneficial, deleterious, and costly mutations coincide with residues that are involved in the stabilization of Hsp90 interdomains and stabilization of client-binding interfaces, or residues that are involved in ATPase–chaperone activity of Hsp90. Thus, our study yields information regarding the role and adaptive potential of a protein sequence that complements and extends known structural information.
Collapse
Affiliation(s)
| | - Inês Fragata
- Instituto Gulbenkian de Ciência, Oeiras, Portugal
| | - Julia Flynn
- University of Massachusetts Medical School, Worcester, MA
| | - David Mavor
- University of Massachusetts Medical School, Worcester, MA
| | | | - Claudia Bank
- Instituto Gulbenkian de Ciência, Oeiras, Portugal.,Institute of Ecology and Evolution, University of Bern, Switzerland
| | | |
Collapse
|
23
|
Berdan EL, Blanckaert A, Slotte T, Suh A, Westram AM, Fragata I. Unboxing mutations: Connecting mutation types with evolutionary consequences. Mol Ecol 2021; 30:2710-2723. [PMID: 33955064 DOI: 10.1111/mec.15936] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Revised: 03/30/2021] [Accepted: 04/20/2021] [Indexed: 01/09/2023]
Abstract
A key step in understanding the genetic basis of different evolutionary outcomes (e.g., adaptation) is to determine the roles played by different mutation types (e.g., SNPs, translocations and inversions). To do this we must simultaneously consider different mutation types in an evolutionary framework. Here, we propose a research framework that directly utilizes the most important characteristics of mutations, their population genetic effects, to determine their relative evolutionary significance in a given scenario. We review known population genetic effects of different mutation types and show how these may be connected to different evolutionary outcomes. We provide examples of how to implement this framework and pinpoint areas where more data, theory and synthesis are needed. Linking experimental and theoretical approaches to examine different mutation types simultaneously is a critical step towards understanding their evolutionary significance.
Collapse
Affiliation(s)
- Emma L Berdan
- Department of Ecology, Environment and Plant Sciences, Science for Life Laboratory, Stockholm University, Stockholm, Sweden
| | | | - Tanja Slotte
- Department of Ecology, Environment and Plant Sciences, Science for Life Laboratory, Stockholm University, Stockholm, Sweden
| | - Alexander Suh
- School of Biological Sciences - Organisms and the Environment, University of East Anglia, Norwich, UK.,Department of Organismal Biology - Systematic Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Anja M Westram
- IST Austria, Klosterneuburg, Austria.,Faculty of Biosciences and Aquaculture, Nord University, Bodø, Norway
| | - Inês Fragata
- cE3c - Centre for Ecology, Evolution and Environmental Changes, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal
| |
Collapse
|
24
|
Abstract
RNA viruses, such as hepatitis C virus (HCV), influenza virus, and SARS-CoV-2, are notorious for their ability to evolve rapidly under selection in novel environments. It is known that the high mutation rate of RNA viruses can generate huge genetic diversity to facilitate viral adaptation. However, less attention has been paid to the underlying fitness landscape that represents the selection forces on viral genomes, especially under different selection conditions. Here, we systematically quantified the distribution of fitness effects of about 1,600 single amino acid substitutions in the drug-targeted region of NS5A protein of HCV. We found that the majority of nonsynonymous substitutions incur large fitness costs, suggesting that NS5A protein is highly optimized. The replication fitness of viruses is correlated with the pattern of sequence conservation in nature, and viral evolution is constrained by the need to maintain protein stability. We characterized the adaptive potential of HCV by subjecting the mutant viruses to selection by the antiviral drug daclatasvir at multiple concentrations. Both the relative fitness values and the number of beneficial mutations were found to increase with the increasing concentrations of daclatasvir. The changes in the spectrum of beneficial mutations in NS5A protein can be explained by a pharmacodynamics model describing viral fitness as a function of drug concentration. Overall, our results show that the distribution of fitness effects of mutations is modulated by both the constraints on the biophysical properties of proteins (i.e., selection pressure for protein stability) and the level of environmental stress (i.e., selection pressure for drug resistance). IMPORTANCE Many viruses adapt rapidly to novel selection pressures, such as antiviral drugs. Understanding how pathogens evolve under drug selection is critical for the success of antiviral therapy against human pathogens. By combining deep sequencing with selection experiments in cell culture, we have quantified the distribution of fitness effects of mutations in hepatitis C virus (HCV) NS5A protein. Our results indicate that the majority of single amino acid substitutions in NS5A protein incur large fitness costs. Simulation of protein stability suggests viral evolution is constrained by the need to maintain protein stability. By subjecting the mutant viruses to selection under an antiviral drug, we find that the adaptive potential of viral proteins in a novel environment is modulated by the level of environmental stress, which can be explained by a pharmacodynamics model. Our comprehensive characterization of the fitness landscapes of NS5A can potentially guide the design of effective strategies to limit viral evolution.
Collapse
|
25
|
Gualtieri CT. Genomic Variation, Evolvability, and the Paradox of Mental Illness. Front Psychiatry 2021; 11:593233. [PMID: 33551865 PMCID: PMC7859268 DOI: 10.3389/fpsyt.2020.593233] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Accepted: 11/27/2020] [Indexed: 12/30/2022] Open
Abstract
Twentieth-century genetics was hard put to explain the irregular behavior of neuropsychiatric disorders. Autism and schizophrenia defy a principle of natural selection; they are highly heritable but associated with low reproductive success. Nevertheless, they persist. The genetic origins of such conditions are confounded by the problem of variable expression, that is, when a given genetic aberration can lead to any one of several distinct disorders. Also, autism and schizophrenia occur on a spectrum of severity, from mild and subclinical cases to the overt and disabling. Such irregularities reflect the problem of missing heritability; although hundreds of genes may be associated with autism or schizophrenia, together they account for only a small proportion of cases. Techniques for higher resolution, genomewide analysis have begun to illuminate the irregular and unpredictable behavior of the human genome. Thus, the origins of neuropsychiatric disorders in particular and complex disease in general have been illuminated. The human genome is characterized by a high degree of structural and behavioral variability: DNA content variation, epistasis, stochasticity in gene expression, and epigenetic changes. These elements have grown more complex as evolution scaled the phylogenetic tree. They are especially pertinent to brain development and function. Genomic variability is a window on the origins of complex disease, neuropsychiatric disorders, and neurodevelopmental disorders in particular. Genomic variability, as it happens, is also the fuel of evolvability. The genomic events that presided over the evolution of the primate and hominid lineages are over-represented in patients with autism and schizophrenia, as well as intellectual disability and epilepsy. That the special qualities of the human genome that drove evolution might, in some way, contribute to neuropsychiatric disorders is a matter of no little interest.
Collapse
|
26
|
The evolutionary scaling of cellular traits imposed by the drift barrier. Proc Natl Acad Sci U S A 2020; 117:10435-10444. [PMID: 32345718 DOI: 10.1073/pnas.2000446117] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Owing to internal homeostatic mechanisms, cellular traits may experience long periods of stable selective pressures, during which the stochastic forces of drift and mutation conspire to generate variation. However, even in the face of invariant selection, the drift barrier defined by the genetic effective population size, which is negatively associated with organism size, can have a substantial influence on the location and dispersion of the long-term steady-state distribution of mean phenotypes. In addition, for multilocus traits, the multiplicity of alternative, functionally equivalent states can draw mean phenotypes away from selective optima, even in the absence of mutation bias. Using a framework for traits with an additive genetic basis, it is shown that 1) optimal phenotypic states may be only rarely achieved; 2) gradients of mean phenotypes with respect to organism size (i.e., allometric relationships) are likely to be molded by differences in the power of random genetic drift across the tree of life; and 3) for any particular set of population-genetic conditions, significant variation in mean phenotypes may exist among lineages exposed to identical selection pressures. These results provide a potentially useful framework for understanding numerous aspects of cellular diversification and illustrate the risks of interpreting such variation in a purely adaptive framework.
Collapse
|
27
|
A Theoretical Framework for Evolutionary Cell Biology. J Mol Biol 2020; 432:1861-1879. [PMID: 32087200 DOI: 10.1016/j.jmb.2020.02.006] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Revised: 01/20/2020] [Accepted: 02/04/2020] [Indexed: 11/24/2022]
Abstract
One of the last uncharted territories in evolutionary biology concerns the link with cell biology. Because all phenotypes ultimately derive from events at the cellular level, this connection is essential to building a mechanism-based theory of evolution. Given the impressive developments in cell biological methodologies at the structural and functional levels, the potential for rapid progress is great. The primary challenge for theory development is the establishment of a quantitative framework that transcends species boundaries. Two approaches to the problem are presented here: establishing the long-term steady-state distribution of mean phenotypes under specific regimes of mutation, selection, and drift and evaluating the energetic costs of cellular structures and functions. Although not meant to be the final word, these theoretical platforms harbor potential for generating insight into a diversity of unsolved problems, ranging from genome structure to cellular architecture to aspects of motility in organisms across the Tree of Life.
Collapse
|
28
|
Osmond MM, Otto SP, Martin G. Genetic Paths to Evolutionary Rescue and the Distribution of Fitness Effects Along Them. Genetics 2020; 214:493-510. [PMID: 31822480 PMCID: PMC7017017 DOI: 10.1534/genetics.119.302890] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2019] [Accepted: 12/06/2019] [Indexed: 02/07/2023] Open
Abstract
The past century has seen substantial theoretical and empirical progress on the genetic basis of adaptation. Over this same period, a pressing need to prevent the evolution of drug resistance has uncovered much about the potential genetic basis of persistence in declining populations. However, we have little theory to predict and generalize how persistence-by sufficiently rapid adaptation-might be realized in this explicitly demographic scenario. Here, we use Fisher's geometric model with absolute fitness to begin a line of theoretical inquiry into the genetic basis of evolutionary rescue, focusing here on asexual populations that adapt through de novo mutations. We show how the dominant genetic path to rescue switches from a single mutation to multiple as mutation rates and the severity of the environmental change increase. In multi-step rescue, intermediate genotypes that themselves go extinct provide a "springboard" to rescue genotypes. Comparing to a scenario where persistence is assured, our approach allows us to quantify how a race between evolution and extinction leads to a genetic basis of adaptation that is composed of fewer loci of larger effect. We hope this work brings awareness to the impact of demography on the genetic basis of adaptation.
Collapse
Affiliation(s)
- Matthew M Osmond
- Biodiversity Centre and Department of Zoology, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Sarah P Otto
- Biodiversity Centre and Department of Zoology, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Guillaume Martin
- Institut des Sciences de l'Evolution de Montpellier UMR5554, Universite de Montpellier, CNRS-IRD-EPHE-UM, France
| |
Collapse
|
29
|
Tataru P, Bataillon T. polyDFE: Inferring the Distribution of Fitness Effects and Properties of Beneficial Mutations from Polymorphism Data. Methods Mol Biol 2020; 2090:125-146. [PMID: 31975166 DOI: 10.1007/978-1-0716-0199-0_6] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
The possible evolutionary trajectories a population can follow is determined by the fitness effects of new mutations. Their relative frequencies are best specified through a distribution of fitness effects (DFE) that spans deleterious, neutral, and beneficial mutations. As such, the DFE is key to several aspects of the evolution of a population, and particularly the rate of adaptive molecular evolution (α). Inference of DFE from patterns of polymorphism and divergence has been a longstanding goal of evolutionary genetics.polyDFE provides a flexible statistical framework to estimate the DFE and α from site frequency spectrum (SFS) data. Several probability distributions can be fitted to the data to model the DFE. The method also jointly estimates a series of nuisance parameters that model the effect of unknown demography as well data imperfections, in particular possible errors in polarizing SNPs. This chapter is organized as a tutorial for polyDFE. We start by briefly reviewing the concept of DFE, α, and the principles underlying the method, and then provide an example using central chimpanzees data (Tataru et al., Genetics 207(3):1103-1119, 2017; Bataillon et al., Genome Biol Evol 7(4):1122-1132, 2015) to guide the user through the different steps of an analysis: formatting the data as input to polyDFE, fitting different models, obtaining estimates of parameters uncertainty and performing statistical tests, as well as model averaging procedures to obtain robust estimates of model parameters.
Collapse
Affiliation(s)
- Paula Tataru
- Bioinformatics Research Center, Aarhus University, Aarhus, Denmark
| | - Thomas Bataillon
- Bioinformatics Research Center, Aarhus University, Aarhus, Denmark.
| |
Collapse
|
30
|
Moutinho AF, Bataillon T, Dutheil JY. Variation of the adaptive substitution rate between species and within genomes. Evol Ecol 2019. [DOI: 10.1007/s10682-019-10026-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
AbstractThe importance of adaptive mutations in molecular evolution is extensively debated. Recent developments in population genomics allow inferring rates of adaptive mutations by fitting a distribution of fitness effects to the observed patterns of polymorphism and divergence at sites under selection and sites assumed to evolve neutrally. Here, we summarize the current state-of-the-art of these methods and review the factors that affect the molecular rate of adaptation. Several studies have reported extensive cross-species variation in the proportion of adaptive amino-acid substitutions (α) and predicted that species with larger effective population sizes undergo less genetic drift and higher rates of adaptation. Disentangling the rates of positive and negative selection, however, revealed that mutations with deleterious effects are the main driver of this population size effect and that adaptive substitution rates vary comparatively little across species. Conversely, rates of adaptive substitution have been documented to vary substantially within genomes. On a genome-wide scale, gene density, recombination and mutation rate were observed to play a role in shaping molecular rates of adaptation, as predicted under models of linked selection. At the gene level, it has been reported that the gene functional category and the macromolecular structure substantially impact the rate of adaptive mutations. Here, we deliver a comprehensive review of methods used to infer the molecular adaptive rate, the potential drivers of adaptive evolution and how positive selection shapes molecular evolution within genes, across genes within species and between species.
Collapse
|
31
|
The Role of Mutation Bias in Adaptive Evolution. Trends Ecol Evol 2019; 34:422-434. [PMID: 31003616 DOI: 10.1016/j.tree.2019.01.015] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2018] [Revised: 01/27/2019] [Accepted: 01/30/2019] [Indexed: 11/24/2022]
Abstract
Mutational input is the ultimate source of genetic variation, but mutations are not thought to affect the direction of adaptive evolution. Recently, critics of standard evolutionary theory have questioned the random and non-directional nature of mutations, claiming that the mutational process can be adaptive in its own right. We discuss here mutation bias in adaptive evolution. We find little support for mutation bias as an independent force in adaptive evolution, although it can interact with selection under conditions of small population size and when standing genetic variation is limited, entirely consistent with standard evolutionary theory. We further emphasize that natural selection can shape the phenotypic effects of mutations, giving the false impression that directed mutations are driving adaptive evolution.
Collapse
|
32
|
Kemble H, Nghe P, Tenaillon O. Recent insights into the genotype-phenotype relationship from massively parallel genetic assays. Evol Appl 2019; 12:1721-1742. [PMID: 31548853 PMCID: PMC6752143 DOI: 10.1111/eva.12846] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 06/21/2019] [Accepted: 07/02/2019] [Indexed: 12/20/2022] Open
Abstract
With the molecular revolution in Biology, a mechanistic understanding of the genotype-phenotype relationship became possible. Recently, advances in DNA synthesis and sequencing have enabled the development of deep mutational scanning assays, capable of scoring comprehensive libraries of genotypes for fitness and a variety of phenotypes in massively parallel fashion. The resulting empirical genotype-fitness maps pave the way to predictive models, potentially accelerating our ability to anticipate the behaviour of pathogen and cancerous cell populations from sequencing data. Besides from cellular fitness, phenotypes of direct application in industry (e.g. enzyme activity) and medicine (e.g. antibody binding) can be quantified and even selected directly by these assays. This review discusses the technological basis of and recent developments in massively parallel genetics, along with the trends it is uncovering in the genotype-phenotype relationship (distribution of mutation effects, epistasis), their possible mechanistic bases and future directions for advancing towards the goal of predictive genetics.
Collapse
Affiliation(s)
- Harry Kemble
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Philippe Nghe
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Olivier Tenaillon
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
| |
Collapse
|
33
|
Durand É, Gagnon-Arsenault I, Hallin J, Hatin I, Dubé AK, Nielly-Thibault L, Namy O, Landry CR. Turnover of ribosome-associated transcripts from de novo ORFs produces gene-like characteristics available for de novo gene emergence in wild yeast populations. Genome Res 2019; 29:932-943. [PMID: 31152050 PMCID: PMC6581059 DOI: 10.1101/gr.239822.118] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Accepted: 05/13/2019] [Indexed: 12/17/2022]
Abstract
Little is known about the rate of emergence of de novo genes, what their initial properties are, and how they spread in populations. We examined wild yeast populations (Saccharomyces paradoxus) to characterize the diversity and turnover of intergenic ORFs over short evolutionary timescales. We find that hundreds of intergenic ORFs show translation signatures similar to canonical genes, and we experimentally confirmed the translation of many of these ORFs in laboratory conditions using a reporter assay. Compared with canonical genes, intergenic ORFs have lower translation efficiency, which could imply a lack of optimization for translation or a mechanism to reduce their production cost. Translated intergenic ORFs also tend to have sequence properties that are generally close to those of random intergenic sequences. However, some of the very recent translated intergenic ORFs, which appeared <110 kya, already show gene-like characteristics, suggesting that the raw material for functional innovations could appear over short evolutionary timescales.
Collapse
Affiliation(s)
- Éléonore Durand
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada
| | - Isabelle Gagnon-Arsenault
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada.,Département de Biochimie, Microbiologie et Bio-informatique, Université Laval, G1V 0A6 Québec, Québec, Canada
| | - Johan Hallin
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada.,Département de Biochimie, Microbiologie et Bio-informatique, Université Laval, G1V 0A6 Québec, Québec, Canada
| | - Isabelle Hatin
- Institut de Biologie Intégrative de la Cellule (I2BC), CEA, CNRS, Université Paris-Sud, Université Paris-Saclay, 91190 Gif sur Yvette, France
| | - Alexandre K Dubé
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada.,Département de Biochimie, Microbiologie et Bio-informatique, Université Laval, G1V 0A6 Québec, Québec, Canada
| | - Lou Nielly-Thibault
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada
| | - Olivier Namy
- Institut de Biologie Intégrative de la Cellule (I2BC), CEA, CNRS, Université Paris-Sud, Université Paris-Saclay, 91190 Gif sur Yvette, France
| | - Christian R Landry
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada.,Département de Biochimie, Microbiologie et Bio-informatique, Université Laval, G1V 0A6 Québec, Québec, Canada
| |
Collapse
|
34
|
Mutational and non mutational adaptation of Salmonella enterica to the gall bladder. Sci Rep 2019; 9:5203. [PMID: 30914708 PMCID: PMC6435676 DOI: 10.1038/s41598-019-41600-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Accepted: 03/12/2019] [Indexed: 02/06/2023] Open
Abstract
During systemic infection of susceptible hosts, Salmonella enterica colonizes the gall bladder, which contains lethal concentrations of bile salts. Recovery of Salmonella cells from the gall bladder of infected mice yields two types of isolates: (i) bile-resistant mutants; (ii) isolates that survive lethal selection without mutation. Bile-resistant mutants are recovered at frequencies high enough to suggest that increased mutation rates may occur in the gall bladder, thus providing a tentative example of stress-induced mutation in a natural environment. However, most bile-resistant mutants characterized in this study show defects in traits that are relevant for Salmonella colonization of the animal host. Mutation may thus permit short-term adaptation to the gall bladder at the expense of losing fitness for transmission to new hosts. In contrast, non mutational adaptation may have evolved as a fitness-preserving strategy. Failure of RpoS− mutants to colonize the gall bladder supports the involvement of the general stress response in non mutational adaptation.
Collapse
|
35
|
Sane M, Miranda JJ, Agashe D. Antagonistic pleiotropy for carbon use is rare in new mutations. Evolution 2018; 72:2202-2213. [PMID: 30095155 PMCID: PMC6203952 DOI: 10.1111/evo.13569] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2018] [Revised: 07/20/2018] [Accepted: 07/25/2018] [Indexed: 12/21/2022]
Abstract
Pleiotropic effects of mutations underlie diverse biological phenomena such as ageing and specialization. In particular, antagonistic pleiotropy ("AP": when a mutation has opposite fitness effects in different environments) generates tradeoffs, which may constrain adaptation. Models of adaptation typically assume that AP is common - especially among large-effect mutations - and that pleiotropic effect sizes are positively correlated. Empirical tests of these assumptions have focused on de novo beneficial mutations arising under strong selection. However, most mutations are actually deleterious or neutral, and may contribute to standing genetic variation that can subsequently drive adaptation. We quantified the incidence, nature, and effect size of pleiotropy for carbon utilization across 80 single mutations in Escherichia coli that arose under mutation accumulation (i.e., weak selection). Although ∼46% of the mutations were pleiotropic, only 11% showed AP; among beneficial mutations, only ∼4% showed AP. In some environments, AP was more common in large-effect mutations; and AP effect sizes across environments were often negatively correlated. Thus, AP for carbon use is generally rare (especially among beneficial mutations); is not consistently enriched in large-effect mutations; and often involves weakly deleterious antagonistic effects. Our unbiased quantification of mutational effects therefore suggests that antagonistic pleiotropy may be unlikely to cause maladaptive tradeoffs.
Collapse
Affiliation(s)
- Mrudula Sane
- National Centre for Biological SciencesTata Institute of Fundamental ResearchBangaloreIndia
| | - Joshua John Miranda
- National Centre for Biological SciencesTata Institute of Fundamental ResearchBangaloreIndia
| | - Deepa Agashe
- National Centre for Biological SciencesTata Institute of Fundamental ResearchBangaloreIndia
| |
Collapse
|
36
|
Deatherage DE, Leon D, Rodriguez ÁE, Omar SK, Barrick JE. Directed evolution of Escherichia coli with lower-than-natural plasmid mutation rates. Nucleic Acids Res 2018; 46:9236-9250. [PMID: 30137492 PMCID: PMC6158703 DOI: 10.1093/nar/gky751] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Revised: 08/03/2018] [Accepted: 08/08/2018] [Indexed: 12/24/2022] Open
Abstract
Unwanted evolution of designed DNA sequences limits metabolic and genome engineering efforts. Engineered functions that are burdensome to host cells and slow their replication are rapidly inactivated by mutations, and unplanned mutations with unpredictable effects often accumulate alongside designed changes in large-scale genome editing projects. We developed a directed evolution strategy, Periodic Reselection for Evolutionarily Reliable Variants (PResERV), to discover mutations that prolong the function of a burdensome DNA sequence in an engineered organism. Here, we used PResERV to isolate Escherichia coli cells that replicate ColE1-type plasmids with higher fidelity. We found mutations in DNA polymerase I and in RNase E that reduce plasmid mutation rates by 6- to 30-fold. The PResERV method implicitly selects to maintain the growth rate of host cells, and high plasmid copy numbers and gene expression levels are maintained in some of the evolved E. coli strains, indicating that it is possible to improve the genetic stability of cellular chassis without encountering trade-offs in other desirable performance characteristics. Utilizing these new antimutator E. coli and applying PResERV to other organisms in the future promises to prevent evolutionary failures and unpredictability to provide a more stable genetic foundation for synthetic biology.
Collapse
Affiliation(s)
- Daniel E Deatherage
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, TX 78712, USA
| | - Dacia Leon
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, TX 78712, USA
| | - Álvaro E Rodriguez
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, TX 78712, USA
| | - Salma K Omar
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, TX 78712, USA
| | - Jeffrey E Barrick
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
37
|
Alachiotis N, Pavlidis P. RAiSD detects positive selection based on multiple signatures of a selective sweep and SNP vectors. Commun Biol 2018; 1:79. [PMID: 30271960 PMCID: PMC6123745 DOI: 10.1038/s42003-018-0085-8] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Accepted: 06/05/2018] [Indexed: 12/16/2022] Open
Abstract
Selective sweeps leave distinct signatures locally in genomes, enabling the detection of loci that have undergone recent positive selection. Multiple signatures of a selective sweep are known, yet each neutrality test only identifies a single signature. We present RAiSD (Raised Accuracy in Sweep Detection), an open-source software that implements a novel, to our knowledge, and parameter-free detection mechanism that relies on multiple signatures of a selective sweep via the enumeration of SNP vectors. RAiSD achieves higher sensitivity and accuracy than the current state of the art, while the computational complexity is greatly reduced, allowing up to 1000 times faster processing than widely used tools, and negligible memory requirements. Nikolaos Alachiotis and Pavlos Pavlidis present RAiSD, a computational method for identifying multiple signatures of selective sweeps using single nucleotide polymorphism vectors. They show that RAiSD has higher sensitivity and accuracy with reduced computational complexity than current methods.
Collapse
Affiliation(s)
- Nikolaos Alachiotis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, Nikolaou Plastira 100, 70013, Heraklion, Crete, Greece.
| | - Pavlos Pavlidis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, Nikolaou Plastira 100, 70013, Heraklion, Crete, Greece.
| |
Collapse
|
38
|
Berger D, Stångberg J, Grieshop K, Martinossi-Allibert I, Arnqvist G. Temperature effects on life-history trade-offs, germline maintenance and mutation rate under simulated climate warming. Proc Biol Sci 2018; 284:rspb.2017.1721. [PMID: 29118134 DOI: 10.1098/rspb.2017.1721] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Accepted: 10/06/2017] [Indexed: 01/28/2023] Open
Abstract
Mutation has a fundamental influence over evolutionary processes, but how evolutionary processes shape mutation rate remains less clear. In asexual unicellular organism, increased mutation rates have been observed in stressful environments and the reigning paradigm ascribes this increase to selection for evolvability. However, this explanation does not apply in sexually reproducing species, where little is known about how the environment affects mutation rate. Here we challenged experimental lines of seed beetle, evolved at ancestral temperature or under simulated climate warming, to repair induced mutations at ancestral and stressful temperature. Results show that temperature stress causes individuals to pass on a greater mutation load to their grand-offspring. This suggests that stress-induced mutation rates, in unicellular and multicellular organisms alike, can result from compromised germline DNA repair in low condition individuals. Moreover, lines adapted to simulated climate warming had evolved increased longevity at the cost of reproduction, and this allocation decision improved germline repair. These results suggest that mutation rates can be modulated by resource allocation trade-offs encompassing life-history traits and the germline and have important implications for rates of adaptation and extinction as well as our understanding of genetic diversity in multicellular organisms.
Collapse
Affiliation(s)
- David Berger
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, Uppsala, Sweden
| | - Josefine Stångberg
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, Uppsala, Sweden
| | - Karl Grieshop
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, Uppsala, Sweden
| | | | - Göran Arnqvist
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
39
|
Robert L, Ollion J, Robert J, Song X, Matic I, Elez M. Mutation dynamics and fitness effects followed in single cells. Science 2018; 359:1283-1286. [DOI: 10.1126/science.aan0797] [Citation(s) in RCA: 87] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2017] [Revised: 10/19/2017] [Accepted: 01/30/2018] [Indexed: 12/12/2022]
|
40
|
Lundin E, Tang PC, Guy L, Näsvall J, Andersson DI. Experimental Determination and Prediction of the Fitness Effects of Random Point Mutations in the Biosynthetic Enzyme HisA. Mol Biol Evol 2018; 35:704-718. [PMID: 29294020 PMCID: PMC5850734 DOI: 10.1093/molbev/msx325] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
The distribution of fitness effects of mutations is a factor of fundamental importance in evolutionary biology. We determined the distribution of fitness effects of 510 mutants that each carried between 1 and 10 mutations (synonymous and nonsynonymous) in the hisA gene, encoding an essential enzyme in the l-histidine biosynthesis pathway of Salmonella enterica. For the full set of mutants, the distribution was bimodal with many apparently neutral mutations and many lethal mutations. For a subset of 81 single, nonsynonymous mutants most mutations appeared neutral at high expression levels, whereas at low expression levels only a few mutations were neutral. Furthermore, we examined how the magnitude of the observed fitness effects was correlated to several measures of biophysical properties and phylogenetic conservation.We conclude that for HisA: (i) The effect of mutations can be masked by high expression levels, such that mutations that are deleterious to the function of the protein can still be neutral with regard to organism fitness if the protein is expressed at a sufficiently high level; (ii) the shape of the fitness distribution is dependent on the extent to which the protein is rate-limiting for growth; (iii) negative epistatic interactions, on an average, amplified the combined effect of nonsynonymous mutations; and (iv) no single sequence-based predictor could confidently predict the fitness effects of mutations in HisA, but a combination of multiple predictors could predict the effect with a SD of 0.04 resulting in 80% of the mutations predicted within 12% of their observed selection coefficients.
Collapse
Affiliation(s)
- Erik Lundin
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Po-Cheng Tang
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Lionel Guy
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Joakim Näsvall
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Dan I Andersson
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
41
|
Sánchez-Gracia A, Guirao-Rico S, Hinojosa-Alvarez S, Rozas J. Computational prediction of the phenotypic effects of genetic variants: basic concepts and some application examples in Drosophila nervous system genes. J Neurogenet 2017; 31:307-319. [DOI: 10.1080/01677063.2017.1398241] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
- Alejandro Sánchez-Gracia
- Departament de Genètica, Microbiologia i Estadística and Institut de Recerca de la Biodiversitat (IRBio), Facultat de Biologia, Universitat de Barcelona, Barcelona, Spain
| | - Sara Guirao-Rico
- Center for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Bellaterra, Spain
| | - Silvia Hinojosa-Alvarez
- Departament de Genètica, Microbiologia i Estadística and Institut de Recerca de la Biodiversitat (IRBio), Facultat de Biologia, Universitat de Barcelona, Barcelona, Spain
| | - Julio Rozas
- Departament de Genètica, Microbiologia i Estadística and Institut de Recerca de la Biodiversitat (IRBio), Facultat de Biologia, Universitat de Barcelona, Barcelona, Spain
| |
Collapse
|
42
|
Lagator M, Sarikas S, Acar H, Bollback JP, Guet CC. Regulatory network structure determines patterns of intermolecular epistasis. eLife 2017; 6:28921. [PMID: 29130883 PMCID: PMC5699867 DOI: 10.7554/elife.28921] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2017] [Accepted: 11/10/2017] [Indexed: 12/29/2022] Open
Abstract
Most phenotypes are determined by molecular systems composed of specifically interacting molecules. However, unlike for individual components, little is known about the distributions of mutational effects of molecular systems as a whole. We ask how the distribution of mutational effects of a transcriptional regulatory system differs from the distributions of its components, by first independently, and then simultaneously, mutating a transcription factor and the associated promoter it represses. We find that the system distribution exhibits increased phenotypic variation compared to individual component distributions - an effect arising from intermolecular epistasis between the transcription factor and its DNA-binding site. In large part, this epistasis can be qualitatively attributed to the structure of the transcriptional regulatory system and could therefore be a common feature in prokaryotes. Counter-intuitively, intermolecular epistasis can alleviate the constraints of individual components, thereby increasing phenotypic variation that selection could act on and facilitating adaptive evolution.
Collapse
Affiliation(s)
- Mato Lagator
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Srdjan Sarikas
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Hande Acar
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Jonathan P Bollback
- Institute of Science and Technology Austria, Klosterneuburg, Austria.,Institute of Integrative Biology, University of Liverpool, Merseyside, United Kingdom
| | - Călin C Guet
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| |
Collapse
|
43
|
Pressman A, Moretti JE, Campbell GW, Müller UF, Chen IA. Analysis of in vitro evolution reveals the underlying distribution of catalytic activity among random sequences. Nucleic Acids Res 2017. [PMID: 28645146 PMCID: PMC5737207 DOI: 10.1093/nar/gkx540] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
The emergence of catalytic RNA is believed to have been a key event during the origin of life. Understanding how catalytic activity is distributed across random sequences is fundamental to estimating the probability that catalytic sequences would emerge. Here, we analyze the in vitro evolution of triphosphorylating ribozymes and translate their fitnesses into absolute estimates of catalytic activity for hundreds of ribozyme families. The analysis efficiently identified highly active ribozymes and estimated catalytic activity with good accuracy. The evolutionary dynamics follow Fisher's Fundamental Theorem of Natural Selection and a corollary, permitting retrospective inference of the distribution of fitness and activity in the random sequence pool for the first time. The frequency distribution of rate constants appears to be log-normal, with a surprisingly steep dropoff at higher activity, consistent with a mechanism for the emergence of activity as the product of many independent contributions.
Collapse
Affiliation(s)
- Abe Pressman
- Department of Chemistry and Biochemistry 9510, University of California, Santa Barbara, CA 93106, USA.,Program in Chemical Engineering, University of California, Santa Barbara, CA 93106, USA
| | - Janina E Moretti
- Department of Chemistry and Biochemistry, University of California, San Diego, CA 92093, USA
| | - Gregory W Campbell
- Department of Chemistry and Biochemistry 9510, University of California, Santa Barbara, CA 93106, USA.,Program in Biomolecular Sciences and Engineering, University of California, Santa Barbara, CA 93106, USA
| | - Ulrich F Müller
- Department of Chemistry and Biochemistry, University of California, San Diego, CA 92093, USA
| | - Irene A Chen
- Department of Chemistry and Biochemistry 9510, University of California, Santa Barbara, CA 93106, USA.,Program in Biomolecular Sciences and Engineering, University of California, Santa Barbara, CA 93106, USA
| |
Collapse
|
44
|
Inference of Distribution of Fitness Effects and Proportion of Adaptive Substitutions from Polymorphism Data. Genetics 2017; 207:1103-1119. [PMID: 28951530 PMCID: PMC5676230 DOI: 10.1534/genetics.117.300323] [Citation(s) in RCA: 76] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 09/13/2017] [Indexed: 11/18/2022] Open
Abstract
The distribution of fitness effects (DFE) encompasses the fraction of deleterious, neutral, and beneficial mutations. It conditions the evolutionary trajectory of populations, as well as the rate of adaptive molecular evolution (α). Inferring DFE and α from patterns of polymorphism, as given through the site frequency spectrum (SFS) and divergence data, has been a longstanding goal of evolutionary genetics. A widespread assumption shared by previous inference methods is that beneficial mutations only contribute negligibly to the polymorphism data. Hence, a DFE comprising only deleterious mutations tends to be estimated from SFS data, and α is then predicted by contrasting the SFS with divergence data from an outgroup. We develop a hierarchical probabilistic framework that extends previous methods to infer DFE and α from polymorphism data alone. We use extensive simulations to examine the performance of our method. While an outgroup is still needed to obtain an unfolded SFS, we show that both a DFE, comprising both deleterious and beneficial mutations, and α can be inferred without using divergence data. We also show that not accounting for the contribution of beneficial mutations to polymorphism data leads to substantially biased estimates of the DFE and α. We compare our framework with one of the most widely used inference methods available and apply it on a recently published chimpanzee exome data set.
Collapse
|
45
|
Abstract
Organisms often encounter stressful conditions, some of which damage their DNA. In response, some organisms show a high expression of error-prone DNA repair machinery, causing a temporary increase in the genome-wide mutation rate. Although we now have a detailed map of the molecular mechanisms underlying such stress-induced mutagenesis (SIM), it has been hotly debated whether SIM alters evolutionary dynamics. Key to this controversy is our poor understanding about which stresses increase mutagenesis and their long-term consequences for adaptation. In a new study with Escherichia coli, Maharjan and Ferenci show that while only some nutritional stresses (phosphorous and carbon limitation) increase total mutation rates, each stress generates a unique spectrum of mutations. Their results suggest the potential for specific stresses to shape evolutionary dynamics and highlight the necessity for explicit tests of the long-term evolutionary impacts of SIM.
Collapse
|
46
|
Lind PA, Arvidsson L, Berg OG, Andersson DI. Variation in Mutational Robustness between Different Proteins and the Predictability of Fitness Effects. Mol Biol Evol 2017; 34:408-418. [PMID: 28025272 DOI: 10.1093/molbev/msw239] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Random mutations in genes from disparate protein classes may have different distributions of fitness effects (DFEs) depending on different structural, functional, and evolutionary constraints. We measured the fitness effects of 156 single mutations in the genes encoding AraC (transcription factor), AraD (enzyme), and AraE (transporter) used for bacterial growth on l-arabinose. Despite their different molecular functions these genes all had bimodal DFEs with most mutations either being neutral or strongly deleterious, providing a general expectation for the DFE. This contrasts with the unimodal DFEs previously obtained for ribosomal protein genes where most mutations were slightly deleterious. Based on theoretical considerations, we suggest that the 33-fold higher average mutational robustness of ribosomal proteins is due to stronger selection for reduced costs of translational and transcriptional errors. Whereas the large majority of synonymous mutations were deleterious for ribosomal proteins genes, no fitness effects could be detected for the AraCDE genes. Four mutations in AraC and AraE increased fitness, suggesting that slightly advantageous mutations make up a significant fraction of the DFE, but that they often escape detection due to the limited sensitivity of commonly used fitness assays. We show that the fitness effects of amino acid substitutions can be predicted based on evolutionary conservation, but those weakly deleterious mutations are less reliably detected. This suggests that large-effect mutations and the fraction of highly deleterious mutations can be computationally predicted, but that experiments are required to characterize the DFE close to neutrality, where many mutations ultimately fixed in a population will occur.
Collapse
Affiliation(s)
- Peter A Lind
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden.,Department of Molecular Biology, Umeå University, Umeå, Sweden
| | - Lars Arvidsson
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Otto G Berg
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Dan I Andersson
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
47
|
Inference of the Distribution of Selection Coefficients for New Nonsynonymous Mutations Using Large Samples. Genetics 2017; 206:345-361. [PMID: 28249985 PMCID: PMC5419480 DOI: 10.1534/genetics.116.197145] [Citation(s) in RCA: 107] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2016] [Accepted: 02/14/2017] [Indexed: 12/23/2022] Open
Abstract
The distribution of fitness effects (DFE) has considerable importance in population genetics. To date, estimates of the DFE come from studies using a small number of individuals. Thus, estimates of the proportion of moderately to strongly deleterious new mutations may be unreliable because such variants are unlikely to be segregating in the data. Additionally, the true functional form of the DFE is unknown, and estimates of the DFE differ significantly between studies. Here we present a flexible and computationally tractable method, called Fit∂a∂i, to estimate the DFE of new mutations using the site frequency spectrum from a large number of individuals. We apply our approach to the frequency spectrum of 1300 Europeans from the Exome Sequencing Project ESP6400 data set, 1298 Danes from the LuCamp data set, and 432 Europeans from the 1000 Genomes Project to estimate the DFE of deleterious nonsynonymous mutations. We infer significantly fewer (0.38-0.84 fold) strongly deleterious mutations with selection coefficient |s| > 0.01 and more (1.24-1.43 fold) weakly deleterious mutations with selection coefficient |s| < 0.001 compared to previous estimates. Furthermore, a DFE that is a mixture distribution of a point mass at neutrality plus a gamma distribution fits better than a gamma distribution in two of the three data sets. Our results suggest that nearly neutral forces play a larger role in human evolution than previously thought.
Collapse
|
48
|
A Statistical Guide to the Design of Deep Mutational Scanning Experiments. Genetics 2016; 204:77-87. [PMID: 27412710 DOI: 10.1534/genetics.116.190462] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2016] [Accepted: 06/29/2016] [Indexed: 12/21/2022] Open
Abstract
The characterization of the distribution of mutational effects is a key goal in evolutionary biology. Recently developed deep-sequencing approaches allow for accurate and simultaneous estimation of the fitness effects of hundreds of engineered mutations by monitoring their relative abundance across time points in a single bulk competition. Naturally, the achievable resolution of the estimated fitness effects depends on the specific experimental setup, the organism and type of mutations studied, and the sequencing technology utilized, among other factors. By means of analytical approximations and simulations, we provide guidelines for optimizing time-sampled deep-sequencing bulk competition experiments, focusing on the number of mutants, the sequencing depth, and the number of sampled time points. Our analytical results show that sampling more time points together with extending the duration of the experiment improves the achievable precision disproportionately compared with increasing the sequencing depth or reducing the number of competing mutants. Even if the duration of the experiment is fixed, sampling more time points and clustering these at the beginning and the end of the experiment increase experimental power and allow for efficient and precise assessment of the entire range of selection coefficients. Finally, we provide a formula for calculating the 95%-confidence interval for the measurement error estimate, which we implement as an interactive web tool. This allows for quantification of the maximum expected a priori precision of the experimental setup, as well as for a statistical threshold for determining deviations from neutrality for specific selection coefficient estimates.
Collapse
|
49
|
Vale PF, Lafforgue G, Gatchitch F, Gardan R, Moineau S, Gandon S. Costs of CRISPR-Cas-mediated resistance in Streptococcus thermophilus. Proc Biol Sci 2016. [PMID: 26224708 DOI: 10.1098/rspb.2015.1270] [Citation(s) in RCA: 80] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
CRISPR-Cas is a form of adaptive sequence-specific immunity in microbes. This system offers unique opportunities for the study of coevolution between bacteria and their viral pathogens, bacteriophages. A full understanding of the coevolutionary dynamics of CRISPR-Cas requires knowing the magnitude of the cost of resisting infection. Here, using the gram-positive bacterium Streptococcus thermophilus and its associated virulent phage 2972, a well-established model system harbouring at least two type II functional CRISPR-Cas systems, we obtained different fitness measures based on growth assays in isolation or in pairwise competition. We measured the fitness cost associated with different components of this adaptive immune system: the cost of Cas protein expression, the constitutive cost of increasing immune memory through additional spacers, and the conditional costs of immunity during phage exposure. We found that Cas protein expression is particularly costly, as Cas-deficient mutants achieved higher competitive abilities than the wild-type strain with functional Cas proteins. Increasing immune memory by acquiring up to four phage-derived spacers was not associated with fitness costs. In addition, the activation of the CRISPR-Cas system during phage exposure induces significant but small fitness costs. Together these results suggest that the costs of the CRISPR-Cas system arise mainly due to the maintenance of the defence system. We discuss the implications of these results for the evolution of CRISPR-Cas-mediated immunity.
Collapse
Affiliation(s)
- Pedro F Vale
- Centre for Immunity, Infection, and Evolution, Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Ashworth Laboratories, West Mains Road, Edinburgh EH9 3JT, UK
| | - Guillaume Lafforgue
- CEFE UMR 5175, CNRS-Université de Montpellier, Université Paul-Valéry Montpellier, EPHE, 1919, route de Mende 34293 Montpellier Cedex 5, France
| | - Francois Gatchitch
- CEFE UMR 5175, CNRS-Université de Montpellier, Université Paul-Valéry Montpellier, EPHE, 1919, route de Mende 34293 Montpellier Cedex 5, France
| | | | - Sylvain Moineau
- GREB and Félix d'Hérelle Reference Center for Bacterial Viruses, Faculté de médecine dentaire, Québec, Canada G1V 0A6 Département de biochimie, de microbiologie et de bio-informatique and PROTEO, Faculté des sciences et de génie, Université Laval, Québec, Canada G1V 0A6
| | - Sylvain Gandon
- CEFE UMR 5175, CNRS-Université de Montpellier, Université Paul-Valéry Montpellier, EPHE, 1919, route de Mende 34293 Montpellier Cedex 5, France
| |
Collapse
|
50
|
Boyer S, Biswas D, Kumar Soshee A, Scaramozzino N, Nizak C, Rivoire O. Hierarchy and extremes in selections from pools of randomized proteins. Proc Natl Acad Sci U S A 2016; 113:3482-7. [PMID: 26969726 PMCID: PMC4822605 DOI: 10.1073/pnas.1517813113] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Variation and selection are the core principles of Darwinian evolution, but quantitatively relating the diversity of a population to its capacity to respond to selection is challenging. Here, we examine this problem at a molecular level in the context of populations of partially randomized proteins selected for binding to well-defined targets. We built several minimal protein libraries, screened them in vitro by phage display, and analyzed their response to selection by high-throughput sequencing. A statistical analysis of the results reveals two main findings. First, libraries with the same sequence diversity but built around different "frameworks" typically have vastly different responses; second, the distribution of responses of the best binders in a library follows a simple scaling law. We show how an elementary probabilistic model based on extreme value theory rationalizes the latter finding. Our results have implications for designing synthetic protein libraries, estimating the density of functional biomolecules in sequence space, characterizing diversity in natural populations, and experimentally investigating evolvability (i.e., the potential for future evolution).
Collapse
Affiliation(s)
- Sébastien Boyer
- Laboratoire Interdisciplinaire de Physique, CNRS and Université Grenoble Alpes, 38000 Grenoble, France
| | - Dipanwita Biswas
- Laboratoire Interdisciplinaire de Physique, CNRS and Université Grenoble Alpes, 38000 Grenoble, France
| | - Ananda Kumar Soshee
- Laboratoire Interdisciplinaire de Physique, CNRS and Université Grenoble Alpes, 38000 Grenoble, France
| | - Natale Scaramozzino
- Laboratoire Interdisciplinaire de Physique, CNRS and Université Grenoble Alpes, 38000 Grenoble, France
| | - Clément Nizak
- Laboratoire de Biochimie, Chimie-Biologie-Innovation UMR8231, CNRS and Ecole Supérieure de Physique et Chimie Industrielles ParisTech, Paris Sciences & Lettres Research University, 75005 Paris, France
| | - Olivier Rivoire
- Laboratoire Interdisciplinaire de Physique, CNRS and Université Grenoble Alpes, 38000 Grenoble, France;
| |
Collapse
|