1
|
Latrille T, Joseph J, Hartasánchez DA, Salamin N. Estimating the proportion of beneficial mutations that are not adaptive in mammals. PLoS Genet 2024; 20:e1011536. [PMID: 39724093 PMCID: PMC11709321 DOI: 10.1371/journal.pgen.1011536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 01/08/2025] [Accepted: 12/10/2024] [Indexed: 12/28/2024] Open
Abstract
Mutations can be beneficial by bringing innovation to their bearer, allowing them to adapt to environmental change. These mutations are typically unpredictable since they respond to an unforeseen change in the environment. However, mutations can also be beneficial because they are simply restoring a state of higher fitness that was lost due to genetic drift in a stable environment. In contrast to adaptive mutations, these beneficial non-adaptive mutations can be predicted if the underlying fitness landscape is stable and known. The contribution of such non-adaptive mutations to molecular evolution has been widely neglected mainly because their detection is very challenging. We have here reconstructed protein-coding gene fitness landscapes shared between mammals, using mutation-selection models and a multi-species alignments across 87 mammals. These fitness landscapes have allowed us to predict the fitness effect of polymorphisms found in 28 mammalian populations. Using methods that quantify selection at the population level, we have confirmed that beneficial non-adaptive mutations are indeed positively selected in extant populations. Our work confirms that deleterious substitutions are accumulating in mammals and are being reverted, generating a balance in which genomes are damaged and restored simultaneously at different loci. We observe that beneficial non-adaptive mutations represent between 15% and 45% of all beneficial mutations in 24 of 28 populations analyzed, suggesting that a substantial part of ongoing positive selection is not driven solely by adaptation to environmental change in mammals.
Collapse
Affiliation(s)
- Thibault Latrille
- Department of Computational Biology, Université de Lausanne, Lausanne, Switzerland
| | - Julien Joseph
- Laboratoire de Biométrie et Biologie Evolutive, UMR5558, Université Lyon 1, Villeurbanne, France
| | | | - Nicolas Salamin
- Department of Computational Biology, Université de Lausanne, Lausanne, Switzerland
| |
Collapse
|
2
|
Huang S, Girdner J, Nguyen LP, Sandoval C, Fregoso OI, Enard D, Li MMH. Positive selection analyses identify a single WWE domain residue that shapes ZAP into a more potent restriction factor against alphaviruses. PLoS Pathog 2024; 20:e1011836. [PMID: 39207950 PMCID: PMC11361444 DOI: 10.1371/journal.ppat.1011836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 07/24/2024] [Indexed: 09/04/2024] Open
Abstract
The host interferon pathway upregulates intrinsic restriction factors in response to viral infection. Many of them block a diverse range of viruses, suggesting that their antiviral functions might have been shaped by multiple viral families during evolution. Host-virus conflicts have led to the rapid adaptation of host and viral proteins at their interaction hotspots. Hence, we can use evolutionary genetic analyses to elucidate antiviral mechanisms and domain functions of restriction factors. Zinc finger antiviral protein (ZAP) is a restriction factor against RNA viruses such as alphaviruses, in addition to other RNA, retro-, and DNA viruses, yet its precise antiviral mechanism is not fully characterized. Previously, an analysis of 13 primate ZAP orthologs identified three positively selected residues in the poly(ADP-ribose) polymerase-like domain. However, selective pressure from ancient alphaviruses and others likely drove ZAP adaptation in a wider representation of mammals. We performed positive selection analyses in 261 mammalian ZAP using more robust methods with complementary strengths and identified seven positively selected sites in all domains of the protein. We generated ZAP inducible cell lines in which the positively selected residues of ZAP are mutated and tested their effects on alphavirus replication and known ZAP activities. Interestingly, the mutant in the second WWE domain of ZAP (N658A) is dramatically better than wild-type ZAP at blocking replication of Sindbis virus and other ZAP-sensitive alphaviruses due to enhanced viral translation inhibition. The N658A mutant is adjacent to the previously reported poly(ADP-ribose) (PAR) binding pocket, but surprisingly has reduced binding to PAR. In summary, the second WWE domain is critical for engineering a more potent ZAP and fluctuations in PAR binding modulate ZAP antiviral activity. Our study has the potential to unravel the role of ADP-ribosylation in the host innate immune defense and viral evolutionary strategies that antagonize this post-translational modification.
Collapse
Affiliation(s)
- Serina Huang
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California, United States of America
| | - Juliana Girdner
- Department of Chemistry and Biochemistry, University of California, Los Angeles, California, United States of America
- Department of Microbiology, Immunology and Molecular Genetics, University of California, Los Angeles, California, United States of America
| | - LeAnn P. Nguyen
- Department of Microbiology, Immunology and Molecular Genetics, University of California, Los Angeles, California, United States of America
- Molecular Biology Institute, University of California, Los Angeles, California, United States of America
| | - Carina Sandoval
- Department of Microbiology, Immunology and Molecular Genetics, University of California, Los Angeles, California, United States of America
| | - Oliver I. Fregoso
- Department of Microbiology, Immunology and Molecular Genetics, University of California, Los Angeles, California, United States of America
- Molecular Biology Institute, University of California, Los Angeles, California, United States of America
| | - David Enard
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona, United States of America
| | - Melody M. H. Li
- Department of Microbiology, Immunology and Molecular Genetics, University of California, Los Angeles, California, United States of America
- Molecular Biology Institute, University of California, Los Angeles, California, United States of America
- AIDS Institute, David Geffen School of Medicine, University of California, Los Angeles, California, United States of America
| |
Collapse
|
3
|
Huang S, Girdner J, Nguyen LP, Enard D, Li MM. Positive selection analyses identify a single WWE domain residue that shapes ZAP into a super restriction factor. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.20.567784. [PMID: 38045310 PMCID: PMC10690157 DOI: 10.1101/2023.11.20.567784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
The host interferon pathway upregulates intrinsic restriction factors in response to viral infection. Many of them block a diverse range of viruses, suggesting that their antiviral functions might have been shaped by multiple viral families during evolution. Virus-host conflicts have led to the rapid adaptation of viral and host proteins at their interaction hotspots. Hence, we can use evolutionary genetic analyses to elucidate antiviral mechanisms and domain functions of restriction factors. Zinc finger antiviral protein (ZAP) is a restriction factor against RNA viruses such as alphaviruses, in addition to other RNA, retro-, and DNA viruses, yet its precise antiviral mechanism is not fully characterized. Previously, an analysis of 13 primate ZAP identified 3 positively selected residues in the poly(ADP-ribose) polymerase-like domain. However, selective pressure from ancient alphaviruses and others likely drove ZAP adaptation in a wider representation of mammals. We performed positive selection analyses in 261 mammalian ZAP using more robust methods with complementary strengths and identified 7 positively selected sites in all domains of the protein. We generated ZAP inducible cell lines in which the positively selected residues of ZAP are mutated and tested their effects on alphavirus replication and known ZAP activities. Interestingly, the mutant in the second WWE domain of ZAP (N658A) is dramatically better than wild-type ZAP at blocking replication of Sindbis virus and other ZAP-sensitive alphaviruses due to enhanced viral translation inhibition. The N658A mutant inhabits the space surrounding the previously reported poly(ADP-ribose) (PAR) binding pocket, but surprisingly has reduced binding to PAR. In summary, the second WWE domain is critical for engineering a super restrictor ZAP and fluctuations in PAR binding modulate ZAP antiviral activity. Our study has the potential to unravel the role of ADP-ribosylation in the host innate immune defense and viral evolutionary strategies that antagonize this post-translational modification.
Collapse
Affiliation(s)
- Serina Huang
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Juliana Girdner
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA
- Department of Microbiology, Immunology and Molecular Genetics, University of California, Los Angeles, CA, USA
| | - LeAnn P Nguyen
- Department of Microbiology, Immunology and Molecular Genetics, University of California, Los Angeles, CA, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA, USA
| | - David Enard
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| | - Melody Mh Li
- Department of Microbiology, Immunology and Molecular Genetics, University of California, Los Angeles, CA, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA, USA
- AIDS Institute, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| |
Collapse
|
4
|
Lucaci AG, Zehr JD, Enard D, Thornton JW, Kosakovsky Pond SL. Evolutionary Shortcuts via Multinucleotide Substitutions and Their Impact on Natural Selection Analyses. Mol Biol Evol 2023; 40:msad150. [PMID: 37395787 PMCID: PMC10336034 DOI: 10.1093/molbev/msad150] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/15/2023] [Accepted: 06/26/2023] [Indexed: 07/04/2023] Open
Abstract
Inference and interpretation of evolutionary processes, in particular of the types and targets of natural selection affecting coding sequences, are critically influenced by the assumptions built into statistical models and tests. If certain aspects of the substitution process (even when they are not of direct interest) are presumed absent or are modeled with too crude of a simplification, estimates of key model parameters can become biased, often systematically, and lead to poor statistical performance. Previous work established that failing to accommodate multinucleotide (or multihit, MH) substitutions strongly biases dN/dS-based inference towards false-positive inferences of diversifying episodic selection, as does failing to model variation in the rate of synonymous substitution (SRV) among sites. Here, we develop an integrated analytical framework and software tools to simultaneously incorporate these sources of evolutionary complexity into selection analyses. We found that both MH and SRV are ubiquitous in empirical alignments, and incorporating them has a strong effect on whether or not positive selection is detected (1.4-fold reduction) and on the distributions of inferred evolutionary rates. With simulation studies, we show that this effect is not attributable to reduced statistical power caused by using a more complex model. After a detailed examination of 21 benchmark alignments and a new high-resolution analysis showing which parts of the alignment provide support for positive selection, we show that MH substitutions occurring along shorter branches in the tree explain a significant fraction of discrepant results in selection detection. Our results add to the growing body of literature which examines decades-old modeling assumptions (including MH) and finds them to be problematic for comparative genomic data analysis. Because multinucleotide substitutions have a significant impact on natural selection detection even at the level of an entire gene, we recommend that selection analyses of this type consider their inclusion as a matter of routine. To facilitate this procedure, we developed, implemented, and benchmarked a simple and well-performing model testing selection detection framework able to screen an alignment for positive selection with two biologically important confounding processes: site-to-site synonymous rate variation, and multinucleotide instantaneous substitutions.
Collapse
Affiliation(s)
- Alexander G Lucaci
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA
| | - Jordan D Zehr
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA
| | - David Enard
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona
| | - Joseph W Thornton
- Department of Human Genetics, University of Chicago, Chicago, Illinois
- Department of Ecology & Evolution, University of Chicago, Chicago, Illinois
| | | |
Collapse
|
5
|
Latrille T, Rodrigue N, Lartillot N. Genes and sites under adaptation at the phylogenetic scale also exhibit adaptation at the population-genetic scale. Proc Natl Acad Sci U S A 2023; 120:e2214977120. [PMID: 36897968 PMCID: PMC10089192 DOI: 10.1073/pnas.2214977120] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 02/11/2023] [Indexed: 03/12/2023] Open
Abstract
Adaptation in protein-coding sequences can be detected from multiple sequence alignments across species or alternatively by leveraging polymorphism data within a population. Across species, quantification of the adaptive rate relies on phylogenetic codon models, classically formulated in terms of the ratio of nonsynonymous over synonymous substitution rates. Evidence of an accelerated nonsynonymous substitution rate is considered a signature of pervasive adaptation. However, because of the background of purifying selection, these models are potentially limited in their sensitivity. Recent developments have led to more sophisticated mutation-selection codon models aimed at making a more detailed quantitative assessment of the interplay between mutation, purifying, and positive selection. In this study, we conducted a large-scale exome-wide analysis of placental mammals with mutation-selection models, assessing their performance at detecting proteins and sites under adaptation. Importantly, mutation-selection codon models are based on a population-genetic formalism and thus are directly comparable to the McDonald and Kreitman test at the population level to quantify adaptation. Taking advantage of this relationship between phylogenetic and population genetics analyses, we integrated divergence and polymorphism data across the entire exome for 29 populations across 7 genera and showed that proteins and sites detected to be under adaptation at the phylogenetic scale are also under adaptation at the population-genetic scale. Altogether, our exome-wide analysis shows that phylogenetic mutation-selection codon models and the population-genetic test of adaptation can be reconciled and are congruent, paving the way for integrative models and analyses across individuals and populations.
Collapse
Affiliation(s)
- Thibault Latrille
- Université de Lyon, Université Lyon 1, CNRS, VetAgro Sup, Laboratoire de Biométrie et Biologie Evolutive, UMR5558, 69100Villeurbanne, France
- École Normale Supérieure de Lyon, Université de Lyon, 69342Lyon, France
- Department of Computational Biology, Université de Lausanne, 1015Lausanne, Switzerland
| | - Nicolas Rodrigue
- Department of Biology, Institute of Biochemistry, and School of Mathematics and Statistics, Carleton University, K1S 5B6Ottawa, Canada
| | - Nicolas Lartillot
- Université de Lyon, Université Lyon 1, CNRS, VetAgro Sup, Laboratoire de Biométrie et Biologie Evolutive, UMR5558, 69100Villeurbanne, France
| |
Collapse
|
6
|
Bricout R, Weil D, Stroebel D, Genovesio A, Roest Crollius H. Evolution is not Uniform Along Coding Sequences. Mol Biol Evol 2023; 40:7060063. [PMID: 36857092 PMCID: PMC10025431 DOI: 10.1093/molbev/msad042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 02/15/2023] [Accepted: 02/16/2023] [Indexed: 03/02/2023] Open
Abstract
Amino acids evolve at different speeds within protein sequences, because their functional and structural roles are different. Notably, amino acids located at the surface of proteins are known to evolve more rapidly than those in the core. In particular, amino acids at the N- and C-termini of protein sequences are likely to be more exposed than those at the core of the folded protein due to their location in the peptidic chain, and they are known to be less structured. Because of these reasons, we would expect that amino acids located at protein termini would evolve faster than residues located inside the chain. Here we test this hypothesis and found that amino acids evolve almost twice as fast at protein termini compared with those in the center, hinting at a strong topological bias along the sequence length. We further show that the distribution of solvent-accessible residues and functional domains in proteins readily explain how structural and functional constraints are weaker at their termini, leading to the observed excess of amino acid substitutions. Finally, we show that the specific evolutionary rates at protein termini may have direct consequences, notably misleading in silico methods used to infer sites under positive selection within genes. These results suggest that accounting for positional information should improve evolutionary models.
Collapse
Affiliation(s)
- Raphaël Bricout
- Département de biologie, École normale supérieure, Institut de Biologie de l'ENS (IBENS), CNRS, INSERM, Paris, France
| | - Dominique Weil
- Laboratoire de Biologie du Développement, Sorbonne Université, CNRS, Institut de Biologie Paris-Seine (IBPS), Paris, France
| | - David Stroebel
- Département de biologie, École normale supérieure, Institut de Biologie de l'ENS (IBENS), CNRS, INSERM, Paris, France
| | - Auguste Genovesio
- Département de biologie, École normale supérieure, Institut de Biologie de l'ENS (IBENS), CNRS, INSERM, Paris, France
| | - Hugues Roest Crollius
- Département de biologie, École normale supérieure, Institut de Biologie de l'ENS (IBENS), CNRS, INSERM, Paris, France
| |
Collapse
|
7
|
Duchemin L, Lanore V, Veber P, Boussau B. Evaluation of Methods to Detect Shifts in Directional Selection at the Genome Scale. Mol Biol Evol 2022; 40:6889995. [PMID: 36510704 PMCID: PMC9940701 DOI: 10.1093/molbev/msac247] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 10/24/2022] [Accepted: 10/26/2022] [Indexed: 12/15/2022] Open
Abstract
Identifying the footprints of selection in coding sequences can inform about the importance and function of individual sites. Analyses of the ratio of nonsynonymous to synonymous substitutions (dN/dS) have been widely used to pinpoint changes in the intensity of selection, but cannot distinguish them from changes in the direction of selection, that is, changes in the fitness of specific amino acids at a given position. A few methods that rely on amino-acid profiles to detect changes in directional selection have been designed, but their performances have not been well characterized. In this paper, we investigate the performance of six of these methods. We evaluate them on simulations along empirical phylogenies in which transition events have been annotated and compare their ability to detect sites that have undergone changes in the direction or intensity of selection to that of a widely used dN/dS approach, codeml's branch-site model A. We show that all methods have reduced performance in the presence of biased gene conversion but not CpG hypermutability. The best profile method, Pelican, a new implementation of Tamuri AU, Hay AJ, Goldstein RA. (2009. Identifying changes in selective constraints: host shifts in influenza. PLoS Comput Biol. 5(11):e1000564), performs as well as codeml in a range of conditions except for detecting relaxations of selection, and performs better when tree length increases, or in the presence of persistent positive selection. It is fast, enabling genome-scale searches for site-wise changes in the direction of selection associated with phenotypic changes.
Collapse
Affiliation(s)
| | - Vincent Lanore
- Laboratoire de Biométrie et Biologie Evolutive, Univ Lyon, Univ Lyon 1, CNRS, VetAgro Sup, UMR5558, Villeurbanne, France
| | - Philippe Veber
- Laboratoire de Biométrie et Biologie Evolutive, Univ Lyon, Univ Lyon 1, CNRS, VetAgro Sup, UMR5558, Villeurbanne, France
| | | |
Collapse
|
8
|
Latrille T, Lartillot N. An Improved Codon Modeling Approach for Accurate Estimation of the Mutation Bias. Mol Biol Evol 2022; 39:6503505. [PMID: 35021218 PMCID: PMC8831783 DOI: 10.1093/molbev/msac005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Phylogenetic codon models are routinely used to characterize selective regimes in coding sequences. Their parametric design, however, is still a matter of debate, in particular concerning the question of how to account for differing nucleotide frequencies and substitution rates. This problem relates to the fact that nucleotide composition in protein-coding sequences is the result of the interactions between mutation and selection. In particular, because of the structure of the genetic code, the nucleotide composition differs between the three coding positions, with the third position showing a more extreme composition. Yet, phylogenetic codon models do not correctly capture this phenomenon and instead predict that the nucleotide composition should be the same for all three positions. Alternatively, some models allow for different nucleotide rates at the three positions, an approach conflating the effects of mutation and selection on nucleotide composition. In practice, it results in inaccurate estimation of the strength of selection. Conceptually, the problem comes from the fact that phylogenetic codon models do not correctly capture the fixation bias acting against the mutational pressure at the mutation–selection equilibrium. To address this problem and to more accurately identify mutation rates and selection strength, we present an improved codon modeling approach where the fixation rate is not seen as a scalar, but as a tensor. This approach gives an accurate representation of how mutation and selection oppose each other at equilibrium and yields a reliable estimate of the mutational process, while disentangling the mean fixation probabilities prevailing in different mutational directions.
Collapse
Affiliation(s)
- T Latrille
- CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR, Université de Lyon, Université Lyon 1, 5558, Villeurbanne, F-69622, France.,École Normale Supérieure de Lyon, Université de Lyon, Université Lyon 1, Lyon, France
| | - N Lartillot
- CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR, Université de Lyon, Université Lyon 1, 5558, Villeurbanne, F-69622, France
| |
Collapse
|
9
|
Tamuri AU, Dos Reis M. A mutation-selection model of protein evolution under persistent positive selection. Mol Biol Evol 2021; 39:6409866. [PMID: 34694387 PMCID: PMC8760937 DOI: 10.1093/molbev/msab309] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
We use first principles of population genetics to model the evolution of proteins under persistent positive selection (PPS). PPS may occur when organisms are subjected to persistent environmental change, during adaptive radiations, or in host–pathogen interactions. Our mutation–selection model indicates protein evolution under PPS is an irreversible Markov process, and thus proteins under PPS show a strongly asymmetrical distribution of selection coefficients among amino acid substitutions. Our model shows the criteria ω>1 (where ω is the ratio of nonsynonymous over synonymous codon substitution rates) to detect positive selection is conservative and indeed arbitrary, because in real proteins many mutations are highly deleterious and are removed by selection even at positively selected sites. We use a penalized-likelihood implementation of the PPS model to successfully detect PPS in plant RuBisCO and influenza HA proteins. By directly estimating selection coefficients at protein sites, our inference procedure bypasses the need for using ω as a surrogate measure of selection and improves our ability to detect molecular adaptation in proteins.
Collapse
Affiliation(s)
- Asif U Tamuri
- Centre for Advanced Research Computing, University College London, Gower St, London, WC1E 6BT, UK.,EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Mario Dos Reis
- School of Biological and Behavioural Sciences, Queen Mary University of London, Mile End Road, London, E1 4NS, UK
| |
Collapse
|