1
|
Duan B, Qiu C, Sze SH, Kaplan C. Widespread epistasis shapes RNA Polymerase II active site function and evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2023.02.27.530048. [PMID: 36909581 PMCID: PMC10002619 DOI: 10.1101/2023.02.27.530048] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/04/2023]
Abstract
Multi-subunit RNA Polymerases (msRNAPs) are responsible for transcription in all kingdoms of life. These enzymes rely on dynamic, highly conserved active site domains such as the so-called "trigger loop" (TL) to accomplish steps in the transcription cycle. Mutations in the RNA polymerase II (Pol II) TL confer a spectrum of biochemical and genetic phenotypes that suggest two main classes, which decrease or increase catalysis or other nucleotide addition cycle (NAC) events. The Pol II active site relies on networks of residue interactions to function and mutations likely perturb these networks in ways that may alter mechanisms. We have undertaken a structural genetics approach to reveal residue interactions within and surrounding the Pol II TL - determining its "interaction landscape" - by deep mutational scanning in Saccharomyces cerevisiae Pol II. This analysis reveals connections between TL residues and surrounding domains, demonstrating that TL function is tightly coupled to its specific enzyme context.
Collapse
Affiliation(s)
- Bingbing Duan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Chenxi Qiu
- Department of Genetics, Harvard Medical School, Boston, MA 02215, USA
| | - Sing-Hoi Sze
- Department of Computer Science and Engineering, Texas A&M University, College Station, TX 77843, USA
- Department of Biochemistry & Biophysics, Texas A&M University, College Station, TX 77843, USA
| | - Craig Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| |
Collapse
|
2
|
Latrille T, Joseph J, Hartasánchez DA, Salamin N. Estimating the proportion of beneficial mutations that are not adaptive in mammals. PLoS Genet 2024; 20:e1011536. [PMID: 39724093 PMCID: PMC11709321 DOI: 10.1371/journal.pgen.1011536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 01/08/2025] [Accepted: 12/10/2024] [Indexed: 12/28/2024] Open
Abstract
Mutations can be beneficial by bringing innovation to their bearer, allowing them to adapt to environmental change. These mutations are typically unpredictable since they respond to an unforeseen change in the environment. However, mutations can also be beneficial because they are simply restoring a state of higher fitness that was lost due to genetic drift in a stable environment. In contrast to adaptive mutations, these beneficial non-adaptive mutations can be predicted if the underlying fitness landscape is stable and known. The contribution of such non-adaptive mutations to molecular evolution has been widely neglected mainly because their detection is very challenging. We have here reconstructed protein-coding gene fitness landscapes shared between mammals, using mutation-selection models and a multi-species alignments across 87 mammals. These fitness landscapes have allowed us to predict the fitness effect of polymorphisms found in 28 mammalian populations. Using methods that quantify selection at the population level, we have confirmed that beneficial non-adaptive mutations are indeed positively selected in extant populations. Our work confirms that deleterious substitutions are accumulating in mammals and are being reverted, generating a balance in which genomes are damaged and restored simultaneously at different loci. We observe that beneficial non-adaptive mutations represent between 15% and 45% of all beneficial mutations in 24 of 28 populations analyzed, suggesting that a substantial part of ongoing positive selection is not driven solely by adaptation to environmental change in mammals.
Collapse
Affiliation(s)
- Thibault Latrille
- Department of Computational Biology, Université de Lausanne, Lausanne, Switzerland
| | - Julien Joseph
- Laboratoire de Biométrie et Biologie Evolutive, UMR5558, Université Lyon 1, Villeurbanne, France
| | | | - Nicolas Salamin
- Department of Computational Biology, Université de Lausanne, Lausanne, Switzerland
| |
Collapse
|
3
|
Duan B, Qiu C, Lockless SW, Sze SH, Kaplan CD. Higher-order epistasis within Pol II trigger loop haplotypes. Genetics 2024; 228:iyae172. [PMID: 39446980 PMCID: PMC11631520 DOI: 10.1093/genetics/iyae172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Accepted: 10/22/2024] [Indexed: 10/26/2024] Open
Abstract
RNA polymerase II (Pol II) has a highly conserved domain, the trigger loop (TL), that controls transcription fidelity and speed. We previously probed pairwise genetic interactions between residues within and surrounding the TL for the purpose of understand functional interactions between residues and to understand how individual mutants might alter TL function. We identified widespread incompatibility between TLs of different species when placed in the Saccharomyces cerevisiae Pol II context, indicating species-specific interactions between otherwise highly conserved TLs and its surroundings. These interactions represent epistasis between TL residues and the rest of Pol II. We sought to understand why certain TL sequences are incompatible with S. cerevisiae Pol II and to dissect the nature of genetic interactions within multiply substituted TLs as a window on higher order epistasis in this system. We identified both positive and negative higher-order residue interactions within example TL haplotypes. Intricate higher-order epistasis formed by TL residues was sometimes only apparent from analysis of intermediate genotypes, emphasizing complexity of epistatic interactions. Furthermore, we distinguished TL substitutions with distinct classes of epistatic patterns, suggesting specific TL residues that potentially influence TL evolution. Our examples of complex residue interactions suggest possible pathways for epistasis to facilitate Pol II evolution.
Collapse
Affiliation(s)
- Bingbing Duan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Chenxi Qiu
- Department of Genetics, Harvard Medical School, Boston, MA 02215, USA
| | - Steve W Lockless
- Department of Biology, Texas A&M University, College Station, TX 77843, USA
| | - Sing-Hoi Sze
- Department of Computer Science & Engineering, Texas A&M University, College Station, TX 77843, USA
- Department of Biochemistry & Biophysics, Texas A&M University, College Station, TX 77843, USA
| | - Craig D Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| |
Collapse
|
4
|
Rodríguez-Horta E, Strahan J, Dinner AR, Barton JP. Chronic infections can generate SARS-CoV-2-like bursts of viral evolution without epistasis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.06.616878. [PMID: 39416020 PMCID: PMC11482859 DOI: 10.1101/2024.10.06.616878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2024]
Abstract
Multiple SARS-CoV-2 variants have arisen during the first years of the pandemic, often bearing many new mutations. Several explanations have been offered for the surprisingly sudden emergence of multiple mutations that enhance viral fitness, including cryptic transmission, spillover from animal reservoirs, epistasis between mutations, and chronic infections. Here, we simulated pathogen evolution combining within-host replication and between-host transmission. We found that, under certain conditions, chronic infections can lead to SARS-CoV-2-like bursts of mutations even without epistasis. Chronic infections can also increase the global evolutionary rate of a pathogen even in the absence of clear mutational bursts. Overall, our study supports chronic infections as a plausible origin for highly mutated SARS-CoV-2 variants. More generally, we also describe how chronic infections can influence pathogen evolution under different scenarios.
Collapse
Affiliation(s)
- Edwin Rodríguez-Horta
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, USA
- Group of Complex Systems and Statistical Physics, Department of Theoretical Physics, Physics Faculty, University of Havana, Cuba
| | - John Strahan
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Aaron R. Dinner
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - John P. Barton
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, USA
| |
Collapse
|
5
|
Chen JZ, Bisardi M, Lee D, Cotogno S, Zamponi F, Weigt M, Tokuriki N. Understanding epistatic networks in the B1 β-lactamases through coevolutionary statistical modeling and deep mutational scanning. Nat Commun 2024; 15:8441. [PMID: 39349467 PMCID: PMC11442494 DOI: 10.1038/s41467-024-52614-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Accepted: 09/16/2024] [Indexed: 10/02/2024] Open
Abstract
Throughout evolution, protein families undergo substantial sequence divergence while preserving structure and function. Although most mutations are deleterious, evolution can explore sequence space via epistatic networks of intramolecular interactions that alleviate the harmful mutations. However, comprehensive analysis of such epistatic networks across protein families remains limited. Thus, we conduct a family wide analysis of the B1 metallo-β-lactamases, combining experiments (deep mutational scanning, DMS) on two distant homologs (NDM-1 and VIM-2) and computational analyses (in silico DMS based on Direct Coupling Analysis, DCA) of 100 homologs. The methods jointly reveal and quantify prevalent epistasis, as ~1/3rd of equivalent mutations are epistatic in DMS. From DCA, half of the positions have a >6.5 fold difference in effective number of tolerated mutations across the entire family. Notably, both methods locate residues with the strongest epistasis in regions of intermediate residue burial, suggesting a balance of residue packing and mutational freedom in forming epistatic networks. We identify entrenched WT residues between NDM-1 and VIM-2 in DMS, which display statistically distinct behaviors in DCA from non-entrenched residues. Entrenched residues are not easily compensated by changes in single nearby interactions, reinforcing existing findings where a complex epistatic network compounds smaller effects from many interacting residues.
Collapse
Affiliation(s)
- J Z Chen
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - M Bisardi
- Laboratoire de Physique de l'Ecole Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université de Paris, F-75005, Paris, France
- Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie Computationnelle et Quantitative LCQB, F-75005, Paris, France
| | - D Lee
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - S Cotogno
- Laboratoire de Physique de l'Ecole Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université de Paris, F-75005, Paris, France
- Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie Computationnelle et Quantitative LCQB, F-75005, Paris, France
| | - F Zamponi
- Laboratoire de Physique de l'Ecole Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université de Paris, F-75005, Paris, France
- Dipartimento di Fisica, Sapienza Università di Roma, I-00185, Rome, Italy
| | - M Weigt
- Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie Computationnelle et Quantitative LCQB, F-75005, Paris, France
| | - N Tokuriki
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
6
|
Duan B, Qiu C, Lockless SW, Sze SH, Kaplan CD. Higher-order epistasis within Pol II trigger loop haplotypes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.20.576280. [PMID: 38293233 PMCID: PMC10827151 DOI: 10.1101/2024.01.20.576280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
RNA polymerase II (Pol II) has a highly conserved domain, the trigger loop (TL), that controls transcription fidelity and speed. We previously probed pairwise genetic interactions between residues within and surrounding the TL for the purpose of understand functional interactions between residues and to understand how individual mutants might alter TL function. We identified widespread incompatibility between TLs of different species when placed in the Saccharomyces cerevisiae Pol II context, indicating species-specific interactions between otherwise highly conserved TLs and its surroundings. These interactions represent epistasis between TL residues and the rest of Pol II. We sought to understand why certain TL sequences are incompatible with S. cerevisiae Pol II and to dissect the nature of genetic interactions within multiply substituted TLs as a window on higher order epistasis in this system. We identified both positive and negative higher-order residue interactions within example TL haplotypes. Intricate higher-order epistasis formed by TL residues was sometimes only apparent from analysis of intermediate genotypes, emphasizing complexity of epistatic interactions. Furthermore, we distinguished TL substitutions with distinct classes of epistatic patterns, suggesting specific TL residues that potentially influence TL evolution. Our examples of complex residue interactions suggest possible pathways for epistasis to facilitate Pol II evolution.
Collapse
Affiliation(s)
- Bingbing Duan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260
| | - Chenxi Qiu
- Department of Genetics, Harvard Medical School, Boston, MA 02215
| | - Steve W Lockless
- Department of Biology, Texas A&M University, College Station, TX 77843
| | - Sing-Hoi Sze
- Department of Computer Science & Engineering, Texas A&M University, College Station, TX 77843
- Department of Biochemistry & Biophysics, Texas A&M University, College Station, TX 77843
| | - Craig D Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260
| |
Collapse
|
7
|
Hong Z, Shimagaki KS, Barton JP. popDMS infers mutation effects from deep mutational scanning data. Bioinformatics 2024; 40:btae499. [PMID: 39115383 PMCID: PMC11335369 DOI: 10.1093/bioinformatics/btae499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 07/10/2024] [Accepted: 08/06/2024] [Indexed: 08/22/2024] Open
Abstract
SUMMARY Deep mutational scanning (DMS) experiments provide a powerful method to measure the functional effects of genetic mutations at massive scales. However, the data generated from these experiments can be difficult to analyze, with significant variation between experimental replicates. To overcome this challenge, we developed popDMS, a computational method based on population genetics theory, to infer the functional effects of mutations from DMS data. Through extensive tests, we found that the functional effects of single mutations and epistasis inferred by popDMS are highly consistent across replicates, comparing favorably with existing methods. Our approach is flexible and can be widely applied to DMS data that includes multiple time points, multiple replicates, and different experimental conditions. AVAILABILITY AND IMPLEMENTATION popDMS is implemented in Python and Julia, and is freely available on GitHub at https://github.com/bartonlab/popDMS.
Collapse
Affiliation(s)
- Zhenchen Hong
- Department of Physics and Astronomy, University of California, Riverside, CA 92521, United States
| | - Kai S Shimagaki
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, PA 15260, United States
| | - John P Barton
- Department of Physics and Astronomy, University of California, Riverside, CA 92521, United States
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, PA 15260, United States
- Department of Physics and Astronomy, University of Pittsburgh, PA 15260, United States
| |
Collapse
|
8
|
Joseph J. Increased Positive Selection in Highly Recombining Genes Does not Necessarily Reflect an Evolutionary Advantage of Recombination. Mol Biol Evol 2024; 41:msae107. [PMID: 38829800 PMCID: PMC11173204 DOI: 10.1093/molbev/msae107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 04/08/2024] [Accepted: 05/28/2024] [Indexed: 06/05/2024] Open
Abstract
It is commonly thought that the long-term advantage of meiotic recombination is to dissipate genetic linkage, allowing natural selection to act independently on different loci. It is thus theoretically expected that genes with higher recombination rates evolve under more effective selection. On the other hand, recombination is often associated with GC-biased gene conversion (gBGC), which theoretically interferes with selection by promoting the fixation of deleterious GC alleles. To test these predictions, several studies assessed whether selection was more effective in highly recombining genes (due to dissipation of genetic linkage) or less effective (due to gBGC), assuming a fixed distribution of fitness effects (DFE) for all genes. In this study, I directly derive the DFE from a gene's evolutionary history (shaped by mutation, selection, drift, and gBGC) under empirical fitness landscapes. I show that genes that have experienced high levels of gBGC are less fit and thus have more opportunities for beneficial mutations. Only a small decrease in the genome-wide intensity of gBGC leads to the fixation of these beneficial mutations, particularly in highly recombining genes. This results in increased positive selection in highly recombining genes that is not caused by more effective selection. Additionally, I show that the death of a recombination hotspot can lead to a higher dN/dS than its birth, but with substitution patterns biased towards AT, and only at selected positions. This shows that controlling for a substitution bias towards GC is therefore not sufficient to rule out the contribution of gBGC to signatures of accelerated evolution. Finally, although gBGC does not affect the fixation probability of GC-conservative mutations, I show that by altering the DFE, gBGC can also significantly affect nonsynonymous GC-conservative substitution patterns.
Collapse
Affiliation(s)
- Julien Joseph
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR 5558, Villeurbanne, France
| |
Collapse
|
9
|
Radojković M, Ubbink M. Positive epistasis drives clavulanic acid resistance in double mutant libraries of BlaC β-lactamase. Commun Biol 2024; 7:197. [PMID: 38368480 PMCID: PMC10874438 DOI: 10.1038/s42003-024-05868-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Accepted: 01/26/2024] [Indexed: 02/19/2024] Open
Abstract
Phenotypic effects of mutations are highly dependent on the genetic backgrounds in which they occur, due to epistatic effects. To test how easily the loss of enzyme activity can be compensated for, we screen mutant libraries of BlaC, a β-lactamase from Mycobacterium tuberculosis, for fitness in the presence of carbenicillin and the inhibitor clavulanic acid. Using a semi-rational approach and deep sequencing, we prepare four double-site saturation libraries and determine the relative fitness effect for 1534/1540 (99.6%) of the unique library members at two temperatures. Each library comprises variants of a residue known to be relevant for clavulanic acid resistance as well as residue 105, which regulates access to the active site. Variants with greatly improved fitness were identified within each library, demonstrating that compensatory mutations for loss of activity can be readily found. In most cases, the fittest variants are a result of positive epistasis, indicating strong synergistic effects between the chosen residue pairs. Our study sheds light on a role of epistasis in the evolution of functional residues and underlines the highly adaptive potential of BlaC.
Collapse
Affiliation(s)
- Marko Radojković
- Leiden Institute of Chemistry, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - Marcellus Ubbink
- Leiden Institute of Chemistry, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands.
| |
Collapse
|
10
|
Hong Z, Barton JP. popDMS infers mutation effects from deep mutational scanning data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.29.577759. [PMID: 38352383 PMCID: PMC10862717 DOI: 10.1101/2024.01.29.577759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
Deep mutational scanning (DMS) experiments provide a powerful method to measure the functional effects of genetic mutations at massive scales. However, the data generated from these experiments can be difficult to analyze, with significant variation between experimental replicates. To overcome this challenge, we developed popDMS, a computational method based on population genetics theory, to infer the functional effects of mutations from DMS data. Through extensive tests, we found that the functional effects of single mutations and epistasis inferred by popDMS are highly consistent across replicates, comparing favorably with existing methods. Our approach is flexible and can be widely applied to DMS data that includes multiple time points, multiple replicates, and different experimental conditions.
Collapse
Affiliation(s)
- Zhenchen Hong
- Department of Physics and Astronomy, University of California, Riverside, USA
| | - John P. Barton
- Department of Physics and Astronomy, University of California, Riverside, USA
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, USA
- Department of Physics and Astronomy, University of Pittsburgh, USA
| |
Collapse
|
11
|
Notin P, Kollasch AW, Ritter D, van Niekerk L, Paul S, Spinner H, Rollins N, Shaw A, Weitzman R, Frazer J, Dias M, Franceschi D, Orenbuch R, Gal Y, Marks DS. ProteinGym: Large-Scale Benchmarks for Protein Design and Fitness Prediction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.07.570727. [PMID: 38106144 PMCID: PMC10723403 DOI: 10.1101/2023.12.07.570727] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Predicting the effects of mutations in proteins is critical to many applications, from understanding genetic disease to designing novel proteins that can address our most pressing challenges in climate, agriculture and healthcare. Despite a surge in machine learning-based protein models to tackle these questions, an assessment of their respective benefits is challenging due to the use of distinct, often contrived, experimental datasets, and the variable performance of models across different protein families. Addressing these challenges requires scale. To that end we introduce ProteinGym, a large-scale and holistic set of benchmarks specifically designed for protein fitness prediction and design. It encompasses both a broad collection of over 250 standardized deep mutational scanning assays, spanning millions of mutated sequences, as well as curated clinical datasets providing high-quality expert annotations about mutation effects. We devise a robust evaluation framework that combines metrics for both fitness prediction and design, factors in known limitations of the underlying experimental methods, and covers both zero-shot and supervised settings. We report the performance of a diverse set of over 70 high-performing models from various subfields (eg., alignment-based, inverse folding) into a unified benchmark suite. We open source the corresponding codebase, datasets, MSAs, structures, model predictions and develop a user-friendly website that facilitates data access and analysis.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Ada Shaw
- Applied Mathematics, Harvard University
| | | | | | - Mafalda Dias
- Centre for Genomic Regulation, Universitat Pompeu Fabra
| | | | | | - Yarin Gal
- Computer Science, University of Oxford
| | | |
Collapse
|
12
|
Haddox HK, Galloway JG, Dadonaite B, Bloom JD, Matsen IV FA, DeWitt WS. Jointly modeling deep mutational scans identifies shifted mutational effects among SARS-CoV-2 spike homologs. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.31.551037. [PMID: 37577604 PMCID: PMC10418112 DOI: 10.1101/2023.07.31.551037] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Deep mutational scanning (DMS) is a high-throughput experimental technique that measures the effects of thousands of mutations to a protein. These experiments can be performed on multiple homologs of a protein or on the same protein selected under multiple conditions. It is often of biological interest to identify mutations with shifted effects across homologs or conditions. However, it is challenging to determine if observed shifts arise from biological signal or experimental noise. Here, we describe a method for jointly inferring mutational effects across multiple DMS experiments while also identifying mutations that have shifted in their effects among experiments. A key aspect of our method is to regularize the inferred shifts, so that they are nonzero only when strongly supported by the data. We apply this method to DMS experiments that measure how mutations to spike proteins from SARS-CoV-2 variants (Delta, Omicron BA.1, and Omicron BA.2) affect cell entry. Most mutational effects are conserved between these spike homologs, but a fraction have markedly shifted. We experimentally validate a subset of the mutations inferred to have shifted effects, and confirm differences of > 1,000-fold in the impact of the same mutation on spike-mediated viral infection across spikes from different SARS-CoV-2 variants. Overall, our work establishes a general approach for comparing sets of DMS experiments to identify biologically important shifts in mutational effects.
Collapse
Affiliation(s)
- Hugh K. Haddox
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98102, USA
| | - Jared G. Galloway
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98102, USA
| | - Bernadeta Dadonaite
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Jesse D. Bloom
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98102, USA
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, Seattle, WA 98109, USA
| | - Frederick A. Matsen IV
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98102, USA
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
- Howard Hughes Medical Institute, Seattle, WA 98109, USA
- Department of Statistics, University of Washington, Seattle, WA 98195, USA
| | - William S. DeWitt
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720, USA
| |
Collapse
|
13
|
Langenmayer MC, Luelf-Averhoff AT, Marr L, Jany S, Freudenstein A, Adam-Neumair S, Tscherne A, Fux R, Rojas JJ, Blutke A, Sutter G, Volz A. Newly Designed Poxviral Promoters to Improve Immunogenicity and Efficacy of MVA-NP Candidate Vaccines against Lethal Influenza Virus Infection in Mice. Pathogens 2023; 12:867. [PMID: 37513714 PMCID: PMC10383309 DOI: 10.3390/pathogens12070867] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 06/20/2023] [Accepted: 06/21/2023] [Indexed: 07/30/2023] Open
Abstract
Influenza, a respiratory disease mainly caused by influenza A and B, viruses of the Orthomyxoviridae, is still a burden on our society's health and economic system. Influenza A viruses (IAV) circulate in mammalian and avian populations, causing seasonal outbreaks with high numbers of cases. Due to the high variability in seasonal IAV triggered by antigenic drift, annual vaccination is necessary, highlighting the need for a more broadly protective vaccine against IAV. The safety tested Modified Vaccinia virus Ankara (MVA) is licensed as a third-generation vaccine against smallpox and serves as a potent vector system for the development of new candidate vaccines against different pathogens. Here, we generated and characterized recombinant MVA candidate vaccines that deliver the highly conserved internal nucleoprotein (NP) of IAV under the transcriptional control of five newly designed chimeric poxviral promoters to further increase the immunogenic properties of the recombinant viruses (MVA-NP). Infections of avian cell cultures with the recombinant MVA-NPs demonstrated efficient synthesis of the IAV-NP which was expressed under the control of the five new promoters. Prime-boost or single shot immunizations in C57BL/6 mice readily induced circulating serum antibodies' binding to recombinant IAV-NP and the robust activation of IAV-NP-specific CD8+ T cell responses. Moreover, the MVA-NP candidate vaccines protected C57BL/6 mice against lethal respiratory infection with mouse-adapted IAV (A/Puerto Rico/8/1934/H1N1). Thus, further studies are warranted to evaluate the immunogenicity and efficacy of these recombinant MVA-NP vaccines in other IAV challenge models in more detail.
Collapse
Affiliation(s)
- Martin C Langenmayer
- Institute for Infectious Diseases and Zoonoses, LMU Munich, 80539 Munich, Germany
- German Center for Infection Research (DZIF), Partner Site Munich, 80539 Munich, Germany
| | | | - Lisa Marr
- Institute for Infectious Diseases and Zoonoses, LMU Munich, 80539 Munich, Germany
- Institute of Clinical Hygiene, Medical Microbiology and Infectiology, Paracelsus Medical University, Klinikum Nürnberg, 90419 Nuremberg, Germany
| | - Sylvia Jany
- Institute for Infectious Diseases and Zoonoses, LMU Munich, 80539 Munich, Germany
| | - Astrid Freudenstein
- Institute for Infectious Diseases and Zoonoses, LMU Munich, 80539 Munich, Germany
| | - Silvia Adam-Neumair
- Institute for Infectious Diseases and Zoonoses, LMU Munich, 80539 Munich, Germany
| | - Alina Tscherne
- Institute for Infectious Diseases and Zoonoses, LMU Munich, 80539 Munich, Germany
- German Center for Infection Research (DZIF), Partner Site Munich, 80539 Munich, Germany
| | - Robert Fux
- Institute for Infectious Diseases and Zoonoses, LMU Munich, 80539 Munich, Germany
| | - Juan J Rojas
- Institute for Infectious Diseases and Zoonoses, LMU Munich, 80539 Munich, Germany
- Immunology Unit, Department of Pathology and Experimental Therapies, Faculty of Medicine and Health Sciences, University of Barcelona-Bellvitge Biomedical Research Institute (IDIBELL), 08908 Barcelona, Spain
| | - Andreas Blutke
- Research Unit Analytical Pathology, Helmholtz Zentrum Munich, 85764 Neuherberg, Germany
- Institute for Veterinary Pathology, LMU Munich, 80539 Munich, Germany
| | - Gerd Sutter
- Institute for Infectious Diseases and Zoonoses, LMU Munich, 80539 Munich, Germany
- German Center for Infection Research (DZIF), Partner Site Munich, 80539 Munich, Germany
| | - Asisa Volz
- Institute for Infectious Diseases and Zoonoses, LMU Munich, 80539 Munich, Germany
- Institute of Virology, University of Veterinary Medicine Hannover, 30559 Hannover, Germany
- German Center of Infection Research (DZIF), Partner Site Hannover-Braunschweig, 30559 Hannover, Germany
| |
Collapse
|
14
|
Fiteha YG, Rashed MA, Ali RA, Abd El-Moneim D, Alshanbari FA, Magdy M. Mitogenomic Features and Evolution of the Nile River Dominant Tilapiine Species (Perciformes: Cichlidae). BIOLOGY 2022; 12:biology12010040. [PMID: 36671733 PMCID: PMC9855864 DOI: 10.3390/biology12010040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 12/19/2022] [Accepted: 12/21/2022] [Indexed: 12/28/2022]
Abstract
To better understand the diversity and evolution of cichlids, we sequenced, assembled, and annotated the complete mitochondrial genomes of three Nile tilapiine species (Coptodon zillii, Oreochromis niloticus, and Sarotherodon galilaeus) dominating the Nile River waters. Our results showed that the general mitogenomic features were conserved among the Nile tilapiine species. The genome length ranged from 16,436 to 16,631 bp and a total of 37 genes were identified (two ribosomal RNA genes (rRNAs), 22 transfer RNA genes (tRNAs), 13 protein-coding genes (PCGs), and 1 control region). The ND6 was the only CDS that presented a negative AT skew and a positive GC skew. The most extended repeat sequences were in the D-loop followed by the pseudogenes (trnSGCU). The ND5 showed relatively high substitution rates whereas ATP8 had the lowest substitution rate. The codon usage bias displayed a greater quantity of NNA and NNC at the third position and anti-bias against NNG. The phylogenetic relationship based on the complete mitogenomes and CDS was able to differentiate the three species as previously reported. This study provides new insight into the evolutionary connections between various subfamilies within cichlids while providing new molecular data that can be applied to discriminate between Nile tilapiine species and their populations.
Collapse
Affiliation(s)
- Yosur G. Fiteha
- Genetics Department, Faculty of Agriculture, Ain Shams University, Cairo 11241, Egypt
- Department of Zoology, Faculty of Women for Art, Science and Education, Ain Shams University, Cairo 11566, Egypt
| | - Mohamed A. Rashed
- Genetics Department, Faculty of Agriculture, Ain Shams University, Cairo 11241, Egypt
| | - Ramadan A. Ali
- Department of Zoology, Faculty of Women for Art, Science and Education, Ain Shams University, Cairo 11566, Egypt
| | - Diaa Abd El-Moneim
- Department of Plant Production (Genetic Branch), Faculty of Environmental Agricultural Sciences, Arish University, El-Arish 45511, Egypt
| | - Fahad A. Alshanbari
- Department of Veterinary Medicine, College of Agriculture and Veterinary Medicine, Qassim University, Buraydah 52266, Saudi Arabia
| | - Mahmoud Magdy
- Genetics Department, Faculty of Agriculture, Ain Shams University, Cairo 11241, Egypt
- Correspondence:
| |
Collapse
|
15
|
Druelle V, Neher RA. Reversions to consensus are positively selected in HIV-1 and bias substitution rate estimates. Virus Evol 2022; 9:veac118. [PMID: 36632482 PMCID: PMC9829961 DOI: 10.1093/ve/veac118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 11/12/2022] [Accepted: 12/13/2022] [Indexed: 12/23/2022] Open
Abstract
Human immunodeficiency virus 1 (HIV-1) is a rapidly evolving virus able to evade host immunity through rapid adaptation during chronic infection. The HIV-1 group M has diversified since its zoonosis into several subtypes at a rate of the order of 10-3 changes per site per year. This rate varies between different parts of the genome, and its inference is sensitive to the timescale and diversity spanned by the sequence data used. Higher rates are estimated on short timescales and particularly for within-host evolution, while rate estimates spanning decades or the entire HIV-1 pandemic tend to be lower. The underlying causes of this difference are not well understood. We investigate here the role of rapid reversions toward a preferred evolutionary sequence state on multiple timescales. We show that within-host reversion mutations are under positive selection and contribute substantially to sequence turnover, especially at conserved sites. We then use the rates of reversions and non-reversions estimated from longitudinal within-host data to parameterize a phylogenetic sequence evolution model. Sequence simulation of this model on HIV-1 phylogenies reproduces diversity and apparent evolutionary rates of HIV-1 in gag and pol, suggesting that a tendency to rapidly revert to a consensus-like state can explain much of the time dependence of evolutionary rate estimates in HIV-1.
Collapse
Affiliation(s)
- Valentin Druelle
- Biozentrum University of Basel, Spitalstrasse 41, Basel 4056, Switzerland
- Swiss Institute of Bioinformatics, Spitalstrasse 41, Basel 4056, Switzerland
| | - Richard A Neher
- Biozentrum University of Basel, Spitalstrasse 41, Basel 4056, Switzerland
- Swiss Institute of Bioinformatics, Spitalstrasse 41, Basel 4056, Switzerland
| |
Collapse
|
16
|
Abstract
One core goal of genetics is to systematically understand the mapping between the DNA sequence of an organism (genotype) and its measurable characteristics (phenotype). Understanding this mapping is often challenging because of interactions between mutations, where the result of combining several different mutations can be very different than the sum of their individual effects. Here we provide a statistical framework for modeling complex genetic interactions of this type. The key idea is to ask how fast the effects of mutations change when introducing the same mutation in increasingly distant genetic backgrounds. We then propose a model for phenotypic prediction that takes into account this tendency for the effects of mutations to be more similar in nearby genetic backgrounds. Contemporary high-throughput mutagenesis experiments are providing an increasingly detailed view of the complex patterns of genetic interaction that occur between multiple mutations within a single protein or regulatory element. By simultaneously measuring the effects of thousands of combinations of mutations, these experiments have revealed that the genotype–phenotype relationship typically reflects not only genetic interactions between pairs of sites but also higher-order interactions among larger numbers of sites. However, modeling and understanding these higher-order interactions remains challenging. Here we present a method for reconstructing sequence-to-function mappings from partially observed data that can accommodate all orders of genetic interaction. The main idea is to make predictions for unobserved genotypes that match the type and extent of epistasis found in the observed data. This information on the type and extent of epistasis can be extracted by considering how phenotypic correlations change as a function of mutational distance, which is equivalent to estimating the fraction of phenotypic variance due to each order of genetic interaction (additive, pairwise, three-way, etc.). Using these estimated variance components, we then define an empirical Bayes prior that in expectation matches the observed pattern of epistasis and reconstruct the genotype–phenotype mapping by conducting Gaussian process regression under this prior. To demonstrate the power of this approach, we present an application to the antibody-binding domain GB1 and also provide a detailed exploration of a dataset consisting of high-throughput measurements for the splicing efficiency of human pre-mRNA 5′ splice sites, for which we also validate our model predictions via additional low-throughput experiments.
Collapse
|
17
|
Starr TN, Greaney AJ, Hannon WW, Loes AN, Hauser K, Dillen JR, Ferri E, Farrell AG, Dadonaite B, McCallum M, Matreyek KA, Corti D, Veesler D, Snell G, Bloom JD. Shifting mutational constraints in the SARS-CoV-2 receptor-binding domain during viral evolution. Science 2022; 377:420-424. [PMID: 35762884 PMCID: PMC9273037 DOI: 10.1126/science.abo7896] [Citation(s) in RCA: 169] [Impact Index Per Article: 56.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 06/23/2022] [Indexed: 12/30/2022]
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has evolved variants with substitutions in the spike receptor-binding domain (RBD) that affect its affinity for angiotensin-converting enzyme 2 (ACE2) receptor and recognition by antibodies. These substitutions could also shape future evolution by modulating the effects of mutations at other sites-a phenomenon called epistasis. To investigate this possibility, we performed deep mutational scans to measure the effects on ACE2 binding of all single-amino acid mutations in the Wuhan-Hu-1, Alpha, Beta, Delta, and Eta variant RBDs. Some substitutions, most prominently Asn501→Tyr (N501Y), cause epistatic shifts in the effects of mutations at other sites. These epistatic shifts shape subsequent evolutionary change-for example, enabling many of the antibody-escape substitutions in the Omicron RBD. These epistatic shifts occur despite high conservation of the overall RBD structure. Our data shed light on RBD sequence-function relationships and facilitate interpretation of ongoing SARS-CoV-2 evolution.
Collapse
Affiliation(s)
- Tyler N. Starr
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Allison J. Greaney
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
- Department of Genome Sciences, University of Washington, Seattle, WA 98109, USA
- Medical Scientist Training Program, University of Washington, Seattle, WA 98109, USA
| | - William W. Hannon
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
- Molecular and Cellular Biology Graduate Program, University of Washington, Seattle, WA 98109, USA
| | - Andrea N. Loes
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
- Howard Hughes Medical Institute, Seattle, WA 98109, USA
| | | | | | - Elena Ferri
- Vir Biotechnology, San Francisco, CA 94158, USA
| | - Ariana Ghez Farrell
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Bernadeta Dadonaite
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Matthew McCallum
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
| | - Kenneth A. Matreyek
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Davide Corti
- Humabs BioMed SA, a subsidiary of Vir Biotechnology, 6500 Bellinzona, Switzerland
| | - David Veesler
- Howard Hughes Medical Institute, Seattle, WA 98109, USA
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
| | | | - Jesse D. Bloom
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
- Department of Genome Sciences, University of Washington, Seattle, WA 98109, USA
- Howard Hughes Medical Institute, Seattle, WA 98109, USA
| |
Collapse
|
18
|
Park Y, Metzger BPH, Thornton JW. Epistatic drift causes gradual decay of predictability in protein evolution. Science 2022; 376:823-830. [PMID: 35587978 DOI: 10.1126/science.abn6895] [Citation(s) in RCA: 55] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Epistatic interactions can make the outcomes of evolution unpredictable, but no comprehensive data are available on the extent and temporal dynamics of changes in the effects of mutations as protein sequences evolve. Here, we use phylogenetic deep mutational scanning to measure the functional effect of every possible amino acid mutation in a series of ancestral and extant steroid receptor DNA binding domains. Across 700 million years of evolution, epistatic interactions caused the effects of most mutations to become decorrelated from their initial effects and their windows of evolutionary accessibility to open and close transiently. Most effects changed gradually and without bias at rates that were largely constant across time, indicating a neutral process caused by many weak epistatic interactions. Our findings show that protein sequences drift inexorably into contingency and unpredictability, but that the process is statistically predictable, given sufficient phylogenetic and experimental data.
Collapse
Affiliation(s)
- Yeonwoo Park
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, USA
| | - Brian P H Metzger
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | - Joseph W Thornton
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, USA.,Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA.,Department of Human Genetics, University of Chicago, Chicago, IL, USA
| |
Collapse
|
19
|
Patel R, Carnevale V, Kumar S. Epistasis Creates Invariant Sites and Modulates the Rate of Molecular Evolution. Mol Biol Evol 2022; 39:msac106. [PMID: 35575390 PMCID: PMC9156017 DOI: 10.1093/molbev/msac106] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Invariant sites are a common feature of amino acid sequence evolution. The presence of invariant sites is frequently attributed to the need to preserve function through site-specific conservation of amino acid residues. Amino acid substitution models without a provision for invariant sites often fit the data significantly worse than those that allow for an excess of invariant sites beyond those predicted by models that only incorporate rate variation among sites (e.g., a Gamma distribution). An alternative is epistasis between sites to preserve residue interactions that can create invariant sites. Through computer-simulated sequence evolution, we evaluated the relative effects of site-specific preferences and site-site couplings in the generation of invariant sites and the modulation of the rate of molecular evolution. In an analysis of ten major families of protein domains with diverse sequence and functional properties, we find that the negative selection imposed by epistasis creates many more invariant sites than site-specific residue preferences alone. Further, epistasis plays an increasingly larger role in creating invariant sites over longer evolutionary periods. Epistasis also dictates rates of domain evolution over time by exerting significant additional purifying selection to preserve site couplings. These patterns illuminate the mechanistic role of epistasis in the processes underlying observed site invariance and evolutionary rates.
Collapse
Affiliation(s)
- Ravi Patel
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA
- Department of Biology, Temple University, Philadelphia, PA 19122, USA
| | - Vincenzo Carnevale
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA
- Department of Biology, Temple University, Philadelphia, PA 19122, USA
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA
- Department of Biology, Temple University, Philadelphia, PA 19122, USA
- Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
20
|
Abstract
Vertebrate immune systems suppress viral infection using both innate restriction factors and adaptive immunity. Viruses mutate to escape these defenses, driving hosts to counterevolve to regain fitness. This cycle recurs repeatedly, resulting in an evolutionary arms race whose outcome depends on the pace and likelihood of adaptation by host and viral genes. Although viruses evolve faster than their vertebrate hosts, their proteins are subject to numerous functional constraints that impact the probability of adaptation. These constraints are globally defined by evolutionary landscapes, which describe the fitness and adaptive potential of all possible mutations. We review deep mutational scanning experiments mapping the evolutionary landscapes of both host and viral proteins engaged in arms races. For restriction factors and some broadly neutralizing antibodies, landscapes favor the host, which may help to level the evolutionary playing field against rapidly evolving viruses. We discuss the biophysical underpinnings of these landscapes and their therapeutic implications.
Collapse
Affiliation(s)
- Jeannette L Tenthorey
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA; , ,
| | - Michael Emerman
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA; , , .,Division of Human Biology, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Harmit S Malik
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA; , , .,Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| |
Collapse
|
21
|
Youssef N, Susko E, Roger AJ, Bielawski JP. Evolution of amino acid propensities under stability-mediated epistasis. Mol Biol Evol 2022; 39:6522130. [PMID: 35134997 PMCID: PMC8896634 DOI: 10.1093/molbev/msac030] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Site-specific amino acid preferences are influenced by the genetic background of the protein. The preferences for resident amino acids are expected to, on average, increase over time because of replacements at other sites - a nonadaptive phenomenon referred to as the 'evolutionary Stokes shift'. Alternatively, decreases in resident amino acid propensity have recently been viewed as evidence of adaptations to external environmental changes. Using population genetics theory and thermodynamic stability-constraints, we show that nonadaptive evolution can lead to both positive and negative shifts in propensities following the fixation of an amino acid, emphasizing that the detection of negative shifts is not conclusive evidence of adaptation. Considering shifts in propensities over windows between substitutions at a focal site, we find that following ≈ 50% of substitutions the propensity for the new resident amino acid decreases over time, and both positive and negative shifts were comparable in magnitude. Preferences were often conserved via a significant negative autocorrelation in propensity changes-increases in propensities often followed by decreases, and vice versa. Lastly, we explore the underlying mechanisms that lead propensities to fluctuate. We observe that stabilizing replacements increase the mutational tolerance at a site and in doing so decrease the propensity for the resident amino acid. In contrast, destabilizing substitutions result in more rugged fitness landscapes that tend to favor the resident amino acid. In summary, our results characterize propensity trajectories under nonadaptive stability-constrained evolution against which evidence of adaptations should be calibrated.
Collapse
Affiliation(s)
- Noor Youssef
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Edward Susko
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada
| | - Andrew J Roger
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS, Canada
| | - Joseph P Bielawski
- Department of Biology, Dalhousie University, Halifax, Nova Scotia, Canada Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
22
|
Raman P, Rominger MC, Young JM, Molaro A, Tsukiyama T, Malik HS. Novel classes and evolutionary turnover of histone H2B variants in the mammalian germline. Mol Biol Evol 2022; 39:6517784. [PMID: 35099534 PMCID: PMC8857922 DOI: 10.1093/molbev/msac019] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Histones and their posttranslational modifications facilitate diverse chromatin functions in eukaryotes. Core histones (H2A, H2B, H3, and H4) package genomes after DNA replication. In contrast, variant histones promote specialized chromatin functions, including DNA repair, genome stability, and epigenetic inheritance. Previous studies have identified only a few H2B variants in animals; their roles and evolutionary origins remain largely unknown. Here, using phylogenomic analyses, we reveal the presence of five H2B variants broadly present in mammalian genomes. Three of these variants have been previously described: H2B.1, H2B.L (also called subH2B), and H2B.W. In addition, we identify and describe two new variants: H2B.K and H2B.N. Four of these variants originated in mammals, whereas H2B.K arose prior to the last common ancestor of bony vertebrates. We find that though H2B variants are subject to high gene turnover, most are broadly retained in mammals, including humans. Despite an overall signature of purifying selection, H2B variants evolve more rapidly than core H2B with considerable divergence in sequence and length. All five H2B variants are expressed in the germline. H2B.K and H2B.N are predominantly expressed in oocytes, an atypical expression site for mammalian histone variants. Our findings suggest that H2B variants likely encode potentially redundant but vital functions via unusual chromatin packaging or nonchromatin functions in mammalian germline cells. Our discovery of novel histone variants highlights the advantages of comprehensive phylogenomic analyses and provides unique opportunities to study how innovations in chromatin function evolve.
Collapse
Affiliation(s)
- Pravrutha Raman
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, 98109, USA
| | - Mary C Rominger
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, 98109, USA
- Whitman College, Walla Walla, Washington, 99362, USA
| | - Janet M Young
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, 98109, USA
| | - Antoine Molaro
- Genetics, Reproduction and Development (GReD) Institute, CNRS UMR 6293, INSERM U1103, Université Clermont Auvergne, Clermont-Ferrand, France
| | - Toshio Tsukiyama
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, 98109, USA
| | - Harmit S Malik
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, 98109, USA
- Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, Washington, 98109, USA
| |
Collapse
|
23
|
Wang Y, Lei R, Nourmohammad A, Wu NC. Antigenic evolution of human influenza H3N2 neuraminidase is constrained by charge balancing. eLife 2021; 10:e72516. [PMID: 34878407 PMCID: PMC8683081 DOI: 10.7554/elife.72516] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 12/07/2021] [Indexed: 11/13/2022] Open
Abstract
As one of the main influenza antigens, neuraminidase (NA) in H3N2 virus has evolved extensively for more than 50 years due to continuous immune pressure. While NA has recently emerged as an effective vaccine target, biophysical constraints on the antigenic evolution of NA remain largely elusive. Here, we apply combinatorial mutagenesis and next-generation sequencing to characterize the local fitness landscape in an antigenic region of NA in six different human H3N2 strains that were isolated around 10 years apart. The local fitness landscape correlates well among strains and the pairwise epistasis is highly conserved. Our analysis further demonstrates that local net charge governs the pairwise epistasis in this antigenic region. In addition, we show that residue coevolution in this antigenic region is correlated with the pairwise epistasis between charge states. Overall, this study demonstrates the importance of quantifying epistasis and the underlying biophysical constraint for building a model of influenza evolution.
Collapse
Affiliation(s)
- Yiquan Wang
- Department of Biochemistry, University of Illinois at Urbana-ChampaignUrbanaUnited States
| | - Ruipeng Lei
- Department of Biochemistry, University of Illinois at Urbana-ChampaignUrbanaUnited States
| | - Armita Nourmohammad
- Department of Physics, University of WashingtonSeattleUnited States
- Max Planck Institute for Dynamics and Self-OrganizationGöttingenGermany
- Fred Hutchinson Cancer Research CenterSeattleUnited States
| | - Nicholas C Wu
- Department of Biochemistry, University of Illinois at Urbana-ChampaignUrbanaUnited States
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-ChampaignUrbanaUnited States
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-ChampaignUrbanaUnited States
- Carle Illinois College of Medicine, University of Illinois at Urbana-ChampaignUrbanaUnited States
| |
Collapse
|
24
|
Youssef N, Susko E, Roger AJ, Bielawski JP. Shifts in amino acid preferences as proteins evolve: A synthesis of experimental and theoretical work. Protein Sci 2021; 30:2009-2028. [PMID: 34322924 PMCID: PMC8442975 DOI: 10.1002/pro.4161] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 07/19/2021] [Accepted: 07/26/2021] [Indexed: 11/08/2022]
Abstract
Amino acid preferences vary across sites and time. While variation across sites is widely accepted, the extent and frequency of temporal shifts are contentious. Our understanding of the drivers of amino acid preference change is incomplete: To what extent are temporal shifts driven by adaptive versus nonadaptive evolutionary processes? We review phenomena that cause preferences to vary (e.g., evolutionary Stokes shift, contingency, and entrenchment) and clarify how they differ. To determine the extent and prevalence of shifted preferences, we review experimental and theoretical studies. Analyses of natural sequence alignments often detect decreases in homoplasy (convergence and reversions) rates, and variation in replacement rates with time-signals that are consistent with temporally changing preferences. While approaches inferring shifts in preferences from patterns in natural alignments are valuable, they are indirect since multiple mechanisms (both adaptive and nonadaptive) could lead to the observed signal. Alternatively, site-directed mutagenesis experiments allow for a more direct assessment of shifted preferences. They corroborate evidence from multiple sequence alignments, revealing that the preference for an amino acid at a site varies depending on the background sequence. However, shifts in preferences are usually minor in magnitude and sites with significantly shifted preferences are low in frequency. The small yet consistent perturbations in preferences could, nevertheless, jeopardize the accuracy of inference procedures, which assume constant preferences. We conclude by discussing if and how such shifts in preferences might influence widely used time-homogenous inference procedures and potential ways to mitigate such effects.
Collapse
Affiliation(s)
- Noor Youssef
- Department of BiologyDalhousie UniversityHalifaxNova ScotiaCanada
| | - Edward Susko
- Department of Mathematics and StatisticsDalhousie UniversityHalifaxNova ScotiaCanada
| | - Andrew J. Roger
- Department of Biochemistry and Molecular BiologyDalhousie UniversityHalifaxNova ScotiaCanada
| | - Joseph P. Bielawski
- Department of BiologyDalhousie UniversityHalifaxNova ScotiaCanada
- Department of Mathematics and StatisticsDalhousie UniversityHalifaxNova ScotiaCanada
| |
Collapse
|
25
|
Spielman SJ. Relative Model Fit Does Not Predict Topological Accuracy in Single-Gene Protein Phylogenetics. Mol Biol Evol 2021; 37:2110-2123. [PMID: 32191313 PMCID: PMC7306691 DOI: 10.1093/molbev/msaa075] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
It is regarded as best practice in phylogenetic reconstruction to perform relative model selection to determine an appropriate evolutionary model for the data. This procedure ranks a set of candidate models according to their goodness of fit to the data, commonly using an information theoretic criterion. Users then specify the best-ranking model for inference. Although it is often assumed that better-fitting models translate to increase accuracy, recent studies have shown that the specific model employed may not substantially affect inferences. We examine whether there is a systematic relationship between relative model fit and topological inference accuracy in protein phylogenetics, using simulations and real sequences. Simulations employed site-heterogeneous mechanistic codon models that are distinct from protein-level phylogenetic inference models, allowing us to investigate how protein models performs when they are misspecified to the data, as will be the case for any real sequence analysis. We broadly find that phylogenies inferred across models with vastly different fits to the data produce highly consistent topologies. We additionally find that all models infer similar proportions of false-positive splits, raising the possibility that all available models of protein evolution are similarly misspecified. Moreover, we find that the parameter-rich GTR (general time reversible) model, whose amino acid exchangeabilities are free parameters, performs similarly to models with fixed exchangeabilities, although the inference precision associated with GTR models was not examined. We conclude that, although relative model selection may not hinder phylogenetic analysis on protein data, it may not offer specific predictable improvements and is not a reliable proxy for accuracy.
Collapse
|
26
|
Pinney MM, Mokhtari DA, Akiva E, Yabukarski F, Sanchez DM, Liang R, Doukov T, Martinez TJ, Babbitt PC, Herschlag D. Parallel molecular mechanisms for enzyme temperature adaptation. Science 2021; 371:371/6533/eaay2784. [PMID: 33674467 DOI: 10.1126/science.aay2784] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Revised: 08/23/2020] [Accepted: 01/04/2021] [Indexed: 12/13/2022]
Abstract
The mechanisms that underly the adaptation of enzyme activities and stabilities to temperature are fundamental to our understanding of molecular evolution and how enzymes work. Here, we investigate the molecular and evolutionary mechanisms of enzyme temperature adaption, combining deep mechanistic studies with comprehensive sequence analyses of thousands of enzymes. We show that temperature adaptation in ketosteroid isomerase (KSI) arises primarily from one residue change with limited, local epistasis, and we establish the underlying physical mechanisms. This residue change occurs in diverse KSI backgrounds, suggesting parallel adaptation to temperature. We identify residues associated with organismal growth temperature across 1005 diverse bacterial enzyme families, suggesting widespread parallel adaptation to temperature. We assess the residue properties, molecular interactions, and interaction networks that appear to underly temperature adaptation.
Collapse
Affiliation(s)
- Margaux M Pinney
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA.
| | - Daniel A Mokhtari
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Eyal Akiva
- Department of Bioengineering and Therapeutic Sciences and Quantitative Biosciences Institute, University of California, San Francisco, CA 94158, USA
| | - Filip Yabukarski
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA.,Chan Zuckerberg Biohub, San Francisco, CA 94110, USA
| | - David M Sanchez
- Department of Chemistry, Stanford University, Stanford, CA 94305, USA.,Department of Photon Sciences, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Ruibin Liang
- Department of Chemistry, Stanford University, Stanford, CA 94305, USA.,Department of Photon Sciences, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Tzanko Doukov
- Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Todd J Martinez
- Department of Chemistry, Stanford University, Stanford, CA 94305, USA.,Department of Photon Sciences, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Patricia C Babbitt
- Department of Bioengineering and Therapeutic Sciences and Quantitative Biosciences Institute, University of California, San Francisco, CA 94158, USA
| | - Daniel Herschlag
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA. .,Department of Chemical Engineering, Stanford University, Stanford, CA 94305, USA.,Stanford ChEM-H, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
27
|
Puller V, Sagulenko P, Neher RA. Efficient inference, potential, and limitations of site-specific substitution models. Virus Evol 2020; 6:veaa066. [PMID: 33343922 PMCID: PMC7733610 DOI: 10.1093/ve/veaa066] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Natural selection imposes a complex filter on which variants persist in a population resulting in evolutionary patterns that vary greatly along the genome. Some sites evolve close to neutrally, while others are highly conserved, allow only specific states, or only change in concert with other sites. On one hand, such constraints on sequence evolution can be to infer biological function, one the other hand they need to be accounted for in phylogenetic reconstruction. Phylogenetic models often account for this complexity by partitioning sites into a small number of discrete classes with different rates and/or state preferences. Appropriate model complexity is typically determined by model selection procedures. Here, we present an efficient algorithm to estimate more complex models that allow for different preferences at every site and explore the accuracy at which such models can be estimated from simulated data. Our iterative approximate maximum likelihood scheme uses information in the data efficiently and accurately estimates site-specific preferences from large data sets with moderately diverged sequences and known topology. However, the joint estimation of site-specific rates, and site-specific preferences, and phylogenetic branch length can suffer from identifiability problems, while ignoring variation in preferences across sites results in branch length underestimates. Site-specific preferences estimated from large HIV pol alignments show qualitative concordance with intra-host estimates of fitness costs. Analysis of these substitution models suggests near saturation of divergence after a few hundred years. Such saturation can explain the inability to infer deep divergence times of HIV and SIVs using molecular clock approaches and time-dependent rate estimates.
Collapse
Affiliation(s)
- Vadim Puller
- Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Klingelbergstrasse 61, Basel, Switzerland
| | - Pavel Sagulenko
- Max Planck Institute for Developmental Biology, Max-Planck-Ring 5, 72076 Tübingen, Germany
| | - Richard A Neher
- Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Klingelbergstrasse 61, Basel, Switzerland
| |
Collapse
|
28
|
Zhou J, McCandlish DM. Minimum epistasis interpolation for sequence-function relationships. Nat Commun 2020; 11:1782. [PMID: 32286265 PMCID: PMC7156698 DOI: 10.1038/s41467-020-15512-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 03/12/2020] [Indexed: 12/17/2022] Open
Abstract
Massively parallel phenotyping assays have provided unprecedented insight into how multiple mutations combine to determine biological function. While such assays can measure phenotypes for thousands to millions of genotypes in a single experiment, in practice these measurements are not exhaustive, so that there is a need for techniques to impute values for genotypes whose phenotypes have not been directly assayed. Here, we present an imputation method based on inferring the least epistatic possible sequence-function relationship compatible with the data. In particular, we infer the reconstruction where mutational effects change as little as possible across adjacent genetic backgrounds. The resulting models can capture complex higher-order genetic interactions near the data, but approach additivity where data is sparse or absent. We apply the method to high-throughput transcription factor binding assays and use it to explore a fitness landscape for protein G.
Collapse
Affiliation(s)
- Juannan Zhou
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA.
| |
Collapse
|
29
|
Liberles DA, Chang B, Geiler-Samerotte K, Goldman A, Hey J, Kaçar B, Meyer M, Murphy W, Posada D, Storfer A. Emerging Frontiers in the Study of Molecular Evolution. J Mol Evol 2020; 88:211-226. [PMID: 32060574 PMCID: PMC7386396 DOI: 10.1007/s00239-020-09932-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
A collection of the editors of Journal of Molecular Evolution have gotten together to pose a set of key challenges and future directions for the field of molecular evolution. Topics include challenges and new directions in prebiotic chemistry and the RNA world, reconstruction of early cellular genomes and proteins, macromolecular and functional evolution, evolutionary cell biology, genome evolution, molecular evolutionary ecology, viral phylodynamics, theoretical population genomics, somatic cell molecular evolution, and directed evolution. While our effort is not meant to be exhaustive, it reflects research questions and problems in the field of molecular evolution that are exciting to our editors.
Collapse
Affiliation(s)
- David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA.
| | - Belinda Chang
- Department of Ecology and Evolutionary Biology and Department of Cell and Systems Biology, University of Toronto, 25 Harbord Street, Toronto, ON, M5S 3G5, Canada
| | - Kerry Geiler-Samerotte
- Center for Mechanisms of Evolution, School of Life Sciences, Arizona State University, Tempe, AZ, 85287, USA
| | - Aaron Goldman
- Department of Biology, Oberlin College and Conservatory, K123 Science Center, 119 Woodland Street, Oberlin, OH, 44074, USA
| | - Jody Hey
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA
| | - Betül Kaçar
- Department of Molecular and Cell Biology, University of Arizona, Tucson, AZ, 85721, USA
| | - Michelle Meyer
- Department of Biology, Boston College, Chestnut Hill, MA, 02467, USA
| | - William Murphy
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, 77843, USA
| | - David Posada
- Biomedical Research Center (CINBIO), University of Vigo, Vigo, Spain
| | - Andrew Storfer
- School of Biological Sciences, Washington State University, Pullman, WA, 99164, USA
| |
Collapse
|
30
|
Deep Mutational Scanning Comprehensively Maps How Zika Envelope Protein Mutations Affect Viral Growth and Antibody Escape. J Virol 2019; 93:JVI.01291-19. [PMID: 31511387 DOI: 10.1128/jvi.01291-19] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2019] [Accepted: 09/06/2019] [Indexed: 12/11/2022] Open
Abstract
Functional constraints on viral proteins are often assessed by examining sequence conservation among natural strains, but this approach is relatively ineffective for Zika virus because all known sequences are highly similar. Here, we take an alternative approach to map functional constraints on Zika virus's envelope (E) protein by using deep mutational scanning to measure how all amino acid mutations to the E protein affect viral growth in cell culture. The resulting sequence-function map is consistent with existing knowledge about E protein structure and function but also provides insight into mutation-level constraints in many regions of the protein that have not been well characterized in prior functional work. In addition, we extend our approach to completely map how mutations affect viral neutralization by two monoclonal antibodies, thereby precisely defining their functional epitopes. Overall, our study provides a valuable resource for understanding the effects of mutations to this important viral protein and also offers a roadmap for future work to map functional and antigenic selection to Zika virus at high resolution.IMPORTANCE Zika virus has recently been shown to be associated with severe birth defects. The virus's E protein mediates its ability to infect cells and is also the primary target of the antibodies that are elicited by natural infection and vaccines that are being developed against the virus. Therefore, determining the effects of mutations to this protein is important for understanding its function, its susceptibility to vaccine-mediated immunity, and its potential for future evolution. We completely mapped how amino acid mutations to the E protein affected the virus's ability to grow in cells in the laboratory and escape from several antibodies. The resulting maps relate changes in the E protein's sequence to changes in viral function and therefore provide a valuable complement to existing maps of the physical structure of the protein.
Collapse
|
31
|
Esposito D, Weile J, Shendure J, Starita LM, Papenfuss AT, Roth FP, Fowler DM, Rubin AF. MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biol 2019; 20:223. [PMID: 31679514 PMCID: PMC6827219 DOI: 10.1186/s13059-019-1845-6] [Citation(s) in RCA: 155] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Accepted: 10/01/2019] [Indexed: 11/10/2022] Open
Abstract
Multiplex assays of variant effect (MAVEs), such as deep mutational scans and massively parallel reporter assays, test thousands of sequence variants in a single experiment. Despite the importance of MAVE data for basic and clinical research, there is no standard resource for their discovery and distribution. Here, we present MaveDB ( https://www.mavedb.org ), a public repository for large-scale measurements of sequence variant impact, designed for interoperability with applications to interpret these datasets. We also describe the first such application, MaveVis, which retrieves, visualizes, and contextualizes variant effect maps. Together, the database and applications will empower the community to mine these powerful datasets.
Collapse
Affiliation(s)
- Daniel Esposito
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Jochen Weile
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Lea M Starita
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Anthony T Papenfuss
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, VIC, Australia
- Department of Mathematics and Statistics, University of Melbourne, Melbourne, VIC, Australia
| | - Frederick P Roth
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada.
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
- Canadian Institute for Advanced Research, Toronto, ON, Canada.
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Canadian Institute for Advanced Research, Toronto, ON, Canada.
- Department of Bioengineering, University of Washington, Seattle, WA, USA.
| | - Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia.
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia.
| |
Collapse
|
32
|
Tomala K, Zrebiec P, Hartl DL. Limits to Compensatory Mutations: Insights from Temperature-Sensitive Alleles. Mol Biol Evol 2019; 36:1874-1883. [PMID: 31058959 PMCID: PMC6735812 DOI: 10.1093/molbev/msz110] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Previous experiments with temperature-sensitive mutants of the yeast enzyme orotidine 5'-phosphate decarboxylase (encoded in gene URA3) yielded the unexpected result that reversion occurs only through exact reversal of the original mutation (Jakubowska A, Korona R. 2009. Lack of evolutionary conservation at positions important for thermal stability in the yeast ODCase protein. Mol Biol Evol. 26(7):1431-1434.). We recreated a set of these mutations in which the codon had two nucleotide substitutions, making exact reversion much less likely. We screened these double mutants for reversion and obtained a number of compensatory mutations occurring at alternative sites in the molecule. None of these compensatory mutations fully restored protein performance. The mechanism of partial compensation is consistent with a model in which protein stabilization is additive, as the same secondary mutations can compensate different primary alternations. The distance between primary and compensatory residues precludes direct interaction between the sites. Instead, most of the compensatory mutants were clustered in proximity to the catalytic center. All of the second-site compensatory substitutions occurred at relatively conserved sites, and the amino acid replacements were to residues found at these sites in a multispecies alignment of the protein. Based on the estimated distribution of changes in Gibbs free energy among a large number of amino acid replacements, we estimate that, for most proteins, the probability that a second-site mutation would have a sufficiently large stabilizing effect to offset a temperature-sensitive mutation in the order of 10-4 or less. Hence compensation is likely to take place only for slightly destabilizing mutations because highly stabilizing mutations are exceeding rare.
Collapse
Affiliation(s)
- Katarzyna Tomala
- Institute of Environmental Sciences, Faculty of Biology, Jagiellonian University, Krakow, Poland
| | - Piotr Zrebiec
- Institute of Environmental Sciences, Faculty of Biology, Jagiellonian University, Krakow, Poland
| | - Daniel L Hartl
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA
| |
Collapse
|
33
|
Ferrada E. Gene Families, Epistasis and the Amino Acid Preferences of Protein Homologs. Evol Bioinform Online 2019; 15:1176934319870485. [PMID: 31452598 PMCID: PMC6698995 DOI: 10.1177/1176934319870485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Accepted: 07/27/2019] [Indexed: 11/16/2022] Open
Abstract
In order to preserve structure and function, proteins tend to preferentially conserve amino acids at particular sites along the sequence. Because mutations can affect structure and function, the question arises whether the preference of a protein site for a particular amino acid varies between protein homologs, and to what extent that variation depends on sequence divergence. Answering these questions can help in the development of models of sequence evolution, as well as provide insights on the dependence of the fitness effects of mutations on the genetic background of sequences, a phenomenon known as epistasis. Here, I comment on recent computational work providing a systematic analysis of the extent to which the amino acid preferences of proteins depend on the background mutations of protein homologs.
Collapse
Affiliation(s)
- Evandro Ferrada
- Center for Genomics and Bioinformatics, Faculty of Science, Universidad Mayor, Santiago, Chile
| |
Collapse
|
34
|
Hom N, Gentles L, Bloom JD, Lee KK. Deep Mutational Scan of the Highly Conserved Influenza A Virus M1 Matrix Protein Reveals Substantial Intrinsic Mutational Tolerance. J Virol 2019; 93:e00161-19. [PMID: 31019050 PMCID: PMC6580950 DOI: 10.1128/jvi.00161-19] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 04/09/2019] [Indexed: 12/30/2022] Open
Abstract
Influenza A virus matrix protein M1 is involved in multiple stages of the viral infectious cycle. Despite its functional importance, our present understanding of this essential viral protein is limited. The roles of a small subset of specific amino acids have been reported, but a more comprehensive understanding of the relationship between M1 sequence, structure, and virus fitness remains elusive. In this study, we used deep mutational scanning to measure the effect of every amino acid substitution in M1 on viral replication in cell culture. The map of amino acid mutational tolerance we have generated allows us to identify sites that are functionally constrained in cell culture as well as sites that are less constrained. Several sites that exhibit low tolerance to mutation have been found to be critical for M1 function and production of viable virions. Surprisingly, significant portions of the M1 sequence, especially in the C-terminal domain, whose structure is undetermined, were found to be highly tolerant of amino acid variation, despite having extremely low levels of sequence diversity among natural influenza virus strains. This unexpected discrepancy indicates that not all sites in M1 that exhibit high sequence conservation in nature are under strong constraint during selection for viral replication in cell culture.IMPORTANCE The M1 matrix protein is critical for many stages of the influenza virus infection cycle. Currently, we have an incomplete understanding of this highly conserved protein's function and structure. Key regions of M1, particularly in the C terminus of the protein, remain poorly characterized. In this study, we used deep mutational scanning to determine the extent of M1's tolerance to mutation. Surprisingly, nearly two-thirds of the M1 sequence exhibits a high tolerance for substitutions, contrary to the extremely low sequence diversity observed across naturally occurring M1 isolates. Sites with low mutational tolerance were also identified, suggesting that they likely play critical functional roles and are under selective pressure. These results reveal the intrinsic mutational tolerance throughout M1 and shape future inquiries probing the functions of this essential influenza A virus protein.
Collapse
Affiliation(s)
- Nancy Hom
- Department of Medicinal Chemistry, University of Washington, Seattle, Washington, USA
| | - Lauren Gentles
- Department of Microbiology, University of Washington, Seattle, Washington, USA
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Jesse D Bloom
- Department of Microbiology, University of Washington, Seattle, Washington, USA
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
- Howard Hughes Medical Institute, Chevy Chase, Maryland, USA
| | - Kelly K Lee
- Department of Medicinal Chemistry, University of Washington, Seattle, Washington, USA
- Department of Microbiology, University of Washington, Seattle, Washington, USA
| |
Collapse
|
35
|
Soh YS, Moncla LH, Eguia R, Bedford T, Bloom JD. Comprehensive mapping of adaptation of the avian influenza polymerase protein PB2 to humans. eLife 2019; 8:45079. [PMID: 31038123 PMCID: PMC6491042 DOI: 10.7554/elife.45079] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Accepted: 03/31/2019] [Indexed: 12/11/2022] Open
Abstract
Viruses like influenza are infamous for their ability to adapt to new hosts. Retrospective studies of natural zoonoses and passaging in the lab have identified a modest number of host-adaptive mutations. However, it is unclear if these mutations represent all ways that influenza can adapt to a new host. Here we take a prospective approach to this question by completely mapping amino-acid mutations to the avian influenza virus polymerase protein PB2 that enhance growth in human cells. We identify numerous previously uncharacterized human-adaptive mutations. These mutations cluster on PB2’s surface, highlighting potential interfaces with host factors. Some previously uncharacterized adaptive mutations occur in avian-to-human transmission of H7N9 influenza, showing their importance for natural virus evolution. But other adaptive mutations do not occur in nature because they are inaccessible via single-nucleotide mutations. Overall, our work shows how selection at key molecular surfaces combines with evolutionary accessibility to shape viral host adaptation. Viruses copy themselves by hijacking the cells of an infected host, but this comes with some limitations. Cells from different species have different molecular machinery and so viruses often have to specialize to a narrow group of species. This specialization consists largely of fine-tuning the way that viral proteins interact with host proteins. For instance, in bird flu viruses, a protein known as PB2 does not interact well with the machinery in human cells. Because PB2 proteins form part of the viral polymerase (the structure that copies the viral genome), this prevents bird flu viruses from replicating efficiently in humans. Sometimes however, changes in the PB2 protein allow bird flu viruses to better replicate in humans, potentially leading to deadly flu pandemics. To understand exactly how this happens, researchers have previously used two approaches: examining the changes that have happened in past flu viruses, and monitoring the evolution of bird flu viruses grown in human cells in the lab. However, these approaches can only look at a small number of the many possible genetic changes to the virus. This makes it hard to anticipate the new ways that flu might adapt to human cells in the future. To overcome this problem, Soh et al. systematically created all of the single changes to the bird flu PB2, altering every element of the protein sequence one-by-one. They then tested which of the changes to PB2 helped the virus grow better in human cells. The modifications that made the viruses thrive were on the surface of the protein, suggesting that they might improve interaction with the cell machinery of the host. Some changes have been found in bird flu viruses that have recently jumped into humans in nature, although fortunately none of these viruses have yet spread widely to cause a pandemic. Many factors affect the evolution of viruses, and their ability to infect new species. Understanding which changes in proteins help these microbes adapt to new hosts is an important element that scientists could consider to assess future risks of pandemics.
Collapse
Affiliation(s)
- Yq Shirleen Soh
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, United States.,Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Louise H Moncla
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, United States.,Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Rachel Eguia
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Trevor Bedford
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, United States.,Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Jesse D Bloom
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, United States.,Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, United States.,Howard Hughes Medical Institute, Seattle, United States
| |
Collapse
|
36
|
Kazmi SO, Rodrigue N. Detecting amino acid preference shifts with codon-level mutation-selection mixture models. BMC Evol Biol 2019; 19:62. [PMID: 30808289 PMCID: PMC6390532 DOI: 10.1186/s12862-019-1358-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Accepted: 01/11/2019] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND In recent years, increasing attention has been placed on the development of phylogeny-based statistical methodologies for uncovering site-specific changes in amino acid fitness profiles over time. The few available random-effects approaches, modelling across-site variation in amino acid profiles as random variables drawn from a statistical law, either lack a mechanistic codon-level formulation, or pose significant computational challenges. RESULTS Here, we bring together a few existing ideas to explore a simple and fast method based on a predefined finite mixture of amino acid profiles within a codon-level substitution model following the mutation-selection formulation. Our study is focused on the detection of site-specific shifts in amino acid profiles over a known sub-clade of a tree, using simulations with and without shifts over the sub-clade to study the properties of the method. Through modifications of the values of the amino acid profiles, our simulations show different levels of reliability under different forms of finite mixture models. Sites identified by our method in a real data set show obvious overlap with those identified using previous methods, with some notable differences. CONCLUSION Overall, our results show that when a site-specific shift in amino acid profile is strongly pronounced, involving two clearly different sets of profiles, the method performs very well; but shifts between profiles that share many features are difficult to correctly identify, highlighting the challenging nature of the problem.
Collapse
Affiliation(s)
- S Omar Kazmi
- Department of Biology, Carleton University, 1125 Colonel By Drive, Ottawa, K1S 5B6, Canada
| | - Nicolas Rodrigue
- Department of Biology, Carleton University, 1125 Colonel By Drive, Ottawa, K1S 5B6, Canada. .,Institute of Biochemistry and School of Mathematics and Statistics, Carleton University, 1125 Colonel By Drive, Ottawa, K1S 5B6, Canada.
| |
Collapse
|
37
|
Ferrada E. The Site-Specific Amino Acid Preferences of Homologous Proteins Depend on Sequence Divergence. Genome Biol Evol 2019; 11:121-135. [PMID: 30496400 PMCID: PMC6326188 DOI: 10.1093/gbe/evy261] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/26/2018] [Indexed: 12/20/2022] Open
Abstract
The propensity of protein sites to be occupied by any of the 20 amino acids is known as site-specific amino acid preferences (SSAP). Under the assumption that SSAP are conserved among homologs, they can be used to parameterize evolutionary models for the reconstruction of accurate phylogenetic trees. However, simulations and experimental studies have not been able to fully assess the relative conservation of SSAP as a function of sequence divergence between protein homologs. Here, we implement a computational procedure to predict the SSAP of proteins based on the effect of changes in thermodynamic stability upon mutation. An advantage of this computational approach is that it allows us to interrogate a large and unbiased sample of homologous proteins, over the entire spectrum of sequence divergence, and under selection for the same molecular trait. We show that computational predictions have reproducibilities that resemble those obtained in experimental replicates, and can largely recapitulate the SSAP observed in a large-scale mutagenesis experiment. Our results support recent experimental reports on the conservation of SSAP of related homologs, with a slowly increasing fraction of up to 15% of different sites at sequence distances lower than 40%. However, even under the sole contribution of thermodynamic stability, our conservative approach identifies up to 30% of significant different sites between divergent homologs. We show that this relation holds for homologs of diverse sizes and structural classes. Analyses of residue contact networks suggest that an important determinant of these differences is the increasing accumulation of structural deviations that results from sequence divergence.
Collapse
Affiliation(s)
- Evandro Ferrada
- Center for Genomics and Bioinformatics, Faculty of Science, Universidad Mayor, Camino La Pirámide 5750, Huechuraba, 8580745, Santiago, Chile
| |
Collapse
|
38
|
Hilton SK, Bloom JD. Modeling site-specific amino-acid preferences deepens phylogenetic estimates of viral sequence divergence. Virus Evol 2018; 4:vey033. [PMID: 30425841 PMCID: PMC6220371 DOI: 10.1093/ve/vey033] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Molecular phylogenetics is often used to estimate the time since the divergence of modern gene sequences. For highly diverged sequences, such phylogenetic techniques sometimes estimate surprisingly recent divergence times. In the case of viruses, independent evidence indicates that the estimates of deep divergence times from molecular phylogenetics are sometimes too recent. This discrepancy is caused in part by inadequate models of purifying selection leading to branch-length underestimation. Here we examine the effect on branch-length estimation of using models that incorporate experimental measurements of purifying selection. We find that models informed by experimentally measured site-specific amino-acid preferences estimate longer deep branches on phylogenies of influenza virus hemagglutinin. This lengthening of branches is due to more realistic stationary states of the models, and is mostly independent of the branch-length extension from modeling site-to-site variation in amino-acid substitution rate. The branch-length extension from experimentally informed site-specific models is similar to that achieved by other approaches that allow the stationary state to vary across sites. However, the improvements from all of these site-specific but time homogeneous and site independent models are limited by the fact that a protein’s amino-acid preferences gradually shift as it evolves. Overall, our work underscores the importance of modeling site-specific amino-acid preferences when estimating deep divergence times—but also shows the inherent limitations of approaches that fail to account for how these preferences shift over time.
Collapse
Affiliation(s)
- Sarah K Hilton
- Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center.,Department of Genome Sciences, University of Washington, USA
| | - Jesse D Bloom
- Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center.,Department of Genome Sciences, University of Washington, USA.,Howard Hughes Medical Institute, Seattle, WA, USA
| |
Collapse
|
39
|
Phillips AM, Ponomarenko AI, Chen K, Ashenberg O, Miao J, McHugh SM, Butty VL, Whittaker CA, Moore CL, Bloom JD, Lin YS, Shoulders MD. Destabilized adaptive influenza variants critical for innate immune system escape are potentiated by host chaperones. PLoS Biol 2018; 16:e3000008. [PMID: 30222731 PMCID: PMC6160216 DOI: 10.1371/journal.pbio.3000008] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2018] [Revised: 09/27/2018] [Accepted: 08/30/2018] [Indexed: 11/24/2022] Open
Abstract
The threat of viral pandemics demands a comprehensive understanding of evolution at the host-pathogen interface. Here, we show that the accessibility of adaptive mutations in influenza nucleoprotein at fever-like temperatures is mediated by host chaperones. Particularly noteworthy, we observe that the Pro283 nucleoprotein variant, which (1) is conserved across human influenza strains, (2) confers resistance to the Myxovirus resistance protein A (MxA) restriction factor, and (3) critically contributed to adaptation to humans in the 1918 pandemic influenza strain, is rendered unfit by heat shock factor 1 inhibition-mediated host chaperone depletion at febrile temperatures. This fitness loss is due to biophysical defects that chaperones are unavailable to address when heat shock factor 1 is inhibited. Thus, influenza subverts host chaperones to uncouple the biophysically deleterious consequences of viral protein variants from the benefits of immune escape. In summary, host proteostasis plays a central role in shaping influenza adaptation, with implications for the evolution of other viruses, for viral host switching, and for antiviral drug development.
Collapse
Affiliation(s)
- Angela M. Phillips
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Anna I. Ponomarenko
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Kenny Chen
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Orr Ashenberg
- Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Jiayuan Miao
- Department of Chemistry, Tufts University, Medford, Massachusetts, United States of America
| | - Sean M. McHugh
- Department of Chemistry, Tufts University, Medford, Massachusetts, United States of America
| | - Vincent L. Butty
- BioMicro Center, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Charles A. Whittaker
- BioMicro Center, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Christopher L. Moore
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Jesse D. Bloom
- Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Yu-Shan Lin
- Department of Chemistry, Tufts University, Medford, Massachusetts, United States of America
| | - Matthew D. Shoulders
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| |
Collapse
|
40
|
Deep mutational scanning of hemagglutinin helps predict evolutionary fates of human H3N2 influenza variants. Proc Natl Acad Sci U S A 2018; 115:E8276-E8285. [PMID: 30104379 PMCID: PMC6126756 DOI: 10.1073/pnas.1806133115] [Citation(s) in RCA: 135] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
A key goal in the study of influenza virus evolution is to forecast which viral strains will persist and which ones will die out. Here we experimentally measure the effects of all amino acid mutations to the hemagglutinin protein from a human H3N2 influenza strain on viral growth in cell culture. We show that these measurements have utility for distinguishing among viral strains that do and do not succeed in nature. Overall, our work suggests that new high-throughput experimental approaches may be useful for understanding virus evolution in nature. Human influenza virus rapidly accumulates mutations in its major surface protein hemagglutinin (HA). The evolutionary success of influenza virus lineages depends on how these mutations affect HA’s functionality and antigenicity. Here we experimentally measure the effects on viral growth in cell culture of all single amino acid mutations to the HA from a recent human H3N2 influenza virus strain. We show that mutations that are measured to be more favorable for viral growth are enriched in evolutionarily successful H3N2 viral lineages relative to mutations that are measured to be less favorable for viral growth. Therefore, despite the well-known caveats about cell-culture measurements of viral fitness, such measurements can still be informative for understanding evolution in nature. We also compare our measurements for H3 HA to similar data previously generated for a distantly related H1 HA and find substantial differences in which amino acids are preferred at many sites. For instance, the H3 HA has less disparity in mutational tolerance between the head and stalk domains than the H1 HA. Overall, our work suggests that experimental measurements of mutational effects can be leveraged to help understand the evolutionary fates of viral lineages in nature—but only when the measurements are made on a viral strain similar to the ones being studied in nature.
Collapse
|
41
|
Lyons DM, Lauring AS. Mutation and Epistasis in Influenza Virus Evolution. Viruses 2018; 10:E407. [PMID: 30081492 PMCID: PMC6115771 DOI: 10.3390/v10080407] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Revised: 07/30/2018] [Accepted: 07/30/2018] [Indexed: 12/25/2022] Open
Abstract
Influenza remains a persistent public health challenge, because the rapid evolution of influenza viruses has led to marginal vaccine efficacy, antiviral resistance, and the annual emergence of novel strains. This evolvability is driven, in part, by the virus's capacity to generate diversity through mutation and reassortment. Because many new traits require multiple mutations and mutations are frequently combined by reassortment, epistatic interactions between mutations play an important role in influenza virus evolution. While mutation and epistasis are fundamental to the adaptability of influenza viruses, they also constrain the evolutionary process in important ways. Here, we review recent work on mutational effects and epistasis in influenza viruses.
Collapse
Affiliation(s)
- Daniel M Lyons
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA.
| | - Adam S Lauring
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA.
- Division of Infectious Diseases, Department of Internal Medicine, University of Michigan, Ann Arbor, MI 48109, USA.
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
42
|
Multiplexed assays of variant effects contribute to a growing genotype-phenotype atlas. Hum Genet 2018; 137:665-678. [PMID: 30073413 PMCID: PMC6153521 DOI: 10.1007/s00439-018-1916-x] [Citation(s) in RCA: 89] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Accepted: 07/21/2018] [Indexed: 12/12/2022]
Abstract
Given the constantly improving cost and speed of genome sequencing, it is reasonable to expect that personal genomes will soon be known for many millions of humans. This stands in stark contrast with our limited ability to interpret the sequence variants which we find. Although it is, perhaps, easiest to interpret variants in coding regions, knowledge of functional impact is unknown for the vast majority of missense variants. While many computational approaches can predict the impact of coding variants, they are given a little weight in the current guidelines for interpreting clinical variants. Laboratory assays produce comparatively more trustworthy results, but until recently did not scale to the space of all possible mutations. The development of deep mutational scanning and other multiplexed assays of variant effect has now brought feasibility of this endeavour within view. Here, we review progress in this field over the last decade, break down the different approaches into their components, and compare methodological differences.
Collapse
|
43
|
Lyons DM, Lauring AS. Evidence for the Selective Basis of Transition-to-Transversion Substitution Bias in Two RNA Viruses. Mol Biol Evol 2018; 34:3205-3215. [PMID: 29029187 PMCID: PMC5850290 DOI: 10.1093/molbev/msx251] [Citation(s) in RCA: 76] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The substitution rates of transitions are higher than expected by chance relative to those of transversions. Many have argued that selection disfavors transversions, as nonsynonymous transversions are less likely to conserve biochemical properties of the original amino acid. Only recently has it become feasible to directly test this selective hypothesis by comparing the fitness effects of a large number of transition and transversion mutations. For example, a recent study of six viruses and one beta-lactamase gene did not find evidence supporting the selective hypothesis. Here, we analyze the relative fitness effects of transition and transversion mutations from our recently published genome-wide study of mutational fitness effects in influenza virus. In contrast to prior work, we find that transversions are significantly more detrimental than transitions. Using what we believe to be an improved statistical framework, we also identify a similar trend in two HIV data sets. We further demonstrate a fitness difference in transition and transversion mutations using four deep mutational scanning data sets of influenza virus and HIV, which provided adequate statistical power. We find that three of the most commonly cited radical/conservative amino acid categories are predictive of fitness, supporting their utility in studies of positive selection and codon usage bias. We conclude that selection is a major contributor to the transition:transversion substitution bias in viruses and that this effect is only partially explained by the greater likelihood of transversion mutations to cause radical as opposed to conservative amino acid changes.
Collapse
Affiliation(s)
- Daniel M Lyons
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI
| | - Adam S Lauring
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI.,Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI.,Division of Infectious Diseases, Department of Internal Medicine, University of Michigan, Ann Arbor, MI
| |
Collapse
|
44
|
Storz JF. Compensatory mutations and epistasis for protein function. Curr Opin Struct Biol 2018; 50:18-25. [PMID: 29100081 PMCID: PMC5936477 DOI: 10.1016/j.sbi.2017.10.009] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Revised: 10/05/2017] [Accepted: 10/12/2017] [Indexed: 01/09/2023]
Abstract
Adaptive protein evolution may be facilitated by neutral amino acid mutations that confer no benefit when they first arise but which potentiate subsequent function-altering mutations via direct or indirect structural mechanisms. Theoretical and empirical results indicate that such compensatory interactions (intramolecular epistasis) can exert a strong influence on trajectories of protein evolution. For this reason, assessing the form and prevalence of intramolecular epistasis and characterizing biophysical mechanisms of compensatory interaction are important research goals at the nexus of structural biology and molecular evolution. Here I review recent insights derived from protein-engineering studies, and I describe an approach for identifying and characterizing mechanisms of epistasis that integrates experimental data on structure-function relationships with analyses of comparative sequence data.
Collapse
Affiliation(s)
- Jay F Storz
- University of Nebraska, School of Biological Sciences, Lincoln, NE 68588-0114, United States.
| |
Collapse
|
45
|
Risso VA, Sanchez-Ruiz JM, Ozkan SB. Biotechnological and protein-engineering implications of ancestral protein resurrection. Curr Opin Struct Biol 2018; 51:106-115. [PMID: 29660672 DOI: 10.1016/j.sbi.2018.02.007] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2018] [Revised: 02/18/2018] [Accepted: 02/20/2018] [Indexed: 10/17/2022]
Abstract
Approximations to the sequences of ancestral proteins can be derived from the sequences of their modern descendants. Proteins encoded by such reconstructed sequences can be prepared in the laboratory and subjected to experimental scrutiny. These 'resurrected' ancestral proteins often display remarkable properties, reflecting ancestral adaptations to intra-cellular and extra-cellular environments that differed from the environments hosting modern/extant proteins. Recent experimental and computational work has specifically discussed high stability, substrate and catalytic promiscuity, conformational flexibility/diversity and altered patterns of interaction with other sub-cellular components. In this review, we discuss these remarkable properties as well as recent attempts to explore their biotechnological and protein-engineering potential.
Collapse
Affiliation(s)
- Valeria A Risso
- Departamento de Quimica Fisica, Facultad de Ciencias, University of Granada, 18071 Granada, Spain
| | - Jose M Sanchez-Ruiz
- Departamento de Quimica Fisica, Facultad de Ciencias, University of Granada, 18071 Granada, Spain.
| | - S Banu Ozkan
- Department of Physics and Center for Biological Physics, Arizona State University, Tempe, AZ 85281, United States.
| |
Collapse
|
46
|
Pervasive contingency and entrenchment in a billion years of Hsp90 evolution. Proc Natl Acad Sci U S A 2018; 115:4453-4458. [PMID: 29626131 DOI: 10.1073/pnas.1718133115] [Citation(s) in RCA: 66] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Interactions among mutations within a protein have the potential to make molecular evolution contingent and irreversible, but the extent to which epistasis actually shaped historical evolutionary trajectories is unclear. To address this question, we experimentally measured how the fitness effects of historical sequence substitutions changed during the billion-year evolutionary history of the heat shock protein 90 (Hsp90) ATPase domain beginning from a deep eukaryotic ancestor to modern Saccharomyces cerevisiae We found a pervasive influence of epistasis. Of 98 derived amino acid states that evolved along this lineage, about half compromise fitness when introduced into the reconstructed ancestral Hsp90. And the vast majority of ancestral states reduce fitness when introduced into the extant S. cerevisiae Hsp90. Overall, more than 75% of historical substitutions were contingent on permissive substitutions that rendered the derived state nondeleterious, became entrenched by subsequent restrictive substitutions that made the ancestral state deleterious, or both. This epistasis was primarily caused by specific interactions among sites rather than a general effect on the protein's tolerance to mutation. Our results show that epistasis continually opened and closed windows of mutational opportunity over evolutionary timescales, producing histories and biological states that reflect the transient internal constraints imposed by the protein's fleeting sequence states.
Collapse
|
47
|
Haddox HK, Dingens AS, Hilton SK, Overbaugh J, Bloom JD. Mapping mutational effects along the evolutionary landscape of HIV envelope. eLife 2018; 7:34420. [PMID: 29590010 PMCID: PMC5910023 DOI: 10.7554/elife.34420] [Citation(s) in RCA: 80] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2017] [Accepted: 03/15/2018] [Indexed: 01/04/2023] Open
Abstract
The immediate evolutionary space accessible to HIV is largely determined by how single amino acid mutations affect fitness. These mutational effects can shift as the virus evolves. However, the prevalence of such shifts in mutational effects remains unclear. Here, we quantify the effects on viral growth of all amino acid mutations to two HIV envelope (Env) proteins that differ at >100 residues. Most mutations similarly affect both Envs, but the amino acid preferences of a minority of sites have clearly shifted. These shifted sites usually prefer a specific amino acid in one Env, but tolerate many amino acids in the other. Surprisingly, shifts are only slightly enriched at sites that have substituted between the Envs—and many occur at residues that do not even contact substitutions. Therefore, long-range epistasis can unpredictably shift Env’s mutational tolerance during HIV evolution, although the amino acid preferences of most sites are conserved between moderately diverged viral strains. The virus that causes AIDS, or HIV, has a protein called Env on its surface, which is essential for the virus to infect cells. Env can also be recognized by the immune system, which then targets the virus for destruction or blocks it from infecting cells. Unfortunately, Env evolves very quickly, which means that HIV can evade our defenses. However, there are limits to how much this protein can change, since it still needs to perform its essential role in helping viruses enter cells. In the century since HIV first appeared in human populations, the virus has evolved considerably. There are now many HIV strains that infect people, and they bear Env proteins with substantially different sequences. However, it is not clear if these changes in sequence have resulted in Envs from distinct strains being able to tolerate different mutations. To examine this question, Haddox et al. compared how the Envs from two strains of HIV react to modifications in their sequences. They created all possible individual mutations in the proteins, and the resulting collections of mutated viruses were then tested for their ability to infect cells in the laboratory. Most mutations had similar effects in both Env proteins. This allowed Haddox et al. to identify portions of the protein that easily accommodate changes, and portions that must remain unchanged for viruses to remain infectious—at least in the laboratory. Some of these mutations are under different types of pressures when the virus faces the immune system, and those were identified using computational approaches. However, some mutations were tolerated differently by the two Env proteins. Therefore, viral strains differ in how their Env proteins can evolve. The parts of Env that showed differences in mutational tolerance between the strains were not necessarily the parts that differ in sequence. This shows that changes in sequence in one part of the protein can modify how other portions evolve. It remains to be determined whether changes in tolerance to mutations translate into differences in how the virus can escape immunity. This is an important question given that the rapid evolution of Env is a major obstacle to creating a vaccine for HIV.
Collapse
Affiliation(s)
- Hugh K Haddox
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, United States.,Molecular and Cellular Biology PhD program, University of Washington, Seattle, United States
| | - Adam S Dingens
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, United States.,Molecular and Cellular Biology PhD program, University of Washington, Seattle, United States
| | - Sarah K Hilton
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, United States.,Department of Genome Sciences, University of Washington, Seattle, United States
| | - Julie Overbaugh
- Human Biology Division, Fred Hutchinson Cancer Research Center, Seattle, United States.,Epidemiology Program, Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Jesse D Bloom
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, United States.,Department of Genome Sciences, University of Washington, Seattle, United States
| |
Collapse
|
48
|
Molaro A, Young JM, Malik HS. Evolutionary origins and diversification of testis-specific short histone H2A variants in mammals. Genome Res 2018; 28:460-473. [PMID: 29549088 PMCID: PMC5880237 DOI: 10.1101/gr.229799.117] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Accepted: 02/13/2018] [Indexed: 12/11/2022]
Abstract
Eukaryotic genomes must accomplish both compact packaging for genome stability and inheritance, as well as accessibility for gene expression. They do so using post-translational modifications of four ancient canonical histone proteins (H2A, H2B, H3, and H4) and by deploying histone variants with specialized chromatin functions. Some histone variants are conserved across all eukaryotes, whereas others are lineage-specific. Here, we performed detailed phylogenomic analyses of “short H2A histone” variants found in mammalian genomes. We discovered a previously undescribed typically-sized H2A variant in monotremes and marsupials, H2A.R, which may represent the common ancestor of the short H2As. We also discovered a novel class of short H2A histone variants in eutherian mammals, H2A.Q. We show that short H2A variants arose on the X Chromosome in the common ancestor of all eutherian mammals and diverged into four evolutionarily distinct clades: H2A.B, H2A.L, H2A.P, and H2A.Q. However, the repertoires of short histone H2A variants vary extensively among eutherian mammals due to lineage-specific gains and losses. Finally, we show that all four short H2As are subject to accelerated rates of protein evolution relative to both canonical and other variant H2A proteins including H2A.R. Our analyses reveal that short H2As are a unique class of testis-restricted histone variants displaying an unprecedented evolutionary dynamism. Based on their X-Chromosomal localization, genetic turnover, and testis-specific expression, we hypothesize that short H2A variants may participate in genetic conflicts involving sex chromosomes during reproduction.
Collapse
Affiliation(s)
- Antoine Molaro
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | - Janet M Young
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | - Harmit S Malik
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA.,Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| |
Collapse
|
49
|
Hilton SK, Doud MB, Bloom JD. phydms: software for phylogenetic analyses informed by deep mutational scanning. PeerJ 2017; 5:e3657. [PMID: 28785526 PMCID: PMC5541924 DOI: 10.7717/peerj.3657] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2017] [Accepted: 07/15/2017] [Indexed: 11/30/2022] Open
Abstract
It has recently become possible to experimentally measure the effects of all amino-acid point mutations to proteins using deep mutational scanning. These experimental measurements can inform site-specific phylogenetic substitution models of gene evolution in nature. Here we describe software that efficiently performs analyses with such substitution models. This software, phydms, can be used to compare the results of deep mutational scanning experiments to the selection on genes in nature. Given a phylogenetic tree topology inferred with another program, phydms enables rigorous comparison of how well different experiments on the same gene capture actual natural selection. It also enables re-scaling of deep mutational scanning data to account for differences in the stringency of selection in the lab and nature. Finally, phydms can identify sites that are evolving differently in nature than expected from experiments in the lab. As data from deep mutational scanning experiments become increasingly widespread, phydms will facilitate quantitative comparison of the experimental results to the actual selection pressures shaping evolution in nature.
Collapse
Affiliation(s)
- Sarah K Hilton
- Division of Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA.,Department of Genome Sciences, University of Washington, Seattle, WA, United States of America
| | - Michael B Doud
- Division of Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA.,Department of Genome Sciences, University of Washington, Seattle, WA, United States of America.,Medical Scientist Training Program, University of Washington, Seattle, WA, United States of America
| | - Jesse D Bloom
- Division of Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA.,Department of Genome Sciences, University of Washington, Seattle, WA, United States of America
| |
Collapse
|
50
|
Ashenberg O, Padmakumar J, Doud MB, Bloom JD. Deep mutational scanning identifies sites in influenza nucleoprotein that affect viral inhibition by MxA. PLoS Pathog 2017; 13:e1006288. [PMID: 28346537 PMCID: PMC5383324 DOI: 10.1371/journal.ppat.1006288] [Citation(s) in RCA: 56] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2016] [Revised: 04/06/2017] [Accepted: 03/10/2017] [Indexed: 01/24/2023] Open
Abstract
The innate-immune restriction factor MxA inhibits influenza replication by targeting the viral nucleoprotein (NP). Human influenza virus is more resistant than avian influenza virus to inhibition by human MxA, and prior work has compared human and avian viral strains to identify amino-acid differences in NP that affect sensitivity to MxA. However, this strategy is limited to identifying sites in NP where mutations that affect MxA sensitivity have fixed during the small number of documented zoonotic transmissions of influenza to humans. Here we use an unbiased deep mutational scanning approach to quantify how all single amino-acid mutations to NP affect MxA sensitivity in the context of replication-competent virus. We both identify new sites in NP where mutations affect MxA resistance and re-identify mutations known to have increased MxA resistance during historical adaptations of influenza to humans. Most of the sites where mutations have the greatest effect are almost completely conserved across all influenza A viruses, and the amino acids at these sites confer relatively high resistance to MxA. These sites cluster in regions of NP that appear to be important for its recognition by MxA. Overall, our work systematically identifies the sites in influenza nucleoprotein where mutations affect sensitivity to MxA. We also demonstrate a powerful new strategy for identifying regions of viral proteins that affect inhibition by host factors. During viral infection, human cells express proteins that can restrict virus replication. However, in many cases it remains unclear what determines the sensitivity of a given viral strain to a particular restriction factor. Here we use a high-throughput approach to measure how all amino-acid mutations to the nucleoprotein of influenza virus affect restriction by the human protein MxA. We find several dozen sites where mutations substantially affect the sensitivity of influenza virus to MxA. While a few of these sites are known to have fixed mutations during past adaptations of influenza virus to humans, most of the sites are broadly conserved across all influenza strains and have never previously been described as affecting MxA resistance. Our results therefore show that the known historical evolution of influenza has only involved substitutions at a small fraction of the sites where mutations can in principle affect MxA resistance. We suggest that this is because many sites are already broadly fixed at amino acids that confer high resistance.
Collapse
Affiliation(s)
- Orr Ashenberg
- Division of Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Jai Padmakumar
- Division of Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Michael B. Doud
- Division of Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Medical Scientist Training Program, University of Washington School of Medicine, Seattle, WA, USA
| | - Jesse D. Bloom
- Division of Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- * E-mail:
| |
Collapse
|