1
|
Forrest B, Derbel H, Zhao Z, Liu Q. MMRT: MultiMut Recursive Tree for predicting functional effects of high-order protein variants from low-order variants. Comput Struct Biotechnol J 2025; 27:672-681. [PMID: 40070521 PMCID: PMC11894328 DOI: 10.1016/j.csbj.2025.02.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2024] [Revised: 02/10/2025] [Accepted: 02/17/2025] [Indexed: 03/14/2025] Open
Abstract
Protein sequences primarily determine their stability and functions. Mutations may occur at one, two, or three positions at the same time (low-order variants) or at multiple positions simultaneously (high-order variants), which affect protein functions. So far, low-order variants, such as single variants, double variants, and triple variants, have been well-studied through high-throughput experimental scanning techniques and computational prediction methods. However, research on high-order variants remains limited because of the difficulty of scanning an exponentially large number of potential variant combinations. Nonetheless, studying higher-order variants is crucial for understanding the pathogenesis of complex diseases, advancing protein engineering, and driving precision medicine. In this work, we introduce a novel deep learning model, namely MultiMut Recursive Tree (MMRT), to address this challenge of predicting the functional effects of high-order variants. MMRT integrates deep learning with a recursive tree framework to leverage the information from low-order variants to predict functional effects of high-order variants. We evaluated MMRT on datasets comprising 685,593 high-order variants. Our results (mean Spearman's correlation coefficient 0.55) demonstrated that MMRT outperformed three existing state-of-the-art methods: ESM (evolutionary scale modeling), DeepSequence, and ECNet (evolutionary context-integrated neural network). MMRT thus provides more accurate prediction of the functional effects of high-order protein variants, offering great potential for aiding the interpretation of variants in human disease studies.
Collapse
Affiliation(s)
- Bryce Forrest
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S Maryland Pkwy, Las Vegas, NV 89154, USA
| | - Houssemeddine Derbel
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S Maryland Pkwy, Las Vegas, NV 89154, USA
| | - Zhongming Zhao
- Center for Precision Health, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Qian Liu
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S Maryland Pkwy, Las Vegas, NV 89154, USA
- School of Life Sciences, College of Sciences, University of Nevada, Las Vegas, 4505 S Maryland Pkwy, Las Vegas, NV 89154, USA
| |
Collapse
|
2
|
Zheng N, Cai Y, Zhang Z, Zhou H, Deng Y, Du S, Tu M, Fang W, Xia X. Tailoring industrial enzymes for thermostability and activity evolution by the machine learning-based iCASE strategy. Nat Commun 2025; 16:604. [PMID: 39799136 PMCID: PMC11724889 DOI: 10.1038/s41467-025-55944-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Accepted: 01/03/2025] [Indexed: 01/15/2025] Open
Abstract
The pursuit of obtaining enzymes with high activity and stability remains a grail in enzyme evolution due to the stability-activity trade-off. Here, we develop an isothermal compressibility-assisted dynamic squeezing index perturbation engineering (iCASE) strategy to construct hierarchical modular networks for enzymes of varying complexity. Molecular mechanism analysis elucidates that the peak of adaptive evolution is reached through a structural response mechanism among variants. Furthermore, this dynamic response predictive model using structure-based supervised machine learning is established to predict enzyme function and fitness, demonstrating robust performance across different datasets and reliable prediction for epistasis. The universality of the iCASE strategy is validated by four sorts of enzymes with different structures and catalytic types. This machine learning-based iCASE strategy provides guidance for future research on the fitness evolution of enzymes.
Collapse
Affiliation(s)
- Nan Zheng
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, PR China
| | - Yongchao Cai
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, PR China
| | - Zehua Zhang
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, PR China
| | - Huimin Zhou
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, PR China
| | - Yu Deng
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, PR China
| | - Shuang Du
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, PR China
| | - Mai Tu
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, PR China
| | - Wei Fang
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, PR China
| | - Xiaole Xia
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, PR China.
- College of Food Science and Engineering, Tianjin University of Science and Technology, Tianjin, PR China.
| |
Collapse
|
3
|
Chen D, Su W, Choy KT, Chu YS, Lin CH, Yen HL. High throughput profiling identified PA-L106R amino acid substitution in A(H1N1)pdm09 influenza virus that confers reduced susceptibility to baloxavir in vitro. Antiviral Res 2024; 229:105961. [PMID: 39002800 DOI: 10.1016/j.antiviral.2024.105961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2024] [Revised: 07/09/2024] [Accepted: 07/09/2024] [Indexed: 07/15/2024]
Abstract
Baloxavir acid (BXA) is a pan-influenza antiviral that targets the cap-dependent endonuclease of the polymerase acidic (PA) protein required for viral mRNA synthesis. To gain a comprehensive understanding on the molecular changes associated with reduced susceptibility to BXA and their fitness profile, we performed a deep mutational scanning at the PA endonuclease domain of an A (H1N1)pdm09 virus. The recombinant virus libraries were serially passaged in vitro under increasing concentrations of BXA followed by next-generation sequencing to monitor PA amino acid substitutions with increased detection frequencies. Enriched PA amino acid changes were each introduced into a recombinant A (H1N1)pdm09 virus to validate their effect on BXA susceptibility and viral replication fitness in vitro. The I38 T/M substitutions known to confer reduced susceptibility to BXA were invariably detected from recombinant virus libraries within 5 serial passages. In addition, we identified a novel L106R substitution that emerged in the third passage and conferred greater than 10-fold reduced susceptibility to BXA. PA-L106 is highly conserved among seasonal influenza A and B viruses. Compared to the wild-type virus, the L106R substitution resulted in reduced polymerase activity and a minor reduction of the peak viral load, suggesting the amino acid change may result in moderate fitness loss. Our results support the use of deep mutational scanning as a practical tool to elucidate genotype-phenotype relationships, including mapping amino acid substitutions with reduced susceptibility to antivirals.
Collapse
Affiliation(s)
- Dongdong Chen
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Wen Su
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Ka-Tim Choy
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Yan Sing Chu
- Centre for PanorOmic Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Chi Ho Lin
- Centre for PanorOmic Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Hui-Ling Yen
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China.
| |
Collapse
|
4
|
Illig AM, Siedhoff NE, Davari MD, Schwaneberg U. Evolutionary Probability and Stacked Regressions Enable Data-Driven Protein Engineering with Minimized Experimental Effort. J Chem Inf Model 2024; 64:6350-6360. [PMID: 39088689 DOI: 10.1021/acs.jcim.4c00704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/03/2024]
Abstract
Protein engineering through directed evolution and (semi)rational approaches is routinely applied to optimize protein properties for a broad range of applications in industry and academia. The multitude of possible variants, combined with limited screening throughput, hampers efficient protein engineering. Data-driven strategies have emerged as a powerful tool to model the protein fitness landscape that can be explored in silico, significantly accelerating protein engineering campaigns. However, such methods require a certain amount of data, which often cannot be provided, to generate a reliable model of the fitness landscape. Here, we introduce MERGE, a method that combines direct coupling analysis (DCA) and machine learning (ML). MERGE enables data-driven protein engineering when only limited data are available for training, typically ranging from 50 to 500 labeled sequences. Our method demonstrates remarkable performance in predicting a protein's fitness value and rank based on its sequence across diverse proteins and properties. Notably, MERGE outperforms state-of-the-art methods when only small data sets are available for modeling, requiring fewer computational resources, and proving particularly promising for protein engineers who have access to limited amounts of data.
Collapse
Affiliation(s)
| | - Niklas E Siedhoff
- Institute of Biotechnology, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany
| | - Mehdi D Davari
- Department of Bioorganic Chemistry, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120 Halle, Germany
| | - Ulrich Schwaneberg
- Institute of Biotechnology, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany
| |
Collapse
|
5
|
Lei R, Qing E, Odle A, Yuan M, Gunawardene CD, Tan TJC, So N, Ouyang WO, Wilson IA, Gallagher T, Perlman S, Wu NC, Wong LYR. Functional and antigenic characterization of SARS-CoV-2 spike fusion peptide by deep mutational scanning. Nat Commun 2024; 15:4056. [PMID: 38744813 PMCID: PMC11094058 DOI: 10.1038/s41467-024-48104-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 04/16/2024] [Indexed: 05/16/2024] Open
Abstract
The fusion peptide of SARS-CoV-2 spike protein is functionally important for membrane fusion during virus entry and is part of a broadly neutralizing epitope. However, sequence determinants at the fusion peptide and its adjacent regions for pathogenicity and antigenicity remain elusive. In this study, we perform a series of deep mutational scanning (DMS) experiments on an S2 region spanning the fusion peptide of authentic SARS-CoV-2 in different cell lines and in the presence of broadly neutralizing antibodies. We identify mutations at residue 813 of the spike protein that reduced TMPRSS2-mediated entry with decreased virulence. In addition, we show that an F823Y mutation, present in bat betacoronavirus HKU9 spike protein, confers resistance to broadly neutralizing antibodies. Our findings provide mechanistic insights into SARS-CoV-2 pathogenicity and also highlight a potential challenge in developing broadly protective S2-based coronavirus vaccines.
Collapse
Affiliation(s)
- Ruipeng Lei
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Enya Qing
- Department of Microbiology and Immunology, Loyola University Chicago, Maywood, IL, 60153, USA
| | - Abby Odle
- Department of Microbiology and Immunology, University of Iowa, Iowa City, IA, 52242, USA
| | - Meng Yuan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Chaminda D Gunawardene
- Center for Virus-Host Innate Immunity, Rutgers New Jersey Medical School, Newark, NJ, 07103, USA
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, NJ, 07103, USA
| | - Timothy J C Tan
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Natalie So
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Wenhao O Ouyang
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Ian A Wilson
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
- The Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Tom Gallagher
- Department of Microbiology and Immunology, Loyola University Chicago, Maywood, IL, 60153, USA.
| | - Stanley Perlman
- Department of Microbiology and Immunology, University of Iowa, Iowa City, IA, 52242, USA.
- Department of Pediatrics, University of Iowa, Iowa City, IA, 52242, USA.
| | - Nicholas C Wu
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
- Carle Illinois College of Medicine, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
| | - Lok-Yin Roy Wong
- Department of Microbiology and Immunology, University of Iowa, Iowa City, IA, 52242, USA.
- Center for Virus-Host Innate Immunity, Rutgers New Jersey Medical School, Newark, NJ, 07103, USA.
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, NJ, 07103, USA.
| |
Collapse
|
6
|
Notin P, Kollasch AW, Ritter D, van Niekerk L, Paul S, Spinner H, Rollins N, Shaw A, Weitzman R, Frazer J, Dias M, Franceschi D, Orenbuch R, Gal Y, Marks DS. ProteinGym: Large-Scale Benchmarks for Protein Design and Fitness Prediction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.07.570727. [PMID: 38106144 PMCID: PMC10723403 DOI: 10.1101/2023.12.07.570727] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Predicting the effects of mutations in proteins is critical to many applications, from understanding genetic disease to designing novel proteins that can address our most pressing challenges in climate, agriculture and healthcare. Despite a surge in machine learning-based protein models to tackle these questions, an assessment of their respective benefits is challenging due to the use of distinct, often contrived, experimental datasets, and the variable performance of models across different protein families. Addressing these challenges requires scale. To that end we introduce ProteinGym, a large-scale and holistic set of benchmarks specifically designed for protein fitness prediction and design. It encompasses both a broad collection of over 250 standardized deep mutational scanning assays, spanning millions of mutated sequences, as well as curated clinical datasets providing high-quality expert annotations about mutation effects. We devise a robust evaluation framework that combines metrics for both fitness prediction and design, factors in known limitations of the underlying experimental methods, and covers both zero-shot and supervised settings. We report the performance of a diverse set of over 70 high-performing models from various subfields (eg., alignment-based, inverse folding) into a unified benchmark suite. We open source the corresponding codebase, datasets, MSAs, structures, model predictions and develop a user-friendly website that facilitates data access and analysis.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Ada Shaw
- Applied Mathematics, Harvard University
| | | | | | - Mafalda Dias
- Centre for Genomic Regulation, Universitat Pompeu Fabra
| | | | | | - Yarin Gal
- Computer Science, University of Oxford
| | | |
Collapse
|
7
|
Lei R, Qing E, Odle A, Yuan M, Tan TJ, So N, Ouyang WO, Wilson IA, Gallagher T, Perlman S, Wu NC, Wong LYR. Functional and antigenic characterization of SARS-CoV-2 spike fusion peptide by deep mutational scanning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.28.569051. [PMID: 38076875 PMCID: PMC10705381 DOI: 10.1101/2023.11.28.569051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
The fusion peptide of SARS-CoV-2 spike protein is functionally important for membrane fusion during virus entry and is part of a broadly neutralizing epitope. However, sequence determinants at the fusion peptide and its adjacent regions for pathogenicity and antigenicity remain elusive. In this study, we performed a series of deep mutational scanning (DMS) experiments on an S2 region spanning the fusion peptide of authentic SARS-CoV-2 in different cell lines and in the presence of broadly neutralizing antibodies. We identified mutations at residue 813 of the spike protein that reduced TMPRSS2-mediated entry with decreased virulence. In addition, we showed that an F823Y mutation, present in bat betacoronavirus HKU9 spike protein, confers resistance to broadly neutralizing antibodies. Our findings provide mechanistic insights into SARS-CoV-2 pathogenicity and also highlight a potential challenge in developing broadly protective S2-based coronavirus vaccines.
Collapse
Affiliation(s)
- Ruipeng Lei
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Enya Qing
- Department of Microbiology and Immunology, Loyola University Chicago, Maywood, IL 60153, USA
| | - Abby Odle
- Department of Microbiology and Immunology, University of Iowa, Iowa City, IA 52242, USA
| | - Meng Yuan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Timothy J.C. Tan
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Natalie So
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Wenhao O. Ouyang
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Ian A. Wilson
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
- The Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Tom Gallagher
- Department of Microbiology and Immunology, Loyola University Chicago, Maywood, IL 60153, USA
| | - Stanley Perlman
- Department of Microbiology and Immunology, University of Iowa, Iowa City, IA 52242, USA
- Department of Pediatrics, University of Iowa, Iowa City, IA 52242, USA
| | - Nicholas C. Wu
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Carle Illinois College of Medicine, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Lok-Yin Roy Wong
- Department of Microbiology and Immunology, University of Iowa, Iowa City, IA 52242, USA
- Center for Virus-Host-Innate Immunity, Rutgers New Jersey Medical School, Newark, NJ 07103, USA
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, NJ 07103, USA
| |
Collapse
|
8
|
Derbel H, Zhao Z, Liu Q. Accurate prediction of functional effect of single amino acid variants with deep learning. Comput Struct Biotechnol J 2023; 21:5776-5784. [PMID: 38074467 PMCID: PMC10709104 DOI: 10.1016/j.csbj.2023.11.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 11/08/2023] [Accepted: 11/09/2023] [Indexed: 02/12/2024] Open
Abstract
The assessment of functional effect of amino acid variants is a critical biological problem in proteomics for clinical medicine and protein engineering. Although natively occurring variants offer insights into deleterious variants, high-throughput deep mutational experiments enable comprehensive investigation of amino acid variants for a given protein. However, these mutational experiments are too expensive to dissect millions of variants on thousands of proteins. Thus, computational approaches have been proposed, but they heavily rely on hand-crafted evolutionary conservation, limiting their accuracy. Recent advancement in transformers provides a promising solution to precisely estimate the functional effects of protein variants on high-throughput experimental data. Here, we introduce a novel deep learning model, namely Rep2Mut-V2, which leverages learned representation from transformer models. Rep2Mut-V2 significantly enhances the prediction accuracy for 27 types of measurements of functional effects of protein variants. In the evaluation of 38 protein datasets with 118,933 single amino acid variants, Rep2Mut-V2 achieved an average Spearman's correlation coefficient of 0.7. This surpasses the performance of six state-of-the-art methods, including the recently released methods ESM, DeepSequence and EVE. Even with limited training data, Rep2Mut-V2 outperforms ESM and DeepSequence, showing its potential to extend high-throughput experimental analysis for more protein variants to reduce experimental cost. In conclusion, Rep2Mut-V2 provides accurate predictions of the functional effects of single amino acid variants of protein coding sequences. This tool can significantly aid in the interpretation of variants in human disease studies.
Collapse
Affiliation(s)
- Houssemeddine Derbel
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, Las Vegas, NV 89154, USA
| | - Zhongming Zhao
- Center for Precision Health, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Qian Liu
- Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, Las Vegas, NV 89154, USA
- School of Life Sciences, College of Sciences, University of Nevada, Las Vegas, Las Vegas, NV 89154, USA
| |
Collapse
|
9
|
Hauser BM, Luo Y, Nathan A, Gaiha GD, Vavvas D, Comander J, Pierce EA, Place EM, Bujakowska KM, Rossin EJ. Structure-based network analysis predicts mutations associated with inherited retinal disease. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.07.05.23292247. [PMID: 37461650 PMCID: PMC10350150 DOI: 10.1101/2023.07.05.23292247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/27/2023]
Abstract
With continued advances in gene sequencing technologies comes the need to develop better tools to understand which mutations cause disease. Here we validate structure-based network analysis (SBNA)1,2 in well-studied human proteins and report results of using SBNA to identify critical amino acids that may cause retinal disease if subject to missense mutation. We computed SBNA scores for genes with high-quality structural data, starting with validating the method using 4 well-studied human disease-associated proteins. We then analyzed 47 inherited retinal disease (IRD) genes. We compared SBNA scores to phenotype data from the ClinVar database and found a significant difference between benign and pathogenic mutations with respect to network score. Finally, we applied this approach to 65 patients at Massachusetts Eye and Ear (MEE) who were diagnosed with IRD but for whom no genetic cause was found. Multivariable logistic regression models built using SBNA scores for IRD-associated genes successfully predicted pathogenicity of novel mutations, allowing us to identify likely causative disease variants in 37 patients with IRD from our clinic. In conclusion, SBNA can be meaningfully applied to human proteins and may help predict mutations causative of IRD.
Collapse
Affiliation(s)
| | - Yuyang Luo
- Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| | - Anusha Nathan
- Ragon Institute of Mass General, MIT, and Harvard, Cambridge, MA
| | - Gaurav D. Gaiha
- Ragon Institute of Mass General, MIT, and Harvard, Cambridge, MA
| | - Demetrios Vavvas
- Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| | - Jason Comander
- Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| | - Eric A. Pierce
- Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| | - Emily M. Place
- Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| | - Kinga M. Bujakowska
- Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| | - Elizabeth J. Rossin
- Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| |
Collapse
|
10
|
Moulana A, Dupic T, Phillips AM, Desai MM. Genotype-phenotype landscapes for immune-pathogen coevolution. Trends Immunol 2023; 44:384-396. [PMID: 37024340 PMCID: PMC10147585 DOI: 10.1016/j.it.2023.03.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 03/08/2023] [Accepted: 03/09/2023] [Indexed: 04/07/2023]
Abstract
Our immune systems constantly coevolve with the pathogens that challenge them, as pathogens adapt to evade our defense responses, with our immune repertoires shifting in turn. These coevolutionary dynamics take place across a vast and high-dimensional landscape of potential pathogen and immune receptor sequence variants. Mapping the relationship between these genotypes and the phenotypes that determine immune-pathogen interactions is crucial for understanding, predicting, and controlling disease. Here, we review recent developments applying high-throughput methods to create large libraries of immune receptor and pathogen protein sequence variants and measure relevant phenotypes. We describe several approaches that probe different regions of the high-dimensional sequence space and comment on how combinations of these methods may offer novel insight into immune-pathogen coevolution.
Collapse
Affiliation(s)
- Alief Moulana
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Thomas Dupic
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Angela M Phillips
- Department of Microbiology and Immunology, University of California at San Francisco, San Francisco, CA 94143, USA
| | - Michael M Desai
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA; Department of Physics, Harvard University, Cambridge, MA 02138, USA; NSF-Simons Center for Mathematical and Statistical Analysis of Biology, Harvard University, Cambridge, MA 02138, USA; Quantitative Biology Initiative, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
11
|
Flynn J, Samant N, Schneider-Nachum G, Tenzin T, Bolon DNA. Mutational fitness landscape and drug resistance. Curr Opin Struct Biol 2023; 78:102525. [PMID: 36621152 PMCID: PMC10243218 DOI: 10.1016/j.sbi.2022.102525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 11/29/2022] [Accepted: 12/06/2022] [Indexed: 01/08/2023]
Abstract
Robust technology has been developed to systematically quantify fitness landscapes that provide valuable opportunities to improve our understanding of drug resistance and define new avenues to develop drugs with reduced resistance susceptibility. We outline the critical importance of drug resistance studies and the potential for fitness landscape approaches to contribute to this effort. We describe the major technical advancements in mutational scanning, which is the primary approach used to quantify protein fitness landscapes. There are many complex steps to consider in planning and executing mutational scanning projects including developing a selection scheme, generating mutant libraries, tracking the frequency of variants using next-generation sequencing, and processing and interpreting the data. Key experimental parameters impacting each of these steps are discussed to aid in planning fitness landscape studies. There is a strong need for improved understanding of drug resistance, and fitness landscapes provide a promising new approach.
Collapse
Affiliation(s)
- Julia Flynn
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Neha Samant
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Gily Schneider-Nachum
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Tsepal Tenzin
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Daniel N A Bolon
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA.
| |
Collapse
|
12
|
Domingo E, García-Crespo C, Soria ME, Perales C. Viral Fitness, Population Complexity, Host Interactions, and Resistance to Antiviral Agents. Curr Top Microbiol Immunol 2023; 439:197-235. [PMID: 36592247 DOI: 10.1007/978-3-031-15640-3_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Fitness of viruses has become a standard parameter to quantify their adaptation to a biological environment. Fitness determinations for RNA viruses (and some highly variable DNA viruses) meet with several uncertainties. Of particular interest are those that arise from mutant spectrum complexity, absence of population equilibrium, and internal interactions among components of a mutant spectrum. Here, concepts, fitness measurements, limitations, and current views on experimental viral fitness landscapes are discussed. The effect of viral fitness on resistance to antiviral agents is covered in some detail since it constitutes a widespread problem in antiviral pharmacology, and a challenge for the design of effective antiviral treatments. Recent evidence with hepatitis C virus suggests the operation of mechanisms of antiviral resistance additional to the standard selection of drug-escape mutants. The possibility that high replicative fitness may be the driver of such alternative mechanisms is considered. New broad-spectrum antiviral designs that target viral fitness may curtail the impact of drug-escape mutants in treatment failures. We consider to what extent fitness-related concepts apply to coronaviruses and how they may affect strategies for COVID-19 prevention and treatment.
Collapse
Affiliation(s)
- Esteban Domingo
- Centro de Biología Molecular "Severo Ochoa" (CSIC-UAM), Consejo Superior de Investigaciones Científicas (CSIC), Campus de Cantoblanco, 28049, Madrid, Spain. .,Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Instituto de Salud Carlos III, 28029, Madrid, Spain.
| | - Carlos García-Crespo
- Centro de Biología Molecular "Severo Ochoa" (CSIC-UAM), Consejo Superior de Investigaciones Científicas (CSIC), Campus de Cantoblanco, 28049, Madrid, Spain.,Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Instituto de Salud Carlos III, 28029, Madrid, Spain
| | - María Eugenia Soria
- Centro de Biología Molecular "Severo Ochoa" (CSIC-UAM), Consejo Superior de Investigaciones Científicas (CSIC), Campus de Cantoblanco, 28049, Madrid, Spain.,Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Instituto de Salud Carlos III, 28029, Madrid, Spain.,Department of Clinical Microbiology, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), Av. Reyes Católicos 2, 28040, Madrid, Spain
| | - Celia Perales
- Centro de Biología Molecular "Severo Ochoa" (CSIC-UAM), Consejo Superior de Investigaciones Científicas (CSIC), Campus de Cantoblanco, 28049, Madrid, Spain.,Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Instituto de Salud Carlos III, 28029, Madrid, Spain.,Department of Clinical Microbiology, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), Av. Reyes Católicos 2, 28040, Madrid, Spain.,Department of Molecular and Cell Biology, Centro Nacional de Biotecnología (CNB-CSIC), Consejo Superior de Investigaciones Científicas (CSIC), Campus de Cantoblanco, 28049, Madrid, Spain
| |
Collapse
|
13
|
Delgado S, Perales C, García-Crespo C, Soria ME, Gallego I, de Ávila AI, Martínez-González B, Vázquez-Sirvent L, López-Galíndez C, Morán F, Domingo E. A Two-Level, Intramutant Spectrum Haplotype Profile of Hepatitis C Virus Revealed by Self-Organized Maps. Microbiol Spectr 2021; 9:e0145921. [PMID: 34756074 PMCID: PMC8579923 DOI: 10.1128/spectrum.01459-21] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 10/12/2021] [Indexed: 12/17/2022] Open
Abstract
RNA viruses replicate as complex mutant spectra termed viral quasispecies. The frequency of each individual genome in a mutant spectrum depends on its rate of generation and its relative fitness in the replicating population ensemble. The advent of deep sequencing methodologies allows for the first-time quantification of haplotype abundances within mutant spectra. There is no information on the haplotype profile of the resident genomes and how the landscape evolves when a virus replicates in a controlled cell culture environment. Here, we report the construction of intramutant spectrum haplotype landscapes of three amplicons of the NS5A-NS5B coding region of hepatitis C virus (HCV). Two-dimensional (2D) neural networks were constructed for 44 related HCV populations derived from a common clonal ancestor that was passaged up to 210 times in human hepatoma Huh-7.5 cells in the absence of external selective pressures. The haplotype profiles consisted of an extended dense basal platform, from which a lower number of protruding higher peaks emerged. As HCV increased its adaptation to the cells, the number of haplotype peaks within each mutant spectrum expanded, and their distribution shifted in the 2D network. The results show that extensive HCV replication in a monotonous cell culture environment does not limit HCV exploration of sequence space through haplotype peak movements. The landscapes reflect dynamic variation in the intramutant spectrum haplotype profile and may serve as a reference to interpret the modifications produced by external selective pressures or to compare with the landscapes of mutant spectra in complex in vivo environments. IMPORTANCE The study provides for the first time the haplotype profile and its variation in the course of virus adaptation to a cell culture environment in the absence of external selective constraints. The deep sequencing-based self-organized maps document a two-layer haplotype distribution with an ample basal platform and a lower number of protruding peaks. The results suggest an inferred intramutant spectrum fitness landscape structure that offers potential benefits for virus resilience to mutational inputs.
Collapse
Affiliation(s)
- Soledad Delgado
- Departamento de Sistemas Informáticos, Escuela Técnica Superior de Ingeniería de Sistemas Informáticos (ETSISI), Universidad Politécnica de Madrid, Madrid, Spain
| | - Celia Perales
- Department of Clinical Microbiology, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD), Madrid, Spain
- Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Consejo Superior de Investigaciones Científicas (CSIC), Madrid, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Instituto de Salud Carlos III, Madrid, Spain
| | - Carlos García-Crespo
- Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Consejo Superior de Investigaciones Científicas (CSIC), Madrid, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Instituto de Salud Carlos III, Madrid, Spain
| | - María Eugenia Soria
- Department of Clinical Microbiology, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD), Madrid, Spain
- Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Consejo Superior de Investigaciones Científicas (CSIC), Madrid, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Instituto de Salud Carlos III, Madrid, Spain
| | - Isabel Gallego
- Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Consejo Superior de Investigaciones Científicas (CSIC), Madrid, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Instituto de Salud Carlos III, Madrid, Spain
| | - Ana Isabel de Ávila
- Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Consejo Superior de Investigaciones Científicas (CSIC), Madrid, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Instituto de Salud Carlos III, Madrid, Spain
| | - Brenda Martínez-González
- Department of Clinical Microbiology, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD), Madrid, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Instituto de Salud Carlos III, Madrid, Spain
| | - Lucía Vázquez-Sirvent
- Department of Clinical Microbiology, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD), Madrid, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Instituto de Salud Carlos III, Madrid, Spain
| | - Cecilio López-Galíndez
- Unidad de Virología Molecular, Laboratorio de Referencia e Investigación en Retrovirus, Centro Nacional de Microbiología, Instituto de Salud Carlos III, Majadahonda, Madrid, Spain
| | - Federico Morán
- Departamento de Bioquímica y Biología Molecular, Universidad Complutense de Madrid, Madrid, Spain
| | - Esteban Domingo
- Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Consejo Superior de Investigaciones Científicas (CSIC), Madrid, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Instituto de Salud Carlos III, Madrid, Spain
| |
Collapse
|
14
|
Burton TD, Eyre NS. Applications of Deep Mutational Scanning in Virology. Viruses 2021; 13:1020. [PMID: 34071591 PMCID: PMC8227372 DOI: 10.3390/v13061020] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 05/26/2021] [Accepted: 05/26/2021] [Indexed: 12/20/2022] Open
Abstract
Several recently developed high-throughput techniques have changed the field of molecular virology. For example, proteomics studies reveal complete interactomes of a viral protein, genome-wide CRISPR knockout and activation screens probe the importance of every single human gene in aiding or fighting a virus, and ChIP-seq experiments reveal genome-wide epigenetic changes in response to infection. Deep mutational scanning is a relatively novel form of protein science which allows the in-depth functional analysis of every nucleotide within a viral gene or genome, revealing regions of importance, flexibility, and mutational potential. In this review, we discuss the application of this technique to RNA viruses including members of the Flaviviridae family, Influenza A Virus and Severe Acute Respiratory Syndrome Coronavirus 2. We also briefly discuss the reverse genetics systems which allow for analysis of viral replication cycles, next-generation sequencing technologies and the bioinformatics tools that facilitate this research.
Collapse
Affiliation(s)
| | - Nicholas S. Eyre
- College of Medicine and Public Health, Flinders University, Bedford Park, SA 5042, Australia;
| |
Collapse
|
15
|
Abstract
RNA viruses, such as hepatitis C virus (HCV), influenza virus, and SARS-CoV-2, are notorious for their ability to evolve rapidly under selection in novel environments. It is known that the high mutation rate of RNA viruses can generate huge genetic diversity to facilitate viral adaptation. However, less attention has been paid to the underlying fitness landscape that represents the selection forces on viral genomes, especially under different selection conditions. Here, we systematically quantified the distribution of fitness effects of about 1,600 single amino acid substitutions in the drug-targeted region of NS5A protein of HCV. We found that the majority of nonsynonymous substitutions incur large fitness costs, suggesting that NS5A protein is highly optimized. The replication fitness of viruses is correlated with the pattern of sequence conservation in nature, and viral evolution is constrained by the need to maintain protein stability. We characterized the adaptive potential of HCV by subjecting the mutant viruses to selection by the antiviral drug daclatasvir at multiple concentrations. Both the relative fitness values and the number of beneficial mutations were found to increase with the increasing concentrations of daclatasvir. The changes in the spectrum of beneficial mutations in NS5A protein can be explained by a pharmacodynamics model describing viral fitness as a function of drug concentration. Overall, our results show that the distribution of fitness effects of mutations is modulated by both the constraints on the biophysical properties of proteins (i.e., selection pressure for protein stability) and the level of environmental stress (i.e., selection pressure for drug resistance). IMPORTANCE Many viruses adapt rapidly to novel selection pressures, such as antiviral drugs. Understanding how pathogens evolve under drug selection is critical for the success of antiviral therapy against human pathogens. By combining deep sequencing with selection experiments in cell culture, we have quantified the distribution of fitness effects of mutations in hepatitis C virus (HCV) NS5A protein. Our results indicate that the majority of single amino acid substitutions in NS5A protein incur large fitness costs. Simulation of protein stability suggests viral evolution is constrained by the need to maintain protein stability. By subjecting the mutant viruses to selection under an antiviral drug, we find that the adaptive potential of viral proteins in a novel environment is modulated by the level of environmental stress, which can be explained by a pharmacodynamics model. Our comprehensive characterization of the fitness landscapes of NS5A can potentially guide the design of effective strategies to limit viral evolution.
Collapse
|
16
|
Ferguson AL, Ranganathan R. 100th Anniversary of Macromolecular Science Viewpoint: Data-Driven Protein Design. ACS Macro Lett 2021; 10:327-340. [PMID: 35549066 DOI: 10.1021/acsmacrolett.0c00885] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The design of synthetic proteins with the desired function is a long-standing goal in biomolecular science, with broad applications in biochemical engineering, agriculture, medicine, and public health. Rational de novo design and experimental directed evolution have achieved remarkable successes but are challenged by the requirement to find functional "needles" in the vast "haystack" of protein sequence space. Data-driven models for fitness landscapes provide a predictive map between protein sequence and function and can prospectively identify functional candidates for experimental testing to greatly improve the efficiency of this search. This Viewpoint reviews the applications of machine learning and, in particular, deep learning as part of data-driven protein engineering platforms. We highlight recent successes, review promising computational methodologies, and provide an outlook on future challenges and opportunities. The article is written for a broad audience comprising both polymer and protein scientists and computer and data scientists interested in an up-to-date review of recent innovations and opportunities in this rapidly evolving field.
Collapse
Affiliation(s)
- Andrew L. Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Rama Ranganathan
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
- Center for Physics of Evolving Systems, University of Chicago, Chicago, Illinois 60637, United States
- Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
17
|
Munro D, Singh M. DeMaSk: a deep mutational scanning substitution matrix and its use for variant impact prediction. Bioinformatics 2020; 36:5322-5329. [PMID: 33325500 PMCID: PMC8016454 DOI: 10.1093/bioinformatics/btaa1030] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Revised: 10/16/2020] [Accepted: 11/30/2020] [Indexed: 01/27/2023] Open
Abstract
Motivation Accurately predicting the quantitative impact of a substitution on a protein’s molecular function would be a great aid in understanding the effects of observed genetic variants across populations. While this remains a challenging task, new approaches can leverage data from the increasing numbers of comprehensive deep mutational scanning (DMS) studies that systematically mutate proteins and measure fitness. Results We introduce DeMaSk, an intuitive and interpretable method based only upon DMS datasets and sequence homologs that predicts the impact of missense mutations within any protein. DeMaSk first infers a directional amino acid substitution matrix from DMS datasets and then fits a linear model that combines these substitution scores with measures of per-position evolutionary conservation and variant frequency across homologs. Despite its simplicity, DeMaSk has state-of-the-art performance in predicting the impact of amino acid substitutions, and can easily and rapidly be applied to any protein sequence. Availability and implementation https://demask.princeton.edu generates fitness impact predictions and visualizations for any user-submitted protein sequence. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Daniel Munro
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, 08544, USA
| | - Mona Singh
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, 08544, USA.,Department of Computer Science, Princeton University, Princeton, 08544, USA
| |
Collapse
|
18
|
Zhang TH, Dai L, Barton JP, Du Y, Tan Y, Pang W, Chakraborty AK, Lloyd-Smith JO, Sun R. Predominance of positive epistasis among drug resistance-associated mutations in HIV-1 protease. PLoS Genet 2020; 16:e1009009. [PMID: 33085662 PMCID: PMC7605711 DOI: 10.1371/journal.pgen.1009009] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Revised: 11/02/2020] [Accepted: 07/24/2020] [Indexed: 12/12/2022] Open
Abstract
Drug-resistant mutations often have deleterious impacts on replication fitness, posing a fitness cost that can only be overcome by compensatory mutations. However, the role of fitness cost in the evolution of drug resistance has often been overlooked in clinical studies or in vitro selection experiments, as these observations only capture the outcome of drug selection. In this study, we systematically profile the fitness landscape of resistance-associated sites in HIV-1 protease using deep mutational scanning. We construct a mutant library covering combinations of mutations at 11 sites in HIV-1 protease, all of which are associated with resistance to protease inhibitors in clinic. Using deep sequencing, we quantify the fitness of thousands of HIV-1 protease mutants after multiple cycles of replication in human T cells. Although the majority of resistance-associated mutations have deleterious effects on viral replication, we find that epistasis among resistance-associated mutations is predominantly positive. Furthermore, our fitness data are consistent with genetic interactions inferred directly from HIV sequence data of patients. Fitness valleys formed by strong positive epistasis reduce the likelihood of reversal of drug resistance mutations. Overall, our results support the view that strong compensatory effects are involved in the emergence of clinically observed resistance mutations and provide insights to understanding fitness barriers in the evolution and reversion of drug resistance.
Collapse
Affiliation(s)
- Tian-hao Zhang
- Molecular Biology Institute, University of California, Los Angeles, CA 90095, USA
| | - Lei Dai
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - John P. Barton
- Department of Physics and Astronomy, University of California, Riverside, CA 92521, USA
| | - Yushen Du
- School of Medicine, ZheJiang University, Hangzhou, 210000, China
- Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | - Yuxiang Tan
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Wenwen Pang
- Department of Public Health Laboratory Science, West China School of Public Health, Sichuan University, Chengdu 610041, China
| | - Arup K. Chakraborty
- Institute for Medical Engineering and Science, Departments of Chemical Engineering, Physics, & Chemistry, Massachusetts Institute of Technology, MA 21309, USA
- Ragon Institute of MGH, MIT, & Harvard, Cambridge, MA 21309, USA
| | - James O. Lloyd-Smith
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095, USA
| | - Ren Sun
- Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| |
Collapse
|
19
|
Esposito D, Weile J, Shendure J, Starita LM, Papenfuss AT, Roth FP, Fowler DM, Rubin AF. MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biol 2019; 20:223. [PMID: 31679514 PMCID: PMC6827219 DOI: 10.1186/s13059-019-1845-6] [Citation(s) in RCA: 154] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Accepted: 10/01/2019] [Indexed: 11/10/2022] Open
Abstract
Multiplex assays of variant effect (MAVEs), such as deep mutational scans and massively parallel reporter assays, test thousands of sequence variants in a single experiment. Despite the importance of MAVE data for basic and clinical research, there is no standard resource for their discovery and distribution. Here, we present MaveDB ( https://www.mavedb.org ), a public repository for large-scale measurements of sequence variant impact, designed for interoperability with applications to interpret these datasets. We also describe the first such application, MaveVis, which retrieves, visualizes, and contextualizes variant effect maps. Together, the database and applications will empower the community to mine these powerful datasets.
Collapse
Affiliation(s)
- Daniel Esposito
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Jochen Weile
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Lea M Starita
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Anthony T Papenfuss
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, VIC, Australia
- Department of Mathematics and Statistics, University of Melbourne, Melbourne, VIC, Australia
| | - Frederick P Roth
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada.
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
- Canadian Institute for Advanced Research, Toronto, ON, Canada.
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Canadian Institute for Advanced Research, Toronto, ON, Canada.
- Department of Bioengineering, University of Washington, Seattle, WA, USA.
| | - Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia.
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia.
| |
Collapse
|
20
|
Laine E, Karami Y, Carbone A. GEMME: a simple and fast global epistatic model predicting mutational effects. Mol Biol Evol 2019; 36:2604-2619. [PMID: 31406981 PMCID: PMC6805226 DOI: 10.1093/molbev/msz179] [Citation(s) in RCA: 76] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 06/03/2019] [Accepted: 08/02/2019] [Indexed: 12/15/2022] Open
Abstract
The systematic and accurate description of protein mutational landscapes is a question of utmost importance in biology, bioengineering, and medicine. Recent progress has been achieved by leveraging on the increasing wealth of genomic data and by modeling intersite dependencies within biological sequences. However, state-of-the-art methods remain time consuming. Here, we present Global Epistatic Model for predicting Mutational Effects (GEMME) (www.lcqb.upmc.fr/GEMME), an original and fast method that predicts mutational outcomes by explicitly modeling the evolutionary history of natural sequences. This allows accounting for all positions in a sequence when estimating the effect of a given mutation. GEMME uses only a few biologically meaningful and interpretable parameters. Assessed against 50 high- and low-throughput mutational experiments, it overall performs similarly or better than existing methods. It accurately predicts the mutational landscapes of a wide range of protein families, including viral ones and, more generally, of much conserved families. Given an input alignment, it generates the full mutational landscape of a protein in a matter of minutes. It is freely available as a package and a webserver at www.lcqb.upmc.fr/GEMME/.
Collapse
Affiliation(s)
- Elodie Laine
- Sorbonne Université, UPMC University Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France
| | - Yasaman Karami
- Sorbonne Université, UPMC University Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France.,Sorbonne Université, UPMC-Univ P6, Institut du Calcul et de la Simulation
| | - Alessandra Carbone
- Sorbonne Université, UPMC University Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France.,Institut Universitaire de France
| |
Collapse
|
21
|
Rollins NJ, Brock KP, Poelwijk FJ, Stiffler MA, Gauthier NP, Sander C, Marks DS. Inferring protein 3D structure from deep mutation scans. Nat Genet 2019; 51:1170-1176. [PMID: 31209393 PMCID: PMC7295002 DOI: 10.1038/s41588-019-0432-9] [Citation(s) in RCA: 96] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 04/29/2019] [Indexed: 11/09/2022]
Abstract
We describe an experimental method of three-dimensional (3D) structure determination that exploits the increasing ease of high-throughput mutational scans. Inspired by the success of using natural, evolutionary sequence covariation to compute protein and RNA folds, we explored whether 'laboratory', synthetic sequence variation might also yield 3D structures. We analyzed five large-scale mutational scans and discovered that the pairs of residues with the largest positive epistasis in the experiments are sufficient to determine the 3D fold. We show that the strongest epistatic pairings from genetic screens of three proteins, a ribozyme and a protein interaction reveal 3D contacts within and between macromolecules. Using these experimental epistatic pairs, we compute ab initio folds for a GB1 domain (within 1.8 Å of the crystal structure) and a WW domain (2.1 Å). We propose strategies that reduce the number of mutants needed for contact prediction, suggesting that genomics-based techniques can efficiently predict 3D structure.
Collapse
Affiliation(s)
- Nathan J Rollins
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Kelly P Brock
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
| | - Frank J Poelwijk
- cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Michael A Stiffler
- cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Nicholas P Gauthier
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Broad Institute of Harvard and MIT, Cambridge, MA, USA.
| |
Collapse
|
22
|
Raja R, Baral S, Dixit NM. Interferon at the cellular, individual, and population level in hepatitis C virus infection: Its role in the interferon-free treatment era. Immunol Rev 2019; 285:55-71. [PMID: 30129199 DOI: 10.1111/imr.12689] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The advent of powerful direct-acting antiviral agents (DAAs) has revolutionized the treatment of hepatitis C. DAAs cure nearly all patients with short duration, oral treatments. Significant efforts are now underway to optimize DAA-based treatments. We discuss the potential role of interferon in this optimization. Clinical studies present compelling evidence that DAAs perform better in treatment-naive individuals than in individuals who previously failed treatment with interferon, a surprising correlation because interferon and DAAs are thought to act independently. Recent mathematical models explore a mechanistic hypothesis underlying this correlation. The hypothesis invokes the action of interferon at the cellular, individual, and population levels. Strong interferon responses prevent the productive infection of cells, reduce viral replication, and impede the development of resistance to DAAs in infected individuals and improve cure rates elicited by DAAs in treated populations. The models develop descriptions of these processes, integrate them into a comprehensive framework, and capture clinical data quantitatively, providing a successful test of the hypothesis. Individuals with strong endogenous interferon responses thus present a promising subpopulation for reducing DAA treatment durations. This review discusses the conceptual advances made by the models, highlights the new insights they unravel, and examines their applicability to optimize DAA-based treatments.
Collapse
Affiliation(s)
- Rubesh Raja
- Department of Chemical Engineering, Indian Institute of Science, Bangalore, India
| | - Subhasish Baral
- Department of Chemical Engineering, Indian Institute of Science, Bangalore, India
| | - Narendra M Dixit
- Department of Chemical Engineering, Indian Institute of Science, Bangalore, India.,Centre for Biosystems Science and Engineering, Indian Institute of Science, Bangalore, India
| |
Collapse
|
23
|
Gaiha GD, Rossin EJ, Urbach J, Landeros C, Collins DR, Nwonu C, Muzhingi I, Anahtar MN, Waring OM, Piechocka-Trocha A, Waring M, Worrall DP, Ghebremichael MS, Newman RM, Power KA, Allen TM, Chodosh J, Walker BD. Structural topology defines protective CD8 + T cell epitopes in the HIV proteome. Science 2019; 364:480-484. [PMID: 31048489 PMCID: PMC6855781 DOI: 10.1126/science.aav5095] [Citation(s) in RCA: 104] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2018] [Accepted: 03/25/2019] [Indexed: 12/26/2022]
Abstract
Mutationally constrained epitopes of variable pathogens represent promising targets for vaccine design but are not reliably identified by sequence conservation. In this study, we employed structure-based network analysis, which applies network theory to HIV protein structure data to quantitate the topological importance of individual amino acid residues. Mutation of residues at important network positions disproportionately impaired viral replication and occurred with high frequency in epitopes presented by protective human leukocyte antigen (HLA) class I alleles. Moreover, CD8+ T cell targeting of highly networked epitopes distinguished individuals who naturally control HIV, even in the absence of protective HLA alleles. This approach thereby provides a mechanistic basis for immune control and a means to identify CD8+ T cell epitopes of topological importance for rational immunogen design, including a T cell-based HIV vaccine.
Collapse
Affiliation(s)
- Gaurav D Gaiha
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
- Gastrointestinal Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Elizabeth J Rossin
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA 02114, USA
| | - Jonathan Urbach
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
| | | | - David R Collins
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| | - Chioma Nwonu
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
| | - Itai Muzhingi
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
| | - Melis N Anahtar
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Olivia M Waring
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Alicja Piechocka-Trocha
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| | - Michael Waring
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| | - Daniel P Worrall
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
| | | | - Ruchi M Newman
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
| | - Karen A Power
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
| | - Todd M Allen
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA
| | - James Chodosh
- Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA 02114, USA
| | - Bruce D Walker
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA 02139, USA.
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
24
|
Raja R, Pareek A, Newar K, Dixit NM. Mutational pathway maps and founder effects define the within-host spectrum of hepatitis C virus mutants resistant to drugs. PLoS Pathog 2019; 15:e1007701. [PMID: 30934020 PMCID: PMC6459561 DOI: 10.1371/journal.ppat.1007701] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2018] [Revised: 04/11/2019] [Accepted: 03/13/2019] [Indexed: 12/11/2022] Open
Abstract
Knowledge of the within-host frequencies of resistance-associated amino acid variants (RAVs) is important to the identification of optimal drug combinations for the treatment of hepatitis C virus (HCV) infection. Multiple RAVs may exist in infected individuals, often below detection limits, at any resistance locus, defining the diversity of accessible resistance pathways. We developed a multiscale mathematical model to estimate the pre-treatment frequencies of the entire spectrum of mutants at chosen loci. Using a codon-level description of amino acids, we performed stochastic simulations of intracellular dynamics with every possible nucleotide variant as the infecting strain and estimated the relative infectivity of each variant and the resulting distribution of variants produced. We employed these quantities in a deterministic multi-strain model of extracellular dynamics and estimated mutant frequencies. Our predictions captured database frequencies of the RAV R155K, resistant to NS3/4A protease inhibitors, presenting a successful test of our formalism. We found that mutational pathway maps, interconnecting all viable mutants, and strong founder effects determined the mutant spectrum. The spectra were vastly different for HCV genotypes 1a and 1b, underlying their differential responses to drugs. Using a fitness landscape determined recently, we estimated that 13 amino acid variants, encoded by 44 codons, exist at the residue 93 of the NS5A protein, illustrating the massive diversity of accessible resistance pathways at specific loci. Accounting for this diversity, which our model enables, would help optimize drug combinations. Our model may be applied to describe the within-host evolution of other flaviviruses and inform vaccine design strategies.
Collapse
Affiliation(s)
- Rubesh Raja
- Department of Chemical Engineering, Indian Institute of Science, Bangalore, India
| | - Aditya Pareek
- Department of Chemical Engineering, Indian Institute of Science, Bangalore, India
| | - Kapil Newar
- Department of Chemical Engineering, Indian Institute of Science, Bangalore, India
| | - Narendra M. Dixit
- Department of Chemical Engineering, Indian Institute of Science, Bangalore, India
- Centre for Biosystems Science and Engineering, Indian Institute of Science, Bangalore, India
- * E-mail:
| |
Collapse
|
25
|
Determinants of Zika virus host tropism uncovered by deep mutational scanning. Nat Microbiol 2019; 4:876-887. [PMID: 30886357 DOI: 10.1038/s41564-019-0399-4] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Accepted: 02/01/2019] [Indexed: 01/01/2023]
Abstract
Arboviruses cycle between, and replicate in, both invertebrate and vertebrate hosts, which for Zika virus (ZIKV) involves Aedes mosquitoes and primates1. The viral determinants required for replication in such obligate hosts are under strong purifying selection during natural virus evolution, making it challenging to resolve which determinants are optimal for viral fitness in each host. Herein we describe a deep mutational scanning (DMS) strategy2-5 whereby a viral cDNA library was constructed containing all codon substitutions in the C-terminal 204 amino acids of ZIKV envelope protein (E). The cDNA library was transfected into C6/36 (Aedes) and Vero (primate) cells, with subsequent deep sequencing and computational analyses of recovered viruses showing that substitutions K316Q and S461G, or Q350L and T397S, conferred substantial replicative advantages in mosquito and primate cells, respectively. A 316Q/461G virus was constructed and shown to be replication-defective in mammalian cells due to severely compromised virus particle formation and secretion. The 316Q/461G virus was also highly attenuated in human brain organoids, and illustrated utility as a vaccine in mice. This approach can thus imitate evolutionary selection in a matter of days and identify amino acids key to the regulation of virus replication in specific host environments.
Collapse
|
26
|
Abstract
Mutagenesis is one of the key techniques in virus research. The recent development of deep mutational scanning allows the assessment of replication fitness effects of a large number of viral mutants in a high-throughput manner. Here, we describe a protocol for studying hepatitis C virus (HCV) using deep mutational scanning, which includes the methodologies for mutant library construction, passaging, sequencing, and data analysis.
Collapse
Affiliation(s)
- Nicholas C Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA.
| | - Hangfei Qi
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA, USA
| |
Collapse
|
27
|
Murayama A, Fujiwara K, Yamada N, Shiina M, Aly HH, Masaki T, Muramatsu M, Wakita T, Kato T. Evaluation of antiviral effects of novel NS5A inhibitors in hepatitis C virus cell culture system with full-genome infectious clones. Antiviral Res 2018; 158:161-170. [PMID: 30118732 DOI: 10.1016/j.antiviral.2018.08.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Revised: 07/05/2018] [Accepted: 08/13/2018] [Indexed: 02/07/2023]
Abstract
Nonstructural protein 5A (NS5A) inhibitors of hepatitis C virus (HCV) are known to have potent anti-viral effects; however, these inhibitors have limited activities on strains with resistant-associated substitutions or non-genotype 1 strains. To overcome these shortcomings, novel NS5A inhibitors have been developed and approved for clinical application. The aim of this study was to evaluate the anti-viral effect of novel NS5A inhibitors (derivatives of odalasvir) on HCV genotype 2 strains in a cell culture system. Chimeric JFH-1 viruses replaced with NS5A of genotypes 1 and 2 were utilized to assess the genotype-specific potencies of NS5A inhibitors. We also examined full-genome infectious clones of JFH-1, J6cc, and J8cc to confirm the effects of NS5A inhibitors on genotype 2 strains. All chimeric viruses were capable of replication at similar levels in cell culture. We examined the anti-viral effects of derivatives of the novel NS5A inhibitor and compared with the first-generation NS5A inhibitor, daclatasvir (DCV). These compounds inhibited replication of chimeric JFH-1 viruses with NS5A of genotypes 1 and 2 at low concentrations in comparison with DCV. The EC50 values of J6cc and J8cc to these compounds were more than 100-fold lower than that of DCV. By long-term culture in the presence of these compounds, we obtained highly resistant variants and identified the responsible substitutions. In conclusion, novel NS5A inhibitors displayed improved potency against HCV genotype 2 strains compared with DCV. However, the activity of these compounds was impaired by emerging resistance-associated substitutions.
Collapse
Affiliation(s)
- Asako Murayama
- Department of Virology II, National Institute of Infectious Diseases, Tokyo, Japan
| | - Kei Fujiwara
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| | - Norie Yamada
- Department of Virology II, National Institute of Infectious Diseases, Tokyo, Japan
| | - Masaaki Shiina
- Department of Virology II, National Institute of Infectious Diseases, Tokyo, Japan; Department of Gastroenterology and Hepatology, Shin-Yurigaoka General Hospital, Kawasaki, Japan
| | - Hussein Hassan Aly
- Department of Virology II, National Institute of Infectious Diseases, Tokyo, Japan
| | - Takahiro Masaki
- Department of Virology II, National Institute of Infectious Diseases, Tokyo, Japan
| | - Masamichi Muramatsu
- Department of Virology II, National Institute of Infectious Diseases, Tokyo, Japan
| | - Takaji Wakita
- Department of Virology II, National Institute of Infectious Diseases, Tokyo, Japan
| | - Takanobu Kato
- Department of Virology II, National Institute of Infectious Diseases, Tokyo, Japan.
| |
Collapse
|
28
|
Riesselman AJ, Ingraham JB, Marks DS. Deep generative models of genetic variation capture the effects of mutations. Nat Methods 2018; 15:816-822. [PMID: 30250057 DOI: 10.1038/s41592-018-0138-4] [Citation(s) in RCA: 320] [Impact Index Per Article: 45.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Accepted: 07/29/2018] [Indexed: 01/05/2023]
Abstract
The functions of proteins and RNAs are defined by the collective interactions of many residues, and yet most statistical models of biological sequences consider sites nearly independently. Recent approaches have demonstrated benefits of including interactions to capture pairwise covariation, but leave higher-order dependencies out of reach. Here we show how it is possible to capture higher-order, context-dependent constraints in biological sequences via latent variable models with nonlinear dependencies. We found that DeepSequence ( https://github.com/debbiemarkslab/DeepSequence ), a probabilistic model for sequence families, predicted the effects of mutations across a variety of deep mutational scanning experiments substantially better than existing methods based on the same evolutionary data. The model, learned in an unsupervised manner solely on the basis of sequence information, is grounded with biologically motivated priors, reveals the latent organization of sequence families, and can be used to explore new parts of sequence space.
Collapse
Affiliation(s)
- Adam J Riesselman
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.,Program in Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - John B Ingraham
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.,Program in Systems Biology, Harvard University, Cambridge, MA, USA
| | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
29
|
Multiplexed assays of variant effects contribute to a growing genotype-phenotype atlas. Hum Genet 2018; 137:665-678. [PMID: 30073413 PMCID: PMC6153521 DOI: 10.1007/s00439-018-1916-x] [Citation(s) in RCA: 89] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Accepted: 07/21/2018] [Indexed: 12/12/2022]
Abstract
Given the constantly improving cost and speed of genome sequencing, it is reasonable to expect that personal genomes will soon be known for many millions of humans. This stands in stark contrast with our limited ability to interpret the sequence variants which we find. Although it is, perhaps, easiest to interpret variants in coding regions, knowledge of functional impact is unknown for the vast majority of missense variants. While many computational approaches can predict the impact of coding variants, they are given a little weight in the current guidelines for interpreting clinical variants. Laboratory assays produce comparatively more trustworthy results, but until recently did not scale to the space of all possible mutations. The development of deep mutational scanning and other multiplexed assays of variant effect has now brought feasibility of this endeavour within view. Here, we review progress in this field over the last decade, break down the different approaches into their components, and compare methodological differences.
Collapse
|
30
|
Venugopal V, Padmanabhan P, Raja R, Dixit NM. Modelling how responsiveness to interferon improves interferon-free treatment of hepatitis C virus infection. PLoS Comput Biol 2018; 14:e1006335. [PMID: 30001324 PMCID: PMC6057683 DOI: 10.1371/journal.pcbi.1006335] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2017] [Revised: 07/24/2018] [Accepted: 06/28/2018] [Indexed: 12/14/2022] Open
Abstract
Direct-acting antiviral agents (DAAs) for hepatitis C treatment tend to fare better in individuals who are also likely to respond well to interferon-alpha (IFN), a surprising correlation given that DAAs target specific viral proteins whereas IFN triggers a generic antiviral immune response. Here, we posit a causal relationship between IFN-responsiveness and DAA treatment outcome. IFN-responsiveness restricts viral replication, which would prevent the growth of viral variants resistant to DAAs and improve treatment outcome. To test this hypothesis, we developed a multiscale mathematical model integrating IFN-responsiveness at the cellular level, viral kinetics and evolution leading to drug resistance at the individual level, and treatment outcome at the population level. Model predictions quantitatively captured data from over 50 clinical trials demonstrating poorer response to DAAs in previous non-responders to IFN than treatment-naïve individuals, presenting strong evidence supporting the hypothesis. Model predictions additionally described several unexplained clinical observations, viz., the percentages of infected individuals who 1) spontaneously clear HCV, 2) get chronically infected but respond to IFN-based therapy, and 3) fail IFN-based therapy but respond to DAA-based therapy, resulting in a comprehensive understanding of HCV infection and treatment. An implication of the causal relationship is that failure of DAA-based treatments may be averted by adding IFN, a strategy of potential use in settings with limited access to DAAs. A second, wider implication is that individuals with greater IFN-responsiveness would require shorter DAA-based treatment durations, presenting a basis and a promising population for response-guided therapy.
Collapse
Affiliation(s)
- Vishnu Venugopal
- Department of Chemical Engineering, Indian Institute of Science, Bangalore, India
| | - Pranesh Padmanabhan
- Department of Chemical Engineering, Indian Institute of Science, Bangalore, India
| | - Rubesh Raja
- Department of Chemical Engineering, Indian Institute of Science, Bangalore, India
| | - Narendra M. Dixit
- Department of Chemical Engineering, Indian Institute of Science, Bangalore, India
- Centre for Biosystems Science and Engineering, Indian Institute of Science, Bangalore, India
| |
Collapse
|
31
|
Du Y, Xin L, Shi Y, Zhang TH, Wu NC, Dai L, Gong D, Brar G, Shu S, Luo J, Reiley W, Tseng YW, Bai H, Wu TT, Wang J, Shu Y, Sun R. Genome-wide identification of interferon-sensitive mutations enables influenza vaccine design. Science 2018; 359:290-296. [PMID: 29348231 DOI: 10.1126/science.aan8806] [Citation(s) in RCA: 58] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2017] [Accepted: 11/15/2017] [Indexed: 12/11/2022]
Abstract
In conventional attenuated viral vaccines, immunogenicity is often suboptimal. Here we present a systematic approach for vaccine development that eliminates interferon (IFN)-modulating functions genome-wide while maintaining virus replication fitness. We applied a quantitative high-throughput genomics system to influenza A virus that simultaneously measured the replication fitness and IFN sensitivity of mutations across the entire genome. By incorporating eight IFN-sensitive mutations, we generated a hyper-interferon-sensitive (HIS) virus as a vaccine candidate. HIS virus is highly attenuated in IFN-competent hosts but able to induce transient IFN responses, elicits robust humoral and cellular immune responses, and provides protection against homologous and heterologous viral challenges. Our approach, which attenuates the virus and promotes immune responses concurrently, is broadly applicable for vaccine development against other pathogens.
Collapse
Affiliation(s)
- Yushen Du
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA. .,Cancer Institute, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Li Xin
- National Institute for Viral Disease Control and Prevention, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Chinese Center for Disease Control and Prevention, Key Laboratory for Medical Virology and Viral Diseases, Ministry of Health of the People's Republic of China, Beijing 102206, China
| | - Yuan Shi
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | - Tian-Hao Zhang
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA.,Molecular Biology Institute, University of California, Los Angeles, CA 90095, USA
| | - Nicholas C Wu
- Molecular Biology Institute, University of California, Los Angeles, CA 90095, USA
| | - Lei Dai
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | - Danyang Gong
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | - Gurpreet Brar
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | - Sara Shu
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | - Jiadi Luo
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA.,Department of Pediatrics, University of Pittsburgh School of Medicine, Pittsburgh, PA 15224, USA.,Department of Pathology, The Second Xiangya Hospital of Central South University, Changsha, Hunan 410005, China
| | | | - Yen-Wen Tseng
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | - Hongyan Bai
- National Institute for Viral Disease Control and Prevention, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Chinese Center for Disease Control and Prevention, Key Laboratory for Medical Virology and Viral Diseases, Ministry of Health of the People's Republic of China, Beijing 102206, China
| | - Ting-Ting Wu
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | - Jieru Wang
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA.,Department of Pediatrics, University of Pittsburgh School of Medicine, Pittsburgh, PA 15224, USA
| | - Yuelong Shu
- National Institute for Viral Disease Control and Prevention, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Chinese Center for Disease Control and Prevention, Key Laboratory for Medical Virology and Viral Diseases, Ministry of Health of the People's Republic of China, Beijing 102206, China.,School of Public Health (Shenzhen), Sun Yat-sen University, Guangdong 510275, China
| | - Ren Sun
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA. .,Cancer Institute, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, School of Medicine, Zhejiang University, Hangzhou 310058, China.,Molecular Biology Institute, University of California, Los Angeles, CA 90095, USA
| |
Collapse
|
32
|
Gong D, Zhang TH, Zhao D, Du Y, Chapa TJ, Shi Y, Wang L, Contreras D, Zeng G, Shi PY, Wu TT, Arumugaswami V, Sun R. High-Throughput Fitness Profiling of Zika Virus E Protein Reveals Different Roles for Glycosylation during Infection of Mammalian and Mosquito Cells. iScience 2018; 1:97-111. [PMID: 30227960 PMCID: PMC6135943 DOI: 10.1016/j.isci.2018.02.005] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Revised: 01/26/2018] [Accepted: 02/12/2018] [Indexed: 12/29/2022] Open
Abstract
Zika virus (ZIKV) infection causes Guillain-Barré syndrome and severe birth defects. ZIKV envelope (E) protein is the major viral protein involved in cell receptor binding and entry and is therefore considered one of the major determinants in ZIKV pathogenesis. Here we report a gene-wide mapping of functional residues of ZIKV E protein using a mutant library, with changes covering every nucleotide position. By comparing the replication fitness of every viral mutant between mosquito and human cells, we identified that mutations affecting glycosylation display the most divergence. By characterizing individual mutants, we show that ablation of glycosylation selectively benefits ZIKV infection of mosquito cells by enhancing cell entry, whereas it either has little impact on ZIKV infection on certain human cells or leads to decreased infection through the entry factor DC-SIGN. In conclusion, we define the roles of individual residues of ZIKV envelope protein, which contribute to ZIKV replication fitness in human and mosquito cells.
Collapse
Affiliation(s)
- Danyang Gong
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | - Tian-Hao Zhang
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | - Dawei Zhao
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | - Yushen Du
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | - Travis J Chapa
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | - Yuan Shi
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | - Laurie Wang
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Deisy Contreras
- Board of Governors Regenerative Medicine Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Gang Zeng
- Department of Urology, University of California, Los Angeles, CA 90095, USA
| | - Pei-Yong Shi
- Department of Biochemistry & Molecular Biology, University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Ting-Ting Wu
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA
| | | | - Ren Sun
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA.
| |
Collapse
|
33
|
Evolutionary mechanisms studied through protein fitness landscapes. Curr Opin Struct Biol 2018; 48:141-148. [DOI: 10.1016/j.sbi.2018.01.001] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Revised: 12/26/2017] [Accepted: 01/01/2018] [Indexed: 12/15/2022]
|
34
|
Effects of Mutations on Replicative Fitness and Major Histocompatibility Complex Class I Binding Affinity Are Among the Determinants Underlying Cytotoxic-T-Lymphocyte Escape of HIV-1 Gag Epitopes. mBio 2017; 8:mBio.01050-17. [PMID: 29184023 PMCID: PMC5705913 DOI: 10.1128/mbio.01050-17] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Certain “protective” major histocompatibility complex class I (MHC-I) alleles, such as B*57 and B*27, are associated with long-term control of HIV-1 in vivo mediated by the CD8+ cytotoxic-T-lymphocyte (CTL) response. However, the mechanism of such superior protection is not fully understood. Here we combined high-throughput fitness profiling of mutations in HIV-1 Gag, in silico prediction of MHC-peptide binding affinity, and analysis of intraperson virus evolution to systematically compare differences with respect to CTL escape mutations between epitopes targeted by protective MHC-I alleles and those targeted by nonprotective MHC-I alleles. We observed that the effects of mutations on both viral replication and MHC-I binding affinity are among the determinants of CTL escape. Mutations in Gag epitopes presented by protective MHC-I alleles are associated with significantly higher fitness cost and lower reductions in binding affinity with respect to MHC-I. A linear regression model accounting for the effect of mutations on both viral replicative capacity and MHC-I binding can explain the protective efficacy of MHC-I alleles. Finally, we found a consistent pattern in the evolution of Gag epitopes in long-term nonprogressors versus progressors. Overall, our results suggest that certain protective MHC-I alleles allow superior control of HIV-1 by targeting epitopes where mutations typically incur high fitness costs and small reductions in MHC-I binding affinity. Understanding the mechanism of viral control achieved in long-term nonprogressors with protective HLA alleles provides insights for developing functional cure of HIV infection. Through the characterization of CTL escape mutations in infected persons, previous researchers hypothesized that protective alleles target epitopes where escape mutations significantly reduce viral replicative capacity. However, these studies were usually limited to a few mutations observed in vivo. Here we utilized our recently developed high-throughput fitness profiling method to quantitatively measure the fitness of mutations across the entirety of HIV-1 Gag. The data enabled us to integrate the results with in silico prediction of MHC-peptide binding affinity and analysis of intraperson virus evolution to systematically determine the differences in CTL escape mutations between epitopes targeted by protective HLA alleles and those targeted by nonprotective HLA alleles. We observed that the effects of Gag epitope mutations on HIV replicative fitness and MHC-I binding affinity are among the major determinants of CTL escape.
Collapse
|
35
|
Genome-Wide Mutagenesis of Dengue Virus Reveals Plasticity of the NS1 Protein and Enables Generation of Infectious Tagged Reporter Viruses. J Virol 2017; 91:JVI.01455-17. [PMID: 28956770 DOI: 10.1128/jvi.01455-17] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Accepted: 09/21/2017] [Indexed: 12/21/2022] Open
Abstract
Dengue virus (DENV) is a major global pathogen that causes significant morbidity and mortality in tropical and subtropical areas worldwide. An improved understanding of the regions within the DENV genome and its encoded proteins that are required for the virus replication cycle will expedite the development of urgently required therapeutics and vaccines. We subjected an infectious DENV genome to unbiased insertional mutagenesis and used next-generation sequencing to identify sites that tolerate 15-nucleotide insertions during the virus replication cycle in hepatic cell culture. This revealed that the regions within capsid, NS1, and the 3' untranslated region were the most tolerant of insertions. In contrast, prM- and NS2A-encoding regions were largely intolerant of insertions. Notably, the multifunctional NS1 protein readily tolerated insertions in regions within the Wing, connector, and β-ladder domains with minimal effects on viral RNA replication and infectious virus production. Using this information, we generated infectious reporter viruses, including a variant encoding the APEX2 electron microscopy tag in NS1 that uniquely enabled high-resolution imaging of its localization to the surface and interior of viral replication vesicles. In addition, we generated a tagged virus bearing an mScarlet fluorescent protein insertion in NS1 that, despite an impact on fitness, enabled live cell imaging of NS1 localization and traffic in infected cells. Overall, this genome-wide profile of DENV genome flexibility may be further dissected and exploited in reporter virus generation and antiviral strategies.IMPORTANCE Regions of genetic flexibility in viral genomes can be exploited in the generation of reporter virus tools and should arguably be avoided in antiviral drug and vaccine design. Here, we subjected the DENV genome to high-throughput insertional mutagenesis to identify regions of genetic flexibility and enable tagged reporter virus generation. In particular, the viral NS1 protein displayed remarkable tolerance of small insertions. This genetic flexibility enabled generation of several novel NS1-tagged reporter viruses, including an APEX2-tagged virus that we used in high-resolution imaging of NS1 localization in infected cells by electron microscopy. For the first time, this analysis revealed the localization of NS1 within viral replication factories known as "vesicle packets" (VPs), in addition to its acknowledged localization to the luminal surface of these VPs. Together, this genetic profile of DENV may be further refined and exploited in the identification of antiviral targets and the generation of reporter virus tools.
Collapse
|
36
|
Systematic identification of anti-interferon function on hepatitis C virus genome reveals p7 as an immune evasion protein. Proc Natl Acad Sci U S A 2017; 114:2018-2023. [PMID: 28159892 DOI: 10.1073/pnas.1614623114] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Hepatitis C virus (HCV) encodes mechanisms to evade the multilayered antiviral actions of the host immune system. Great progress has been made in elucidating the strategies HCV employs to down-regulate interferon (IFN) production, impede IFN signaling transduction, and impair IFN-stimulated gene (ISG) expression. However, there is a limited understanding of the mechanisms governing how viral proteins counteract the antiviral functions of downstream IFN effectors due to the lack of an efficient approach to identify such interactions systematically. To study the mechanisms by which HCV antagonizes the IFN responses, we have developed a high-throughput profiling platform that enables mapping of HCV sequences critical for anti-IFN function at high resolution. Genome-wide profiling performed with a 15-nt insertion mutant library of HCV showed that mutations in the p7 region conferred high levels of IFN sensitivity, which could be alleviated by the expression of WT p7 protein. This finding suggests that p7 protein of HCV has an immune evasion function. By screening a liver-specific ISG library, we identified that IFI6-16 significantly inhibits the replication of p7 mutant viruses without affecting WT virus replication. In contrast, knockout of IFI6-16 reversed the IFN hypersensitivity of p7 mutant virus. In addition, p7 was found to be coimmunoprecipitated with IFI6-16 and to counteract the function of IFI6-16 by depolarizing the mitochondria potential. Our data suggest that p7 is a critical immune evasion protein that suppresses the antiviral IFN function by counteracting the function of IFI6-16.
Collapse
|
37
|
Hopf TA, Ingraham JB, Poelwijk FJ, Schärfe CP, Springer M, Sander C, Marks DS. Mutation effects predicted from sequence co-variation. Nat Biotechnol 2017; 35:128-135. [PMID: 28092658 PMCID: PMC5383098 DOI: 10.1038/nbt.3769] [Citation(s) in RCA: 436] [Impact Index Per Article: 54.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2016] [Accepted: 12/09/2016] [Indexed: 01/09/2023]
Abstract
Many high-throughput experimental technologies have been developed to assess the effects of large numbers of mutations (variation) on phenotypes. However, designing functional assays for these methods is challenging, and systematic testing of all combinations is impossible, so robust methods to predict the effects of genetic variation are needed. Most prediction methods exploit evolutionary sequence conservation but do not consider the interdependencies of residues or bases. We present EVmutation, an unsupervised statistical method for predicting the effects of mutations that explicitly captures residue dependencies between positions. We validate EVmutation by comparing its predictions with outcomes of high-throughput mutagenesis experiments and measurements of human disease mutations and show that it outperforms methods that do not account for epistasis. EVmutation can be used to assess the quantitative effects of mutations in genes of any organism. We provide pre-computed predictions for ∼7,000 human proteins at http://evmutation.org/.
Collapse
Affiliation(s)
- Thomas A. Hopf
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- Department of Informatics, Technische Universität München, Garching, Germany
| | - John B. Ingraham
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | | | - Charlotta P.I. Schärfe
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Applied Bioinformatics, Department of Computer Science, University of Tübingen, Tübingen, Germany
| | - Michael Springer
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- cBio Center, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Debora S. Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
38
|
Haddox HK, Dingens AS, Bloom JD. Experimental Estimation of the Effects of All Amino-Acid Mutations to HIV's Envelope Protein on Viral Replication in Cell Culture. PLoS Pathog 2016; 12:e1006114. [PMID: 27959955 PMCID: PMC5189966 DOI: 10.1371/journal.ppat.1006114] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2016] [Revised: 12/27/2016] [Accepted: 12/07/2016] [Indexed: 11/18/2022] Open
Abstract
HIV is notorious for its capacity to evade immunity and anti-viral drugs through rapid sequence evolution. Knowledge of the functional effects of mutations to HIV is critical for understanding this evolution. HIV's most rapidly evolving protein is its envelope (Env). Here we use deep mutational scanning to experimentally estimate the effects of all amino-acid mutations to Env on viral replication in cell culture. Most mutations are under purifying selection in our experiments, although a few sites experience strong selection for mutations that enhance HIV's replication in cell culture. We compare our experimental measurements of each site's preference for each amino acid to the actual frequencies of these amino acids in naturally occurring HIV sequences. Our measured amino-acid preferences correlate with amino-acid frequencies in natural sequences for most sites. However, our measured preferences are less concordant with natural amino-acid frequencies at surface-exposed sites that are subject to pressures absent from our experiments such as antibody selection. Our data enable us to quantify the inherent mutational tolerance of each site in Env. We show that the epitopes of broadly neutralizing antibodies have a significantly reduced inherent capacity to tolerate mutations, rigorously validating a pervasive idea in the field. Overall, our results help disentangle the role of inherent functional constraints and external selection pressures in shaping Env's evolution.
Collapse
Affiliation(s)
- Hugh K. Haddox
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- Molecular and Cellular Biology PhD Program, University of Washington, Seattle, Washington, United States of America
| | - Adam S. Dingens
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- Molecular and Cellular Biology PhD Program, University of Washington, Seattle, Washington, United States of America
| | - Jesse D. Bloom
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| |
Collapse
|
39
|
Du Y, Wu NC, Jiang L, Zhang T, Gong D, Shu S, Wu TT, Sun R. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis. mBio 2016; 7:e01801-16. [PMID: 27803181 PMCID: PMC5090041 DOI: 10.1128/mbio.01801-16] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Accepted: 10/07/2016] [Indexed: 11/28/2022] Open
Abstract
Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. IMPORTANCE To fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available.
Collapse
Affiliation(s)
- Yushen Du
- Department of Molecular and Medical Pharmacology, University of California Los Angeles, Los Angeles, California, USA
- Cancer Institute, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, ZJU-UCLA Joint Center for Medical Education and Research, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China
| | - Nicholas C Wu
- Department of Molecular and Medical Pharmacology, University of California Los Angeles, Los Angeles, California, USA
- Molecular Biology Institute, University of California Los Angeles, Los Angeles, California, USA
| | - Lin Jiang
- Department of Neurology, University of California Los Angeles, Los Angeles, California, USA
| | - Tianhao Zhang
- Department of Molecular and Medical Pharmacology, University of California Los Angeles, Los Angeles, California, USA
- Molecular Biology Institute, University of California Los Angeles, Los Angeles, California, USA
| | - Danyang Gong
- Department of Molecular and Medical Pharmacology, University of California Los Angeles, Los Angeles, California, USA
| | - Sara Shu
- Department of Molecular and Medical Pharmacology, University of California Los Angeles, Los Angeles, California, USA
| | - Ting-Ting Wu
- Department of Molecular and Medical Pharmacology, University of California Los Angeles, Los Angeles, California, USA
| | - Ren Sun
- Department of Molecular and Medical Pharmacology, University of California Los Angeles, Los Angeles, California, USA
- Cancer Institute, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, ZJU-UCLA Joint Center for Medical Education and Research, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China
- Molecular Biology Institute, University of California Los Angeles, Los Angeles, California, USA
| |
Collapse
|
40
|
Abstract
A virus’ mutational robustness is described in terms of the strength and distribution of the mutational fitness effects, or MFE. The distribution of MFE is central to many questions in evolutionary theory and is a key parameter in models of molecular evolution. Here we define the mutational fitness effects in influenza A virus by generating 128 viruses, each with a single nucleotide mutation. In contrast to mutational scanning approaches, this strategy allowed us to unambiguously assign fitness values to individual mutations. The presence of each desired mutation and the absence of additional mutations were verified by next generation sequencing of each stock. A mutation was considered lethal only after we failed to rescue virus in three independent transfections. We measured the fitness of each viable mutant relative to the wild type by quantitative RT-PCR following direct competition on A549 cells. We found that 31.6% of the mutations in the genome-wide dataset were lethal and that the lethal fraction did not differ appreciably between the HA- and NA-encoding segments and the rest of the genome. Of the viable mutants, the fitness mean and standard deviation were 0.80 and 0.22 in the genome-wide dataset and best modeled as a beta distribution. The fitness impact of mutation was marginally lower in the segments coding for HA and NA (0.88 ± 0.16) than in the other 6 segments (0.78 ± 0.24), and their respective beta distributions had slightly different shape parameters. The results for influenza A virus are remarkably similar to our own analysis of CirSeq-derived fitness values from poliovirus and previously published data from other small, single stranded DNA and RNA viruses. These data suggest that genome size, and not nucleic acid type or mode of replication, is the main determinant of viral mutational fitness effects. Like other RNA viruses, influenza virus has a very high mutation rate. While high mutation rates may increase the rate at which influenza virus will adapt to a new host, acquire a new route of transmission, or escape from host immune surveillance, data from model systems suggest that most new viral mutations are either lethal or highly detrimental. Mutational robustness refers to the ability of a virus to tolerate, or buffer, these mutations. The mutational robustness of a virus will determine which mutations are maintained in a population and may have a greater impact on viral evolution than mutation rate. We defined the mutational robustness of influenza A virus by measuring the fitness of a large number of viruses, each with a single point mutation. We found that the overall robustness of influenza was similar to that of poliovirus and other viruses of similar size. Interestingly, mutations appeared to be more easily accommodated in hemagglutinin and neuraminidase than elsewhere in the genome. This work will inform models of influenza evolution at the global and molecular scale.
Collapse
|
41
|
Wu NC, Dai L, Olson CA, Lloyd-Smith JO, Sun R. Adaptation in protein fitness landscapes is facilitated by indirect paths. eLife 2016; 5. [PMID: 27391790 PMCID: PMC4985287 DOI: 10.7554/elife.16965] [Citation(s) in RCA: 150] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Accepted: 07/07/2016] [Indexed: 12/11/2022] Open
Abstract
The structure of fitness landscapes is critical for understanding adaptive protein evolution. Previous empirical studies on fitness landscapes were confined to either the neighborhood around the wild type sequence, involving mostly single and double mutants, or a combinatorially complete subgraph involving only two amino acids at each site. In reality, the dimensionality of protein sequence space is higher (20L) and there may be higher-order interactions among more than two sites. Here we experimentally characterized the fitness landscape of four sites in protein GB1, containing 204 = 160,000 variants. We found that while reciprocal sign epistasis blocked many direct paths of adaptation, such evolutionary traps could be circumvented by indirect paths through genotype space involving gain and subsequent loss of mutations. These indirect paths alleviate the constraint on adaptive protein evolution, suggesting that the heretofore neglected dimensions of sequence space may change our views on how proteins evolve. DOI:http://dx.doi.org/10.7554/eLife.16965.001 Proteins can evolve over time by changing their component parts, which are called amino acids. These changes usually happen one at a time and natural selection tends to preserve those changes that make the protein more efficient at its specific tasks, while discarding those that impair the protein’s activity. However the effect of each change depends on the protein as a whole, and so two changes that separately make the protein worse can make it much better if they occur together. This phenomenon is called epistasis and in some cases it can trap proteins in a sub-optimal form and prevent them from improving further. Proteins are made from twenty different kinds of amino acid, and there are millions of different combinations of amino acids that could, in theory, make a protein of a given length. Studying protein evolution involves making variants of the same protein, each with just a few changes, and comparing how efficient, or “fit”, they are. Previous studies only measured the fitness of a few variants and showed that epistasis could block protein evolution by requiring the protein to lose some fitness before it could improve further. However, new techniques have now made it easier to study protein evolution by testing many more protein variants. Wu, Dai et al. focused on four amino acids in part of a protein called GB1 and tested the efficiency of every possible combination of these four amino acids, a total of 160,000 (204) variants. Contrary to expectations, the results suggested that the protein could evolve quickly to maximise fitness despite there being epistasis between the four amino acids. Overcoming epistasis typically involved making a change to one amino acid that paved the way for further changes while avoiding the need to lose fitness. The original change could then be reversed once the epistasis was overcome. The complexity of this solution means it can only be seen by studying a large number of protein variants that represent many alternative sequences of protein changes. Wu, Dai et al. conclude that proteins are able to achieve a higher level of fitness through evolution by exploring a large number of changes. There are many possible changes for each protein and it is this variety that, despite epistasis, allows proteins to become naturally optimised for the tasks that they perform. While the full complexity of protein evolution cannot be explored at the moment, as technology advances it will become possible to study more protein variants. Such advances would therefore hopefully allow researchers to discover even more about the natural mechanisms of protein evolution. DOI:http://dx.doi.org/10.7554/eLife.16965.002
Collapse
Affiliation(s)
- Nicholas C Wu
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, Los Angeles, United States.,Molecular Biology Institute, University of California, Los Angeles, Los Angeles, United States
| | - Lei Dai
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, Los Angeles, United States.,Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, United States
| | - C Anders Olson
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, Los Angeles, United States
| | - James O Lloyd-Smith
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, United States
| | - Ren Sun
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, Los Angeles, United States.,Molecular Biology Institute, University of California, Los Angeles, Los Angeles, United States
| |
Collapse
|
42
|
Abriata LA, Bovigny C, Dal Peraro M. Detection and sequence/structure mapping of biophysical constraints to protein variation in saturated mutational libraries and protein sequence alignments with a dedicated server. BMC Bioinformatics 2016; 17:242. [PMID: 27315797 PMCID: PMC4912743 DOI: 10.1186/s12859-016-1124-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Accepted: 06/07/2016] [Indexed: 11/21/2022] Open
Abstract
Background Protein variability can now be studied by measuring high-resolution tolerance-to-substitution maps and fitness landscapes in saturated mutational libraries. But these rich and expensive datasets are typically interpreted coarsely, restricting detailed analyses to positions of extremely high or low variability or dubbed important beforehand based on existing knowledge about active sites, interaction surfaces, (de)stabilizing mutations, etc. Results Our new webserver PsychoProt (freely available without registration at http://psychoprot.epfl.ch or at http://lucianoabriata.altervista.org/psychoprot/index.html) helps to detect, quantify, and sequence/structure map the biophysical and biochemical traits that shape amino acid preferences throughout a protein as determined by deep-sequencing of saturated mutational libraries or from large alignments of naturally occurring variants. Discussion We exemplify how PsychoProt helps to (i) unveil protein structure-function relationships from experiments and from alignments that are consistent with structures according to coevolution analysis, (ii) recall global information about structural and functional features and identify hitherto unknown constraints to variation in alignments, and (iii) point at different sources of variation among related experimental datasets or between experimental and alignment-based data. Remarkably, metabolic costs of the amino acids pose strong constraints to variability at protein surfaces in nature but not in the laboratory. This and other differences call for caution when extrapolating results from in vitro experiments to natural scenarios in, for example, studies of protein evolution. Conclusion We show through examples how PsychoProt can be a useful tool for the broad communities of structural biology and molecular evolution, particularly for studies about protein modeling, evolution and design. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1124-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Luciano A Abriata
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, and Swiss Institute of Bioinformatics, AAB014 Station 19, Lausanne, 1015, Switzerland.
| | - Christophe Bovigny
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, and Swiss Institute of Bioinformatics, AAB014 Station 19, Lausanne, 1015, Switzerland.,Present address: Molecular Modeling Group, Swiss Institute of Bioinformatics, UNIL, Bâtiment Génopode, Lausanne, 1015, Switzerland
| | - Matteo Dal Peraro
- Laboratory for Biomolecular Modeling, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, and Swiss Institute of Bioinformatics, AAB014 Station 19, Lausanne, 1015, Switzerland
| |
Collapse
|
43
|
A benchmark study on error-correction by read-pairing and tag-clustering in amplicon-based deep sequencing. BMC Genomics 2016; 17:108. [PMID: 26868371 PMCID: PMC4751728 DOI: 10.1186/s12864-016-2388-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2015] [Accepted: 01/08/2016] [Indexed: 11/10/2022] Open
Abstract
Background The high error rate of next generation sequencing (NGS) restricts some of its applications, such as monitoring virus mutations and detecting rare mutations in tumors. There are two commonly employed sequencing library preparation strategies to improve sequencing accuracy by correcting sequencing errors: read-pairing method and tag-clustering method (i.e. primer ID or UID). Here, we constructed a homogeneous library from a single clone, and compared the variant calling accuracy of these error-correction methods. Result We comprehensively described the strengths and pitfalls of these methods. We found that both read-pairing and tag-clustering methods significantly decreased sequencing error rate. While the read-pairing method was more effective than the tag-clustering method at correcting insertion and deletion errors, it was not as effective as the tag-clustering method at correcting substitution errors. In addition, we observed that when the read quality was poor, the tag-clustering method led to huge coverage loss. We also tested the effect of applying quality score filtering to the error-correction methods and demonstrated that quality score filtering was able to impose a minor, yet statistically significant improvement to the error-correction methods tested in this study. Conclusion Our study provides a benchmark for researchers to select suitable error-correction methods based on the goal of the experiment by balancing the trade-off between sequencing cost (i.e. sequencing coverage requirement) and detection sensitivity. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2388-9) contains supplementary material, which is available to authorized users.
Collapse
|
44
|
Wu NC, Du Y, Le S, Young AP, Zhang TH, Wang Y, Zhou J, Yoshizawa JM, Dong L, Li X, Wu TT, Sun R. Coupling high-throughput genetics with phylogenetic information reveals an epistatic interaction on the influenza A virus M segment. BMC Genomics 2016; 17:46. [PMID: 26754751 PMCID: PMC4710013 DOI: 10.1186/s12864-015-2358-7] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2015] [Accepted: 12/28/2015] [Indexed: 12/15/2022] Open
Abstract
Background Epistasis is one of the central themes in viral evolution due to its importance in drug resistance, immune escape, and interspecies transmission. However, there is a lack of experimental approach to systematically probe for epistatic residues. Results By utilizing the information from natural occurring sequences and high-throughput genetics, this study established a novel strategy to identify epistatic residues. The rationale is that a substitution that is deleterious in one strain may be prevalent in nature due to the presence of a naturally occurring compensatory substitution. Here, high-throughput genetics was applied to influenza A virus M segment to systematically identify deleterious substitutions. Comparison with natural sequence variation showed that a deleterious substitution M1 Q214H was prevalent in circulating strains. A coevolution analysis was then performed and indicated that M1 residues 121, 207, 209, and 214 naturally coevolved as a group. Subsequently, we experimentally validated that M1 A209T was a compensatory substitution for M1 Q214H. Conclusions This work provided a proof-of-concept to identify epistatic residues by coupling high-throughput genetics with phylogenetic information. In particular, we were able to identify an epistatic interaction between M1 substitutions A209T and Q214H. This analytic strategy can potentially be adapted to study any protein of interest, provided that the information on natural sequence variants is available. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2358-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Nicholas C Wu
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA. .,Molecular Biology InstituteUniversity of California, Los Angeles, 90095, CA, USA. .,Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, 92037, CA, USA.
| | - Yushen Du
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Shuai Le
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA. .,Department of Microbiology, Third Military Medical University, Chongqing, 400038, China.
| | - Arthur P Young
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Tian-Hao Zhang
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Yuanyuan Wang
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Jian Zhou
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Janice M Yoshizawa
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Ling Dong
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Xinmin Li
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Ting-Ting Wu
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| | - Ren Sun
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, 90095, CA, USA.
| |
Collapse
|
45
|
Synergistic Activity of Combined NS5A Inhibitors. Antimicrob Agents Chemother 2015; 60:1573-83. [PMID: 26711745 DOI: 10.1128/aac.02639-15] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2015] [Accepted: 12/13/2015] [Indexed: 12/29/2022] Open
Abstract
Daclatasvir (DCV) is a first-in-class hepatitis C virus (HCV) nonstructural 5A replication complex inhibitor (NS5A RCI) that is clinically effective in interferon-free combinations with direct-acting antivirals (DAAs) targeting alternate HCV proteins. Recently, we reported NS5A RCI combinations that enhance HCV inhibitory potential in vitro, defining a new class of HCV inhibitors termed NS5A synergists (J. Sun, D. R. O'Boyle II, R. A. Fridell, D. R. Langley, C. Wang, S. Roberts, P. Nower, B. M. Johnson F. Moulin, M. J. Nophsker, Y. Wang, M. Liu, K. Rigat, Y. Tu, P. Hewawasam, J. Kadow, N. A. Meanwell, M. Cockett, J. A. Lemm, M. Kramer, M. Belema, and M. Gao, Nature 527:245-248, 2015, doi:10.1038/nature15711). To extend the characterization of NS5A synergists, we tested new combinations of DCV and NS5A synergists against genotype (gt) 1 to 6 replicons and gt 1a, 2a, and 3a viruses. The kinetics of inhibition in HCV-infected cells treated with DCV, an NS5A synergist (NS5A-Syn), or a combination of DCV and NS5A-Syn were distinctive. Similar to activity observed clinically, DCV caused a multilog drop in HCV, followed by rebound due to the emergence of resistance. DCV-NS5A-Syn combinations were highly efficient at clearing cells of viruses, in line with the trend seen in replicon studies. The retreatment of resistant viruses that emerged using DCV monotherapy with DCV-NS5A-Syn resulted in a multilog drop and rebound in HCV similar to the initial decline and rebound observed with DCV alone on wild-type (WT) virus. A triple combination of DCV, NS5A-Syn, and a DAA targeting the NS3 or NS5B protein cleared the cells of viruses that are highly resistant to DCV. Our data support the observation that the cooperative interaction of DCV and NS5A-Syn potentiates both the genotype coverage and resistance barrier of DCV, offering an additional DAA option for combination therapy and tools for explorations of NS5A function.
Collapse
|
46
|
A Balance between Inhibitor Binding and Substrate Processing Confers Influenza Drug Resistance. J Mol Biol 2015; 428:538-553. [PMID: 26656922 DOI: 10.1016/j.jmb.2015.11.027] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2015] [Revised: 11/23/2015] [Accepted: 11/24/2015] [Indexed: 11/22/2022]
Abstract
The therapeutic benefits of the neuraminidase (NA) inhibitor oseltamivir are dampened by the emergence of drug resistance mutations in influenza A virus (IAV). To investigate the mechanistic features that underlie resistance, we developed an approach to quantify the effects of all possible single-nucleotide substitutions introduced into important regions of NA. We determined the experimental fitness effects of 450 nucleotide mutations encoding positions both surrounding the active site and at more distant sites in an N1 strain of IAV in the presence and absence of oseltamivir. NA mutations previously known to confer oseltamivir resistance in N1 strains, including H275Y and N295S, were adaptive in the presence of drug, indicating that our experimental system captured salient features of real-world selection pressures acting on NA. We identified mutations, including several at position 223, that reduce the apparent affinity for oseltamivir in vitro. Position 223 of NA is located adjacent to a hydrophobic portion of oseltamivir that is chemically distinct from the substrate, making it a hotspot for substitutions that preferentially impact drug binding relative to substrate processing. Furthermore, two NA mutations, K221N and Y276F, each reduce susceptibility to oseltamivir by increasing NA activity without altering drug binding. These results indicate that competitive expansion of IAV in the face of drug pressure is mediated by a balance between inhibitor binding and substrate processing.
Collapse
|
47
|
High-resolution genetic profile of viral genomes: why it matters. Curr Opin Virol 2015; 14:62-70. [DOI: 10.1016/j.coviro.2015.08.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Revised: 08/07/2015] [Accepted: 08/07/2015] [Indexed: 12/12/2022]
|
48
|
Rational Protein Engineering Guided by Deep Mutational Scanning. Int J Mol Sci 2015; 16:23094-110. [PMID: 26404267 PMCID: PMC4613353 DOI: 10.3390/ijms160923094] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2015] [Revised: 09/04/2015] [Accepted: 09/13/2015] [Indexed: 11/16/2022] Open
Abstract
Sequence-function relationship in a protein is commonly determined by the three-dimensional protein structure followed by various biochemical experiments. However, with the explosive increase in the number of genome sequences, facilitated by recent advances in sequencing technology, the gap between protein sequences available and three-dimensional structures is rapidly widening. A recently developed method termed deep mutational scanning explores the functional phenotype of thousands of mutants via massive sequencing. Coupled with a highly efficient screening system, this approach assesses the phenotypic changes made by the substitution of each amino acid sequence that constitutes a protein. Such an informational resource provides the functional role of each amino acid sequence, thereby providing sufficient rationale for selecting target residues for protein engineering. Here, we discuss the current applications of deep mutational scanning and consider experimental design.
Collapse
|
49
|
Hughes D, Andersson DI. Evolutionary consequences of drug resistance: shared principles across diverse targets and organisms. Nat Rev Genet 2015; 16:459-71. [DOI: 10.1038/nrg3922] [Citation(s) in RCA: 165] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
50
|
Wu NC, Olson CA, Du Y, Le S, Tran K, Remenyi R, Gong D, Al-Mawsawi LQ, Qi H, Wu TT, Sun R. Functional Constraint Profiling of a Viral Protein Reveals Discordance of Evolutionary Conservation and Functionality. PLoS Genet 2015; 11:e1005310. [PMID: 26132554 PMCID: PMC4489113 DOI: 10.1371/journal.pgen.1005310] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2015] [Accepted: 05/28/2015] [Indexed: 12/31/2022] Open
Abstract
Viruses often encode proteins with multiple functions due to their compact genomes. Existing approaches to identify functional residues largely rely on sequence conservation analysis. Inferring functional residues from sequence conservation can produce false positives, in which the conserved residues are functionally silent, or false negatives, where functional residues are not identified since they are species-specific and therefore non-conserved. Furthermore, the tedious process of constructing and analyzing individual mutations limits the number of residues that can be examined in a single study. Here, we developed a systematic approach to identify the functional residues of a viral protein by coupling experimental fitness profiling with protein stability prediction using the influenza virus polymerase PA subunit as the target protein. We identified a significant number of functional residues that were influenza type-specific and were evolutionarily non-conserved among different influenza types. Our results indicate that type-specific functional residues are prevalent and may not otherwise be identified by sequence conservation analysis alone. More importantly, this technique can be adapted to any viral (and potentially non-viral) protein where structural information is available. The analysis of sequence conservation is a common approach to identify functional residues within a protein. However, not all functional residues are conserved as natural evolution and species diversification permit continuous innovation of protein functionality through the retention of advantageous mutations. Non-conserved functional residues, which are often species-specific, may not be identified by conventional analysis of sequence conservation despite being biologically important. Here we described a novel approach to identify functional residues within a protein by coupling a high-throughput experimental fitness profiling approach with computational protein modeling. Our methodology is independent of sequence conservation and is applicable to any protein where structural information is available. In this study, we systematically mapped the functional residues on the influenza A PA protein and revealed that non-conserved functional residues are prevalent. Our results not only have significant implication on how functionality evolves during natural evolution, but also highlight the caveats when applying conservation-based approaches to identify functional residues within a protein.
Collapse
Affiliation(s)
- Nicholas C. Wu
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, United States of America,
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, California, United States of America,
| | - C. Anders Olson
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, United States of America,
| | - Yushen Du
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, United States of America,
| | - Shuai Le
- Department of Microbiology, Third Military Medical University, Chongqing, 400038, China
| | - Kevin Tran
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, United States of America,
| | - Roland Remenyi
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, United States of America,
| | - Danyang Gong
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, United States of America,
| | - Laith Q. Al-Mawsawi
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, United States of America,
| | - Hangfei Qi
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, United States of America,
| | - Ting-Ting Wu
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, United States of America,
| | - Ren Sun
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, United States of America,
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, California, United States of America,
- AIDS Institute, University of California, Los Angeles, Los Angeles, California, United States of America
- * E-mail:
| |
Collapse
|