1
|
Park Y, Metzger BPH, Thornton JW. The simplicity of protein sequence-function relationships. Nat Commun 2024; 15:7953. [PMID: 39261454 PMCID: PMC11390738 DOI: 10.1038/s41467-024-51895-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 08/20/2024] [Indexed: 09/13/2024] Open
Abstract
How complex are the rules by which a protein's sequence determines its function? High-order epistatic interactions among residues are thought to be pervasive, suggesting an idiosyncratic and unpredictable sequence-function relationship. But many prior studies may have overestimated epistasis, because they analyzed sequence-function relationships relative to a single reference sequence-which causes measurement noise and local idiosyncrasies to snowball into high-order epistasis-or they did not fully account for global nonlinearities. Here we present a reference-free method that jointly infers specific epistatic interactions and global nonlinearity using a bird's-eye view of sequence space. This technique yields the simplest explanation of sequence-function relationships and is more robust than existing methods to measurement noise, missing data, and model misspecification. We reanalyze 20 experimental datasets and find that context-independent amino acid effects and pairwise interactions, along with a simple nonlinearity to account for limited dynamic range, explain a median of 96% of phenotypic variance and over 92% in every case. Only a tiny fraction of genotypes are strongly affected by higher-order epistasis. Sequence-function relationships are also sparse: a miniscule fraction of amino acids and interactions account for 90% of phenotypic variance. Sequence-function causality across these datasets is therefore simple, opening the way for tractable approaches to characterize proteins' genetic architecture.
Collapse
Affiliation(s)
- Yeonwoo Park
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, USA
- Center for RNA Research, Institute for Basic Science, Seoul, Republic of Korea
| | - Brian P H Metzger
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Joseph W Thornton
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA.
- Department of Human Genetics, University of Chicago, Chicago, IL, USA.
| |
Collapse
|
2
|
Abdolmaleki S, Ganjalikhani hakemi M, Ganjalikhany MR. An in silico investigation on the binding site preference of PD-1 and PD-L1 for designing antibodies for targeted cancer therapy. PLoS One 2024; 19:e0304270. [PMID: 39052609 PMCID: PMC11271968 DOI: 10.1371/journal.pone.0304270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 05/08/2024] [Indexed: 07/27/2024] Open
Abstract
Cancer control and treatment remain a significant challenge in cancer therapy and recently immune checkpoints has considered as a novel treatment strategy to develop anti-cancer drugs. Many cancer types use the immune checkpoints and its ligand, PD-1/PD-L1 pathway, to evade detection and destruction by the immune system, which is associated with altered effector function of PD-1 and PD-L1 overexpression on cancer cells to deactivate T cells. In recent years, mAbs have been employed to block immune checkpoints, therefore normalization of the anti-tumor response has enabled the scientists to develop novel biopharmaceuticals. In vivo affinity maturation of antibodies in targeted therapy has sometimes failed, and current experimental methods cannot accommodate the accurate structural details of protein-protein interactions. Therefore, determining favorable binding sites on the protein surface for modulator design of these interactions is a major challenge. In this study, we used the in silico methods to identify favorable binding sites on the PD-1 and PD-L1 and to optimize mAb variants on a large scale. At first, all the binding areas on PD-1 and PD-L1 have been identified. Then, using the RosettaDesign protocol, thousands of antibodies have been generated for 11 different regions on PD-1 and PD-L1 and then the designs with higher stability, affinity, and shape complementarity were selected. Next, molecular dynamics simulations and MM-PBSA analysis were employed to understand the dynamic, structural features of the complexes and measure the binding affinity of the final designs. Our results suggest that binding sites 1, 3 and 6 on PD-1 and binding sites 9 and 11 on PD-L1 can be regarded as the most appropriate sites for the inhibition of PD-1-PD-L1 interaction by the designed antibodies. This study provides comprehensive information regarding the potential binding epitopes on PD-1 which could be considered as hotspots for designing potential biopharmaceuticals. We also showed that mutations in the CDRs regions will rearrange the interaction pattern between the designed antibodies and targets (PD-1 and PD-L1) with improved affinity to effectively inhibit protein-protein interaction and block the immune checkpoint.
Collapse
Affiliation(s)
- Sarah Abdolmaleki
- Department of Cell and Molecular Biology & Microbiology, University of Isfahan, Isfahan, Iran
| | - Mazdak Ganjalikhani hakemi
- Regenerative and Restorative Medicine Research Center (REMER), Research Institute for Health Sciences and Technologies (SABITA), Istanbul Medipol University, Istanbul, Turkey
- Department of Immunology, Faculty of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | | |
Collapse
|
3
|
Vu MH, Robert PA, Akbar R, Swiatczak B, Sandve GK, Haug DTT, Greiff V. Linguistics-based formalization of the antibody language as a basis for antibody language models. NATURE COMPUTATIONAL SCIENCE 2024; 4:412-422. [PMID: 38877120 DOI: 10.1038/s43588-024-00642-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 05/13/2024] [Indexed: 06/16/2024]
Abstract
Apparent parallels between natural language and antibody sequences have led to a surge in deep language models applied to antibody sequences for predicting cognate antigen recognition. However, a linguistic formal definition of antibody language does not exist, and insight into how antibody language models capture antibody-specific binding features remains largely uninterpretable. Here we describe how a linguistic formalization of the antibody language, by characterizing its tokens and grammar, could address current challenges in antibody language model rule mining.
Collapse
Affiliation(s)
- Mai Ha Vu
- Department of Linguistics and Scandinavian Studies, University of Oslo, Oslo, Norway.
| | - Philippe A Robert
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Rahmad Akbar
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Bartlomiej Swiatczak
- Department of History of Science and Scientific Archeology, University of Science and Technology of China, Hefei, China
| | | | | | - Victor Greiff
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
4
|
Ferretti F, Kardar M. Universal characterization of epitope immunodominance from a multiscale model of clonal competition in germinal centers. Phys Rev E 2024; 109:064409. [PMID: 39020898 DOI: 10.1103/physreve.109.064409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 04/02/2024] [Indexed: 07/20/2024]
Abstract
We introduce a multiscale model for affinity maturation, which aims to capture the intraclonal, interclonal, and epitope-specific organization of the B-cell population in a germinal center. We describe the evolution of the B-cell population via a quasispecies dynamics, with species corresponding to unique B-cell receptors (BCRs), where the desired multiscale structure is reflected on the mutational connectivity of the accessible BCR space, and on the statistical properties of its fitness landscape. Within this mathematical framework, we study the competition among classes of BCRs targeting different antigen epitopes, and we construct an effective immunogenic space where epitope immunodominance relations can be universally characterized. We finally study how varying the relative composition of a mixture of antigens with variable and conserved domains allows for a parametric exploration of this space, and we identify general principles for the rational design of two-antigen cocktails.
Collapse
Affiliation(s)
- Federica Ferretti
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Mehran Kardar
- Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
5
|
Metzger BPH, Park Y, Starr TN, Thornton JW. Epistasis facilitates functional evolution in an ancient transcription factor. eLife 2024; 12:RP88737. [PMID: 38767330 PMCID: PMC11105156 DOI: 10.7554/elife.88737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024] Open
Abstract
A protein's genetic architecture - the set of causal rules by which its sequence produces its functions - also determines its possible evolutionary trajectories. Prior research has proposed that the genetic architecture of proteins is very complex, with pervasive epistatic interactions that constrain evolution and make function difficult to predict from sequence. Most of this work has analyzed only the direct paths between two proteins of interest - excluding the vast majority of possible genotypes and evolutionary trajectories - and has considered only a single protein function, leaving unaddressed the genetic architecture of functional specificity and its impact on the evolution of new functions. Here, we develop a new method based on ordinal logistic regression to directly characterize the global genetic determinants of multiple protein functions from 20-state combinatorial deep mutational scanning (DMS) experiments. We use it to dissect the genetic architecture and evolution of a transcription factor's specificity for DNA, using data from a combinatorial DMS of an ancient steroid hormone receptor's capacity to activate transcription from two biologically relevant DNA elements. We show that the genetic architecture of DNA recognition consists of a dense set of main and pairwise effects that involve virtually every possible amino acid state in the protein-DNA interface, but higher-order epistasis plays only a tiny role. Pairwise interactions enlarge the set of functional sequences and are the primary determinants of specificity for different DNA elements. They also massively expand the number of opportunities for single-residue mutations to switch specificity from one DNA target to another. By bringing variants with different functions close together in sequence space, pairwise epistasis therefore facilitates rather than constrains the evolution of new functions.
Collapse
Affiliation(s)
- Brian PH Metzger
- Department of Ecology and Evolution, University of ChicagoChicagoUnited States
| | - Yeonwoo Park
- Program in Genetics, Genomics, and Systems Biology, University of ChicagoChicagoUnited States
| | - Tyler N Starr
- Department of Biochemistry and Molecular Biophysics, University of ChicagoChicagoUnited States
| | - Joseph W Thornton
- Department of Ecology and Evolution, University of ChicagoChicagoUnited States
- Department of Human Genetics, University of ChicagoChicagoUnited States
| |
Collapse
|
6
|
Park Y, Metzger BP, Thornton JW. The simplicity of protein sequence-function relationships. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.02.556057. [PMID: 37732229 PMCID: PMC10508729 DOI: 10.1101/2023.09.02.556057] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
How complicated is the genetic architecture of proteins - the set of causal effects by which sequence determines function? High-order epistatic interactions among residues are thought to be pervasive, making a protein's function difficult to predict or understand from its sequence. Most studies, however, used methods that overestimate epistasis, because they analyze genetic architecture relative to a designated reference sequence - causing measurement noise and small local idiosyncrasies to propagate into pervasive high-order interactions - or have not effectively accounted for global nonlinearity in the sequence-function relationship. Here we present a new reference-free method that jointly estimates global nonlinearity and specific epistatic interactions across a protein's entire genotype-phenotype map. This method yields a maximally efficient explanation of a protein's genetic architecture and is more robust than existing methods to measurement noise, partial sampling, and model misspecification. We reanalyze 20 combinatorial mutagenesis experiments from a diverse set of proteins and find that additive and pairwise effects, along with a simple nonlinearity to account for limited dynamic range, explain a median of 96% of total variance in measured phenotypes (and >92% in every case). Only a tiny fraction of genotypes are strongly affected by third- or higher-order epistasis. Genetic architecture is also sparse: the number of terms required to explain the vast majority of variance is smaller than the number of genotypes by many orders of magnitude. The sequence-function relationship in most proteins is therefore far simpler than previously thought, opening the way for new and tractable approaches to characterize it.
Collapse
Affiliation(s)
- Yeonwoo Park
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL 60637
- Current affiliation: Center for RNA Research, Institute for Basic Science, Seoul, Republic of Korea 08826
| | - Brian P.H. Metzger
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637
- Current affiliation: Department of Biological Sciences, Purdue University, West Lafayette, IN 47907
| | - Joseph W. Thornton
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637
- Department of Human Genetics, University of Chicago, Chicago, IL 60637
| |
Collapse
|
7
|
Dupic T, Phillips AM, Desai MM. Protein sequence landscapes are not so simple: on reference-free versus reference-based inference. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.29.577800. [PMID: 38352387 PMCID: PMC10862727 DOI: 10.1101/2024.01.29.577800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
In a recent preprint, Park, Metzger, and Thornton reanalyze 20 empirical protein sequence-function landscapes using a "reference-free analysis" (RFA) method they recently developed. They argue that these empirical landscapes are simpler and less epistatic than earlier work suggested, and attribute the difference to limitations of the methods used in the original analyses of these landscapes, which they claim are more sensitive to measurement noise, missing data, and other artifacts. Here, we show that these claims are incorrect. Instead, we find that the RFA method introduced by Park et al. is exactly equivalent to the reference-based least-squares methods used in the original analysis of many of these empirical landscapes (and also equivalent to a Hadamard-based approach they implement). Because the reanalyzed and original landscapes are in fact identical, the different conclusions drawn by Park et al. instead reflect different interpretations of the parameters describing the inferred landscapes; we argue that these do not support the conclusion that epistasis plays only a small role in protein sequence-function landscapes.
Collapse
Affiliation(s)
- Thomas Dupic
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA
| | - Angela M Phillips
- Department of Microbiology and Immunology, University of California San Francisco, San Francisco CA
| | - Michael M Desai
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA
| |
Collapse
|
8
|
Eccleston RC, Manko E, Campino S, Clark TG, Furnham N. A computational method for predicting the most likely evolutionary trajectories in the stepwise accumulation of resistance mutations. eLife 2023; 12:e84756. [PMID: 38132182 PMCID: PMC10807863 DOI: 10.7554/elife.84756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 12/21/2023] [Indexed: 12/23/2023] Open
Abstract
Pathogen evolution of drug resistance often occurs in a stepwise manner via the accumulation of multiple mutations that in combination have a non-additive impact on fitness, a phenomenon known as epistasis. The evolution of resistance via the accumulation of point mutations in the DHFR genes of Plasmodium falciparum (Pf) and Plasmodium vivax (Pv) has been studied extensively and multiple studies have shown epistatic interactions between these mutations determine the accessible evolutionary trajectories to highly resistant multiple mutations. Here, we simulated these evolutionary trajectories using a model of molecular evolution, parameterised using Rosetta Flex ddG predictions, where selection acts to reduce the target-drug binding affinity. We observe strong agreement with pathways determined using experimentally measured IC50 values of pyrimethamine binding, which suggests binding affinity is strongly predictive of resistance and epistasis in binding affinity strongly influences the order of fixation of resistance mutations. We also infer pathways directly from the frequency of mutations found in isolate data, and observe remarkable agreement with the most likely pathways predicted by our mechanistic model, as well as those determined experimentally. This suggests mutation frequency data can be used to intuitively infer evolutionary pathways, provided sufficient sampling of the population.
Collapse
Affiliation(s)
- Ruth Charlotte Eccleston
- Department of Infection Biology, London School of Hygiene and Tropical MedicineLondonUnited Kingdom
| | - Emilia Manko
- Department of Infection Biology, London School of Hygiene and Tropical MedicineLondonUnited Kingdom
| | - Susana Campino
- Department of Infection Biology, London School of Hygiene and Tropical MedicineLondonUnited Kingdom
| | - Taane G Clark
- Department of Infection Biology, London School of Hygiene and Tropical MedicineLondonUnited Kingdom
- Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical MedicineLondonUnited Kingdom
| | - Nicholas Furnham
- Department of Infection Biology, London School of Hygiene and Tropical MedicineLondonUnited Kingdom
| |
Collapse
|
9
|
Phillips AM, Maurer DP, Brooks C, Dupic T, Schmidt AG, Desai MM. Hierarchical sequence-affinity landscapes shape the evolution of breadth in an anti-influenza receptor binding site antibody. eLife 2023; 12:83628. [PMID: 36625542 PMCID: PMC9995116 DOI: 10.7554/elife.83628] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 01/09/2023] [Indexed: 01/11/2023] Open
Abstract
Broadly neutralizing antibodies (bnAbs) that neutralize diverse variants of a particular virus are of considerable therapeutic interest. Recent advances have enabled us to isolate and engineer these antibodies as therapeutics, but eliciting them through vaccination remains challenging, in part due to our limited understanding of how antibodies evolve breadth. Here, we analyze the landscape by which an anti-influenza receptor binding site (RBS) bnAb, CH65, evolved broad affinity to diverse H1 influenza strains. We do this by generating an antibody library of all possible evolutionary intermediates between the unmutated common ancestor (UCA) and the affinity-matured CH65 antibody and measure the affinity of each intermediate to three distinct H1 antigens. We find that affinity to each antigen requires a specific set of mutations - distributed across the variable light and heavy chains - that interact non-additively (i.e., epistatically). These sets of mutations form a hierarchical pattern across the antigens, with increasingly divergent antigens requiring additional epistatic mutations beyond those required to bind less divergent antigens. We investigate the underlying biochemical and structural basis for these hierarchical sets of epistatic mutations and find that epistasis between heavy chain mutations and a mutation in the light chain at the VH-VL interface is essential for binding a divergent H1. Collectively, this is the first work to comprehensively characterize epistasis between heavy and light chain mutations and shows that such interactions are both strong and widespread. Together with our previous study analyzing a different class of anti-influenza antibodies, our results implicate epistasis as a general feature of antibody sequence-affinity landscapes that can potentiate and constrain the evolution of breadth.
Collapse
Affiliation(s)
- Angela M Phillips
- Department of Organismic and Evolutionary Biology, Harvard UniversityCambridgeUnited States
- Department of Microbiology and Immunology, University of California, San FranciscoSan FranciscoUnited States
| | - Daniel P Maurer
- Ragon Institute of MGH, MIT, and HarvardCambridgeUnited States
- Department of Microbiology, Harvard Medical SchoolBostonUnited States
| | - Caelan Brooks
- Department of Physics, Harvard UniversityCambridgeUnited States
| | - Thomas Dupic
- Department of Organismic and Evolutionary Biology, Harvard UniversityCambridgeUnited States
| | - Aaron G Schmidt
- Ragon Institute of MGH, MIT, and HarvardCambridgeUnited States
- Department of Microbiology, Harvard Medical SchoolBostonUnited States
| | - Michael M Desai
- Department of Organismic and Evolutionary Biology, Harvard UniversityCambridgeUnited States
- Department of Physics, Harvard UniversityCambridgeUnited States
- NSF-Simons Center for Mathematical and Statistical Analysis of Biology, Harvard UniversityCambridgeUnited States
- Quantitative Biology Initiative, Harvard UniversityCambridgeUnited States
| |
Collapse
|
10
|
Pennell M, Rodriguez OL, Watson CT, Greiff V. The evolutionary and functional significance of germline immunoglobulin gene variation. Trends Immunol 2023; 44:7-21. [PMID: 36470826 DOI: 10.1016/j.it.2022.11.001] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 11/07/2022] [Indexed: 12/04/2022]
Abstract
The recombination between immunoglobulin (IG) gene segments determines an individual's naïve antibody repertoire and, consequently, (auto)antigen recognition. Emerging evidence suggests that mammalian IG germline variation impacts humoral immune responses associated with vaccination, infection, and autoimmunity - from the molecular level of epitope specificity, up to profound changes in the architecture of antibody repertoires. These links between IG germline variants and immunophenotype raise the question on the evolutionary causes and consequences of diversity within IG loci. We discuss why the extreme diversity in IG loci remains a mystery, why resolving this is important for the design of more effective vaccines and therapeutics, and how recent evidence from multiple lines of inquiry may help us do so.
Collapse
Affiliation(s)
- Matt Pennell
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA; Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA.
| | - Oscar L Rodriguez
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Victor Greiff
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
11
|
Robert PA, Akbar R, Frank R, Pavlović M, Widrich M, Snapkov I, Slabodkin A, Chernigovskaya M, Scheffer L, Smorodina E, Rawat P, Mehta BB, Vu MH, Mathisen IF, Prósz A, Abram K, Olar A, Miho E, Haug DTT, Lund-Johansen F, Hochreiter S, Haff IH, Klambauer G, Sandve GK, Greiff V. Unconstrained generation of synthetic antibody-antigen structures to guide machine learning methodology for antibody specificity prediction. NATURE COMPUTATIONAL SCIENCE 2022; 2:845-865. [PMID: 38177393 DOI: 10.1038/s43588-022-00372-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 11/09/2022] [Indexed: 01/06/2024]
Abstract
Machine learning (ML) is a key technology for accurate prediction of antibody-antigen binding. Two orthogonal problems hinder the application of ML to antibody-specificity prediction and the benchmarking thereof: the lack of a unified ML formalization of immunological antibody-specificity prediction problems and the unavailability of large-scale synthetic datasets to benchmark real-world relevant ML methods and dataset design. Here we developed the Absolut! software suite that enables parameter-based unconstrained generation of synthetic lattice-based three-dimensional antibody-antigen-binding structures with ground-truth access to conformational paratope, epitope and affinity. We formalized common immunological antibody-specificity prediction problems as ML tasks and confirmed that for both sequence- and structure-based tasks, accuracy-based rankings of ML methods trained on experimental data hold for ML methods trained on Absolut!-generated data. The Absolut! framework has the potential to enable real-world relevant development and benchmarking of ML strategies for biotherapeutics design.
Collapse
Affiliation(s)
- Philippe A Robert
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| | - Rahmad Akbar
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Robert Frank
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | | | - Michael Widrich
- ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
| | - Igor Snapkov
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Andrei Slabodkin
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Maria Chernigovskaya
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | | | - Eva Smorodina
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Puneet Rawat
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Brij Bhushan Mehta
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Mai Ha Vu
- Department of Linguistics and Scandinavian Studies, University of Oslo, Oslo, Norway
| | | | - Aurél Prósz
- Danish Cancer Society Research Center, Translational Cancer Genomics, Copenhagen, Denmark
| | - Krzysztof Abram
- The Novo Nordisk Foundation Center for Biosustainability, Autoflow, DTU Biosustain and IT University of Copenhagen, Copenhagen, Denmark
| | - Alex Olar
- Department of Complex Systems in Physics, Eötvös Loránd University, Budapest, Hungary
| | - Enkelejda Miho
- Institute of Medical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz, Switzerland
- aiNET GmbH, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | | | - Sepp Hochreiter
- ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
- Institute of Advanced Research in Artificial Intelligence (IARAI), Vienna, Austria
| | | | - Günter Klambauer
- ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
| | | | - Victor Greiff
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
12
|
Moulana A, Dupic T, Phillips AM, Chang J, Nieves S, Roffler AA, Greaney AJ, Starr TN, Bloom JD, Desai MM. Compensatory epistasis maintains ACE2 affinity in SARS-CoV-2 Omicron BA.1. Nat Commun 2022; 13:7011. [PMID: 36384919 PMCID: PMC9668218 DOI: 10.1038/s41467-022-34506-z] [Citation(s) in RCA: 61] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 10/26/2022] [Indexed: 11/17/2022] Open
Abstract
The Omicron BA.1 variant emerged in late 2021 and quickly spread across the world. Compared to the earlier SARS-CoV-2 variants, BA.1 has many mutations, some of which are known to enable antibody escape. Many of these antibody-escape mutations individually decrease the spike receptor-binding domain (RBD) affinity for ACE2, but BA.1 still binds ACE2 with high affinity. The fitness and evolution of the BA.1 lineage is therefore driven by the combined effects of numerous mutations. Here, we systematically map the epistatic interactions between the 15 mutations in the RBD of BA.1 relative to the Wuhan Hu-1 strain. Specifically, we measure the ACE2 affinity of all possible combinations of these 15 mutations (215 = 32,768 genotypes), spanning all possible evolutionary intermediates from the ancestral Wuhan Hu-1 strain to BA.1. We find that immune escape mutations in BA.1 individually reduce ACE2 affinity but are compensated by epistatic interactions with other affinity-enhancing mutations, including Q498R and N501Y. Thus, the ability of BA.1 to evade immunity while maintaining ACE2 affinity is contingent on acquiring multiple interacting mutations. Our results implicate compensatory epistasis as a key factor driving substantial evolutionary change for SARS-CoV-2 and are consistent with Omicron BA.1 arising from a chronic infection.
Collapse
Affiliation(s)
- Alief Moulana
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, 02138, USA
| | - Thomas Dupic
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, 02138, USA
| | - Angela M Phillips
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, 02138, USA.
| | - Jeffrey Chang
- Department of Physics, Harvard University, Cambridge, MA, 02138, USA
| | - Serafina Nieves
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, 02138, USA
| | - Anne A Roffler
- Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, 02115, USA
| | - Allison J Greaney
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA
- Medical Scientist Training Program, University of Washington, Seattle, WA, 98195, USA
| | - Tyler N Starr
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
| | - Jesse D Bloom
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA
- Howard Hughes Medical Institute, Seattle, WA, 98109, USA
| | - Michael M Desai
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, 02138, USA.
- Department of Physics, Harvard University, Cambridge, MA, 02138, USA.
- NSF-Simons Center for Mathematical and Statistical Analysis of Biology, Harvard University, Cambridge, MA, 02138, USA.
- Quantitative Biology Initiative, Harvard University, Cambridge, MA, 02138, USA.
| |
Collapse
|
13
|
McBride JM, Eckmann JP, Tlusty T. General Theory of Specific Binding: Insights from a Genetic-Mechano-Chemical Protein Model. Mol Biol Evol 2022; 39:msac217. [PMID: 36208205 PMCID: PMC9641994 DOI: 10.1093/molbev/msac217] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Proteins need to selectively interact with specific targets among a multitude of similar molecules in the cell. However, despite a firm physical understanding of binding interactions, we lack a general theory of how proteins evolve high specificity. Here, we present such a model that combines chemistry, mechanics, and genetics and explains how their interplay governs the evolution of specific protein-ligand interactions. The model shows that there are many routes to achieving molecular discrimination-by varying degrees of flexibility and shape/chemistry complementarity-but the key ingredient is precision. Harder discrimination tasks require more collective and precise coaction of structure, forces, and movements. Proteins can achieve this through correlated mutations extending far from a binding site, which fine-tune the localized interaction with the ligand. Thus, the solution of more complicated tasks is enabled by increasing the protein size, and proteins become more evolvable and robust when they are larger than the bare minimum required for discrimination. The model makes testable, specific predictions about the role of flexibility and shape mismatch in discrimination, and how evolution can independently tune affinity and specificity. Thus, the proposed theory of specific binding addresses the natural question of "why are proteins so big?". A possible answer is that molecular discrimination is often a hard task best performed by adding more layers to the protein.
Collapse
Affiliation(s)
- John M McBride
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
| | - Jean-Pierre Eckmann
- Département de Physique Théorique and Section de Mathématiques, University of Geneva, Geneva, Switzerland
| | - Tsvi Tlusty
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
- Departments of Physics and Chemistry, Ulsan National Institute of Science and Technology, Ulsan 44919, South Korea
| |
Collapse
|
14
|
Designing antibodies as therapeutics. Cell 2022; 185:2789-2805. [PMID: 35868279 DOI: 10.1016/j.cell.2022.05.029] [Citation(s) in RCA: 67] [Impact Index Per Article: 33.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 05/18/2022] [Accepted: 05/31/2022] [Indexed: 12/25/2022]
Abstract
Antibody therapeutics are a large and rapidly expanding drug class providing major health benefits. We provide a snapshot of current antibody therapeutics including their formats, common targets, therapeutic areas, and routes of administration. Our focus is on selected emerging directions in antibody design where progress may provide a broad benefit. These topics include enhancing antibodies for cancer, antibody delivery to organs such as the brain, gastrointestinal tract, and lungs, plus antibody developability challenges including immunogenicity risk assessment and mitigation and subcutaneous delivery. Machine learning has the potential, albeit as yet largely unrealized, for a transformative future impact on antibody discovery and engineering.
Collapse
|
15
|
Chu HY, Wong ASL. Facilitating Machine Learning-Guided Protein Engineering with Smart Library Design and Massively Parallel Assays. ADVANCED GENETICS (HOBOKEN, N.J.) 2021; 2:2100038. [PMID: 36619853 PMCID: PMC9744531 DOI: 10.1002/ggn2.202100038] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 11/08/2021] [Indexed: 01/11/2023]
Abstract
Protein design plays an important role in recent medical advances from antibody therapy to vaccine design. Typically, exhaustive mutational screens or directed evolution experiments are used for the identification of the best design or for improvements to the wild-type variant. Even with a high-throughput screening on pooled libraries and Next-Generation Sequencing to boost the scale of read-outs, surveying all the variants with combinatorial mutations for their empirical fitness scores is still of magnitudes beyond the capacity of existing experimental settings. To tackle this challenge, in-silico approaches using machine learning to predict the fitness of novel variants based on a subset of empirical measurements are now employed. These machine learning models turn out to be useful in many cases, with the premise that the experimentally determined fitness scores and the amino-acid descriptors of the models are informative. The machine learning models can guide the search for the highest fitness variants, resolve complex epistatic relationships, and highlight bio-physical rules for protein folding. Using machine learning-guided approaches, researchers can build more focused libraries, thus relieving themselves from labor-intensive screens and fast-tracking the optimization process. Here, we describe the current advances in massive-scale variant screens, and how machine learning and mutagenesis strategies can be integrated to accelerate protein engineering. More specifically, we examine strategies to make screens more economical, informative, and effective in discovery of useful variants.
Collapse
Affiliation(s)
- Hoi Yee Chu
- Laboratory of Combinatorial Genetics and Synthetic BiologySchool of Biomedical SciencesThe University of Hong KongHong Kong852China
| | - Alan S. L. Wong
- Laboratory of Combinatorial Genetics and Synthetic BiologySchool of Biomedical SciencesThe University of Hong KongHong Kong852China
- Electrical and Electronic EngineeringThe University of Hong KongPokfulamHong Kong852China
| |
Collapse
|
16
|
Intelligent host engineering for metabolic flux optimisation in biotechnology. Biochem J 2021; 478:3685-3721. [PMID: 34673920 PMCID: PMC8589332 DOI: 10.1042/bcj20210535] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Revised: 09/22/2021] [Accepted: 09/24/2021] [Indexed: 12/13/2022]
Abstract
Optimising the function of a protein of length N amino acids by directed evolution involves navigating a 'search space' of possible sequences of some 20N. Optimising the expression levels of P proteins that materially affect host performance, each of which might also take 20 (logarithmically spaced) values, implies a similar search space of 20P. In this combinatorial sense, then, the problems of directed protein evolution and of host engineering are broadly equivalent. In practice, however, they have different means for avoiding the inevitable difficulties of implementation. The spare capacity exhibited in metabolic networks implies that host engineering may admit substantial increases in flux to targets of interest. Thus, we rehearse the relevant issues for those wishing to understand and exploit those modern genome-wide host engineering tools and thinking that have been designed and developed to optimise fluxes towards desirable products in biotechnological processes, with a focus on microbial systems. The aim throughput is 'making such biology predictable'. Strategies have been aimed at both transcription and translation, especially for regulatory processes that can affect multiple targets. However, because there is a limit on how much protein a cell can produce, increasing kcat in selected targets may be a better strategy than increasing protein expression levels for optimal host engineering.
Collapse
|
17
|
Phillips AM, Lawrence KR, Moulana A, Dupic T, Chang J, Johnson MS, Cvijovic I, Mora T, Walczak AM, Desai MM. Binding affinity landscapes constrain the evolution of broadly neutralizing anti-influenza antibodies. eLife 2021; 10:71393. [PMID: 34491198 PMCID: PMC8476123 DOI: 10.7554/elife.71393] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 09/05/2021] [Indexed: 12/12/2022] Open
Abstract
Over the past two decades, several broadly neutralizing antibodies (bnAbs) that confer protection against diverse influenza strains have been isolated. Structural and biochemical characterization of these bnAbs has provided molecular insight into how they bind distinct antigens. However, our understanding of the evolutionary pathways leading to bnAbs, and thus how best to elicit them, remains limited. Here, we measure equilibrium dissociation constants of combinatorially complete mutational libraries for two naturally isolated influenza bnAbs (CR9114, 16 heavy-chain mutations; CR6261, 11 heavy-chain mutations), reconstructing all possible evolutionary intermediates back to the unmutated germline sequences. We find that these two libraries exhibit strikingly different patterns of breadth: while many variants of CR6261 display moderate affinity to diverse antigens, those of CR9114 display appreciable affinity only in specific, nested combinations. By examining the extensive pairwise and higher order epistasis between mutations, we find key sites with strong synergistic interactions that are highly similar across antigens for CR6261 and different for CR9114. Together, these features of the binding affinity landscapes strongly favor sequential acquisition of affinity to diverse antigens for CR9114, while the acquisition of breadth to more similar antigens for CR6261 is less constrained. These results, if generalizable to other bnAbs, may explain the molecular basis for the widespread observation that sequential exposure favors greater breadth, and such mechanistic insight will be essential for predicting and eliciting broadly protective immune responses.
Collapse
Affiliation(s)
- Angela M Phillips
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, United States
| | - Katherine R Lawrence
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, United States.,NSF-Simons Center for Mathematical and Statistical Analysis of Biology, Harvard University, Cambridge, United States.,Quantitative Biology Initiative, Harvard University, Cambridge, United States.,Department of Physics, Massachusetts Institute of Technology, Cambridge, United States
| | - Alief Moulana
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, United States
| | - Thomas Dupic
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, United States
| | - Jeffrey Chang
- Department of Physics, Harvard University, Cambridge, United States
| | - Milo S Johnson
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, United States
| | - Ivana Cvijovic
- Department of Applied Physics, Stanford University, Stanford, United States
| | - Thierry Mora
- Laboratoire de physique de ÍÉcole Normale Supérieure, CNRS, PSL University, Sorbonne Université, and Université de Paris, Paris, France
| | - Aleksandra M Walczak
- Laboratoire de physique de ÍÉcole Normale Supérieure, CNRS, PSL University, Sorbonne Université, and Université de Paris, Paris, France
| | - Michael M Desai
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, United States.,NSF-Simons Center for Mathematical and Statistical Analysis of Biology, Harvard University, Cambridge, United States.,Quantitative Biology Initiative, Harvard University, Cambridge, United States.,Department of Physics, Harvard University, Cambridge, United States
| |
Collapse
|
18
|
Pedruzzi G, Rouzine IM. An evolution-based high-fidelity method of epistasis measurement: Theory and application to influenza. PLoS Pathog 2021; 17:e1009669. [PMID: 34153082 PMCID: PMC8248644 DOI: 10.1371/journal.ppat.1009669] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 07/01/2021] [Accepted: 05/25/2021] [Indexed: 12/18/2022] Open
Abstract
Linkage effects in a multi-locus population strongly influence its evolution. The models based on the traveling wave approach enable us to predict the average speed of evolution and the statistics of phylogeny. However, predicting statistically the evolution of specific sites and pairs of sites in the multi-locus context remains a mathematical challenge. In particular, the effects of epistasis, the interaction of gene regions contributing to phenotype, is difficult to predict theoretically and detect experimentally in sequence data. A large number of false-positive interactions arises from stochastic linkage effects and indirect interactions, which mask true epistatic interactions. Here we develop a proof-of-principle method to filter out false-positive interactions. We start by demonstrating that the averaging of haplotype frequencies over multiple independent populations is necessary but not sufficient for epistatic detection, because it still leaves high numbers of false-positive interactions. To compensate for the residual stochastic noise, we develop a three-way haplotype method isolating true interactions. The fidelity of the method is confirmed analytically and on simulated genetic sequences evolved with a known epistatic network. The method is then applied to a large sequence database of neurominidase protein of influenza A H1N1 obtained from various geographic locations to infer the epistatic network responsible for the difference between the pre-pandemic virus and the pandemic strain of 2009. These results present a simple and reliable technique to measure epistatic interactions of any sign from sequence data. Interactions between genomic sites create a fitness landscape. The knowledge of topology and strength of interactions is vital for predicting the escape of viruses from drugs and immune response and their passing through fitness valleys. Many efforts have been invested into measuring these interactions from DNA sequence sets. Unfortunately, reproducibility of the results remains low due partly to a very small fraction of interaction pairs and partly to stochastic linkage noise masking true interactions. Here we propose a method to separate stochastic linkage and indirect interactions from epistatic interactions and apply it to influenza virus sequence data.
Collapse
Affiliation(s)
- Gabriele Pedruzzi
- Sorbonne Université, Institute de Biologie Paris-Seine, Laboratoire de Biologie Computationelle et Quantitative LCQB, Paris, France
| | - Igor M. Rouzine
- Sorbonne Université, Institute de Biologie Paris-Seine, Laboratoire de Biologie Computationelle et Quantitative LCQB, Paris, France
- * E-mail:
| |
Collapse
|
19
|
Miton CM, Buda K, Tokuriki N. Epistasis and intramolecular networks in protein evolution. Curr Opin Struct Biol 2021; 69:160-168. [PMID: 34077895 DOI: 10.1016/j.sbi.2021.04.007] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 04/01/2021] [Accepted: 04/21/2021] [Indexed: 12/01/2022]
Abstract
Proteins are molecular machines composed of complex, highly connected amino acid networks. Their functional optimization requires the reorganization of these intramolecular networks by evolution. In this review, we discuss the mechanisms by which epistasis, that is, the dependence of the effect of a mutation on the genetic background, rewires intramolecular interactions to alter protein function. Deciphering the biophysical basis of epistasis is crucial to our understanding of evolutionary dynamics and the elucidation of sequence-structure-function relationships. We featured recent studies that provide insights into the molecular mechanisms giving rise to epistasis, particularly at the structural level. These studies illustrate the convoluted and fascinating nature of the intramolecular networks co-opted by epistasis during the evolution of protein function.
Collapse
Affiliation(s)
- Charlotte M Miton
- Michael Smith Laboratories, University of British Columbia, Vancouver, V6T 1Z4, BC, Canada
| | - Karol Buda
- Michael Smith Laboratories, University of British Columbia, Vancouver, V6T 1Z4, BC, Canada
| | - Nobuhiko Tokuriki
- Michael Smith Laboratories, University of British Columbia, Vancouver, V6T 1Z4, BC, Canada.
| |
Collapse
|
20
|
Layman NC, Tuschhoff BM, Nuismer SL. Designing transmissible viral vaccines for evolutionary robustness and maximum efficiency. Virus Evol 2021; 7:veab002. [PMID: 33680502 PMCID: PMC7920745 DOI: 10.1093/ve/veab002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
The danger posed by emerging infectious diseases necessitates the development of new tools that can mitigate the risk of animal pathogens spilling over into the human population. One promising approach is the development of recombinant viral vaccines that are transmissible, and thus capable of self-dissemination through hard to reach populations of wild animals. Indeed, mathematical models demonstrate that transmissible vaccines can greatly reduce the effort required to control the spread of zoonotic pathogens in their animal reservoirs, thereby limiting the chances of human infection. A key challenge facing these new vaccines, however, is the inevitability of evolutionary change resulting from their ability to self-replicate and generate extended chains of transmission. Further, carrying immunogenic transgenes is often costly, in terms of metabolic burden, increased competition with the pathogen, or due to unintended interactions with the viral host regulatory network. As a result, natural selection is expected to favor vaccine strains that down-regulate or delete these transgenes resulting in increased rates of transmission and reduced efficacy against the target pathogen. In addition, efficacy and evolutionary stability will often be at odds; as when longer, more efficacious antigens experience faster rates of evolutionary decay. Here, we ask how such trade-offs influence the overall performance of transmissible vaccines. We find that evolutionary instability can substantially reduce performance, even for vaccine candidates with the ideal combination of efficacy and transmission. However, we find that, at least in some cases, vaccine stability and overall performance can be improved by the inclusion of a second, redundant antigen. Overall, our results suggest that the successful application of recombinant transmissible vaccines will require consideration of evolutionary dynamics and epistatic effects, as well as basic measurements of epidemiological features.
Collapse
Affiliation(s)
| | - Beth M Tuschhoff
- Department of Mathematics, University of Idaho, 875 Perimeter Drive, Moscow, ID 83844, USA
| | | |
Collapse
|
21
|
Spisak N, Walczak AM, Mora T. Learning the heterogeneous hypermutation landscape of immunoglobulins from high-throughput repertoire data. Nucleic Acids Res 2020; 48:10702-10712. [PMID: 33035336 PMCID: PMC7641750 DOI: 10.1093/nar/gkaa825] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Revised: 09/07/2020] [Accepted: 09/18/2020] [Indexed: 01/23/2023] Open
Abstract
Somatic hypermutations of immunoglobulin (Ig) genes occurring during affinity maturation drive B-cell receptors’ ability to evolve strong binding to their antigenic targets. The landscape of these mutations is highly heterogeneous, with certain regions of the Ig gene being preferentially targeted. However, a rigorous quantification of this bias has been difficult because of phylogenetic correlations between sequences and the interference of selective forces. Here, we present an approach that corrects for these issues, and use it to learn a model of hypermutation preferences from a recently published large IgH repertoire dataset. The obtained model predicts mutation profiles accurately and in a reproducible way, including in the previously uncharacterized Complementarity Determining Region 3, revealing that both the sequence context of the mutation and its absolute position along the gene are important. In addition, we show that hypermutations occurring concomittantly along B-cell lineages tend to co-localize, suggesting a possible mechanism for accelerating affinity maturation.
Collapse
Affiliation(s)
- Natanael Spisak
- Laboratoire de physique de l’École normale supérieure, CNRS, PSL University, Sorbonne Université, and Université de Paris, 24 rue Lhomond, 75005 Paris, France
| | | | | |
Collapse
|
22
|
In vitro evolution of antibody affinity via insertional scanning mutagenesis of an entire antibody variable region. Proc Natl Acad Sci U S A 2020; 117:27307-27318. [PMID: 33067389 DOI: 10.1073/pnas.2002954117] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
We report a systematic combinatorial exploration of affinity enhancement of antibodies by insertions and deletions (InDels). Transposon-based introduction of InDels via the method TRIAD (transposition-based random insertion and deletion mutagenesis) was used to generate large libraries with random in-frame InDels across the entire single-chain variable fragment gene that were further recombined and screened by ribosome display. Knowledge of potential insertion points from TRIAD libraries formed the basis of exploration of length and sequence diversity of novel insertions by insertional-scanning mutagenesis (InScaM). An overall 256-fold affinity improvement of an anti-IL-13 antibody BAK1 as a result of InDel mutagenesis and combination with known point mutations validates this approach, and suggests that the results of this InDel mutagenesis and conventional exploration of point mutations can synergize to generate antibodies with higher affinity.
Collapse
|
23
|
Abstract
Living systems evolve one mutation at a time, but a single mutation can alter the effect of subsequent mutations. The underlying mechanistic determinants of such epistasis are unclear. Here, we demonstrate that the physical dynamics of a biological system can generically constrain epistasis. We analyze models and experimental data on proteins and regulatory networks. In each, we find that if the long-time physical dynamics is dominated by a slow, collective mode, then the dimensionality of mutational effects is reduced. Consequently, epistatic coefficients for different combinations of mutations are no longer independent, even if individually strong. Such epistasis can be summarized as resulting from a global nonlinearity applied to an underlying linear trait, that is, as global epistasis. This constraint, in turn, reduces the ruggedness of the sequence-to-function map. By providing a generic mechanistic origin for experimentally observed global epistasis, our work suggests that slow collective physical modes can make biological systems evolvable.
Collapse
Affiliation(s)
- Kabir Husain
- Department of Physics, University of Chicago, Chicago, IL
| | - Arvind Murugan
- Department of Physics, University of Chicago, Chicago, IL
| |
Collapse
|
24
|
Ballal A, Laurendon C, Salmon M, Vardakou M, Cheema J, Defernez M, O'Maille PE, Morozov AV. Sparse Epistatic Patterns in the Evolution of Terpene Synthases. Mol Biol Evol 2020; 37:1907-1924. [PMID: 32119077 DOI: 10.1093/molbev/msaa052] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
We explore sequence determinants of enzyme activity and specificity in a major enzyme family of terpene synthases. Most enzymes in this family catalyze reactions that produce cyclic terpenes-complex hydrocarbons widely used by plants and insects in diverse biological processes such as defense, communication, and symbiosis. To analyze the molecular mechanisms of emergence of terpene cyclization, we have carried out in-depth examination of mutational space around (E)-β-farnesene synthase, an Artemisia annua enzyme which catalyzes production of a linear hydrocarbon chain. Each mutant enzyme in our synthetic libraries was characterized biochemically, and the resulting reaction rate data were used as input to the Michaelis-Menten model of enzyme kinetics, in which free energies were represented as sums of one-amino-acid contributions and two-amino-acid couplings. Our model predicts measured reaction rates with high accuracy and yields free energy landscapes characterized by relatively few coupling terms. As a result, the Michaelis-Menten free energy landscapes have simple, interpretable structure and exhibit little epistasis. We have also developed biophysical fitness models based on the assumption that highly fit enzymes have evolved to maximize the output of correct products, such as cyclic products or a specific product of interest, while minimizing the output of byproducts. This approach results in nonlinear fitness landscapes that are considerably more epistatic. Overall, our experimental and computational framework provides focused characterization of evolutionary emergence of novel enzymatic functions in the context of microevolutionary exploration of sequence space around naturally occurring enzymes.
Collapse
Affiliation(s)
- Aditya Ballal
- Department of Physics & Astronomy and Center for Quantitative Biology, Rutgers University, Piscataway, NJ
| | - Caroline Laurendon
- John Innes Centre, Department of Metabolic Biology, Norwich Research Park, Norwich, United Kingdom.,Food & Health Programme, Institute of Food Research, Norwich Research Park, Norwich, United Kingdom
| | - Melissa Salmon
- John Innes Centre, Department of Metabolic Biology, Norwich Research Park, Norwich, United Kingdom.,Food & Health Programme, Institute of Food Research, Norwich Research Park, Norwich, United Kingdom.,Earlham Institute, Norwich Research Park, Norwich, United Kingdom
| | - Maria Vardakou
- John Innes Centre, Department of Metabolic Biology, Norwich Research Park, Norwich, United Kingdom.,Food & Health Programme, Institute of Food Research, Norwich Research Park, Norwich, United Kingdom.,School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, United Kingdom
| | - Jitender Cheema
- John Innes Centre, Department of Computational and Systems Biology, Norwich Research Park, Norwich, United Kingdom
| | - Marianne Defernez
- Core Science Resources, Quadram Institute, Norwich Research Park, Norwich, United Kingdom
| | - Paul E O'Maille
- John Innes Centre, Department of Metabolic Biology, Norwich Research Park, Norwich, United Kingdom.,Food & Health Programme, Institute of Food Research, Norwich Research Park, Norwich, United Kingdom.,SRI International, Menlo Park, CA
| | - Alexandre V Morozov
- Department of Physics & Astronomy and Center for Quantitative Biology, Rutgers University, Piscataway, NJ
| |
Collapse
|
25
|
Sachdeva V, Husain K, Sheng J, Wang S, Murugan A. Tuning environmental timescales to evolve and maintain generalists. Proc Natl Acad Sci U S A 2020; 117:12693-12699. [PMID: 32457160 PMCID: PMC7293598 DOI: 10.1073/pnas.1914586117] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Natural environments can present diverse challenges, but some genotypes remain fit across many environments. Such "generalists" can be hard to evolve, outcompeted by specialists fitter in any particular environment. Here, inspired by the search for broadly neutralizing antibodies during B cell affinity maturation, we demonstrate that environmental changes on an intermediate timescale can reliably evolve generalists, even when faster or slower environmental changes are unable to do so. We find that changing environments on timescales comparable with evolutionary transients in a population enhance the rate of evolving generalists from specialists, without enhancing the reverse process. The yield of generalists is further increased in more complex dynamic environments, such as a "chirp" of increasing frequency. Our work offers design principles for how nonequilibrium fitness "seascapes" can dynamically funnel populations to genotypes unobtainable in static environments.
Collapse
Affiliation(s)
- Vedant Sachdeva
- Graduate Program in Biophysical Sciences, The University of Chicago, Chicago, IL 60627
| | - Kabir Husain
- Department of Physics, The University of Chicago, Chicago, IL 60627
| | - Jiming Sheng
- Department of Physics and Astronomy, The University of California, Los Angeles, CA 90095
| | - Shenshen Wang
- Department of Physics and Astronomy, The University of California, Los Angeles, CA 90095
| | - Arvind Murugan
- Department of Physics, The University of Chicago, Chicago, IL 60627;
| |
Collapse
|
26
|
Abstract
Understanding the individual and joint contribution of multiple protein levels toward a phenotype requires precise and tunable multigene expression control. Here we introduce a pair of mammalian synthetic gene circuits that linearly and orthogonally control the expression of two reporter genes in mammalian cells with low variability in response to chemical inducers introduced into the growth medium. These gene expression systems can be used to simultaneously probe the individual and joint effects of two gene product concentrations on a cellular phenotype in basic research or biomedical applications.
Collapse
Affiliation(s)
- Mariola Szenk
- The Louis and Beatrice Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, United States
- Department of Biomedical Engineering, Stony Brook University, Stony Brook, New York 11794, United States
| | - Terrence Yim
- Department of Biomedical Engineering, Stony Brook University, Stony Brook, New York 11794, United States
| | - Gábor Balázsi
- The Louis and Beatrice Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, United States
- Department of Biomedical Engineering, Stony Brook University, Stony Brook, New York 11794, United States
| |
Collapse
|
27
|
Srikant S, Gaudet R, Murray AW. Selecting for Altered Substrate Specificity Reveals the Evolutionary Flexibility of ATP-Binding Cassette Transporters. Curr Biol 2020; 30:1689-1702.e6. [PMID: 32220325 PMCID: PMC7243462 DOI: 10.1016/j.cub.2020.02.077] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Revised: 01/20/2020] [Accepted: 02/24/2020] [Indexed: 12/12/2022]
Abstract
ATP-binding cassette (ABC) transporters are the largest family of ATP-hydrolyzing transporters, which import or export substrates across membranes, and have members in every sequenced genome. Structural studies and biochemistry highlight the contrast between the global structural similarity of homologous transporters and the enormous diversity of their substrates. How do ABC transporters evolve to carry such diverse molecules and what variations in their amino acid sequence alter their substrate selectivity? We mutagenized the transmembrane domains of a conserved fungal ABC transporter that exports a mating pheromone and selected for mutants that export a non-cognate pheromone. Mutations that alter export selectivity cover a region that is larger than expected for a localized substrate-binding site. Individual selected clones have multiple mutations, which have broadly additive contributions to specific transport activity. Our results suggest that multiple positions influence substrate selectivity, leading to alternative evolutionary paths toward selectivity for particular substrates and explaining the number and diversity of ABC transporters. Srikant et al. find that mutations at many different positions in an ABC transporter of fungal mating pheromone have roughly additive effects on substrate recognition. This helps explain the evolvability of ABC transporters to transport a remarkable variety of substrates and their presence as the largest protein family across all domains of life.
Collapse
Affiliation(s)
- Sriram Srikant
- Department of Molecular and Cellular Biology, Harvard University, 52 Oxford Street, Cambridge, MA 02138, USA
| | - Rachelle Gaudet
- Department of Molecular and Cellular Biology, Harvard University, 52 Oxford Street, Cambridge, MA 02138, USA.
| | - Andrew W Murray
- Department of Molecular and Cellular Biology, Harvard University, 52 Oxford Street, Cambridge, MA 02138, USA.
| |
Collapse
|