1
|
Xu M, Dantu SC, Garnett JA, Bonomo RA, Pandini A, Haider S. Functionally important residues from graph analysis of coevolved dynamic couplings. eLife 2025; 14:RP105005. [PMID: 40153310 PMCID: PMC11952748 DOI: 10.7554/elife.105005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2025] Open
Abstract
The relationship between protein dynamics and function is essential for understanding biological processes and developing effective therapeutics. Functional sites within proteins are critical for activities such as substrate binding, catalysis, and structural changes. Existing computational methods for the predictions of functional residues are trained on sequence, structural, and experimental data, but they do not explicitly model the influence of evolution on protein dynamics. This overlooked contribution is essential as it is known that evolution can fine-tune protein dynamics through compensatory mutations either to improve the proteins' performance or diversify its function while maintaining the same structural scaffold. To model this critical contribution, we introduce DyNoPy, a computational method that combines residue coevolution analysis with molecular dynamics simulations, revealing hidden correlations between functional sites. DyNoPy constructs a graph model of residue-residue interactions, identifies communities of key residue groups, and annotates critical sites based on their roles. By leveraging the concept of coevolved dynamical couplings-residue pairs with critical dynamical interactions that have been preserved during evolution-DyNoPy offers a powerful method for predicting and analysing protein evolution and dynamics. We demonstrate the effectiveness of DyNoPy on SHV-1 and PDC-3, chromosomally encoded β-lactamases linked to antibiotic resistance, highlighting its potential to inform drug design and address pressing healthcare challenges.
Collapse
Affiliation(s)
- Manming Xu
- UCL School of PharmacyLondonUnited Kingdom
| | | | - James A Garnett
- Centre for Host-Microbiome Interactions, Faculty of Dentistry, Oral & Craniofacial Sciences, King’s College LondonLondonUnited Kingdom
| | - Robert A Bonomo
- Research Service, Louis Stokes Cleveland Department of Veterans Affairs Medical CenterClevelandUnited States
- Department of Molecular Biology and Microbiology, Case Western Reserve University School of MedicineClevelandUnited States
- Department of Medicine, Case Western Reserve University School of MedicineClevelandUnited States
- Departments of Pharmacology, Biochemistry, and Proteomics and Bioinformatics Case Western Reserve University School of MedicineClevelandUnited States
- CWRU-Cleveland VAMC Center for Antimicrobial Resistance and Epidemiology (Case VA CARES)ClevelandUnited States
| | - Alessandro Pandini
- Department of Computer Science, Brunel University LondonUxbridgeUnited Kingdom
| | - Shozeb Haider
- UCL School of PharmacyLondonUnited Kingdom
- University of Tabuk (PFSCBR)TabukSaudi Arabia
- UCL Center for Advanced Research Computing, University College LondonLondonUnited Kingdom
| |
Collapse
|
2
|
Prywes N, Phillips NR, Oltrogge LM, Lindner S, Taylor-Kearney LJ, Tsai YCC, de Pins B, Cowan AE, Chang HA, Wang RZ, Hall LN, Bellieny-Rabelo D, Nisonoff HM, Weissman RF, Flamholz AI, Ding D, Bhatt AY, Mueller-Cajar O, Shih PM, Milo R, Savage DF. A map of the rubisco biochemical landscape. Nature 2025; 638:823-828. [PMID: 39843747 PMCID: PMC11839469 DOI: 10.1038/s41586-024-08455-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 11/26/2024] [Indexed: 01/24/2025]
Abstract
Rubisco is the primary CO2-fixing enzyme of the biosphere1, yet it has slow kinetics2. The roles of evolution and chemical mechanism in constraining its biochemical function remain debated3,4. Engineering efforts aimed at adjusting the biochemical parameters of rubisco have largely failed5, although recent results indicate that the functional potential of rubisco has a wider scope than previously known6. Here we developed a massively parallel assay, using an engineered Escherichia coli7 in which enzyme activity is coupled to growth, to systematically map the sequence-function landscape of rubisco. Composite assay of more than 99% of single-amino acid mutants versus CO2 concentration enabled inference of enzyme velocity and apparent CO2 affinity parameters for thousands of substitutions. This approach identified many highly conserved positions that tolerate mutation and rare mutations that improve CO2 affinity. These data indicate that non-trivial biochemical changes are readily accessible and that the functional distance between rubiscos from diverse organisms can be traversed, laying the groundwork for further enzyme engineering efforts.
Collapse
Affiliation(s)
- Noam Prywes
- Innovative Genomics Institute, University of California Berkeley, Berkeley, CA, USA
- Howard Hughes Medical Institute, University of California Berkeley, Berkeley, CA, USA
| | - Naiya R Phillips
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA
| | - Luke M Oltrogge
- Howard Hughes Medical Institute, University of California Berkeley, Berkeley, CA, USA
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA
| | | | - Leah J Taylor-Kearney
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA
| | - Yi-Chin Candace Tsai
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Benoit de Pins
- Department of Biology, University of Naples Federico II, Naples, Italy
| | - Aidan E Cowan
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA
- Joint BioEnergy Institute, Lawrence Berkeley National Laboratory, Emeryville, CA, USA
| | - Hana A Chang
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA
| | - Renée Z Wang
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA
| | - Laina N Hall
- Biophysics, University of California Berkeley, Berkeley, CA, USA
| | - Daniel Bellieny-Rabelo
- Innovative Genomics Institute, University of California Berkeley, Berkeley, CA, USA
- California Institute for Quantitative Biosciences (QB3), University of California Berkeley, Berkeley, CA, USA
| | - Hunter M Nisonoff
- Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA
| | - Rachel F Weissman
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA
| | - Avi I Flamholz
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - David Ding
- Innovative Genomics Institute, University of California Berkeley, Berkeley, CA, USA
- Howard Hughes Medical Institute, University of California Berkeley, Berkeley, CA, USA
| | - Abhishek Y Bhatt
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA
- School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Oliver Mueller-Cajar
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Patrick M Shih
- Innovative Genomics Institute, University of California Berkeley, Berkeley, CA, USA
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Feedstocks Division, Joint BioEnergy Institute, Emeryville, CA, USA
| | - Ron Milo
- Department of Plant and Environmental Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - David F Savage
- Innovative Genomics Institute, University of California Berkeley, Berkeley, CA, USA.
- Howard Hughes Medical Institute, University of California Berkeley, Berkeley, CA, USA.
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA.
| |
Collapse
|
3
|
Dieckhaus H, Kuhlman B. Protein stability models fail to capture epistatic interactions of double point mutations. Protein Sci 2025; 34:e70003. [PMID: 39704075 DOI: 10.1002/pro.70003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Revised: 11/06/2024] [Accepted: 12/05/2024] [Indexed: 12/21/2024]
Abstract
There is strong interest in accurate methods for predicting changes in protein stability resulting from amino acid mutations to the protein sequence. Recombinant proteins must often be stabilized to be used as therapeutics or reagents, and destabilizing mutations are implicated in a variety of diseases. Due to increased data availability and improved modeling techniques, recent studies have shown advancements in predicting changes in protein stability when a single-point mutation is made. Less focus has been directed toward predicting changes in protein stability when there are two or more mutations. Here, we analyze the largest available dataset of double point mutation stability and benchmark several widely used protein stability models on this and other datasets. We find that additive models of protein stability perform surprisingly well on this task, achieving similar performance to comparable non-additive predictors according to most metrics. Accordingly, we find that neither artificial intelligence-based nor physics-based protein stability models consistently capture epistatic interactions between single mutations. We observe one notable deviation from this trend, which is that epistasis-aware models provide marginally better predictions than additive models on stabilizing double point mutations. We develop an extension of the ThermoMPNN framework for double mutant modeling, as well as a novel data augmentation scheme, which mitigates some of the limitations in currently available datasets. Collectively, our findings indicate that current protein stability models fail to capture the nuanced epistatic interactions between concurrent mutations due to several factors, including training dataset limitations and insufficient model sensitivity.
Collapse
Affiliation(s)
- Henry Dieckhaus
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Division of Chemical Biology and Medicinal Chemistry, University of North Carolina Eshelman School of Pharmacy, Chapel Hill, North Carolina, USA
| | - Brian Kuhlman
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| |
Collapse
|
4
|
Faure AJ, Martí-Aranda A, Hidalgo-Carcedo C, Beltran A, Schmiedel JM, Lehner B. The genetic architecture of protein stability. Nature 2024; 634:995-1003. [PMID: 39322666 PMCID: PMC11499273 DOI: 10.1038/s41586-024-07966-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 08/20/2024] [Indexed: 09/27/2024]
Abstract
There are more ways to synthesize a 100-amino acid (aa) protein (20100) than there are atoms in the universe. Only a very small fraction of such a vast sequence space can ever be experimentally or computationally surveyed. Deep neural networks are increasingly being used to navigate high-dimensional sequence spaces1. However, these models are extremely complicated. Here, by experimentally sampling from sequence spaces larger than 1010, we show that the genetic architecture of at least some proteins is remarkably simple, allowing accurate genetic prediction in high-dimensional sequence spaces with fully interpretable energy models. These models capture the nonlinear relationships between free energies and phenotypes but otherwise consist of additive free energy changes with a small contribution from pairwise energetic couplings. These energetic couplings are sparse and associated with structural contacts and backbone proximity. Our results indicate that protein genetics is actually both rather simple and intelligible.
Collapse
Affiliation(s)
- Andre J Faure
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.
- ALLOX, Barcelona, Spain.
| | - Aina Martí-Aranda
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Cristina Hidalgo-Carcedo
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Antoni Beltran
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Jörn M Schmiedel
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- factorize.bio, Berlin, Germany
| | - Ben Lehner
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| |
Collapse
|
5
|
Park Y, Metzger BPH, Thornton JW. The simplicity of protein sequence-function relationships. Nat Commun 2024; 15:7953. [PMID: 39261454 PMCID: PMC11390738 DOI: 10.1038/s41467-024-51895-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 08/20/2024] [Indexed: 09/13/2024] Open
Abstract
How complex are the rules by which a protein's sequence determines its function? High-order epistatic interactions among residues are thought to be pervasive, suggesting an idiosyncratic and unpredictable sequence-function relationship. But many prior studies may have overestimated epistasis, because they analyzed sequence-function relationships relative to a single reference sequence-which causes measurement noise and local idiosyncrasies to snowball into high-order epistasis-or they did not fully account for global nonlinearities. Here we present a reference-free method that jointly infers specific epistatic interactions and global nonlinearity using a bird's-eye view of sequence space. This technique yields the simplest explanation of sequence-function relationships and is more robust than existing methods to measurement noise, missing data, and model misspecification. We reanalyze 20 experimental datasets and find that context-independent amino acid effects and pairwise interactions, along with a simple nonlinearity to account for limited dynamic range, explain a median of 96% of phenotypic variance and over 92% in every case. Only a tiny fraction of genotypes are strongly affected by higher-order epistasis. Sequence-function relationships are also sparse: a miniscule fraction of amino acids and interactions account for 90% of phenotypic variance. Sequence-function causality across these datasets is therefore simple, opening the way for tractable approaches to characterize proteins' genetic architecture.
Collapse
Affiliation(s)
- Yeonwoo Park
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, USA
- Center for RNA Research, Institute for Basic Science, Seoul, Republic of Korea
| | - Brian P H Metzger
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Joseph W Thornton
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA.
- Department of Human Genetics, University of Chicago, Chicago, IL, USA.
| |
Collapse
|
6
|
Huang ZZ, Tan J, Huang P, Li BS, Guo Q, Liang LJ. The evolutionary features and roles of single nucleotide variants and charged amino acid mutations in influenza outbreaks during NPI period. Sci Rep 2024; 14:20418. [PMID: 39223292 PMCID: PMC11369173 DOI: 10.1038/s41598-024-71349-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 08/27/2024] [Indexed: 09/04/2024] Open
Abstract
The epidemic and outbreaks of influenza B Victoria lineage (Bv) during 2019-2022 led to an analysis of genetic, epitopes, charged amino acids and Bv outbreaks. Based on the National Influenza Surveillance Network (NISN), the Bv 72 strains isolated during 2019-2022 were selected by spatio-temporal sampling, then were sequenced. Using the Compare Means, Correlate and Cluster, the outbreak data were analyzed, including the single nucleotide variant (SNV), amino acid (AA), epitope, evolutionary rate (ER), Shannon entropy value (SV), charged amino acid and outbreak. With the emergence of COVID-19, the non-pharmaceutical interventions (NPIs) made Less distant transmission and only Bv outbreak. The 2021-2022 strains in the HA genes were located in the same subset, but were distinct from the 2019-2020 strains (P < 0.001). The codon G → A transition in nucleotide was in the highest ratio but the transversion of C → A and T → A made the most significant contribution to the outbreaks, while the increase in amino acid mutations characterized by polar, acidic and basic signatures played a key role in the Bv epidemic in 2021-2022. Both ER and SV were positively correlated in HA genes (R = 0.690) and NA genes (R = 0.711), respectively, however, the number of mutations in the HA genes was 1.59 times higher than that of the NA gene (2.15/1.36) from the beginning of 2020 to 2022. The positively selective sites 174, 199, 214 and 563 in HA genes and the sites 73 and 384 in NA genes were evolutionarily selected in the 2021-2022 influenza outbreaks. Overall, the prevalent factors related to 2021-2022 influenza outbreaks included epidemic timing, Tv, Ts, Tv/Ts, P137 (B → P), P148 (B → P), P199 (P → A), P212 (P → A), P214 (H → P) and P563 (B → P). The preference of amino acid mutations for charge/pH could influence the epidemic/outbreak trends of infectious diseases. Here was a good model of the evolution of infectious disease pathogens. This study, on account of further exploration of virology, genetics, bioinformatics and outbreak information, might facilitate further understanding of their deep interaction mechanisms in the spread of infectious diseases.
Collapse
Affiliation(s)
- Zhong-Zhou Huang
- Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, 510120, China
- School of Public Health, Sun Yat-Sen University, Guangzhou, 510080, China
- Workstation for Emerging Infectious Disease Control and Prevention, Guangdong Center for Disease Control and Prevention, Guangzhou, 511430, China
| | - Jing Tan
- Workstation for Emerging Infectious Disease Control and Prevention, Guangdong Center for Disease Control and Prevention, Guangzhou, 511430, China
- School of Public Health, Southern Medical University, Guangzhou, 510515, China
- School of Public Health, Southwest Medical University, Luzhou, 646000, China
| | - Ping Huang
- School of Public Health, Sun Yat-Sen University, Guangzhou, 510080, China.
- Workstation for Emerging Infectious Disease Control and Prevention, Guangdong Center for Disease Control and Prevention, Guangzhou, 511430, China.
- Guangdong Key Laboratory of Pathogen Detection for Emerging Infectious Disease Response, Guangdong Center for Disease Control and Prevention, Guangzhou, 511430, China.
- School of Public Health, Southern Medical University, Guangzhou, 510515, China.
| | - Bai-Sheng Li
- Workstation for Emerging Infectious Disease Control and Prevention, Guangdong Center for Disease Control and Prevention, Guangzhou, 511430, China
- Guangdong Key Laboratory of Pathogen Detection for Emerging Infectious Disease Response, Guangdong Center for Disease Control and Prevention, Guangzhou, 511430, China
- School of Public Health, Southern Medical University, Guangzhou, 510515, China
| | - Qing Guo
- Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, 510120, China
| | - Li-Jun Liang
- Workstation for Emerging Infectious Disease Control and Prevention, Guangdong Center for Disease Control and Prevention, Guangzhou, 511430, China
- Guangdong Key Laboratory of Pathogen Detection for Emerging Infectious Disease Response, Guangdong Center for Disease Control and Prevention, Guangzhou, 511430, China
| |
Collapse
|
7
|
Dieckhaus H, Kuhlman B. Protein stability models fail to capture epistatic interactions of double point mutations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.20.608844. [PMID: 39229177 PMCID: PMC11370451 DOI: 10.1101/2024.08.20.608844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
There is strong interest in accurate methods for predicting changes in protein stability resulting from amino acid mutations to the protein sequence. Recombinant proteins must often be stabilized to be used as therapeutics or reagents, and destabilizing mutations are implicated in a variety of diseases. Due to increased data availability and improved modeling techniques, recent studies have shown advancements in predicting changes in protein stability when a single point mutation is made. Less focus has been directed toward predicting changes in protein stability when there are two or more mutations, despite the significance of mutation clusters for disease pathways and protein design studies. Here, we analyze the largest available dataset of double point mutation stability and benchmark several widely used protein stability models on this and other datasets. We identify a blind spot in how predictors are typically evaluated on multiple mutations, finding that, contrary to assumptions in the field, current stability models are unable to consistently capture epistatic interactions between double mutations. We observe one notable deviation from this trend, which is that epistasis-aware models provide marginally better predictions on stabilizing double point mutations. We develop an extension of the ThermoMPNN framework for double mutant modeling as well as a novel data augmentation scheme which mitigates some of the limitations in available datasets. Collectively, our findings indicate that current protein stability models fail to capture the nuanced epistatic interactions between concurrent mutations due to several factors, including training dataset limitations and insufficient model sensitivity.
Collapse
Affiliation(s)
- Henry Dieckhaus
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Division of Chemical Biology and Medicinal Chemistry, University of North Carolina Eshelman School of Pharmacy, Chapel Hill, North Carolina, USA
| | - Brian Kuhlman
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| |
Collapse
|
8
|
Lipsh-Sokolik R, Fleishman SJ. Addressing epistasis in the design of protein function. Proc Natl Acad Sci U S A 2024; 121:e2314999121. [PMID: 39133844 PMCID: PMC11348311 DOI: 10.1073/pnas.2314999121] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024] Open
Abstract
Mutations in protein active sites can dramatically improve function. The active site, however, is densely packed and extremely sensitive to mutations. Therefore, some mutations may only be tolerated in combination with others in a phenomenon known as epistasis. Epistasis reduces the likelihood of obtaining improved functional variants and dramatically slows natural and lab evolutionary processes. Research has shed light on the molecular origins of epistasis and its role in shaping evolutionary trajectories and outcomes. In addition, sequence- and AI-based strategies that infer epistatic relationships from mutational patterns in natural or experimental evolution data have been used to design functional protein variants. In recent years, combinations of such approaches and atomistic design calculations have successfully predicted highly functional combinatorial mutations in active sites. These were used to design thousands of functional active-site variants, demonstrating that, while our understanding of epistasis remains incomplete, some of the determinants that are critical for accurate design are now sufficiently understood. We conclude that the space of active-site variants that has been explored by evolution may be expanded dramatically to enhance natural activities or discover new ones. Furthermore, design opens the way to systematically exploring sequence and structure space and mutational impacts on function, deepening our understanding and control over protein activity.
Collapse
Affiliation(s)
- Rosalie Lipsh-Sokolik
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Sarel J Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
9
|
Petersen BM, Kirby MB, Chrispens KM, Irvin OM, Strawn IK, Haas CM, Walker AM, Baumer ZT, Ulmer SA, Ayala E, Rhodes ER, Guthmiller JJ, Steiner PJ, Whitehead TA. An integrated technology for quantitative wide mutational scanning of human antibody Fab libraries. Nat Commun 2024; 15:3974. [PMID: 38730230 PMCID: PMC11087541 DOI: 10.1038/s41467-024-48072-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 04/19/2024] [Indexed: 05/12/2024] Open
Abstract
Antibodies are engineerable quantities in medicine. Learning antibody molecular recognition would enable the in silico design of high affinity binders against nearly any proteinaceous surface. Yet, publicly available experiment antibody sequence-binding datasets may not contain the mutagenic, antigenic, or antibody sequence diversity necessary for deep learning approaches to capture molecular recognition. In part, this is because limited experimental platforms exist for assessing quantitative and simultaneous sequence-function relationships for multiple antibodies. Here we present MAGMA-seq, an integrated technology that combines multiple antigens and multiple antibodies and determines quantitative biophysical parameters using deep sequencing. We demonstrate MAGMA-seq on two pooled libraries comprising mutants of nine different human antibodies spanning light chain gene usage, CDR H3 length, and antigenic targets. We demonstrate the comprehensive mapping of potential antibody development pathways, sequence-binding relationships for multiple antibodies simultaneously, and identification of paratope sequence determinants for binding recognition for broadly neutralizing antibodies (bnAbs). MAGMA-seq enables rapid and scalable antibody engineering of multiple lead candidates because it can measure binding for mutants of many given parental antibodies in a single experiment.
Collapse
Affiliation(s)
- Brian M Petersen
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Monica B Kirby
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Karson M Chrispens
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Olivia M Irvin
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Isabell K Strawn
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Cyrus M Haas
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Alexis M Walker
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Zachary T Baumer
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Sophia A Ulmer
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Edgardo Ayala
- Department of Immunology and Microbiology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Emily R Rhodes
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Jenna J Guthmiller
- Department of Immunology and Microbiology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Paul J Steiner
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA
| | - Timothy A Whitehead
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO, USA.
| |
Collapse
|
10
|
Prywes N, Philips NR, Oltrogge LM, Lindner S, Candace Tsai YC, de Pins B, Cowan AE, Taylor-Kearney LJ, Chang HA, Hall LN, Bellieny-Rabelo D, Nisonoff HM, Weissman RF, Flamholz AI, Ding D, Bhatt AY, Shih PM, Mueller-Cajar O, Milo R, Savage DF. A map of the rubisco biochemical landscape. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.27.559826. [PMID: 38645011 PMCID: PMC11030240 DOI: 10.1101/2023.09.27.559826] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Rubisco is the primary CO2 fixing enzyme of the biosphere yet has slow kinetics. The roles of evolution and chemical mechanism in constraining the sequence landscape of rubisco remain debated. In order to map sequence to function, we developed a massively parallel assay for rubisco using an engineered E. coli where enzyme function is coupled to growth. By assaying >99% of single amino acid mutants across CO2 concentrations, we inferred enzyme velocity and CO2 affinity for thousands of substitutions. We identified many highly conserved positions that tolerate mutation and rare mutations that improve CO2 affinity. These data suggest that non-trivial kinetic improvements are readily accessible and provide a comprehensive sequence-to-function mapping for enzyme engineering efforts.
Collapse
Affiliation(s)
- Noam Prywes
- Innovative Genomics Institute, University of California; Berkeley, California 94720, USA
- Howard Hughes Medical Institute, University of California; Berkeley, California 94720, USA
| | - Naiya R Philips
- Department of Molecular and Cell Biology, University of California; Berkeley, California 94720, USA
| | - Luke M Oltrogge
- Howard Hughes Medical Institute, University of California; Berkeley, California 94720, USA
- Department of Molecular and Cell Biology, University of California; Berkeley, California 94720, USA
| | | | - Yi-Chin Candace Tsai
- School of Biological Sciences, Nanyang Technological University; Singapore 637551, Singapore
| | - Benoit de Pins
- Department of Plant and Environmental Sciences, Weizmann Institute of Science; Rehovot 76100, Israel
| | - Aidan E Cowan
- Department of Molecular and Cell Biology, University of California; Berkeley, California 94720, USA
- Joint BioEnergy Institute, Lawrence Berkeley National Laboratory; Emeryville, CA 94608, USA
| | - Leah J Taylor-Kearney
- Department of Plant and Microbial Biology, University of California, Berkeley; Berkeley, CA 94720, USA
| | - Hana A Chang
- Department of Plant and Microbial Biology, University of California, Berkeley; Berkeley, CA 94720, USA
| | - Laina N Hall
- Biophysics, University of California, Berkeley; Berkeley, CA 94720, USA
| | - Daniel Bellieny-Rabelo
- Innovative Genomics Institute, University of California; Berkeley, California 94720, USA
- California Institute for Quantitative Biosciences (QB3), University of California; Berkeley, CA 94720, USA
| | - Hunter M Nisonoff
- Center for Computational Biology, University of California, Berkeley; Berkeley, CA, USA
| | - Rachel F Weissman
- Department of Molecular and Cell Biology, University of California; Berkeley, California 94720, USA
| | - Avi I Flamholz
- Division of Biology and Biological Engineering, California Institute of Technology; Pasadena, CA 91125
| | - David Ding
- Innovative Genomics Institute, University of California; Berkeley, California 94720, USA
- Howard Hughes Medical Institute, University of California; Berkeley, California 94720, USA
| | - Abhishek Y Bhatt
- Department of Molecular and Cell Biology, University of California; Berkeley, California 94720, USA
- School of Medicine, University of California, San Diego; La Jolla, CA 92092, USA
| | - Patrick M Shih
- Innovative Genomics Institute, University of California; Berkeley, California 94720, USA
- Department of Plant and Microbial Biology, University of California, Berkeley; Berkeley, CA 94720, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory; Berkeley, CA 94720, USA
- Feedstocks Division, Joint BioEnergy Institute; Emeryville, CA 94608, USA
| | - Oliver Mueller-Cajar
- School of Biological Sciences, Nanyang Technological University; Singapore 637551, Singapore
| | - Ron Milo
- Department of Plant and Environmental Sciences, Weizmann Institute of Science; Rehovot 76100, Israel
| | - David F Savage
- Innovative Genomics Institute, University of California; Berkeley, California 94720, USA
- Howard Hughes Medical Institute, University of California; Berkeley, California 94720, USA
- Department of Molecular and Cell Biology, University of California; Berkeley, California 94720, USA
| |
Collapse
|