1
|
Teoh YC, Noor MS, Aghakhani S, Girton J, Hu G, Chowdhury R. Viral escape-inspired framework for structure-guided dual bait protein biosensor design. PLoS Comput Biol 2025; 21:e1012964. [PMID: 40233103 PMCID: PMC12021294 DOI: 10.1371/journal.pcbi.1012964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2024] [Revised: 04/24/2025] [Accepted: 03/14/2025] [Indexed: 04/17/2025] Open
Abstract
A generalizable computational platform, CTRL-V (Computational TRacking of Likely Variants), is introduced to design selective binding (dual bait) biosensor proteins. The iteratively evolving receptor binding domain (RBD) of SARS-CoV-2 spike protein has been construed as a model dual bait biosensor which has iteratively evolved to distinguish and selectively bind to human entry receptors and avoid binding neutralizing antibodies. Spike RBD prioritizes mutations that reduce antibody binding while enhancing/ retaining binding with the ACE2 receptor. CTRL-V's through iterative design cycles was shown to pinpoint 20% (of the 39) reported SARS-CoV-2 point mutations across 30 circulating, infective strains as responsible for immune escape from commercial antibody LY-CoV1404. CTRL-V successfully identifies ~70% (five out of seven) single point mutations (371F, 373P, 440K, 445H, 456L) in the latest circulating KP.2 variant and offers detailed structural insights to the escape mechanism. While other data-driven viral escape variant predictor tools have shown promise in predicting potential future viral variants, they require massive amounts of data to bypass the need for physics of explicit biochemical interactions. Consequently, they cannot be generalized for other protein design applications. The publicly availably viral escape data was leveraged as in vivo anchors to streamline a computational workflow that can be generalized for dual bait biosensor design tasks as exemplified by identifying key mutational loci in Raf kinase that enables it to selectively bind Ras and Rap1a GTP. We demonstrate three versions of CTRL-V which use a combination of integer optimization, stochastic sampling by PyRosetta, and deep learning-based ProteinMPNN for structure-guided biosensor design.
Collapse
Affiliation(s)
- Yee Chuen Teoh
- Department of Computer Science, Iowa State University, Ames, Iowa, United States of America
| | - Mohammed Sakib Noor
- Department of Chemical and Biological Engineering, Iowa State University, Ames, Iowa, United States of America
| | - Sina Aghakhani
- School of Industrial Engineering and Management, Oklahoma State University, Stillwater, Oklahoma, United States of America
| | - Jack Girton
- Department of Chemical and Biological Engineering, Iowa State University, Ames, Iowa, United States of America
| | - Guiping Hu
- School of Industrial Engineering and Management, Oklahoma State University, Stillwater, Oklahoma, United States of America
| | - Ratul Chowdhury
- Department of Chemical and Biological Engineering, Iowa State University, Ames, Iowa, United States of America
- Nanovaccine Institute, Iowa State University, Ames, Iowa, United States of America
| |
Collapse
|
2
|
Chowdhury R, Frazier AN, Koziel JA, Thompson L, Beck MR. Computational approaches for enteric methane mitigation research: from fermi calculations to artificial intelligence paradigms. Anim Front 2024; 14:33-41. [PMID: 39764521 PMCID: PMC11700611 DOI: 10.1093/af/vfae025] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/06/2025] Open
Affiliation(s)
- Ratul Chowdhury
- Department of Chemical and Biological Engineering, Iowa State University, Ames, IA, USA
| | | | - Jacek A Koziel
- Livestock Nutrient Management Research Unit, USDA-ARS, Bushland, TX 79012, USA
| | - Logan Thompson
- Department of Animal Sciences and Industry, Kansas State University, Manhattan, KS 66506, USA
| | - Matthew R Beck
- Livestock Nutrient Management Research Unit, USDA-ARS, Bushland, TX 79012, USA
| |
Collapse
|
3
|
van der Flier F, Estell D, Pricelius S, Dankmeyer L, van Stigt Thans S, Mulder H, Otsuka R, Goedegebuur F, Lammerts L, Staphorst D, van Dijk AD, de Ridder D, Redestig H. Enzyme structure correlates with variant effect predictability. Comput Struct Biotechnol J 2024; 23:3489-3497. [PMID: 39435338 PMCID: PMC11491678 DOI: 10.1016/j.csbj.2024.09.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 09/03/2024] [Accepted: 09/12/2024] [Indexed: 10/23/2024] Open
Abstract
Protein engineering increasingly relies on machine learning models to computationally pre-screen promising novel candidates. Although machine learning approaches have proven effective, their performance on prospective screening data leaves room for improvement; prediction accuracy can vary greatly from one protein variant to the next. So far, it is unclear what characterizes variants that are associated with large prediction error. In order to establish whether structural characteristics influence predictability, we created a novel high-order combinatorial dataset for an enzyme spanning 3,706 variants, that can be partitioned into subsets of variants with mutations at positions exclusively belonging to a particular structural class. By training four different supervised variant effect prediction (VEP) models on structurally partitioned subsets of our data, we found that predictability strongly depended on all four structural characteristics we tested; buriedness, number of contact residues, proximity to the active site and presence of secondary structure elements. These dependencies were also found in several single mutation enzyme variant datasets, albeit with dataset specific directions. Most importantly, we found that these dependencies were similar for all four models we tested, indicating that there are specific structure and function determinants that are insufficiently accounted for by current machine learning algorithms. Overall, our findings suggest that improvements can be made to VEP models by exploring new inductive biases and by leveraging different data modalities of protein variants, and that stratified dataset design can highlight areas of improvement for machine learning guided protein engineering.
Collapse
Affiliation(s)
- Floris van der Flier
- Department of Plant Sciences, Wageningen University & Research, Wageningen, 6708 PB, the Netherlands
| | - Dave Estell
- Health & Biosciences, International Flavors and Fragrances, Palo Alto, 94304 CA, USA
| | - Sina Pricelius
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Lydia Dankmeyer
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Sander van Stigt Thans
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Harm Mulder
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Rei Otsuka
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Frits Goedegebuur
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Laurens Lammerts
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Diego Staphorst
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Aalt D.J. van Dijk
- Department of Plant Sciences, Wageningen University & Research, Wageningen, 6708 PB, the Netherlands
| | - Dick de Ridder
- Department of Plant Sciences, Wageningen University & Research, Wageningen, 6708 PB, the Netherlands
| | - Henning Redestig
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| |
Collapse
|
4
|
Recent progress in the synthesis of advanced biofuel and bioproducts. Curr Opin Biotechnol 2023; 80:102913. [PMID: 36854202 DOI: 10.1016/j.copbio.2023.102913] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 01/20/2023] [Accepted: 01/30/2023] [Indexed: 02/27/2023]
Abstract
Energy is one of the most complex fields of study and an issue that influences nearly every aspect of modern life. Over the past century, combustion of fossil fuels, particularly in the transportation sector, has been the dominant form of energy release. Refining of petroleum and natural gas into liquid transportation fuels is also the centerpiece of the modern chemical industry used to produce materials, solvents, and other consumer goods. In the face of global climate change, the world is searching for alternative, sustainable means of producing energy carriers and chemical building blocks. The use of biofuels in engines predates modern refinery optimization and today represents a small but significant fraction of liquid transportation fuels burnt each year. Similarly, white biotechnology has been used to produce many natural products through fermentation. The evolution of recombinant DNA technology into modern synthetic biology has expanded the scope of biofuels and bioproducts that can be made by biocatalysts. This opinion examines the current trends in this research space, highlighting the substantial growth in computational tools and the growing influence of renewable electricity in the design of metabolic engineering strategies. In short, advanced biofuel and bioproduct synthesis remains a vibrant and critically important field of study whose focus is shifting away from the conversion of lignocellulosic biomass toward a broader consideration of how to reduce carbon dioxide to fuels and chemical products.
Collapse
|
5
|
Miton CM, Tokuriki N. Insertions and Deletions (Indels): A Missing Piece of the Protein Engineering Jigsaw. Biochemistry 2023; 62:148-157. [PMID: 35830609 DOI: 10.1021/acs.biochem.2c00188] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Over the years, protein engineers have studied nature and borrowed its tricks to accelerate protein evolution in the test tube. While there have been considerable advances, our ability to generate new proteins in the laboratory is seemingly limited. One explanation for these shortcomings may be that insertions and deletions (indels), which frequently arise in nature, are largely overlooked during protein engineering campaigns. The profound effect of indels on protein structures, by way of drastic backbone alterations, could be perceived as "saltation" events that bring about significant phenotypic changes in a single mutational step. Should we leverage these effects to accelerate protein engineering and gain access to unexplored regions of adaptive landscapes? In this Perspective, we describe the role played by indels in the functional diversification of proteins in nature and discuss their untapped potential for protein engineering, despite their often-destabilizing nature. We hope to spark a renewed interest in indels, emphasizing that their wider study and use may prove insightful and shape the future of protein engineering by unlocking unique functional changes that substitutions alone could never achieve.
Collapse
Affiliation(s)
- Charlotte M Miton
- Michael Smith Laboratories, University of British Columbia, Vancouver, V6T 1Z4 BC, Canada
| | - Nobuhiko Tokuriki
- Michael Smith Laboratories, University of British Columbia, Vancouver, V6T 1Z4 BC, Canada
| |
Collapse
|
6
|
Neamtu A, Mocci F, Laaksonen A, Barroso da Silva FL. Towards an optimal monoclonal antibody with higher binding affinity to the receptor-binding domain of SARS-CoV-2 spike proteins from different variants. Colloids Surf B Biointerfaces 2023; 221:112986. [PMID: 36375294 PMCID: PMC9617679 DOI: 10.1016/j.colsurfb.2022.112986] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 09/13/2022] [Accepted: 10/27/2022] [Indexed: 11/13/2022]
Abstract
A highly efficient and robust multiple scales in silico protocol, consisting of atomistic Molecular Dynamics (MD), coarse-grain (CG) MD, and constant-pH CG Monte Carlo (MC), has been developed and used to study the binding affinities of selected antigen-binding fragments of the monoclonal antibody (mAbs) CR3022 and several of its here optimized versions against 11 SARS-CoV-2 variants including the wild type. Totally 235,000 mAbs structures were initially generated using the RosettaAntibodyDesign software, resulting in top 10 scored CR3022-like-RBD complexes with critical mutations and compared to the native one, all having the potential to block virus-host cell interaction. Of these 10 finalists, two candidates were further identified in the CG simulations to be the best against all SARS-CoV-2 variants. Surprisingly, all 10 candidates and the native CR3022 exhibited a higher affinity for the Omicron variant despite its highest number of mutations. The multiscale protocol gives us a powerful rational tool to design efficient mAbs. The electrostatic interactions play a crucial role and appear to be controlling the affinity and complex building. Studied mAbs carrying a more negative total net charge show a higher affinity. Structural determinants could be identified in atomistic simulations and their roles are discussed in detail to further hint at a strategy for designing the best RBD binder. Although the SARS-CoV-2 was specifically targeted in this work, our approach is generally suitable for many diseases and viral and bacterial pathogens, leukemia, cancer, multiple sclerosis, rheumatoid, arthritis, lupus, and more.
Collapse
Affiliation(s)
- Andrei Neamtu
- Department of Physiology, "Grigore T. Popa" University of Medicine and Pharmacy of Iasi, Str. Universitatii nr. 16, 700051 Iasi, România; TRANSCEND Centre - Regional Institute of Oncology (IRO) Iasi, Str. General Henri Mathias Berthelot, Nr. 2-4 Iași, România
| | - Francesca Mocci
- University of Cagliari, Department of Chemical and Geological Sciences, Campus Monserrato, SS 554 bivio per Sestu, 09042 Monserrato, Italy
| | - Aatto Laaksonen
- Centre of Advanced Research in Bionanoconjugates and Biopolymers, PetruPoni Institute of Macromolecular Chemistry Aleea Grigore Ghica-Voda, 41 A, 700487 Iasi, Romania; University of Cagliari, Department of Chemical and Geological Sciences, Campus Monserrato, SS 554 bivio per Sestu, 09042 Monserrato, Italy; Department of Materials and Environmental Chemistry, Arrhenius Laboratory, Stockholm University, SE-106 91 Stockholm, Sweden; State Key Laboratory of Materials-Oriented and Chemical Engineering, Nanjing Tech University, Nanjing 210009, PR China; Department of Engineering Sciences and Mathematics, Division of Energy Science, Luleå University of Technology, SE-97187 Luleå, Sweden
| | - Fernando L Barroso da Silva
- Universidade de São Paulo, Departamento de Ciências Biomoleculares, Faculdade de Ciências Farmacêuticas de Ribeirão Preto, Av. café, s/no - campus da USP, BR-14040-903 Ribeirão Preto, SP, Brazil; Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, USA.
| |
Collapse
|
7
|
Wittmund M, Cadet F, Davari MD. Learning Epistasis and Residue Coevolution Patterns: Current Trends and Future Perspectives for Advancing Enzyme Engineering. ACS Catal 2022. [DOI: 10.1021/acscatal.2c01426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Marcel Wittmund
- Department of Bioorganic Chemistry, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120 Halle, Germany
| | - Frederic Cadet
- Laboratory of Excellence LABEX GR, DSIMB, Inserm UMR S1134, University of Paris city & University of Reunion, Paris 75014, France
| | - Mehdi D. Davari
- Department of Bioorganic Chemistry, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120 Halle, Germany
| |
Collapse
|
8
|
Arslan A. Systematic Inspection of Genomic Tandem Repeats and Rearrangements in Autism Model. BRAIN DISORDERS 2022. [DOI: 10.1016/j.dscb.2022.100059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
|
9
|
Arslan A. Compendious survey of protein tandem repeats in inbred mouse strains. BMC Genom Data 2022; 23:62. [PMID: 35931961 PMCID: PMC9354378 DOI: 10.1186/s12863-022-01079-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Accepted: 07/28/2022] [Indexed: 11/10/2022] Open
Abstract
Short tandem repeats (STRs) play a crucial role in genetic diseases. However, classic disease models such as inbred mice lack such genome wide data in public domain. The examination of STR alleles present in the protein coding regions (are known as protein tandem repeats or PTR) can provide additional functional layer of phenotype regulars. Motivated with this, we analysed the whole genome sequencing data from 71 different mouse strains and identified STR alleles present within the coding regions of 562 genes. Taking advantage of recently formulated protein models, we also showed that the presence of these alleles within protein 3-dimensional space, could impact the protein folding. Overall, we identified novel alleles from a large number of mouse strains and demonstrated that these alleles are of interest considering protein structure integrity and functionality within the mouse genomes. We conclude that PTR alleles have potential to influence protein functions through impacting protein structural folding and integrity.
Collapse
|
10
|
Chowdhury R, Boorla VS, Maranas CD. Computational biophysical characterization of the SARS-CoV-2 spike protein binding with the ACE2 receptor and implications for infectivity. Comput Struct Biotechnol J 2020; 18:2573-2582. [PMID: 32983400 PMCID: PMC7500280 DOI: 10.1016/j.csbj.2020.09.019] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 09/11/2020] [Accepted: 09/12/2020] [Indexed: 02/06/2023] Open
Abstract
SARS-CoV-2 is a novel highly virulent pathogen which gains entry to human cells by binding with the cell surface receptor - angiotensin converting enzyme (ACE2). We computationally contrasted the binding interactions between human ACE2 and coronavirus spike protein receptor binding domain (RBD) of the 2002 epidemic-causing SARS-CoV-1, SARS-CoV-2, and bat coronavirus RaTG13 using the Rosetta energy function. We find that the RBD of the spike protein of SARS-CoV-2 is highly optimized to achieve very strong binding with human ACE2 (hACE2) which is consistent with its enhanced infectivity. SARS-CoV-2 forms the most stable complex with hACE2 compared to SARS-CoV-1 (23% less stable) or RaTG13 (11% less stable). Notably, we calculate that the SARS-CoV-2 RBD lowers the binding strength of angiotensin 2 receptor type I (ATR1) which is the native binding partner of ACE2 by 44.2%. Strong binding is mediated through strong electrostatic attachments with every fourth residue on the N-terminus alpha-helix (starting from Ser19 to Asn53) as the turn of the helix makes these residues solvent accessible. By contrasting the spike protein SARS-CoV-2 Rosetta binding energy with ACE2 of different livestock and pet species we find strongest binding with bat ACE2 followed by human, feline, equine, canine and finally chicken. This is consistent with the hypothesis that bats are the viral origin and reservoir species. These results offer a computational explanation for the increased infection susceptibility by SARS-CoV-2 and allude to therapeutic modalities by identifying and rank-ordering the ACE2 residues involved in binding with the virus.
Collapse
Affiliation(s)
- Ratul Chowdhury
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA 16802, USA
| | - Veda Sheersh Boorla
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA 16802, USA
| | - Costas D. Maranas
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|