1
|
Nguyen PT, Harris BJ, Mateos DL, González AH, Murray AM, Yarov-Yarovoy V. Structural modeling of ion channels using AlphaFold2, RoseTTAFold2, and ESMFold. Channels (Austin) 2024; 18:2325032. [PMID: 38445990 PMCID: PMC10936637 DOI: 10.1080/19336950.2024.2325032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 01/14/2024] [Indexed: 03/07/2024] Open
Abstract
Ion channels play key roles in human physiology and are important targets in drug discovery. The atomic-scale structures of ion channels provide invaluable insights into a fundamental understanding of the molecular mechanisms of channel gating and modulation. Recent breakthroughs in deep learning-based computational methods, such as AlphaFold, RoseTTAFold, and ESMFold have transformed research in protein structure prediction and design. We review the application of AlphaFold, RoseTTAFold, and ESMFold to structural modeling of ion channels using representative voltage-gated ion channels, including human voltage-gated sodium (NaV) channel - NaV1.8, human voltage-gated calcium (CaV) channel - CaV1.1, and human voltage-gated potassium (KV) channel - KV1.3. We compared AlphaFold, RoseTTAFold, and ESMFold structural models of NaV1.8, CaV1.1, and KV1.3 with corresponding cryo-EM structures to assess details of their similarities and differences. Our findings shed light on the strengths and limitations of the current state-of-the-art deep learning-based computational methods for modeling ion channel structures, offering valuable insights to guide their future applications for ion channel research.
Collapse
Affiliation(s)
- Phuong Tran Nguyen
- Department of Physiology and Membrane Biology, University of California School of Medicine, Davis, CA, USA
| | - Brandon John Harris
- Department of Physiology and Membrane Biology, University of California School of Medicine, Davis, CA, USA
- Biophysics Graduate Group, University of California School of Medicine, Davis, CA, USA
| | - Diego Lopez Mateos
- Department of Physiology and Membrane Biology, University of California School of Medicine, Davis, CA, USA
- Biophysics Graduate Group, University of California School of Medicine, Davis, CA, USA
| | - Adriana Hernández González
- Department of Physiology and Membrane Biology, University of California School of Medicine, Davis, CA, USA
- Biophysics Graduate Group, University of California School of Medicine, Davis, CA, USA
| | | | - Vladimir Yarov-Yarovoy
- Department of Physiology and Membrane Biology, University of California School of Medicine, Davis, CA, USA
- Department of Anesthesiology and Pain Medicine, University of California School of Medicine, Davis, CA, USA
| |
Collapse
|
2
|
Beck J, Shanmugaratnam S, Höcker B. Diversifying de novo TIM barrels by hallucination. Protein Sci 2024; 33:e5001. [PMID: 38723111 PMCID: PMC11081422 DOI: 10.1002/pro.5001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 03/26/2024] [Accepted: 04/10/2024] [Indexed: 05/13/2024]
Abstract
De novo protein design expands the protein universe by creating new sequences to accomplish tailor-made enzymes in the future. A promising topology to implement diverse enzyme functions is the ubiquitous TIM-barrel fold. Since the initial de novo design of an idealized four-fold symmetric TIM barrel, the family of de novo TIM barrels is expanding rapidly. Despite this and in contrast to natural TIM barrels, these novel proteins lack cavities and structural elements essential for the incorporation of binding sites or enzymatic functions. In this work, we diversified a de novo TIM barrel by extending multiple βα-loops using constrained hallucination. Experimentally tested designs were found to be soluble upon expression in Escherichia coli and well-behaved. Biochemical characterization and crystal structures revealed successful extensions with defined α-helical structures. These diversified de novo TIM barrels provide a framework to explore a broad spectrum of functions based on the potential of natural TIM barrels.
Collapse
Affiliation(s)
- Julian Beck
- Department of BiochemistryUniversity of BayreuthBayreuthGermany
| | | | - Birte Höcker
- Department of BiochemistryUniversity of BayreuthBayreuthGermany
| |
Collapse
|
3
|
Winnifrith A, Outeiral C, Hie BL. Generative artificial intelligence for de novo protein design. Curr Opin Struct Biol 2024; 86:102794. [PMID: 38663170 DOI: 10.1016/j.sbi.2024.102794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 01/31/2024] [Accepted: 02/19/2024] [Indexed: 05/19/2024]
Abstract
Engineering new molecules with desirable functions and properties has the potential to extend our ability to engineer proteins beyond what nature has so far evolved. Advances in the so-called 'de novo' design problem have recently been brought forward by developments in artificial intelligence. Generative architectures, such as language models and diffusion processes, seem adept at generating novel, yet realistic proteins that display desirable properties and perform specified functions. State-of-the-art design protocols now achieve experimental success rates nearing 20%, thus widening the access to de novo designed proteins. Despite extensive progress, there are clear field-wide challenges, for example, in determining the best in silico metrics to prioritise designs for experimental testing, and in designing proteins that can undergo large conformational changes or be regulated by post-translational modifications. With an increase in the number of models being developed, this review provides a framework to understand how these tools fit into the overall process of de novo protein design. Throughout, we highlight the power of incorporating biochemical knowledge to improve performance and interpretability.
Collapse
Affiliation(s)
- Adam Winnifrith
- Department of Biochemistry, University of Oxford, South Parks Rd, Oxford, OX1 3QU, United Kingdom; Evolvere Biosciences, Innovation Building, Old Road Campus, Oxford, OX3 7FZ, United Kingdom.
| | - Carlos Outeiral
- Department of Statistics, University of Oxford, 24-29 St Giles', Oxford OX1 3LB, United Kingdom.
| | - Brian L Hie
- Department of Chemical Engineering, Stanford University, 443 Via Ortega, Stanford, CA 94305, USA; Stanford Data Science, 475 Via Ortega, Stanford CA 94305, USA; Arc Institute, 3181 Porter Dr, Palo Alto, CA, USA.
| |
Collapse
|
4
|
Frasnetti E, Magni A, Castelli M, Serapian SA, Moroni E, Colombo G. Structures, dynamics, complexes, and functions: From classic computation to artificial intelligence. Curr Opin Struct Biol 2024; 87:102835. [PMID: 38744148 DOI: 10.1016/j.sbi.2024.102835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 04/14/2024] [Accepted: 04/22/2024] [Indexed: 05/16/2024]
Abstract
Computational approaches can provide highly detailed insight into the molecular recognition processes that underlie drug binding, the assembly of protein complexes, and the regulation of biological functional processes. Classical simulation methods can bridge a wide range of length- and time-scales typically involved in such processes. Lately, automated learning and artificial intelligence methods have shown the potential to expand the reach of physics-based approaches, ushering in the possibility to model and even design complex protein architectures. The synergy between atomistic simulations and AI methods is an emerging frontier with a huge potential for advances in structural biology. Herein, we explore various examples and frameworks for these approaches, providing select instances and applications that illustrate their impact on fundamental biomolecular problems.
Collapse
Affiliation(s)
- Elena Frasnetti
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | - Andrea Magni
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | - Matteo Castelli
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | - Stefano A Serapian
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | | | - Giorgio Colombo
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy.
| |
Collapse
|
5
|
Zhou L, Tao C, Shen X, Sun X, Wang J, Yuan Q. Unlocking the potential of enzyme engineering via rational computational design strategies. Biotechnol Adv 2024; 73:108376. [PMID: 38740355 DOI: 10.1016/j.biotechadv.2024.108376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 04/27/2024] [Accepted: 05/08/2024] [Indexed: 05/16/2024]
Abstract
Enzymes play a pivotal role in various industries by enabling efficient, eco-friendly, and sustainable chemical processes. However, the low turnover rates and poor substrate selectivity of enzymes limit their large-scale applications. Rational computational enzyme design, facilitated by computational algorithms, offers a more targeted and less labor-intensive approach. There has been notable advancement in employing rational computational protein engineering strategies to overcome these issues, it has not been comprehensively reviewed so far. This article reviews recent developments in rational computational enzyme design, categorizing them into three types: structure-based, sequence-based, and data-driven machine learning computational design. Case studies are presented to demonstrate successful enhancements in catalytic activity, stability, and substrate selectivity. Lastly, the article provides a thorough analysis of these approaches, highlights existing challenges and potential solutions, and offers insights into future development directions.
Collapse
Affiliation(s)
- Lei Zhou
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Chunmeng Tao
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Xiaolin Shen
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Xinxiao Sun
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Jia Wang
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China.
| | - Qipeng Yuan
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China.
| |
Collapse
|
6
|
Chen Z, Wang R, Guo J, Wang X. The role and future prospects of artificial intelligence algorithms in peptide drug development. Biomed Pharmacother 2024; 175:116709. [PMID: 38713945 DOI: 10.1016/j.biopha.2024.116709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 05/01/2024] [Accepted: 05/02/2024] [Indexed: 05/09/2024] Open
Abstract
Peptide medications have been more well-known in recent years due to their many benefits, including low side effects, high biological activity, specificity, effectiveness, and so on. Over 100 peptide medications have been introduced to the market to treat a variety of illnesses. Most of these peptide medications are developed on the basis of endogenous peptides or natural peptides, which frequently required expensive, time-consuming, and extensive tests to confirm. As artificial intelligence advances quickly, it is now possible to build machine learning or deep learning models that screen a large number of candidate sequences for therapeutic peptides. Therapeutic peptides, such as those with antibacterial or anticancer properties, have been developed by the application of artificial intelligence algorithms.The process of finding and developing peptide drugs is outlined in this review, along with a few related cases that were helped by AI and conventional methods. These resources will open up new avenues for peptide drug development and discovery, helping to meet the pressing needs of clinical patients for disease treatment. Although peptide drugs are a new class of biopharmaceuticals that distinguish them from chemical and small molecule drugs, their clinical purpose and value cannot be ignored. However, the traditional peptide drug research and development has a long development cycle and high investment, and the creation of peptide medications will be substantially hastened by the AI-assisted (AI+) mode, offering a new boost for combating diseases.
Collapse
Affiliation(s)
- Zhiheng Chen
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Ruoxi Wang
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Junqi Guo
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Xiaogang Wang
- Guangdong Provincial Key Laboratory of Bone and Joint Degenerative Diseases, The Third Affiliated Hospital of Southern Medical University, Guangzhou, Guangdong 510630, China.
| |
Collapse
|
7
|
Wang H, Chen M, Wei X, Xia R, Pei D, Huang X, Han B. Computational tools for plant genomics and breeding. SCIENCE CHINA. LIFE SCIENCES 2024:10.1007/s11427-024-2578-6. [PMID: 38676814 DOI: 10.1007/s11427-024-2578-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 03/25/2024] [Indexed: 04/29/2024]
Abstract
Plant genomics and crop breeding are at the intersection of biotechnology and information technology. Driven by a combination of high-throughput sequencing, molecular biology and data science, great advances have been made in omics technologies at every step along the central dogma, especially in genome assembling, genome annotation, epigenomic profiling, and transcriptome profiling. These advances further revolutionized three directions of development. One is genetic dissection of complex traits in crops, along with genomic prediction and selection. The second is comparative genomics and evolution, which open up new opportunities to depict the evolutionary constraints of biological sequences for deleterious variant discovery. The third direction is the development of deep learning approaches for the rational design of biological sequences, especially proteins, for synthetic biology. All three directions of development serve as the foundation for a new era of crop breeding where agronomic traits are enhanced by genome design.
Collapse
Affiliation(s)
- Hai Wang
- State Key Laboratory of Maize Bio-breeding, Frontiers Science Center for Molecular Design Breeding, Joint International Research Laboratory of Crop Molecular Breeding, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing, 100193, China.
- Sanya Institute of China Agricultural University, Sanya, 572025, China.
- Hainan Yazhou Bay Seed Laboratory, Sanya, 572025, China.
| | - Mengjiao Chen
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of the State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
| | - Xin Wei
- Shanghai Key Laboratory of Plant Molecular Sciences, College of Life Sciences, Shanghai Normal University, Shanghai, 200234, China
| | - Rui Xia
- College of Horticulture, South China Agricultural University, Guangzhou, 510640, China
| | - Dong Pei
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of the State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
| | - Xuehui Huang
- Shanghai Key Laboratory of Plant Molecular Sciences, College of Life Sciences, Shanghai Normal University, Shanghai, 200234, China
| | - Bin Han
- National Center for Gene Research, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, 200233, China
| |
Collapse
|
8
|
Krishna R, Wang J, Ahern W, Sturmfels P, Venkatesh P, Kalvet I, Lee GR, Morey-Burrows FS, Anishchenko I, Humphreys IR, McHugh R, Vafeados D, Li X, Sutherland GA, Hitchcock A, Hunter CN, Kang A, Brackenbrough E, Bera AK, Baek M, DiMaio F, Baker D. Generalized biomolecular modeling and design with RoseTTAFold All-Atom. Science 2024; 384:eadl2528. [PMID: 38452047 DOI: 10.1126/science.adl2528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 02/27/2024] [Indexed: 03/09/2024]
Abstract
Deep-learning methods have revolutionized protein structure prediction and design but are presently limited to protein-only systems. We describe RoseTTAFold All-Atom (RFAA), which combines a residue-based representation of amino acids and DNA bases with an atomic representation of all other groups to model assemblies that contain proteins, nucleic acids, small molecules, metals, and covalent modifications, given their sequences and chemical structures. By fine-tuning on denoising tasks, we developed RFdiffusion All-Atom (RFdiffusionAA), which builds protein structures around small molecules. Starting from random distributions of amino acid residues surrounding target small molecules, we designed and experimentally validated, through crystallography and binding measurements, proteins that bind the cardiac disease therapeutic digoxigenin, the enzymatic cofactor heme, and the light-harvesting molecule bilin.
Collapse
Affiliation(s)
- Rohith Krishna
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Jue Wang
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Woody Ahern
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98105, USA
| | - Pascal Sturmfels
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98105, USA
| | - Preetham Venkatesh
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Graduate Program in Biological Physics, Structure and Design, University of Washington, Seattle, WA 98105, USA
| | - Indrek Kalvet
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98105, USA
| | - Gyu Rie Lee
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98105, USA
| | | | - Ivan Anishchenko
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Ian R Humphreys
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Ryan McHugh
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Graduate Program in Biological Physics, Structure and Design, University of Washington, Seattle, WA 98105, USA
| | - Dionne Vafeados
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Xinting Li
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | | | - Andrew Hitchcock
- School of Biosciences, University of Sheffield, Sheffield S10 2TN, UK
| | - C Neil Hunter
- School of Biosciences, University of Sheffield, Sheffield S10 2TN, UK
| | - Alex Kang
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Evans Brackenbrough
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Asim K Bera
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Minkyung Baek
- School of Biological Sciences, Seoul National University, Seoul 08826, Republic of Korea
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98105, USA
| |
Collapse
|
9
|
Sgueglia G, Vrettas MD, Chino M, De Simone A, Lombardi A. MetalHawk: Enhanced Classification of Metal Coordination Geometries by Artificial Neural Networks. J Chem Inf Model 2024; 64:2356-2367. [PMID: 37956388 PMCID: PMC11005052 DOI: 10.1021/acs.jcim.3c00873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 09/29/2023] [Accepted: 10/26/2023] [Indexed: 11/15/2023]
Abstract
The chemical properties of metal complexes are strongly dependent on the number and geometrical arrangement of ligands coordinated to the metal center. Existing methods for determining either coordination number or geometry rely on a trade-off between accuracy and computational costs, which hinders their application to the study of large structure data sets. Here, we propose MetalHawk (https://github.com/vrettasm/MetalHawk), a machine learning-based approach to perform simultaneous classification of metal site coordination number and geometry through artificial neural networks (ANNs), which were trained using the Cambridge Structural Database (CSD) and Metal Protein Data Bank (MetalPDB). We demonstrate that the CSD-trained model can be used to classify sites belonging to the most common coordination numbers and geometry classes with balanced accuracy equal to 96.51% for CSD-deposited metal sites. The CSD-trained model was also found to be capable of classifying bioinorganic metal sites from the MetalPDB database, with balanced accuracy equal to 84.29% on the whole PDB data set and to 91.66% on manually reviewed sites in the PDB validation set. Moreover, we report evidence that the output vectors of the CSD-trained model can be considered as a proxy indicator of metal-site distortions, showing that these can be interpreted as a low-dimensional representation of subtle geometrical features present in metal site structures.
Collapse
Affiliation(s)
- Gianmattia Sgueglia
- Department
of Chemical Sciences, University of Naples
Federico II, Via Cintia 21, 80126 Napoli, Italy
| | - Michail D. Vrettas
- Department
of Pharmacy, University of Naples Federico
II, Via Domenico Montesano
49, 80131 Napoli, Italy
| | - Marco Chino
- Department
of Chemical Sciences, University of Naples
Federico II, Via Cintia 21, 80126 Napoli, Italy
| | - Alfonso De Simone
- Department
of Pharmacy, University of Naples Federico
II, Via Domenico Montesano
49, 80131 Napoli, Italy
| | - Angela Lombardi
- Department
of Chemical Sciences, University of Naples
Federico II, Via Cintia 21, 80126 Napoli, Italy
| |
Collapse
|
10
|
Ding X, Chen X, Sullivan EE, Shay TF, Gradinaru V. Fast, accurate ranking of engineered proteins by target-binding propensity using structure modeling. Mol Ther 2024:S1525-0016(24)00219-3. [PMID: 38582966 DOI: 10.1016/j.ymthe.2024.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 02/08/2024] [Accepted: 04/03/2024] [Indexed: 04/08/2024] Open
Abstract
Deep-learning-based methods for protein structure prediction have achieved unprecedented accuracy, yet their utility in the engineering of protein-based binders remains constrained due to a gap between the ability to predict the structures of candidate proteins and the ability toprioritize proteins by their potential to bind to a target. To bridge this gap, we introduce Automated Pairwise Peptide-Receptor Analysis for Screening Engineered proteins (APPRAISE), a method for predicting the target-binding propensity of engineered proteins. After generating structural models of engineered proteins competing for binding to a target using an established structure prediction tool such as AlphaFold-Multimer or ESMFold, APPRAISE performs a rapid (under 1 CPU second per model) scoring analysis that takes into account biophysical and geometrical constraints. As proof-of-concept cases, we demonstrate that APPRAISE can accurately classify receptor-dependent vs. receptor-independent adeno-associated viral vectors and diverse classes of engineered proteins such as miniproteins targeting the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike, nanobodies targeting a G-protein-coupled receptor, and peptides that specifically bind to transferrin receptor or programmed death-ligand 1 (PD-L1). APPRAISE is accessible through a web-based notebook interface using Google Colaboratory (https://tiny.cc/APPRAISE). With its accuracy, interpretability, and generalizability, APPRAISE promises to expand the utility of protein structure prediction and accelerate protein engineering for biomedical applications.
Collapse
Affiliation(s)
- Xiaozhe Ding
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA.
| | - Xinhong Chen
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA
| | - Erin E Sullivan
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA
| | - Timothy F Shay
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA
| | - Viviana Gradinaru
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA.
| |
Collapse
|
11
|
Xu Y, Hu X, Wang C, Liu Y, Chen Q, Liu H. De novo design of cavity-containing proteins with a backbone-centered neural network energy function. Structure 2024; 32:424-432.e4. [PMID: 38325370 DOI: 10.1016/j.str.2024.01.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Revised: 10/04/2023] [Accepted: 01/11/2024] [Indexed: 02/09/2024]
Abstract
The design of small-molecule-binding proteins requires protein backbones that contain cavities. Previous design efforts were based on naturally occurring cavity-containing backbone architectures. Here, we designed diverse cavity-containing backbones without predefined architectures by introducing tailored restraints into the backbone sampling driven by SCUBA (Side Chain-Unknown Backbone Arrangement), a neural network statistical energy function. For 521 out of 5816 designs, the root-mean-square deviations (RMSDs) of the Cα atoms for the AlphaFold2-predicted structures and our designed structures are within 2.0 Å. We experimentally tested 10 designed proteins and determined the crystal structures of two of them. One closely agrees with the designed model, while the other forms a domain-swapped dimer, where the partial structures are in agreement with the designed structures. Our results indicate that data-driven methods such as SCUBA hold great potential for designing de novo proteins with tailored small-molecule-binding function.
Collapse
Affiliation(s)
- Yang Xu
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Centre for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230001, China; MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Xiuhong Hu
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Centre for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230001, China; MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Chenchen Wang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Yongrui Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Quan Chen
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Centre for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230001, China; MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China; Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China.
| | - Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China; Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China; School of Data Science, University of Science and Technology of China, Hefei, Anhui 230027, China.
| |
Collapse
|
12
|
Capponi S, Wang S. AI in cellular engineering and reprogramming. Biophys J 2024:S0006-3495(24)00245-5. [PMID: 38576162 DOI: 10.1016/j.bpj.2024.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 03/19/2024] [Accepted: 04/01/2024] [Indexed: 04/06/2024] Open
Abstract
During the last decade, artificial intelligence (AI) has increasingly been applied in biophysics and related fields, including cellular engineering and reprogramming, offering novel approaches to understand, manipulate, and control cellular function. The potential of AI lies in its ability to analyze complex datasets and generate predictive models. AI algorithms can process large amounts of data from single-cell genomics and multiomic technologies, allowing researchers to gain mechanistic insights into the control of cell identity and function. By integrating and interpreting these complex datasets, AI can help identify key molecular events and regulatory pathways involved in cellular reprogramming. This knowledge can inform the design of precision engineering strategies, such as the development of new transcription factor and signaling molecule cocktails, to manipulate cell identity and drive authentic cell fate across lineage boundaries. Furthermore, when used in combination with computational methods, AI can accelerate and improve the analysis and understanding of the intricate relationships between genes, proteins, and cellular processes. In this review article, we explore the current state of AI applications in biophysics with a specific focus on cellular engineering and reprogramming. Then, we showcase a couple of recent applications where we combined machine learning with experimental and computational techniques. Finally, we briefly discuss the challenges and prospects of AI in cellular engineering and reprogramming, emphasizing the potential of these technologies to revolutionize our ability to engineer cells for a variety of applications, from disease modeling and drug discovery to regenerative medicine and biomanufacturing.
Collapse
Affiliation(s)
- Sara Capponi
- IBM Almaden Research Center, San Jose, California; Center for Cellular Construction, San Francisco, California.
| | - Shangying Wang
- Bay Area Institute of Science, Altos Labs, Redwood City, California.
| |
Collapse
|
13
|
Listov D, Goverde CA, Correia BE, Fleishman SJ. Opportunities and challenges in design and optimization of protein function. Nat Rev Mol Cell Biol 2024:10.1038/s41580-024-00718-y. [PMID: 38565617 DOI: 10.1038/s41580-024-00718-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2024] [Indexed: 04/04/2024]
Abstract
The field of protein design has made remarkable progress over the past decade. Historically, the low reliability of purely structure-based design methods limited their application, but recent strategies that combine structure-based and sequence-based calculations, as well as machine learning tools, have dramatically improved protein engineering and design. In this Review, we discuss how these methods have enabled the design of increasingly complex structures and therapeutically relevant activities. Additionally, protein optimization methods have improved the stability and activity of complex eukaryotic proteins. Thanks to their increased reliability, computational design methods have been applied to improve therapeutics and enzymes for green chemistry and have generated vaccine antigens, antivirals and drug-delivery nano-vehicles. Moreover, the high success of design methods reflects an increased understanding of basic rules that govern the relationships among protein sequence, structure and function. However, de novo design is still limited mostly to α-helix bundles, restricting its potential to generate sophisticated enzymes and diverse protein and small-molecule binders. Designing complex protein structures is a challenging but necessary next step if we are to realize our objective of generating new-to-nature activities.
Collapse
Affiliation(s)
- Dina Listov
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Casper A Goverde
- Institute of Bioengineering, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Bruno E Correia
- Institute of Bioengineering, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
| | - Sarel Jacob Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
14
|
Roel‐Touris J, Carcelén L, Marcos E. The structural landscape of the immunoglobulin fold by large-scale de novo design. Protein Sci 2024; 33:e4936. [PMID: 38501461 PMCID: PMC10949314 DOI: 10.1002/pro.4936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 02/02/2024] [Accepted: 02/06/2024] [Indexed: 03/20/2024]
Abstract
De novo designing immunoglobulin-like frameworks that allow for functional loop diversification shows great potential for crafting antibody-like scaffolds with fully customizable structures and functions. In this work, we combined de novo parametric design with deep-learning methods for protein structure prediction and design to explore the structural landscape of 7-stranded immunoglobulin domains. After screening folding of nearly 4 million designs, we have assembled a structurally diverse library of ~50,000 immunoglobulin domains with high-confidence AlphaFold2 predictions and structures diverging from naturally occurring ones. The designed dataset enabled us to identify structural requirements for the correct folding of immunoglobulin domains, shed light on β-sheet-β-sheet rotational preferences and how these are linked to functional properties. Our approach eliminates the need for preset loop conformations and opens the route to large-scale de novo design of immunoglobulin-like frameworks.
Collapse
Affiliation(s)
- Jorge Roel‐Touris
- Protein Design and Modeling Lab, Department of Structural and Molecular BiologyMolecular Biology Institute of Barcelona (IBMB), CSICBarcelonaSpain
| | - Lourdes Carcelén
- Protein Design and Modeling Lab, Department of Structural and Molecular BiologyMolecular Biology Institute of Barcelona (IBMB), CSICBarcelonaSpain
| | - Enrique Marcos
- Protein Design and Modeling Lab, Department of Structural and Molecular BiologyMolecular Biology Institute of Barcelona (IBMB), CSICBarcelonaSpain
| |
Collapse
|
15
|
Mu J, Li Z, Zhang B, Zhang Q, Iqbal J, Wadood A, Wei T, Feng Y, Chen HF. Graphormer supervised de novo protein design method and function validation. Brief Bioinform 2024; 25:bbae135. [PMID: 38557677 PMCID: PMC10982952 DOI: 10.1093/bib/bbae135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 01/31/2024] [Accepted: 03/12/2024] [Indexed: 04/04/2024] Open
Abstract
Protein design is central to nearly all protein engineering problems, as it can enable the creation of proteins with new biological functions, such as improving the catalytic efficiency of enzymes. One key facet of protein design, fixed-backbone protein sequence design, seeks to design new sequences that will conform to a prescribed protein backbone structure. Nonetheless, existing sequence design methods present limitations, such as low sequence diversity and shortcomings in experimental validation of the designed functional proteins. These inadequacies obstruct the goal of functional protein design. To improve these limitations, we initially developed the Graphormer-based Protein Design (GPD) model. This model utilizes the Transformer on a graph-based representation of three-dimensional protein structures and incorporates Gaussian noise and a sequence random masks to node features, thereby enhancing sequence recovery and diversity. The performance of the GPD model was significantly better than that of the state-of-the-art ProteinMPNN model on multiple independent tests, especially for sequence diversity. We employed GPD to design CalB hydrolase and generated nine artificially designed CalB proteins. The results show a 1.7-fold increase in catalytic activity compared to that of the wild-type CalB and strong substrate selectivity on p-nitrophenyl acetate with different carbon chain lengths (C2-C16). Thus, the GPD method could be used for the de novo design of industrial enzymes and protein drugs. The code was released at https://github.com/decodermu/GPD.
Collapse
Affiliation(s)
- Junxi Mu
- State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
- Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, No.5 Yiheyuan Road, Beijing, 100871, China
| | - Zhengxin Li
- State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Bo Zhang
- State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Qi Zhang
- State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Jamshed Iqbal
- Centre for Advanced Drug Research, COMSATS University Islamabad, Abbottabad Campus, Abbottabad, 22060, Pakistan
| | - Abdul Wadood
- Department of Biochemistry, Abdul Wali Khan University Mardan, Mardan, 23200, Pakistan
| | - Ting Wei
- State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Yan Feng
- State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Hai-Feng Chen
- State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| |
Collapse
|
16
|
de Haas RJ, Brunette N, Goodson A, Dauparas J, Yi SY, Yang EC, Dowling Q, Nguyen H, Kang A, Bera AK, Sankaran B, de Vries R, Baker D, King NP. Rapid and automated design of two-component protein nanomaterials using ProteinMPNN. Proc Natl Acad Sci U S A 2024; 121:e2314646121. [PMID: 38502697 PMCID: PMC10990136 DOI: 10.1073/pnas.2314646121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 02/20/2024] [Indexed: 03/21/2024] Open
Abstract
The design of protein-protein interfaces using physics-based design methods such as Rosetta requires substantial computational resources and manual refinement by expert structural biologists. Deep learning methods promise to simplify protein-protein interface design and enable its application to a wide variety of problems by researchers from various scientific disciplines. Here, we test the ability of a deep learning method for protein sequence design, ProteinMPNN, to design two-component tetrahedral protein nanomaterials and benchmark its performance against Rosetta. ProteinMPNN had a similar success rate to Rosetta, yielding 13 new experimentally confirmed assemblies, but required orders of magnitude less computation and no manual refinement. The interfaces designed by ProteinMPNN were substantially more polar than those designed by Rosetta, which facilitated in vitro assembly of the designed nanomaterials from independently purified components. Crystal structures of several of the assemblies confirmed the accuracy of the design method at high resolution. Our results showcase the potential of deep learning-based methods to unlock the widespread application of designed protein-protein interfaces and self-assembling protein nanomaterials in biotechnology.
Collapse
Affiliation(s)
- Robbert J. de Haas
- Department of Physical Chemistry and Soft Matter, Wageningen University and Research, Wageningen6078 WE, The Netherlands
| | - Natalie Brunette
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Alex Goodson
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Justas Dauparas
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Sue Y. Yi
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Erin C. Yang
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Quinton Dowling
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Hannah Nguyen
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Alex Kang
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Asim K. Bera
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Banumathi Sankaran
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, CA94720
| | - Renko de Vries
- Department of Physical Chemistry and Soft Matter, Wageningen University and Research, Wageningen6078 WE, The Netherlands
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
- HHMI, Seattle, WA98195
| | - Neil P. King
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| |
Collapse
|
17
|
Bennett NR, Watson JL, Ragotte RJ, Borst AJ, See DL, Weidle C, Biswas R, Shrock EL, Leung PJY, Huang B, Goreshnik I, Ault R, Carr KD, Singer B, Criswell C, Vafeados D, Sanchez MG, Kim HM, Torres SV, Chan S, Baker D. Atomically accurate de novo design of single-domain antibodies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.14.585103. [PMID: 38562682 PMCID: PMC10983868 DOI: 10.1101/2024.03.14.585103] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Despite the central role that antibodies play in modern medicine, there is currently no way to rationally design novel antibodies to bind a specific epitope on a target. Instead, antibody discovery currently involves time-consuming immunization of an animal or library screening approaches. Here we demonstrate that a fine-tuned RFdiffusion network is capable of designing de novo antibody variable heavy chains (VHH's) that bind user-specified epitopes. We experimentally confirm binders to four disease-relevant epitopes, and the cryo-EM structure of a designed VHH bound to influenza hemagglutinin is nearly identical to the design model both in the configuration of the CDR loops and the overall binding pose.
Collapse
Affiliation(s)
- Nathaniel R. Bennett
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA 98105, USA
| | - Joseph L. Watson
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Robert J. Ragotte
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Andrew J. Borst
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Déjenaé L. See
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Connor Weidle
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Riti Biswas
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA 98105, USA
| | - Ellen L. Shrock
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Philip J. Y. Leung
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA 98105, USA
| | - Buwei Huang
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Inna Goreshnik
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Russell Ault
- Department of Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Kenneth D. Carr
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Benedikt Singer
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Cameron Criswell
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Dionne Vafeados
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | | | - Ho Min Kim
- Center for Biomolecular and Cellular Structure, Institute for Basic Science (IBS), Daejeon, 34126, Republic of Korea
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea
| | - Susana Vázquez Torres
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Graduate Program in Biological Physics, Structure and Design, University of Washington, Seattle, WA, USA
| | - Sidney Chan
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
18
|
Hansen AL, Theisen FF, Crehuet R, Marcos E, Aghajari N, Willemoës M. Carving out a Glycoside Hydrolase Active Site for Incorporation into a New Protein Scaffold Using Deep Network Hallucination. ACS Synth Biol 2024; 13:862-875. [PMID: 38357862 PMCID: PMC10949244 DOI: 10.1021/acssynbio.3c00674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 01/16/2024] [Accepted: 01/23/2024] [Indexed: 02/16/2024]
Abstract
Enzymes are indispensable biocatalysts for numerous industrial applications, yet stability, selectivity, and restricted substrate recognition present limitations for their use. Despite the importance of enzyme engineering in overcoming these limitations, success is often challenged by the intricate architecture of enzymes derived from natural sources. Recent advances in computational methods have enabled the de novo design of simplified scaffolds with specific functional sites. Such scaffolds may be advantageous as platforms for enzyme engineering. Here, we present a strategy for the de novo design of a simplified scaffold of an endo-α-N-acetylgalactosaminidase active site, a glycoside hydrolase from the GH101 enzyme family. Using a combination of trRosetta hallucination, iterative cycles of deep-learning-based structure prediction, and ProteinMPNN sequence design, we designed proteins with 290 amino acids incorporating the active site while reducing the molecular weight by over 100 kDa compared to the initial endo-α-N-acetylgalactosaminidase. Of 11 tested designs, six were expressed as soluble monomers, displaying similar or increased thermostabilities compared to the natural enzyme. Despite lacking detectable enzymatic activity, the experimentally determined crystal structures of a representative design closely matched the design with a root-mean-square deviation of 1.0 Å, with most catalytically important side chains within 2.0 Å. The results highlight the potential of scaffold hallucination in designing proteins that may serve as a foundation for subsequent enzyme engineering.
Collapse
Affiliation(s)
- Anders Lønstrup Hansen
- The
Linderstrøm-Lang Centre for Protein Science, Section for Biomolecular
Sciences, Department of Biology, University
of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
| | - Frederik Friis Theisen
- The
Linderstrøm-Lang Centre for Protein Science, Section for Biomolecular
Sciences, Department of Biology, University
of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
| | - Ramon Crehuet
- Institute
for Advanced Chemistry of Catalonia (IQAC), CSIC, Carrer Jordi Girona 18-26, 08034 Barcelona, Spain
| | - Enrique Marcos
- Protein
Design and Modeling Lab, Department of Structural and Molecular Biology, Molecular Biology Institute of Barcelona (IBMB), CSIC, Baldiri Reixac 10, 08028 Barcelona, Spain
| | - Nushin Aghajari
- Molecular
Microbiology and Structural Biochemistry, CNRS, University of Lyon1, UMR5086, 7 Passage du Vercors, F-69367 Lyon CEDEX 07, France
| | - Martin Willemoës
- The
Linderstrøm-Lang Centre for Protein Science, Section for Biomolecular
Sciences, Department of Biology, University
of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
| |
Collapse
|
19
|
Wu X, Lin H, Bai R, Duan H. Deep learning for advancing peptide drug development: Tools and methods in structure prediction and design. Eur J Med Chem 2024; 268:116262. [PMID: 38387334 DOI: 10.1016/j.ejmech.2024.116262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 02/06/2024] [Accepted: 02/17/2024] [Indexed: 02/24/2024]
Abstract
Peptides can bind challenging disease targets with high affinity and specificity, offering enormous opportunities for addressing unmet medical needs. However, peptides' unique features, including smaller size, increased structural flexibility, and limited data availability, pose additional challenges to the design process compared to proteins. This review explores the dynamic field of peptide therapeutics, leveraging deep learning to enhance structure prediction and design. Our exploration encompasses various facets of peptide research, ranging from dataset curation handling to model development. As deep learning technologies become more refined, we channel our efforts into peptide structure prediction and design, aligning with the fundamental principles of structure-activity relationships in drug development. To guide researchers in harnessing the potential of deep learning to advance peptide drug development, our insights comprehensively explore current challenges and future directions of peptide therapeutics.
Collapse
Affiliation(s)
- Xinyi Wu
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, PR China
| | - Huitian Lin
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, PR China
| | - Renren Bai
- School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, PR China.
| | - Hongliang Duan
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, 999078, PR China.
| |
Collapse
|
20
|
Goverde CA, Pacesa M, Goldbach N, Dornfeld LJ, Balbi PEM, Georgeon S, Rosset S, Kapoor S, Choudhury J, Dauparas J, Schellhaas C, Kozlov S, Baker D, Ovchinnikov S, Vecchio AJ, Correia BE. Computational design of soluble functional analogues of integral membrane proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.05.09.540044. [PMID: 38496615 PMCID: PMC10942269 DOI: 10.1101/2023.05.09.540044] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
De novo design of complex protein folds using solely computational means remains a significant challenge. Here, we use a robust deep learning pipeline to design complex folds and soluble analogues of integral membrane proteins. Unique membrane topologies, such as those from GPCRs, are not found in the soluble proteome and we demonstrate that their structural features can be recapitulated in solution. Biophysical analyses reveal high thermal stability of the designs and experimental structures show remarkable design accuracy. The soluble analogues were functionalized with native structural motifs, standing as a proof-of-concept for bringing membrane protein functions to the soluble proteome, potentially enabling new approaches in drug discovery. In summary, we designed complex protein topologies and enriched them with functionalities from membrane proteins, with high experimental success rates, leading to a de facto expansion of the functional soluble fold space.
Collapse
|
21
|
Kohyama S, Frohn BP, Babl L, Schwille P. Machine learning-aided design and screening of an emergent protein function in synthetic cells. Nat Commun 2024; 15:2010. [PMID: 38443351 PMCID: PMC10914801 DOI: 10.1038/s41467-024-46203-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 02/16/2024] [Indexed: 03/07/2024] Open
Abstract
Recently, utilization of Machine Learning (ML) has led to astonishing progress in computational protein design, bringing into reach the targeted engineering of proteins for industrial and biomedical applications. However, the design of proteins for emergent functions of core relevance to cells, such as the ability to spatiotemporally self-organize and thereby structure the cellular space, is still extremely challenging. While on the generative side conditional generative models and multi-state design are on the rise, for emergent functions there is a lack of tailored screening methods as typically needed in a protein design project, both computational and experimental. Here we describe a proof-of-principle of how such screening, in silico and in vitro, can be achieved for ML-generated variants of a protein that forms intracellular spatiotemporal patterns. For computational screening we use a structure-based divide-and-conquer approach to find the most promising candidates, while for the subsequent in vitro screening we use synthetic cell-mimics as established by Bottom-Up Synthetic Biology. We then show that the best screened candidate can indeed completely substitute the wildtype gene in Escherichia coli. These results raise great hopes for the next level of synthetic biology, where ML-designed synthetic proteins will be used to engineer cellular functions.
Collapse
Affiliation(s)
- Shunshi Kohyama
- Dept. Cellular and Molecular Biophysics, Max Planck Institute of Biochemistry, Martinsried, D-82152, Germany
| | - Béla P Frohn
- Dept. Cellular and Molecular Biophysics, Max Planck Institute of Biochemistry, Martinsried, D-82152, Germany
| | - Leon Babl
- Dept. Cellular and Molecular Biophysics, Max Planck Institute of Biochemistry, Martinsried, D-82152, Germany
| | - Petra Schwille
- Dept. Cellular and Molecular Biophysics, Max Planck Institute of Biochemistry, Martinsried, D-82152, Germany.
| |
Collapse
|
22
|
Hong L, Kortemme T. An integrative approach to protein sequence design through multiobjective optimization. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.01.582670. [PMID: 38496480 PMCID: PMC10942313 DOI: 10.1101/2024.03.01.582670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
With recent methodological advances in the field of computational protein design, in particular those based on deep learning, there is an increasing need for frameworks that allow for coherent, direct integration of different models and objective functions into the generative design process. Here we demonstrate how evolutionary multiobjective optimization techniques can be adapted to provide such an approach. With the established Non-dominated Sorting Genetic Algorithm II (NSGA-II) as the optimization framework, we use AlphaFold2 and ProteinMPNN confidence metrics to define the objective space, and a mutation operator composed of ESM-1v and ProteinMPNN to rank and then redesign the least favorable positions. Using the multistate design problem of the foldswitching protein RfaH as an in-depth case study, we show that the evolutionary multiobjective optimization approach leads to significant reduction in the bias and variance in RfaH native sequence recovery, compared to a direct application of ProteinMPNN. We suggest that this improvement is due to three factors: (i) the use of an informative mutation operator that accelerates the sequence space exploration, (ii) the parallel, iterative design process inherent to the genetic algorithm that improves upon the ProteinMPNN autoregressive sequence decoding scheme, and (iii) the explicit approximation of the Pareto front that leads to optimal design candidates representing diverse tradeoff conditions. We anticipate this approach to be readily adaptable to different models and broadly relevant for protein design tasks with complex specifications.
Collapse
Affiliation(s)
- Lu Hong
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| |
Collapse
|
23
|
Du J, Kong Y, Wen Y, Shen E, Xing H. HUH Endonuclease: A Sequence-specific Fusion Protein Tag for Precise DNA-Protein Conjugation. Bioorg Chem 2024; 144:107118. [PMID: 38330720 DOI: 10.1016/j.bioorg.2024.107118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 01/01/2024] [Accepted: 01/09/2024] [Indexed: 02/10/2024]
Abstract
Synthetic DNA-protein conjugates have found widespread applications in diagnostics and therapeutics, prompting a growing interest in developing chemical biology methodologies for the precise and site-specific preparation of covalent DNA-protein conjugates. In this review article, we concentrate on techniques to achieve precise control over the structural and site-specific aspects of DNA-protein conjugates. We summarize conventional methods involving unnatural amino acids and self-labeling proteins, accompanied by a discussion of their potential limitations. Our primary focus is on introducing HUH endonuclease as a novel generation of fusion protein tags for DNA-protein conjugate preparation. The detailed conjugation mechanisms and structures of representative endonucleases are surveyed, showcasing their advantages as fusion protein tag in sequence selectivity, biological orthogonality, and no requirement for DNA modification. Additionally, we present the burgeoning applications of HUH-tag-based DNA-protein conjugates in protein assembly, biosensing, and gene editing. Furthermore, we delve into the future research directions of the HUH-tag, highlighting its significant potential for applications in the biomedical and DNA nanotechnology fields.
Collapse
Affiliation(s)
- Jiajun Du
- Institute of Chemical Biology and Nanomedicine, State Key Laboratory of Chemo/Biosensing and Chemometrics, Hunan Provincial Key Laboratory of Biomacromolecular Chemical Biology, School of Chemistry and Chemical Engineering Hunan University Changsha, Hunan 410082, PR China
| | - Yuhan Kong
- Institute of Chemical Biology and Nanomedicine, State Key Laboratory of Chemo/Biosensing and Chemometrics, Hunan Provincial Key Laboratory of Biomacromolecular Chemical Biology, School of Chemistry and Chemical Engineering Hunan University Changsha, Hunan 410082, PR China
| | - Yujian Wen
- Institute of Chemical Biology and Nanomedicine, State Key Laboratory of Chemo/Biosensing and Chemometrics, Hunan Provincial Key Laboratory of Biomacromolecular Chemical Biology, School of Chemistry and Chemical Engineering Hunan University Changsha, Hunan 410082, PR China
| | - Enxi Shen
- Institute of Chemical Biology and Nanomedicine, State Key Laboratory of Chemo/Biosensing and Chemometrics, Hunan Provincial Key Laboratory of Biomacromolecular Chemical Biology, School of Chemistry and Chemical Engineering Hunan University Changsha, Hunan 410082, PR China
| | - Hang Xing
- Institute of Chemical Biology and Nanomedicine, State Key Laboratory of Chemo/Biosensing and Chemometrics, Hunan Provincial Key Laboratory of Biomacromolecular Chemical Biology, School of Chemistry and Chemical Engineering Hunan University Changsha, Hunan 410082, PR China.
| |
Collapse
|
24
|
Chao Y, Han Y, Chen Z, Chu D, Xu Q, Wallace G, Wang C. Multiscale Structural Design of 2D Nanomaterials-based Flexible Electrodes for Wearable Energy Storage Applications. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2305558. [PMID: 38115755 PMCID: PMC10916616 DOI: 10.1002/advs.202305558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 11/22/2023] [Indexed: 12/21/2023]
Abstract
2D nanomaterials play a critical role in realizing high-performance flexible electrodes for wearable energy storge devices, owing to their merits of large surface area, high conductivity and high strength. The electrode is a complex system and the performance is determined by multiple and interrelated factors including the intrinsic properties of materials and the structures at different scales from macroscale to atomic scale. Multiscale design strategies have been developed to engineer the structures to exploit full potential and mitigate drawbacks of 2D materials. Analyzing the design strategies and understanding the working mechanisms are essential to facilitate the integration and harvest the synergistic effects. This review summarizes the multiscale design strategies from macroscale down to micro/nano-scale structures and atomic-scale structures for developing 2D nanomaterials-based flexible electrodes. It starts with brief introduction of 2D nanomaterials, followed by analysis of structural design strategies at different scales focusing on the elucidation of structure-property relationship, and ends with the presentation of challenges and future prospects. This review highlights the importance of integrating multiscale design strategies. Finding from this review may deepen the understanding of electrode performance and provide valuable guidelines for designing 2D nanomaterials-based flexible electrodes.
Collapse
Affiliation(s)
- Yunfeng Chao
- Henan Institute of Advanced TechnologyZhengzhou UniversityZhengzhou450052China
- Intelligent Polymer Research InstituteARC Centre of Excellence for Electromaterials ScienceAIIM FacilityInnovation CampusUniversity of WollongongWollongongNSW2522Australia
| | - Yan Han
- Energy & Materials Engineering CentreCollege of Physics and Materials ScienceTianjin Normal UniversityTianjin300387China
| | - Zhiqi Chen
- Intelligent Polymer Research InstituteARC Centre of Excellence for Electromaterials ScienceAIIM FacilityInnovation CampusUniversity of WollongongWollongongNSW2522Australia
| | - Dewei Chu
- School of Materials Science and EngineeringThe University of New South WalesSydneyNSW2052Australia
| | - Qun Xu
- Henan Institute of Advanced TechnologyZhengzhou UniversityZhengzhou450052China
| | - Gordon Wallace
- Intelligent Polymer Research InstituteARC Centre of Excellence for Electromaterials ScienceAIIM FacilityInnovation CampusUniversity of WollongongWollongongNSW2522Australia
| | - Caiyun Wang
- Intelligent Polymer Research InstituteARC Centre of Excellence for Electromaterials ScienceAIIM FacilityInnovation CampusUniversity of WollongongWollongongNSW2522Australia
| |
Collapse
|
25
|
Li M, Li J, Liu K, Zhang H. Artificial structural proteins: Synthesis, assembly and material applications. Bioorg Chem 2024; 144:107162. [PMID: 38308999 DOI: 10.1016/j.bioorg.2024.107162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/14/2024] [Accepted: 01/27/2024] [Indexed: 02/05/2024]
Abstract
Structural proteins have evolved over billions of years and offer outstanding mechanical properties, such as resilience, toughness and stiffness. Advances in modular protein engineering, polypeptide modification, and synthetic biology have led to the development of novel biomimetic structural proteins to perform in biomedical and military fields. However, the development of customized structural proteins and assemblies with superior performance remains a major challenge, due to the inherent limitations of biosynthesis, difficulty in mimicking the complexed macroscale assembly, etc. This review summarizes the approaches for the design and production of biomimetic structural proteins, and their chemical modifications for multiscale assembly. Furthermore, we discuss the function tailoring and current applications of biomimetic structural protein assemblies. A perspective of future research is to reveal how the mechanical properties are encoded in the sequences and conformations. This review, therefore, provides an important reference for the development of structural proteins-mimetics from replication of nature to even outperforming nature.
Collapse
Affiliation(s)
- Ming Li
- State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, China; School of Applied Chemistry and Engineering, University of Science and Technology of China, Hefei 230026, China
| | - Jingjing Li
- State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, China.
| | - Kai Liu
- State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, China; School of Applied Chemistry and Engineering, University of Science and Technology of China, Hefei 230026, China; Engineering Research Center of Advanced Rare Earth Materials, Ministry of Education, Department of Chemistry, Tsinghua University, Beijing 100084, China
| | - Hongjie Zhang
- State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, China; School of Applied Chemistry and Engineering, University of Science and Technology of China, Hefei 230026, China; Engineering Research Center of Advanced Rare Earth Materials, Ministry of Education, Department of Chemistry, Tsinghua University, Beijing 100084, China
| |
Collapse
|
26
|
Yang J, Li FZ, Arnold FH. Opportunities and Challenges for Machine Learning-Assisted Enzyme Engineering. ACS CENTRAL SCIENCE 2024; 10:226-241. [PMID: 38435522 PMCID: PMC10906252 DOI: 10.1021/acscentsci.3c01275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 12/26/2023] [Accepted: 01/16/2024] [Indexed: 03/05/2024]
Abstract
Enzymes can be engineered at the level of their amino acid sequences to optimize key properties such as expression, stability, substrate range, and catalytic efficiency-or even to unlock new catalytic activities not found in nature. Because the search space of possible proteins is vast, enzyme engineering usually involves discovering an enzyme starting point that has some level of the desired activity followed by directed evolution to improve its "fitness" for a desired application. Recently, machine learning (ML) has emerged as a powerful tool to complement this empirical process. ML models can contribute to (1) starting point discovery by functional annotation of known protein sequences or generating novel protein sequences with desired functions and (2) navigating protein fitness landscapes for fitness optimization by learning mappings between protein sequences and their associated fitness values. In this Outlook, we explain how ML complements enzyme engineering and discuss its future potential to unlock improved engineering outcomes.
Collapse
Affiliation(s)
- Jason Yang
- Division
of Chemistry and Chemical Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| | - Francesca-Zhoufan Li
- Division
of Biology and Biological Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| | - Frances H. Arnold
- Division
of Chemistry and Chemical Engineering, California
Institute of Technology, Pasadena, California 91125, United States
- Division
of Biology and Biological Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| |
Collapse
|
27
|
Nam K, Shao Y, Major DT, Wolf-Watz M. Perspectives on Computational Enzyme Modeling: From Mechanisms to Design and Drug Development. ACS OMEGA 2024; 9:7393-7412. [PMID: 38405524 PMCID: PMC10883025 DOI: 10.1021/acsomega.3c09084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/15/2024] [Accepted: 01/19/2024] [Indexed: 02/27/2024]
Abstract
Understanding enzyme mechanisms is essential for unraveling the complex molecular machinery of life. In this review, we survey the field of computational enzymology, highlighting key principles governing enzyme mechanisms and discussing ongoing challenges and promising advances. Over the years, computer simulations have become indispensable in the study of enzyme mechanisms, with the integration of experimental and computational exploration now established as a holistic approach to gain deep insights into enzymatic catalysis. Numerous studies have demonstrated the power of computer simulations in characterizing reaction pathways, transition states, substrate selectivity, product distribution, and dynamic conformational changes for various enzymes. Nevertheless, significant challenges remain in investigating the mechanisms of complex multistep reactions, large-scale conformational changes, and allosteric regulation. Beyond mechanistic studies, computational enzyme modeling has emerged as an essential tool for computer-aided enzyme design and the rational discovery of covalent drugs for targeted therapies. Overall, enzyme design/engineering and covalent drug development can greatly benefit from our understanding of the detailed mechanisms of enzymes, such as protein dynamics, entropy contributions, and allostery, as revealed by computational studies. Such a convergence of different research approaches is expected to continue, creating synergies in enzyme research. This review, by outlining the ever-expanding field of enzyme research, aims to provide guidance for future research directions and facilitate new developments in this important and evolving field.
Collapse
Affiliation(s)
- Kwangho Nam
- Department
of Chemistry and Biochemistry, University
of Texas at Arlington, Arlington, Texas 76019, United States
| | - Yihan Shao
- Department
of Chemistry and Biochemistry, University
of Oklahoma, Norman, Oklahoma 73019-5251, United States
| | - Dan T. Major
- Department
of Chemistry and Institute for Nanotechnology & Advanced Materials, Bar-Ilan University, Ramat-Gan 52900, Israel
| | | |
Collapse
|
28
|
Zheng W, Xu YF, Hu ZM, Li K, Xu ZQ, Sun JL, Wei JF. Artificial intelligence-driven design of the assembled major cat allergen Fel d 1 to improve its spatial folding and IgE-reactivity. Int Immunopharmacol 2024; 128:111488. [PMID: 38185034 DOI: 10.1016/j.intimp.2024.111488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 12/31/2023] [Accepted: 01/02/2024] [Indexed: 01/09/2024]
Abstract
BACKGROUND Cat-derived allergens are considered as one of the most common causes of allergic diseases worldwide. Fel d 1 is a major cat allergen and plays an important role in immunoglobulin E (IgE)-reaction diagnosis. However, the two separate chains of Fel d 1 exhibited lower IgE-reactivity than its complete molecule of an assembled form, which makes it difficult to efficiently prepare and limits the application of Fel d 1 in molecular diagnosis of cat allergy. METHODS We first applied artificial intelligence (AI) based tool AlphaFold2 to build the 3-dimensional structures of Fel d 1 with different connection modes between two chains, which were evaluated by ERRAT program and were expressed in Escherichia coli. We then calculated the expression ratios of soluble form/inclusion bodies form of optimized Fel d 1. The Circular Dichroism (CD), High Performance Liquid Chromatography-Size Exclusion Chromatography (HPLC-SEC) and reducing/non-reducing SDS-PAGE were performed to characterize the folding status and dimerization of the optimized fusion Fel d 1. The improvement of specific-IgE reactivity to optimized fusion Fel d 1 was investigated by enzyme linked immunosorbent assay (ELISA). RESULTS Among several linkers, 2 × GGGGS got the highest scores, with an overall quality factor of 100. The error value of the residues around the junction of 2 × GGGGS was lower than others. It exhibited highest proportion of soluble protein than other Fel d 1 constructs with ERRAT (GGGGS, KK as well as direct fusion Fel d 1). The results of CD and HPLC-SEC showed the consistent folding and dimerization of two fused subunits between the optimized fusion Fel d 1 and previously well-defined direct fusion Fel d 1. The overall IgE-binding absorbance of optimized fusion Fel d 1 tested by ELISA was improved compared with that of the direct fusion Fel d 1. CONCLUSION We firstly provided an AI-design strategy to optimize the Fel d 1, which could spontaneously fold into its native-like structure without additional refolding process or eukaryotic folding factors. The improved IgE-binding activity and simplified preparation method could greatly facilitate it to be a robust allergen material for molecular diagnosis of cat allergy.
Collapse
Affiliation(s)
- Wei Zheng
- Department of Pharmacy, Jiangsu Cancer Hospital & Jiangsu Institute of Cancer Research & The Affiliated Cancer Hospital of Nanjing Medical University, Nanjing, China
| | - Yi-Fei Xu
- Department of Pharmacy, Jiangsu Cancer Hospital & Jiangsu Institute of Cancer Research & The Affiliated Cancer Hospital of Nanjing Medical University, Nanjing, China
| | - Zhi-Ming Hu
- Department of Pharmacy, Jiangsu Cancer Hospital & Jiangsu Institute of Cancer Research & The Affiliated Cancer Hospital of Nanjing Medical University, Nanjing, China
| | - Ke Li
- Department of Pharmacy, Jiangsu Cancer Hospital & Jiangsu Institute of Cancer Research & The Affiliated Cancer Hospital of Nanjing Medical University, Nanjing, China
| | - Zhi-Qiang Xu
- Research Division of Clinical Pharmacology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China; National Vaccine Innovation Platform, Nanjing Medical University, Nanjing 211166, China.
| | - Jin-Lyu Sun
- Department of Allergy, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China.
| | - Ji-Fu Wei
- Department of Pharmacy, Jiangsu Cancer Hospital & Jiangsu Institute of Cancer Research & The Affiliated Cancer Hospital of Nanjing Medical University, Nanjing, China; Research Division of Clinical Pharmacology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China; National Vaccine Innovation Platform, Nanjing Medical University, Nanjing 211166, China.
| |
Collapse
|
29
|
Pan X, Li Y, Huang P, Staecker H, He M. Extracellular vesicles for developing targeted hearing loss therapy. J Control Release 2024; 366:460-478. [PMID: 38182057 DOI: 10.1016/j.jconrel.2023.12.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 12/19/2023] [Accepted: 12/28/2023] [Indexed: 01/07/2024]
Abstract
Substantial efforts have been made for local administration of small molecules or biologics in treating hearing loss diseases caused by either trauma, genetic mutations, or drug ototoxicity. Recently, extracellular vesicles (EVs) naturally secreted from cells have drawn increasing attention on attenuating hearing impairment from both preclinical studies and clinical studies. Highly emerging field utilizing diverse bioengineering technologies for developing EVs as the bioderived therapeutic materials, along with artificial intelligence (AI)-based targeting toolkits, shed the light on the unique properties of EVs specific to inner ear delivery. This review will illuminate such exciting research field from fundamentals of hearing protective functions of EVs to biotechnology advancement and potential clinical translation of functionalized EVs. Specifically, the advancements in assessing targeting ligands using AI algorithms are systematically discussed. The overall translational potential of EVs is reviewed in the context of auditory sensing system for developing next generation gene therapy.
Collapse
Affiliation(s)
- Xiaoshu Pan
- Department of Pharmaceutics, College of Pharmacy, University of Florida, Gainesville, Florida 32610, United States
| | - Yanjun Li
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, Florida 32610, United States
| | - Peixin Huang
- Department of Otolaryngology, Head and Neck Surgery, University of Kansas School of Medicine, Kansas City, Kansas 66160, United States
| | - Hinrich Staecker
- Department of Otolaryngology, Head and Neck Surgery, University of Kansas School of Medicine, Kansas City, Kansas 66160, United States.
| | - Mei He
- Department of Pharmaceutics, College of Pharmacy, University of Florida, Gainesville, Florida 32610, United States.
| |
Collapse
|
30
|
Affiliation(s)
- Chloe Hsu
- University of California, Berkeley, Berkeley, CA, USA.
| | | | | |
Collapse
|
31
|
Notin P, Rollins N, Gal Y, Sander C, Marks D. Machine learning for functional protein design. Nat Biotechnol 2024; 42:216-228. [PMID: 38361074 DOI: 10.1038/s41587-024-02127-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Accepted: 01/05/2024] [Indexed: 02/17/2024]
Abstract
Recent breakthroughs in AI coupled with the rapid accumulation of protein sequence and structure data have radically transformed computational protein design. New methods promise to escape the constraints of natural and laboratory evolution, accelerating the generation of proteins for applications in biotechnology and medicine. To make sense of the exploding diversity of machine learning approaches, we introduce a unifying framework that classifies models on the basis of their use of three core data modalities: sequences, structures and functional labels. We discuss the new capabilities and outstanding challenges for the practical design of enzymes, antibodies, vaccines, nanomachines and more. We then highlight trends shaping the future of this field, from large-scale assays to more robust benchmarks, multimodal foundation models, enhanced sampling strategies and laboratory automation.
Collapse
Affiliation(s)
- Pascal Notin
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Department of Computer Science, University of Oxford, Oxford, UK.
| | | | - Yarin Gal
- Department of Computer Science, University of Oxford, Oxford, UK
| | - Chris Sander
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Debora Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Broad Institute of Harvard and MIT, Cambridge, MA, USA.
| |
Collapse
|
32
|
Chu AE, Lu T, Huang PS. Sparks of function by de novo protein design. Nat Biotechnol 2024; 42:203-215. [PMID: 38361073 DOI: 10.1038/s41587-024-02133-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Accepted: 01/09/2024] [Indexed: 02/17/2024]
Abstract
Information in proteins flows from sequence to structure to function, with each step causally driven by the preceding one. Protein design is founded on inverting this process: specify a desired function, design a structure executing this function, and find a sequence that folds into this structure. This 'central dogma' underlies nearly all de novo protein-design efforts. Our ability to accomplish these tasks depends on our understanding of protein folding and function and our ability to capture this understanding in computational methods. In recent years, deep learning-derived approaches for efficient and accurate structure modeling and enrichment of successful designs have enabled progression beyond the design of protein structures and towards the design of functional proteins. We examine these advances in the broader context of classical de novo protein design and consider implications for future challenges to come, including fundamental capabilities such as sequence and structure co-design and conformational control considering flexibility, and functional objectives such as antibody and enzyme design.
Collapse
Affiliation(s)
- Alexander E Chu
- Biophysics Program, Stanford University, Palo Alto, CA, USA
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA
- Google DeepMind, London, UK
| | - Tianyu Lu
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA
| | - Po-Ssu Huang
- Biophysics Program, Stanford University, Palo Alto, CA, USA.
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA.
| |
Collapse
|
33
|
Roberts JB, Nava AA, Pearson AN, Incha MR, Valencia LE, Ma M, Rao A, Keasling JD. Foldy: An open-source web application for interactive protein structure analysis. PLoS Comput Biol 2024; 20:e1011171. [PMID: 38306398 PMCID: PMC10866462 DOI: 10.1371/journal.pcbi.1011171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 02/14/2024] [Accepted: 01/05/2024] [Indexed: 02/04/2024] Open
Abstract
Foldy is a cloud-based application that allows non-computational biologists to easily utilize advanced AI-based structural biology tools, including AlphaFold and DiffDock. With many deployment options, it can be employed by individuals, labs, universities, and companies in the cloud without requiring hardware resources, but it can also be configured to utilize locally available computers. Foldy enables scientists to predict the structure of proteins and complexes up to 6000 amino acids with AlphaFold, visualize Pfam annotations, and dock ligands with AutoDock Vina and DiffDock. In our manuscript, we detail Foldy's interface design, deployment strategies, and optimization for various user scenarios. We demonstrate its application through case studies including rational enzyme design and analyzing proteins with domains of unknown function. Furthermore, we compare Foldy's interface and management capabilities with other open and closed source tools in the field, illustrating its practicality in managing complex data and computation tasks. Our manuscript underlines the benefits of Foldy as a day-to-day tool for life science researchers, and shows how Foldy can make modern tools more accessible and efficient.
Collapse
Affiliation(s)
- Jacob B. Roberts
- Joint BioEnergy Institute, Lawrence Berkeley National Laboratory, Emeryville, California, United States of America
- Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- Department of Bioengineering, University of California, Berkeley, Berkeley, California, United States of America
| | - Alberto A. Nava
- Joint BioEnergy Institute, Lawrence Berkeley National Laboratory, Emeryville, California, United States of America
- Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- Department of Chemical and Biomolecular Engineering, University of California, Berkeley, Berkeley, California, United States of America
| | - Allison N. Pearson
- Joint BioEnergy Institute, Lawrence Berkeley National Laboratory, Emeryville, California, United States of America
- Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Matthew R. Incha
- Joint BioEnergy Institute, Lawrence Berkeley National Laboratory, Emeryville, California, United States of America
- Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Luis E. Valencia
- Joint BioEnergy Institute, Lawrence Berkeley National Laboratory, Emeryville, California, United States of America
- Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- Department of Chemical and Biomolecular Engineering, University of California, Berkeley, Berkeley, California, United States of America
| | - Melody Ma
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Abhay Rao
- Department of Bioengineering, University of California, Berkeley, Berkeley, California, United States of America
| | - Jay D. Keasling
- Joint BioEnergy Institute, Lawrence Berkeley National Laboratory, Emeryville, California, United States of America
- Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- Department of Bioengineering, University of California, Berkeley, Berkeley, California, United States of America
- Department of Chemical and Biomolecular Engineering, University of California, Berkeley, Berkeley, California, United States of America
- Center for Synthetic Biochemistry, Shenzhen Institutes for Advanced Technologies, Shenzhen, People’s Republic of China
- The Novo Nordisk Foundation Center for Biosustainability, Technical University Denmark, Kemitorvet, Denmark
| |
Collapse
|
34
|
Fannjiang C, Listgarten J. Is Novelty Predictable? Cold Spring Harb Perspect Biol 2024; 16:a041469. [PMID: 38052497 PMCID: PMC10835614 DOI: 10.1101/cshperspect.a041469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Machine learning-based design has gained traction in the sciences, most notably in the design of small molecules, materials, and proteins, with societal applications ranging from drug development and plastic degradation to carbon sequestration. When designing objects to achieve novel property values with machine learning, one faces a fundamental challenge: how to push past the frontier of current knowledge, distilled from the training data into the model, in a manner that rationally controls the risk of failure. If one trusts learned models too much in extrapolation, one is likely to design rubbish. In contrast, if one does not extrapolate, one cannot find novelty. Herein, we ponder how one might strike a useful balance between these two extremes. We focus in particular on designing proteins with novel property values, although much of our discussion is relevant to machine learning-based design more broadly.
Collapse
Affiliation(s)
- Clara Fannjiang
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, California 94720, USA
| | - Jennifer Listgarten
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, California 94720, USA
| |
Collapse
|
35
|
Kortemme T. De novo protein design-From new structures to programmable functions. Cell 2024; 187:526-544. [PMID: 38306980 PMCID: PMC10990048 DOI: 10.1016/j.cell.2023.12.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 12/03/2023] [Accepted: 12/19/2023] [Indexed: 02/04/2024]
Abstract
Methods from artificial intelligence (AI) trained on large datasets of sequences and structures can now "write" proteins with new shapes and molecular functions de novo, without starting from proteins found in nature. In this Perspective, I will discuss the state of the field of de novo protein design at the juncture of physics-based modeling approaches and AI. New protein folds and higher-order assemblies can be designed with considerable experimental success rates, and difficult problems requiring tunable control over protein conformations and precise shape complementarity for molecular recognition are coming into reach. Emerging approaches incorporate engineering principles-tunability, controllability, and modularity-into the design process from the beginning. Exciting frontiers lie in deconstructing cellular functions with de novo proteins and, conversely, constructing synthetic cellular signaling from the ground up. As methods improve, many more challenges are unsolved.
Collapse
Affiliation(s)
- Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA; Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA; Chan Zuckerberg Biohub, San Francisco, CA 94158, USA.
| |
Collapse
|
36
|
Yu J, Mu J, Wei T, Chen HF. Multi-indicator comparative evaluation for deep learning-based protein sequence design methods. Bioinformatics 2024; 40:btae037. [PMID: 38261649 PMCID: PMC10868333 DOI: 10.1093/bioinformatics/btae037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 12/20/2023] [Accepted: 01/18/2024] [Indexed: 01/25/2024] Open
Abstract
MOTIVATION Proteins found in nature represent only a fraction of the vast space of possible proteins. Protein design presents an opportunity to explore and expand this protein landscape. Within protein design, protein sequence design plays a crucial role, and numerous successful methods have been developed. Notably, deep learning-based protein sequence design methods have experienced significant advancements in recent years. However, a comprehensive and systematic comparison and evaluation of these methods have been lacking, with indicators provided by different methods often inconsistent or lacking effectiveness. RESULTS To address this gap, we have designed a diverse set of indicators that cover several important aspects, including sequence recovery, diversity, root-mean-square deviation of protein structure, secondary structure, and the distribution of polar and nonpolar amino acids. In our evaluation, we have employed an improved weighted inferiority-superiority distance method to comprehensively assess the performance of eight widely used deep learning-based protein sequence design methods. Our evaluation not only provides rankings of these methods but also offers optimization suggestions by analyzing the strengths and weaknesses of each method. Furthermore, we have developed a method to select the best temperature parameter and proposed solutions for the common issue of designing sequences with consecutive repetitive amino acids, which is often encountered in protein design methods. These findings can greatly assist users in selecting suitable protein sequence design methods. Overall, our work contributes to the field of protein sequence design by providing a comprehensive evaluation system and optimization suggestions for different methods.
Collapse
Affiliation(s)
- Jinyu Yu
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Junxi Mu
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ting Wei
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Hai-Feng Chen
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
37
|
Vázquez Torres S, Leung PJY, Venkatesh P, Lutz ID, Hink F, Huynh HH, Becker J, Yeh AHW, Juergens D, Bennett NR, Hoofnagle AN, Huang E, MacCoss MJ, Expòsit M, Lee GR, Bera AK, Kang A, De La Cruz J, Levine PM, Li X, Lamb M, Gerben SR, Murray A, Heine P, Korkmaz EN, Nivala J, Stewart L, Watson JL, Rogers JM, Baker D. De novo design of high-affinity binders of bioactive helical peptides. Nature 2024; 626:435-442. [PMID: 38109936 PMCID: PMC10849960 DOI: 10.1038/s41586-023-06953-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 12/07/2023] [Indexed: 12/20/2023]
Abstract
Many peptide hormones form an α-helix on binding their receptors1-4, and sensitive methods for their detection could contribute to better clinical management of disease5. De novo protein design can now generate binders with high affinity and specificity to structured proteins6,7. However, the design of interactions between proteins and short peptides with helical propensity is an unmet challenge. Here we describe parametric generation and deep learning-based methods for designing proteins to address this challenge. We show that by extending RFdiffusion8 to enable binder design to flexible targets, and to refining input structure models by successive noising and denoising (partial diffusion), picomolar-affinity binders can be generated to helical peptide targets by either refining designs generated with other methods, or completely de novo starting from random noise distributions without any subsequent experimental optimization. The RFdiffusion designs enable the enrichment and subsequent detection of parathyroid hormone and glucagon by mass spectrometry, and the construction of bioluminescence-based protein biosensors. The ability to design binders to conformationally variable targets, and to optimize by partial diffusion both natural and designed proteins, should be broadly useful.
Collapse
Affiliation(s)
- Susana Vázquez Torres
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Biological Physics, Structure and Design, University of Washington, Seattle, WA, USA
| | - Philip J Y Leung
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA, USA
| | - Preetham Venkatesh
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Biological Physics, Structure and Design, University of Washington, Seattle, WA, USA
| | - Isaac D Lutz
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Fabian Hink
- Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark
| | - Huu-Hien Huynh
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA
| | - Jessica Becker
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA
| | - Andy Hsien-Wei Yeh
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - David Juergens
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA, USA
| | - Nathaniel R Bennett
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA, USA
| | - Andrew N Hoofnagle
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA
| | - Eric Huang
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Michael J MacCoss
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Marc Expòsit
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA, USA
| | - Gyu Rie Lee
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Asim K Bera
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Alex Kang
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Joshmyn De La Cruz
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Paul M Levine
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Xinting Li
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Mila Lamb
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Stacey R Gerben
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Analisa Murray
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Piper Heine
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Elif Nihal Korkmaz
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Jeff Nivala
- School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
- Molecular Engineering and Sciences Institute, University of Washington, Seattle, WA, USA
| | - Lance Stewart
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Joseph L Watson
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
| | - Joseph M Rogers
- Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark.
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
38
|
Sumida K, Núñez-Franco R, Kalvet I, Pellock SJ, Wicky BIM, Milles LF, Dauparas J, Wang J, Kipnis Y, Jameson N, Kang A, De La Cruz J, Sankaran B, Bera AK, Jiménez-Osés G, Baker D. Improving Protein Expression, Stability, and Function with ProteinMPNN. J Am Chem Soc 2024; 146:2054-2061. [PMID: 38194293 PMCID: PMC10811672 DOI: 10.1021/jacs.3c10941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 12/03/2023] [Accepted: 12/05/2023] [Indexed: 01/10/2024]
Abstract
Natural proteins are highly optimized for function but are often difficult to produce at a scale suitable for biotechnological applications due to poor expression in heterologous systems, limited solubility, and sensitivity to temperature. Thus, a general method that improves the physical properties of native proteins while maintaining function could have wide utility for protein-based technologies. Here, we show that the deep neural network ProteinMPNN, together with evolutionary and structural information, provides a route to increasing protein expression, stability, and function. For both myoglobin and tobacco etch virus (TEV) protease, we generated designs with improved expression, elevated melting temperatures, and improved function. For TEV protease, we identified multiple designs with improved catalytic activity as compared to the parent sequence and previously reported TEV variants. Our approach should be broadly useful for improving the expression, stability, and function of biotechnologically important proteins.
Collapse
Affiliation(s)
- Kiera
H. Sumida
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
| | - Reyes Núñez-Franco
- Center
for Cooperative Research in Biosciences, Basque Research and Technology Alliance, Derio 48160, Spain
| | - Indrek Kalvet
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
- Howard
Hughes Medical Institute, University of
Washington, Seattle, Washington 98195, United States
| | - Samuel J. Pellock
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
| | - Basile I. M. Wicky
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
| | - Lukas F. Milles
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
| | - Justas Dauparas
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
| | - Jue Wang
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
| | - Yakov Kipnis
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
- Howard
Hughes Medical Institute, University of
Washington, Seattle, Washington 98195, United States
| | - Noel Jameson
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Alex Kang
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
| | - Joshmyn De La Cruz
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
| | - Banumathi Sankaran
- Berkeley
Center for Structural Biology, Molecular Biophysics, and Integrated
Bioimaging, Lawrence Berkeley Laboratory, Berkeley, California 94720, United States
| | - Asim K. Bera
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
| | - Gonzalo Jiménez-Osés
- Center
for Cooperative Research in Biosciences, Basque Research and Technology Alliance, Derio 48160, Spain
- Ikerbasque,
Basque Foundation for Science, Bilbao 48013, Spain
| | - David Baker
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
- Howard
Hughes Medical Institute, University of
Washington, Seattle, Washington 98195, United States
| |
Collapse
|
39
|
Min J, Rong X, Zhang J, Su R, Wang Y, Qi W. Computational Design of Peptide Assemblies. J Chem Theory Comput 2024; 20:532-550. [PMID: 38206800 DOI: 10.1021/acs.jctc.3c01054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2024]
Abstract
With the ongoing development of peptide self-assembling materials, there is growing interest in exploring novel functional peptide sequences. From short peptides to long polypeptides, as the functionality increases, the sequence space is also expanding exponentially. Consequently, attempting to explore all functional sequences comprehensively through experience and experiments alone has become impractical. By utilizing computational methods, especially artificial intelligence enhanced molecular dynamics (MD) simulation and de novo peptide design, there has been a significant expansion in the exploration of sequence space. Through these methods, a variety of supramolecular functional materials, including fibers, two-dimensional arrays, nanocages, etc., have been designed by meticulously controlling the inter- and intramolecular interactions. In this review, we first provide a brief overview of the current main computational methods and then focus on the computational design methods for various self-assembled peptide materials. Additionally, we introduce some representative protein self-assemblies to offer guidance for the design of self-assembling peptides.
Collapse
Affiliation(s)
- Jiwei Min
- State Key Laboratory of Chemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, P. R. China
| | - Xi Rong
- State Key Laboratory of Chemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, P. R. China
| | - Jiaxing Zhang
- State Key Laboratory of Chemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, P. R. China
| | - Rongxin Su
- State Key Laboratory of Chemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, P. R. China
- Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), Tianjin 300072, P. R. China
- Tianjin Key Laboratory of Membrane Science and Desalination Technology, Tianjin 300072, P. R. China
| | - Yuefei Wang
- State Key Laboratory of Chemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, P. R. China
- Tianjin Key Laboratory of Membrane Science and Desalination Technology, Tianjin 300072, P. R. China
| | - Wei Qi
- State Key Laboratory of Chemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, P. R. China
- Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), Tianjin 300072, P. R. China
- Tianjin Key Laboratory of Membrane Science and Desalination Technology, Tianjin 300072, P. R. China
| |
Collapse
|
40
|
Hayes RL, Nixon CF, Marqusee S, Brooks CL. Selection pressures on evolution of ribonuclease H explored with rigorous free-energy-based design. Proc Natl Acad Sci U S A 2024; 121:e2312029121. [PMID: 38194446 PMCID: PMC10801872 DOI: 10.1073/pnas.2312029121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 11/22/2023] [Indexed: 01/11/2024] Open
Abstract
Understanding natural protein evolution and designing novel proteins are motivating interest in development of high-throughput methods to explore large sequence spaces. In this work, we demonstrate the application of multisite λ dynamics (MSλD), a rigorous free energy simulation method, and chemical denaturation experiments to quantify evolutionary selection pressure from sequence-stability relationships and to address questions of design. This study examines a mesophilic phylogenetic clade of ribonuclease H (RNase H), furthering its extensive characterization in earlier studies, focusing on E. coli RNase H (ecRNH) and a more stable consensus sequence (AncCcons) differing at 15 positions. The stabilities of 32,768 chimeras between these two sequences were computed using the MSλD framework. The most stable and least stable chimeras were predicted and tested along with several other sequences, revealing a designed chimera with approximately the same stability increase as AncCcons, but requiring only half the mutations. Comparing the computed stabilities with experiment for 12 sequences reveals a Pearson correlation of 0.86 and root mean squared error of 1.18 kcal/mol, an unprecedented level of accuracy well beyond less rigorous computational design methods. We then quantified selection pressure using a simple evolutionary model in which sequences are selected according to the Boltzmann factor of their stability. Selection temperatures from 110 to 168 K are estimated in three ways by comparing experimental and computational results to evolutionary models. These estimates indicate selection pressure is high, which has implications for evolutionary dynamics and for the accuracy required for design, and suggests accurate high-throughput computational methods like MSλD may enable more effective protein design.
Collapse
Affiliation(s)
- Ryan L. Hayes
- Department of Chemical and Biomolecular Engineering, University of California, Irvine, CA92697
- Department of Chemistry, University of Michigan, Ann Arbor, MI48109
| | - Charlotte F. Nixon
- Department of Molecular and Cell Biology, University of California, Berkeley, CA94720
| | - Susan Marqusee
- Department of Molecular and Cell Biology, University of California, Berkeley, CA94720
- California Institute for Quantitative Biosciences, University of California, Berkeley, CA94720
- Department of Chemistry, University of California, Berkeley, CA94720
| | - Charles L. Brooks
- Department of Chemistry, University of Michigan, Ann Arbor, MI48109
- Biophysics Program, University of Michigan, Ann Arbor, MI48109
| |
Collapse
|
41
|
Khalaf MNA, Soliman THA, Mohamed SS. PLM-GAN: A Large-Scale Protein Loop Modeling Using pix2pix GAN. ACS OMEGA 2024; 9:437-446. [PMID: 38222545 PMCID: PMC10785670 DOI: 10.1021/acsomega.3c05863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 11/01/2023] [Accepted: 11/22/2023] [Indexed: 01/16/2024]
Abstract
Revealing the tertiary structure of proteins holds huge significance as it unveils their vital properties and functions. These intricate three-dimensional configurations comprise diverse interactions including ionic, hydrophobic, and disulfide forces. In certain instances, these structures exhibit missing regions, necessitating the reconstruction of specific segments, thereby resulting in challenges in protein design, which encompasses loop modeling, circular permutation, and interface prediction. To address this problem, we present two pioneering models: pix2pix generative adversarial network (GAN) and PLM-GAN. The pix2pix GAN model is adept at generating and inpainting distance matrices of protein structures, whereas the PLM-GAN model incorporates residual blocks into the U-Net network of the GAN, building upon the foundation of the pix2pix GAN model. To bolster the models' performance, we introduce a novel loss function named the "missing to real regions loss" (LMTR) within the GAN framework. Additionally, we introduce a distinctive approach of pairing two different distance matrices: one representing the native protein structure and the other representing the same structure with a missing region that undergoes changes in each successive epoch. Moreover, we extend the reconstruction of missing regions, encompassing up to 30 amino acids and increase the protein length by 128 amino acids. The evaluation of our pix2pix GAN and PLM-GAN models on a random selection of natural proteins (4ZCB, 3FJB, and 2REZ) demonstrated promising experimental results. Our models constitute significant contributions to addressing intricate challenges in protein structure design. These contributions hold immense potential to propel advancements in protein-protein interactions, drug design, and further innovations in protein engineering. Data, code, trained models, examples, and measurements are available on https://github.com/mena01/PLM-GAN-A-Large-Scale-Protein-Loop-Modeling-Using-pix2pix-GAN_.
Collapse
Affiliation(s)
- Mena Nagy A Khalaf
- Information System Department, Faculty of Computer and Information, Assiut University, Assiut 71515, Egypt
| | - Taysir Hassan A Soliman
- Information System Department, Faculty of Computer and Information, Assiut University, Assiut 71515, Egypt
| | - Sara Salah Mohamed
- Information System Department, Faculty of Computer and Information, Assiut University, Assiut 71515, Egypt
- Mathematics and Computer Science Department, Faculty of Science, New Valley University, New Valley 71511, Egypt
| |
Collapse
|
42
|
Teng F, Cui T, Zhou L, Gao Q, Zhou Q, Li W. Programmable synthetic receptors: the next-generation of cell and gene therapies. Signal Transduct Target Ther 2024; 9:7. [PMID: 38167329 PMCID: PMC10761793 DOI: 10.1038/s41392-023-01680-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 09/22/2023] [Accepted: 10/11/2023] [Indexed: 01/05/2024] Open
Abstract
Cell and gene therapies hold tremendous promise for treating a range of difficult-to-treat diseases. However, concerns over the safety and efficacy require to be further addressed in order to realize their full potential. Synthetic receptors, a synthetic biology tool that can precisely control the function of therapeutic cells and genetic modules, have been rapidly developed and applied as a powerful solution. Delicately designed and engineered, they can be applied to finetune the therapeutic activities, i.e., to regulate production of dosed, bioactive payloads by sensing and processing user-defined signals or biomarkers. This review provides an overview of diverse synthetic receptor systems being used to reprogram therapeutic cells and their wide applications in biomedical research. With a special focus on four synthetic receptor systems at the forefront, including chimeric antigen receptors (CARs) and synthetic Notch (synNotch) receptors, we address the generalized strategies to design, construct and improve synthetic receptors. Meanwhile, we also highlight the expanding landscape of therapeutic applications of the synthetic receptor systems as well as current challenges in their clinical translation.
Collapse
Affiliation(s)
- Fei Teng
- University of Chinese Academy of Sciences, Beijing, 101408, China.
| | - Tongtong Cui
- State Key Laboratory of Stem Cell and Regenerative Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, 100101, China
| | - Li Zhou
- University of Chinese Academy of Sciences, Beijing, 101408, China
- State Key Laboratory of Stem Cell and Regenerative Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, 100101, China
| | - Qingqin Gao
- University of Chinese Academy of Sciences, Beijing, 101408, China
- State Key Laboratory of Stem Cell and Regenerative Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, 100101, China
| | - Qi Zhou
- University of Chinese Academy of Sciences, Beijing, 101408, China.
- State Key Laboratory of Stem Cell and Regenerative Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China.
- Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, 100101, China.
- Beijing Institute for Stem Cell and Regenerative Medicine, Beijing, 100101, China.
| | - Wei Li
- University of Chinese Academy of Sciences, Beijing, 101408, China.
- State Key Laboratory of Stem Cell and Regenerative Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China.
- Institute for Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing, 100101, China.
- Beijing Institute for Stem Cell and Regenerative Medicine, Beijing, 100101, China.
| |
Collapse
|
43
|
Lee GY, Song J. Single missense mutations in Vi capsule synthesis genes confer hypervirulence to Salmonella Typhi. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.28.573590. [PMID: 38260632 PMCID: PMC10802248 DOI: 10.1101/2023.12.28.573590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Many bacterial pathogens, including the human exclusive pathogen Salmonella Typhi, express capsular polysaccharides as a crucial virulence factor. Here, through S. Typhi whole genome sequence analyses and functional studies, we found a list of single point mutations that make S . Typhi hypervirulent. We discovered a single point mutation in the Vi biosynthesis enzymes that control the length or acetylation of Vi is enough to create different capsule variants of S. Typhi. All variant strains are pathogenic, but the hyper-capsule variants are particularly hypervirulent, as demonstrated by the high morbidity and mortality rates observed in infected mice. The hypo-capsule variants have primarily been identified in Africa, whereas the hyper-capsule variants are distributed worldwide. Collectively, these studies increase awareness about the existence of different capsule variants of S. Typhi, establish a solid foundation for numerous future studies on S. Typhi capsule variants, and offer valuable insights into strategies to combat capsulated bacteria.
Collapse
|
44
|
Chang L, Mondal A, Singh B, Martínez-Noa Y, Perez A. Revolutionizing Peptide-Based Drug Discovery: Advances in the Post-AlphaFold Era. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2024; 14:e1693. [PMID: 38680429 PMCID: PMC11052547 DOI: 10.1002/wcms.1693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 09/18/2023] [Indexed: 05/01/2024]
Abstract
Peptide-based drugs offer high specificity, potency, and selectivity. However, their inherent flexibility and differences in conformational preferences between their free and bound states create unique challenges that have hindered progress in effective drug discovery pipelines. The emergence of AlphaFold (AF) and Artificial Intelligence (AI) presents new opportunities for enhancing peptide-based drug discovery. We explore recent advancements that facilitate a successful peptide drug discovery pipeline, considering peptides' attractive therapeutic properties and strategies to enhance their stability and bioavailability. AF enables efficient and accurate prediction of peptide-protein structures, addressing a critical requirement in computational drug discovery pipelines. In the post-AF era, we are witnessing rapid progress with the potential to revolutionize peptide-based drug discovery such as the ability to rank peptide binders or classify them as binders/non-binders and the ability to design novel peptide sequences. However, AI-based methods are struggling due to the lack of well-curated datasets, for example to accommodate modified amino acids or unconventional cyclization. Thus, physics-based methods, such as docking or molecular dynamics simulations, continue to hold a complementary role in peptide drug discovery pipelines. Moreover, MD-based tools offer valuable insights into binding mechanisms, as well as the thermodynamic and kinetic properties of complexes. As we navigate this evolving landscape, a synergistic integration of AI and physics-based methods holds the promise of reshaping the landscape of peptide-based drug discovery.
Collapse
Affiliation(s)
- Liwei Chang
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | - Arup Mondal
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | - Bhumika Singh
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | | | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL 32611
| |
Collapse
|
45
|
Chen Z, Wu T, Yu S, Li M, Fan X, Huo YX. Self-assembly systems to troubleshoot metabolic engineering challenges. Trends Biotechnol 2024; 42:43-60. [PMID: 37451946 DOI: 10.1016/j.tibtech.2023.06.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 06/18/2023] [Accepted: 06/23/2023] [Indexed: 07/18/2023]
Abstract
Enzyme self-assembly is a technology in which enzyme units can aggregate into ordered macromolecules, assisted by scaffolds. In metabolic engineering, self-assembly strategies have been explored for aggregating multiple enzymes in the same pathway to improve sequential catalytic efficiency, which in turn enables high-level production. The performance of the scaffolds is critical to the formation of an efficient and stable assembly system. This review comprehensively analyzes these scaffolds by exploring how they assemble, and it illustrates how to apply self-assembly strategies for different modules in metabolic engineering. Functional modifications to scaffolds will further promote efficient strategies for production.
Collapse
Affiliation(s)
- Zhenya Chen
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081, Beijing, China
| | - Tong Wu
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081, Beijing, China
| | - Shengzhu Yu
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081, Beijing, China
| | - Min Li
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081, Beijing, China
| | - Xuanhe Fan
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081, Beijing, China
| | - Yi-Xin Huo
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081, Beijing, China.
| |
Collapse
|
46
|
Zhong Y, Li Y, Chen Q, Ji S, Xu M, Liu Y, Wu X, Li S, Li K, Lu B. Catalytic efficiency and thermal stability promotion of the cassava linamarase with multiple mutations for better cyanogenic glycoside degradation. Int J Biol Macromol 2023; 253:126677. [PMID: 37717874 DOI: 10.1016/j.ijbiomac.2023.126677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 08/22/2023] [Accepted: 09/01/2023] [Indexed: 09/19/2023]
Abstract
In our previous study, we found that cassava cyanogenic glycosides had an acute health risk. Therefore, to solve this problem, the improvement of specific degradation of cyanogenic glycosides of cassava linamarase during processing is the key. In this study, the catalytic activity and thermal stability of enzymes were screened before investigating the degradation efficiency of cyanogenic glycosides with a cassava linamarase mutant K263P-T53F-S366R-V335C-F339C (CASmut) -controlled technique. The CASmut was obtained with the optimum temperature of 45 °C, which was improved by 10 °C. The specific activity of CASmut was 85.1 ± 4.6 U/mg, which was 2.02 times higher than that of the wild type. Molecular dynamics simulation analysis and flexible docking showed there were more hydrogen bonding interactions at the pocket, and the aliphatic glycoside of the linamarin was partially surrounded by hydrophobic residues. The optimum conditions of degradation reactions was screened with CASmut addition of 47 mg/L at 45 °C, pH 6.0. The CASmut combined with ultrasonication improved the degradation from 478.2 ± 10.4 mg/kg to 86.7 ± 7.4 mg/kg. Those results indicating the great potential of CASmut in applying in the cassava food or cyanogenic food. However, challenges in terms of the catalytic mechanism research is worthy of being noticed in further studies.
Collapse
Affiliation(s)
- Yongheng Zhong
- College of Biosystems Engineering and Food Science, Key Laboratory for Quality Evaluation and Health Benefit of Agro-Products of Ministry of Agriculture and Rural Affairs, Key Laboratory for Quality and Safety Risk Assessment of Agro-Products Storage and Preservation of Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou 310058, China
| | - Ye Li
- College of Biosystems Engineering and Food Science, Key Laboratory for Quality Evaluation and Health Benefit of Agro-Products of Ministry of Agriculture and Rural Affairs, Key Laboratory for Quality and Safety Risk Assessment of Agro-Products Storage and Preservation of Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou 310058, China
| | - Qi Chen
- College of Biosystems Engineering and Food Science, Key Laboratory for Quality Evaluation and Health Benefit of Agro-Products of Ministry of Agriculture and Rural Affairs, Key Laboratory for Quality and Safety Risk Assessment of Agro-Products Storage and Preservation of Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou 310058, China
| | - Shengyang Ji
- College of Biosystems Engineering and Food Science, Key Laboratory for Quality Evaluation and Health Benefit of Agro-Products of Ministry of Agriculture and Rural Affairs, Key Laboratory for Quality and Safety Risk Assessment of Agro-Products Storage and Preservation of Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou 310058, China
| | - Minhao Xu
- College of Biosystems Engineering and Food Science, Key Laboratory for Quality Evaluation and Health Benefit of Agro-Products of Ministry of Agriculture and Rural Affairs, Key Laboratory for Quality and Safety Risk Assessment of Agro-Products Storage and Preservation of Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou 310058, China
| | - Yuqi Liu
- College of Biosystems Engineering and Food Science, Key Laboratory for Quality Evaluation and Health Benefit of Agro-Products of Ministry of Agriculture and Rural Affairs, Key Laboratory for Quality and Safety Risk Assessment of Agro-Products Storage and Preservation of Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou 310058, China
| | - Xiaodan Wu
- Analysis Center of Agrobiology and Environmental Sciences, Zhejiang University, Hangzhou 310058, China
| | - Shimin Li
- Analysis Center of Agrobiology and Environmental Sciences, Zhejiang University, Hangzhou 310058, China
| | - Kaimian Li
- Tropical Crop Germplasm Research Institute, Chinese Academy of Tropical Agricultural Sciences, Danzhou 571737, China
| | - Baiyi Lu
- College of Biosystems Engineering and Food Science, Key Laboratory for Quality Evaluation and Health Benefit of Agro-Products of Ministry of Agriculture and Rural Affairs, Key Laboratory for Quality and Safety Risk Assessment of Agro-Products Storage and Preservation of Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou 310058, China.
| |
Collapse
|
47
|
An L, Said M, Tran L, Majumder S, Goreshnik I, Lee GR, Juergens D, Dauparas J, Anishchenko I, Coventry B, Bera AK, Kang A, Levine PM, Alvarez V, Pillai A, Norn C, Feldman D, Zorine D, Hicks DR, Li X, Sanchez MG, Vafeados DK, Salveson PJ, Vorobieva AA, Baker D. De novo design of diverse small molecule binders and sensors using Shape Complementary Pseudocycles. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.20.572602. [PMID: 38187589 PMCID: PMC10769206 DOI: 10.1101/2023.12.20.572602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
A general method for designing proteins to bind and sense any small molecule of interest would be widely useful. Due to the small number of atoms to interact with, binding to small molecules with high affinity requires highly shape complementary pockets, and transducing binding events into signals is challenging. Here we describe an integrated deep learning and energy based approach for designing high shape complementarity binders to small molecules that are poised for downstream sensing applications. We employ deep learning generated psuedocycles with repeating structural units surrounding central pockets; depending on the geometry of the structural unit and repeat number, these pockets span wide ranges of sizes and shapes. For a small molecule target of interest, we extensively sample high shape complementarity pseudocycles to generate large numbers of customized potential binding pockets; the ligand binding poses and the interacting interfaces are then optimized for high affinity binding. We computationally design binders to four diverse molecules, including for the first time polar flexible molecules such as methotrexate and thyroxine, which are expressed at high levels and have nanomolar affinities straight out of the computer. Co-crystal structures are nearly identical to the design models. Taking advantage of the modular repeating structure of pseudocycles and central location of the binding pockets, we constructed low noise nanopore sensors and chemically induced dimerization systems by splitting the binders into domains which assemble into the original pseudocycle pocket upon target molecule addition.
Collapse
Affiliation(s)
- Linna An
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Meerit Said
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Long Tran
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Chemical Engineering, University of Washington, Seattle, WA, USA
| | - Sagardip Majumder
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Inna Goreshnik
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Gyu Rie Lee
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - David Juergens
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Justas Dauparas
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Ivan Anishchenko
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Brian Coventry
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Asim K. Bera
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Alex Kang
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Paul M. Levine
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Valentina Alvarez
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Arvind Pillai
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | | | - David Feldman
- BioInnovation Institute, DK2200 Copenhagen N, Denmark
| | - Dmitri Zorine
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Derrick R. Hicks
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Xinting Li
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | | | - Dionne K. Vafeados
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Patrick J. Salveson
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | | | - David Baker
- Department of Biochemistry, The University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
48
|
Tian J, Zhang J, Francis F. The role and pathway of VQ family in plant growth, immunity, and stress response. PLANTA 2023; 259:16. [PMID: 38078967 DOI: 10.1007/s00425-023-04292-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 11/14/2023] [Indexed: 12/18/2023]
Abstract
MAIN CONCLUSION This review provides a detailed description of the function and mechanism of VQ family gene, which is helpful for further research and application of VQ gene resources to improve crops. Valine-glutamine (VQ) motif-containing proteins are a large class of transcriptional regulatory cofactors. VQ proteins have their own unique molecular characteristics. Amino acids are highly conserved only in the VQ domain, while other positions vary greatly. Most VQ genes do not contain introns and the length of their proteins is less than 300 amino acids. A majority of VQ proteins are predicted to be localized in the nucleus. The promoter of many VQ genes contains stress or growth related elements. Segment duplication and tandem duplication are the main amplification mechanisms of the VQ gene family in angiosperms and gymnosperms, respectively. Purification selection plays a crucial role in the evolution of many VQ genes. By interacting with WRKY, MAPK, and other proteins, VQ proteins participate in the multiple signaling pathways to regulate plant growth and development, as well as defense responses to biotic and abiotic stresses. Although there have been some reports on the VQ gene family in plants, most of them only identify family members, with little functional verification, and there is also a lack of complete, detailed, and up-to-date review of research progress. Here, we comprehensively summarized the research progress of VQ genes that have been published so far, mainly including their molecular characteristics, biological functions, importance of VQ motif, and working mechanisms. Finally, the regulatory network and model of VQ genes were drawn, a precise molecular breeding strategy based on VQ genes was proposed, and the current problems and future prospects were pointed out, providing a powerful reference for further research and utilization of VQ genes in plant improvement.
Collapse
Affiliation(s)
- Jinfu Tian
- Functional and Evolutionary Entomology, Gembloux Agro-Bio Tech, University of Liège, 5030, Gembloux, Belgium.
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), Beijing, 100081, China.
| | - Jiahui Zhang
- Functional and Evolutionary Entomology, Gembloux Agro-Bio Tech, University of Liège, 5030, Gembloux, Belgium
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), Beijing, 100081, China
| | - Frédéric Francis
- Functional and Evolutionary Entomology, Gembloux Agro-Bio Tech, University of Liège, 5030, Gembloux, Belgium
| |
Collapse
|
49
|
Zheng C, Ji Z, Mathews II, Boxer SG. Enhanced active-site electric field accelerates enzyme catalysis. Nat Chem 2023; 15:1715-1721. [PMID: 37563323 PMCID: PMC10906027 DOI: 10.1038/s41557-023-01287-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 06/29/2023] [Indexed: 08/12/2023]
Abstract
The design and improvement of enzymes based on physical principles remain challenging. Here we demonstrate that the principle of electrostatic catalysis can be leveraged to substantially improve a natural enzyme's activity. We enhanced the active-site electric field in horse liver alcohol dehydrogenase by replacing the serine hydrogen-bond donor with threonine and replacing the catalytic Zn2+ with Co2+. Based on the electric field enhancement, we make a quantitative prediction of rate acceleration-50-fold faster than the wild-type enzyme-which was in close agreement with experimental measurements. The effects of the hydrogen bonding and metal coordination, two distinct chemical forces, are described by a unified physical quantity-electric field, which is quantitative, and shown here to be additive and predictive. These results suggest a new design paradigm for both biological and non-biological catalysts.
Collapse
Affiliation(s)
- Chu Zheng
- Department of Chemistry, Stanford University, Stanford, CA, USA
| | - Zhe Ji
- Department of Chemistry, Stanford University, Stanford, CA, USA
| | | | - Steven G Boxer
- Department of Chemistry, Stanford University, Stanford, CA, USA.
| |
Collapse
|
50
|
Zhang X, Yin H, Ling F, Zhan J, Zhou Y. SPIN-CGNN: Improved fixed backbone protein design with contact map-based graph construction and contact graph neural network. PLoS Comput Biol 2023; 19:e1011330. [PMID: 38060617 PMCID: PMC10729952 DOI: 10.1371/journal.pcbi.1011330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 12/19/2023] [Accepted: 11/27/2023] [Indexed: 12/20/2023] Open
Abstract
Recent advances in deep learning have significantly improved the ability to infer protein sequences directly from protein structures for the fix-backbone design. The methods have evolved from the early use of multi-layer perceptrons to convolutional neural networks, transformers, and graph neural networks (GNN). However, the conventional approach of constructing K-nearest-neighbors (KNN) graph for GNN has limited the utilization of edge information, which plays a critical role in network performance. Here we introduced SPIN-CGNN based on protein contact maps for nearest neighbors. Together with auxiliary edge updates and selective kernels, we found that SPIN-CGNN provided a comparable performance in refolding ability by AlphaFold2 to the current state-of-the-art techniques but a significant improvement over them in term of sequence recovery, perplexity, deviation from amino-acid compositions of native sequences, conservation of hydrophobic positions, and low complexity regions, according to the test by unseen structures, "hallucinated" structures and diffusion models. Results suggest that low complexity regions in the sequences designed by deep learning, for generated structures in particular, remain to be improved, when compared to the native sequences.
Collapse
Affiliation(s)
- Xing Zhang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, People’s Republic of China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| | - Hongmei Yin
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| | - Fei Ling
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, People’s Republic of China
| | - Jian Zhan
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| | - Yaoqi Zhou
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| |
Collapse
|