51
|
de Haas RJ, Brunette N, Goodson A, Dauparas J, Yi SY, Yang EC, Dowling Q, Nguyen H, Kang A, Bera AK, Sankaran B, de Vries R, Baker D, King NP. Rapid and automated design of two-component protein nanomaterials using ProteinMPNN. Proc Natl Acad Sci U S A 2024; 121:e2314646121. [PMID: 38502697 PMCID: PMC10990136 DOI: 10.1073/pnas.2314646121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 02/20/2024] [Indexed: 03/21/2024] Open
Abstract
The design of protein-protein interfaces using physics-based design methods such as Rosetta requires substantial computational resources and manual refinement by expert structural biologists. Deep learning methods promise to simplify protein-protein interface design and enable its application to a wide variety of problems by researchers from various scientific disciplines. Here, we test the ability of a deep learning method for protein sequence design, ProteinMPNN, to design two-component tetrahedral protein nanomaterials and benchmark its performance against Rosetta. ProteinMPNN had a similar success rate to Rosetta, yielding 13 new experimentally confirmed assemblies, but required orders of magnitude less computation and no manual refinement. The interfaces designed by ProteinMPNN were substantially more polar than those designed by Rosetta, which facilitated in vitro assembly of the designed nanomaterials from independently purified components. Crystal structures of several of the assemblies confirmed the accuracy of the design method at high resolution. Our results showcase the potential of deep learning-based methods to unlock the widespread application of designed protein-protein interfaces and self-assembling protein nanomaterials in biotechnology.
Collapse
Affiliation(s)
- Robbert J. de Haas
- Department of Physical Chemistry and Soft Matter, Wageningen University and Research, Wageningen6078 WE, The Netherlands
| | - Natalie Brunette
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Alex Goodson
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Justas Dauparas
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Sue Y. Yi
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Erin C. Yang
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Quinton Dowling
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Hannah Nguyen
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Alex Kang
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Asim K. Bera
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| | - Banumathi Sankaran
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, CA94720
| | - Renko de Vries
- Department of Physical Chemistry and Soft Matter, Wageningen University and Research, Wageningen6078 WE, The Netherlands
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
- HHMI, Seattle, WA98195
| | - Neil P. King
- Department of Biochemistry, University of Washington, Seattle, WA98195
- Institute for Protein Design, University of Washington, Seattle, WA98195
| |
Collapse
|
52
|
Hong L, Kortemme T. An integrative approach to protein sequence design through multiobjective optimization. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.01.582670. [PMID: 38496480 PMCID: PMC10942313 DOI: 10.1101/2024.03.01.582670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
With recent methodological advances in the field of computational protein design, in particular those based on deep learning, there is an increasing need for frameworks that allow for coherent, direct integration of different models and objective functions into the generative design process. Here we demonstrate how evolutionary multiobjective optimization techniques can be adapted to provide such an approach. With the established Non-dominated Sorting Genetic Algorithm II (NSGA-II) as the optimization framework, we use AlphaFold2 and ProteinMPNN confidence metrics to define the objective space, and a mutation operator composed of ESM-1v and ProteinMPNN to rank and then redesign the least favorable positions. Using the multistate design problem of the foldswitching protein RfaH as an in-depth case study, we show that the evolutionary multiobjective optimization approach leads to significant reduction in the bias and variance in RfaH native sequence recovery, compared to a direct application of ProteinMPNN. We suggest that this improvement is due to three factors: (i) the use of an informative mutation operator that accelerates the sequence space exploration, (ii) the parallel, iterative design process inherent to the genetic algorithm that improves upon the ProteinMPNN autoregressive sequence decoding scheme, and (iii) the explicit approximation of the Pareto front that leads to optimal design candidates representing diverse tradeoff conditions. We anticipate this approach to be readily adaptable to different models and broadly relevant for protein design tasks with complex specifications.
Collapse
Affiliation(s)
- Lu Hong
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| |
Collapse
|
53
|
Huddy TF, Hsia Y, Kibler RD, Xu J, Bethel N, Nagarajan D, Redler R, Leung PJY, Weidle C, Courbet A, Yang EC, Bera AK, Coudray N, Calise SJ, Davila-Hernandez FA, Han HL, Carr KD, Li Z, McHugh R, Reggiano G, Kang A, Sankaran B, Dickinson MS, Coventry B, Brunette TJ, Liu Y, Dauparas J, Borst AJ, Ekiert D, Kollman JM, Bhabha G, Baker D. Blueprinting extendable nanomaterials with standardized protein blocks. Nature 2024; 627:898-904. [PMID: 38480887 PMCID: PMC10972742 DOI: 10.1038/s41586-024-07188-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 02/09/2024] [Indexed: 03/26/2024]
Abstract
A wooden house frame consists of many different lumber pieces, but because of the regularity of these building blocks, the structure can be designed using straightforward geometrical principles. The design of multicomponent protein assemblies, in comparison, has been much more complex, largely owing to the irregular shapes of protein structures1. Here we describe extendable linear, curved and angled protein building blocks, as well as inter-block interactions, that conform to specified geometric standards; assemblies designed using these blocks inherit their extendability and regular interaction surfaces, enabling them to be expanded or contracted by varying the number of modules, and reinforced with secondary struts. Using X-ray crystallography and electron microscopy, we validate nanomaterial designs ranging from simple polygonal and circular oligomers that can be concentrically nested, up to large polyhedral nanocages and unbounded straight 'train track' assemblies with reconfigurable sizes and geometries that can be readily blueprinted. Because of the complexity of protein structures and sequence-structure relationships, it has not previously been possible to build up large protein assemblies by deliberate placement of protein backbones onto a blank three-dimensional canvas; the simplicity and geometric regularity of our design platform now enables construction of protein nanomaterials according to 'back of an envelope' architectural blueprints.
Collapse
Affiliation(s)
- Timothy F Huddy
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Yang Hsia
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Ryan D Kibler
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Jinwei Xu
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Neville Bethel
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | | | - Rachel Redler
- Department of Cell Biology, NYU School of Medicine, New York, NY, USA
| | - Philip J Y Leung
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Molecular Engineering and Sciences Institute, University of Washington, Seattle, WA, USA
| | - Connor Weidle
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Alexis Courbet
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Erin C Yang
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Biological Physics, Structure and Design, University of Washington, Seattle, WA, USA
| | - Asim K Bera
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Nicolas Coudray
- Department of Cell Biology, NYU School of Medicine, New York, NY, USA
- Applied Bioinformatics Laboratories, NYU School of Medicine, New York, NY, USA
- Division of Precision Medicine, Department of Medicine, NYU Grossman School of Medicine, New York, NY, USA
| | - S John Calise
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Fatima A Davila-Hernandez
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Hannah L Han
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Kenneth D Carr
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Zhe Li
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Ryan McHugh
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Gabriella Reggiano
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Alex Kang
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Banumathi Sankaran
- Molecular Biophysics and Integrated Bioimaging, Berkeley Center for Structural Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Miles S Dickinson
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Brian Coventry
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - T J Brunette
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Yulai Liu
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Justas Dauparas
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Andrew J Borst
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Damian Ekiert
- Department of Cell Biology, NYU School of Medicine, New York, NY, USA
- Applied Bioinformatics Laboratories, NYU School of Medicine, New York, NY, USA
| | - Justin M Kollman
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Gira Bhabha
- Applied Bioinformatics Laboratories, NYU School of Medicine, New York, NY, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
54
|
Yang J, Li FZ, Arnold FH. Opportunities and Challenges for Machine Learning-Assisted Enzyme Engineering. ACS CENTRAL SCIENCE 2024; 10:226-241. [PMID: 38435522 PMCID: PMC10906252 DOI: 10.1021/acscentsci.3c01275] [Citation(s) in RCA: 25] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 12/26/2023] [Accepted: 01/16/2024] [Indexed: 03/05/2024]
Abstract
Enzymes can be engineered at the level of their amino acid sequences to optimize key properties such as expression, stability, substrate range, and catalytic efficiency-or even to unlock new catalytic activities not found in nature. Because the search space of possible proteins is vast, enzyme engineering usually involves discovering an enzyme starting point that has some level of the desired activity followed by directed evolution to improve its "fitness" for a desired application. Recently, machine learning (ML) has emerged as a powerful tool to complement this empirical process. ML models can contribute to (1) starting point discovery by functional annotation of known protein sequences or generating novel protein sequences with desired functions and (2) navigating protein fitness landscapes for fitness optimization by learning mappings between protein sequences and their associated fitness values. In this Outlook, we explain how ML complements enzyme engineering and discuss its future potential to unlock improved engineering outcomes.
Collapse
Affiliation(s)
- Jason Yang
- Division
of Chemistry and Chemical Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| | - Francesca-Zhoufan Li
- Division
of Biology and Biological Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| | - Frances H. Arnold
- Division
of Chemistry and Chemical Engineering, California
Institute of Technology, Pasadena, California 91125, United States
- Division
of Biology and Biological Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| |
Collapse
|
55
|
Lei ZC, Wang X, Yang L, Qu H, Sun Y, Yang Y, Li W, Zhang WB, Cao XY, Fan C, Li G, Wu J, Tian ZQ. What can molecular assembly learn from catalysed assembly in living organisms? Chem Soc Rev 2024; 53:1892-1914. [PMID: 38230701 DOI: 10.1039/d3cs00634d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2024]
Abstract
Molecular assembly is the process of organizing individual molecules into larger structures and complex systems. The self-assembly approach is predominantly utilized in creating artificial molecular assemblies, and was believed to be the primary mode of molecular assembly in living organisms as well. However, it has been shown that the assembly of many biological complexes is "catalysed" by other molecules, rather than relying solely on self-assembly. In this review, we summarize these catalysed-assembly (catassembly) phenomena in living organisms and systematically analyse their mechanisms. We then expand on these phenomena and discuss related concepts, including catalysed-disassembly and catalysed-reassembly. Catassembly proves to be an efficient and highly selective strategy for synergistically controlling and manipulating various noncovalent interactions, especially in hierarchical molecular assemblies. Overreliance on self-assembly may, to some extent, hinder the advancement of artificial molecular assembly with powerful features. Furthermore, inspired by the biological catassembly phenomena, we propose guidelines for designing artificial catassembly systems and developing characterization and theoretical methods, and review pioneering works along this new direction. Overall, this approach may broaden and deepen our understanding of molecular assembly, enabling the construction and control of intelligent assembly systems with advanced functionality.
Collapse
Affiliation(s)
- Zhi-Chao Lei
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China.
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, P. R. China
- University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Xinchang Wang
- School of Electronic Science and Engineering, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, P. R. China
| | - Liulin Yang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China.
| | - Hang Qu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China.
| | - Yibin Sun
- Beijing National Laboratory for Molecular Sciences, Key Laboratory of Polymer Chemistry & Physics of Ministry of Education, Center for Soft Matter Science and Engineering, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, P. R. China
| | - Yang Yang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China.
| | - Wei Li
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, P. R. China
- University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Wen-Bin Zhang
- Beijing National Laboratory for Molecular Sciences, Key Laboratory of Polymer Chemistry & Physics of Ministry of Education, Center for Soft Matter Science and Engineering, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, P. R. China
| | - Xiao-Yu Cao
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China.
| | - Chunhai Fan
- School of Chemistry and Chemical Engineering, Frontiers Science, Center for Transformative Molecules and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai 200240, P. R. China
| | - Guohong Li
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, P. R. China
- University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Jiarui Wu
- Key Laboratory of Systems Biology, Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, 200031, P. R. China
- School of Life Science and Technology, ShanghaiTech University, Shanghai, 201210, P. R. China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou, 310024, P. R. China
| | - Zhong-Qun Tian
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China.
| |
Collapse
|
56
|
Guo XY, Yi L, Yang J, An HW, Yang ZX, Wang H. Self-assembly of peptide nanomaterials at biointerfaces: molecular design and biomedical applications. Chem Commun (Camb) 2024; 60:2009-2021. [PMID: 38275083 DOI: 10.1039/d3cc05811e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]
Abstract
Self-assembly is an important strategy for constructing ordered structures and complex functions in nature. Based on this, people can imitate nature and artificially construct functional materials with novel structures through the supermolecular self-assembly pathway of biological interfaces. Among the many assembly units, peptide molecular self-assembly has received widespread attention in recent years. In this review, we introduce the interactions (hydrophobic interaction, hydrogen bond, and electrostatic interaction) between peptide nanomaterials and biological interfaces, summarizing the latest advancements in multifunctional self-assembling peptide materials. We systematically demonstrate the assembly mechanisms of peptides at biological interfaces, such as proteins and cell membranes, while highlighting their application potential and challenges in fields like drug delivery, antibacterial strategies, and cancer therapy.
Collapse
Affiliation(s)
- Xin-Yuan Guo
- College of Chemistry, Huazhong Agricultural University, Shizishan 1, Hongshan District, Wuhan, 430070, China
- CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety, CAS Center for Excellence in Nanoscience, National Center for Nanoscience and Technology (NCNST), Beijing, 100190, China.
| | - Li Yi
- CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety, CAS Center for Excellence in Nanoscience, National Center for Nanoscience and Technology (NCNST), Beijing, 100190, China.
| | - Jia Yang
- CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety, CAS Center for Excellence in Nanoscience, National Center for Nanoscience and Technology (NCNST), Beijing, 100190, China.
| | - Hong-Wei An
- CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety, CAS Center for Excellence in Nanoscience, National Center for Nanoscience and Technology (NCNST), Beijing, 100190, China.
| | - Zi-Xin Yang
- College of Chemistry, Huazhong Agricultural University, Shizishan 1, Hongshan District, Wuhan, 430070, China
| | - Hao Wang
- CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety, CAS Center for Excellence in Nanoscience, National Center for Nanoscience and Technology (NCNST), Beijing, 100190, China.
| |
Collapse
|
57
|
Alvarez S, Nartey CM, Mercado N, de la Paz JA, Huseinbegovic T, Morcos F. In vivo functional phenotypes from a computational epistatic model of evolution. Proc Natl Acad Sci U S A 2024; 121:e2308895121. [PMID: 38285950 PMCID: PMC10861889 DOI: 10.1073/pnas.2308895121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 12/19/2023] [Indexed: 01/31/2024] Open
Abstract
Computational models of evolution are valuable for understanding the dynamics of sequence variation, to infer phylogenetic relationships or potential evolutionary pathways and for biomedical and industrial applications. Despite these benefits, few have validated their propensities to generate outputs with in vivo functionality, which would enhance their value as accurate and interpretable evolutionary algorithms. We demonstrate the power of epistasis inferred from natural protein families to evolve sequence variants in an algorithm we developed called sequence evolution with epistatic contributions (SEEC). Utilizing the Hamiltonian of the joint probability of sequences in the family as fitness metric, we sampled and experimentally tested for in vivo [Formula: see text]-lactamase activity in Escherichia coli TEM-1 variants. These evolved proteins can have dozens of mutations dispersed across the structure while preserving sites essential for both catalysis and interactions. Remarkably, these variants retain family-like functionality while being more active than their wild-type predecessor. We found that depending on the inference method used to generate the epistatic constraints, different parameters simulate diverse selection strengths. Under weaker selection, local Hamiltonian fluctuations reliably predict relative changes to variant fitness, recapitulating neutral evolution. SEEC has the potential to explore the dynamics of neofunctionalization, characterize viral fitness landscapes, and facilitate vaccine development.
Collapse
Affiliation(s)
- Sophia Alvarez
- Department of Biological Sciences, University of Texas at Dallas, Richardson, TX75080
| | - Charisse M. Nartey
- Department of Biological Sciences, University of Texas at Dallas, Richardson, TX75080
| | - Nicholas Mercado
- Department of Biological Sciences, University of Texas at Dallas, Richardson, TX75080
| | | | - Tea Huseinbegovic
- Department of Biological Sciences, University of Texas at Dallas, Richardson, TX75080
| | - Faruck Morcos
- Department of Biological Sciences, University of Texas at Dallas, Richardson, TX75080
- Department of Bioengineering, University of Texas at Dallas, Richardson, TX75080
- Center for Systems Biology, University of Texas at Dallas, Richardson, TX75080
| |
Collapse
|
58
|
Notin P, Rollins N, Gal Y, Sander C, Marks D. Machine learning for functional protein design. Nat Biotechnol 2024; 42:216-228. [PMID: 38361074 DOI: 10.1038/s41587-024-02127-0] [Citation(s) in RCA: 50] [Impact Index Per Article: 50.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Accepted: 01/05/2024] [Indexed: 02/17/2024]
Abstract
Recent breakthroughs in AI coupled with the rapid accumulation of protein sequence and structure data have radically transformed computational protein design. New methods promise to escape the constraints of natural and laboratory evolution, accelerating the generation of proteins for applications in biotechnology and medicine. To make sense of the exploding diversity of machine learning approaches, we introduce a unifying framework that classifies models on the basis of their use of three core data modalities: sequences, structures and functional labels. We discuss the new capabilities and outstanding challenges for the practical design of enzymes, antibodies, vaccines, nanomachines and more. We then highlight trends shaping the future of this field, from large-scale assays to more robust benchmarks, multimodal foundation models, enhanced sampling strategies and laboratory automation.
Collapse
Affiliation(s)
- Pascal Notin
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Department of Computer Science, University of Oxford, Oxford, UK.
| | | | - Yarin Gal
- Department of Computer Science, University of Oxford, Oxford, UK
| | - Chris Sander
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Debora Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Broad Institute of Harvard and MIT, Cambridge, MA, USA.
| |
Collapse
|
59
|
Chu AE, Lu T, Huang PS. Sparks of function by de novo protein design. Nat Biotechnol 2024; 42:203-215. [PMID: 38361073 PMCID: PMC11366440 DOI: 10.1038/s41587-024-02133-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Accepted: 01/09/2024] [Indexed: 02/17/2024]
Abstract
Information in proteins flows from sequence to structure to function, with each step causally driven by the preceding one. Protein design is founded on inverting this process: specify a desired function, design a structure executing this function, and find a sequence that folds into this structure. This 'central dogma' underlies nearly all de novo protein-design efforts. Our ability to accomplish these tasks depends on our understanding of protein folding and function and our ability to capture this understanding in computational methods. In recent years, deep learning-derived approaches for efficient and accurate structure modeling and enrichment of successful designs have enabled progression beyond the design of protein structures and towards the design of functional proteins. We examine these advances in the broader context of classical de novo protein design and consider implications for future challenges to come, including fundamental capabilities such as sequence and structure co-design and conformational control considering flexibility, and functional objectives such as antibody and enzyme design.
Collapse
Affiliation(s)
- Alexander E Chu
- Biophysics Program, Stanford University, Palo Alto, CA, USA
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA
- Google DeepMind, London, UK
| | - Tianyu Lu
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA
| | - Po-Ssu Huang
- Biophysics Program, Stanford University, Palo Alto, CA, USA.
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA.
| |
Collapse
|
60
|
Ming K, Xing B, Hu Y, Mei M, Huang W, Hu X, Wei Z. De novo design of a protein binder against Staphylococcus enterotoxin B. Int J Biol Macromol 2024; 257:128666. [PMID: 38070805 DOI: 10.1016/j.ijbiomac.2023.128666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 12/03/2023] [Accepted: 12/06/2023] [Indexed: 01/26/2024]
Abstract
Staphylococcus enterotoxin B (SEB) interacts with MHC-II molecules to overactivate immune cells and thereby to produce excessive pro-inflammatory cytokines. Disrupting the interactions between SEB and MHC-II helps eliminate the lethal threat posed by SEB. In this study, a de novo computational approach was used to design protein binders targeting SEB. The MHC-II binding domain of SEB was selected as the target, and the possible promising binding mode was broadly explored. The obtained original binder was folded into triple-helix bundles and contained 56 amino acids with molecular weight 5.9 kDa. The interface of SEB and the binder was highly hydrophobic. ProteinMPNN optimization further enlarged the hydrophobic region of the binder and improved the stability of the binder-SEB complex. In vitro study demonstrated that the optimized binder significantly inhibited the inflammatory response induced by SEB. Overall, our research demonstrated the applicability of this approach in de novo designing protein binders against SEB, and thereby providing potential therapeutics for SEB induced diseases.
Collapse
Affiliation(s)
- Ke Ming
- School of life sciences, Hubei University, Wuhan, Hubei, PR China; State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, Hubei, PR China; Hubei Jiangxia Laboratory, Wuhan, Hubei, PR China
| | - Banbin Xing
- School of life sciences, Hubei University, Wuhan, Hubei, PR China; State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, Hubei, PR China
| | - Yang Hu
- School of life sciences, Hubei University, Wuhan, Hubei, PR China; State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, Hubei, PR China
| | - Meng Mei
- School of life sciences, Hubei University, Wuhan, Hubei, PR China; State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, Hubei, PR China
| | - Wenli Huang
- School of life sciences, Hubei University, Wuhan, Hubei, PR China; State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, Hubei, PR China
| | - Xiaoyu Hu
- School of life sciences, Hubei University, Wuhan, Hubei, PR China; State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, Hubei, PR China
| | - Zigong Wei
- School of life sciences, Hubei University, Wuhan, Hubei, PR China; State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, Hubei, PR China; Hubei Jiangxia Laboratory, Wuhan, Hubei, PR China; Hubei Province Key Laboratory of Biotechnology of Chinese Traditional Medicine, National & Local Joint Engineering Research Center of High-throughput Drug Screening Technology, School of life sciences, Hubei University, Wuhan, Hubei, PR China.
| |
Collapse
|
61
|
Kortemme T. De novo protein design-From new structures to programmable functions. Cell 2024; 187:526-544. [PMID: 38306980 PMCID: PMC10990048 DOI: 10.1016/j.cell.2023.12.028] [Citation(s) in RCA: 46] [Impact Index Per Article: 46.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 12/03/2023] [Accepted: 12/19/2023] [Indexed: 02/04/2024]
Abstract
Methods from artificial intelligence (AI) trained on large datasets of sequences and structures can now "write" proteins with new shapes and molecular functions de novo, without starting from proteins found in nature. In this Perspective, I will discuss the state of the field of de novo protein design at the juncture of physics-based modeling approaches and AI. New protein folds and higher-order assemblies can be designed with considerable experimental success rates, and difficult problems requiring tunable control over protein conformations and precise shape complementarity for molecular recognition are coming into reach. Emerging approaches incorporate engineering principles-tunability, controllability, and modularity-into the design process from the beginning. Exciting frontiers lie in deconstructing cellular functions with de novo proteins and, conversely, constructing synthetic cellular signaling from the ground up. As methods improve, many more challenges are unsolved.
Collapse
Affiliation(s)
- Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA; Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA; Chan Zuckerberg Biohub, San Francisco, CA 94158, USA.
| |
Collapse
|
62
|
Vázquez Torres S, Leung PJY, Venkatesh P, Lutz ID, Hink F, Huynh HH, Becker J, Yeh AHW, Juergens D, Bennett NR, Hoofnagle AN, Huang E, MacCoss MJ, Expòsit M, Lee GR, Bera AK, Kang A, De La Cruz J, Levine PM, Li X, Lamb M, Gerben SR, Murray A, Heine P, Korkmaz EN, Nivala J, Stewart L, Watson JL, Rogers JM, Baker D. De novo design of high-affinity binders of bioactive helical peptides. Nature 2024; 626:435-442. [PMID: 38109936 PMCID: PMC10849960 DOI: 10.1038/s41586-023-06953-1] [Citation(s) in RCA: 53] [Impact Index Per Article: 53.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 12/07/2023] [Indexed: 12/20/2023]
Abstract
Many peptide hormones form an α-helix on binding their receptors1-4, and sensitive methods for their detection could contribute to better clinical management of disease5. De novo protein design can now generate binders with high affinity and specificity to structured proteins6,7. However, the design of interactions between proteins and short peptides with helical propensity is an unmet challenge. Here we describe parametric generation and deep learning-based methods for designing proteins to address this challenge. We show that by extending RFdiffusion8 to enable binder design to flexible targets, and to refining input structure models by successive noising and denoising (partial diffusion), picomolar-affinity binders can be generated to helical peptide targets by either refining designs generated with other methods, or completely de novo starting from random noise distributions without any subsequent experimental optimization. The RFdiffusion designs enable the enrichment and subsequent detection of parathyroid hormone and glucagon by mass spectrometry, and the construction of bioluminescence-based protein biosensors. The ability to design binders to conformationally variable targets, and to optimize by partial diffusion both natural and designed proteins, should be broadly useful.
Collapse
Affiliation(s)
- Susana Vázquez Torres
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Biological Physics, Structure and Design, University of Washington, Seattle, WA, USA
| | - Philip J Y Leung
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA, USA
| | - Preetham Venkatesh
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Biological Physics, Structure and Design, University of Washington, Seattle, WA, USA
| | - Isaac D Lutz
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Fabian Hink
- Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark
| | - Huu-Hien Huynh
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA
| | - Jessica Becker
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA
| | - Andy Hsien-Wei Yeh
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - David Juergens
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA, USA
| | - Nathaniel R Bennett
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA, USA
| | - Andrew N Hoofnagle
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA
| | - Eric Huang
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Michael J MacCoss
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Marc Expòsit
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA, USA
| | - Gyu Rie Lee
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Asim K Bera
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Alex Kang
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Joshmyn De La Cruz
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Paul M Levine
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Xinting Li
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Mila Lamb
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Stacey R Gerben
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Analisa Murray
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Piper Heine
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Elif Nihal Korkmaz
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Jeff Nivala
- School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
- Molecular Engineering and Sciences Institute, University of Washington, Seattle, WA, USA
| | - Lance Stewart
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Joseph L Watson
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
| | - Joseph M Rogers
- Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark.
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
63
|
Sumida K, Núñez-Franco R, Kalvet I, Pellock SJ, Wicky BIM, Milles LF, Dauparas J, Wang J, Kipnis Y, Jameson N, Kang A, De La Cruz J, Sankaran B, Bera AK, Jiménez-Osés G, Baker D. Improving Protein Expression, Stability, and Function with ProteinMPNN. J Am Chem Soc 2024; 146:2054-2061. [PMID: 38194293 PMCID: PMC10811672 DOI: 10.1021/jacs.3c10941] [Citation(s) in RCA: 41] [Impact Index Per Article: 41.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 12/03/2023] [Accepted: 12/05/2023] [Indexed: 01/10/2024]
Abstract
Natural proteins are highly optimized for function but are often difficult to produce at a scale suitable for biotechnological applications due to poor expression in heterologous systems, limited solubility, and sensitivity to temperature. Thus, a general method that improves the physical properties of native proteins while maintaining function could have wide utility for protein-based technologies. Here, we show that the deep neural network ProteinMPNN, together with evolutionary and structural information, provides a route to increasing protein expression, stability, and function. For both myoglobin and tobacco etch virus (TEV) protease, we generated designs with improved expression, elevated melting temperatures, and improved function. For TEV protease, we identified multiple designs with improved catalytic activity as compared to the parent sequence and previously reported TEV variants. Our approach should be broadly useful for improving the expression, stability, and function of biotechnologically important proteins.
Collapse
Affiliation(s)
- Kiera
H. Sumida
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
| | - Reyes Núñez-Franco
- Center
for Cooperative Research in Biosciences, Basque Research and Technology Alliance, Derio 48160, Spain
| | - Indrek Kalvet
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
- Howard
Hughes Medical Institute, University of
Washington, Seattle, Washington 98195, United States
| | - Samuel J. Pellock
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
| | - Basile I. M. Wicky
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
| | - Lukas F. Milles
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
| | - Justas Dauparas
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
| | - Jue Wang
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
| | - Yakov Kipnis
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
- Howard
Hughes Medical Institute, University of
Washington, Seattle, Washington 98195, United States
| | - Noel Jameson
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Alex Kang
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
| | - Joshmyn De La Cruz
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
| | - Banumathi Sankaran
- Berkeley
Center for Structural Biology, Molecular Biophysics, and Integrated
Bioimaging, Lawrence Berkeley Laboratory, Berkeley, California 94720, United States
| | - Asim K. Bera
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
| | - Gonzalo Jiménez-Osés
- Center
for Cooperative Research in Biosciences, Basque Research and Technology Alliance, Derio 48160, Spain
- Ikerbasque,
Basque Foundation for Science, Bilbao 48013, Spain
| | - David Baker
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
- Howard
Hughes Medical Institute, University of
Washington, Seattle, Washington 98195, United States
| |
Collapse
|
64
|
Minot M, Reddy ST. Meta learning addresses noisy and under-labeled data in machine learning-guided antibody engineering. Cell Syst 2024; 15:4-18.e4. [PMID: 38194961 DOI: 10.1016/j.cels.2023.12.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 07/21/2023] [Accepted: 12/07/2023] [Indexed: 01/11/2024]
Abstract
Machine learning-guided protein engineering is rapidly progressing; however, collecting high-quality, large datasets remains a bottleneck. Directed evolution and protein engineering studies often require extensive experimental processes to eliminate noise and label protein sequence-function data. Meta learning has proven effective in other fields in learning from noisy data via bi-level optimization given the availability of a small dataset with trusted labels. Here, we leverage meta learning approaches to overcome noisy and under-labeled data and expedite workflows in antibody engineering. We generate yeast display antibody mutagenesis libraries and screen them for target antigen binding followed by deep sequencing. We then create representative learning tasks, including learning from noisy training data, positive and unlabeled learning, and learning out of distribution properties. We demonstrate that meta learning has the potential to reduce experimental screening time and improve the robustness of machine learning models by training with noisy and under-labeled training data.
Collapse
Affiliation(s)
- Mason Minot
- ETH Zurich, Department of Biosystems Science and Engineering, Basel 4056, Switzerland
| | - Sai T Reddy
- ETH Zurich, Department of Biosystems Science and Engineering, Basel 4056, Switzerland.
| |
Collapse
|
65
|
Chen Z, Wu T, Yu S, Li M, Fan X, Huo YX. Self-assembly systems to troubleshoot metabolic engineering challenges. Trends Biotechnol 2024; 42:43-60. [PMID: 37451946 DOI: 10.1016/j.tibtech.2023.06.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 06/18/2023] [Accepted: 06/23/2023] [Indexed: 07/18/2023]
Abstract
Enzyme self-assembly is a technology in which enzyme units can aggregate into ordered macromolecules, assisted by scaffolds. In metabolic engineering, self-assembly strategies have been explored for aggregating multiple enzymes in the same pathway to improve sequential catalytic efficiency, which in turn enables high-level production. The performance of the scaffolds is critical to the formation of an efficient and stable assembly system. This review comprehensively analyzes these scaffolds by exploring how they assemble, and it illustrates how to apply self-assembly strategies for different modules in metabolic engineering. Functional modifications to scaffolds will further promote efficient strategies for production.
Collapse
Affiliation(s)
- Zhenya Chen
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081, Beijing, China
| | - Tong Wu
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081, Beijing, China
| | - Shengzhu Yu
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081, Beijing, China
| | - Min Li
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081, Beijing, China
| | - Xuanhe Fan
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081, Beijing, China
| | - Yi-Xin Huo
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, No. 5 South Zhongguancun Street, 100081, Beijing, China.
| |
Collapse
|
66
|
Goudy OJ, Nallathambi A, Kinjo T, Randolph NZ, Kuhlman B. In silico evolution of autoinhibitory domains for a PD-L1 antagonist using deep learning models. Proc Natl Acad Sci U S A 2023; 120:e2307371120. [PMID: 38032933 PMCID: PMC10710080 DOI: 10.1073/pnas.2307371120] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 09/24/2023] [Indexed: 12/02/2023] Open
Abstract
There has been considerable progress in the development of computational methods for designing protein-protein interactions, but engineering high-affinity binders without extensive screening and maturation remains challenging. Here, we test a protein design pipeline that uses iterative rounds of deep learning (DL)-based structure prediction (AlphaFold2) and sequence optimization (ProteinMPNN) to design autoinhibitory domains (AiDs) for a PD-L1 antagonist. With the goal of creating an anticancer agent that is inactive until reaching the tumor environment, we sought to create autoinhibited (or masked) forms of the PD-L1 antagonist that can be unmasked by tumor-enriched proteases. Twenty-three de novo designed AiDs, varying in length and topology, were fused to the antagonist with a protease-sensitive linker, and binding to PD-L1 was measured with and without protease treatment. Nine of the fusion proteins demonstrated conditional binding to PD-L1, and the top-performing AiDs were selected for further characterization as single-domain proteins. Without any experimental affinity maturation, four of the AiDs bind to the PD-L1 antagonist with equilibrium dissociation constants (KDs) below 150 nM, with the lowest KD equal to 0.9 nM. Our study demonstrates that DL-based protein modeling can be used to rapidly generate high-affinity protein binders.
Collapse
Affiliation(s)
- Odessa J. Goudy
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, NC27599
| | - Amrita Nallathambi
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, NC27599
| | - Tomoaki Kinjo
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, NC27599
| | - Nicholas Z. Randolph
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, NC27599
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, NC27599
| | - Brian Kuhlman
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, NC27599
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, NC27599
- Lineberger Comprehensive Cancer Center, University of North Carolina School of Medicine, Chapel Hill, NC27599
| |
Collapse
|
67
|
Buller R, Lutz S, Kazlauskas RJ, Snajdrova R, Moore JC, Bornscheuer UT. From nature to industry: Harnessing enzymes for biocatalysis. Science 2023; 382:eadh8615. [PMID: 37995253 DOI: 10.1126/science.adh8615] [Citation(s) in RCA: 134] [Impact Index Per Article: 67.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 10/17/2023] [Indexed: 11/25/2023]
Abstract
Biocatalysis harnesses enzymes to make valuable products. This green technology is used in countless applications from bench scale to industrial production and allows practitioners to access complex organic molecules, often with fewer synthetic steps and reduced waste. The last decade has seen an explosion in the development of experimental and computational tools to tailor enzymatic properties, equipping enzyme engineers with the ability to create biocatalysts that perform reactions not present in nature. By using (chemo)-enzymatic synthesis routes or orchestrating intricate enzyme cascades, scientists can synthesize elaborate targets ranging from DNA and complex pharmaceuticals to starch made in vitro from CO2-derived methanol. In addition, new chemistries have emerged through the combination of biocatalysis with transition metal catalysis, photocatalysis, and electrocatalysis. This review highlights recent key developments, identifies current limitations, and provides a future prospect for this rapidly developing technology.
Collapse
Affiliation(s)
- R Buller
- Competence Center for Biocatalysis, Institute of Chemistry and Biotechnology, Zurich University of Applied Sciences, 8820 Wädenswil, Switzerland
| | - S Lutz
- Codexis Incorporated, Redwood City, CA 94063, USA
| | - R J Kazlauskas
- Department of Biochemistry, Molecular Biology and Biophysics, Biotechnology Institute, University of Minnesota, Saint Paul, MN 55108, USA
| | - R Snajdrova
- Novartis Institutes for BioMedical Research, Global Discovery Chemistry, 4056 Basel, Switzerland
| | - J C Moore
- MRL, Merck & Co., Rahway, NJ 07065, USA
| | - U T Bornscheuer
- Institute of Biochemistry, Dept. of Biotechnology and Enzyme Catalysis, Greifswald University, Greifswald, Germany
| |
Collapse
|
68
|
Khakzad H, Igashov I, Schneuing A, Goverde C, Bronstein M, Correia B. A new age in protein design empowered by deep learning. Cell Syst 2023; 14:925-939. [PMID: 37972559 DOI: 10.1016/j.cels.2023.10.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 06/22/2023] [Accepted: 10/11/2023] [Indexed: 11/19/2023]
Abstract
The rapid progress in the field of deep learning has had a significant impact on protein design. Deep learning methods have recently produced a breakthrough in protein structure prediction, leading to the availability of high-quality models for millions of proteins. Along with novel architectures for generative modeling and sequence analysis, they have revolutionized the protein design field in the past few years remarkably by improving the accuracy and ability to identify novel protein sequences and structures. Deep neural networks can now learn and extract the fundamental features of protein structures, predict how they interact with other biomolecules, and have the potential to create new effective drugs for treating disease. As their applicability in protein design is rapidly growing, we review the recent developments and technology in deep learning methods and provide examples of their performance to generate novel functional proteins.
Collapse
Affiliation(s)
- Hamed Khakzad
- Université de Lorraine, CNRS, Inria, LORIA, 54000 Nancy, France; École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Ilia Igashov
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Arne Schneuing
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Casper Goverde
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | | | - Bruno Correia
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
| |
Collapse
|
69
|
Kouba P, Kohout P, Haddadi F, Bushuiev A, Samusevich R, Sedlar J, Damborsky J, Pluskal T, Sivic J, Mazurenko S. Machine Learning-Guided Protein Engineering. ACS Catal 2023; 13:13863-13895. [PMID: 37942269 PMCID: PMC10629210 DOI: 10.1021/acscatal.3c02743] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 09/20/2023] [Indexed: 11/10/2023]
Abstract
Recent progress in engineering highly promising biocatalysts has increasingly involved machine learning methods. These methods leverage existing experimental and simulation data to aid in the discovery and annotation of promising enzymes, as well as in suggesting beneficial mutations for improving known targets. The field of machine learning for protein engineering is gathering steam, driven by recent success stories and notable progress in other areas. It already encompasses ambitious tasks such as understanding and predicting protein structure and function, catalytic efficiency, enantioselectivity, protein dynamics, stability, solubility, aggregation, and more. Nonetheless, the field is still evolving, with many challenges to overcome and questions to address. In this Perspective, we provide an overview of ongoing trends in this domain, highlight recent case studies, and examine the current limitations of machine learning-based methods. We emphasize the crucial importance of thorough experimental validation of emerging models before their use for rational protein design. We present our opinions on the fundamental problems and outline the potential directions for future research.
Collapse
Affiliation(s)
- Petr Kouba
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
- Faculty of
Electrical Engineering, Czech Technical
University in Prague, Technicka 2, 166 27 Prague 6, Czech Republic
| | - Pavel Kohout
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Faraneh Haddadi
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Anton Bushuiev
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Raman Samusevich
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
- Institute
of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Prague 6, Czech Republic
| | - Jiri Sedlar
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Jiri Damborsky
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Tomas Pluskal
- Institute
of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Prague 6, Czech Republic
| | - Josef Sivic
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Stanislav Mazurenko
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| |
Collapse
|
70
|
Lee GR, Pellock SJ, Norn C, Tischer D, Dauparas J, Anischenko I, Mercer JAM, Kang A, Bera A, Nguyen H, Goreshnik I, Vafeados D, Roullier N, Han HL, Coventry B, Haddox HK, Liu DR, Yeh AHW, Baker D. Small-molecule binding and sensing with a designed protein family. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.01.565201. [PMID: 37961294 PMCID: PMC10635051 DOI: 10.1101/2023.11.01.565201] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Despite transformative advances in protein design with deep learning, the design of small-molecule-binding proteins and sensors for arbitrary ligands remains a grand challenge. Here we combine deep learning and physics-based methods to generate a family of proteins with diverse and designable pocket geometries, which we employ to computationally design binders for six chemically and structurally distinct small-molecule targets. Biophysical characterization of the designed binders revealed nanomolar to low micromolar binding affinities and atomic-level design accuracy. The bound ligands are exposed at one edge of the binding pocket, enabling the de novo design of chemically induced dimerization (CID) systems; we take advantage of this to create a biosensor with nanomolar sensitivity for cortisol. Our approach provides a general method to design proteins that bind and sense small molecules for a wide range of analytical, environmental, and biomedical applications.
Collapse
|
71
|
An L, Hicks DR, Zorine D, Dauparas J, Wicky BIM, Milles LF, Courbet A, Bera AK, Nguyen H, Kang A, Carter L, Baker D. Hallucination of closed repeat proteins containing central pockets. Nat Struct Mol Biol 2023; 30:1755-1760. [PMID: 37770718 PMCID: PMC10643118 DOI: 10.1038/s41594-023-01112-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 08/28/2023] [Indexed: 09/30/2023]
Abstract
In pseudocyclic proteins, such as TIM barrels, β barrels, and some helical transmembrane channels, a single subunit is repeated in a cyclic pattern, giving rise to a central cavity that can serve as a pocket for ligand binding or enzymatic activity. Inspired by these proteins, we devised a deep-learning-based approach to broadly exploring the space of closed repeat proteins starting from only a specification of the repeat number and length. Biophysical data for 38 structurally diverse pseudocyclic designs produced in Escherichia coli are consistent with the design models, and the three crystal structures we were able to obtain are very close to the designed structures. Docking studies suggest the diversity of folds and central pockets provide effective starting points for designing small-molecule binders and enzymes.
Collapse
Affiliation(s)
- Linna An
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
| | - Derrick R Hicks
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Dmitri Zorine
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Justas Dauparas
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Basile I M Wicky
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Lukas F Milles
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Alexis Courbet
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Asim K Bera
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Hannah Nguyen
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Alex Kang
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Lauren Carter
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
72
|
Ingraham JB, Baranov M, Costello Z, Barber KW, Wang W, Ismail A, Frappier V, Lord DM, Ng-Thow-Hing C, Van Vlack ER, Tie S, Xue V, Cowles SC, Leung A, Rodrigues JV, Morales-Perez CL, Ayoub AM, Green R, Puentes K, Oplinger F, Panwar NV, Obermeyer F, Root AR, Beam AL, Poelwijk FJ, Grigoryan G. Illuminating protein space with a programmable generative model. Nature 2023; 623:1070-1078. [PMID: 37968394 PMCID: PMC10686827 DOI: 10.1038/s41586-023-06728-8] [Citation(s) in RCA: 125] [Impact Index Per Article: 62.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 10/06/2023] [Indexed: 11/17/2023]
Abstract
Three billion years of evolution has produced a tremendous diversity of protein molecules1, but the full potential of proteins is likely to be much greater. Accessing this potential has been challenging for both computation and experiments because the space of possible protein molecules is much larger than the space of those likely to have functions. Here we introduce Chroma, a generative model for proteins and protein complexes that can directly sample novel protein structures and sequences, and that can be conditioned to steer the generative process towards desired properties and functions. To enable this, we introduce a diffusion process that respects the conformational statistics of polymer ensembles, an efficient neural architecture for molecular systems that enables long-range reasoning with sub-quadratic scaling, layers for efficiently synthesizing three-dimensional structures of proteins from predicted inter-residue geometries and a general low-temperature sampling algorithm for diffusion models. Chroma achieves protein design as Bayesian inference under external constraints, which can involve symmetries, substructure, shape, semantics and even natural-language prompts. The experimental characterization of 310 proteins shows that sampling from Chroma results in proteins that are highly expressed, fold and have favourable biophysical properties. The crystal structures of two designed proteins exhibit atomistic agreement with Chroma samples (a backbone root-mean-square deviation of around 1.0 Å). With this unified approach to protein design, we hope to accelerate the programming of protein matter to benefit human health, materials science and synthetic biology.
Collapse
Affiliation(s)
| | | | | | | | - Wujie Wang
- Generate Biomedicines, Somerville, MA, USA
| | | | | | | | | | | | - Shan Tie
- Generate Biomedicines, Somerville, MA, USA
| | | | | | - Alan Leung
- Generate Biomedicines, Somerville, MA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
73
|
Capponi S, Daniels KG. Harnessing the power of artificial intelligence to advance cell therapy. Immunol Rev 2023; 320:147-165. [PMID: 37415280 DOI: 10.1111/imr.13236] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 06/17/2023] [Indexed: 07/08/2023]
Abstract
Cell therapies are powerful technologies in which human cells are reprogrammed for therapeutic applications such as killing cancer cells or replacing defective cells. The technologies underlying cell therapies are increasing in effectiveness and complexity, making rational engineering of cell therapies more difficult. Creating the next generation of cell therapies will require improved experimental approaches and predictive models. Artificial intelligence (AI) and machine learning (ML) methods have revolutionized several fields in biology including genome annotation, protein structure prediction, and enzyme design. In this review, we discuss the potential of combining experimental library screens and AI to build predictive models for the development of modular cell therapy technologies. Advances in DNA synthesis and high-throughput screening techniques enable the construction and screening of libraries of modular cell therapy constructs. AI and ML models trained on this screening data can accelerate the development of cell therapies by generating predictive models, design rules, and improved designs.
Collapse
Affiliation(s)
- Sara Capponi
- Department of Functional Genomics and Cellular Engineering, IBM Almaden Research Center, San Jose, California, USA
- Center for Cellular Construction, San Francisco, California, USA
| | - Kyle G Daniels
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, California, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
| |
Collapse
|
74
|
Meador K, Castells-Graells R, Aguirre R, Sawaya MR, Arbing MA, Sherman T, Senarathne C, Yeates TO. A Suite of Designed Protein Cages Using Machine Learning Algorithms and Protein Fragment-Based Protocols. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.09.561468. [PMID: 37873110 PMCID: PMC10592684 DOI: 10.1101/2023.10.09.561468] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Designed protein cages and related materials provide unique opportunities for applications in biotechnology and medicine, while methods for their creation remain challenging and unpredictable. In the present study, we apply new computational approaches to design a suite of new tetrahedrally symmetric, self-assembling protein cages. For the generation of docked poses, we emphasize a protein fragment-based approach, while for de novo interface design, a comparison of computational protocols highlights the power and increased experimental success achieved using the machine learning program ProteinMPNN. In relating information from docking and design, we observe that agreement between fragment-based sequence preferences and ProteinMPNN sequence inference correlates with experimental success. Additional insights for designing polar interactions are highlighted by experimentally testing larger and more polar interfaces. In all, using X-ray crystallography and cryo-EM, we report five structures for seven protein cages, with atomic resolution in the best case reaching 2.0 Å. We also report structures of two incompletely assembled protein cages, providing unique insights into one type of assembly failure. The new set of designed cages and their structures add substantially to the body of available protein nanoparticles, and to methodologies for their creation.
Collapse
Affiliation(s)
- Kyle Meador
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA 90095
| | | | - Roman Aguirre
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA 90095
| | - Michael R. Sawaya
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA, USA 90095
| | - Mark A. Arbing
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA, USA 90095
| | - Trent Sherman
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA 90095
| | - Chethaka Senarathne
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA 90095
| | - Todd O. Yeates
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA 90095
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA, USA 90095
| |
Collapse
|
75
|
Roel-Touris J, Nadal M, Marcos E. Single-chain dimers from de novo immunoglobulins as robust scaffolds for multiple binding loops. Nat Commun 2023; 14:5939. [PMID: 37741853 PMCID: PMC10517939 DOI: 10.1038/s41467-023-41717-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 09/15/2023] [Indexed: 09/25/2023] Open
Abstract
Antibody derivatives have sought to recapitulate the antigen binding properties of antibodies, but with improved biophysical attributes convenient for therapeutic, diagnostic and research applications. However, their success has been limited by the naturally occurring structure of the immunoglobulin dimer displaying hypervariable binding loops, which is hard to modify by traditional engineering approaches. Here, we devise geometrical principles for de novo designing single-chain immunoglobulin dimers, as a tunable two-domain architecture that optimizes biophysical properties through more favorable dimer interfaces. Guided by these principles, we computationally designed protein scaffolds that were hyperstable, structurally accurate and robust for accommodating multiple functional loops, both individually and in combination, as confirmed through biochemical assays and X-ray crystallography. We showcase the modularity of this architecture by deep-learning-based diversification, opening up the possibility for tailoring the number, positioning, and relative orientation of ligand-binding loops targeting one or two distal epitopes. Our results provide a route to custom-design robust protein scaffolds for harboring multiple functional loops.
Collapse
Affiliation(s)
- Jorge Roel-Touris
- Protein Design and Modeling Lab, Department of Structural and Molecular Biology, Molecular Biology Institute of Barcelona (IBMB), CSIC, Baldiri Reixac 10, 08028, Barcelona, Spain
| | - Marta Nadal
- Protein Design and Modeling Lab, Department of Structural and Molecular Biology, Molecular Biology Institute of Barcelona (IBMB), CSIC, Baldiri Reixac 10, 08028, Barcelona, Spain
| | - Enrique Marcos
- Protein Design and Modeling Lab, Department of Structural and Molecular Biology, Molecular Biology Institute of Barcelona (IBMB), CSIC, Baldiri Reixac 10, 08028, Barcelona, Spain.
| |
Collapse
|
76
|
Han K, Zhang Z, Tezcan FA. Spatially Patterned, Porous Protein Crystals as Multifunctional Materials. J Am Chem Soc 2023; 145:19932-19944. [PMID: 37642457 DOI: 10.1021/jacs.3c06348] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
While the primary use of protein crystals has historically been in crystallographic structure determination, they have recently emerged as promising materials with many advantageous properties such as high porosity, biocompatibility, stability, structural and functional versatility, and genetic/chemical tailorability. Here, we report that the utility of protein crystals as functional materials can be further augmented through their spatial patterning and control of their morphologies. To this end, we took advantage of the chemically and kinetically controllable nature of ferritin self-assembly and constructed core-shell crystals with chemically distinct domains, tunable structural patterns, and morphologies. The spatial organization within ferritin crystals enabled the generation of patterned, multi-enzyme frameworks with cooperative catalytic behavior. We further exploited the differential growth kinetics of ferritin crystal facets to assemble Janus-type architectures with an anisotropic arrangement of chemically distinct domains. These examples represent a step toward using protein crystals as reaction vessels for complex multi-step reactions and broadening their utility as functional, solid-state materials. Our results demonstrate that morphology control and spatial patterning, which are key concepts in materials science and nanotechnology, can also be applied for engineering protein crystals.
Collapse
Affiliation(s)
- Kenneth Han
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, United States
| | - Zhiyin Zhang
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, United States
| | - F Akif Tezcan
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, United States
- Materials Science and Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, United States
| |
Collapse
|
77
|
Praetorius F, Leung PJY, Tessmer MH, Broerman A, Demakis C, Dishman AF, Pillai A, Idris A, Juergens D, Dauparas J, Li X, Levine PM, Lamb M, Ballard RK, Gerben SR, Nguyen H, Kang A, Sankaran B, Bera AK, Volkman BF, Nivala J, Stoll S, Baker D. Design of stimulus-responsive two-state hinge proteins. Science 2023; 381:754-760. [PMID: 37590357 PMCID: PMC10697137 DOI: 10.1126/science.adg7731] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 07/11/2023] [Indexed: 08/19/2023]
Abstract
In nature, proteins that switch between two conformations in response to environmental stimuli structurally transduce biochemical information in a manner analogous to how transistors control information flow in computing devices. Designing proteins with two distinct but fully structured conformations is a challenge for protein design as it requires sculpting an energy landscape with two distinct minima. Here we describe the design of "hinge" proteins that populate one designed state in the absence of ligand and a second designed state in the presence of ligand. X-ray crystallography, electron microscopy, double electron-electron resonance spectroscopy, and binding measurements demonstrate that despite the significant structural differences the two states are designed with atomic level accuracy and that the conformational and binding equilibria are closely coupled.
Collapse
Affiliation(s)
- Florian Praetorius
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Philip J. Y. Leung
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA, USA
| | - Maxx H. Tessmer
- Department of Chemistry, University of Washington, Seattle, WA, USA
| | - Adam Broerman
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Chemical Engineering, University of Washington, Seattle, WA, USA
| | - Cullen Demakis
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Biological Physics, Structure, and Design, University of Washington, Seattle, Washington, USA
| | - Acacia F. Dishman
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, WI, USA
- Medical Scientist Training Program, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Arvind Pillai
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Abbas Idris
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - David Juergens
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA, USA
| | - Justas Dauparas
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Xinting Li
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Paul M. Levine
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Mila Lamb
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Ryanne K. Ballard
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Stacey R. Gerben
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Hannah Nguyen
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Alex Kang
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Banumathi Sankaran
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Asim K. Bera
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Brian F. Volkman
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Jeff Nivala
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
- Molecular Engineering and Sciences Institute, University of Washington, Seattle, WA, USA
| | - Stefan Stoll
- Department of Chemistry, University of Washington, Seattle, WA, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA,USA
| |
Collapse
|
78
|
Kao HW, Lu WL, Ho MR, Lin YF, Hsieh YJ, Ko TP, Danny Hsu ST, Wu KP. Robust Design of Effective Allosteric Activators for Rsp5 E3 Ligase Using the Machine Learning Tool ProteinMPNN. ACS Synth Biol 2023; 12:2310-2319. [PMID: 37556858 DOI: 10.1021/acssynbio.3c00042] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/11/2023]
Abstract
We used the deep learning tool ProteinMPNN to redesign ubiquitin (Ub) as a specific and functionally stimulating/enhancing binder of the Rsp5 E3 ligase. We generated 20 extensively mutated─up to 37 of 76 residues─recombinant Ub variants (UbVs), named R1 to R20, displaying well-folded structures and high thermal stabilities. These UbVs can also form stable complexes with Rsp5, as predicted using AlphaFold2. Three of the UbVs bound to Rsp5 with low micromolar affinity, with R4 and R12 effectively enhancing the Rsp5 activity six folds. AlphaFold2 predicts that R4 and R12 bind to Rsp5's exosite in an identical manner to the Rsp5-Ub template, thereby allosterically activating Rsp5-Ub thioester formation. Thus, we present a virtual solution for rapidly and cost-effectively designing UbVs as functional modulators of Ub-related enzymes.
Collapse
Affiliation(s)
- Hsi-Wen Kao
- Institute of Biological Chemistry, Academia Sinica, Taipei 115, Taiwan
| | - Wei-Lin Lu
- Institute of Biological Chemistry, Academia Sinica, Taipei 115, Taiwan
| | - Meng-Ru Ho
- Institute of Biological Chemistry, Academia Sinica, Taipei 115, Taiwan
| | - Yu-Fong Lin
- Institute of Biological Chemistry, Academia Sinica, Taipei 115, Taiwan
- Institute of Biochemical Science, National Taiwan University, Taipei 106, Taiwan
| | - Yun-Jung Hsieh
- Institute of Biological Chemistry, Academia Sinica, Taipei 115, Taiwan
- Institute of Biochemical Science, National Taiwan University, Taipei 106, Taiwan
| | - Tzu-Ping Ko
- Institute of Biological Chemistry, Academia Sinica, Taipei 115, Taiwan
| | - Shang-Te Danny Hsu
- Institute of Biological Chemistry, Academia Sinica, Taipei 115, Taiwan
- Institute of Biochemical Science, National Taiwan University, Taipei 106, Taiwan
- International Institute for Sustainability with Knotted Chiral Meta Matter, Hiroshima University, Higashihiroshima 739-8527, Japan
| | - Kuen-Phon Wu
- Institute of Biological Chemistry, Academia Sinica, Taipei 115, Taiwan
- Institute of Biochemical Science, National Taiwan University, Taipei 106, Taiwan
| |
Collapse
|
79
|
de Haas RJ, Brunette N, Goodson A, Dauparas J, Yi SY, Yang EC, Dowling Q, Nguyen H, Kang A, Bera AK, Sankaran B, de Vries R, Baker D, King NP. Rapid and automated design of two-component protein nanomaterials using ProteinMPNN. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.04.551935. [PMID: 37577478 PMCID: PMC10418170 DOI: 10.1101/2023.08.04.551935] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
The design of novel protein-protein interfaces using physics-based design methods such as Rosetta requires substantial computational resources and manual refinement by expert structural biologists. A new generation of deep learning methods promises to simplify protein-protein interface design and enable its application to a wide variety of problems by researchers from various scientific disciplines. Here we test the ability of a deep learning method for protein sequence design, ProteinMPNN, to design two-component tetrahedral protein nanomaterials and benchmark its performance against Rosetta. ProteinMPNN had a similar success rate to Rosetta, yielding 13 new experimentally confirmed assemblies, but required orders of magnitude less computation and no manual refinement. The interfaces designed by ProteinMPNN were substantially more polar than those designed by Rosetta, which facilitated in vitro assembly of the designed nanomaterials from independently purified components. Crystal structures of several of the assemblies confirmed the accuracy of the design method at high resolution. Our results showcase the potential of deep learning-based methods to unlock the widespread application of designed protein-protein interfaces and self-assembling protein nanomaterials in biotechnology.
Collapse
|
80
|
Ekins S, Brackmann M, Invernizzi C, Lentzos F. Generative Artificial Intelligence-Assisted Protein Design Must Consider Repurposing Potential. GEN BIOTECHNOLOGY 2023; 2:296-300. [PMID: 37928405 PMCID: PMC10623615 DOI: 10.1089/genbio.2023.0025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2023]
Abstract
Generative artificial intelligence software used for chemical and protein design has repurposing potential. We propose careful discussion in the biotech community on security considerations of such technologies and serious consideration of restrictions to control who can access the software and what applications it is used for.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations Pharmaceuticals, Inc., Raleigh, North Carolina, USA
| | - Maximilian Brackmann
- Spiez Laboratory, Federal Department of Defence, Civil Protection and Sports, Spiez, Switzerland
| | - Cédric Invernizzi
- Spiez Laboratory, Federal Department of Defence, Civil Protection and Sports, Spiez, Switzerland
| | - Filippa Lentzos
- Department of War Studies and King's College London, London, United Kingdom
- Department of Global Health and Social Medicine, King's College London, London, United Kingdom
| |
Collapse
|
81
|
Wang H, Fu T, Du Y, Gao W, Huang K, Liu Z, Chandak P, Liu S, Van Katwyk P, Deac A, Anandkumar A, Bergen K, Gomes CP, Ho S, Kohli P, Lasenby J, Leskovec J, Liu TY, Manrai A, Marks D, Ramsundar B, Song L, Sun J, Tang J, Veličković P, Welling M, Zhang L, Coley CW, Bengio Y, Zitnik M. Scientific discovery in the age of artificial intelligence. Nature 2023; 620:47-60. [PMID: 37532811 DOI: 10.1038/s41586-023-06221-2] [Citation(s) in RCA: 270] [Impact Index Per Article: 135.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 05/16/2023] [Indexed: 08/04/2023]
Abstract
Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment and accelerate research, helping scientists to generate hypotheses, design experiments, collect and interpret large datasets, and gain insights that might not have been possible using traditional scientific methods alone. Here we examine breakthroughs over the past decade that include self-supervised learning, which allows models to be trained on vast amounts of unlabelled data, and geometric deep learning, which leverages knowledge about the structure of scientific data to enhance model accuracy and efficiency. Generative AI methods can create designs, such as small-molecule drugs and proteins, by analysing diverse data modalities, including images and sequences. We discuss how these methods can help scientists throughout the scientific process and the central issues that remain despite such advances. Both developers and users of AI toolsneed a better understanding of when such approaches need improvement, and challenges posed by poor data quality and stewardship remain. These issues cut across scientific disciplines and require developing foundational algorithmic approaches that can contribute to scientific understanding or acquire it autonomously, making them critical areas of focus for AI innovation.
Collapse
Affiliation(s)
- Hanchen Wang
- Department of Engineering, University of Cambridge, Cambridge, UK
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA
- Department of Research and Early Development, Genentech Inc, South San Francisco, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Tianfan Fu
- Department of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, USA
| | - Yuanqi Du
- Department of Computer Science, Cornell University, Ithaca, NY, USA
| | - Wenhao Gao
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Kexin Huang
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Ziming Liu
- Department of Physics, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Payal Chandak
- Harvard-MIT Program in Health Sciences and Technology, Cambridge, MA, USA
| | - Shengchao Liu
- Mila - Quebec AI Institute, Montreal, Quebec, Canada
- Université de Montréal, Montreal, Quebec, Canada
| | - Peter Van Katwyk
- Department of Earth, Environmental and Planetary Sciences, Brown University, Providence, RI, USA
- Data Science Institute, Brown University, Providence, RI, USA
| | - Andreea Deac
- Mila - Quebec AI Institute, Montreal, Quebec, Canada
- Université de Montréal, Montreal, Quebec, Canada
| | - Anima Anandkumar
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA
- NVIDIA, Santa Clara, CA, USA
| | - Karianne Bergen
- Department of Earth, Environmental and Planetary Sciences, Brown University, Providence, RI, USA
- Data Science Institute, Brown University, Providence, RI, USA
| | - Carla P Gomes
- Department of Computer Science, Cornell University, Ithaca, NY, USA
| | - Shirley Ho
- Center for Computational Astrophysics, Flatiron Institute, New York, NY, USA
- Department of Astrophysical Sciences, Princeton University, Princeton, NJ, USA
- Department of Physics, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Physics and Center for Data Science, New York University, New York, NY, USA
| | | | - Joan Lasenby
- Department of Engineering, University of Cambridge, Cambridge, UK
| | - Jure Leskovec
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | | | - Arjun Manrai
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Debora Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Le Song
- BioMap, Beijing, China
- Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
| | - Jimeng Sun
- University of Illinois at Urbana-Champaign, Champaign, IL, USA
| | - Jian Tang
- Mila - Quebec AI Institute, Montreal, Quebec, Canada
- HEC Montréal, Montreal, Quebec, Canada
- CIFAR AI Chair, Toronto, Ontario, Canada
| | - Petar Veličković
- Google DeepMind, London, UK
- Department of Computer Science and Technology, University of Cambridge, Cambridge, UK
| | - Max Welling
- University of Amsterdam, Amsterdam, Netherlands
- Microsoft Research Amsterdam, Amsterdam, Netherlands
| | - Linfeng Zhang
- DP Technology, Beijing, China
- AI for Science Institute, Beijing, China
| | - Connor W Coley
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Yoshua Bengio
- Mila - Quebec AI Institute, Montreal, Quebec, Canada
- Université de Montréal, Montreal, Quebec, Canada
| | - Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Harvard Data Science Initiative, Cambridge, MA, USA.
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
82
|
Watson JL, Juergens D, Bennett NR, Trippe BL, Yim J, Eisenach HE, Ahern W, Borst AJ, Ragotte RJ, Milles LF, Wicky BIM, Hanikel N, Pellock SJ, Courbet A, Sheffler W, Wang J, Venkatesh P, Sappington I, Torres SV, Lauko A, De Bortoli V, Mathieu E, Ovchinnikov S, Barzilay R, Jaakkola TS, DiMaio F, Baek M, Baker D. De novo design of protein structure and function with RFdiffusion. Nature 2023; 620:1089-1100. [PMID: 37433327 PMCID: PMC10468394 DOI: 10.1038/s41586-023-06415-8] [Citation(s) in RCA: 513] [Impact Index Per Article: 256.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 07/07/2023] [Indexed: 07/13/2023]
Abstract
There has been considerable recent progress in designing new proteins using deep-learning methods1-9. Despite this progress, a general deep-learning framework for protein design that enables solution of a wide range of design challenges, including de novo binder design and design of higher-order symmetric architectures, has yet to be described. Diffusion models10,11 have had considerable success in image and language generative modelling but limited success when applied to protein modelling, probably due to the complexity of protein backbone geometry and sequence-structure relationships. Here we show that by fine-tuning the RoseTTAFold structure prediction network on protein structure denoising tasks, we obtain a generative model of protein backbones that achieves outstanding performance on unconditional and topology-constrained protein monomer design, protein binder design, symmetric oligomer design, enzyme active site scaffolding and symmetric motif scaffolding for therapeutic and metal-binding protein design. We demonstrate the power and generality of the method, called RoseTTAFold diffusion (RFdiffusion), by experimentally characterizing the structures and functions of hundreds of designed symmetric assemblies, metal-binding proteins and protein binders. The accuracy of RFdiffusion is confirmed by the cryogenic electron microscopy structure of a designed binder in complex with influenza haemagglutinin that is nearly identical to the design model. In a manner analogous to networks that produce images from user-specified inputs, RFdiffusion enables the design of diverse functional proteins from simple molecular specifications.
Collapse
Affiliation(s)
- Joseph L Watson
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - David Juergens
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA, USA
| | - Nathaniel R Bennett
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA, USA
| | - Brian L Trippe
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Columbia University, Department of Statistics, New York, NY, USA
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY, USA
| | - Jason Yim
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Helen E Eisenach
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Woody Ahern
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
| | - Andrew J Borst
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Robert J Ragotte
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Lukas F Milles
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Basile I M Wicky
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Nikita Hanikel
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Samuel J Pellock
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Alexis Courbet
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- National Centre for Scientific Research, École Normale Supérieure rue d'Ulm, Paris, France
| | - William Sheffler
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Jue Wang
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Preetham Venkatesh
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Biological Physics, Structure and Design, University of Washington, Seattle, WA, USA
| | - Isaac Sappington
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Biological Physics, Structure and Design, University of Washington, Seattle, WA, USA
| | - Susana Vázquez Torres
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Biological Physics, Structure and Design, University of Washington, Seattle, WA, USA
| | - Anna Lauko
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Biological Physics, Structure and Design, University of Washington, Seattle, WA, USA
| | - Valentin De Bortoli
- National Centre for Scientific Research, École Normale Supérieure rue d'Ulm, Paris, France
| | - Emile Mathieu
- Department of Engineering, University of Cambridge, Cambridge, UK
| | - Sergey Ovchinnikov
- Faculty of Applied Sciences, Harvard University, Cambridge, MA, USA
- John Harvard Distinguished Science Fellowship, Harvard University, Cambridge, MA, USA
| | | | | | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Minkyung Baek
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
83
|
Mallik BB, Stanislaw J, Alawathurage TM, Khmelinskaia A. De Novo Design of Polyhedral Protein Assemblies: Before and After the AI Revolution. Chembiochem 2023; 24:e202300117. [PMID: 37014094 DOI: 10.1002/cbic.202300117] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 04/03/2023] [Accepted: 04/03/2023] [Indexed: 04/05/2023]
Abstract
Self-assembling polyhedral protein biomaterials have gained attention as engineering targets owing to their naturally evolved sophisticated functions, ranging from protecting macromolecules from the environment to spatially controlling biochemical reactions. Precise computational design of de novo protein polyhedra is possible through two main types of approaches: methods from first principles, using physical and geometrical rules, and more recent data-driven methods based on artificial intelligence (AI), including deep learning (DL). Here, we retrospect first principle- and AI-based approaches for designing finite polyhedral protein assemblies, as well as advances in the structure prediction of such assemblies. We further highlight the possible applications of these materials and explore how the presented approaches can be combined to overcome current challenges and to advance the design of functional protein-based biomaterials.
Collapse
Affiliation(s)
- Bhoomika Basu Mallik
- Transdisciplinary Research Area, "Building Blocks of Matter and Fundamental Interactions (TRA Matter)", University of Bonn, 53121, Bonn, Germany
- Life and Medical Sciences Institute, University of Bonn, 53115, Bonn, Germany
| | - Jenna Stanislaw
- Transdisciplinary Research Area, "Building Blocks of Matter and Fundamental Interactions (TRA Matter)", University of Bonn, 53121, Bonn, Germany
- Life and Medical Sciences Institute, University of Bonn, 53115, Bonn, Germany
| | - Tharindu Madhusankha Alawathurage
- Transdisciplinary Research Area, "Building Blocks of Matter and Fundamental Interactions (TRA Matter)", University of Bonn, 53121, Bonn, Germany
- Life and Medical Sciences Institute, University of Bonn, 53115, Bonn, Germany
| | - Alena Khmelinskaia
- Transdisciplinary Research Area, "Building Blocks of Matter and Fundamental Interactions (TRA Matter)", University of Bonn, 53121, Bonn, Germany
- Life and Medical Sciences Institute, University of Bonn, 53115, Bonn, Germany
- Current address: Department of Chemistry, Ludwig Maximillian University, 80539, Munich, Germany
| |
Collapse
|
84
|
|
85
|
Casadevall G, Duran C, Osuna S. AlphaFold2 and Deep Learning for Elucidating Enzyme Conformational Flexibility and Its Application for Design. JACS AU 2023; 3:1554-1562. [PMID: 37388680 PMCID: PMC10302747 DOI: 10.1021/jacsau.3c00188] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 05/22/2023] [Accepted: 05/22/2023] [Indexed: 07/01/2023]
Abstract
The recent success of AlphaFold2 (AF2) and other deep learning (DL) tools in accurately predicting the folded three-dimensional (3D) structure of proteins and enzymes has revolutionized the structural biology and protein design fields. The 3D structure indeed reveals key information on the arrangement of the catalytic machinery of enzymes and which structural elements gate the active site pocket. However, comprehending enzymatic activity requires a detailed knowledge of the chemical steps involved along the catalytic cycle and the exploration of the multiple thermally accessible conformations that enzymes adopt when in solution. In this Perspective, some of the recent studies showing the potential of AF2 in elucidating the conformational landscape of enzymes are provided. Selected examples of the key developments of AF2-based and DL methods for protein design are discussed, as well as a few enzyme design cases. These studies show the potential of AF2 and DL for allowing the routine computational design of efficient enzymes.
Collapse
Affiliation(s)
- Guillem Casadevall
- Institut
de Química Computacional i Catàlisi (IQCC) and Departament
de Química, Universitat de Girona, Maria Aurèlia Capmany 69, 17003 Girona, Spain
| | - Cristina Duran
- Institut
de Química Computacional i Catàlisi (IQCC) and Departament
de Química, Universitat de Girona, Maria Aurèlia Capmany 69, 17003 Girona, Spain
| | - Sílvia Osuna
- Institut
de Química Computacional i Catàlisi (IQCC) and Departament
de Química, Universitat de Girona, Maria Aurèlia Capmany 69, 17003 Girona, Spain
- ICREA, Passeig Lluís Companys 23, 08010 Barcelona, Spain
| |
Collapse
|
86
|
Huddy TF, Hsia Y, Kibler RD, Xu J, Bethel N, Nagarajan D, Redler R, Leung PJY, Courbet A, Yang EC, Bera AK, Coudray N, Calise SJ, Davila-Hernandez FA, Weidle C, Han HL, Li Z, McHugh R, Reggiano G, Kang A, Sankaran B, Dickinson MS, Coventry B, Brunette TJ, Liu Y, Dauparas J, Borst AJ, Ekiert D, Kollman JM, Bhabha G, Baker D. Blueprinting expandable nanomaterials with standardized protein building blocks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.09.544258. [PMID: 37333359 PMCID: PMC10274926 DOI: 10.1101/2023.06.09.544258] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
A wooden house frame consists of many different lumber pieces, but because of the regularity of these building blocks, the structure can be designed using straightforward geometrical principles. The design of multicomponent protein assemblies in comparison has been much more complex, largely due to the irregular shapes of protein structures 1 . Here we describe extendable linear, curved, and angled protein building blocks, as well as inter-block interactions that conform to specified geometric standards; assemblies designed using these blocks inherit their extendability and regular interaction surfaces, enabling them to be expanded or contracted by varying the number of modules, and reinforced with secondary struts. Using X-ray crystallography and electron microscopy, we validate nanomaterial designs ranging from simple polygonal and circular oligomers that can be concentrically nested, up to large polyhedral nanocages and unbounded straight "train track" assemblies with reconfigurable sizes and geometries that can be readily blueprinted. Because of the complexity of protein structures and sequence-structure relationships, it has not been previously possible to build up large protein assemblies by deliberate placement of protein backbones onto a blank 3D canvas; the simplicity and geometric regularity of our design platform now enables construction of protein nanomaterials according to "back of an envelope" architectural blueprints.
Collapse
|
87
|
Wang G, Feng Y, Gao C, Zhang X, Wang Q, Zhang J, Zhang H, Wu Y, Li X, Wang L, Fu Y, Yu X, Zhang D, Liu J, Ding J. Biaxial stretching of polytetrafluoroethylene in industrial scale to fabricate medical ePTFE membrane with node-fibril microstructure. Regen Biomater 2023; 10:rbad056. [PMID: 37397871 PMCID: PMC10310521 DOI: 10.1093/rb/rbad056] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 05/19/2023] [Accepted: 05/25/2023] [Indexed: 07/04/2023] Open
Abstract
Expanded polytetrafluoroethylene (ePTFE) is promising in biomedical fields such as covered stents and plastic surgery owing to its excellent biocompatibility and mechanical properties. However, ePTFE material prepared by the traditional biaxial stretching process is with thicker middle and thinner sides due to the bowing effect, which poses a major problem in industrial-scale fabrication. To solve this problem, we design an olive-shaped winding roller to provide the middle part of the ePTFE tape with a greater longitudinal stretching amplitude than the two sides, so as to make up for the excessive longitudinal retraction tendency of the middle part when it is transversely stretched. The as-fabricated ePTFE membrane has, as designed, uniform thickness and node-fibril microstructure. In addition, we examine the effects of mass ratio of lubricant to PTFE powder, biaxial stretching ratio and sintering temperature on the performance of the resultant ePTFE membranes. Particularly, the relation between the internal microstructure of the ePTFE membrane and its mechanical properties is revealed. Besides stable mechanical properties, the sintered ePTFE membrane exhibits satisfactory biological properties. We make a series of biological assessments including in vitro hemolysis, coagulation, bacterial reverse mutation and in vivo thrombosis, intracutaneous reactivity test, pyrogen test and subchronic systemic toxicity test; all of the results meet the relevant international standards. The muscle implantation of the sintered ePTFE membrane into rabbits indicates acceptable inflammatory reactions of our sintered ePTFE membrane fabricated on industrial scale. Such a medical-grade raw material with the unique physical form and condensed-state microstructure is expected to afford an inert biomaterial potentially for stent-graft membrane.
Collapse
Affiliation(s)
- Gang Wang
- State Key Laboratory of Molecular Engineering of Polymers, Department of Macromolecular Science, Fudan University, Shanghai 200438, China
- R&D Center, Lifetech Scientific (Shenzhen) Co., Ltd., Shenzhen 518057, China
- R&D Center, Lifevalve Medical Scientific Co., Ltd., Shenzhen 518057, China
| | - Yusheng Feng
- R&D Center, Lifetech Scientific (Shenzhen) Co., Ltd., Shenzhen 518057, China
| | - Caiyun Gao
- State Key Laboratory of Molecular Engineering of Polymers, Department of Macromolecular Science, Fudan University, Shanghai 200438, China
| | - Xu Zhang
- R&D Center, Lifetech Scientific (Shenzhen) Co., Ltd., Shenzhen 518057, China
- R&D Center, Lifevalve Medical Scientific Co., Ltd., Shenzhen 518057, China
| | - Qunsong Wang
- State Key Laboratory of Molecular Engineering of Polymers, Department of Macromolecular Science, Fudan University, Shanghai 200438, China
| | - Jie Zhang
- R&D Center, Lifetech Scientific (Shenzhen) Co., Ltd., Shenzhen 518057, China
- R&D Center, Lifevalve Medical Scientific Co., Ltd., Shenzhen 518057, China
| | - Hongjie Zhang
- State Key Laboratory of Molecular Engineering of Polymers, Department of Macromolecular Science, Fudan University, Shanghai 200438, China
| | - Yongqiang Wu
- R&D Center, Lifetech Scientific (Shenzhen) Co., Ltd., Shenzhen 518057, China
- R&D Center, Lifevalve Medical Scientific Co., Ltd., Shenzhen 518057, China
| | - Xin Li
- State Key Laboratory of Molecular Engineering of Polymers, Department of Macromolecular Science, Fudan University, Shanghai 200438, China
| | - Lin Wang
- R&D Center, Lifetech Scientific (Shenzhen) Co., Ltd., Shenzhen 518057, China
- R&D Center, Lifevalve Medical Scientific Co., Ltd., Shenzhen 518057, China
| | - Ye Fu
- State Key Laboratory of Molecular Engineering of Polymers, Department of Macromolecular Science, Fudan University, Shanghai 200438, China
| | - Xiaoye Yu
- State Key Laboratory of Molecular Engineering of Polymers, Department of Macromolecular Science, Fudan University, Shanghai 200438, China
| | - Deyuan Zhang
- R&D Center, Lifetech Scientific (Shenzhen) Co., Ltd., Shenzhen 518057, China
| | - Jianxiong Liu
- R&D Center, Lifetech Scientific (Shenzhen) Co., Ltd., Shenzhen 518057, China
| | - Jiandong Ding
- State Key Laboratory of Molecular Engineering of Polymers, Department of Macromolecular Science, Fudan University, Shanghai 200438, China
| |
Collapse
|
88
|
Goudy OJ, Nallathambi A, Kinjo T, Randolph N, Kuhlman B. In silico evolution of protein binders with deep learning models for structure prediction and sequence design. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.03.539278. [PMID: 37205527 PMCID: PMC10187191 DOI: 10.1101/2023.05.03.539278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
There has been considerable progress in the development of computational methods for designing protein-protein interactions, but engineering high-affinity binders without extensive screening and maturation remains challenging. Here, we test a protein design pipeline that uses iterative rounds of deep learning (DL)-based structure prediction (AlphaFold2) and sequence optimization (ProteinMPNN) to design autoinhibitory domains (AiDs) for a PD-L1 antagonist. Inspired by recent advances in therapeutic design, we sought to create autoinhibited (or masked) forms of the antagonist that can be conditionally activated by proteases. Twenty-three de novo designed AiDs, varying in length and topology, were fused to the antagonist with a protease sensitive linker, and binding to PD-L1 was tested with and without protease treatment. Nine of the fusion proteins demonstrated conditional binding to PD-L1 and the top performing AiDs were selected for further characterization as single domain proteins. Without any experimental affinity maturation, four of the AiDs bind to the PD-L1 antagonist with equilibrium dissociation constants (KDs) below 150 nM, with the lowest KD equal to 0.9 nM. Our study demonstrates that DL-based protein modeling can be used to rapidly generate high affinity protein binders.
Collapse
Affiliation(s)
- Odessa J Goudy
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| | - Amrita Nallathambi
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| | - Tomoaki Kinjo
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| | - Nicholas Randolph
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| | - Brian Kuhlman
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| |
Collapse
|
89
|
Lutz ID, Wang S, Norn C, Courbet A, Borst AJ, Zhao YT, Dosey A, Cao L, Xu J, Leaf EM, Treichel C, Litvicov P, Li Z, Goodson AD, Rivera-Sánchez P, Bratovianu AM, Baek M, King NP, Ruohola-Baker H, Baker D. Top-down design of protein architectures with reinforcement learning. Science 2023; 380:266-273. [PMID: 37079676 DOI: 10.1126/science.adf6591] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Accepted: 03/21/2023] [Indexed: 04/22/2023]
Abstract
As a result of evolutionary selection, the subunits of naturally occurring protein assemblies often fit together with substantial shape complementarity to generate architectures optimal for function in a manner not achievable by current design approaches. We describe a "top-down" reinforcement learning-based design approach that solves this problem using Monte Carlo tree search to sample protein conformers in the context of an overall architecture and specified functional constraints. Cryo-electron microscopy structures of the designed disk-shaped nanopores and ultracompact icosahedra are very close to the computational models. The icosohedra enable very-high-density display of immunogens and signaling molecules, which potentiates vaccine response and angiogenesis induction. Our approach enables the top-down design of complex protein nanomaterials with desired system properties and demonstrates the power of reinforcement learning in protein design.
Collapse
Affiliation(s)
- Isaac D Lutz
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Shunzhi Wang
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Christoffer Norn
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- BioInnovation Institute, DK2200 Copenhagen N, Denmark
| | - Alexis Courbet
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Andrew J Borst
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Yan Ting Zhao
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
- Oral Health Sciences, University of Washington, Seattle, WA, USA
| | - Annie Dosey
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Longxing Cao
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
| | - Jinwei Xu
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Elizabeth M Leaf
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Catherine Treichel
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Patrisia Litvicov
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
| | - Zhe Li
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Alexander D Goodson
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | | | | | - Minkyung Baek
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
| | - Neil P King
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Hannele Ruohola-Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
- Oral Health Sciences, University of Washington, Seattle, WA, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| |
Collapse
|
90
|
Kibler RD, Lee S, Kennedy MA, Wicky BIM, Lai SM, Kostelic MM, Li X, Chow CM, Carter L, Wysocki VH, Stoddard BL, Baker D. Stepwise design of pseudosymmetric protein hetero-oligomers. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.07.535760. [PMID: 37066191 PMCID: PMC10104133 DOI: 10.1101/2023.04.07.535760] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Pseudosymmetric hetero-oligomers with three or more unique subunits with overall structural (but not sequence) symmetry play key roles in biology, and systematic approaches for generating such proteins de novo would provide new routes to controlling cell signaling and designing complex protein materials. However, the de novo design of protein hetero-oligomers with three or more distinct chains with nearly identical structures is a challenging problem because it requires the accurate design of multiple protein-protein interfaces simultaneously. Here, we describe a divide-and-conquer approach that breaks the multiple-interface design challenge into a set of more tractable symmetric single-interface redesign problems, followed by structural recombination of the validated homo-oligomers into pseudosymmetric hetero-oligomers. Starting from de novo designed circular homo-oligomers composed of 9 or 24 tandemly repeated units, we redesigned the inter-subunit interfaces to generate 15 new homo-oligomers and recombined them to make 17 new hetero-oligomers, including ABC heterotrimers, A2B2 heterotetramers, and A3B3 and A2B2C2 heterohexamers which assemble with high structural specificity. The symmetric homo-oligomers and pseudosymmetric hetero-oligomers generated for each system share a common backbone, and hence are ideal building blocks for generating and functionalizing larger symmetric assemblies.
Collapse
Affiliation(s)
- Ryan D. Kibler
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Sangmin Lee
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Madison A. Kennedy
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Division of Basic Sciences, Fred Hutchinson Cancer Center, Seattle, WA 98006, USA
| | - Basile I. M. Wicky
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Stella M. Lai
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA
- Resource for Native Mass Spectrometry Guided Structural Biology, The Ohio State University, Columbus, OH 43210, USA
| | - Marius M. Kostelic
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA
- Resource for Native Mass Spectrometry Guided Structural Biology, The Ohio State University, Columbus, OH 43210, USA
| | - Xinting Li
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Cameron M. Chow
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Lauren Carter
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Vicki H. Wysocki
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA
- Resource for Native Mass Spectrometry Guided Structural Biology, The Ohio State University, Columbus, OH 43210, USA
| | - Barry L. Stoddard
- Division of Basic Sciences, Fred Hutchinson Cancer Center, Seattle, WA 98006, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
91
|
Hutskalov I, Linden A, Čorić I. Directional Ionic Bonds. J Am Chem Soc 2023; 145:8291-8298. [PMID: 37027000 PMCID: PMC10119990 DOI: 10.1021/jacs.3c01030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Indexed: 04/08/2023]
Abstract
Covalent and ionic bonds represent two fundamental forms of bonding between atoms. In contrast to bonds with significant covalent character, ionic bonds are of limited use for the spatial structuring of matter because of the lack of directionality of the electric field around simple ions. We describe a predictable directional orientation of ionic bonds that contain concave nonpolar shields around the charged sites. Such directional ionic bonds offer an alternative to hydrogen bonds and other directional noncovalent interactions for the structuring of organic molecules and materials.
Collapse
Affiliation(s)
- Illia Hutskalov
- Department of Chemistry, University
of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland
| | - Anthony Linden
- Department of Chemistry, University
of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland
| | - Ilija Čorić
- Department of Chemistry, University
of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland
| |
Collapse
|
92
|
Oohora K. Supramolecular assembling systems of hemoproteins using chemical modifications. J INCL PHENOM MACRO 2023. [DOI: 10.1007/s10847-023-01181-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/20/2023]
|
93
|
Lin Z, Akin H, Rao R, Hie B, Zhu Z, Lu W, Smetanin N, Verkuil R, Kabeli O, Shmueli Y, Dos Santos Costa A, Fazel-Zarandi M, Sercu T, Candido S, Rives A. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 2023; 379:1123-1130. [PMID: 36927031 DOI: 10.1126/science.ade2574] [Citation(s) in RCA: 1304] [Impact Index Per Article: 652.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
Recent advances in machine learning have leveraged evolutionary information in multiple sequence alignments to predict protein structure. We demonstrate direct inference of full atomic-level protein structure from primary sequence using a large language model. As language models of protein sequences are scaled up to 15 billion parameters, an atomic-resolution picture of protein structure emerges in the learned representations. This results in an order-of-magnitude acceleration of high-resolution structure prediction, which enables large-scale structural characterization of metagenomic proteins. We apply this capability to construct the ESM Metagenomic Atlas by predicting structures for >617 million metagenomic protein sequences, including >225 million that are predicted with high confidence, which gives a view into the vast breadth and diversity of natural proteins.
Collapse
Affiliation(s)
- Zeming Lin
- FAIR, Meta AI, New York, NY, USA
- New York University, New York, NY, USA
| | | | | | - Brian Hie
- FAIR, Meta AI, New York, NY, USA
- Stanford University, Palo Alto, CA, USA
| | | | | | | | | | | | | | | | | | | | | | - Alexander Rives
- FAIR, Meta AI, New York, NY, USA
- New York University, New York, NY, USA
| |
Collapse
|
94
|
Determinants for an Efficient Enzymatic Catalysis in Poly(Ethylene Terephthalate) Degradation. Catalysts 2023. [DOI: 10.3390/catal13030591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023] Open
Abstract
The enzymatic degradation of the recalcitrant poly(ethylene terephthalate) (PET) has been an important biotechnological goal. The present review focuses on the state of the art in enzymatic degradation of PET, and the challenges ahead. This review covers (i) enzymes acting on PET, (ii) protein improvements through selection or engineering, (iii) strategies to improve biocatalyst–polymer interaction and monomer yields. Finally, this review discusses critical points on PET degradation, and their related experimental aspects, that include the control of physicochemical parameters. The search for, and engineering of, PET hydrolases, have been widely studied to achieve this, and several examples are discussed here. Many enzymes, from various microbial sources, have been studied and engineered, but recently true PET hydrolases (PETases), active at moderate temperatures, were reported. For a circular economy process, terephtalic acid (TPA) production is critical. Some thermophilic cutinases and engineered PETases have been reported to release terephthalic acid in significant amounts. Some bottlenecks in enzyme performance are discussed, including enzyme activity, thermal stability, substrate accessibility, PET microstructures, high crystallinity, molecular mass, mass transfer, and efficient conversion into reusable fragments.
Collapse
|
95
|
Rettie SA, Campbell KV, Bera AK, Kang A, Kozlov S, De La Cruz J, Adebomi V, Zhou G, DiMaio F, Ovchinnikov S, Bhardwaj G. Cyclic peptide structure prediction and design using AlphaFold. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.25.529956. [PMID: 36865323 PMCID: PMC9980166 DOI: 10.1101/2023.02.25.529956] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/28/2023]
Abstract
Deep learning networks offer considerable opportunities for accurate structure prediction and design of biomolecules. While cyclic peptides have gained significant traction as a therapeutic modality, developing deep learning methods for designing such peptides has been slow, mostly due to the small number of available structures for molecules in this size range. Here, we report approaches to modify the AlphaFold network for accurate structure prediction and design of cyclic peptides. Our results show this approach can accurately predict the structures of native cyclic peptides from a single sequence, with 36 out of 49 cases predicted with high confidence (pLDDT > 0.85) matching the native structure with root mean squared deviation (RMSD) less than 1.5 Å. Further extending our approach, we describe computational methods for designing sequences of peptide backbones generated by other backbone sampling methods and for de novo design of new macrocyclic peptides. We extensively sampled the structural diversity of cyclic peptides between 7-13 amino acids, and identified around 10,000 unique design candidates predicted to fold into the designed structures with high confidence. X-ray crystal structures for seven sequences with diverse sizes and structures designed by our approach match very closely with the design models (root mean squared deviation < 1.0 Å), highlighting the atomic level accuracy in our approach. The computational methods and scaffolds developed here provide the basis for custom-designing peptides for targeted therapeutic applications.
Collapse
Affiliation(s)
- Stephen A. Rettie
- Molecular and Cell Biology program, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Katelyn V. Campbell
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Asim K. Bera
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Alex Kang
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Simon Kozlov
- FAS Division of Science, Harvard University, Cambridge, MA, USA
| | - Joshmyn De La Cruz
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Victor Adebomi
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Medicinal Chemistry, University of Washington, Seattle, WA, USA
| | - Guangfeng Zhou
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Frank DiMaio
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship, Harvard University, Cambridge, MA, USA
- FAS Division of Science, Harvard University, Cambridge, MA, USA
| | - Gaurav Bhardwaj
- Molecular and Cell Biology program, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Medicinal Chemistry, University of Washington, Seattle, WA, USA
| |
Collapse
|
96
|
Yeh AHW, Norn C, Kipnis Y, Tischer D, Pellock SJ, Evans D, Ma P, Lee GR, Zhang JZ, Anishchenko I, Coventry B, Cao L, Dauparas J, Halabiya S, DeWitt M, Carter L, Houk KN, Baker D. De novo design of luciferases using deep learning. Nature 2023; 614:774-780. [PMID: 36813896 PMCID: PMC9946828 DOI: 10.1038/s41586-023-05696-3] [Citation(s) in RCA: 158] [Impact Index Per Article: 79.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 01/03/2023] [Indexed: 02/24/2023]
Abstract
De novo enzyme design has sought to introduce active sites and substrate-binding pockets that are predicted to catalyse a reaction of interest into geometrically compatible native scaffolds1,2, but has been limited by a lack of suitable protein structures and the complexity of native protein sequence-structure relationships. Here we describe a deep-learning-based 'family-wide hallucination' approach that generates large numbers of idealized protein structures containing diverse pocket shapes and designed sequences that encode them. We use these scaffolds to design artificial luciferases that selectively catalyse the oxidative chemiluminescence of the synthetic luciferin substrates diphenylterazine3 and 2-deoxycoelenterazine. The designed active sites position an arginine guanidinium group adjacent to an anion that develops during the reaction in a binding pocket with high shape complementarity. For both luciferin substrates, we obtain designed luciferases with high selectivity; the most active of these is a small (13.9 kDa) and thermostable (with a melting temperature higher than 95 °C) enzyme that has a catalytic efficiency on diphenylterazine (kcat/Km = 106 M-1 s-1) comparable to that of native luciferases, but a much higher substrate specificity. The creation of highly active and specific biocatalysts from scratch with broad applications in biomedicine is a key milestone for computational enzyme design, and our approach should enable generation of a wide range of luciferases and other enzymes.
Collapse
Affiliation(s)
- Andy Hsien-Wei Yeh
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA.
| | - Christoffer Norn
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Yakov Kipnis
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Doug Tischer
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Samuel J Pellock
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Declan Evans
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA, USA
| | - Pengchen Ma
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA, USA
- School of Chemistry, Xi'an Key Laboratory of Sustainable Energy Materials Chemistry, MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, Xi'an Jiaotong University, Xi'an, China
| | - Gyu Rie Lee
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Jason Z Zhang
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Ivan Anishchenko
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Brian Coventry
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Longxing Cao
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Justas Dauparas
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Samer Halabiya
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Michelle DeWitt
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Lauren Carter
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - K N Houk
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
97
|
Rogers JR, Nikolényi G, AlQuraishi M. Growing ecosystem of deep learning methods for modeling protein-protein interactions. Protein Eng Des Sel 2023; 36:gzad023. [PMID: 38102755 DOI: 10.1093/protein/gzad023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 12/06/2023] [Accepted: 12/07/2023] [Indexed: 12/17/2023] Open
Abstract
Numerous cellular functions rely on protein-protein interactions. Efforts to comprehensively characterize them remain challenged however by the diversity of molecular recognition mechanisms employed within the proteome. Deep learning has emerged as a promising approach for tackling this problem by exploiting both experimental data and basic biophysical knowledge about protein interactions. Here, we review the growing ecosystem of deep learning methods for modeling protein interactions, highlighting the diversity of these biophysically informed models and their respective trade-offs. We discuss recent successes in using representation learning to capture complex features pertinent to predicting protein interactions and interaction sites, geometric deep learning to reason over protein structures and predict complex structures, and generative modeling to design de novo protein assemblies. We also outline some of the outstanding challenges and promising new directions. Opportunities abound to discover novel interactions, elucidate their physical mechanisms, and engineer binders to modulate their functions using deep learning and, ultimately, unravel how protein interactions orchestrate complex cellular behaviors.
Collapse
Affiliation(s)
- Julia R Rogers
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
| | - Gergő Nikolényi
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
| | | |
Collapse
|
98
|
Minireview: Engineering evolution to reconfigure phenotypic traits in microbes for biotechnological applications. Comput Struct Biotechnol J 2022; 21:563-573. [PMID: 36659921 PMCID: PMC9816911 DOI: 10.1016/j.csbj.2022.12.042] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Revised: 12/23/2022] [Accepted: 12/23/2022] [Indexed: 12/25/2022] Open
Abstract
Adaptive laboratory evolution (ALE) has long been used as the tool of choice for microbial engineering applications, ranging from the production of commodity chemicals to the innovation of complex phenotypes. With the advent of systems and synthetic biology, the ALE experimental design has become increasingly sophisticated. For instance, implementation of in silico metabolic model reconstruction and advanced synthetic biology tools have facilitated the effective coupling of desired traits to adaptive phenotypes. Furthermore, various multi-omic tools now enable in-depth analysis of cellular states, providing a comprehensive understanding of the biology of even the most genomically perturbed systems. Emerging machine learning approaches would assist in streamlining the interpretation of massive and multiplexed datasets and promoting our understanding of complexity in biology. This review covers some of the representative case studies among the 700 independent ALE studies reported to date, outlining key ideas, principles, and important mechanisms underlying ALE designs in bioproduction and synthetic cell engineering, with evidence from literatures to aid comprehension.
Collapse
|
99
|
Callaway E. Scientists are using AI to dream up revolutionary new proteins. Nature 2022; 609:661-662. [DOI: 10.1038/d41586-022-02947-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|