1
|
Meador K, Castells-Graells R, Aguirre R, Sawaya MR, Arbing MA, Sherman T, Senarathne C, Yeates TO. A suite of designed protein cages using machine learning and protein fragment-based protocols. Structure 2024; 32:751-765.e11. [PMID: 38513658 PMCID: PMC11162342 DOI: 10.1016/j.str.2024.02.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 01/22/2024] [Accepted: 02/23/2024] [Indexed: 03/23/2024]
Abstract
Designed protein cages and related materials provide unique opportunities for applications in biotechnology and medicine, but their creation remains challenging. Here, we apply computational approaches to design a suite of tetrahedrally symmetric, self-assembling protein cages. For the generation of docked conformations, we emphasize a protein fragment-based approach, while for sequence design of the de novo interface, a comparison of knowledge-based and machine learning protocols highlights the power and increased experimental success achieved using ProteinMPNN. An analysis of design outcomes provides insights for improving interface design protocols, including prioritizing fragment-based motifs, balancing interface hydrophobicity and polarity, and identifying preferred polar contact patterns. In all, we report five structures for seven protein cages, along with two structures of intermediate assemblies, with the highest resolution reaching 2.0 Å using cryo-EM. This set of designed cages adds substantially to the body of available protein nanoparticles, and to methodologies for their creation.
Collapse
Affiliation(s)
- Kyle Meador
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095, USA
| | | | - Roman Aguirre
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095, USA
| | - Michael R Sawaya
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA 90095, USA
| | - Mark A Arbing
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA 90095, USA
| | - Trent Sherman
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095, USA
| | - Chethaka Senarathne
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095, USA
| | - Todd O Yeates
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095, USA; UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA 90095, USA.
| |
Collapse
|
2
|
Su Z, Griffin B, Emmons S, Wu Y. Prediction of interactions between cell surface proteins by machine learning. Proteins 2024; 92:567-580. [PMID: 38050713 DOI: 10.1002/prot.26648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 11/15/2023] [Accepted: 11/20/2023] [Indexed: 12/06/2023]
Abstract
Cells detect changes in their external environments or communicate with each other through proteins on their surfaces. These cell surface proteins form a complicated network of interactions in order to fulfill their functions. The interactions between cell surface proteins are highly dynamic and, thus, challenging to detect using traditional experimental techniques. Here, we tackle this challenge using a computational framework. The primary focus of the framework is to develop new tools to identify interactions between domains in the immunoglobulin (Ig) fold, which is the most abundant domain family in cell surface proteins. These interactions could be formed between ligands and receptors from different cells or between proteins on the same cell surface. In practice, we collected all structural data on Ig domain interactions and transformed them into an interface fragment pair library. A high-dimensional profile can then be constructed from the library for a given pair of query protein sequences. Multiple machine learning models were used to read this profile so that the probability of interaction between the query proteins could be predicted. We tested our models on an experimentally derived dataset that contains 564 cell surface proteins in humans. The cross-validation results show that we can achieve higher than 70% accuracy in identifying the PPIs within this dataset. We then applied this method to a group of 46 cell surface proteins in Caenorhabditis elegans. We screened every possible interaction between these proteins. Many interactions recognized by our machine learning classifiers have been experimentally confirmed in the literature. In conclusion, our computational platform serves as a useful tool to help identify potential new interactions between cell surface proteins in addition to current state-of-the-art experimental techniques. The tool is freely accessible for use by the scientific community. Moreover, the general framework of the machine learning classification can also be extended to study the interactions of proteins in other domain superfamilies.
Collapse
Affiliation(s)
- Zhaoqian Su
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Brian Griffin
- Department of Genetics, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Scott Emmons
- Department of Genetics, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, USA
| |
Collapse
|
3
|
Meador K, Castells-Graells R, Aguirre R, Sawaya MR, Arbing MA, Sherman T, Senarathne C, Yeates TO. A Suite of Designed Protein Cages Using Machine Learning Algorithms and Protein Fragment-Based Protocols. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.09.561468. [PMID: 37873110 PMCID: PMC10592684 DOI: 10.1101/2023.10.09.561468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Designed protein cages and related materials provide unique opportunities for applications in biotechnology and medicine, while methods for their creation remain challenging and unpredictable. In the present study, we apply new computational approaches to design a suite of new tetrahedrally symmetric, self-assembling protein cages. For the generation of docked poses, we emphasize a protein fragment-based approach, while for de novo interface design, a comparison of computational protocols highlights the power and increased experimental success achieved using the machine learning program ProteinMPNN. In relating information from docking and design, we observe that agreement between fragment-based sequence preferences and ProteinMPNN sequence inference correlates with experimental success. Additional insights for designing polar interactions are highlighted by experimentally testing larger and more polar interfaces. In all, using X-ray crystallography and cryo-EM, we report five structures for seven protein cages, with atomic resolution in the best case reaching 2.0 Å. We also report structures of two incompletely assembled protein cages, providing unique insights into one type of assembly failure. The new set of designed cages and their structures add substantially to the body of available protein nanoparticles, and to methodologies for their creation.
Collapse
Affiliation(s)
- Kyle Meador
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA 90095
| | | | - Roman Aguirre
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA 90095
| | - Michael R. Sawaya
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA, USA 90095
| | - Mark A. Arbing
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA, USA 90095
| | - Trent Sherman
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA 90095
| | - Chethaka Senarathne
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA 90095
| | - Todd O. Yeates
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA 90095
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA, USA 90095
| |
Collapse
|
4
|
Su Z, Griffin B, Emmons S, Wu Y. Prediction of Interactions between Cell Surface Proteins by Machine Learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.12.557337. [PMID: 37745607 PMCID: PMC10515853 DOI: 10.1101/2023.09.12.557337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Cells detect changes of external environments or communicate with each other through proteins on their surfaces. These cell surface proteins form a complicated network of interactions in order to fulfill their functions. The interactions between cell surface proteins are highly dynamic and thus challenging to detect using traditional experimental techniques. Here we tackle this challenge by a computational framework. The primary focus of the framework is to develop new tools to identify interactions between domains in immunoglobulin (Ig) fold, which is the most abundant domain family in cell surface proteins. These interactions could be formed between ligands and receptors from different cells, or between proteins on the same cell surface. In practice, we collected all structural data of Ig domain interactions and transformed them into an interface fragment pair library. A high dimensional profile can be then constructed from the library for a given pair of query protein sequences. Multiple machine learning models were used to read this profile, so that the probability of interaction between the query proteins can be predicted. We tested our models to an experimentally derived dataset which contains 564 cell surface proteins in human. The cross-validation results show that we can achieve higher than 70% accuracy in identifying the PPIs within this dataset. We then applied this method to a group of 46 cell surface proteins in C elegans. We screened every possible interaction between these proteins. Many interactions recognized by our machine learning classifiers have been experimentally confirmed in the literatures. In conclusion, our computational platform serves a useful tool to help identifying potential new interactions between cell surface proteins in addition to current state-of-the-art experimental techniques. The tool is freely accessible for use by the scientific community. Moreover, the general framework of the machine learning classification can also be extended to study interactions of proteins in other domain superfamilies.
Collapse
Affiliation(s)
- Zhaoqian Su
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY, 10461
| | - Brian Griffin
- Department of Genetics, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY, 10461
| | - Scott Emmons
- Department of Genetics, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY, 10461
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY, 10461
| |
Collapse
|
5
|
Sheffler W, Yang EC, Dowling Q, Hsia Y, Fries CN, Stanislaw J, Langowski MD, Brandys M, Li Z, Skotheim R, Borst AJ, Khmelinskaia A, King NP, Baker D. Fast and versatile sequence-independent protein docking for nanomaterials design using RPXDock. PLoS Comput Biol 2023; 19:e1010680. [PMID: 37216343 PMCID: PMC10237659 DOI: 10.1371/journal.pcbi.1010680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 06/02/2023] [Accepted: 04/09/2023] [Indexed: 05/24/2023] Open
Abstract
Computationally designed multi-subunit assemblies have shown considerable promise for a variety of applications, including a new generation of potent vaccines. One of the major routes to such materials is rigid body sequence-independent docking of cyclic oligomers into architectures with point group or lattice symmetries. Current methods for docking and designing such assemblies are tailored to specific classes of symmetry and are difficult to modify for novel applications. Here we describe RPXDock, a fast, flexible, and modular software package for sequence-independent rigid-body protein docking across a wide range of symmetric architectures that is easily customizable for further development. RPXDock uses an efficient hierarchical search and a residue-pair transform (RPX) scoring method to rapidly search through multidimensional docking space. We describe the structure of the software, provide practical guidelines for its use, and describe the available functionalities including a variety of score functions and filtering tools that can be used to guide and refine docking results towards desired configurations.
Collapse
Affiliation(s)
- William Sheffler
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
| | - Erin C. Yang
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
- Graduate Program in Biological Physics, Structure & Design, University of Washington, Seattle, Washington, United States of America
| | - Quinton Dowling
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
- Department of Bioengineering, University of Washington, Seattle, Washington, United States of America
| | - Yang Hsia
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
| | - Chelsea N. Fries
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
| | - Jenna Stanislaw
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
- Transdisciplinary Research Area “Building Blocks of Matter and Fundamental Interactions (TRA Matter)”, University of Bonn, Bonn, Germany
- Life and Medical Sciences Institute, University of Bonn, Bonn, Germany
| | - Mark D. Langowski
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
- Graduate Program in Molecular and Cellular Biology, University of Washington, Seattle, Washington, United States of America
| | - Marisa Brandys
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
| | - Zhe Li
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
| | - Rebecca Skotheim
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
| | - Andrew J. Borst
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
| | - Alena Khmelinskaia
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
- Transdisciplinary Research Area “Building Blocks of Matter and Fundamental Interactions (TRA Matter)”, University of Bonn, Bonn, Germany
- Life and Medical Sciences Institute, University of Bonn, Bonn, Germany
| | - Neil P. King
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
| | - David Baker
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|
6
|
Kikuchi K, Date K, Ueno T. Design of a Hierarchical Assembly at a Solid-Liquid Interface Using an Asymmetric Protein Needle. LANGMUIR : THE ACS JOURNAL OF SURFACES AND COLLOIDS 2023; 39:2389-2397. [PMID: 36734675 DOI: 10.1021/acs.langmuir.2c03146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Design and control of processes for a hierarchical assembly of proteins remain challenging because it requires consideration of design principles with atomic-level accuracy. Previous studies have adopted symmetry-based strategies to minimize the complexity of protein-protein interactions and this has placed constraints on the structures of the resulting protein assemblies. In the present work, we used an anisotropic-shaped protein needle, gene product 5 (gp5) from bacteriophage T4 with a C-terminal hexahistidine-tag (His-tag) (gp5_CHis), to construct a hierarchical assembly with two distinct protein-protein interaction sites. High-speed atomic force microscopy (HS-AFM) measurements reveal that it forms unique tetrameric clusters through its N-terminal head on a mica surface. The clusters further self-assemble into a monolayer through the C-terminal His-tag. The HS-AFM images and displacement analyses show that the monolayer is a network-like structure rather than a crystalline lattice. Our results expand the toolbox for constructing hierarchical protein assemblies based on structural anisotropy.
Collapse
Affiliation(s)
- Kosuke Kikuchi
- School of Life Science and Technology, Tokyo Institute of Technology, 4259-B-55, Nagatsuta-cho, Midori-ku, Yokohama 226-8501, Japan
| | - Koki Date
- School of Life Science and Technology, Tokyo Institute of Technology, 4259-B-55, Nagatsuta-cho, Midori-ku, Yokohama 226-8501, Japan
| | - Takafumi Ueno
- School of Life Science and Technology, Tokyo Institute of Technology, 4259-B-55, Nagatsuta-cho, Midori-ku, Yokohama 226-8501, Japan
- Living Systems Materialogy (LiSM) Research Group, International Research Frontiers Initiative (IRFI), Tokyo Institute of Technology, Nagatsuta-cho 4259, Midori-ku, Yokohama 226-8501, Japan
| |
Collapse
|
7
|
Olshefsky A, Richardson C, Pun SH, King NP. Engineering Self-Assembling Protein Nanoparticles for Therapeutic Delivery. Bioconjug Chem 2022; 33:2018-2034. [PMID: 35487503 PMCID: PMC9673152 DOI: 10.1021/acs.bioconjchem.2c00030] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Despite remarkable advances over the past several decades, many therapeutic nanomaterials fail to overcome major in vivo delivery barriers. Controlling immunogenicity, optimizing biodistribution, and engineering environmental responsiveness are key outstanding delivery problems for most nanotherapeutics. However, notable exceptions exist including some lipid and polymeric nanoparticles, some virus-based nanoparticles, and nanoparticle vaccines where immunogenicity is desired. Self-assembling protein nanoparticles offer a powerful blend of modularity and precise designability to the field, and have the potential to solve many of the major barriers to delivery. In this review, we provide a brief overview of key designable features of protein nanoparticles and their implications for therapeutic delivery applications. We anticipate that protein nanoparticles will rapidly grow in their prevalence and impact as clinically relevant delivery platforms.
Collapse
Affiliation(s)
- Audrey Olshefsky
- Department
of Bioengineering, University of Washington, Seattle, Washington 98195, United States
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
| | - Christian Richardson
- Department
of Bioengineering, University of Washington, Seattle, Washington 98195, United States
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
| | - Suzie H. Pun
- Department
of Bioengineering, University of Washington, Seattle, Washington 98195, United States
- Molecular
Engineering and Sciences Institute, University
of Washington, Seattle, Washington 98195, United States
| | - Neil P. King
- Institute
for Protein Design, University of Washington, Seattle, Washington 98195, United States
- Department
of Biochemistry, University of Washington, Seattle, Washington 98195, United States
| |
Collapse
|
8
|
Sen N, Madhusudhan MS. A structural database of chain–chain and domain–domain interfaces of proteins. Protein Sci 2022. [DOI: 10.1002/pro.4406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Neeladri Sen
- Indian Institute of Science Education and Research Pune India
- Institute of Structural and Molecular Biology University College London London UK
| | | |
Collapse
|
9
|
Precision materials: Computational design methods of accurate protein materials. Curr Opin Struct Biol 2022; 74:102367. [DOI: 10.1016/j.sbi.2022.102367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Revised: 02/22/2022] [Accepted: 02/28/2022] [Indexed: 11/23/2022]
|