1
|
He J, Wu W, Wang X. DIProT: A deep learning based interactive toolkit for efficient and effective Protein design. Synth Syst Biotechnol 2024; 9:217-222. [PMID: 38385151 PMCID: PMC10876589 DOI: 10.1016/j.synbio.2024.01.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 01/02/2024] [Accepted: 01/30/2024] [Indexed: 02/23/2024] Open
Abstract
The protein inverse folding problem, designing amino acid sequences that fold into desired protein structures, is a critical challenge in biological sciences. Despite numerous data-driven and knowledge-driven methods, there remains a need for a user-friendly toolkit that effectively integrates these approaches for in-silico protein design. In this paper, we present DIProT, an interactive protein design toolkit. DIProT leverages a non-autoregressive deep generative model to solve the inverse folding problem, combined with a protein structure prediction model. This integration allows users to incorporate prior knowledge into the design process, evaluate designs in silico, and form a virtual design loop with human feedback. Our inverse folding model demonstrates competitive performance in terms of effectiveness and efficiency on TS50 and CATH4.2 datasets, with promising sequence recovery and inference time. Case studies further illustrate how DIProT can facilitate user-guided protein design.
Collapse
Affiliation(s)
| | | | - Xiaowo Wang
- Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Bioinformatics Division, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing, China
| |
Collapse
|
2
|
Chu AE, Lu T, Huang PS. Sparks of function by de novo protein design. Nat Biotechnol 2024; 42:203-215. [PMID: 38361073 DOI: 10.1038/s41587-024-02133-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Accepted: 01/09/2024] [Indexed: 02/17/2024]
Abstract
Information in proteins flows from sequence to structure to function, with each step causally driven by the preceding one. Protein design is founded on inverting this process: specify a desired function, design a structure executing this function, and find a sequence that folds into this structure. This 'central dogma' underlies nearly all de novo protein-design efforts. Our ability to accomplish these tasks depends on our understanding of protein folding and function and our ability to capture this understanding in computational methods. In recent years, deep learning-derived approaches for efficient and accurate structure modeling and enrichment of successful designs have enabled progression beyond the design of protein structures and towards the design of functional proteins. We examine these advances in the broader context of classical de novo protein design and consider implications for future challenges to come, including fundamental capabilities such as sequence and structure co-design and conformational control considering flexibility, and functional objectives such as antibody and enzyme design.
Collapse
Affiliation(s)
- Alexander E Chu
- Biophysics Program, Stanford University, Palo Alto, CA, USA
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA
- Google DeepMind, London, UK
| | - Tianyu Lu
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA
| | - Po-Ssu Huang
- Biophysics Program, Stanford University, Palo Alto, CA, USA.
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA.
| |
Collapse
|
3
|
Nguyen H, Nguyen HL, Lan PD, Thai NQ, Sikora M, Li MS. Interaction of SARS-CoV-2 with host cells and antibodies: experiment and simulation. Chem Soc Rev 2023; 52:6497-6553. [PMID: 37650302 DOI: 10.1039/d1cs01170g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of the devastating global COVID-19 pandemic announced by WHO in March 2020. Through unprecedented scientific effort, several vaccines, drugs and antibodies have been developed, saving millions of lives, but the fight against COVID-19 continues as immune escape variants of concern such as Delta and Omicron emerge. To develop more effective treatments and to elucidate the side effects caused by vaccines and therapeutic agents, a deeper understanding of the molecular interactions of SARS-CoV-2 with them and human cells is required. With special interest in computational approaches, we will focus on the structure of SARS-CoV-2 and the interaction of its spike protein with human angiotensin-converting enzyme-2 (ACE2) as a prime entry point of the virus into host cells. In addition, other possible viral receptors will be considered. The fusion of viral and human membranes and the interaction of the spike protein with antibodies and nanobodies will be discussed, as well as the effect of SARS-CoV-2 on protein synthesis in host cells.
Collapse
Affiliation(s)
- Hung Nguyen
- Institute of Physics, Polish Academy of Sciences, al. Lotnikow 32/46, 02-668 Warsaw, Poland.
| | - Hoang Linh Nguyen
- Institute of Fundamental and Applied Sciences, Duy Tan University, Ho Chi Minh City 700000, Vietnam
- Faculty of Environmental and Natural Sciences, Duy Tan University, Da Nang 550000, Vietnam
| | - Pham Dang Lan
- Life Science Lab, Institute for Computational Science and Technology, Quang Trung Software City, Tan Chanh Hiep Ward, District 12, 729110 Ho Chi Minh City, Vietnam
- Faculty of Physics and Engineering Physics, VNUHCM-University of Science, 227, Nguyen Van Cu Street, District 5, 749000 Ho Chi Minh City, Vietnam
| | - Nguyen Quoc Thai
- Dong Thap University, 783 Pham Huu Lau Street, Ward 6, Cao Lanh City, Dong Thap, Vietnam
| | - Mateusz Sikora
- Malopolska Centre of Biotechnology, Jagiellonian University, Kraków, Poland
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Frankfurt am Main, Germany
| | - Mai Suan Li
- Institute of Physics, Polish Academy of Sciences, al. Lotnikow 32/46, 02-668 Warsaw, Poland.
| |
Collapse
|
4
|
Li L, Li J, Ou Y, Wu J, Li H, Wang X, Tang L, Dai X, Yang C, Wei Z, Yin Z, Shu Y. Ccdc57 is required for straightening the body axis by regulating ciliary motility in the brain ventricle of zebrafish. J Genet Genomics 2023; 50:253-263. [PMID: 36669737 DOI: 10.1016/j.jgg.2022.12.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 12/22/2022] [Accepted: 12/31/2022] [Indexed: 01/19/2023]
Abstract
Recently, cilia defects have been proposed to contribute to scoliosis. Here, we demonstrate that coiled-coil domain-containing 57 (Ccdc57) plays an essential role in straightening the body axis of zebrafish by regulating ciliary beating in the brain ventricle (BV). Zygotic ccdc57 (Zccdc57) mutant zebrafish developes scoliosis without significant changes in their bone density and calcification, and the maternal-zygotic ccdc57 (MZccdc57) mutant embryos display curved bodies since the long-pec stage. The expression of ccdc57 is enriched in ciliated tissues and immunofluorescence analysis reveals colocalization of Ccdc57-HA with acetylated α-tubulin, implicating it in having a role in ciliary function. Further examination reveals that it is the coordinated cilia beating of multiple cilia bundles (MCB) in the MZccdc57 mutant embryos that is affected at 48 hours post fertilization, when the compromised cerebrospinal fluid flow and curved body axis have already occurred. Either ccdc57 mRNA injection or epinephrine treatment reverses the spinal curvature in MZccdc57 mutant larvae from ventrally curly to straight or even dorsally curly and significantly upregulates urotensin signaling. This study reveals the role of ccdc57 in maintaining coordinated cilia beating of MCB in the BV.
Collapse
Affiliation(s)
- Lu Li
- State Key Laboratory of Developmental Biology of Freshwater Fish, College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China; College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China
| | - Juan Li
- State Key Laboratory of Developmental Biology of Freshwater Fish, College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China; College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China
| | - Yuan Ou
- State Key Laboratory of Developmental Biology of Freshwater Fish, College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China; College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China
| | - Jiaxin Wu
- School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Huilin Li
- State Key Laboratory of Developmental Biology of Freshwater Fish, College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China; College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China
| | - Xin Wang
- State Key Laboratory of Developmental Biology of Freshwater Fish, College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China; College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China
| | - Liying Tang
- College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China
| | - Xiangyan Dai
- Key Laboratory of Freshwater Fish Reproduction and Development, Ministry of Education, Southwest University, Chongqing 400715, China
| | - Conghui Yang
- State Key Laboratory of Developmental Biology of Freshwater Fish, College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China; College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China
| | - Zehong Wei
- State Key Laboratory of Developmental Biology of Freshwater Fish, College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China; College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China
| | - Zhan Yin
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei 430072, China
| | - Yuqin Shu
- State Key Laboratory of Developmental Biology of Freshwater Fish, College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China; College of Life Sciences, Hunan Normal University, Changsha, Hunan 410081, China.
| |
Collapse
|
5
|
Woolfson DN. Understanding a protein fold: the physics, chemistry, and biology of α-helical coiled coils. J Biol Chem 2023; 299:104579. [PMID: 36871758 PMCID: PMC10124910 DOI: 10.1016/j.jbc.2023.104579] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 02/25/2023] [Accepted: 02/27/2023] [Indexed: 03/07/2023] Open
Abstract
Protein science is being transformed by powerful computational methods for structure prediction and design: AlphaFold2 can predict many natural protein structures from sequence, and other AI methods are enabling the de novo design of new structures. This raises a question: how much do we understand the underlying sequence-to-structure/function relationships being captured by these methods? This perspective presents our current understanding of one class of protein assembly, the α-helical coiled coils. At first sight, these are straightforward: sequence repeats of hydrophobic (h) and polar (p) residues, (hpphppp)n, direct the folding and assembly of amphipathic α helices into bundles. However, many different bundles are possible: they can have two or more helices (different oligomers); the helices can have parallel, antiparallel or mixed arrangements (different topologies); and the helical sequences can be the same (homomers) or different (heteromers). Thus, sequence-to-structure relationships must be present within the hpphppp repeats to distinguish these states. I discuss the current understanding of this problem at three levels: First, physics gives a parametric framework to generate the many possible coiled-coil backbone structures. Second, chemistry provides a means to explore and deliver sequence-to-structure relationships. Third, biology shows how coiled coils are adapted and functionalized in nature, inspiring applications of coiled coils in synthetic biology. I argue that the chemistry is largely understood; the physics is partly solved, though the considerable challenge of predicting even relative stabilities of different coiled-coil states remains; but there is much more to explore in the biology and synthetic biology of coiled coils.
Collapse
Affiliation(s)
- Derek N Woolfson
- School of Chemistry, University of Bristol, Bristol, United Kingdom; School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, Bristol, United Kingdom; BrisEngBio, School of Chemistry, University of Bristol, Bristol, United Kingdom; Max Planck-Bristol Centre for Minimal Biology, University of Bristol, Bristol, United Kingdom.
| |
Collapse
|
6
|
Yang KK, Zanichelli N, Yeh H. Masked inverse folding with sequence transfer for protein representation learning. Protein Eng Des Sel 2023; 36:gzad015. [PMID: 37883472 DOI: 10.1093/protein/gzad015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 10/10/2023] [Accepted: 10/11/2023] [Indexed: 10/28/2023] Open
Abstract
Self-supervised pretraining on protein sequences has led to state-of-the art performance on protein function and fitness prediction. However, sequence-only methods ignore the rich information contained in experimental and predicted protein structures. Meanwhile, inverse folding methods reconstruct a protein's amino-acid sequence given its structure, but do not take advantage of sequences that do not have known structures. In this study, we train a masked inverse folding protein masked language model parameterized as a structured graph neural network. During pretraining, this model learns to reconstruct corrupted sequences conditioned on the backbone structure. We then show that using the outputs from a pretrained sequence-only protein masked language model as input to the inverse folding model further improves pretraining perplexity. We evaluate both of these models on downstream protein engineering tasks and analyze the effect of using information from experimental or predicted structures on performance.
Collapse
Affiliation(s)
- Kevin K Yang
- Microsoft Research, 1 Memorial Drive, Cambridge, MA, USA
| | | | - Hugh Yeh
- Pritzker School of Medicine, University of Chicago, 924 E 57th Street, Chicago, IL, USA
| |
Collapse
|
7
|
Lu H, Cheng Z, Hu Y, Tang LV. What Can De Novo Protein Design Bring to the Treatment of Hematological Disorders? BIOLOGY 2023; 12:biology12020166. [PMID: 36829445 PMCID: PMC9952452 DOI: 10.3390/biology12020166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 01/17/2023] [Accepted: 01/18/2023] [Indexed: 01/22/2023]
Abstract
Protein therapeutics have been widely used to treat hematological disorders. With the advent of de novo protein design, protein therapeutics are not limited to ameliorating natural proteins but also produce novel protein sequences, folds, and functions with shapes and functions customized to bind to the therapeutic targets. De novo protein techniques have been widely used biomedically to design novel diagnostic and therapeutic drugs, novel vaccines, and novel biological materials. In addition, de novo protein design has provided new options for treating hematological disorders. Scientists have designed protein switches called Colocalization-dependent Latching Orthogonal Cage-Key pRoteins (Co-LOCKR) that perform computations on the surface of cells. De novo designed molecules exhibit a better capacity than the currently available tyrosine kinase inhibitors in chronic myeloid leukemia therapy. De novo designed protein neoleukin-2/15 enhances chimeric antigen receptor T-cell activity. This new technique has great biomedical potential, especially in exploring new treatment methods for hematological disorders. This review discusses the development of de novo protein design and its biological applications, with emphasis on the treatment of hematological disorders.
Collapse
|
8
|
Mizutani Y, Mizuno M. Time-resolved spectroscopic mapping of vibrational energy flow in proteins: Understanding thermal diffusion at the nanoscale. J Chem Phys 2022; 157:240901. [PMID: 36586981 DOI: 10.1063/5.0116734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Vibrational energy exchange between various degrees of freedom is critical to barrier-crossing processes in proteins. Hemeproteins are well suited for studying vibrational energy exchange in proteins because the heme group is an efficient photothermal converter. The released energy by heme following photoexcitation shows migration in a protein moiety on a picosecond timescale, which is observed using time-resolved ultraviolet resonance Raman spectroscopy. The anti-Stokes ultraviolet resonance Raman intensity of a tryptophan residue is an excellent probe for the vibrational energy in proteins, allowing the mapping of energy flow with the spatial resolution of a single amino acid residue. This Perspective provides an overview of studies on vibrational energy flow in proteins, including future perspectives for both methodologies and applications.
Collapse
Affiliation(s)
- Yasuhisa Mizutani
- Department of Chemistry, Graduate School of Science, Osaka University, 1-1 Machikaneyama, Toyonaka, Osaka 560-0043, Japan
| | - Misao Mizuno
- Department of Chemistry, Graduate School of Science, Osaka University, 1-1 Machikaneyama, Toyonaka, Osaka 560-0043, Japan
| |
Collapse
|
9
|
Wicky BIM, Milles LF, Courbet A, Ragotte RJ, Dauparas J, Kinfu E, Tipps S, Kibler RD, Baek M, DiMaio F, Li X, Carter L, Kang A, Nguyen H, Bera AK, Baker D. Hallucinating symmetric protein assemblies. Science 2022; 378:56-61. [PMID: 36108048 PMCID: PMC9724707 DOI: 10.1126/science.add1964] [Citation(s) in RCA: 60] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Deep learning generative approaches provide an opportunity to broadly explore protein structure space beyond the sequences and structures of natural proteins. Here, we use deep network hallucination to generate a wide range of symmetric protein homo-oligomers given only a specification of the number of protomers and the protomer length. Crystal structures of seven designs are very similar to the computational models (median root mean square deviation: 0.6 angstroms), as are three cryo-electron microscopy structures of giant 10-nanometer rings with up to 1550 residues and C33 symmetry; all differ considerably from previously solved structures. Our results highlight the rich diversity of new protein structures that can be generated using deep learning and pave the way for the design of increasingly complex components for nanomachines and biomaterials.
Collapse
Affiliation(s)
- B. I. M. Wicky
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - L. F. Milles
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - A. Courbet
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - R. J. Ragotte
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - J. Dauparas
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - E. Kinfu
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - S. Tipps
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - R. D. Kibler
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - M. Baek
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - F. DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - X. Li
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - L. Carter
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - A. Kang
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - H. Nguyen
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - A. K. Bera
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - D. Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
10
|
Zhou C, Lu P. De novo
design of membrane transport proteins. Proteins 2022; 90:1800-1806. [DOI: 10.1002/prot.26336] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 03/07/2022] [Accepted: 03/12/2022] [Indexed: 12/22/2022]
Affiliation(s)
- Chen Zhou
- Westlake Laboratory of Life Sciences and Biomedicine Hangzhou Zhejiang China
- Key Laboratory of Structural Biology of Zhejiang Province School of Life Sciences, Westlake University Hangzhou Zhejiang China
- Institute of Biology Westlake Institute for Advanced Study Hangzhou Zhejiang China
| | - Peilong Lu
- Westlake Laboratory of Life Sciences and Biomedicine Hangzhou Zhejiang China
- Key Laboratory of Structural Biology of Zhejiang Province School of Life Sciences, Westlake University Hangzhou Zhejiang China
- Institute of Biology Westlake Institute for Advanced Study Hangzhou Zhejiang China
| |
Collapse
|
11
|
Qing R, Hao S, Smorodina E, Jin D, Zalevsky A, Zhang S. Protein Design: From the Aspect of Water Solubility and Stability. Chem Rev 2022; 122:14085-14179. [PMID: 35921495 PMCID: PMC9523718 DOI: 10.1021/acs.chemrev.1c00757] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Indexed: 12/13/2022]
Abstract
Water solubility and structural stability are key merits for proteins defined by the primary sequence and 3D-conformation. Their manipulation represents important aspects of the protein design field that relies on the accurate placement of amino acids and molecular interactions, guided by underlying physiochemical principles. Emulated designer proteins with well-defined properties both fuel the knowledge-base for more precise computational design models and are used in various biomedical and nanotechnological applications. The continuous developments in protein science, increasing computing power, new algorithms, and characterization techniques provide sophisticated toolkits for solubility design beyond guess work. In this review, we summarize recent advances in the protein design field with respect to water solubility and structural stability. After introducing fundamental design rules, we discuss the transmembrane protein solubilization and de novo transmembrane protein design. Traditional strategies to enhance protein solubility and structural stability are introduced. The designs of stable protein complexes and high-order assemblies are covered. Computational methodologies behind these endeavors, including structure prediction programs, machine learning algorithms, and specialty software dedicated to the evaluation of protein solubility and aggregation, are discussed. The findings and opportunities for Cryo-EM are presented. This review provides an overview of significant progress and prospects in accurate protein design for solubility and stability.
Collapse
Affiliation(s)
- Rui Qing
- State
Key Laboratory of Microbial Metabolism, School of Life Sciences and
Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
- Media
Lab, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
- The
David H. Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Shilei Hao
- Media
Lab, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
- Key
Laboratory of Biorheological Science and Technology, Ministry of Education, College of Bioengineering, Chongqing University, Chongqing 400030, China
| | - Eva Smorodina
- Department
of Immunology, University of Oslo and Oslo
University Hospital, Oslo 0424, Norway
| | - David Jin
- Avalon GloboCare
Corp., Freehold, New Jersey 07728, United States
| | - Arthur Zalevsky
- Laboratory
of Bioinformatics Approaches in Combinatorial Chemistry and Biology, Shemyakin−Ovchinnikov Institute of Bioorganic
Chemistry RAS, Moscow 117997, Russia
| | - Shuguang Zhang
- Media
Lab, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
12
|
Biswas G, Ghosh S, Basu S, Bhattacharyya D, Datta AK, Banerjee R. Can the jigsaw puzzle model of protein folding re‐assemble a hydrophobic core? Proteins 2022; 90:1390-1412. [DOI: 10.1002/prot.26321] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 01/11/2022] [Accepted: 01/28/2022] [Indexed: 12/30/2022]
Affiliation(s)
- Gargi Biswas
- Saha Institute of Nuclear Physics Kolkata India
- Homi Bhabha National Institute Mumbai India
| | | | - Sankar Basu
- Saha Institute of Nuclear Physics Kolkata India
| | | | | | - Rahul Banerjee
- Saha Institute of Nuclear Physics Kolkata India
- Homi Bhabha National Institute Mumbai India
| |
Collapse
|
13
|
Lam NT, McCluskey JB, Glover DJ. Harnessing the Structural and Functional Diversity of Protein Filaments as Biomaterial Scaffolds. ACS APPLIED BIO MATERIALS 2022; 5:4668-4686. [PMID: 35766918 DOI: 10.1021/acsabm.2c00275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The natural ability of many proteins to polymerize into highly structured filaments has been harnessed as scaffolds to align functional molecules in a diverse range of biomaterials. Protein-engineering methodologies also enable the structural and physical properties of filaments to be tailored for specific biomaterial applications through genetic engineering or filaments built from the ground up using advances in the computational prediction of protein folding and assembly. Using these approaches, protein filament-based biomaterials have been engineered to accelerate enzymatic catalysis, provide routes for the biomineralization of inorganic materials, facilitate energy production and transfer, and provide support for mammalian cells for tissue engineering. In this review, we describe how the unique structural and functional diversity in natural and computationally designed protein filaments can be harnessed in biomaterials. In addition, we detail applications of these protein assemblies as material scaffolds with a particular emphasis on applications that exploit unique properties of specific filaments. Through the diversity of protein filaments, the biomaterial engineer's toolbox contains many modular protein filaments that will likely be incorporated as the main structural component of future biomaterials.
Collapse
Affiliation(s)
- Nga T Lam
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Joshua B McCluskey
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Dominic J Glover
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales 2052, Australia
| |
Collapse
|
14
|
Srinivasan S, Vanni S. Computational Approaches to Investigate and Design Lipid-binding Domains for Membrane Biosensing. Chimia (Aarau) 2021; 75:1031-1036. [PMID: 34920773 DOI: 10.2533/chimia.2021.1031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Association of proteins with cellular membranes is critical for signaling and membrane trafficking processes. Many peripheral lipid-binding domains have been identified in the last few decades and have been investigated for their specific lipid-sensing properties using traditional in vivo and in vitro studies. However, several knowledge gaps remain owing to intrinsic limitations of these methodologies. Thus, novel approaches are necessary to further our understanding in lipid-protein biology. This review briefly discusses lipid-binding domains that act as specific lipid biosensors and provides a broad perspective on the computational approaches such as molecular dynamics (MD) simulations and machine learning (ML)-based techniques that can be used to study protein-membrane interactions. We also highlight the need for de novo design of proteins that elicit specific lipid-binding properties.
Collapse
Affiliation(s)
| | - Stefano Vanni
- Department of Biology, University of Fribourg, Switzerland;,
| |
Collapse
|
15
|
Liang S, Li Z, Zhan J, Zhou Y. De novo protein design by an energy function based on series expansion in distance and orientation dependence. Bioinformatics 2021; 38:86-93. [PMID: 34406339 DOI: 10.1093/bioinformatics/btab598] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Revised: 08/11/2021] [Accepted: 08/16/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Despite many successes, de novo protein design is not yet a solved problem as its success rate remains low. The low success rate is largely because we do not yet have an accurate energy function for describing the solvent-mediated interaction between amino acid residues in a protein chain. Previous studies showed that an energy function based on series expansions with its parameters optimized for side-chain and loop conformations can lead to one of the most accurate methods for side chain (OSCAR) and loop prediction (LEAP). Following the same strategy, we developed an energy function based on series expansions with the parameters optimized in four separate stages (recovering single-residue types without and with orientation dependence, selecting loop decoys and maintaining the composition of amino acids). We tested the energy function for de novo design by using Monte Carlo simulated annealing. RESULTS The method for protein design (OSCAR-Design) is found to be as accurate as OSCAR and LEAP for side-chain and loop prediction, respectively. In de novo design, it can recover native residue types ranging from 38% to 43% depending on test sets, conserve hydrophobic/hydrophilic residues at ∼75%, and yield the overall similarity in amino acid compositions at more than 90%. These performance measures are all statistically significantly better than several protein design programs compared. Moreover, the largest hydrophobic patch areas in designed proteins are near or smaller than those in native proteins. Thus, an energy function based on series expansion can be made useful for protein design. AVAILABILITY AND IMPLEMENTATION The Linux executable version is freely available for academic users at http://zhouyq-lab.szbl.ac.cn/resources/.
Collapse
Affiliation(s)
- Shide Liang
- Department of R & D, Bio-Thera Solutions, Guangzhou 510530, China
| | - Zhixiu Li
- Institute of Health and Biomedical Innovation, Queensland University of Technology at Translational Research Institute, Woolloongabba, QLD 3001, Australia
| | - Jian Zhan
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast Campus, Southport, QLD 4222, Australia.,Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Yaoqi Zhou
- Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China.,Peking University Shenzhen Graduate School, Shenzhen 518055, China
| |
Collapse
|
16
|
Zhu J, Avakyan N, Kakkis AA, Hoffnagle AM, Han K, Li Y, Zhang Z, Choi TS, Na Y, Yu CJ, Tezcan FA. Protein Assembly by Design. Chem Rev 2021; 121:13701-13796. [PMID: 34405992 PMCID: PMC9148388 DOI: 10.1021/acs.chemrev.1c00308] [Citation(s) in RCA: 89] [Impact Index Per Article: 29.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Proteins are nature's primary building blocks for the construction of sophisticated molecular machines and dynamic materials, ranging from protein complexes such as photosystem II and nitrogenase that drive biogeochemical cycles to cytoskeletal assemblies and muscle fibers for motion. Such natural systems have inspired extensive efforts in the rational design of artificial protein assemblies in the last two decades. As molecular building blocks, proteins are highly complex, in terms of both their three-dimensional structures and chemical compositions. To enable control over the self-assembly of such complex molecules, scientists have devised many creative strategies by combining tools and principles of experimental and computational biophysics, supramolecular chemistry, inorganic chemistry, materials science, and polymer chemistry, among others. Owing to these innovative strategies, what started as a purely structure-building exercise two decades ago has, in short order, led to artificial protein assemblies with unprecedented structures and functions and protein-based materials with unusual properties. Our goal in this review is to give an overview of this exciting and highly interdisciplinary area of research, first outlining the design strategies and tools that have been devised for controlling protein self-assembly, then describing the diverse structures of artificial protein assemblies, and finally highlighting the emergent properties and functions of these assemblies.
Collapse
Affiliation(s)
| | | | - Albert A. Kakkis
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Alexander M. Hoffnagle
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Kenneth Han
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Yiying Li
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Zhiyin Zhang
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Tae Su Choi
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Youjeong Na
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Chung-Jui Yu
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - F. Akif Tezcan
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| |
Collapse
|
17
|
Parihar PS, Singh A, Karade SS, Sahasrabuddhe AA, Pratap JV. Structural insights into kinetoplastid coronin oligomerization domain and F-actin interaction. Curr Res Struct Biol 2021; 3:268-276. [PMID: 34746809 PMCID: PMC8554105 DOI: 10.1016/j.crstbi.2021.10.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 09/18/2021] [Accepted: 10/11/2021] [Indexed: 12/25/2022] Open
Abstract
The two-domain actin associated protein coronin interacts with filamentous (F-) actin, facilitating diverse biological processes including cell proliferation, motility, phagocytosis, host-parasite interaction and cargo binding. The conserved N-terminal β-propeller domain is involved in protein: protein interactions, while the C-terminal coiled-coil domain mediates oligomerization, transducing conformational changes. The L. donovani coronin coiled-coil (LdCoroCC) domain exhibited a novel topology and oligomer association with an inherent asymmetry, caused primarily by three a residues of successive heptads. In the T.brucei homolog (TbrCoro), two of these 'a' residues are different (Val 493 & 507 replacing LdCoroCC Ile 486 and Met 500 respectively). The elucidated structure possesses a similar topology and assembly while comparative structural analysis shows that the T.brucei coronin coiled-coil domain (TbrCoroCC) too possesses the asymmetry though its magnitude is smaller. Analysis identifies that the asymmetric state is stabilized via cyclic salt bridges formed by Arg 497 and Glu 504. Co-localization studies (LdCoro, TbrCoro and corresponding mutant coiled coil constructs) with actin show that there are subtle differences in their binding patterns, with the double mutant V493I-V507M showing maximal effect. None of the constructs have an effect on F-actin length. Taken together with LdCoroCC, we therefore conclude that the inherent asymmetric structures are essential for kinetoplastids, and are of interest in understanding and exploiting actin dynamics.
Collapse
Affiliation(s)
- Pankaj Singh Parihar
- Division of Biochemistry and Structural Biology, CSIR - Central Drug Research Institute, Sector 10, Jankipuram Extension, Sitapur Road, Lucknow, 226031, Uttar Pradesh, India
| | - Aastha Singh
- Division of Biochemistry and Structural Biology, CSIR - Central Drug Research Institute, Sector 10, Jankipuram Extension, Sitapur Road, Lucknow, 226031, Uttar Pradesh, India
| | - Sharanbasappa Shrimant Karade
- Division of Biochemistry and Structural Biology, CSIR - Central Drug Research Institute, Sector 10, Jankipuram Extension, Sitapur Road, Lucknow, 226031, Uttar Pradesh, India
| | - Amogh Anant Sahasrabuddhe
- Division of Biochemistry and Structural Biology, CSIR - Central Drug Research Institute, Sector 10, Jankipuram Extension, Sitapur Road, Lucknow, 226031, Uttar Pradesh, India
| | - J Venkatesh Pratap
- Division of Biochemistry and Structural Biology, CSIR - Central Drug Research Institute, Sector 10, Jankipuram Extension, Sitapur Road, Lucknow, 226031, Uttar Pradesh, India
| |
Collapse
|
18
|
Nazet J, Lang E, Merkl R. Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network. PLoS One 2021; 16:e0256691. [PMID: 34437621 PMCID: PMC8389498 DOI: 10.1371/journal.pone.0256691] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 08/12/2021] [Indexed: 12/05/2022] Open
Abstract
Rational protein design aims at the targeted modification of existing proteins. To reach this goal, software suites like Rosetta propose sequences to introduce the desired properties. Challenging design problems necessitate the representation of a protein by means of a structural ensemble. Thus, Rosetta multi-state design (MSD) protocols have been developed wherein each state represents one protein conformation. Computational demands of MSD protocols are high, because for each of the candidate sequences a costly three-dimensional (3D) model has to be created and assessed for all states. Each of these scores contributes one data point to a complex, design-specific energy landscape. As neural networks (NN) proved well-suited to learn such solution spaces, we integrated one into the framework Rosetta:MSF instead of the so far used genetic algorithm with the aim to reduce computational costs. As its predecessor, Rosetta:MSF:NN administers a set of candidate sequences and their scores and scans sequence space iteratively. During each iteration, the union of all candidate sequences and their Rosetta scores are used to re-train NNs that possess a design-specific architecture. The enormous speed of the NNs allows an extensive assessment of alternative sequences, which are ranked on the scores predicted by the NN. Costly 3D models are computed only for a small fraction of best-scoring sequences; these and the corresponding 3D-based scores replace half of the candidate sequences during each iteration. The analysis of two sets of candidate sequences generated for a specific design problem by means of a genetic algorithm confirmed that the NN predicted 3D-based scores quite well; the Pearson correlation coefficient was at least 0.95. Applying Rosetta:MSF:NN:enzdes to a benchmark consisting of 16 ligand-binding problems showed that this protocol converges ten-times faster than the genetic algorithm and finds sequences with comparable scores.
Collapse
Affiliation(s)
- Julian Nazet
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, Regensburg, Germany
| | - Elmar Lang
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, Regensburg, Germany
| | - Rainer Merkl
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, Regensburg, Germany
- * E-mail:
| |
Collapse
|
19
|
Heide F, McDougall M, Harder-Viddal C, Roshko R, Davidson D, Wu J, Aprosoff C, Moya-Torres A, Lin F, Stetefeld J. Boron rich nanotube drug carrier system is suited for boron neutron capture therapy. Sci Rep 2021; 11:15520. [PMID: 34330984 PMCID: PMC8324832 DOI: 10.1038/s41598-021-95044-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Accepted: 07/08/2021] [Indexed: 02/07/2023] Open
Abstract
Boron neutron capture therapy (BNCT) is a two-step therapeutic process that utilizes Boron-10 in combination with low energy neutrons to effectively eliminate targeted cells. This therapy is primarily used for difficult to treat head and neck carcinomas; recent advances have expanded this method to cover a broader range of carcinomas. However, it still remains an unconventional therapy where one of the barriers for widespread adoption is the adequate delivery of Boron-10 to target cells. In an effort to address this issue, we examined a unique nanoparticle drug delivery system based on a highly stable and modular proteinaceous nanotube. Initially, we confirmed and structurally analyzed ortho-carborane binding into the cavities of the nanotube. The high ratio of Boron to proteinaceous mass and excellent thermal stability suggest the nanotube system as a suitable candidate for drug delivery into cancer cells. The full physicochemical characterization of the nanotube then allowed for further mechanistic molecular dynamic studies of the ortho-carborane uptake and calculations of corresponding energy profiles. Visualization of the binding event highlighted the protein dynamics and the importance of the interhelical channel formation to allow movement of the boron cluster into the nanotube. Additionally, cell assays showed that the nanotube can penetrate outer membranes of cancer cells followed by localization around the cells' nuclei. This work uses an integrative approach combining experimental data from structural, molecular dynamics simulations and biological experiments to thoroughly present an alternative drug delivery device for BNCT which offers additional benefits over current delivery methods.
Collapse
Affiliation(s)
- Fabian Heide
- Department of Chemistry, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada.
| | - Matthew McDougall
- Department of Chemistry, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada
| | - Candice Harder-Viddal
- Department of Chemistry and Physics, Canadian Mennonite University, Winnipeg, MB, R3P 2N2, Canada
| | - Roy Roshko
- Department of Physics and Astronomy, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada
| | - David Davidson
- Department of Chemistry, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada
| | - Jiandong Wu
- Institute of Biomedical and Health Engineering, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Camila Aprosoff
- Department of Chemistry, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada
| | - Aniel Moya-Torres
- Department of Chemistry, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada
| | - Francis Lin
- Department of Physics and Astronomy, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada
| | - Jörg Stetefeld
- Department of Chemistry, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada.
| |
Collapse
|
20
|
Woolfson DN. A Brief History of De Novo Protein Design: Minimal, Rational, and Computational. J Mol Biol 2021; 433:167160. [PMID: 34298061 DOI: 10.1016/j.jmb.2021.167160] [Citation(s) in RCA: 55] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2021] [Revised: 07/07/2021] [Accepted: 07/12/2021] [Indexed: 12/26/2022]
Abstract
Protein design has come of age, but how will it mature? In the 1980s and the 1990s, the primary motivation for de novo protein design was to test our understanding of the informational aspect of the protein-folding problem; i.e., how does protein sequence determine protein structure and function? This necessitated minimal and rational design approaches whereby the placement of each residue in a design was reasoned using chemical principles and/or biochemical knowledge. At that time, though with some notable exceptions, the use of computers to aid design was not widespread. Over the past two decades, the tables have turned and computational protein design is firmly established. Here, I illustrate this progress through a timeline of de novo protein structures that have been solved to atomic resolution and deposited in the Protein Data Bank. From this, it is clear that the impact of rational and computational design has been considerable: More-complex and more-sophisticated designs are being targeted with many being resolved to atomic resolution. Furthermore, our ability to generate and manipulate synthetic proteins has advanced to a point where they are providing realistic alternatives to natural protein functions for applications both in vitro and in cells. Also, and increasingly, computational protein design is becoming accessible to non-specialists. This all begs the questions: Is there still a place for minimal and rational design approaches? And, what challenges lie ahead for the burgeoning field of de novo protein design as a whole?
Collapse
Affiliation(s)
- Derek N Woolfson
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, UK; School of Biochemistry, University of Bristol, Biomedical Sciences Building, University Walk, Bristol BS8 1TD, UK; Bristol BioDesign Institute, University of Bristol, Life Sciences Building, Tyndall Avenue, Bristol BS8 1TQ, UK.
| |
Collapse
|
21
|
Koga N, Koga R, Liu G, Castellanos J, Montelione GT, Baker D. Role of backbone strain in de novo design of complex α/β protein structures. Nat Commun 2021; 12:3921. [PMID: 34168113 PMCID: PMC8225619 DOI: 10.1038/s41467-021-24050-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Accepted: 05/28/2021] [Indexed: 12/24/2022] Open
Abstract
We previously elucidated principles for designing ideal proteins with completely consistent local and non-local interactions which have enabled the design of a wide range of new αβ-proteins with four or fewer β-strands. The principles relate local backbone structures to supersecondary-structure packing arrangements of α-helices and β-strands. Here, we test the generality of the principles by employing them to design larger proteins with five- and six- stranded β-sheets flanked by α-helices. The initial designs were monomeric in solution with high thermal stability, and the nuclear magnetic resonance (NMR) structure of one was close to the design model, but for two others the order of strands in the β-sheet was swapped. Investigation into the origins of this strand swapping suggested that the global structures of the design models were more strained than the NMR structures. We incorporated explicit consideration of global backbone strain into the design methodology, and succeeded in designing proteins with the intended unswapped strand arrangements. These results illustrate the value of experimental structure determination in guiding improvement of de novo design, and the importance of consistency between local, supersecondary, and global tertiary interactions in determining protein topology. The augmented set of principles should inform the design of larger functional proteins.
Collapse
Affiliation(s)
- Nobuyasu Koga
- University of Washington, Department of Biochemistry and Howard Hughes Medical Institute, Seattle, Washington, WA, USA. .,Research Center of Integrative Molecular Systems, Institute for Molecular Science, National Institutes of Natural Sciences, Okazaki, Aichi, Japan. .,Protein Design Group, Exploratory Research Center on Life and Living Systems (ExCELLS), National Institutes of Natural Sciences, Okazaki, Aichi, Japan. .,SOKENDAI, The Graduate University for Advanced Studies, Hayama, Kanagawa, Japan.
| | - Rie Koga
- University of Washington, Department of Biochemistry and Howard Hughes Medical Institute, Seattle, Washington, WA, USA.,Protein Design Group, Exploratory Research Center on Life and Living Systems (ExCELLS), National Institutes of Natural Sciences, Okazaki, Aichi, Japan
| | - Gaohua Liu
- Nexomics Biosciences, Rocky Hill, NJ, USA
| | - Javier Castellanos
- University of Washington, Department of Biochemistry and Howard Hughes Medical Institute, Seattle, Washington, WA, USA
| | - Gaetano T Montelione
- Department of Chemistry and Chemical Biology, and Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, NY, USA.
| | - David Baker
- University of Washington, Department of Biochemistry and Howard Hughes Medical Institute, Seattle, Washington, WA, USA.
| |
Collapse
|
22
|
Pereira JM, Vieira M, Santos SM. Step-by-step design of proteins for small molecule interaction: A review on recent milestones. Protein Sci 2021; 30:1502-1520. [PMID: 33934427 DOI: 10.1002/pro.4098] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 04/21/2021] [Accepted: 04/23/2021] [Indexed: 01/01/2023]
Abstract
Protein design is the field of synthetic biology that aims at developing de novo custom-made proteins and peptides for specific applications. Despite exploring an ambitious goal, recent computational advances in both hardware and software technologies have paved the way to high-throughput screening and detailed design of novel folds and improved functionalities. Modern advances in the field of protein design for small molecule targeting are described in this review, organized in a step-by-step fashion: from the conception of a new or upgraded active binding site, to scaffold design, sequence optimization, and experimental expression of the custom protein. In each step, contemporary examples are described, and state-of-the-art software is briefly explored.
Collapse
Affiliation(s)
- José M Pereira
- CICECO & Departamento de Química, Universidade de Aveiro, Aveiro, Portugal
| | - Maria Vieira
- CICECO & Departamento de Química, Universidade de Aveiro, Aveiro, Portugal
| | - Sérgio M Santos
- CICECO & Departamento de Química, Universidade de Aveiro, Aveiro, Portugal
| |
Collapse
|
23
|
Rhys GG, Dawson WM, Beesley JL, Martin FJO, Brady RL, Thomson AR, Woolfson DN. How Coiled-Coil Assemblies Accommodate Multiple Aromatic Residues. Biomacromolecules 2021; 22:2010-2019. [PMID: 33881308 DOI: 10.1021/acs.biomac.1c00131] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Rational protein design requires understanding the contribution of each amino acid to a targeted protein fold. For a subset of protein structures, namely, α-helical coiled coils (CCs), knowledge is sufficiently advanced to allow the rational de novo design of many structures, including entirely new protein folds. Current CC design rules center on using aliphatic hydrophobic residues predominantly to drive the folding and assembly of amphipathic α helices. The consequences of using aromatic residues-which would be useful for introducing structural probes, and binding and catalytic functionalities-into these interfaces are not understood. There are specific examples of designed CCs containing such aromatic residues, e.g., phenylalanine-rich sequences, and the use of polar aromatic residues to make buried hydrogen-bond networks. However, it is not known generally if sequences rich in tyrosine can form CCs, or what CC assemblies these would lead to. Here, we explore tyrosine-rich sequences in a general CC-forming background and resolve new CC structures. In one of these, an antiparallel tetramer, the tyrosine residues are solvent accessible and pack at the interface between the core and the surface. In another more complex structure, the residues are buried and form an extended hydrogen-bond network.
Collapse
Affiliation(s)
- Guto G Rhys
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, United Kingdom.,Department of Biochemistry, University of Bayreuth, Universitätsstraße 30, 95447 Bayreuth, Germany
| | - William M Dawson
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, United Kingdom
| | - Joseph L Beesley
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, United Kingdom
| | - Freddie J O Martin
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, United Kingdom
| | - R Leo Brady
- School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, Bristol BS8 1TD, United Kingdom
| | - Andrew R Thomson
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, United Kingdom.,School of Chemistry, University of Glasgow, Glasgow G12 8QQ, United Kingdom
| | - Derek N Woolfson
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, United Kingdom.,School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, Bristol BS8 1TD, United Kingdom.,Bristol BioDesign Institute, University of Bristol, Life Sciences Building, Tyndall Avenue, Bristol BS8 1TQ, United Kingdom
| |
Collapse
|
24
|
Orientational Ambiguity in Septin Coiled Coils and its Structural Basis. J Mol Biol 2021; 433:166889. [PMID: 33639214 DOI: 10.1016/j.jmb.2021.166889] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2020] [Revised: 01/25/2021] [Accepted: 02/17/2021] [Indexed: 12/21/2022]
Abstract
Septins are an example of subtle molecular recognition whereby different paralogues must correctly assemble into functional filaments important for essential cellular events such as cytokinesis. Most possess C-terminal domains capable of forming coiled coils which are believed to be involved in filament formation and bundling. Here, we report an integrated structural approach which aims to unravel their architectural diversity and in so doing provide direct structural information for the coiled-coil regions of five human septins. Unexpectedly, we encounter dimeric structures presenting both parallel and antiparallel arrangements which are in consonance with molecular modelling suggesting that both are energetically accessible. These sequences therefore code for two metastable states of different orientations which employ different but overlapping interfaces. The antiparallel structures present a mixed coiled-coil interface, one side of which is dominated by a continuous chain of core hydrophilic residues. This unusual type of coiled coil could be used to expand the toolkit currently available to the protein engineer for the design of previously unforeseen coiled-coil based assemblies. Within a physiological context, our data provide the first atomic details related to the assumption that the parallel orientation is likely formed between septin monomers from the same filament whilst antiparallelism may participate in the widely described interfilament cross bridges necessary for higher order structures and thereby septin function.
Collapse
|
25
|
Hawkins-Hooker A, Depardieu F, Baur S, Couairon G, Chen A, Bikard D. Generating functional protein variants with variational autoencoders. PLoS Comput Biol 2021; 17:e1008736. [PMID: 33635868 PMCID: PMC7946179 DOI: 10.1371/journal.pcbi.1008736] [Citation(s) in RCA: 69] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 03/10/2021] [Accepted: 01/25/2021] [Indexed: 11/20/2022] Open
Abstract
The vast expansion of protein sequence databases provides an opportunity for new protein design approaches which seek to learn the sequence-function relationship directly from natural sequence variation. Deep generative models trained on protein sequence data have been shown to learn biologically meaningful representations helpful for a variety of downstream tasks, but their potential for direct use in the design of novel proteins remains largely unexplored. Here we show that variational autoencoders trained on a dataset of almost 70000 luciferase-like oxidoreductases can be used to generate novel, functional variants of the luxA bacterial luciferase. We propose separate VAE models to work with aligned sequence input (MSA VAE) and raw sequence input (AR-VAE), and offer evidence that while both are able to reproduce patterns of amino acid usage characteristic of the family, the MSA VAE is better able to capture long-distance dependencies reflecting the influence of 3D structure. To confirm the practical utility of the models, we used them to generate variants of luxA whose luminescence activity was validated experimentally. We further showed that conditional variants of both models could be used to increase the solubility of luxA without disrupting function. Altogether 6/12 of the variants generated using the unconditional AR-VAE and 9/11 generated using the unconditional MSA VAE retained measurable luminescence, together with all 23 of the less distant variants generated by conditional versions of the models; the most distant functional variant contained 35 differences relative to the nearest training set sequence. These results demonstrate the feasibility of using deep generative models to explore the space of possible protein sequences and generate useful variants, providing a method complementary to rational design and directed evolution approaches.
Collapse
Affiliation(s)
- Alex Hawkins-Hooker
- Synthetic Biology Group, Microbiology Department, Institut Pasteur, Paris, France
| | - Florence Depardieu
- Synthetic Biology Group, Microbiology Department, Institut Pasteur, Paris, France
| | - Sebastien Baur
- Synthetic Biology Group, Microbiology Department, Institut Pasteur, Paris, France
| | - Guillaume Couairon
- Synthetic Biology Group, Microbiology Department, Institut Pasteur, Paris, France
| | - Arthur Chen
- Synthetic Biology Group, Microbiology Department, Institut Pasteur, Paris, France
| | - David Bikard
- Synthetic Biology Group, Microbiology Department, Institut Pasteur, Paris, France
| |
Collapse
|
26
|
Pan X, Kortemme T. Recent advances in de novo protein design: Principles, methods, and applications. J Biol Chem 2021; 296:100558. [PMID: 33744284 PMCID: PMC8065224 DOI: 10.1016/j.jbc.2021.100558] [Citation(s) in RCA: 82] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 03/12/2021] [Accepted: 03/16/2021] [Indexed: 02/06/2023] Open
Abstract
The computational de novo protein design is increasingly applied to address a number of key challenges in biomedicine and biological engineering. Successes in expanding applications are driven by advances in design principles and methods over several decades. Here, we review recent innovations in major aspects of the de novo protein design and include how these advances were informed by principles of protein architecture and interactions derived from the wealth of structures in the Protein Data Bank. We describe developments in de novo generation of designable backbone structures, optimization of sequences, design scoring functions, and the design of the function. The advances not only highlight design goals reachable now but also point to the challenges and opportunities for the future of the field.
Collapse
Affiliation(s)
- Xingjie Pan
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California, USA; UC Berkeley - UCSF Graduate Program in Bioengineering, University of California San Francisco, San Francisco, California, USA.
| | - Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California, USA; UC Berkeley - UCSF Graduate Program in Bioengineering, University of California San Francisco, San Francisco, California, USA; Quantitative Biosciences Institute (QBI), University of California San Francisco, San Francisco, California, USA.
| |
Collapse
|
27
|
Robust folding of a de novo designed ideal protein even with most of the core mutated to valine. Proc Natl Acad Sci U S A 2020; 117:31149-31156. [PMID: 33229587 PMCID: PMC7739874 DOI: 10.1073/pnas.2002120117] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
De novo designed proteins exhibit a remarkable property of extremely high thermal stability compared with naturally occurring proteins. The designed proteins are completely optimized for folding; the backbone structures are created by using a set of rules that relate local backbone structures to preferred tertiary motifs and the side chains are designed to favor both the local backbone structures and the entire tertiary structures. Here, we found that one of the de novo designed proteins, which was mutated to fill the core with mostly valine residues, still has the folding ability and shows high stability (Tm = 106 °C) even with its reduced and loosened core packing. This result supports the importance of local backbone structures to protein folding. Protein design provides a stringent test for our understanding of protein folding. We previously described principles for designing ideal protein structures stabilized by consistent local and nonlocal interactions, based on a set of rules relating local backbone structures to tertiary packing motifs. The principles have made possible the design of protein structures having various topologies with high thermal stability. Whereas nonlocal interactions such as tight hydrophobic core packing have traditionally been considered to be crucial for protein folding and stability, the rules proposed by our previous studies suggest the importance of local backbone structures to protein folding. In this study, we investigated the robustness of folding of de novo designed proteins to the reduction of the hydrophobic core, by extensive mutation of large hydrophobic residues (Leu, Ile) to smaller ones (Val) for one of the designs. Surprisingly, even after 10 Leu and Ile residues were mutated to Val, this mutant with the core mostly filled with Val was found to not be in a molten globule state and fold into the same backbone structure as the original design, with high stability. These results indicate the importance of local backbone structures to the folding ability and high thermal stability of designed proteins and suggest a method for engineering thermally stabilized natural proteins.
Collapse
|
28
|
Mignon D, Druart K, Michael E, Opuu V, Polydorides S, Villa F, Gaillard T, Panel N, Archontis G, Simonson T. Physics-Based Computational Protein Design: An Update. J Phys Chem A 2020; 124:10637-10648. [DOI: 10.1021/acs.jpca.0c07605] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- David Mignon
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Karen Druart
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Eleni Michael
- Department of Physics, University of Cyprus, PO20537, CY1678 Nicosia, Cyprus
| | - Vaitea Opuu
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Savvas Polydorides
- Department of Physics, University of Cyprus, PO20537, CY1678 Nicosia, Cyprus
| | - Francesco Villa
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Thomas Gaillard
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Nicolas Panel
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Georgios Archontis
- Department of Physics, University of Cyprus, PO20537, CY1678 Nicosia, Cyprus
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| |
Collapse
|
29
|
Sabban S, Markovsky M. RamaNet: Computational de novo helical protein backbone design using a long short-term memory generative neural network. F1000Res 2020. [DOI: 10.12688/f1000research.22907.3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The ability to perform de novo protein design will allow researchers to expand the variety of available proteins. By designing synthetic structures computationally, they can utilise more structures than those available in the Protein Data Bank, design structures that are not found in nature, or direct the design of proteins to acquire a specific desired structure. While some researchers attempt to design proteins from first physical and thermodynamic principals, we decided to attempt to test whether it is possible to perform de novo helical protein design of just the backbone statistically using machine learning by building a model that uses a long short-term memory (LSTM) architecture. The LSTM model used only the φ and ψ angles of each residue from an augmented dataset of only helical protein structures. Though the network’s generated backbone structures were not perfect, they were idealised and evaluated post generation where the non-ideal structures were filtered out and the adequate structures kept. The results were successful in developing a logical, rigid, compact, helical protein backbone topology. This paper is a proof of concept that shows it is possible to generate a novel helical backbone topology using an LSTM neural network architecture using only the φ and ψ angles as features. The next step is to attempt to use these backbone topologies and sequence design them to form complete protein structures.
Collapse
|
30
|
Turoňová B, Sikora M, Schürmann C, Hagen WJH, Welsch S, Blanc FEC, von Bülow S, Gecht M, Bagola K, Hörner C, van Zandbergen G, Landry J, de Azevedo NTD, Mosalaganti S, Schwarz A, Covino R, Mühlebach MD, Hummer G, Krijnse Locker J, Beck M. In situ structural analysis of SARS-CoV-2 spike reveals flexibility mediated by three hinges. Science 2020; 370:203-208. [PMID: 32817270 DOI: 10.1101/2020.06.26.173476] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 08/13/2020] [Indexed: 05/24/2023]
Abstract
The spike protein (S) of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is required for cell entry and is the primary focus for vaccine development. In this study, we combined cryo-electron tomography, subtomogram averaging, and molecular dynamics simulations to structurally analyze S in situ. Compared with the recombinant S, the viral S was more heavily glycosylated and occurred mostly in the closed prefusion conformation. We show that the stalk domain of S contains three hinges, giving the head unexpected orientational freedom. We propose that the hinges allow S to scan the host cell surface, shielded from antibodies by an extensive glycan coat. The structure of native S contributes to our understanding of SARS-CoV-2 infection and potentially to the development of safe vaccines.
Collapse
Affiliation(s)
- Beata Turoňová
- Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstr. 1, 69117 Heidelberg, Germany
- Department of Molecular Sociology, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| | - Mateusz Sikora
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| | - Christoph Schürmann
- Division of Veterinary Medicine, Paul Ehrlich Institute, Paul Ehrlich Strasse 51-59, 63225 Langen, Germany
| | - Wim J H Hagen
- Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstr. 1, 69117 Heidelberg, Germany
| | - Sonja Welsch
- Central Electron Microscopy Facility, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| | - Florian E C Blanc
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| | - Sören von Bülow
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| | - Michael Gecht
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| | - Katrin Bagola
- Division of Immunology, Paul Ehrlich Institute, Paul Ehrlich Strasse 51-59, 63225 Langen, Germany
| | - Cindy Hörner
- Division of Veterinary Medicine, Paul Ehrlich Institute, Paul Ehrlich Strasse 51-59, 63225 Langen, Germany
- German Center for Infection Research, Gießen-Marburg-Langen, Germany
| | - Ger van Zandbergen
- Division of Immunology, Paul Ehrlich Institute, Paul Ehrlich Strasse 51-59, 63225 Langen, Germany
- Institute for Immunology, University Medical Center, Johannes Gutenberg University Mainz, Mainz, Germany
- Research Center for Immunotherapy (FZI), University Medical Center, Johannes Gutenberg-University Mainz, Mainz, Germany
| | - Jonathan Landry
- Genomics Core Facility, EMBL, Meyerhofstr. 1, 69117 Heidelberg, Germany
| | | | - Shyamal Mosalaganti
- Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstr. 1, 69117 Heidelberg, Germany
- Department of Molecular Sociology, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| | - Andre Schwarz
- Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstr. 1, 69117 Heidelberg, Germany
| | - Roberto Covino
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
- Frankfurt Institute for Advanced Studies, Ruth-Moufang-Str. 1, 60438 Frankfurt am Main, Germany
| | - Michael D Mühlebach
- Division of Veterinary Medicine, Paul Ehrlich Institute, Paul Ehrlich Strasse 51-59, 63225 Langen, Germany
- German Center for Infection Research, Gießen-Marburg-Langen, Germany
| | - Gerhard Hummer
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany.
- Institute of Biophysics, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| | - Jacomine Krijnse Locker
- Electron Microscopy of Pathogens Unit, Paul Ehrlich Institute, Paul Ehrlich Strasse 51-59, 63225 Langen, Germany.
| | - Martin Beck
- Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstr. 1, 69117 Heidelberg, Germany.
- Department of Molecular Sociology, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| |
Collapse
|
31
|
Pan X, Thompson MC, Zhang Y, Liu L, Fraser JS, Kelly MJS, Kortemme T. Expanding the space of protein geometries by computational design of de novo fold families. Science 2020; 369:1132-1136. [PMID: 32855341 DOI: 10.1126/science.abc0881] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Accepted: 07/14/2020] [Indexed: 01/03/2023]
Abstract
Naturally occurring proteins vary the precise geometries of structural elements to create distinct shapes optimal for function. We present a computational design method, loop-helix-loop unit combinatorial sampling (LUCS), that mimics nature's ability to create families of proteins with the same overall fold but precisely tunable geometries. Through near-exhaustive sampling of loop-helix-loop elements, LUCS generates highly diverse geometries encompassing those found in nature but also surpassing known structure space. Biophysical characterization showed that 17 (38%) of 45 tested LUCS designs encompassing two different structural topologies were well folded, including 16 with designed non-native geometries. Four experimentally solved structures closely matched the designs. LUCS greatly expands the designable structure space and offers a new paradigm for designing proteins with tunable geometries that may be customizable for novel functions.
Collapse
Affiliation(s)
- Xingjie Pan
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA. .,UC Berkeley-UCSF Graduate Program in Bioengineering, University of California, San Francisco, CA, USA
| | - Michael C Thompson
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA
| | - Yang Zhang
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA
| | - Lin Liu
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA
| | - James S Fraser
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA.,Quantitative Biosciences Institute, University of California, San Francisco, CA, USA
| | - Mark J S Kelly
- Department of Pharmaceutical Chemistry, University of California, San Francisco, CA, USA
| | - Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA. .,UC Berkeley-UCSF Graduate Program in Bioengineering, University of California, San Francisco, CA, USA.,Quantitative Biosciences Institute, University of California, San Francisco, CA, USA.,Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
32
|
Turoňová B, Sikora M, Schürmann C, Hagen WJH, Welsch S, Blanc FEC, von Bülow S, Gecht M, Bagola K, Hörner C, van Zandbergen G, Landry J, de Azevedo NTD, Mosalaganti S, Schwarz A, Covino R, Mühlebach MD, Hummer G, Krijnse Locker J, Beck M. In situ structural analysis of SARS-CoV-2 spike reveals flexibility mediated by three hinges. Science 2020; 370:203-208. [PMID: 32817270 PMCID: PMC7665311 DOI: 10.1126/science.abd5223] [Citation(s) in RCA: 429] [Impact Index Per Article: 107.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 08/13/2020] [Indexed: 12/12/2022]
Abstract
The spike protein (S) of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is required for cell entry and is the primary focus for vaccine development. In this study, we combined cryo-electron tomography, subtomogram averaging, and molecular dynamics simulations to structurally analyze S in situ. Compared with the recombinant S, the viral S was more heavily glycosylated and occurred mostly in the closed prefusion conformation. We show that the stalk domain of S contains three hinges, giving the head unexpected orientational freedom. We propose that the hinges allow S to scan the host cell surface, shielded from antibodies by an extensive glycan coat. The structure of native S contributes to our understanding of SARS-CoV-2 infection and potentially to the development of safe vaccines.
Collapse
Affiliation(s)
- Beata Turoňová
- Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstr. 1, 69117 Heidelberg, Germany.,Department of Molecular Sociology, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| | - Mateusz Sikora
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| | - Christoph Schürmann
- Division of Veterinary Medicine, Paul Ehrlich Institute, Paul Ehrlich Strasse 51-59, 63225 Langen, Germany
| | - Wim J H Hagen
- Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstr. 1, 69117 Heidelberg, Germany
| | - Sonja Welsch
- Central Electron Microscopy Facility, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| | - Florian E C Blanc
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| | - Sören von Bülow
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| | - Michael Gecht
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| | - Katrin Bagola
- Division of Immunology, Paul Ehrlich Institute, Paul Ehrlich Strasse 51-59, 63225 Langen, Germany
| | - Cindy Hörner
- Division of Veterinary Medicine, Paul Ehrlich Institute, Paul Ehrlich Strasse 51-59, 63225 Langen, Germany.,German Center for Infection Research, Gießen-Marburg-Langen, Germany
| | - Ger van Zandbergen
- Division of Immunology, Paul Ehrlich Institute, Paul Ehrlich Strasse 51-59, 63225 Langen, Germany.,Institute for Immunology, University Medical Center, Johannes Gutenberg University Mainz, Mainz, Germany.,Research Center for Immunotherapy (FZI), University Medical Center, Johannes Gutenberg-University Mainz, Mainz, Germany
| | - Jonathan Landry
- Genomics Core Facility, EMBL, Meyerhofstr. 1, 69117 Heidelberg, Germany
| | | | - Shyamal Mosalaganti
- Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstr. 1, 69117 Heidelberg, Germany.,Department of Molecular Sociology, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| | - Andre Schwarz
- Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstr. 1, 69117 Heidelberg, Germany
| | - Roberto Covino
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany.,Frankfurt Institute for Advanced Studies, Ruth-Moufang-Str. 1, 60438 Frankfurt am Main, Germany
| | - Michael D Mühlebach
- Division of Veterinary Medicine, Paul Ehrlich Institute, Paul Ehrlich Strasse 51-59, 63225 Langen, Germany.,German Center for Infection Research, Gießen-Marburg-Langen, Germany
| | - Gerhard Hummer
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany. .,Institute of Biophysics, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| | - Jacomine Krijnse Locker
- Electron Microscopy of Pathogens Unit, Paul Ehrlich Institute, Paul Ehrlich Strasse 51-59, 63225 Langen, Germany.
| | - Martin Beck
- Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstr. 1, 69117 Heidelberg, Germany. .,Department of Molecular Sociology, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| |
Collapse
|
33
|
Abstract
Atom pairwise potential functions make up an essential part of many scoring functions for protein decoy detection. With the development of machine learning (ML) tools, there are multiple ways to combine potential functions to create novel ML models and methods. Potential function parameters can be easily extracted; however, it is usually hard to directly obtain the calculated atom pairwise energies from scoring functions. Amber, as one of the most popular suites of modeling programs, has an extensive history and library of force field potential functions. In this work, we directly used the force field parameters in ff94 and ff14SB from Amber and encoded them to calculate atom pairwise energies for different interactions. Two sets of structures (single amino acid set and a dipeptide set) were used to evaluate the performance of our encoded Amber potentials. From the comparison results between energy terms obtained from our encoding and Amber, we find energy difference within ±0.06 kcal/mol for all tested structures. Previously we have shown that the Random Forest (RF) model can help to emphasize more important atom pairwise interactions and ignore insignificant ones [Pei, J.; Zheng, Z.; Merz, K. M. J. Chem. Inf. Model. 2019, 59, 1919-1929]. Here, as an example of combining ML methods with traditional potential functions, we followed the same work flow to combine the RF models with force field potential functions from Amber. To determine the performance of our RF models with force field potential functions, 224 different protein native-decoy systems were used as our training and testing sets We find that the RF models with ff94 and ff14SB force field parameters outperformed all other scoring functions (RF models with KECSA2, RWplus, DFIRE, dDFIRE, and GOAP) considered in this work for native structure detection, and they performed similarly in detecting the best decoy. Through inclusion of best decoy to decoy comparisons in building our RF models, we were able to generate models that outperformed the score functions tested herein both on accuracy and best decoy detection, again showing the performance and flexibility of our RF models to tackle this problem. Finally, the importance of the RF algorithm and force field parameters were also tested and the comparison results suggest that both the RF algorithm and force field potentials are important with the ML scoring function achieving its best performance only by combining them together. All code and data used in this work are available at https://github.com/JunPei000/FFENCODER_for_Protein_Folding_Pose_Selection.
Collapse
Affiliation(s)
- Jun Pei
- Department of Chemistry and the Department of Biochemistry and Molecular Biology, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Lin Frank Song
- Department of Chemistry and the Department of Biochemistry and Molecular Biology, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Kenneth M Merz
- Department of Chemistry and the Department of Biochemistry and Molecular Biology, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| |
Collapse
|
34
|
Opuu V, Sun YJ, Hou T, Panel N, Fuentes EJ, Simonson T. A physics-based energy function allows the computational redesign of a PDZ domain. Sci Rep 2020; 10:11150. [PMID: 32636412 PMCID: PMC7341745 DOI: 10.1038/s41598-020-67972-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 06/08/2020] [Indexed: 11/30/2022] Open
Abstract
Computational protein design (CPD) can address the inverse folding problem, exploring a large space of sequences and selecting ones predicted to fold. CPD was used previously to redesign several proteins, employing a knowledge-based energy function for both the folded and unfolded states. We show that a PDZ domain can be entirely redesigned using a "physics-based" energy for the folded state and a knowledge-based energy for the unfolded state. Thousands of sequences were generated by Monte Carlo simulation. Three were chosen for experimental testing, based on their low energies and several empirical criteria. All three could be overexpressed and had native-like circular dichroism spectra and 1D-NMR spectra typical of folded structures. Two had upshifted thermal denaturation curves when a peptide ligand was present, indicating binding and suggesting folding to a correct, PDZ structure. Evidently, the physical principles that govern folded proteins, with a dash of empirical post-filtering, can allow successful whole-protein redesign.
Collapse
Affiliation(s)
- Vaitea Opuu
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France
| | - Young Joo Sun
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, USA
| | - Titus Hou
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, USA
| | - Nicolas Panel
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France
| | - Ernesto J Fuentes
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, USA.
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France.
| |
Collapse
|
35
|
Glasgow AA, Huang YM, Mandell DJ, Thompson M, Ritterson R, Loshbaugh AL, Pellegrino J, Krivacic C, Pache RA, Barlow KA, Ollikainen N, Jeon D, Kelly MJS, Fraser JS, Kortemme T. Computational design of a modular protein sense-response system. Science 2020; 366:1024-1028. [PMID: 31754004 DOI: 10.1126/science.aax8780] [Citation(s) in RCA: 65] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Accepted: 10/07/2019] [Indexed: 12/28/2022]
Abstract
Sensing and responding to signals is a fundamental ability of living systems, but despite substantial progress in the computational design of new protein structures, there is no general approach for engineering arbitrary new protein sensors. Here, we describe a generalizable computational strategy for designing sensor-actuator proteins by building binding sites de novo into heterodimeric protein-protein interfaces and coupling ligand sensing to modular actuation through split reporters. Using this approach, we designed protein sensors that respond to farnesyl pyrophosphate, a metabolic intermediate in the production of valuable compounds. The sensors are functional in vitro and in cells, and the crystal structure of the engineered binding site closely matches the design model. Our computational design strategy opens broad avenues to link biological outputs to new signals.
Collapse
Affiliation(s)
- Anum A Glasgow
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Yao-Ming Huang
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Daniel J Mandell
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA.,Bioinformatics Graduate Program, University of California San Francisco, San Francisco, CA, USA
| | - Michael Thompson
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Ryan Ritterson
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Amanda L Loshbaugh
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA.,Biophysics Graduate Program, University of California San Francisco, San Francisco, CA, USA
| | - Jenna Pellegrino
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA.,Biophysics Graduate Program, University of California San Francisco, San Francisco, CA, USA
| | - Cody Krivacic
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA.,UC Berkeley-UCSF Graduate Program in Bioengineering, University of California San Francisco, San Francisco, CA, USA
| | - Roland A Pache
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Kyle A Barlow
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA.,Bioinformatics Graduate Program, University of California San Francisco, San Francisco, CA, USA
| | - Noah Ollikainen
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA.,Bioinformatics Graduate Program, University of California San Francisco, San Francisco, CA, USA
| | - Deborah Jeon
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Mark J S Kelly
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, CA, USA
| | - James S Fraser
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA.,Biophysics Graduate Program, University of California San Francisco, San Francisco, CA, USA.,Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA, USA
| | - Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA. .,Bioinformatics Graduate Program, University of California San Francisco, San Francisco, CA, USA.,Biophysics Graduate Program, University of California San Francisco, San Francisco, CA, USA.,UC Berkeley-UCSF Graduate Program in Bioengineering, University of California San Francisco, San Francisco, CA, USA.,Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA, USA.,Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
36
|
Abstract
Proteins are molecular machines whose function depends on their ability to achieve complex folds with precisely defined structural and dynamic properties. The rational design of proteins from first-principles, or de novo, was once considered to be impossible, but today proteins with a variety of folds and functions have been realized. We review the evolution of the field from its earliest days, placing particular emphasis on how this endeavor has illuminated our understanding of the principles underlying the folding and function of natural proteins, and is informing the design of macromolecules with unprecedented structures and properties. An initial set of milestones in de novo protein design focused on the construction of sequences that folded in water and membranes to adopt folded conformations. The first proteins were designed from first-principles using very simple physical models. As computers became more powerful, the use of the rotamer approximation allowed one to discover amino acid sequences that stabilize the desired fold. As the crystallographic database of protein structures expanded in subsequent years, it became possible to construct proteins by assembling short backbone fragments that frequently recur in Nature. The second set of milestones in de novo design involves the discovery of complex functions. Proteins have been designed to bind a variety of metals, porphyrins, and other cofactors. The design of proteins that catalyze hydrolysis and oxygen-dependent reactions has progressed significantly. However, de novo design of catalysts for energetically demanding reactions, or even proteins that bind with high affinity and specificity to highly functionalized complex polar molecules remains an importnant challenge that is now being achieved. Finally, the protein design contributed significantly to our understanding of membrane protein folding and transport of ions across membranes. The area of membrane protein design, or more generally of biomimetic polymers that function in mixed or non-aqueous environments, is now becoming increasingly possible.
Collapse
|
37
|
Reese HR, Shanahan CC, Proulx C, Menegatti S. Peptide science: A "rule model" for new generations of peptidomimetics. Acta Biomater 2020; 102:35-74. [PMID: 31698048 DOI: 10.1016/j.actbio.2019.10.045] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2019] [Revised: 10/17/2019] [Accepted: 10/30/2019] [Indexed: 02/07/2023]
Abstract
Peptides have been heavily investigated for their biocompatible and bioactive properties. Though a wide array of functionalities can be introduced by varying the amino acid sequence or by structural constraints, properties such as proteolytic stability, catalytic activity, and phase behavior in solution are difficult or impossible to impart upon naturally occurring α-L-peptides. To this end, sequence-controlled peptidomimetics exhibit new folds, morphologies, and chemical modifications that create new structures and functions. The study of these new classes of polymers, especially α-peptoids, has been highly influenced by the analysis, computational, and design techniques developed for peptides. This review examines techniques to determine primary, secondary, and tertiary structure of peptides, and how they have been adapted to investigate peptoid structure. Computational models developed for peptides have been modified to predict the morphologies of peptoids and have increased in accuracy in recent years. The combination of in vitro and in silico techniques have led to secondary and tertiary structure design principles that mirror those for peptides. We then examine several important developments in peptoid applications inspired by peptides such as pharmaceuticals, catalysis, and protein-binding. A brief survey of alternative backbone structures and research investigating these peptidomimetics shows how the advancement of peptide and peptoid science has influenced the growth of numerous fields of study. As peptide, peptoid, and other peptidomimetic studies continue to advance, we will expect to see higher throughput structural analyses, greater computational accuracy and functionality, and wider application space that can improve human health, solve environmental challenges, and meet industrial needs. STATEMENT OF SIGNIFICANCE: Many historical, chemical, and functional relations draw a thread connecting peptides to their recent cognates, the "peptidomimetics". This review presents a comprehensive survey of this field by highlighting the width and relevance of these familial connections. In the first section, we examine the experimental and computational techniques originally developed for peptides and their morphing into a broader analytical and predictive toolbox. The second section presents an excursus of the structures and properties of prominent peptidomimetics, and how the expansion of the chemical and structural diversity has returned new exciting properties. The third section presents an overview of technological applications and new families of peptidomimetics. As the field grows, new compounds emerge with clear potential in medicine and advanced manufacturing.
Collapse
|
38
|
Merritt HI, Sawyer N, Arora PS. Bent Into Shape: Folded Peptides to Mimic Protein Structure and Modulate Protein Function. Pept Sci (Hoboken) 2020; 112:e24145. [PMID: 33575525 PMCID: PMC7875438 DOI: 10.1002/pep2.24145] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2019] [Accepted: 12/11/2019] [Indexed: 12/16/2022]
Abstract
Protein secondary and tertiary structure mimics have served as model systems to probe biophysical parameters that guide protein folding and as attractive reagents to modulate protein interactions. Here we review contemporary methods to reproduce loop, helix, sheet and coiled-coil conformations in short peptides.
Collapse
Affiliation(s)
| | | | - Paramjit S. Arora
- Department of Chemistry New York University, New York, New York 10003, United States
| |
Collapse
|
39
|
Kuhlman B, Bradley P. Advances in protein structure prediction and design. Nat Rev Mol Cell Biol 2019; 20:681-697. [PMID: 31417196 PMCID: PMC7032036 DOI: 10.1038/s41580-019-0163-x] [Citation(s) in RCA: 364] [Impact Index Per Article: 72.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/19/2019] [Indexed: 12/18/2022]
Abstract
The prediction of protein three-dimensional structure from amino acid sequence has been a grand challenge problem in computational biophysics for decades, owing to its intrinsic scientific interest and also to the many potential applications for robust protein structure prediction algorithms, from genome interpretation to protein function prediction. More recently, the inverse problem - designing an amino acid sequence that will fold into a specified three-dimensional structure - has attracted growing attention as a potential route to the rational engineering of proteins with functions useful in biotechnology and medicine. Methods for the prediction and design of protein structures have advanced dramatically in the past decade. Increases in computing power and the rapid growth in protein sequence and structure databases have fuelled the development of new data-intensive and computationally demanding approaches for structure prediction. New algorithms for designing protein folds and protein-protein interfaces have been used to engineer novel high-order assemblies and to design from scratch fluorescent proteins with novel or enhanced properties, as well as signalling proteins with therapeutic potential. In this Review, we describe current approaches for protein structure prediction and design and highlight a selection of the successful applications they have enabled.
Collapse
Affiliation(s)
- Brian Kuhlman
- Department of Biochemistry and Biophysics, University of North Carolina, Chapel Hill, NC, USA.
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, USA.
| | - Philip Bradley
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
| |
Collapse
|
40
|
Towards functional de novo designed proteins. Curr Opin Chem Biol 2019; 52:102-111. [DOI: 10.1016/j.cbpa.2019.06.011] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Revised: 04/25/2019] [Accepted: 06/06/2019] [Indexed: 12/31/2022]
|
41
|
Abstract
Online citizen science projects such as GalaxyZoo1, Eyewire2 and Phylo3 have been very successful for data collection, annotation, and processing, but for the most part have harnessed human pattern recognition skills rather than human creativity. An exception is the game EteRNA4, in which game players learn to build new RNA structures by exploring the discrete two-dimensional space of Watson-Crick base pairing possibilities. Building new proteins, however, is a more challenging task to present in a game, as both the representation and evaluation of a protein structure are intrinsically three-dimensional. We posed the challenge of de novo protein design in the online protein folding game Foldit5. Players were presented with a fully extended peptide chain and challenged to craft a folded protein structure with an amino acid sequence encoding that structure. After many iterations of player design, analysis of the top scoring solutions, and subsequent game improvement, Foldit players can now, starting from an extended polypeptide chain, generate a diversity of protein structures and sequences which encode them in silico. 146 Foldit player designs with sequences unrelated to naturally occurring proteins were encoded in synthetic genes; 56 were found to be expressed in E. coli with good solubility and to adopt stable monomeric folded structures in solution. The diversity of these structures is unprecedented in de novo protein design, representing 20 different folds—including a new fold not observed in natural proteins. High resolution structures were determined for four of the designs, and are nearly identical to the player models. This work makes explicit the considerable implicit knowledge contributing to success in de novo protein design, and shows that citizen scientists can discover creative new solutions to outstanding scientific challenges, such as the protein design problem.
Collapse
|
42
|
Lombardi A, Pirro F, Maglio O, Chino M, DeGrado WF. De Novo Design of Four-Helix Bundle Metalloproteins: One Scaffold, Diverse Reactivities. Acc Chem Res 2019; 52:1148-1159. [PMID: 30973707 DOI: 10.1021/acs.accounts.8b00674] [Citation(s) in RCA: 77] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
De novo protein design represents an attractive approach for testing and extending our understanding of metalloprotein structure and function. Here, we describe our work on the design of DF (Due Ferri or two-iron in Italian), a minimalist model for the active sites of much larger and more complex natural diiron and dimanganese proteins. In nature, diiron and dimanganese proteins protypically bind their ions in 4-Glu, 2-His environments, and they catalyze diverse reactions, ranging from hydrolysis, to O2-dependent chemistry, to decarbonylation of aldehydes. In the design of DF, the position of each atom-including the backbone, the first-shell ligands, the second-shell hydrogen-bonded groups, and the well-packed hydrophobic core-was bespoke using precise mathematical equations and chemical principles. The first member of the DF family was designed to be of minimal size and complexity and yet to display the quintessential elements required for binding the dimetal cofactor. After thoroughly characterizing its structural, dynamic, spectroscopic, and functional properties, we added additional complexity in a rational stepwise manner to achieve increasingly sophisticated catalytic functions, ultimately demonstrating substrate-gated four-electron reduction of O2 to water. We also briefly describe the extension of these studies to the design of proteins that bind nonbiological metal cofactors (a synthetic porphyrin and a tetranuclear cluster), and a Zn2+/proton antiporting membrane protein. Together these studies demonstrate a successful and generally applicable strategy for de novo metalloprotein design, which might indeed mimic the process by which primordial metalloproteins evolved. We began the design process with a highly symmetrical backbone and binding site, by using point-group symmetry to assemble the secondary structures that position the amino acid side chains required for binding. The resulting models provided a rough starting point and initial parameters for the subsequent precise design of the final protein using modern methods of computational protein design. Unless the desired site is itself symmetrical, this process requires reduction of the symmetry or lifting it altogether. Nevertheless, the initial symmetrical structure can be helpful to restrain the search space during assembly of the backbone. Finally, the methods described here should be generally applicable to the design of highly stable and robust catalysts and sensors. There is considerable potential in combining the efficiency and knowledge base associated with homogeneous metal catalysis with the programmability, biocompatibility, and versatility of proteins. While the work reported here focuses on testing and learning the principles of natural metalloproteins by designing and studying proteins one at a time, there is also considerable potential for using designed proteins that incorporate both biological and nonbiological metal ion cofactors for the evolution of novel catalysts.
Collapse
Affiliation(s)
- Angela Lombardi
- Department of Chemical Sciences, University of Napoli Federico II, Via Cintia, 26, 80126 Napoli, Italy
| | - Fabio Pirro
- Department of Chemical Sciences, University of Napoli Federico II, Via Cintia, 26, 80126 Napoli, Italy
- Department of Pharmaceutical Chemistry and the Cardiovascular Research Institute, University of California at San Francisco, San Francisco, California 94158-9001, United States
| | - Ornella Maglio
- Department of Chemical Sciences, University of Napoli Federico II, Via Cintia, 26, 80126 Napoli, Italy
- IBB, National Research Council, Via Mezzocannone 16, 80134 Napoli, Italy
| | - Marco Chino
- Department of Chemical Sciences, University of Napoli Federico II, Via Cintia, 26, 80126 Napoli, Italy
| | - William F. DeGrado
- Department of Pharmaceutical Chemistry and the Cardiovascular Research Institute, University of California at San Francisco, San Francisco, California 94158-9001, United States
| |
Collapse
|
43
|
Networks of electrostatic and hydrophobic interactions modulate the complex folding free energy surface of a designed βα protein. Proc Natl Acad Sci U S A 2019; 116:6806-6811. [PMID: 30877249 DOI: 10.1073/pnas.1818744116] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
The successful de novo design of proteins can provide insights into the physical chemical basis of stability, the role of evolution in constraining amino acid sequences, and the production of customizable platforms for engineering applications. Previous guanidine hydrochloride (GdnHCl; an ionic denaturant) experiments of a designed, naturally occurring βα fold, Di-III_14, revealed a cooperative, two-state unfolding transition and a modest stability. Continuous-flow mixing experiments in our laboratory revealed a simple two-state reaction in the microsecond to millisecond time range and consistent with the thermodynamic results. In striking contrast, the protein remains folded up to 9.25 M in urea, a neutral denaturant, and hydrogen exchange (HDX) NMR analysis in water revealed the presence of numerous high-energy states that interconvert on a time scale greater than seconds. The complex protection pattern for HDX corresponds closely with a pair of electrostatic networks on the surface and an extensive network of hydrophobic side chains in the interior of the protein. Mutational analysis showed that electrostatic and hydrophobic networks contribute to the resistance to urea denaturation for the WT protein; remarkably, single charge reversals on the protein surface restore the expected urea sensitivity. The roughness of the energy surface reflects the densely packed hydrophobic core; the removal of only two methyl groups eliminates the high-energy states and creates a smooth surface. The design of a very stable βα fold containing electrostatic and hydrophobic networks has created a complex energy surface rarely observed in natural proteins.
Collapse
|
44
|
Pei J, Zheng Z, Merz KM. Random Forest Refinement of the KECSA2 Knowledge-Based Scoring Function for Protein Decoy Detection. J Chem Inf Model 2019; 59:1919-1929. [DOI: 10.1021/acs.jcim.8b00734] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Jun Pei
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
| | - Zheng Zheng
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
| | - Kenneth M. Merz
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
- Institute for Cyber Enabled Research, Michigan State University, 567 Wilson Road, East Lansing, Michigan 48824, United States
| |
Collapse
|
45
|
Kreitler DF, Yao Z, Steinkruger JD, Mortenson DE, Huang L, Mittal R, Travis BR, Forest KT, Gellman SH. A Hendecad Motif Is Preferred for Heterochiral Coiled-Coil Formation. J Am Chem Soc 2019; 141:1583-1592. [PMID: 30645104 DOI: 10.1021/jacs.8b11246] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The structural principles that govern interactions between l- and d-peptides are not well understood. Among natural proteins, coiled-coil assemblies formed between or among α-helices are the most regular feature of tertiary and quaternary structures. We recently reported the first high-resolution structures for heterochiral coiled-coil dimers, which represent a starting point for understanding associations of l- and d-polypeptides. These structures were an unexpected outcome from crystallization of a racemic peptide corresponding to the transmembrane domain of the influenza A M2 protein (M2-TM). The reported structures raised the possibility that heterochiral coiled-coil dimers prefer an 11-residue (hendecad) sequence repeat, in contrast to the 7-residue (heptad) sequence repeat that is dominant among natural coiled coils. To gain insight on sequence repeat preferences of heterochiral coiled-coils, we have examined three M2-TM variants containing substitutions intended to minimize steric clashes between side chains at the coiled-coil interface. In each of the three new crystal structures, we observed heterochiral coiled-coil associations that closely match a hendecad sequence motif, which strengthens the conclusion that this motif is intrinsic to the pairing of α-helices with opposite handedness. In each case, the presence of a hendecad motif was established by comparing the observed helical frequency to that of an ideal hendecad. This comparison revealed that decreasing the size of the amino acid side chain at positions that project toward the superhelical axis produces tighter packing, as determined by the size of the coiled-coil radius. These results provide a basis for future design of heterochiral coiled-coil pairings.
Collapse
Affiliation(s)
- Dale F Kreitler
- Department of Chemistry , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| | - Zhihui Yao
- Graduate Program in Biophysics , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| | - Jay D Steinkruger
- School of Natural Sciences , University of Central Missouri , Warrensburg , Missouri 64093 , United States
| | - David E Mortenson
- Department of Chemistry , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| | - Lijun Huang
- Anatrace , Maumee , Ohio 43537 , United States
| | | | | | - Katrina T Forest
- Department of Chemistry , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States.,Graduate Program in Biophysics , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States.,Department of Bacteriology , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| | - Samuel H Gellman
- Department of Chemistry , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States.,Graduate Program in Biophysics , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| |
Collapse
|
46
|
Simoncini D, Zhang KYJ, Schiex T, Barbe S. A structural homology approach for computational protein design with flexible backbone. Bioinformatics 2018; 35:2418-2426. [DOI: 10.1093/bioinformatics/bty975] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Revised: 11/01/2018] [Accepted: 11/28/2018] [Indexed: 01/09/2023] Open
Abstract
Abstract
Motivation
Structure-based Computational Protein design (CPD) plays a critical role in advancing the field of protein engineering. Using an all-atom energy function, CPD tries to identify amino acid sequences that fold into a target structure and ultimately perform a desired function. Energy functions remain however imperfect and injecting relevant information from known structures in the design process should lead to improved designs.
Results
We introduce Shades, a data-driven CPD method that exploits local structural environments in known protein structures together with energy to guide sequence design, while sampling side-chain and backbone conformations to accommodate mutations. Shades (Structural Homology Algorithm for protein DESign), is based on customized libraries of non-contiguous in-contact amino acid residue motifs. We have tested Shades on a public benchmark of 40 proteins selected from different protein families. When excluding homologous proteins, Shades achieved a protein sequence recovery of 30% and a protein sequence similarity of 46% on average, compared with the PFAM protein family of the target protein. When homologous structures were added, the wild-type sequence recovery rate achieved 93%.
Availability and implementation
Shades source code is available at https://bitbucket.org/satsumaimo/shades as a patch for Rosetta 3.8 with a curated protein structure database and ITEM library creation software.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- David Simoncini
- Laboratoire d'Ingénierie des Systèmes Biologiques et des Procédés, LISBP, Université de Toulouse, CNRS, INRA, INSA, F Toulouse cedex 04, France
- Institut de recherche en informatique de Toulouse, IRIT, UMR 5505-CNRS, Université de Toulouse, Cedex 9, France
| | - Kam Y J Zhang
- Laboratory for Structural Bioinformatics, Center for Biosystems Dynamics Research, RIKEN, Yokohama, Kanagawa, Japan
| | - Thomas Schiex
- Institut de recherche en informatique de Toulouse, UMR 5505-CNRS, Université de Toulouse, Cedex 9, France
| | - Sophie Barbe
- Laboratoire d'Ingénierie des Systèmes Biologiques et des Procédés, LISBP, Université de Toulouse, CNRS, INRA, INSA, F Toulouse cedex 04, France
| |
Collapse
|
47
|
Abstract
Motivation Multistate protein design addresses real-world challenges, such as multi-specificity design and backbone flexibility, by considering both positive and negative protein states with an ensemble of substates for each. It also presents an enormous challenge to exact algorithms that guarantee the optimal solutions and enable a direct test of mechanistic hypotheses behind models. However, efficient exact algorithms are lacking for multistate protein design. Results We have developed an efficient exact algorithm called interconnected cost function networks (iCFN) for multistate protein design. Its generic formulation allows for a wide array of applications such as stability, affinity and specificity designs while addressing concerns such as global flexibility of protein backbones. iCFN treats each substate design as a weighted constraint satisfaction problem (WCSP) modeled through a CFN; and it solves the coupled WCSPs using novel bounds and a depth-first branch-and-bound search over a tree structure of sequences, substates, and conformations. When iCFN is applied to specificity design of a T-cell receptor, a problem of unprecedented size to exact methods, it drastically reduces search space and running time to make the problem tractable. Moreover, iCFN generates experimentally-agreeing receptor designs with improved accuracy compared with state-of-the-art methods, highlights the importance of modeling backbone flexibility in protein design, and reveals molecular mechanisms underlying binding specificity. Availability and implementation https://shen-lab.github.io/software/iCFN. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mostafa Karimi
- Department of Electrical and Computer Engineering and TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, USA
| | - Yang Shen
- Department of Electrical and Computer Engineering and TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, USA
| |
Collapse
|
48
|
Thomas F, Dawson WM, Lang EJM, Burton AJ, Bartlett GJ, Rhys GG, Mulholland AJ, Woolfson DN. De Novo-Designed α-Helical Barrels as Receptors for Small Molecules. ACS Synth Biol 2018; 7:1808-1816. [PMID: 29944338 DOI: 10.1021/acssynbio.8b00225] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
We describe de novo-designed α-helical barrels (αHBs) that bind and discriminate between lipophilic biologically active molecules. αHBs have five or more α-helices arranged around central hydrophobic channels the diameters of which scale with oligomer state. We show that pentameric, hexameric, and heptameric αHBs bind the environmentally sensitive dye 1,6-diphenylhexatriene (DPH) in the micromolar range and fluoresce. Displacement of the dye is used to report the binding of nonfluorescent molecules: palmitic acid and retinol bind to all three αHBs with submicromolar inhibitor constants; farnesol binds the hexamer and heptamer; but β-carotene binds only the heptamer. A co-crystal structure of the hexamer with farnesol reveals oriented binding in the center of the hydrophobic channel. Charged side chains engineered into the lumen of the heptamer facilitate binding of polar ligands: a glutamate variant binds a cationic variant of DPH, and introducing lysine allows binding of the biosynthetically important farnesol diphosphate.
Collapse
Affiliation(s)
- Franziska Thomas
- School of Chemistry, University of Bristol, Cantock’s Close, Bristol BS8 1TS, U.K
- Institute of Organic and Biomolecular Chemistry, Georg-August-Universität Göttingen, Tammannstrasse 2, 37077 Göttingen, Germany
| | - William M. Dawson
- School of Chemistry, University of Bristol, Cantock’s Close, Bristol BS8 1TS, U.K
| | - Eric J. M. Lang
- School of Chemistry, University of Bristol, Cantock’s Close, Bristol BS8 1TS, U.K
- BrisSynBio, University of Bristol, Life Sciences Building, Tyndall Avenue, Bristol BS8 1TQ, U.K
| | - Antony J. Burton
- School of Chemistry, University of Bristol, Cantock’s Close, Bristol BS8 1TS, U.K
- Frick Chemistry Laboratory, Princeton, New Jersey 084544, United States
| | - Gail J. Bartlett
- School of Chemistry, University of Bristol, Cantock’s Close, Bristol BS8 1TS, U.K
| | - Guto G. Rhys
- School of Chemistry, University of Bristol, Cantock’s Close, Bristol BS8 1TS, U.K
| | - Adrian J. Mulholland
- School of Chemistry, University of Bristol, Cantock’s Close, Bristol BS8 1TS, U.K
- BrisSynBio, University of Bristol, Life Sciences Building, Tyndall Avenue, Bristol BS8 1TQ, U.K
- Centre for Computational Chemistry, School of Chemistry, University of Bristol, Cantock’s Close, Bristol BS8 1TS, U.K
| | - Derek N. Woolfson
- School of Chemistry, University of Bristol, Cantock’s Close, Bristol BS8 1TS, U.K
- BrisSynBio, University of Bristol, Life Sciences Building, Tyndall Avenue, Bristol BS8 1TQ, U.K
- School of Biochemistry, University of Bristol, Biomedical Sciences Building, University Walk, Bristol BS8 1TD, U.K
| |
Collapse
|
49
|
Marcos E, Silva D. Essentials of
de novo
protein design: Methods and applications. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2018. [DOI: 10.1002/wcms.1374] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Affiliation(s)
- Enrique Marcos
- Institute for Research in Biomedicine (IRB Barcelona)The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | - Daniel‐Adriano Silva
- Department of BiochemistryUniversity of WashingtonSeattleWashington
- Institute for Protein DesignUniversity of WashingtonSeattleWashington
| |
Collapse
|
50
|
Yamagami M, Sawada T, Fujita M. Synthetic β-Barrel by Metal-Induced Folding and Assembly. J Am Chem Soc 2018; 140:8644-8647. [DOI: 10.1021/jacs.8b04284] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Motoya Yamagami
- Department of Applied Chemistry, School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Tomohisa Sawada
- Department of Applied Chemistry, School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Makoto Fujita
- Department of Applied Chemistry, School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| |
Collapse
|