1
|
McPartlon M, Xu J. An end-to-end deep learning method for protein side-chain packing and inverse folding. Proc Natl Acad Sci U S A 2023; 120:e2216438120. [PMID: 37253017 PMCID: PMC10266014 DOI: 10.1073/pnas.2216438120] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 04/24/2023] [Indexed: 06/01/2023] Open
Abstract
Protein side-chain packing (PSCP), the task of determining amino acid side-chain conformations given only backbone atom positions, has important applications to protein structure prediction, refinement, and design. Many methods have been proposed to tackle this problem, but their speed or accuracy is still unsatisfactory. To address this, we present AttnPacker, a deep learning (DL) method for directly predicting protein side-chain coordinates. Unlike existing methods, AttnPacker directly incorporates backbone 3D geometry to simultaneously compute all side-chain coordinates without delegating to a discrete rotamer library or performing expensive conformational search and sampling steps. This enables a significant increase in computational efficiency, decreasing inference time by over 100× compared to the DL-based method DLPacker and physics-based RosettaPacker. Tested on the CASP13 and CASP14 native and nonnative protein backbones, AttnPacker computes physically realistic side-chain conformations, reducing steric clashes and improving both rmsd and dihedral accuracy compared to state-of-the-art methods SCWRL4, FASPR, RosettaPacker, and DLPacker. Different from traditional PSCP approaches, AttnPacker can also codesign sequences and side chains, producing designs with subnative Rosetta energy and high in silico consistency.
Collapse
Affiliation(s)
- Matthew McPartlon
- Department of Computer Science, Physical Sciences, The University of Chicago, Chicago, IL60637
| | - Jinbo Xu
- Toyota Technical Institute of Chicago, Chicago, IL60637
- MoleculeMind Inc., Beijing100086, China
| |
Collapse
|
2
|
Beuvin F, de Givry S, Schiex T, Verel S, Simoncini D. Iterated local search with partition crossover for computational protein design. Proteins 2021; 89:1522-1529. [PMID: 34228826 DOI: 10.1002/prot.26174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Accepted: 05/25/2021] [Indexed: 11/06/2022]
Abstract
Structure-based computational protein design (CPD) refers to the problem of finding a sequence of amino acids which folds into a specific desired protein structure, and possibly fulfills some targeted biochemical properties. Recent studies point out the particularly rugged CPD energy landscape, suggesting that local search optimization methods should be designed and tuned to easily escape local minima attraction basins. In this article, we analyze the performance and search dynamics of an iterated local search (ILS) algorithm enhanced with partition crossover. Our algorithm, PILS, quickly finds local minima and escapes their basins of attraction by solution perturbation. Additionally, the partition crossover operator exploits the structure of the residue interaction graph in order to efficiently mix solutions and find new unexplored basins. Our results on a benchmark of 30 proteins of various topology and size show that PILS consistently finds lower energy solutions compared to Rosetta fixbb and a classic ILS, and that the corresponding sequences are mostly closer to the native.
Collapse
Affiliation(s)
- François Beuvin
- IRIT UMR 5505-CNRS, Université de Toulouse I Capitole, Toulouse, France.,Artificial and Natural Intelligence Toulouse Institute, ANITI, Toulouse, France
| | - Simon de Givry
- Artificial and Natural Intelligence Toulouse Institute, ANITI, Toulouse, France.,MIAT, Université de Toulouse, INRAE, UR 875, Toulouse, France
| | - Thomas Schiex
- Artificial and Natural Intelligence Toulouse Institute, ANITI, Toulouse, France.,MIAT, Université de Toulouse, INRAE, UR 875, Toulouse, France
| | | | - David Simoncini
- IRIT UMR 5505-CNRS, Université de Toulouse I Capitole, Toulouse, France.,Artificial and Natural Intelligence Toulouse Institute, ANITI, Toulouse, France
| |
Collapse
|
3
|
Bouchiba Y, Cortés J, Schiex T, Barbe S. Molecular flexibility in computational protein design: an algorithmic perspective. Protein Eng Des Sel 2021; 34:6271252. [PMID: 33959778 DOI: 10.1093/protein/gzab011] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 03/12/2021] [Accepted: 03/29/2021] [Indexed: 12/19/2022] Open
Abstract
Computational protein design (CPD) is a powerful technique for engineering new proteins, with both great fundamental implications and diverse practical interests. However, the approximations usually made for computational efficiency, using a single fixed backbone and a discrete set of side chain rotamers, tend to produce rigid and hyper-stable folds that may lack functionality. These approximations contrast with the demonstrated importance of molecular flexibility and motions in a wide range of protein functions. The integration of backbone flexibility and multiple conformational states in CPD, in order to relieve the inaccuracies resulting from these simplifications and to improve design reliability, are attracting increased attention. However, the greatly increased search space that needs to be explored in these extensions defines extremely challenging computational problems. In this review, we outline the principles of CPD and discuss recent effort in algorithmic developments for incorporating molecular flexibility in the design process.
Collapse
Affiliation(s)
- Younes Bouchiba
- Toulouse Biotechnology Institute, TBI, CNRS, INRAE, INSA, ANITI, Toulouse 31400, France.,Laboratoire d'Analyse et d'Architecture des Systèmes, LAAS CNRS, Université de Toulouse, CNRS, Toulouse 31400, France
| | - Juan Cortés
- Laboratoire d'Analyse et d'Architecture des Systèmes, LAAS CNRS, Université de Toulouse, CNRS, Toulouse 31400, France
| | - Thomas Schiex
- Université de Toulouse, ANITI, INRAE, UR MIAT, F-31320, Castanet-Tolosan, France
| | - Sophie Barbe
- Toulouse Biotechnology Institute, TBI, CNRS, INRAE, INSA, ANITI, Toulouse 31400, France
| |
Collapse
|
4
|
Mignon D, Druart K, Michael E, Opuu V, Polydorides S, Villa F, Gaillard T, Panel N, Archontis G, Simonson T. Physics-Based Computational Protein Design: An Update. J Phys Chem A 2020; 124:10637-10648. [DOI: 10.1021/acs.jpca.0c07605] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- David Mignon
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Karen Druart
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Eleni Michael
- Department of Physics, University of Cyprus, PO20537, CY1678 Nicosia, Cyprus
| | - Vaitea Opuu
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Savvas Polydorides
- Department of Physics, University of Cyprus, PO20537, CY1678 Nicosia, Cyprus
| | - Francesco Villa
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Thomas Gaillard
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Nicolas Panel
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Georgios Archontis
- Department of Physics, University of Cyprus, PO20537, CY1678 Nicosia, Cyprus
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| |
Collapse
|
5
|
Surpeta B, Sequeiros-Borja CE, Brezovsky J. Dynamics, a Powerful Component of Current and Future in Silico Approaches for Protein Design and Engineering. Int J Mol Sci 2020; 21:E2713. [PMID: 32295283 PMCID: PMC7215530 DOI: 10.3390/ijms21082713] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 04/10/2020] [Accepted: 04/12/2020] [Indexed: 12/13/2022] Open
Abstract
Computational prediction has become an indispensable aid in the processes of engineering and designing proteins for various biotechnological applications. With the tremendous progress in more powerful computer hardware and more efficient algorithms, some of in silico tools and methods have started to apply the more realistic description of proteins as their conformational ensembles, making protein dynamics an integral part of their prediction workflows. To help protein engineers to harness benefits of considering dynamics in their designs, we surveyed new tools developed for analyses of conformational ensembles in order to select engineering hotspots and design mutations. Next, we discussed the collective evolution towards more flexible protein design methods, including ensemble-based approaches, knowledge-assisted methods, and provable algorithms. Finally, we highlighted apparent challenges that current approaches are facing and provided our perspectives on their further development.
Collapse
Affiliation(s)
- Bartłomiej Surpeta
- Laboratory of Biomolecular Interactions and Transport, Department of Gene Expression, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznanskiego 6, 61-614 Poznan, Poland; (B.S.); (C.E.S.-B.)
- International Institute of Molecular and Cell Biology in Warsaw, Ks Trojdena 4, 02-109 Warsaw, Poland
| | - Carlos Eduardo Sequeiros-Borja
- Laboratory of Biomolecular Interactions and Transport, Department of Gene Expression, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznanskiego 6, 61-614 Poznan, Poland; (B.S.); (C.E.S.-B.)
- International Institute of Molecular and Cell Biology in Warsaw, Ks Trojdena 4, 02-109 Warsaw, Poland
| | - Jan Brezovsky
- Laboratory of Biomolecular Interactions and Transport, Department of Gene Expression, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznanskiego 6, 61-614 Poznan, Poland; (B.S.); (C.E.S.-B.)
- International Institute of Molecular and Cell Biology in Warsaw, Ks Trojdena 4, 02-109 Warsaw, Poland
| |
Collapse
|