1
|
Liang S, Zhang C, Zhu M. Ab Initio Prediction of 3-D Conformations for Protein Long Loops with High Accuracy and Applications to Antibody CDRH3 Modeling. J Chem Inf Model 2023; 63:7568-7577. [PMID: 38018130 DOI: 10.1021/acs.jcim.3c01051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2023]
Abstract
Residue-level potentials of mean force were widely used for protein backbone refinements to avoid simultaneous sampling of side-chain conformations. The interaction energy between the reduced side chains and backbone atoms was not considered explicitly. In this study, we developed novel methods to calculate the residue-atom interaction energy in combination with atomic and residue-level terms. The parameters were optimized step by step to remove the overcounting or overlap problem between different energy terms. The mixing energy functions were then used to evaluate the generated backbone conformations at the initial sampling stage of protein loop modeling (OSCAR-loop), including the interaction energy between the reduced loop residues and full atoms of the protein framework. The accuracies of top-ranked decoys were 1.18 and 2.81 Å for 8-residue and 12-residue loops, respectively. We then selected diverse decoys for side-chain modeling, backbone refinement, and energy minimization. The procedure was repeated multiple times to select one prediction with the lowest energy. Consequently, we obtained an accuracy of 0.74 Å for a prevailing test set of 12-residue loops, compared with >1.4 Å reported by other researchers. The OSCAR-loop was also effective for modeling the H3 loops of antibody complementary determining regions (CDRs) in the crystal environment. The prediction accuracy of OSCAR-loop (1.74 Å) was better than the accuracy of the Rosetta NGK method (3.11 Å) or those achieved by deep learning methods (>2.2 Å) for the CDRH3 loops of 49 targets in the Rosetta antibody benchmark. The performance of OSCAR-loop in a model environment was also discussed.
Collapse
Affiliation(s)
- Shide Liang
- Department of Computational Biology, 20n Bio Limited, Hangzhou 310018, P. R. China
- Department of Research and Development, Bio-Thera Solutions, Guangzhou 510530, P. R. China
| | - Chi Zhang
- School of Biological Sciences, University of Nebraska, Lincoln, Nebraska 68588, United States
| | - Mingfu Zhu
- Department of Computational Biology, 20n Bio Limited, Hangzhou 310018, P. R. China
| |
Collapse
|
2
|
Maksymenko K, Maurer A, Aghaallaei N, Barry C, Borbarán-Bravo N, Ullrich T, Dijkstra TM, Hernandez Alvarez B, Müller P, Lupas AN, Skokowa J, ElGamacy M. The design of functional proteins using tensorized energy calculations. CELL REPORTS METHODS 2023; 3:100560. [PMID: 37671023 PMCID: PMC10475850 DOI: 10.1016/j.crmeth.2023.100560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2023] [Revised: 05/25/2023] [Accepted: 07/21/2023] [Indexed: 09/07/2023]
Abstract
In protein design, the energy associated with a huge number of sequence-conformer perturbations has to be routinely estimated. Hence, enhancing the throughput and accuracy of these energy calculations can profoundly improve design success rates and enable tackling more complex design problems. In this work, we explore the possibility of tensorizing the energy calculations and apply them in a protein design framework. We use this framework to design enhanced proteins with anti-cancer and radio-tracing functions. Particularly, we designed multispecific binders against ligands of the epidermal growth factor receptor (EGFR), where the tested design could inhibit EGFR activity in vitro and in vivo. We also used this method to design high-affinity Cu2+ binders that were stable in serum and could be readily loaded with copper-64 radionuclide. The resulting molecules show superior functional properties for their respective applications and demonstrate the generalizable potential of the described protein design approach.
Collapse
Affiliation(s)
- Kateryna Maksymenko
- Department of Protein Evolution, Max Planck Institute for Biology, 72076 Tübingen, Germany
- Friedrich Miescher Laboratory of the Max Planck Society, 72076 Tübingen, Germany
| | - Andreas Maurer
- Werner Siemens Imaging Center, Department of Preclinical Imaging and Radiopharmacy, Eberhard Karls University, 72076 Tübingen, Germany
- Cluster of Excellence iFIT (EXC 2180) “Image Guided and Functionally Instructed Tumor Therapies,” Eberhard Karls University, 72076 Tübingen, Germany
| | - Narges Aghaallaei
- Division of Translational Oncology, University Hospital Tübingen, 72076 Tübingen, Germany
| | - Caroline Barry
- Department of Protein Evolution, Max Planck Institute for Biology, 72076 Tübingen, Germany
- Krieger School of Arts and Sciences, Johns Hopkins University, Washington, DC 20036, USA
| | - Natalia Borbarán-Bravo
- Division of Translational Oncology, University Hospital Tübingen, 72076 Tübingen, Germany
| | - Timo Ullrich
- Department of Protein Evolution, Max Planck Institute for Biology, 72076 Tübingen, Germany
- Friedrich Miescher Laboratory of the Max Planck Society, 72076 Tübingen, Germany
| | - Tjeerd M.H. Dijkstra
- Department of Protein Evolution, Max Planck Institute for Biology, 72076 Tübingen, Germany
- Department for Women’s Health, University Hospital Tübingen, 72076 Tübingen, Germany
- Translational Bioinformatics, University Hospital Tübingen, 72072 Tübingen, Germany
| | | | - Patrick Müller
- Friedrich Miescher Laboratory of the Max Planck Society, 72076 Tübingen, Germany
| | - Andrei N. Lupas
- Department of Protein Evolution, Max Planck Institute for Biology, 72076 Tübingen, Germany
| | - Julia Skokowa
- Division of Translational Oncology, University Hospital Tübingen, 72076 Tübingen, Germany
| | - Mohammad ElGamacy
- Department of Protein Evolution, Max Planck Institute for Biology, 72076 Tübingen, Germany
- Friedrich Miescher Laboratory of the Max Planck Society, 72076 Tübingen, Germany
- Division of Translational Oncology, University Hospital Tübingen, 72076 Tübingen, Germany
| |
Collapse
|
3
|
Anderson DM, Jayanthi LP, Gosavi S, Meiering EM. Engineering the kinetic stability of a β-trefoil protein by tuning its topological complexity. Front Mol Biosci 2023; 10:1021733. [PMID: 36845544 PMCID: PMC9945329 DOI: 10.3389/fmolb.2023.1021733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 01/02/2023] [Indexed: 02/11/2023] Open
Abstract
Kinetic stability, defined as the rate of protein unfolding, is central to determining the functional lifetime of proteins, both in nature and in wide-ranging medical and biotechnological applications. Further, high kinetic stability is generally correlated with high resistance against chemical and thermal denaturation, as well as proteolytic degradation. Despite its significance, specific mechanisms governing kinetic stability remain largely unknown, and few studies address the rational design of kinetic stability. Here, we describe a method for designing protein kinetic stability that uses protein long-range order, absolute contact order, and simulated free energy barriers of unfolding to quantitatively analyze and predict unfolding kinetics. We analyze two β-trefoil proteins: hisactophilin, a quasi-three-fold symmetric natural protein with moderate stability, and ThreeFoil, a designed three-fold symmetric protein with extremely high kinetic stability. The quantitative analysis identifies marked differences in long-range interactions across the protein hydrophobic cores that partially account for the differences in kinetic stability. Swapping the core interactions of ThreeFoil into hisactophilin increases kinetic stability with close agreement between predicted and experimentally measured unfolding rates. These results demonstrate the predictive power of readily applied measures of protein topology for altering kinetic stability and recommend core engineering as a tractable target for rationally designing kinetic stability that may be widely applicable.
Collapse
Affiliation(s)
| | - Lakshmi P. Jayanthi
- Simons Centre for the Study of Living Machines, National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India
| | - Shachi Gosavi
- Simons Centre for the Study of Living Machines, National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India
| | - Elizabeth M. Meiering
- Department of Chemistry, University of Waterloo, Waterloo, ON, Canada,*Correspondence: Elizabeth M. Meiering,
| |
Collapse
|
4
|
Ochoa R, Lunardelli VAS, Rosa DS, Laio A, Cossio P. Multiple-Allele MHC Class II Epitope Engineering by a Molecular Dynamics-Based Evolution Protocol. Front Immunol 2022; 13:862851. [PMID: 35572587 PMCID: PMC9094701 DOI: 10.3389/fimmu.2022.862851] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Accepted: 03/28/2022] [Indexed: 11/13/2022] Open
Abstract
Epitopes that bind simultaneously to all human alleles of Major Histocompatibility Complex class II (MHC II) are considered one of the key factors for the development of improved vaccines and cancer immunotherapies. To engineer MHC II multiple-allele binders, we developed a protocol called PanMHC-PARCE, based on the unsupervised optimization of the epitope sequence by single-point mutations, parallel explicit-solvent molecular dynamics simulations and scoring of the MHC II-epitope complexes. The key idea is accepting mutations that not only improve the affinity but also reduce the affinity gap between the alleles. We applied this methodology to enhance a Plasmodium vivax epitope for multiple-allele binding. In vitro rate-binding assays showed that four engineered peptides were able to bind with improved affinity toward multiple human MHC II alleles. Moreover, we demonstrated that mice immunized with the peptides exhibited interferon-gamma cellular immune response. Overall, the method enables the engineering of peptides with improved binding properties that can be used for the generation of new immunotherapies.
Collapse
Affiliation(s)
- Rodrigo Ochoa
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia UdeA, Medellin, Colombia
| | | | - Daniela Santoro Rosa
- Department of Microbiology, Immunology and Parasitology, Federal University of Sao Paulo, Sao Paulo, Brazil.,Institute for Investigation in Immunology (iii), Instituto Nacional de Ciência e Tecnologia (INCT), Sao Paulo, Brazil
| | - Alessandro Laio
- Physics Area, International School for Advanced Studies (SISSA), Trieste, Italy.,Condensed Matter and Statistical Physics Section, International Centre for Theoretical Physics (ICTP), Trieste, Italy
| | - Pilar Cossio
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia UdeA, Medellin, Colombia.,Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Frankfurt am Main, Germany.,Center for Computational Mathematics, Flatiron Institute, New York, NY, United States.,Center for Computational Biology, Flatiron Institute, New York, NY, United States
| |
Collapse
|
5
|
Pedraza-González L, Barneschi L, Padula D, De Vico L, Olivucci M. Evolution of the Automatic Rhodopsin Modeling (ARM) Protocol. Top Curr Chem (Cham) 2022; 380:21. [PMID: 35291019 PMCID: PMC8924150 DOI: 10.1007/s41061-022-00374-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 01/29/2022] [Indexed: 10/27/2022]
Abstract
In recent years, photoactive proteins such as rhodopsins have become a common target for cutting-edge research in the field of optogenetics. Alongside wet-lab research, computational methods are also developing rapidly to provide the necessary tools to analyze and rationalize experimental results and, most of all, drive the design of novel systems. The Automatic Rhodopsin Modeling (ARM) protocol is focused on providing exactly the necessary computational tools to study rhodopsins, those being either natural or resulting from mutations. The code has evolved along the years to finally provide results that are reproducible by any user, accurate and reliable so as to replicate experimental trends. Furthermore, the code is efficient in terms of necessary computing resources and time, and scalable in terms of both number of concurrent calculations as well as features. In this review, we will show how the code underlying ARM achieved each of these properties.
Collapse
Affiliation(s)
- Laura Pedraza-González
- Dipartimento di Biotecnologie, Chimica e Farmacia, Università degli Studi di Siena, Via Aldo Moro 2, 53100, Siena, Italy. .,Department of Chemistry and Industrial Chemistry, University of Pisa, Via Moruzzi 13, 56124, Pisa, Italy.
| | - Leonardo Barneschi
- Dipartimento di Biotecnologie, Chimica e Farmacia, Università degli Studi di Siena, Via Aldo Moro 2, 53100, Siena, Italy
| | - Daniele Padula
- Dipartimento di Biotecnologie, Chimica e Farmacia, Università degli Studi di Siena, Via Aldo Moro 2, 53100, Siena, Italy
| | - Luca De Vico
- Dipartimento di Biotecnologie, Chimica e Farmacia, Università degli Studi di Siena, Via Aldo Moro 2, 53100, Siena, Italy.
| | - Massimo Olivucci
- Dipartimento di Biotecnologie, Chimica e Farmacia, Università degli Studi di Siena, Via Aldo Moro 2, 53100, Siena, Italy. .,Department of Chemistry, Bowling Green State University, Bowling Green, OH, 43403, USA.
| |
Collapse
|
6
|
Ochoa R, Soler MA, Gladich I, Battisti A, Minovski N, Rodriguez A, Fortuna S, Cossio P, Laio A. Computational Evolution Protocol for Peptide Design. Methods Mol Biol 2022; 2405:335-359. [PMID: 35298821 DOI: 10.1007/978-1-0716-1855-4_16] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Computational peptide design is useful for therapeutics, diagnostics, and vaccine development. To select the most promising peptide candidates, the key is describing accurately the peptide-target interactions at the molecular level. We here review a computational peptide design protocol whose key feature is the use of all-atom explicit solvent molecular dynamics for describing the different peptide-target complexes explored during the optimization. We describe the milestones behind the development of this protocol, which is now implemented in an open-source code called PARCE. We provide a basic tutorial to run the code for an antibody fragment design example. Finally, we describe three additional applications of the method to design peptides for different targets, illustrating the broad scope of the proposed approach.
Collapse
Affiliation(s)
- Rodrigo Ochoa
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia, Medellin, Colombia
| | | | - Ivan Gladich
- Qatar Environment and Energy Research Institute, Hamad Bin Khalifa University, Doha, Qatar
- SISSA, Trieste, Italy
| | | | - Nikola Minovski
- Department of Chemical and Pharmaceutical Sciences, University of Trieste, Trieste, Italy
- Theory Department, Laboratory for Cheminformatics, National Institute of Chemistry, Ljubljana, Slovenia
| | - Alex Rodriguez
- The Abdus Salam International Centre for Theoretical Physics, Trieste, Italy
| | - Sara Fortuna
- Italian Institute of Technology (IIT), Genova, Italy
- Department of Chemical and Pharmaceutical Sciences, University of Trieste, Trieste, Italy
| | - Pilar Cossio
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia, Medellin, Colombia
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Frankfurt am Main, Germany
| | - Alessandro Laio
- The Abdus Salam International Centre for Theoretical Physics, Trieste, Italy
- SISSA, Trieste, Italy
| |
Collapse
|
7
|
Liang S, Li Z, Zhan J, Zhou Y. De novo protein design by an energy function based on series expansion in distance and orientation dependence. Bioinformatics 2021; 38:86-93. [PMID: 34406339 DOI: 10.1093/bioinformatics/btab598] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Revised: 08/11/2021] [Accepted: 08/16/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Despite many successes, de novo protein design is not yet a solved problem as its success rate remains low. The low success rate is largely because we do not yet have an accurate energy function for describing the solvent-mediated interaction between amino acid residues in a protein chain. Previous studies showed that an energy function based on series expansions with its parameters optimized for side-chain and loop conformations can lead to one of the most accurate methods for side chain (OSCAR) and loop prediction (LEAP). Following the same strategy, we developed an energy function based on series expansions with the parameters optimized in four separate stages (recovering single-residue types without and with orientation dependence, selecting loop decoys and maintaining the composition of amino acids). We tested the energy function for de novo design by using Monte Carlo simulated annealing. RESULTS The method for protein design (OSCAR-Design) is found to be as accurate as OSCAR and LEAP for side-chain and loop prediction, respectively. In de novo design, it can recover native residue types ranging from 38% to 43% depending on test sets, conserve hydrophobic/hydrophilic residues at ∼75%, and yield the overall similarity in amino acid compositions at more than 90%. These performance measures are all statistically significantly better than several protein design programs compared. Moreover, the largest hydrophobic patch areas in designed proteins are near or smaller than those in native proteins. Thus, an energy function based on series expansion can be made useful for protein design. AVAILABILITY AND IMPLEMENTATION The Linux executable version is freely available for academic users at http://zhouyq-lab.szbl.ac.cn/resources/.
Collapse
Affiliation(s)
- Shide Liang
- Department of R & D, Bio-Thera Solutions, Guangzhou 510530, China
| | - Zhixiu Li
- Institute of Health and Biomedical Innovation, Queensland University of Technology at Translational Research Institute, Woolloongabba, QLD 3001, Australia
| | - Jian Zhan
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast Campus, Southport, QLD 4222, Australia.,Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Yaoqi Zhou
- Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China.,Peking University Shenzhen Graduate School, Shenzhen 518055, China
| |
Collapse
|
8
|
Kurcinski M, Kmiecik S, Zalewski M, Kolinski A. Protein-Protein Docking with Large-Scale Backbone Flexibility Using Coarse-Grained Monte-Carlo Simulations. Int J Mol Sci 2021; 22:ijms22147341. [PMID: 34298961 PMCID: PMC8306105 DOI: 10.3390/ijms22147341] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2021] [Revised: 07/03/2021] [Accepted: 07/04/2021] [Indexed: 12/21/2022] Open
Abstract
Most of the protein–protein docking methods treat proteins as almost rigid objects. Only the side-chains flexibility is usually taken into account. The few approaches enabling docking with a flexible backbone typically work in two steps, in which the search for protein–protein orientations and structure flexibility are simulated separately. In this work, we propose a new straightforward approach for docking sampling. It consists of a single simulation step during which a protein undergoes large-scale backbone rearrangements, rotations, and translations. Simultaneously, the other protein exhibits small backbone fluctuations. Such extensive sampling was possible using the CABS coarse-grained protein model and Replica Exchange Monte Carlo dynamics at a reasonable computational cost. In our proof-of-concept simulations of 62 protein–protein complexes, we obtained acceptable quality models for a significant number of cases.
Collapse
|
9
|
Alapati R, Shuvo MH, Bhattacharya D. SPECS: Integration of side-chain orientation and global distance-based measures for improved evaluation of protein structural models. PLoS One 2020; 15:e0228245. [PMID: 32053611 PMCID: PMC7018003 DOI: 10.1371/journal.pone.0228245] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Accepted: 01/11/2020] [Indexed: 12/23/2022] Open
Abstract
Significant advancements in the field of protein structure prediction have necessitated the need for objective and robust evaluation of protein structural models by comparing predicted models against the experimentally determined native structures to quantitate their structural similarities. Existing protein model versus native similarity metrics either consider the distances between alpha carbon (Cα) or side-chain atoms for computing the similarity. However, side-chain orientation of a protein plays a critical role in defining its conformation at the atomic-level. Despite its importance, inclusion of side-chain orientation in structural similarity evaluation has not yet been addressed. Here, we present SPECS, a side-chain-orientation-included protein model-native similarity metric for improved evaluation of protein structural models. SPECS combines side-chain orientation and global distance based measures in an integrated framework using the united-residue model of polypeptide conformation for computing model-native similarity. Experimental results demonstrate that SPECS is a reliable measure for evaluating structural similarity at the global level including and beyond the accuracy of Cα positioning. Moreover, SPECS delivers superior performance in capturing local quality aspect compared to popular global Cα positioning-based metrics ranging from models at near-experimental accuracies to models with correct overall folds-making it a robust measure suitable for both high- and moderate-resolution models. Finally, SPECS is sensitive to minute variations in side-chain χ angles even for models with perfect Cα trace, revealing the power of including side-chain orientation. Collectively, SPECS is a versatile evaluation metric covering a wide spectrum of protein modeling scenarios and simultaneously captures complementary aspects of structural similarities at multiple levels of granularities. SPECS is freely available at http://watson.cse.eng.auburn.edu/SPECS/.
Collapse
Affiliation(s)
- Rahul Alapati
- Department of Computer Science and Software Engineering, Auburn University, Auburn, Alabama, United States of America
| | - Md. Hossain Shuvo
- Department of Computer Science and Software Engineering, Auburn University, Auburn, Alabama, United States of America
| | - Debswapna Bhattacharya
- Department of Computer Science and Software Engineering, Auburn University, Auburn, Alabama, United States of America
- Department of Biological Sciences, Auburn University, Auburn, Alabama, United States of America
| |
Collapse
|
10
|
Badaczewska-Dawid AE, Kolinski A, Kmiecik S. Computational reconstruction of atomistic protein structures from coarse-grained models. Comput Struct Biotechnol J 2019; 18:162-176. [PMID: 31969975 PMCID: PMC6961067 DOI: 10.1016/j.csbj.2019.12.007] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 12/10/2019] [Indexed: 01/02/2023] Open
Abstract
Three-dimensional protein structures, whether determined experimentally or theoretically, are often too low resolution. In this mini-review, we outline the computational methods for protein structure reconstruction from incomplete coarse-grained to all atomistic models. Typical reconstruction schemes can be divided into four major steps. Usually, the first step is reconstruction of the protein backbone chain starting from the C-alpha trace. This is followed by side-chains rebuilding based on protein backbone geometry. Subsequently, hydrogen atoms can be reconstructed. Finally, the resulting all-atom models may require structure optimization. Many methods are available to perform each of these tasks. We discuss the available tools and their potential applications in integrative modeling pipelines that can transfer coarse-grained information from computational predictions, or experiment, to all atomistic structures.
Collapse
Affiliation(s)
| | | | - Sebastian Kmiecik
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| |
Collapse
|
11
|
Christoffer C, Terashi G, Shin WH, Aderinwale T, Maddhuri Venkata Subramaniya SR, Peterson L, Verburgt J, Kihara D. Performance and enhancement of the LZerD protein assembly pipeline in CAPRI 38-46. Proteins 2019; 88:948-961. [PMID: 31697428 DOI: 10.1002/prot.25850] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Revised: 10/07/2019] [Accepted: 11/03/2019] [Indexed: 01/17/2023]
Abstract
We report the performance of the protein docking prediction pipeline of our group and the results for Critical Assessment of Prediction of Interactions (CAPRI) rounds 38-46. The pipeline integrates programs developed in our group as well as other existing scoring functions. The core of the pipeline is the LZerD protein-protein docking algorithm. If templates of the target complex are not found in PDB, the first step of our docking prediction pipeline is to run LZerD for a query protein pair. Meanwhile, in the case of human group prediction, we survey the literature to find information that can guide the modeling, such as protein-protein interface information. In addition to any literature information and binding residue prediction, generated docking decoys were selected by a rank aggregation of statistical scoring functions. The top 10 decoys were relaxed by a short molecular dynamics simulation before submission to remove atom clashes and improve side-chain conformations. In these CAPRI rounds, our group, particularly the LZerD server, showed robust performance. On the other hand, there are failed cases where some other groups were successful. To understand weaknesses of our pipeline, we analyzed sources of errors for failed targets. Since we noted that structure refinement is a step that needs improvement, we newly performed a comparative study of several refinement approaches. Finally, we show several examples that illustrate successful and unsuccessful cases by our group.
Collapse
Affiliation(s)
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana
| | - Woong-Hee Shin
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana.,Department of Chemistry Education, Sunchon National University, Suncheon, Jeollanam-do, Republic of Korea
| | - Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, Indiana
| | | | - Lenna Peterson
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana
| | - Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, Indiana.,Department of Biological Sciences, Purdue University, West Lafayette, Indiana.,Purdue University Center for Cancer Research, Purdue University, West Lafayette, Indiana.,Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio
| |
Collapse
|
12
|
Jumper JM, Faruk NF, Freed KF, Sosnick TR. Accurate calculation of side chain packing and free energy with applications to protein molecular dynamics. PLoS Comput Biol 2018; 14:e1006342. [PMID: 30589846 PMCID: PMC6307715 DOI: 10.1371/journal.pcbi.1006342] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Accepted: 06/21/2018] [Indexed: 12/02/2022] Open
Abstract
To address the large gap between time scales that can be easily reached by molecular simulations and those required to understand protein dynamics, we present a rapid self-consistent approximation of the side chain free energy at every integration step. In analogy with the adiabatic Born-Oppenheimer approximation for electronic structure, the protein backbone dynamics are simulated as preceding according to the dictates of the free energy of an instantaneously-equilibrated side chain potential. The side chain free energy is computed on the fly, allowing the protein backbone dynamics to traverse a greatly smoothed energetic landscape. This computation results in extremely rapid equilibration and sampling of the Boltzmann distribution. Our method, termed Upside, employs a reduced model involving the three backbone atoms, along with the carbonyl oxygen and amide proton, and a single (oriented) side chain bead having multiple locations reflecting the conformational diversity of the side chain's rotameric states. We also introduce a novel, maximum-likelihood method to parameterize the side chain interactions using protein structures. We demonstrate state-of-the-art accuracy for predicting χ1 rotamer states while consuming only milliseconds of CPU time. Our method enables rapidly equilibrating coarse-grained simulations that can nonetheless contain significant molecular detail. We also show that the resulting free energies of the side chains are sufficiently accurate for de novo folding of some proteins.
Collapse
Affiliation(s)
- John M. Jumper
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, United States of America
- Department of Chemistry, and The James Franck Institute, University of Chicago, Chicago, Illinois, United States of America
| | - Nabil F. Faruk
- Graduate Program in Biophysical Sciences, University of Chicago, Chicago, Illinois, United States of America
| | - Karl F. Freed
- Department of Chemistry, and The James Franck Institute, University of Chicago, Chicago, Illinois, United States of America
| | - Tobin R. Sosnick
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, United States of America
- Institute for Biophysical Dynamics, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
13
|
Ochoa R, Soler MA, Laio A, Cossio P. Assessing the capability of in silico mutation protocols for predicting the finite temperature conformation of amino acids. Phys Chem Chem Phys 2018; 20:25901-25909. [PMID: 30289133 DOI: 10.1039/c8cp03826k] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Mutation protocols are a key tool in computational biophysics for modelling unknown side chain conformations. In particular, these protocols are used to generate the starting structures for molecular dynamics simulations. The accuracy of the initial side chain and backbone placement is crucial to obtain a stable and quickly converging simulation. In this work, we assessed the performance of several mutation protocols in predicting the most probable conformer observed in finite temperature molecular dynamics simulations for a set of protein-peptide crystals differing only by single-point mutations in the peptide sequence. Our results show that several programs which predict well the crystal conformations fail to predict the most probable finite temperature configuration. Methods relying on backbone-dependent rotamer libraries have, in general, a better performance, but even the best protocol fails in predicting approximately 30% of the mutations.
Collapse
Affiliation(s)
- Rodrigo Ochoa
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia, Medellin, Colombia.
| | | | | | | |
Collapse
|
14
|
Colbes J, Corona RI, Lezcano C, Rodríguez D, Brizuela CA. Protein side-chain packing problem: is there still room for improvement? Brief Bioinform 2018; 18:1033-1043. [PMID: 27567382 DOI: 10.1093/bib/bbw079] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Indexed: 11/12/2022] Open
Abstract
The protein side-chain packing problem (PSCPP) is an important subproblem of both protein structure prediction and protein design. During the past two decades, a large number of methods have been proposed to tackle this problem. These methods consist of three main components: a rotamer library, a scoring function and a search strategy. The average overall accuracy level obtained by these methods is approximately 87%. Whether a better accuracy level could be achieved remains to be answered. To address this question, we calculated the maximum accuracy level attainable using a simple rotamer library, independently of the energy function or the search method. Using 2883 different structures from the Protein Data Bank, we compared this accuracy level with the accuracy level of five state-of-the-art methods. These comparisons indicated that, for buried residues in the protein, we are already close to the best possible accuracy results. In addition, for exposed residues, we found that a significant gap exists between the possible improvement and the maximum accuracy level achievable with current methods. After determining that an improvement is possible, the next step is to understand what limitations are preventing us from obtaining such an improvement. Previous works on protein structure prediction and protein design have shown that scoring function inaccuracies may represent the main obstacle to achieving better results for these problems. To show that the same is true for the PSCPP, we evaluated the quality of two scoring functions used by some state-of-the-art algorithms. Our results indicate that neither of these scoring functions can guide the search method correctly, thereby reinforcing the idea that efforts to solve the PSCPP must also focus on developing better scoring functions.
Collapse
|
15
|
Gaines JC, Acebes S, Virrueta A, Butler M, Regan L, O'Hern CS. Comparing side chain packing in soluble proteins, protein-protein interfaces, and transmembrane proteins. Proteins 2018; 86:581-591. [PMID: 29427530 PMCID: PMC5912992 DOI: 10.1002/prot.25479] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2017] [Revised: 01/23/2018] [Accepted: 02/06/2018] [Indexed: 12/26/2022]
Abstract
We compare side chain prediction and packing of core and non-core regions of soluble proteins, protein-protein interfaces, and transmembrane proteins. We first identified or created comparable databases of high-resolution crystal structures of these 3 protein classes. We show that the solvent-inaccessible cores of the 3 classes of proteins are equally densely packed. As a result, the side chains of core residues at protein-protein interfaces and in the membrane-exposed regions of transmembrane proteins can be predicted by the hard-sphere plus stereochemical constraint model with the same high prediction accuracies (>90%) as core residues in soluble proteins. We also find that for all 3 classes of proteins, as one moves away from the solvent-inaccessible core, the packing fraction decreases as the solvent accessibility increases. However, the side chain predictability remains high (80% within 30°) up to a relative solvent accessibility, rSASA≲0.3, for all 3 protein classes. Our results show that ≈40% of the interface regions in protein complexes are "core", that is, densely packed with side chain conformations that can be accurately predicted using the hard-sphere model. We propose packing fraction as a metric that can be used to distinguish real protein-protein interactions from designed, non-binding, decoys. Our results also show that cores of membrane proteins are the same as cores of soluble proteins. Thus, the computational methods we are developing for the analysis of the effect of hydrophobic core mutations in soluble proteins will be equally applicable to analyses of mutations in membrane proteins.
Collapse
Affiliation(s)
- J C Gaines
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, 06520
- Integrated Graduate Program in Physical and Engineering Biology (IGPPEB), Yale University, New Haven, Connecticut, 06520
| | - S Acebes
- Department of Mechanical Engineering and Materials Science, Yale University, New Haven, Connecticut, 06520
| | - A Virrueta
- Integrated Graduate Program in Physical and Engineering Biology (IGPPEB), Yale University, New Haven, Connecticut, 06520
- Department of Mechanical Engineering and Materials Science, Yale University, New Haven, Connecticut, 06520
| | - M Butler
- Department of Physics and Astronomy, University of Southern California, Los Angeles, California, 90007
| | - L Regan
- Integrated Graduate Program in Physical and Engineering Biology (IGPPEB), Yale University, New Haven, Connecticut, 06520
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, Connecticut, 06520
- Department of Chemistry, Yale University, New Haven, Connecticut, 06520
| | - C S O'Hern
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, 06520
- Integrated Graduate Program in Physical and Engineering Biology (IGPPEB), Yale University, New Haven, Connecticut, 06520
- Department of Mechanical Engineering and Materials Science, Yale University, New Haven, Connecticut, 06520
- Department of Physics, Yale University, New Haven, Connecticut, 06520
- Department of Applied Physics, Yale University, New Haven, Connecticut, 06520
| |
Collapse
|
16
|
Colbes J, Aguila SA, Brizuela CA. Scoring of Side-Chain Packings: An Analysis of Weight Factors and Molecular Dynamics Structures. J Chem Inf Model 2018; 58:443-452. [PMID: 29368924 DOI: 10.1021/acs.jcim.7b00679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The protein side-chain packing problem (PSCPP) is a central task in computational protein design. The problem is usually modeled as a combinatorial optimization problem, which consists of searching for a set of rotamers, from a given rotamer library, that minimizes a scoring function (SF). The SF is a weighted sum of terms, that can be decomposed in physics-based and knowledge-based terms. Although there are many methods to obtain approximate solutions for this problem, all of them have similar performances and there has not been a significant improvement in recent years. Studies on protein structure prediction and protein design revealed the limitations of current SFs to achieve further improvements for these two problems. In the same line, a recent work reported a similar result for the PSCPP. In this work, we ask whether or not this negative result regarding further improvements in performance is due to (i) an incorrect weighting of the SFs terms or (ii) the constrained conformation resulting from the protein crystallization process. To analyze these questions, we (i) model the PSCPP as a bi-objective combinatorial optimization problem, optimizing, at the same time, the two most important terms of two SFs of state-of-the-art algorithms and (ii) performed a preprocessing relaxation of the crystal structure through molecular dynamics to simulate the protein in the solvent and evaluated the performance of these two state-of-the-art SFs under these conditions. Our results indicate that (i) no matter what combination of weight factors we use the current SFs will not lead to better performances and (ii) the evaluated SFs will not be able to improve performance on relaxed structures. Furthermore, the experiments revealed that the SFs and the methods are biased toward crystallized structures.
Collapse
Affiliation(s)
- Jose Colbes
- Computer Science Department, CICESE Research Center , 22860 Ensenada, Mexico
| | - Sergio A Aguila
- Centro de Nanociencias y Nanotecnologia, Universidad Nacional Autonoma de Mexico , Km. 107 Carretera Tijuana-Ensenada, Ensenada, Baja California, Mexico , C.P. 22860
| | - Carlos A Brizuela
- Computer Science Department, CICESE Research Center , 22860 Ensenada, Mexico
| |
Collapse
|
17
|
Leem J, Georges G, Shi J, Deane CM. Antibody side chain conformations are position-dependent. Proteins 2018; 86:383-392. [PMID: 29318667 DOI: 10.1002/prot.25453] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 12/15/2017] [Accepted: 01/05/2018] [Indexed: 11/11/2022]
Abstract
Side chain prediction is an integral component of computational antibody design and structure prediction. Current antibody modelling tools use backbone-dependent rotamer libraries with conformations taken from general proteins. Here we present our antibody-specific rotamer library, where rotamers are binned according to their immunogenetics (IMGT) position, rather than their local backbone geometry. We find that for some amino acid types at certain positions, only a restricted number of side chain conformations are ever observed. Using this information, we are able to reduce the breadth of the rotamer sampling space. Based on our rotamer library, we built a side chain predictor, position-dependent antibody rotamer swapper (PEARS). On a blind test set of 95 antibody model structures, PEARS had the highest average χ1 and χ1+2 accuracy (78.7% and 64.8%) compared to three leading backbone-dependent side chain predictors. Our use of IMGT position, rather than backbone ϕ/ψ, meant that PEARS was more robust to errors in the backbone of the model structure. PEARS also achieved the lowest number of side chain-side chain clashes. PEARS is freely available as a web application at http://opig.stats.ox.ac.uk/webapps/pears.
Collapse
Affiliation(s)
- Jinwoo Leem
- Department of Statistics, University of Oxford, 24-29 St Giles, Oxford, OX1 3LB, United Kingdom
| | - Guy Georges
- Pharma Research and Early Development, Large Molecule Research, Roche Innovation Center Munich, Nonnenwald 2, Penzberg, 82377, Germany
| | - Jiye Shi
- Chemistry Department, UCB, 208 Bath Road, Slough, SL1 3WE, United Kingdom
| | - Charlotte M Deane
- Department of Statistics, University of Oxford, 24-29 St Giles, Oxford, OX1 3LB, United Kingdom
| |
Collapse
|
18
|
Zhao K, Zhou X, Ding M. Molecular insight into mutation-induced conformational change in metastasic bowel cancer BRAF kinase domain and its implications for selective inhibitor design. J Mol Graph Model 2017; 79:59-64. [PMID: 29145034 DOI: 10.1016/j.jmgm.2017.11.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2017] [Revised: 10/31/2017] [Accepted: 11/06/2017] [Indexed: 11/30/2022]
Abstract
Oncogenic BRAF V600E mutation confers constitutive activation for the kinase and is closely related to the pathogenesis of metastasic bowel cancer (MBC). Here, the V600E-induced conformational change in MBC BRAF kinase domain is characterized systematically at structural, energetic and dynamic levels. The mutation is observed to cause a conformational conversion of the kinase's activation loop from DFG-out to DFG-in, thus activating the kinase. Electrostatic force is primarily responsible for the conformational conversion and stabilization of DFG-in associated with the mutation. Molecular docking calculations are employed to analyze the binding mode difference of mutant-selective inhibitors between the DFG-out and DFG-in conformations of BRAF kinase. It is revealed that the mutation can reshape inhibitor selectivity profile by altering kinase loop conformation. Several compounds are determined to have a high or moderate selectivity for mutant over wild-type kinase. The selectivity is primarily originated from hydrogen bond interactions of inhibitor ligands with mutant rather than wild type due to the conformational difference in kinase domain.
Collapse
Affiliation(s)
- Kai Zhao
- Department of Gastroenterology, People's Hospital of Jintan, Changzhou 213200, China
| | - Xin Zhou
- Department of Gastroenterology, People's Hospital of Jintan, Changzhou 213200, China
| | - Ming Ding
- Department of Respiration, The Affiliated Hospital of Jiangsu University, Zhenjiang 212001, China.
| |
Collapse
|
19
|
Gaines JC, Virrueta A, Buch DA, Fleishman SJ, O'Hern CS, Regan L. Collective repacking reveals that the structures of protein cores are uniquely specified by steric repulsive interactions. Protein Eng Des Sel 2017; 30:387-394. [PMID: 28201818 PMCID: PMC7263838 DOI: 10.1093/protein/gzx011] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2017] [Accepted: 01/26/2017] [Indexed: 11/12/2022] Open
Abstract
Protein core repacking is a standard test of protein modeling software. A recent study of
six different modeling software packages showed that they are more successful at
predicting side chain conformations of core compared to surface residues. All the modeling
software tested have multicomponent energy functions, typically including contributions
from solvation, electrostatics, hydrogen bonding and Lennard–Jones interactions in
addition to statistical terms based on observed protein structures. We investigated to
what extent a simplified energy function that includes only stereochemical constraints and
repulsive hard-sphere interactions can correctly repack protein cores. For single residue
and collective repacking, the hard-sphere model accurately recapitulates the observed side
chain conformations for Ile, Leu, Phe, Thr, Trp, Tyr and Val. This result shows that there
are no alternative, sterically allowed side chain conformations of core residues. Analysis
of the same set of protein cores using the Rosetta software suite revealed that the
hard-sphere model and Rosetta perform equally well on Ile, Leu, Phe, Thr and Val; the
hard-sphere model performs better on Trp and Tyr and Rosetta performs better on Ser. We
conclude that the high prediction accuracy in protein cores obtained by protein modeling
software and our simplified hard-sphere approach reflects the high density of protein
cores and dominance of steric repulsion.
Collapse
Affiliation(s)
- J C Gaines
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.,Integrated Graduate Program in Physical and Engineering Biology (IGPPEB), Yale University, New Haven, CT 06520, USA
| | - A Virrueta
- Integrated Graduate Program in Physical and Engineering Biology (IGPPEB), Yale University, New Haven, CT 06520, USA.,Department of Mechanical Engineering and Materials Science, Yale University, New Haven, CT 06520, USA
| | - D A Buch
- C. Eugene Bennett Department of Chemistry, 217 Clark Hall, West Virginia University, Morgantown, WV 26506, USA
| | - S J Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot 76100, Israel
| | - C S O'Hern
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.,Integrated Graduate Program in Physical and Engineering Biology (IGPPEB), Yale University, New Haven, CT 06520, USA.,Department of Mechanical Engineering and Materials Science, Yale University, New Haven, CT 06520, USA.,Department of Physics, Yale University, New Haven, CT 06520, USA.,Department of Applied Physics, Yale University, New Haven, CT 06520, USA
| | - L Regan
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.,Integrated Graduate Program in Physical and Engineering Biology (IGPPEB), Yale University, New Haven, CT 06520, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.,Department of Chemistry, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
20
|
Modeling disordered protein interactions from biophysical principles. PLoS Comput Biol 2017; 13:e1005485. [PMID: 28394890 PMCID: PMC5402988 DOI: 10.1371/journal.pcbi.1005485] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2016] [Revised: 04/24/2017] [Accepted: 03/29/2017] [Indexed: 12/12/2022] Open
Abstract
Disordered protein-protein interactions (PPIs), those involving a folded protein and an intrinsically disordered protein (IDP), are prevalent in the cell, including important signaling and regulatory pathways. IDPs do not adopt a single dominant structure in isolation but often become ordered upon binding. To aid understanding of the molecular mechanisms of disordered PPIs, it is crucial to obtain the tertiary structure of the PPIs. However, experimental methods have difficulty in solving disordered PPIs and existing protein-protein and protein-peptide docking methods are not able to model them. Here we present a novel computational method, IDP-LZerD, which models the conformation of a disordered PPI by considering the biophysical binding mechanism of an IDP to a structured protein, whereby a local segment of the IDP initiates the interaction and subsequently the remaining IDP regions explore and coalesce around the initial binding site. On a dataset of 22 disordered PPIs with IDPs up to 69 amino acids, successful predictions were made for 21 bound and 18 unbound receptors. The successful modeling provides additional support for biophysical principles. Moreover, the new technique significantly expands the capability of protein structure modeling and provides crucial insights into the molecular mechanisms of disordered PPIs. A substantial fraction of the proteins encoded in genomes are intrinsically disordered proteins (IDPs), which lack a single stable structure in the native state. IDPs serve many functions including mediating protein-protein interactions (PPIs). Such disordered PPIs are prevalent in important regulatory pathways, including many interactions of the tumor suppressor protein p53. To elucidate the molecular mechanisms of disordered PPIs, obtaining tertiary structure information is essential; however, they are difficult to study with experimental techniques and existing computational protein-protein and protein-peptide modeling methods are unable to model disordered PPIs. Here we present a novel computational method for modeling the structure of disordered PPIs, which is the first of this sort. The method, IDP-LZerD, is designed to follow a known biophysical picture of the mechanism of how IDPs interact with structured proteins. IDP-LZerD successfully modeled the majority of disordered PPIs tested. This technique opens up new possibilities for structural studies of IDPs and their interactions.
Collapse
|
21
|
Miao Z, Cao Y. Quantifying side-chain conformational variations in protein structure. Sci Rep 2016; 6:37024. [PMID: 27845406 PMCID: PMC5109468 DOI: 10.1038/srep37024] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Accepted: 10/24/2016] [Indexed: 12/15/2022] Open
Abstract
Protein side-chain conformation is closely related to their biological functions. The side-chain prediction is a key step in protein design, protein docking and structure optimization. However, side-chain polymorphism comprehensively exists in protein as various types and has been long overlooked by side-chain prediction. But such conformational variations have not been quantitatively studied and the correlations between these variations and residue features are vague. Here, we performed statistical analyses on large scale data sets and found that the side-chain conformational flexibility is closely related to the exposure to solvent, degree of freedom and hydrophilicity. These analyses allowed us to quantify different types of side-chain variabilities in PDB. The results underscore that protein side-chain conformation prediction is not a single-answer problem, leading us to reconsider the assessment approaches of side-chain prediction programs.
Collapse
Affiliation(s)
- Zhichao Miao
- Architecture et Réactivité de l'ARN, Université de Strasbourg, Institut de biologie moléculaire et cellulaire du CNRS, 67000 Strasbourg, France.,European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.,Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Yang Cao
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences and State Key Laboratory of Biotherapy, Sichuan University, Chengdu, 610014, China
| |
Collapse
|
22
|
Caballero D, Virrueta A, O'Hern CS, Regan L. Steric interactions determine side-chain conformations in protein cores. Protein Eng Des Sel 2016; 29:367-376. [PMID: 27416747 DOI: 10.1093/protein/gzw027] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Accepted: 06/12/2016] [Indexed: 11/12/2022] Open
Abstract
We investigate the role of steric interactions in defining side-chain conformations in protein cores. Previously, we explored the strengths and limitations of hard-sphere dipeptide models in defining sterically allowed side-chain conformations and recapitulating key features of the side-chain dihedral angle distributions observed in high-resolution protein structures. Here, we show that modeling residues in the context of a particular protein environment, with both intra- and inter-residue steric interactions, is sufficient to specify which of the allowed side-chain conformations is adopted. This model predicts 97% of the side-chain conformations of Leu, Ile, Val, Phe, Tyr, Trp and Thr core residues to within 20°. Although the hard-sphere dipeptide model predicts the observed side-chain dihedral angle distributions for both Thr and Ser, the model including the protein environment predicts side-chain conformations to within 20° for only 60% of core Ser residues. Thus, this approach can identify the amino acids for which hard-sphere interactions alone are sufficient and those for which additional interactions are necessary to accurately predict side-chain conformations in protein cores. We also show that our approach can predict alternate side-chain conformations of core residues, which are supported by the observed electron density.
Collapse
Affiliation(s)
- D Caballero
- Department of Physics, Yale University, New Haven, CT 06520, USA.,Integrated Graduate Program in Physical and Engineering Biology, Yale University, New Haven, CT 06520, USA
| | - A Virrueta
- Integrated Graduate Program in Physical and Engineering Biology, Yale University, New Haven, CT 06520, USA.,Department of Mechanical Engineering and Materials Science, Yale University, New Haven, CT 06520, USA
| | - C S O'Hern
- Department of Physics, Yale University, New Haven, CT 06520, USA.,Integrated Graduate Program in Physical and Engineering Biology, Yale University, New Haven, CT 06520, USA.,Department of Mechanical Engineering and Materials Science, Yale University, New Haven, CT 06520, USA.,Graduate Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
| | - L Regan
- Integrated Graduate Program in Physical and Engineering Biology, Yale University, New Haven, CT 06520, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.,Department of Chemistry, Yale University, New Haven, CT 06520, USA.,Raymond and Beverly Sackler Institute for Biological, Physical, and Engineering Sciences, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
23
|
Gaillard T, Panel N, Simonson T. Protein side chain conformation predictions with an MMGBSA energy function. Proteins 2016; 84:803-19. [PMID: 26948696 DOI: 10.1002/prot.25030] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Revised: 02/22/2016] [Accepted: 02/27/2016] [Indexed: 12/17/2022]
Abstract
The prediction of protein side chain conformations from backbone coordinates is an important task in structural biology, with applications in structure prediction and protein design. It is a difficult problem due to its combinatorial nature. We study the performance of an "MMGBSA" energy function, implemented in our protein design program Proteus, which combines molecular mechanics terms, a Generalized Born and Surface Area (GBSA) solvent model, with approximations that make the model pairwise additive. Proteus is not a competitor to specialized side chain prediction programs due to its cost, but it allows protein design applications, where side chain prediction is an important step and MMGBSA an effective energy model. We predict the side chain conformations for 18 proteins. The side chains are first predicted individually, with the rest of the protein in its crystallographic conformation. Next, all side chains are predicted together. The contributions of individual energy terms are evaluated and various parameterizations are compared. We find that the GB and SA terms, with an appropriate choice of the dielectric constant and surface energy coefficients, are beneficial for single side chain predictions. For the prediction of all side chains, however, errors due to the pairwise additive approximation overcome the improvement brought by these terms. We also show the crucial contribution of side chain minimization to alleviate the rigid rotamer approximation. Even without GB and SA terms, we obtain accuracies comparable to SCWRL4, a specialized side chain prediction program. In particular, we obtain a better RMSD than SCWRL4 for core residues (at a higher cost), despite our simpler rotamer library. Proteins 2016; 84:803-819. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Thomas Gaillard
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Nicolas Panel
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Thomas Simonson
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| |
Collapse
|
24
|
Kim H, Kihara D. Protein structure prediction using residue- and fragment-environment potentials in CASP11. Proteins 2015; 84 Suppl 1:105-17. [PMID: 26344195 DOI: 10.1002/prot.24920] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Revised: 08/03/2015] [Accepted: 08/31/2015] [Indexed: 11/08/2022]
Abstract
An accurate scoring function that can select near-native structure models from a pool of alternative models is key for successful protein structure prediction. For the critical assessment of techniques for protein structure prediction (CASP) 11, we have built a protocol of protein structure prediction that has novel coarse-grained scoring functions for selecting decoys as the heart of its pipeline. The score named PRESCO (Protein Residue Environment SCOre) developed recently by our group evaluates the native-likeness of local structural environment of residues in a structure decoy considering positions and the depth of side-chains of spatially neighboring residues. We also introduced a helix interaction potential as an additional scoring function for selecting decoys. The best models selected by PRESCO and the helix interaction potential underwent structure refinement, which includes side-chain modeling and relaxation with a short molecular dynamics simulation. Our protocol was successful, achieving the top rank in the free modeling category with a significant margin of the accumulated Z-score to the subsequent groups when the top 1 models were considered. Proteins 2016; 84(Suppl 1):105-117. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Hyungrae Kim
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47906
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47906. .,Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907.
| |
Collapse
|
25
|
Peng X, Chenani A, Hu S, Zhou Y, Niemi AJ. A three dimensional visualisation approach to protein heavy-atom structure reconstruction. BMC STRUCTURAL BIOLOGY 2014; 14:27. [PMID: 25551190 PMCID: PMC4302604 DOI: 10.1186/s12900-014-0027-8] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Accepted: 12/16/2014] [Indexed: 11/10/2022]
Abstract
Background A commonly recurring problem in structural protein studies, is the determination of all heavy atom positions from the knowledge of the central α-carbon coordinates. Results We employ advances in virtual reality to address the problem. The outcome is a 3D visualisation based technique where all the heavy backbone and side chain atoms are treated on equal footing, in terms of the Cα coordinates. Each heavy atom is visualised on the surfaces of a different two-sphere, that is centered at another heavy backbone and side chain atoms. In particular, the rotamers are visible as clusters, that display a clear and strong dependence on the underlying backbone secondary structure. Conclusions We demonstrate that there is a clear interdependence between rotameric states and secondary structure. Our method easily detects those atoms in a crystallographic protein structure which are either outliers or have been likely misplaced, possibly due to radiation damage. Our approach forms a basis for the development of a new generation, visualization based side chain construction, validation and refinement tools. The heavy atom positions are identified in a manner which accounts for the secondary structure environment, leading to improved accuracy. Electronic supplementary material The online version of this article (doi:10.1186/s12900-014-0027-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xubiao Peng
- Department of Physics and Astronomy, Uppsala University, Uppsala, Sweden.
| | - Alireza Chenani
- Department of Physics and Astronomy, Uppsala University, Uppsala, Sweden.
| | - Shuangwei Hu
- Department of Physics and Astronomy, Uppsala University, Uppsala, Sweden.
| | - Yifan Zhou
- Department of Biomedicine, Faculty of Medicine and Dentistry, University of Bergen, Jonas Lies Vei 91, NO-5009, Bergen, Norway.
| | - Antti J Niemi
- Department of Physics and Astronomy, Uppsala University, Uppsala, Sweden. .,Laboratoire de Mathematiques et Physique Theorique CNRS UMR 6083, Fédération Denis Poisson, Université de Tours, Parc de Grandmont, F37200, Tours, France.
| |
Collapse
|