1
|
Heo L, Arbour CF, Janson G, Feig M. Improved Sampling Strategies for Protein Model Refinement Based on Molecular Dynamics Simulation. J Chem Theory Comput 2021; 17:1931-1943. [PMID: 33562962 DOI: 10.1021/acs.jctc.0c01238] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein structures provide valuable information for understanding biological processes. Protein structures can be determined by experimental methods such as X-ray crystallography, nuclear magnetic resonance spectroscopy, or cryogenic electron microscopy. As an alternative, in silico methods can be used to predict protein structures. These methods utilize protein structure databases for structure prediction via template-based modeling or for training machine-learning models to generate predictions. Structure prediction for proteins distant from proteins with known structures often results in lower accuracy with respect to the true physiological structures. Physics-based protein model refinement methods can be applied to improve model accuracy in the predicted models. Refinement methods rely on conformational sampling around the predicted structures, and if structures closer to the native states are sampled, improvements in the model quality become possible. Molecular dynamics simulations have been especially successful for improving model qualities but although consistent refinement can be achieved, the improvements in model qualities are still moderate. To extend the refinement performance of a simulation-based protocol, we explored new schemes that focus on optimized use of biasing functions and the application of increased simulation temperatures. In addition, we tested the use of alternative initial models so that the simulations can explore the conformational space more broadly. Based on the insights of this analysis, we are proposing a new refinement protocol that significantly outperformed previous state-of-the-art molecular dynamics simulation-based protocols in the benchmark tests described here.
Collapse
Affiliation(s)
- Lim Heo
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| | - Collin F Arbour
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| | - Giacomo Janson
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| | - Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
2
|
Fine J, Konc J, Samudrala R, Chopra G. CANDOCK: Chemical Atomic Network-Based Hierarchical Flexible Docking Algorithm Using Generalized Statistical Potentials. J Chem Inf Model 2020; 60:1509-1527. [PMID: 32069042 DOI: 10.1021/acs.jcim.9b00686] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Small-molecule docking has proven to be invaluable for drug design and discovery. However, existing docking methods have several limitations such as improper treatment of the interactions of essential components in the chemical environment of the binding pocket (e.g., cofactors, metal ions, etc.), incomplete sampling of chemically relevant ligand conformational space, and the inability to consistently correlate docking scores of the best binding pose with experimental binding affinities. We present CANDOCK, a novel docking algorithm, that utilizes a hierarchical approach to reconstruct ligands from an atomic grid using graph theory and generalized statistical potential functions to sample biologically relevant ligand conformations. Our algorithm accounts for protein flexibility, solvent, metal ions, and cofactor interactions in the binding pocket that are traditionally ignored by current methods. We evaluate the algorithm on the PDBbind, Astex, and PINC proteins to show its ability to reproduce the binding mode of the ligands that is independent of the initial ligand conformation in these benchmarks. Finally, we identify the best selector and ranker potential functions such that the statistical score of the best selected docked pose correlates with the experimental binding affinities of the ligands for any given protein target. Our results indicate that CANDOCK is a generalized flexible docking method that addresses several limitations of current docking methods by considering all interactions in the chemical environment of a binding pocket for correlating the best-docked pose with biological activity. CANDOCK along with all structures and scripts used for benchmarking is available at https://github.com/chopralab/candock_benchmark.
Collapse
Affiliation(s)
- Jonathan Fine
- Department of Chemistry, Purdue University, 720 Clinic Drive, West Lafayette, Indiana 47906, United States
| | - Janez Konc
- National Institute of Chemistry, Hajdrihova 19, SI-1000, Ljubljana, Slovenia
| | - Ram Samudrala
- Department of Biomedical Informatics, SUNY, Buffalo, New York 14260, United States
| | - Gaurav Chopra
- Department of Chemistry, Purdue University, 720 Clinic Drive, West Lafayette, Indiana 47906, United States.,Purdue Institute for Drug Discovery, West Lafayette, Indiana 47907, United States.,Purdue Center for Cancer Research, West Lafayette, Indiana 47907, United States.,Purdue Institute for Inflammation, Immunology and Infectious Disease, West Lafayette, Indiana 47907, United States.,Purdue Institute for Integrative Neuroscience, West Lafayette, Indiana 47907, United States.,Integrative Data Science Initiative, West Lafayette, Indiana 47907, United States
| |
Collapse
|
3
|
Badaczewska-Dawid AE, Kolinski A, Kmiecik S. Computational reconstruction of atomistic protein structures from coarse-grained models. Comput Struct Biotechnol J 2019; 18:162-176. [PMID: 31969975 PMCID: PMC6961067 DOI: 10.1016/j.csbj.2019.12.007] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 12/10/2019] [Indexed: 01/02/2023] Open
Abstract
Three-dimensional protein structures, whether determined experimentally or theoretically, are often too low resolution. In this mini-review, we outline the computational methods for protein structure reconstruction from incomplete coarse-grained to all atomistic models. Typical reconstruction schemes can be divided into four major steps. Usually, the first step is reconstruction of the protein backbone chain starting from the C-alpha trace. This is followed by side-chains rebuilding based on protein backbone geometry. Subsequently, hydrogen atoms can be reconstructed. Finally, the resulting all-atom models may require structure optimization. Many methods are available to perform each of these tasks. We discuss the available tools and their potential applications in integrative modeling pipelines that can transfer coarse-grained information from computational predictions, or experiment, to all atomistic structures.
Collapse
Affiliation(s)
| | | | - Sebastian Kmiecik
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| |
Collapse
|
4
|
Revisiting the "satisfaction of spatial restraints" approach of MODELLER for protein homology modeling. PLoS Comput Biol 2019; 15:e1007219. [PMID: 31846452 PMCID: PMC6938380 DOI: 10.1371/journal.pcbi.1007219] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 12/31/2019] [Accepted: 11/13/2019] [Indexed: 01/02/2023] Open
Abstract
The most frequently used approach for protein structure prediction is currently homology modeling. The 3D model building phase of this methodology is critical for obtaining an accurate and biologically useful prediction. The most widely employed tool to perform this task is MODELLER. This program implements the “modeling by satisfaction of spatial restraints” strategy and its core algorithm has not been altered significantly since the early 1990s. In this work, we have explored the idea of modifying MODELLER with two effective, yet computationally light strategies to improve its 3D modeling performance. Firstly, we have investigated how the level of accuracy in the estimation of structural variability between a target protein and its templates in the form of σ values profoundly influences 3D modeling. We show that the σ values produced by MODELLER are on average weakly correlated to the true level of structural divergence between target-template pairs and that increasing this correlation greatly improves the program’s predictions, especially in multiple-template modeling. Secondly, we have inquired into how the incorporation of statistical potential terms (such as the DOPE potential) in the MODELLER’s objective function impacts positively 3D modeling quality by providing a small but consistent improvement in metrics such as GDT-HA and lDDT and a large increase in stereochemical quality. Python modules to harness this second strategy are freely available at https://github.com/pymodproject/altmod. In summary, we show that there is a large room for improving MODELLER in terms of 3D modeling quality and we propose strategies that could be pursued in order to further increase its performance. Proteins are fundamental biological molecules that carry out countless activities in living beings. Since the function of proteins is dictated by their three-dimensional atomic structures, acquiring structural details of proteins provides deep insights into their function. Currently, the most frequently used computational approach for protein structure prediction is template-based modeling. In this approach, a target protein is modeled using the experimentally-derived structural information of a template protein assumed to have a similar structure to the target. MODELLER is the most frequently used program for template-based 3D model building. Despite its success, its predictions are not always accurate enough to be useful in Biomedical Research. Here, we show that it is possible to greatly increase the performance of MODELLER by modifying two aspects of its algorithm. First, we demonstrate that providing the program with accurate estimations of local target-template structural divergence greatly increases the quality of its predictions. Additionally, we show that modifying MODELLER’s scoring function with statistical potential energetic terms also helps to improve modeling quality. This work will be useful in future research, since it reports practical strategies to improve the performance of this core tool in Structural Bioinformatics.
Collapse
|
5
|
Methods for the Refinement of Protein Structure 3D Models. Int J Mol Sci 2019; 20:ijms20092301. [PMID: 31075942 PMCID: PMC6539982 DOI: 10.3390/ijms20092301] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 04/24/2019] [Accepted: 05/07/2019] [Indexed: 12/25/2022] Open
Abstract
The refinement of predicted 3D protein models is crucial in bringing them closer towards experimental accuracy for further computational studies. Refinement approaches can be divided into two main stages: The sampling and scoring stages. Sampling strategies, such as the popular Molecular Dynamics (MD)-based protocols, aim to generate improved 3D models. However, generating 3D models that are closer to the native structure than the initial model remains challenging, as structural deviations from the native basin can be encountered due to force-field inaccuracies. Therefore, different restraint strategies have been applied in order to avoid deviations away from the native structure. For example, the accurate prediction of local errors and/or contacts in the initial models can be used to guide restraints. MD-based protocols, using physics-based force fields and smart restraints, have made significant progress towards a more consistent refinement of 3D models. The scoring stage, including energy functions and Model Quality Assessment Programs (MQAPs) are also used to discriminate near-native conformations from non-native conformations. Nevertheless, there are often very small differences among generated 3D models in refinement pipelines, which makes model discrimination and selection problematic. For this reason, the identification of the most native-like conformations remains a major challenge.
Collapse
|
6
|
Keasar C, McGuffin LJ, Wallner B, Chopra G, Adhikari B, Bhattacharya D, Blake L, Bortot LO, Cao R, Dhanasekaran BK, Dimas I, Faccioli RA, Faraggi E, Ganzynkowicz R, Ghosh S, Ghosh S, Giełdoń A, Golon L, He Y, Heo L, Hou J, Khan M, Khatib F, Khoury GA, Kieslich C, Kim DE, Krupa P, Lee GR, Li H, Li J, Lipska A, Liwo A, Maghrabi AHA, Mirdita M, Mirzaei S, Mozolewska MA, Onel M, Ovchinnikov S, Shah A, Shah U, Sidi T, Sieradzan AK, Ślusarz M, Ślusarz R, Smadbeck J, Tamamis P, Trieber N, Wirecki T, Yin Y, Zhang Y, Bacardit J, Baranowski M, Chapman N, Cooper S, Defelicibus A, Flatten J, Koepnick B, Popović Z, Zaborowski B, Baker D, Cheng J, Czaplewski C, Delbem ACB, Floudas C, Kloczkowski A, Ołdziej S, Levitt M, Scheraga H, Seok C, Söding J, Vishveshwara S, Xu D, Crivelli SN. An analysis and evaluation of the WeFold collaborative for protein structure prediction and its pipelines in CASP11 and CASP12. Sci Rep 2018; 8:9939. [PMID: 29967418 PMCID: PMC6028396 DOI: 10.1038/s41598-018-26812-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Accepted: 05/17/2018] [Indexed: 01/14/2023] Open
Abstract
Every two years groups worldwide participate in the Critical Assessment of Protein Structure Prediction (CASP) experiment to blindly test the strengths and weaknesses of their computational methods. CASP has significantly advanced the field but many hurdles still remain, which may require new ideas and collaborations. In 2012 a web-based effort called WeFold, was initiated to promote collaboration within the CASP community and attract researchers from other fields to contribute new ideas to CASP. Members of the WeFold coopetition (cooperation and competition) participated in CASP as individual teams, but also shared components of their methods to create hybrid pipelines and actively contributed to this effort. We assert that the scale and diversity of integrative prediction pipelines could not have been achieved by any individual lab or even by any collaboration among a few partners. The models contributed by the participating groups and generated by the pipelines are publicly available at the WeFold website providing a wealth of data that remains to be tapped. Here, we analyze the results of the 2014 and 2016 pipelines showing improvements according to the CASP assessment as well as areas that require further adjustments and research.
Collapse
Affiliation(s)
- Chen Keasar
- Department of Computer Science, Ben Gurion University of the Negev, Be'er sheva, Israel
| | - Liam J McGuffin
- Biomedical Sciences Division, School of Biological Sciences, University of Reading, Reading, RG6 6AS, UK
| | - Björn Wallner
- Division of Bioinformatics, Department of Physics, Chemistry, and Biology, Linköping University, Linköping, Sweden
| | - Gaurav Chopra
- Department of Chemistry, College of Science, Purdue University, West Lafayette, IN, USA
- Purdue Institute for Drug Discovery, Purdue University, West Lafayette, IN, USA
- Purdue Center for Cancer Research, Purdue University, West Lafayette, IN, USA
- Purdue Institute for Inflammation, Immunology and Infectious Disease, Purdue University, West Lafayette, IN, USA
- Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, USA
| | - Badri Adhikari
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | - Debswapna Bhattacharya
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | - Lauren Blake
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Leandro Oliveira Bortot
- Laboratory of Biological Physics, Faculty of Pharmaceutical Sciences at Ribeirão Preto, University of São Paulo, São Paulo, Brazil
| | - Renzhi Cao
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | - B K Dhanasekaran
- Molecular Biophysics Unit and IISC Mathematics Initiative, Indian Institute of Science, Bangalore, India
| | - Itzhel Dimas
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | | | - Eshel Faraggi
- Research and Information Systems, LLC, Carmel, IN, USA
- Department of Biochemistry and Molecular Biology, IU School of Medicine, Indianapolis, IN, USA
- Batelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
| | | | - Sambit Ghosh
- Molecular Biophysics Unit and IISC Mathematics Initiative, Indian Institute of Science, Bangalore, India
| | - Soma Ghosh
- Molecular Biophysics Unit and IISC Mathematics Initiative, Indian Institute of Science, Bangalore, India
| | - Artur Giełdoń
- Faculty of Chemistry, University of Gdansk, Gdańsk, Poland
| | - Lukasz Golon
- Faculty of Chemistry, University of Gdansk, Gdańsk, Poland
| | - Yi He
- School of Engineering, University of California, Merced, CA, USA
| | - Lim Heo
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Jie Hou
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | - Main Khan
- Department of Computer and Information Science, University of Massachusetts Dartmouth, MA, USA
| | - Firas Khatib
- Department of Computer and Information Science, University of Massachusetts Dartmouth, MA, USA
| | - George A Khoury
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ, USA
| | - Chris Kieslich
- Texas A&M Energy Institute, Texas A&M University, College Station, TX, USA
| | - David E Kim
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Pawel Krupa
- Faculty of Chemistry, University of Gdansk, Gdańsk, Poland
| | - Gyu Rie Lee
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Hongbo Li
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- School of Computer Science and Information Technology, NorthEast Normal University, Changchun, China
- Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
| | - Jilong Li
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | | | - Adam Liwo
- Faculty of Chemistry, University of Gdansk, Gdańsk, Poland
| | - Ali Hassan A Maghrabi
- Biomedical Sciences Division, School of Biological Sciences, University of Reading, Reading, RG6 6AS, UK
| | - Milot Mirdita
- Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Shokoufeh Mirzaei
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- California State Polytechnic University, Pomona, CA, USA
| | | | - Melis Onel
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, TX, USA
| | - Sergey Ovchinnikov
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Anand Shah
- Department of Computer and Information Science, University of Massachusetts Dartmouth, MA, USA
| | - Utkarsh Shah
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, TX, USA
| | - Tomer Sidi
- Department of Computer Science, Ben Gurion University of the Negev, Be'er sheva, Israel
| | | | | | - Rafal Ślusarz
- Faculty of Chemistry, University of Gdansk, Gdańsk, Poland
| | - James Smadbeck
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ, USA
| | - Phanourios Tamamis
- Texas A&M Energy Institute, Texas A&M University, College Station, TX, USA
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, TX, USA
| | - Nicholas Trieber
- Department of Computer and Information Science, University of Massachusetts Dartmouth, MA, USA
| | - Tomasz Wirecki
- Faculty of Chemistry, University of Gdansk, Gdańsk, Poland
| | - Yanping Yin
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Jaume Bacardit
- Interdisciplinary Computing and Complex BioSystems (ICOS) research group, School of Computing, Newcastle University, Newcastle-upon-Tyne, UK
| | - Maciej Baranowski
- Intercollegiate Faculty of Biotechnology, University of Gdańsk and Medical University of Gdańsk, Gdańsk, Poland
| | - Nicholas Chapman
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, Seattle, WA, USA
| | - Seth Cooper
- College of Computer and Information Science, Northeastern University, Boston, MA, USA
| | - Alexandre Defelicibus
- Institute of Mathematical and Computer Sciences, University of São Paulo, São Paulo, Brazil
| | - Jeff Flatten
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, Seattle, WA, USA
| | - Brian Koepnick
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Zoran Popović
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, Seattle, WA, USA
| | | | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, Seattle, WA, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | | | | | | | | | - Stanislaw Ołdziej
- Intercollegiate Faculty of Biotechnology, University of Gdańsk and Medical University of Gdańsk, Gdańsk, Poland
| | - Michael Levitt
- Department of Structural Biology, School of Medicine, Stanford University, Stanford, CA, USA
| | - Harold Scheraga
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY, USA
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Johannes Söding
- Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Saraswathi Vishveshwara
- Molecular Biophysics Unit and IISC Mathematics Initiative, Indian Institute of Science, Bangalore, India
| | - Dong Xu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
| | - Silvia N Crivelli
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Department of Computer Science, University of California, Davis, CA, USA.
| |
Collapse
|
7
|
Chopra G, Samudrala R. Exploring Polypharmacology in Drug Discovery and Repurposing Using the CANDO Platform. Curr Pharm Des 2017; 22:3109-23. [PMID: 27013226 DOI: 10.2174/1381612822666160325121943] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2016] [Accepted: 03/01/2015] [Indexed: 01/05/2023]
Abstract
BACKGROUND Traditional drug discovery approaches focus on a limited set of target molecules for treatment against specific indications/diseases. However, drug absorption, dispersion, metabolism, and excretion (ADME) involve interactions with multiple protein systems. Drugs approved for particular indication(s) may be repurposed as novel therapeutics for others. The severely declining rate of discovery and increasing costs of new drugs illustrate the limitations of the traditional reductionist paradigm in drug discovery. METHODS We developed the Computational Analysis of Novel Drug Opportunities (CANDO) platform based on a hypothesis that drugs function by interacting with multiple protein targets to create a molecular interaction signature that can be exploited for therapeutic repurposing and discovery. We compiled a library of compounds that are human ingestible with minimal side effects, followed by an 'all-compounds' vs 'all-proteins' fragment-based multitarget docking with dynamics screen to construct compound-proteome interaction matrices that were then analyzed to determine similarity of drug behavior. The proteomic signature similarity of drugs is then ranked to make putative drug predictions for all indications in a shotgun manner. RESULTS We have previously applied this platform with success in both retrospective benchmarking and prospective validation, and to understand the effect of druggable protein classes on repurposing accuracy. Here we use the CANDO platform to analyze and determine the contribution of multitargeting (polypharmacology) to drug repurposing benchmarking accuracy. Taken together with the previous work, our results indicate that a large number of protein structures with diverse fold space and a specific polypharmacological interactome is necessary for accurate drug predictions using our proteomic and evolutionary drug discovery and repurposing platform. CONCLUSION These results have implications for future drug development and repurposing in the context of polypharmacology.
Collapse
Affiliation(s)
- Gaurav Chopra
- Department of Chemistry, Purdue University, West Lafayette, IN, USA.
| | - Ram Samudrala
- Department of Biomedical Informatics, SUNY, Buffalo, NY, USA.
| |
Collapse
|
8
|
GalaxyDock BP2 score: a hybrid scoring function for accurate protein–ligand docking. J Comput Aided Mol Des 2017. [DOI: 10.1007/s10822-017-0030-9] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
9
|
Combating Ebola with Repurposed Therapeutics Using the CANDO Platform. Molecules 2016; 21:molecules21121537. [PMID: 27898018 PMCID: PMC5958544 DOI: 10.3390/molecules21121537] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Revised: 10/23/2016] [Accepted: 10/28/2016] [Indexed: 12/20/2022] Open
Abstract
Ebola virus disease (EVD) is extremely virulent with an estimated mortality rate of up to 90%. However, the state-of-the-art treatment for EVD is limited to quarantine and supportive care. The 2014 Ebola epidemic in West Africa, the largest in history, is believed to have caused more than 11,000 fatalities. The countries worst affected are also among the poorest in the world. Given the complexities, time, and resources required for a novel drug development, finding efficient drug discovery pathways is going to be crucial in the fight against future outbreaks. We have developed a Computational Analysis of Novel Drug Opportunities (CANDO) platform based on the hypothesis that drugs function by interacting with multiple protein targets to create a molecular interaction signature that can be exploited for rapid therapeutic repurposing and discovery. We used the CANDO platform to identify and rank FDA-approved drug candidates that bind and inhibit all proteins encoded by the genomes of five different Ebola virus strains. Top ranking drug candidates for EVD treatment generated by CANDO were compared to in vitro screening studies against Ebola virus-like particles (VLPs) by Kouznetsova et al. and genetically engineered Ebola virus and cell viability studies by Johansen et al. to identify drug overlaps between the in virtuale and in vitro studies as putative treatments for future EVD outbreaks. Our results indicate that integrating computational docking predictions on a proteomic scale with results from in vitro screening studies may be used to select and prioritize compounds for further in vivo and clinical testing. This approach will significantly reduce the lead time, risk, cost, and resources required to determine efficacious therapies against future EVD outbreaks.
Collapse
|
10
|
Computational Refinement and Validation Protocol for Proteins with Large Variable Regions Applied to Model HIV Env Spike in CD4 and 17b Bound State. Structure 2016; 23:1138-49. [PMID: 26039348 DOI: 10.1016/j.str.2015.03.026] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2014] [Revised: 03/11/2015] [Accepted: 03/13/2015] [Indexed: 12/28/2022]
Abstract
Envelope glycoprotein gp120 of HIV-1 possesses several variable regions; their precise structure has been difficult to establish. We report a new model of gp120, in complex with antibodies CD4 and 17b, complete with its variable regions. The model was produced by a computational protocol that uses cryo-electron microscopy (EM) maps, atomic-resolution structures of the core, and information on binding interactions. Our model has excellent fit with EMD: 5020, is stereochemically and energetically favorable, and has the expected binding interfaces. Comparison of the ternary arrangement of the loops in this model with those bound to PGT122 and PGV04 suggested a possible motion of the V1V2 away from the CCR5 binding site and toward CD4. Our study also revealed that the CD4-bound state of the V1V2 loop is not optimal for gp120 bound with several neutralizing antibodies.
Collapse
|
11
|
Sethi G, Chopra G, Samudrala R. Multiscale modelling of relationships between protein classes and drug behavior across all diseases using the CANDO platform. Mini Rev Med Chem 2016; 15:705-17. [PMID: 25694071 DOI: 10.2174/1389557515666150219145148] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2014] [Revised: 10/30/2014] [Accepted: 11/25/2014] [Indexed: 01/27/2023]
Abstract
We have examined the effect of eight different protein classes (channels, GPCRs, kinases, ligases, nuclear receptors, proteases, phosphatases, transporters) on the benchmarking performance of the CANDO drug discovery and repurposing platform (http://protinfo.org/cando). The first version of the CANDO platform utilizes a matrix of predicted interactions between 48278 proteins and 3733 human ingestible compounds (including FDA approved drugs and supplements) that map to 2030 indications/diseases using a hierarchical chem and bio-informatic fragment based docking with dynamics protocol (> one billion predicted interactions considered). The platform uses similarity of compound-proteome interaction signatures as indicative of similar functional behavior and benchmarking accuracy is calculated across 1439 indications/diseases with more than one approved drug. The CANDO platform yields a significant correlation (0.99, p-value < 0.0001) between the number of proteins considered and benchmarking accuracy obtained indicating the importance of multitargeting for drug discovery. Average benchmarking accuracies range from 6.2 % to 7.6 % for the eight classes when the top 10 ranked compounds are considered, in contrast to a range of 5.5 % to 11.7 % obtained for the comparison/control sets consisting of 10, 100, 1000, and 10000 single best performing proteins. These results are generally two orders of magnitude better than the average accuracy of 0.2% obtained when randomly generated (fully scrambled) matrices are used. Different indications perform well when different classes are used but the best accuracies (up to 11.7% for the top 10 ranked compounds) are achieved when a combination of classes are used containing the broadest distribution of protein folds. Our results illustrate the utility of the CANDO approach and the consideration of different protein classes for devising indication specific protocols for drug repurposing as well as drug discovery.
Collapse
Affiliation(s)
| | | | - Ram Samudrala
- Department of Biomedical Informatics, School of Medicine and Biomedical Sciences, State University of New York (SUNY), 923 Main Street, Buffalo, NY 14203, USA.
| |
Collapse
|
12
|
Carlsen M, Koehl P, Røgen P. On the importance of the distance measures used to train and test knowledge-based potentials for proteins. PLoS One 2014; 9:e109335. [PMID: 25411785 PMCID: PMC4239004 DOI: 10.1371/journal.pone.0109335] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2014] [Accepted: 08/31/2014] [Indexed: 12/15/2022] Open
Abstract
Knowledge-based potentials are energy functions derived from the analysis of databases of protein structures and sequences. They can be divided into two classes. Potentials from the first class are based on a direct conversion of the distributions of some geometric properties observed in native protein structures into energy values, while potentials from the second class are trained to mimic quantitatively the geometric differences between incorrectly folded models and native structures. In this paper, we focus on the relationship between energy and geometry when training the second class of knowledge-based potentials. We assume that the difference in energy between a decoy structure and the corresponding native structure is linearly related to the distance between the two structures. We trained two distance-based knowledge-based potentials accordingly, one based on all inter-residue distances (PPD), while the other had the set of all distances filtered to reflect consistency in an ensemble of decoys (PPE). We tested four types of metric to characterize the distance between the decoy and the native structure, two based on extrinsic geometry (RMSD and GTD-TS*), and two based on intrinsic geometry (Q* and MT). The corresponding eight potentials were tested on a large collection of decoy sets. We found that it is usually better to train a potential using an intrinsic distance measure. We also found that PPE outperforms PPD, emphasizing the benefits of capturing consistent information in an ensemble. The relevance of these results for the design of knowledge-based potentials is discussed.
Collapse
Affiliation(s)
- Martin Carlsen
- Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Patrice Koehl
- Department of Computer Science and Genome Center, University of California Davis, Davis, CA, United States of America
| | - Peter Røgen
- Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark
- * E-mail:
| |
Collapse
|
13
|
Ryu H, Kim TR, Ahn S, Ji S, Lee J. Protein NMR structures refined without NOE data. PLoS One 2014; 9:e108888. [PMID: 25279564 PMCID: PMC4184813 DOI: 10.1371/journal.pone.0108888] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2014] [Accepted: 09/04/2014] [Indexed: 12/31/2022] Open
Abstract
The refinement of low-quality structures is an important challenge in protein structure prediction. Many studies have been conducted on protein structure refinement; the refinement of structures derived from NMR spectroscopy has been especially intensively studied. In this study, we generated flat-bottom distance potential instead of NOE data because NOE data have ambiguity and uncertainty. The potential was derived from distance information from given structures and prevented structural dislocation during the refinement process. A simulated annealing protocol was used to minimize the potential energy of the structure. The protocol was tested on 134 NMR structures in the Protein Data Bank (PDB) that also have X-ray structures. Among them, 50 structures were used as a training set to find the optimal "width" parameter in the flat-bottom distance potential functions. In the validation set (the other 84 structures), most of the 12 quality assessment scores of the refined structures were significantly improved (total score increased from 1.215 to 2.044). Moreover, the secondary structure similarity of the refined structure was improved over that of the original structure. Finally, we demonstrate that the combination of two energy potentials, statistical torsion angle potential (STAP) and the flat-bottom distance potential, can drive the refinement of NMR structures.
Collapse
Affiliation(s)
- Hyojung Ryu
- Korean Bioinformation Center (KOBIC), Korea Research Institute of Bioscience and Biotechnology, Daejeon, The Republic of Korea
- Department of Bioinformatics, University of Science and Technology, Daejeon, The Republic of Korea
| | - Tae-Rae Kim
- Department of Chemistry, Seoul National University, Seoul, The Republic of Korea
| | - SeonJoo Ahn
- Korean Bioinformation Center (KOBIC), Korea Research Institute of Bioscience and Biotechnology, Daejeon, The Republic of Korea
| | - Sunyoung Ji
- Korean Bioinformation Center (KOBIC), Korea Research Institute of Bioscience and Biotechnology, Daejeon, The Republic of Korea
- Department of Bioinformatics, University of Science and Technology, Daejeon, The Republic of Korea
| | - Jinhyuk Lee
- Korean Bioinformation Center (KOBIC), Korea Research Institute of Bioscience and Biotechnology, Daejeon, The Republic of Korea
- Department of Bioinformatics, University of Science and Technology, Daejeon, The Republic of Korea
| |
Collapse
|
14
|
Khoury GA, Liwo A, Khatib F, Zhou H, Chopra G, Bacardit J, Bortot LO, Faccioli RA, Deng X, He Y, Krupa P, Li J, Mozolewska MA, Sieradzan AK, Smadbeck J, Wirecki T, Cooper S, Flatten J, Xu K, Baker D, Cheng J, Delbem ACB, Floudas CA, Keasar C, Levitt M, Popović Z, Scheraga HA, Skolnick J, Crivelli SN, Players F. WeFold: a coopetition for protein structure prediction. Proteins 2014; 82:1850-68. [PMID: 24677212 PMCID: PMC4249725 DOI: 10.1002/prot.24538] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2013] [Revised: 01/25/2014] [Accepted: 02/08/2014] [Indexed: 12/19/2022]
Abstract
The protein structure prediction problem continues to elude scientists. Despite the introduction of many methods, only modest gains were made over the last decade for certain classes of prediction targets. To address this challenge, a social-media based worldwide collaborative effort, named WeFold, was undertaken by 13 labs. During the collaboration, the laboratories were simultaneously competing with each other. Here, we present the first attempt at "coopetition" in scientific research applied to the protein structure prediction and refinement problems. The coopetition was possible by allowing the participating labs to contribute different components of their protein structure prediction pipelines and create new hybrid pipelines that they tested during CASP10. This manuscript describes both successes and areas needing improvement as identified throughout the first WeFold experiment and discusses the efforts that are underway to advance this initiative. A footprint of all contributions and structures are publicly accessible at http://www.wefold.org.
Collapse
Affiliation(s)
- George A. Khoury
- Department of Chemical and Biological Engineering, Princeton University, USA
| | - Adam Liwo
- Faculty of Chemistry, University of Gdansk, Poland
| | - Firas Khatib
- Department of Biochemistry, University of Washington, USA
| | - Hongyi Zhou
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, USA
| | - Gaurav Chopra
- Department of Structural Biology, School of Medicine, Stanford University, USA
- Diabetes Center, School of Medicine, University of California San Francisco (UCSF), USA
| | - Jaume Bacardit
- School of Computing Science, Newcastle University, United Kingdom
| | - Leandro O. Bortot
- Laboratory of Biological Physics, Faculty of Pharmaceutical Sciences at Ribeirão Preto, University of São Paulo, Brazil
| | - Rodrigo A. Faccioli
- Institute of Mathematical and Computer Sciences, University of São Paulo, Brazil
| | - Xin Deng
- Department of Computer Science, University of Missouri, USA
| | - Yi He
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | - Pawel Krupa
- Faculty of Chemistry, University of Gdansk, Poland
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | - Jilong Li
- Department of Computer Science, University of Missouri, USA
| | - Magdalena A. Mozolewska
- Faculty of Chemistry, University of Gdansk, Poland
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | | | - James Smadbeck
- Department of Chemical and Biological Engineering, Princeton University, USA
| | - Tomasz Wirecki
- Faculty of Chemistry, University of Gdansk, Poland
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | - Seth Cooper
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, USA
| | - Jeff Flatten
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, USA
| | - Kefan Xu
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, USA
| | - David Baker
- Department of Biochemistry, University of Washington, USA
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, USA
| | | | | | - Chen Keasar
- Departments of Computer Science and Life Sciences, Ben Gurion University of the Negev, Israel
| | - Michael Levitt
- Department of Structural Biology, School of Medicine, Stanford University, USA
| | - Zoran Popović
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, USA
| | - Harold A. Scheraga
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, USA
| | | | | |
Collapse
|
15
|
Dirks-Hofmeister ME, Singh R, Leufken CM, Inlow JK, Moerschbacher BM. Structural diversity in the dandelion (Taraxacum officinale) polyphenol oxidase family results in different responses to model substrates. PLoS One 2014; 9:e99759. [PMID: 24918587 PMCID: PMC4053514 DOI: 10.1371/journal.pone.0099759] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2014] [Accepted: 05/19/2014] [Indexed: 11/19/2022] Open
Abstract
Polyphenol oxidases (PPOs) are ubiquitous type-3 copper enzymes that catalyze the oxygen-dependent conversion of o-diphenols to the corresponding quinones. In most plants, PPOs are present as multiple isoenzymes that probably serve distinct functions, although the precise relationship between sequence, structure and function has not been addressed in detail. We therefore compared the characteristics and activities of recombinant dandelion PPOs to gain insight into the structure-function relationships within the plant PPO family. Phylogenetic analysis resolved the 11 isoenzymes of dandelion into two evolutionary groups. More detailed in silico and in vitro analyses of four representative PPOs covering both phylogenetic groups were performed. Molecular modeling and docking predicted differences in enzyme-substrate interactions, providing a structure-based explanation for grouping. One amino acid side chain positioned at the entrance to the active site (position HB2+1) potentially acts as a "selector" for substrate binding. In vitro activity measurements with the recombinant, purified enzymes also revealed group-specific differences in kinetic parameters when the selected PPOs were presented with five model substrates. The combination of our enzyme kinetic measurements and the in silico docking studies therefore indicate that the physiological functions of individual PPOs might be defined by their specific interactions with different natural substrates.
Collapse
Affiliation(s)
| | - Ratna Singh
- Department of Plant Biology and Biotechnology, Westphalian Wilhelms-University of Münster, Münster, Germany
| | - Christine M. Leufken
- Department of Plant Biology and Biotechnology, Westphalian Wilhelms-University of Münster, Münster, Germany
| | - Jennifer K. Inlow
- Department of Chemistry and Physics, Indiana State University, Terre Haute, Indiana, United States of America
| | - Bruno M. Moerschbacher
- Department of Plant Biology and Biotechnology, Westphalian Wilhelms-University of Münster, Münster, Germany
| |
Collapse
|
16
|
Padilla-Sanchez V, Gao S, Kim HR, Kihara D, Sun L, Rossmann MG, Rao VB. Structure-function analysis of the DNA translocating portal of the bacteriophage T4 packaging machine. J Mol Biol 2013; 426:1019-38. [PMID: 24126213 DOI: 10.1016/j.jmb.2013.10.011] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2013] [Revised: 09/17/2013] [Accepted: 10/08/2013] [Indexed: 12/20/2022]
Abstract
Tailed bacteriophages and herpesviruses consist of a structurally well conserved dodecameric portal at a special 5-fold vertex of the capsid. The portal plays critical roles in head assembly, genome packaging, neck/tail attachment, and genome ejection. Although the structures of portals from phages φ29, SPP1, and P22 have been determined, their mechanistic roles have not been well understood. Structural analysis of phage T4 portal (gp20) has been hampered because of its unusual interaction with the Escherichia coli inner membrane. Here, we predict atomic models for the T4 portal monomer and dodecamer, and we fit the dodecamer into the cryo-electron microscopy density of the phage portal vertex. The core structure, like that from other phages, is cone shaped with the wider end containing the "wing" and "crown" domains inside the phage head. A long "stem" encloses a central channel, and a narrow "stalk" protrudes outside the capsid. A biochemical approach was developed to analyze portal function by incorporating plasmid-expressed portal protein into phage heads and determining the effect of mutations on head assembly, DNA translocation, and virion production. We found that the protruding loops of the stalk domain are involved in assembling the DNA packaging motor. A loop that connects the stalk to the channel might be required for communication between the motor and the portal. The "tunnel" loops that project into the channel are essential for sealing the packaged head. These studies established that the portal is required throughout the DNA packaging process, with different domains participating at different stages of genome packaging.
Collapse
Affiliation(s)
- Victor Padilla-Sanchez
- Department of Biology, The Catholic University of America, 620 Michigan Avenue Northeast, Washington, DC 20064, USA
| | - Song Gao
- Department of Biology, The Catholic University of America, 620 Michigan Avenue Northeast, Washington, DC 20064, USA; Marine Drug Research Institute, Huaihai Institute of Technology, Lianyungang, Jiangsu 222001, China
| | - Hyung Rae Kim
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA; Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Lei Sun
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Michael G Rossmann
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Venigalla B Rao
- Department of Biology, The Catholic University of America, 620 Michigan Avenue Northeast, Washington, DC 20064, USA.
| |
Collapse
|
17
|
Mirjalili V, Feig M. Protein Structure Refinement through Structure Selection and Averaging from Molecular Dynamics Ensembles. J Chem Theory Comput 2013; 9:1294-1303. [PMID: 23526422 PMCID: PMC3603382 DOI: 10.1021/ct300962x] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
A molecular dynamics (MD) simulation based protocol for structure refinement of template-based model predictions is described. The protocol involves the application of restraints, ensemble averaging of selected subsets, interpolation between initial and refined structures, and assessment of refinement success. It is found that sub-microsecond MD-based sampling when combined with ensemble averaging can produce moderate but consistent refinement for most systems in the CASP targets considered here.
Collapse
Affiliation(s)
- Vahid Mirjalili
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824; USA
- Department of Mechanical Engineering, Michigan State University, East Lansing, MI 48824; USA
| | - Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824; USA
- Department of Chemistry, Michigan State University, East Lansing, MI 48824; USA
| |
Collapse
|
18
|
Bhattacharya D, Cheng J. 3Drefine: consistent protein structure refinement by optimizing hydrogen bonding network and atomic-level energy minimization. Proteins 2013; 81:119-31. [PMID: 22927229 PMCID: PMC3634918 DOI: 10.1002/prot.24167] [Citation(s) in RCA: 122] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2012] [Revised: 07/26/2012] [Accepted: 08/17/2012] [Indexed: 12/27/2022]
Abstract
One of the major limitations of computational protein structure prediction is the deviation of predicted models from their experimentally derived true, native structures. The limitations often hinder the possibility of applying computational protein structure prediction methods in biochemical assignment and drug design that are very sensitive to structural details. Refinement of these low-resolution predicted models to high-resolution structures close to the native state, however, has proven to be extremely challenging. Thus, protein structure refinement remains a largely unsolved problem. Critical assessment of techniques for protein structure prediction (CASP) specifically indicated that most predictors participating in the refinement category still did not consistently improve model quality. Here, we propose a two-step refinement protocol, called 3Drefine, to consistently bring the initial model closer to the native structure. The first step is based on optimization of hydrogen bonding (HB) network and the second step applies atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields. The approach has been evaluated on the CASP benchmark data and it exhibits consistent improvement over the initial structure in both global and local structural quality measures. 3Drefine method is also computationally inexpensive, consuming only few minutes of CPU time to refine a protein of typical length (300 residues). 3Drefine web server is freely available at http://sysbio.rnet.missouri.edu/3Drefine/.
Collapse
Affiliation(s)
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, Columbia, MO 65211, USA
- Informatics Institute, University of Missouri, Columbia, MO 65211, USA
- Bond Life Science Center, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
19
|
Rodrigues JPGLM, Levitt M, Chopra G. KoBaMIN: a knowledge-based minimization web server for protein structure refinement. Nucleic Acids Res 2012; 40:W323-8. [PMID: 22564897 PMCID: PMC3394243 DOI: 10.1093/nar/gks376] [Citation(s) in RCA: 109] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The KoBaMIN web server provides an online interface to a simple, consistent and computationally efficient protein structure refinement protocol based on minimization of a knowledge-based potential of mean force. The server can be used to refine either a single protein structure or an ensemble of proteins starting from their unrefined coordinates in PDB format. The refinement method is particularly fast and accurate due to the underlying knowledge-based potential derived from structures deposited in the PDB; as such, the energy function implicitly includes the effects of solvent and the crystal environment. Our server allows for an optional but recommended step that optimizes stereochemistry using the MESHI software. The KoBaMIN server also allows comparison of the refined structures with a provided reference structure to assess the changes brought about by the refinement protocol. The performance of KoBaMIN has been benchmarked widely on a large set of decoys, all models generated at the seventh worldwide experiments on critical assessment of techniques for protein structure prediction (CASP7) and it was also shown to produce top-ranking predictions in the refinement category at both CASP8 and CASP9, yielding consistently good results across a broad range of model quality values. The web server is fully functional and freely available at http://csb.stanford.edu/kobamin.
Collapse
Affiliation(s)
- João P G L M Rodrigues
- Department of Structural Biology, 299 Campus Dr W, Fairchild Bldg, Room D100, Stanford University, Stanford, CA 94305, USA
| | | | | |
Collapse
|
20
|
Gront D, Kmiecik S, Blaszczyk M, Ekonomiuk D, Koliński A. Optimization of protein models. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2012. [DOI: 10.1002/wcms.1090] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Dominik Gront
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Sebastian Kmiecik
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Maciej Blaszczyk
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Dariusz Ekonomiuk
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Andrzej Koliński
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| |
Collapse
|
21
|
Abstract
Accurate all-atom energy functions are crucial for successful high-resolution protein structure prediction. In this chapter, we review both physics-based force fields and knowledge-based potentials used in protein modeling. Because it is important to calculate the energy as accurately as possible given the limitations imposed by sampling convergence, different components of the energy, and force fields representing them to varying degrees of detail and complexity are discussed. Force fields using Cartesian as well as torsion angle representations of protein geometry are covered. Since solvent is important for protein energetics, different aqueous and membrane solvation models for protein simulations are also described. Finally, we summarize recent progress in protein structure refinement using new force fields.
Collapse
|
22
|
Bellay J, Michaut M, Kim T, Han S, Colak R, Myers CL, Kim PM. An omics perspective of protein disorder. ACTA ACUST UNITED AC 2012; 8:185-93. [DOI: 10.1039/c1mb05235g] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
23
|
Haddadian EJ, Gong H, Jha AK, Yang X, Debartolo J, Hinshaw JR, Rice PA, Sosnick TR, Freed KF. Automated real-space refinement of protein structures using a realistic backbone move set. Biophys J 2011; 101:899-909. [PMID: 21843481 DOI: 10.1016/j.bpj.2011.06.063] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2011] [Revised: 06/23/2011] [Accepted: 06/28/2011] [Indexed: 11/26/2022] Open
Abstract
Crystals of many important biological macromolecules diffract to limited resolution, rendering accurate model building and refinement difficult and time-consuming. We present a torsional optimization protocol that is applicable to many such situations and combines Protein Data Bank-based torsional optimization with real-space refinement against the electron density derived from crystallography or cryo-electron microscopy. Our method converts moderate- to low-resolution structures at initial (e.g., backbone trace only) or late stages of refinement to structures with increased numbers of hydrogen bonds, improved crystallographic R-factors, and superior backbone geometry. This automated method is applicable to DNA-binding and membrane proteins of any size and will aid studies of structural biology by improving model quality and saving considerable effort. The method can be extended to improve NMR and other structures. Our backbone score and its sequence profile provide an additional standard tool for evaluating structural quality.
Collapse
Affiliation(s)
- Esmael J Haddadian
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Bernauer J, Huang X, Sim AYL, Levitt M. Fully differentiable coarse-grained and all-atom knowledge-based potentials for RNA structure evaluation. RNA (NEW YORK, N.Y.) 2011; 17:1066-1075. [PMID: 21521828 PMCID: PMC3096039 DOI: 10.1261/rna.2543711] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2010] [Accepted: 03/01/2011] [Indexed: 05/27/2023]
Abstract
RNA molecules play integral roles in gene regulation, and understanding their structures gives us important insights into their biological functions. Despite recent developments in template-based and parameterized energy functions, the structure of RNA--in particular the nonhelical regions--is still difficult to predict. Knowledge-based potentials have proven efficient in protein structure prediction. In this work, we describe two differentiable knowledge-based potentials derived from a curated data set of RNA structures, with all-atom or coarse-grained representation, respectively. We focus on one aspect of the prediction problem: the identification of native-like RNA conformations from a set of near-native models. Using a variety of near-native RNA models generated from three independent methods, we show that our potential is able to distinguish the native structure and identify native-like conformations, even at the coarse-grained level. The all-atom version of our knowledge-based potential performs better and appears to be more effective at discriminating near-native RNA conformations than one of the most highly regarded parameterized potential. The fully differentiable form of our potentials will additionally likely be useful for structure refinement and/or molecular dynamics simulations.
Collapse
Affiliation(s)
- Julie Bernauer
- INRIA AMIB Bioinformatique, Laboratoire d'Informatique (LIX), Ecole Polytechnique, 91128 Palaiseau, France.
| | | | | | | |
Collapse
|