601
|
Olechnovič K, Venclovas Č. VoroMQA: Assessment of protein structure quality using interatomic contact areas. Proteins 2017; 85:1131-1145. [DOI: 10.1002/prot.25278] [Citation(s) in RCA: 121] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Revised: 01/13/2017] [Accepted: 02/21/2017] [Indexed: 12/14/2022]
Affiliation(s)
- Kliment Olechnovič
- Institute of Biotechnology, Vilnius University; Saulėtekio 7 LT-10257 Vilnius Lithuania
- Faculty of Mathematics and Informatics; Vilnius University; Naugarduko 24 LT-03225 Vilnius Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Vilnius University; Saulėtekio 7 LT-10257 Vilnius Lithuania
| |
Collapse
|
602
|
Abstract
The development of automated servers to predict the three-dimensional structure of proteins has seen much progress over the years. These servers make calculations simpler, but largely exclude users from the process. In this study, we present the PRotein Interactive MOdeling (PRIMO) pipeline for homology modeling of protein monomers. The pipeline eases the multi-step modeling process, and reduces the workload required by the user, while still allowing engagement from the user during every step. Default parameters are given for each step, which can either be modified or supplemented with additional external input. PRIMO has been designed for users of varying levels of experience with homology modeling. The pipeline incorporates a user-friendly interface that makes it easy to alter parameters used during modeling. During each stage of the modeling process, the site provides suggestions for novice users to improve the quality of their models. PRIMO provides functionality that allows users to also model ligands and ions in complex with their protein targets. Herein, we assess the accuracy of the fully automated capabilities of the server, including a comparative analysis of the available alignment programs, as well as of the refinement levels used during modeling. The tests presented here demonstrate the reliability of the PRIMO server when producing a large number of protein models. While PRIMO does focus on user involvement in the homology modeling process, the results indicate that in the presence of suitable templates, good quality models can be produced even without user intervention. This gives an idea of the base level accuracy of PRIMO, which users can improve upon by adjusting parameters in their modeling runs. The accuracy of PRIMO’s automated scripts is being continuously evaluated by the CAMEO (Continuous Automated Model EvaluatiOn) project. The PRIMO site is free for non-commercial use and can be accessed at https://primo.rubi.ru.ac.za/.
Collapse
|
603
|
Pang YP. FF12MC: A revised AMBER forcefield and new protein simulation protocol. Proteins 2016; 84:1490-516. [PMID: 27348292 PMCID: PMC5129589 DOI: 10.1002/prot.25094] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Revised: 06/16/2016] [Accepted: 06/18/2016] [Indexed: 12/25/2022]
Abstract
Specialized to simulate proteins in molecular dynamics (MD) simulations with explicit solvation, FF12MC is a combination of a new protein simulation protocol employing uniformly reduced atomic masses by tenfold and a revised AMBER forcefield FF99 with (i) shortened CH bonds, (ii) removal of torsions involving a nonperipheral sp(3) atom, and (iii) reduced 1-4 interaction scaling factors of torsions ϕ and ψ. This article reports that in multiple, distinct, independent, unrestricted, unbiased, isobaric-isothermal, and classical MD simulations FF12MC can (i) simulate the experimentally observed flipping between left- and right-handed configurations for C14-C38 of BPTI in solution, (ii) autonomously fold chignolin, CLN025, and Trp-cage with folding times that agree with the experimental values, (iii) simulate subsequent unfolding and refolding of these miniproteins, and (iv) achieve a robust Z score of 1.33 for refining protein models TMR01, TMR04, and TMR07. By comparison, the latest general-purpose AMBER forcefield FF14SB locks the C14-C38 bond to the right-handed configuration in solution under the same protein simulation conditions. Statistical survival analysis shows that FF12MC folds chignolin and CLN025 in isobaric-isothermal MD simulations 2-4 times faster than FF14SB under the same protein simulation conditions. These results suggest that FF12MC may be used for protein simulations to study kinetics and thermodynamics of miniprotein folding as well as protein structure and dynamics. Proteins 2016; 84:1490-1516. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Yuan-Ping Pang
- Computer-Aided Molecular Design Laboratory, Mayo Clinic, Rochester, MN, 55905, USA.
| |
Collapse
|
604
|
Kryshtafovych A, Monastyrskyy B, Fidelis K. CASP11 statistics and the prediction center evaluation system. Proteins 2016; 84 Suppl 1:15-9. [PMID: 26857434 PMCID: PMC5479680 DOI: 10.1002/prot.25005] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 01/18/2016] [Accepted: 02/04/2016] [Indexed: 01/10/2023]
Abstract
We outline the role of the Protein Structure Prediction Center (predictioncenter.org) in conducting the CASP11 and CASP ROLL experiments, discuss the experiment statistics, and provide an overview of the present CASP infrastructure. The biggest changes compared to the previous CASPs are the implementation of the evaluation system incorporating practically all evaluation measures, statistical tests, and visualization tools historically used by the CASP assessors, the expansion of the infrastructure to incorporate new categories of contact-assisted and multimeric predictions, and the redesign of the assessors' web-workspace enabling assessments based on multiple measures for different group categories and target sets. Proteins 2016; 84(Suppl 1):15-19. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California, 95616
| | - Bohdan Monastyrskyy
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California, 95616
| | - Krzysztof Fidelis
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California, 95616.
| |
Collapse
|
605
|
Modi V, Dunbrack RL. Assessment of refinement of template-based models in CASP11. Proteins 2016; 84 Suppl 1:260-81. [PMID: 27081793 DOI: 10.1002/prot.25048] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2015] [Revised: 03/13/2016] [Accepted: 04/11/2016] [Indexed: 12/26/2022]
Abstract
CASP11 (the 11th Meeting on the Critical Assessment of Protein Structure Prediction) ran a blind experiment in the refinement of protein structure predictions, the fourth such experiment since CASP8. As with the previous experiments, the predictors were provided with one starting structure from the server models of each of a selected set of template-based modeling targets and asked to refine the coordinates of the starting structure toward native. We assessed the refined structures with the Z-scores of the standard CASP measures, which compare the model-target similarities of the models from all the predictors. Furthermore, we assessed the refined structures with "relative measures," which compare the improvement in accuracy of each model with respect to the starting structure. The latter provides an assessment of the extent to which each predictor group is able to improve the starting structures toward native. We utilized heat maps to display improvements in the Calpha-Calpha distance matrix for each model. The heat maps labeled with each element of secondary structure helped us to identify regions of refinement toward native in each model. Most positively scoring models show modest improvements in multiple regions of the structure, while in some models we were able to identify significant repositioning of N/C-terminal segments and internal elements of secondary structure. The best groups were able to improve more than 70% of the targets from the starting models, and by an average of 3-5% in the standard CASP measures. Proteins 2016; 84(Suppl 1):260-281. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Vivek Modi
- Fox Chase Cancer Center, Philadelphia, Pennsylvania, 19111
| | | |
Collapse
|
606
|
Huwe PJ, Xu Q, Shapovalov MV, Modi V, Andrake MD, Dunbrack RL. Biological function derived from predicted structures in CASP11. Proteins 2016; 84 Suppl 1:370-91. [PMID: 27181425 DOI: 10.1002/prot.24997] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2015] [Revised: 01/10/2016] [Accepted: 01/18/2016] [Indexed: 12/26/2022]
Abstract
In CASP11, the organizers sought to bring the biological inferences from predicted structures to the fore. To accomplish this, we assessed the models for their ability to perform quantifiable tasks related to biological function. First, for 10 targets that were probable homodimers, we measured the accuracy of docking the models into homodimers as a function of GDT-TS of the monomers, which produced characteristic L-shaped plots. At low GDT-TS, none of the models could be docked correctly as homodimers. Above GDT-TS of ∼60%, some models formed correct homodimers in one of the largest docked clusters, while many other models at the same values of GDT-TS did not. Docking was more successful when many of the templates shared the same homodimer. Second, we docked a ligand from an experimental structure into each of the models of one of the targets. Docking to the models with two different programs produced poor ligand RMSDs with the experimental structure. Measures that evaluated similarity of contacts were reasonable for some of the models, although there was not a significant correlation with model accuracy. Finally, we assessed whether models would be useful in predicting the phenotypes of missense mutations in three human targets by comparing features calculated from the models with those calculated from the experimental structures. The models were successful in reproducing accessible surface areas but there was little correlation of model accuracy with calculation of FoldX evaluation of the change in free energy between the wild-type and the mutant. Proteins 2016; 84(Suppl 1):370-391. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Peter J Huwe
- Fox Chase Cancer Center, Philadelphia, Pennsylvania, 19111
| | - Qifang Xu
- Fox Chase Cancer Center, Philadelphia, Pennsylvania, 19111
| | | | - Vivek Modi
- Fox Chase Cancer Center, Philadelphia, Pennsylvania, 19111
| | - Mark D Andrake
- Fox Chase Cancer Center, Philadelphia, Pennsylvania, 19111
| | | |
Collapse
|
607
|
Modi V, Xu Q, Adhikari S, Dunbrack RL. Assessment of template-based modeling of protein structure in CASP11. Proteins 2016; 84 Suppl 1:200-20. [PMID: 27081927 DOI: 10.1002/prot.25049] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2016] [Revised: 04/04/2016] [Accepted: 04/11/2016] [Indexed: 12/27/2022]
Abstract
We present the assessment of predictions submitted in the template-based modeling (TBM) category of CASP11 (Critical Assessment of Protein Structure Prediction). Model quality was judged on the basis of global and local measures of accuracy on all atoms including side chains. The top groups on 39 human-server targets based on model 1 predictions were LEER, Zhang, LEE, MULTICOM, and Zhang-Server. The top groups on 81 targets by server groups based on model 1 predictions were Zhang-Server, nns, BAKER-ROSETTASERVER, QUARK, and myprotein-me. In CASP11, the best models for most targets were equal to or better than the best template available in the Protein Data Bank, even for targets with poor templates. The overall performance in CASP11 is similar to the performance of predictors in CASP10 with slightly better performance on the hardest targets. For most targets, assessment measures exhibited bimodal probability density distributions. Multi-dimensional scaling of an RMSD matrix for each target typically revealed a single cluster with models similar to the target structure, with a mode in the GDT-TS density between 40 and 90, and a wide distribution of models highly divergent from each other and from the experimental structure, with density mode at a GDT-TS value of ∼20. The models in this peak in the density were either compact models with entirely the wrong fold, or highly non-compact models. The results argue for a density-driven approach in future CASP TBM assessments that accounts for the bimodal nature of these distributions instead of Z scores, which assume a unimodal, Gaussian distribution. Proteins 2016; 84(Suppl 1):200-220. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Vivek Modi
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111
| | - Qifang Xu
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111
| | - Sam Adhikari
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111
| | - Roland L Dunbrack
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111.
| |
Collapse
|
608
|
Li W, Schaeffer RD, Otwinowski Z, Grishin NV. Estimation of Uncertainties in the Global Distance Test (GDT_TS) for CASP Models. PLoS One 2016; 11:e0154786. [PMID: 27149620 PMCID: PMC4858170 DOI: 10.1371/journal.pone.0154786] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Accepted: 04/19/2016] [Indexed: 11/19/2022] Open
Abstract
The Critical Assessment of techniques for protein Structure Prediction (or CASP) is a community-wide blind test experiment to reveal the best accomplishments of structure modeling. Assessors have been using the Global Distance Test (GDT_TS) measure to quantify prediction performance since CASP3 in 1998. However, identifying significant score differences between close models is difficult because of the lack of uncertainty estimations for this measure. Here, we utilized the atomic fluctuations caused by structure flexibility to estimate the uncertainty of GDT_TS scores. Structures determined by nuclear magnetic resonance are deposited as ensembles of alternative conformers that reflect the structural flexibility, whereas standard X-ray refinement produces the static structure averaged over time and space for the dynamic ensembles. To recapitulate the structural heterogeneous ensemble in the crystal lattice, we performed time-averaged refinement for X-ray datasets to generate structural ensembles for our GDT_TS uncertainty analysis. Using those generated ensembles, our study demonstrates that the time-averaged refinements produced structure ensembles with better agreement with the experimental datasets than the averaged X-ray structures with B-factors. The uncertainty of the GDT_TS scores, quantified by their standard deviations (SDs), increases for scores lower than 50 and 70, with maximum SDs of 0.3 and 1.23 for X-ray and NMR structures, respectively. We also applied our procedure to the high accuracy version of GDT-based score and produced similar results with slightly higher SDs. To facilitate score comparisons by the community, we developed a user-friendly web server that produces structure ensembles for NMR and X-ray structures and is accessible at http://prodata.swmed.edu/SEnCS. Our work helps to identify the significance of GDT_TS score differences, as well as to provide structure ensembles for estimating SDs of any scores.
Collapse
Affiliation(s)
- Wenlin Li
- Department of Biochemistry and Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, 75390–9050, United States of America
| | - R. Dustin Schaeffer
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, 75390–9050, United States of America
| | - Zbyszek Otwinowski
- Department of Biochemistry and Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, 75390–9050, United States of America
| | - Nick V. Grishin
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, 75390–9050, United States of America
- Department of Biochemistry and Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, 75390–9050, United States of America
- * E-mail:
| |
Collapse
|
609
|
Kinch LN, Li W, Monastyrskyy B, Kryshtafovych A, Grishin NV. Evaluation of free modeling targets in CASP11 and ROLL. Proteins 2016; 84 Suppl 1:51-66. [PMID: 26677002 DOI: 10.1002/prot.24973] [Citation(s) in RCA: 65] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2015] [Accepted: 12/12/2015] [Indexed: 12/25/2022]
Abstract
We present an assessment of 'template-free modeling' (FM) in CASP11and ROLL. Community-wide server performance suggested the use of automated scores similar to previous CASPs would provide a good system of evaluating performance, even in the absence of comprehensive manual assessment. The CASP11 FM category included several outstanding examples, including successful prediction by the Baker group of a 256-residue target (T0806-D1) that lacked sequence similarity to any existing template. The top server model prediction by Zhang's Quark, which was apparently selected and refined by several manual groups, encompassed the entire fold of target T0837-D1. Methods from the same two groups tended to dominate overall CASP11 FM and ROLL rankings. Comparison of top FM predictions with those from the previous CASP experiment revealed progress in the category, particularly reflected in high prediction accuracy for larger protein domains. FM prediction models for two cases were sufficient to provide functional insights that were otherwise not obtainable by traditional sequence analysis methods. Importantly, CASP11 abstracts revealed that alignment-based contact prediction methods brought about much of the CASP11 progress, producing both of the functionally relevant models as well as several of the other outstanding structure predictions. These methodological advances enabled de novo modeling of much larger domain structures than was previously possible and allowed prediction of functional sites. Proteins 2016; 84(Suppl 1):51-66. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Lisa N Kinch
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center at Dallas, 6001 Forest Park Road, Dallas, Texas 75390-9050.
| | - Wenlin Li
- Department of Biophysics and Department of Biochemistry, University of Texas Southwestern Medical Center at Dallas, 6001 Forest Park Road, Dallas, Texas 75390-9050
| | - Bohdan Monastyrskyy
- Genome Center, University of California, 451 Health Sciences Drive, Davis, California 95616
| | - Andriy Kryshtafovych
- Genome Center, University of California, 451 Health Sciences Drive, Davis, California 95616
| | - Nick V Grishin
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center at Dallas, 6001 Forest Park Road, Dallas, Texas 75390-9050.,Department of Biophysics and Department of Biochemistry, University of Texas Southwestern Medical Center at Dallas, 6001 Forest Park Road, Dallas, Texas 75390-9050
| |
Collapse
|
610
|
Joo K, Joung I, Cheng Q, Lee SJ, Lee J. Contact-assisted protein structure modeling by global optimization in CASP11. Proteins 2016; 84 Suppl 1:189-99. [DOI: 10.1002/prot.24975] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Revised: 11/24/2015] [Accepted: 12/12/2015] [Indexed: 11/09/2022]
Affiliation(s)
- Keehyoung Joo
- Center for in Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
- Center for Advanced Computation, Korea Institute for Advanced Study; Seoul 130-722 Korea
| | - InSuk Joung
- Center for in Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
- School of Computational Sciences, Korea Institute for Advanced Study; Seoul 130-722 Korea
| | - Qianyi Cheng
- Center for in Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
- School of Computational Sciences, Korea Institute for Advanced Study; Seoul 130-722 Korea
| | - Sung Jong Lee
- Center for in Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
- Department of Physics; University of Suwon; Hwaseong-Si Gyeonggi-do 445-743 Korea
| | - Jooyoung Lee
- Center for in Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
- Center for Advanced Computation, Korea Institute for Advanced Study; Seoul 130-722 Korea
- School of Computational Sciences, Korea Institute for Advanced Study; Seoul 130-722 Korea
| |
Collapse
|
611
|
Cao R, Bhattacharya D, Adhikari B, Li J, Cheng J. Massive integration of diverse protein quality assessment methods to improve template based modeling in CASP11. Proteins 2015; 84 Suppl 1:247-59. [PMID: 26369671 DOI: 10.1002/prot.24924] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2015] [Revised: 08/21/2015] [Accepted: 09/10/2015] [Indexed: 12/28/2022]
Abstract
Model evaluation and selection is an important step and a big challenge in template-based protein structure prediction. Individual model quality assessment methods designed for recognizing some specific properties of protein structures often fail to consistently select good models from a model pool because of their limitations. Therefore, combining multiple complimentary quality assessment methods is useful for improving model ranking and consequently tertiary structure prediction. Here, we report the performance and analysis of our human tertiary structure predictor (MULTICOM) based on the massive integration of 14 diverse complementary quality assessment methods that was successfully benchmarked in the 11th Critical Assessment of Techniques of Protein Structure prediction (CASP11). The predictions of MULTICOM for 39 template-based domains were rigorously assessed by six scoring metrics covering global topology of Cα trace, local all-atom fitness, side chain quality, and physical reasonableness of the model. The results show that the massive integration of complementary, diverse single-model and multi-model quality assessment methods can effectively leverage the strength of single-model methods in distinguishing quality variation among similar good models and the advantage of multi-model quality assessment methods of identifying reasonable average-quality models. The overall excellent performance of the MULTICOM predictor demonstrates that integrating a large number of model quality assessment methods in conjunction with model clustering is a useful approach to improve the accuracy, diversity, and consequently robustness of template-based protein structure prediction. Proteins 2016; 84(Suppl 1):247-259. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Renzhi Cao
- Department of Computer Science, University of Missouri, Columbia, Missouri, 65211
| | | | - Badri Adhikari
- Department of Computer Science, University of Missouri, Columbia, Missouri, 65211
| | - Jilong Li
- Department of Computer Science, University of Missouri, Columbia, Missouri, 65211
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, Columbia, Missouri, 65211. .,Informatics Institute, University of Missouri, Columbia, Missouri, 65211.
| |
Collapse
|
612
|
Kryshtafovych A, Barbato A, Monastyrskyy B, Fidelis K, Schwede T, Tramontano A. Methods of model accuracy estimation can help selecting the best models from decoy sets: Assessment of model accuracy estimations in CASP11. Proteins 2015; 84 Suppl 1:349-69. [PMID: 26344049 DOI: 10.1002/prot.24919] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2015] [Revised: 07/30/2015] [Accepted: 08/28/2015] [Indexed: 12/27/2022]
Abstract
The article presents assessment of the model accuracy estimation methods participating in CASP11. The results of the assessment are expected to be useful to both-developers of the methods and users who way too often are presented with structural models without annotations of accuracy. The main emphasis is placed on the ability of techniques to identify the best models from among several available. Bivariate descriptive statistics and ROC analysis are used to additionally assess the overall correctness of the predicted model accuracy scores, the correlation between the predicted and observed accuracy of models, the effectiveness in distinguishing between good and bad models, the ability to discriminate between reliable and unreliable regions in models, and the accuracy of the coordinate error self-estimates. A rigid-body measure (GDT_TS) and three local-structure-based scores (LDDT, CADaa, and SphereGrinder) are used as reference measures for evaluating methods' performance. Consensus methods, taking advantage of the availability of several models for the same target protein, perform well on the majority of tasks. Methods that predict accuracy on the basis of a single model perform comparably to consensus methods in picking the best models and in the estimation of how accurate is the local structure. More groups than in previous experiments submitted reasonable error estimates of their own models, most likely in response to a recommendation from CASP and the increasing demand from users. Proteins 2016; 84(Suppl 1):349-369. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
| | - Alessandro Barbato
- Biozentrum, University of Basel, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | | | | | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Anna Tramontano
- Department of Physics, Sapienza University of Rome, Rome, Italy
| |
Collapse
|
613
|
Kim H, Kihara D. Protein structure prediction using residue- and fragment-environment potentials in CASP11. Proteins 2015; 84 Suppl 1:105-17. [PMID: 26344195 DOI: 10.1002/prot.24920] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Revised: 08/03/2015] [Accepted: 08/31/2015] [Indexed: 11/08/2022]
Abstract
An accurate scoring function that can select near-native structure models from a pool of alternative models is key for successful protein structure prediction. For the critical assessment of techniques for protein structure prediction (CASP) 11, we have built a protocol of protein structure prediction that has novel coarse-grained scoring functions for selecting decoys as the heart of its pipeline. The score named PRESCO (Protein Residue Environment SCOre) developed recently by our group evaluates the native-likeness of local structural environment of residues in a structure decoy considering positions and the depth of side-chains of spatially neighboring residues. We also introduced a helix interaction potential as an additional scoring function for selecting decoys. The best models selected by PRESCO and the helix interaction potential underwent structure refinement, which includes side-chain modeling and relaxation with a short molecular dynamics simulation. Our protocol was successful, achieving the top rank in the free modeling category with a significant margin of the accumulated Z-score to the subsequent groups when the top 1 models were considered. Proteins 2016; 84(Suppl 1):105-117. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Hyungrae Kim
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47906
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47906. .,Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907.
| |
Collapse
|
614
|
Joo K, Joung I, Lee SY, Kim JY, Cheng Q, Manavalan B, Joung JY, Heo S, Lee J, Nam M, Lee IH, Lee SJ, Lee J. Template based protein structure modeling by global optimization in CASP11. Proteins 2015; 84 Suppl 1:221-32. [PMID: 26329522 DOI: 10.1002/prot.24917] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Revised: 08/04/2015] [Accepted: 08/21/2015] [Indexed: 11/11/2022]
Abstract
For the template-based modeling (TBM) of CASP11 targets, we have developed three new protein modeling protocols (nns for server prediction and LEE and LEER for human prediction) by improving upon our previous CASP protocols (CASP7 through CASP10). We applied the powerful global optimization method of conformational space annealing to three stages of optimization, including multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain remodeling. For more successful fold recognition, a new alignment method called CRFalign was developed. It can incorporate sensitive positional and environmental dependence in alignment scores as well as strong nonlinear correlations among various features. Modifications and adjustments were made to the form of the energy function and weight parameters pertaining to the chain building procedure. For the side-chain remodeling step, residue-type dependence was introduced to the cutoff value that determines the entry of a rotamer to the side-chain modeling library. The improved performance of the nns server method is attributed to successful fold recognition achieved by combining several methods including CRFalign and to the current modeling formulation that can incorporate native-like structural aspects present in multiple templates. The LEE protocol is identical to the nns one except that CASP11-released server models are used as templates. The success of LEE in utilizing CASP11 server models indicates that proper template screening and template clustering assisted by appropriate cluster ranking promises a new direction to enhance protein 3D modeling. Proteins 2016; 84(Suppl 1):221-232. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Keehyoung Joo
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,Center for Advanced Computation, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - InSuk Joung
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Sun Young Lee
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Jong Yun Kim
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Qianyi Cheng
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Balachandran Manavalan
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Jong Young Joung
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Seungryong Heo
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Juyong Lee
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, 20852
| | - Mikyung Nam
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - In-Ho Lee
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,Korea Research Institute of Standards and Science (KRISS), Seoul, 305-600, Korea
| | - Sung Jong Lee
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,Department of Physics, University of Suwon, Hwaseong-Si, Gyeonggi-Do, 445-743, Korea
| | - Jooyoung Lee
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea. .,Center for Advanced Computation, Korea Institute for Advanced Study, Seoul, 130-722, Korea. .,School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 130-722, Korea.
| |
Collapse
|
615
|
Park H, DiMaio F, Baker D. The origin of consistent protein structure refinement from structural averaging. Structure 2015; 23:1123-8. [PMID: 25960407 PMCID: PMC4456269 DOI: 10.1016/j.str.2015.03.022] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Revised: 03/03/2015] [Accepted: 03/26/2015] [Indexed: 11/27/2022]
Abstract
Recent studies have shown that explicit solvent molecular dynamics (MD) simulation followed by structural averaging can consistently improve protein structure models. We find that improvement upon averaging is not limited to explicit water MD simulation, as consistent improvements are also observed for more efficient implicit solvent MD or Monte Carlo minimization simulations. To determine the origin of these improvements, we examine the changes in model accuracy brought about by averaging at the individual residue level. We find that the improvement in model quality from averaging results from the superposition of two effects: a dampening of deviations from the correct structure in the least well modeled regions, and a reinforcement of consistent movements towards the correct structure in better modeled regions. These observations are consistent with an energy landscape model in which the magnitude of the energy gradient toward the native structure decreases with increasing distance from the native state.
Collapse
Affiliation(s)
- Hahnbeom Park
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Box 357370, Seattle, WA 98195, USA.
| |
Collapse
|
616
|
Lee J, Lee K, Joung I, Joo K, Brooks BR, Lee J. Sigma-RF: prediction of the variability of spatial restraints in template-based modeling by random forest. BMC Bioinformatics 2015; 16:94. [PMID: 25886990 PMCID: PMC4374281 DOI: 10.1186/s12859-015-0526-z] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2014] [Accepted: 03/04/2015] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND In template-based modeling when using a single template, inter-atomic distances of an unknown protein structure are assumed to be distributed by Gaussian probability density functions, whose center peaks are located at the distances between corresponding atoms in the template structure. The width of the Gaussian distribution, the variability of a spatial restraint, is closely related to the reliability of the restraint information extracted from a template, and it should be accurately estimated for successful template-based protein structure modeling. RESULTS To predict the variability of the spatial restraints in template-based modeling, we have devised a prediction model, Sigma-RF, by using the random forest (RF) algorithm. The benchmark results on 22 CASP9 targets show that the variability values from Sigma-RF are of higher correlations with the true distance deviation than those from Modeller. We assessed the effect of new sigma values by performing the single-domain homology modeling of 22 CASP9 targets and 24 CASP10 targets. For most of the targets tested, we could obtain more accurate 3D models from the identical alignments by using the Sigma-RF results than by using Modeller ones. CONCLUSIONS We find that the average alignment quality of residues located between and at two aligned residues, quasi-local information, is the most contributing factor, by investigating the importance of input features used in the RF machine learning. This average alignment quality is shown to be more important than the previously identified quantity of a local information: the product of alignment qualities at two aligned residues.
Collapse
Affiliation(s)
- Juyong Lee
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, 5635 Fishers Ln, Bethesda, 20852, USA.
- Center for In Silico Protein Science, Korea Institute for Advanced Study, Seoul, Korea.
| | - Kiho Lee
- Center for In Silico Protein Science, Korea Institute for Advanced Study, Seoul, Korea.
| | - InSuk Joung
- Center for In Silico Protein Science, Korea Institute for Advanced Study, Seoul, Korea.
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul, Korea.
| | - Keehyoung Joo
- Center for In Silico Protein Science, Korea Institute for Advanced Study, Seoul, Korea.
- Center for Advanced Computation, Korea Institute for Advanced Study, Seoul, Korea.
| | - Bernard R Brooks
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, 5635 Fishers Ln, Bethesda, 20852, USA.
| | - Jooyoung Lee
- Center for In Silico Protein Science, Korea Institute for Advanced Study, Seoul, Korea.
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul, Korea.
| |
Collapse
|
617
|
Studer G, Biasini M, Schwede T. Assessing the local structural quality of transmembrane protein models using statistical potentials (QMEANBrane). ACTA ACUST UNITED AC 2015; 30:i505-11. [PMID: 25161240 PMCID: PMC4147910 DOI: 10.1093/bioinformatics/btu457] [Citation(s) in RCA: 113] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Motivation: Membrane proteins are an important class of biological macromolecules involved in many cellular key processes including signalling and transport. They account for one third of genes in the human genome and >50% of current drug targets. Despite their importance, experimental structural data are sparse, resulting in high expectations for computational modelling tools to help fill this gap. However, as many empirical methods have been trained on experimental structural data, which is biased towards soluble globular proteins, their accuracy for transmembrane proteins is often limited. Results: We developed a local model quality estimation method for membrane proteins (‘QMEANBrane’) by combining statistical potentials trained on membrane protein structures with a per-residue weighting scheme. The increasing number of available experimental membrane protein structures allowed us to train membrane-specific statistical potentials that approach statistical saturation. We show that reliable local quality estimation of membrane protein models is possible, thereby extending local quality estimation to these biologically relevant molecules. Availability and implementation: Source code and datasets are available on request. Contact:torsten.schwede@unibas.ch Supplementary Information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gabriel Studer
- Biozentrum, University of Basel, Basel, 4056, Switzerland and SIB Swiss Institute of Bioinformatics, Basel, 4056, Switzerland Biozentrum, University of Basel, Basel, 4056, Switzerland and SIB Swiss Institute of Bioinformatics, Basel, 4056, Switzerland
| | - Marco Biasini
- Biozentrum, University of Basel, Basel, 4056, Switzerland and SIB Swiss Institute of Bioinformatics, Basel, 4056, Switzerland Biozentrum, University of Basel, Basel, 4056, Switzerland and SIB Swiss Institute of Bioinformatics, Basel, 4056, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, 4056, Switzerland and SIB Swiss Institute of Bioinformatics, Basel, 4056, Switzerland Biozentrum, University of Basel, Basel, 4056, Switzerland and SIB Swiss Institute of Bioinformatics, Basel, 4056, Switzerland
| |
Collapse
|
618
|
Berman HM, Gabanyi MJ, Groom CR, Johnson JE, Murshudov GN, Nicholls RA, Reddy V, Schwede T, Zimmerman MD, Westbrook J, Minor W. Data to knowledge: how to get meaning from your result. IUCRJ 2015; 2:45-58. [PMID: 25610627 PMCID: PMC4285880 DOI: 10.1107/s2052252514023306] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Accepted: 10/22/2014] [Indexed: 05/19/2023]
Abstract
Structural and functional studies require the development of sophisticated 'Big Data' technologies and software to increase the knowledge derived and ensure reproducibility of the data. This paper presents summaries of the Structural Biology Knowledge Base, the VIPERdb Virus Structure Database, evaluation of homology modeling by the Protein Model Portal, the ProSMART tool for conformation-independent structure comparison, the LabDB 'super' laboratory information management system and the Cambridge Structural Database. These techniques and technologies represent important tools for the transformation of crystallographic data into knowledge and information, in an effort to address the problem of non-reproducibility of experimental results.
Collapse
Affiliation(s)
- Helen M. Berman
- Center for Integrative Proteomics Research, Department of Chemistry and Chemical Biology, Rutgers, State University of New Jersey, Piscataway, NJ 08854, USA
| | - Margaret J. Gabanyi
- Center for Integrative Proteomics Research, Department of Chemistry and Chemical Biology, Rutgers, State University of New Jersey, Piscataway, NJ 08854, USA
| | - Colin R. Groom
- Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge CB2 1EZ, England
| | - John E. Johnson
- Department of Integrative Structural and Computational Biology, Scripps Research Institute, La Jolla, CA 92037, USA
| | - Garib N. Murshudov
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, England
| | - Robert A. Nicholls
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, England
| | - Vijay Reddy
- Department of Integrative Structural and Computational Biology, Scripps Research Institute, La Jolla, CA 92037, USA
| | - Torsten Schwede
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, 4056 Basel, Switzerland
- SIB-Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Matthew D. Zimmerman
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22908, USA
| | - John Westbrook
- Center for Integrative Proteomics Research, Department of Chemistry and Chemical Biology, Rutgers, State University of New Jersey, Piscataway, NJ 08854, USA
| | - Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22908, USA
| |
Collapse
|
619
|
Huang YJ, Mao B, Aramini JM, Montelione GT. Assessment of template-based protein structure predictions in CASP10. Proteins 2014; 82 Suppl 2:43-56. [PMID: 24323734 DOI: 10.1002/prot.24488] [Citation(s) in RCA: 82] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2013] [Revised: 11/10/2013] [Accepted: 11/19/2013] [Indexed: 12/27/2022]
Abstract
Template-based modeling (TBM) is a major component of the critical assessment of protein structure prediction (CASP). In CASP10, some 41,740 predicted models submitted by 150 predictor groups were assessed as TBM predictions. The accuracy of protein structure prediction was assessed by geometric comparison with experimental X-ray crystal and NMR structures using a composite score that included both global alignment metrics and distance-matrix-based metrics. These included GDT-HA and GDC-all global alignment scores, and the superimposition-independent LDDT distance-matrix-based score. In addition, a superimposition-independent RPF metric, similar to that described previously for comparing protein models against experimental NMR data, was used for comparing predicted protein structure models against experimental protein structures. To score well on all four of these metrics, models must feature accurate predictions of both backbone and side-chain conformations. Performance rankings were determined independently for server and the combined server plus human-curated predictor groups. Final rankings were made using paired head-to-head Student's t-test analysis of raw metric scores among the top 25 performing groups in each category.
Collapse
Affiliation(s)
- Yuanpeng J Huang
- Center for Advanced Biotechnology and Medicine and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey, 08854; Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, New Jersey, 08854; Northeast Structural Genomics Consortium, Rutgers, The State University of New Jersey, Piscataway, New Jersey, 08854
| | | | | | | |
Collapse
|
620
|
Chen Y, Shang Y, Xu D. Multi-Dimensional Scaling and MODELLER-Based Evolutionary Algorithms for Protein Model Refinement. PROCEEDINGS OF THE ... CONGRESS ON EVOLUTIONARY COMPUTATION. CONGRESS ON EVOLUTIONARY COMPUTATION 2014; 2014:1038-1045. [PMID: 25844403 PMCID: PMC4380876 DOI: 10.1109/cec.2014.6900443] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Protein structure prediction, i.e., computationally predicting the three-dimensional structure of a protein from its primary sequence, is one of the most important and challenging problems in bioinformatics. Model refinement is a key step in the prediction process, where improved structures are constructed based on a pool of initially generated models. Since the refinement category was added to the biennial Critical Assessment of Structure Prediction (CASP) in 2008, CASP results show that it is a challenge for existing model refinement methods to improve model quality consistently. This paper presents three evolutionary algorithms for protein model refinement, in which multidimensional scaling(MDS), the MODELLER software, and a hybrid of both are used as crossover operators, respectively. The MDS-based method takes a purely geometrical approach and generates a child model by combining the contact maps of multiple parents. The MODELLER-based method takes a statistical and energy minimization approach, and uses the remodeling module in MODELLER program to generate new models from multiple parents. The hybrid method first generates models using the MDS-based method and then run them through the MODELLER-based method, aiming at combining the strength of both. Promising results have been obtained in experiments using CASP datasets. The MDS-based method improved the best of a pool of predicted models in terms of the global distance test score (GDT-TS) in 9 out of 16test targets.
Collapse
Affiliation(s)
- Yan Chen
- Yan Chen, Yi Shang, and Dong Xu are with the Department of Computer Science, University of Missouri, Columbia, MO 65211 USA. Dong Xu is also with the Christopher S. Bond Life Science Center, University of Missouri. (, , and )
| | - Yi Shang
- Yan Chen, Yi Shang, and Dong Xu are with the Department of Computer Science, University of Missouri, Columbia, MO 65211 USA. Dong Xu is also with the Christopher S. Bond Life Science Center, University of Missouri. (, , and )
| | - Dong Xu
- Yan Chen, Yi Shang, and Dong Xu are with the Department of Computer Science, University of Missouri, Columbia, MO 65211 USA. Dong Xu is also with the Christopher S. Bond Life Science Center, University of Missouri. (, , and )
| |
Collapse
|
621
|
Olechnovič K, Venclovas C. The CAD-score web server: contact area-based comparison of structures and interfaces of proteins, nucleic acids and their complexes. Nucleic Acids Res 2014; 42:W259-63. [PMID: 24838571 PMCID: PMC4086110 DOI: 10.1093/nar/gku294] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
The Contact Area Difference score (CAD-score) web server provides a universal framework to compute and analyze discrepancies between different 3D structures of the same biological macromolecule or complex. The server accepts both single-subunit and multi-subunit structures and can handle all the major types of macromolecules (proteins, RNA, DNA and their complexes). It can perform numerical comparison of both structures and interfaces. In addition to entire structures and interfaces, the server can assess user-defined subsets. The CAD-score server performs both global and local numerical evaluations of structural differences between structures or interfaces. The results can be explored interactively using sortable tables of global scores, profiles of local errors, superimposed contact maps and 3D structure visualization. The web server could be used for tasks such as comparison of models with the native (reference) structure, comparison of X-ray structures of the same macromolecule obtained in different states (e.g. with and without a bound ligand), analysis of nuclear magnetic resonance (NMR) structural ensemble or structures obtained in the course of molecular dynamics simulation. The web server is freely accessible at: http://www.ibt.lt/bioinformatics/cad-score.
Collapse
Affiliation(s)
- Kliment Olechnovič
- Institute of Biotechnology, Vilnius University, Graičiūno 8, Vilnius LT-02241, Lithuania Faculty of Mathematics and Informatics, Vilnius University, Naugarduko 24, Vilnius LT-03225, Lithuania
| | - Ceslovas Venclovas
- Institute of Biotechnology, Vilnius University, Graičiūno 8, Vilnius LT-02241, Lithuania
| |
Collapse
|
622
|
Biasini M, Bienert S, Waterhouse A, Arnold K, Studer G, Schmidt T, Kiefer F, Cassarino TG, Bertoni M, Bordoli L, Schwede T. SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information. Nucleic Acids Res 2014. [DOI: 10.1093/nar/gku340 and 67=89] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
623
|
Biasini M, Bienert S, Waterhouse A, Arnold K, Studer G, Schmidt T, Kiefer F, Cassarino TG, Bertoni M, Bordoli L, Schwede T. SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information. Nucleic Acids Res 2014. [DOI: 10.1093/nar/gku340 and 21=21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
624
|
Biasini M, Bienert S, Waterhouse A, Arnold K, Studer G, Schmidt T, Kiefer F, Gallo Cassarino T, Bertoni M, Bordoli L, Schwede T. SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information. Nucleic Acids Res 2014; 42:W252-8. [PMID: 24782522 PMCID: PMC4086089 DOI: 10.1093/nar/gku340] [Citation(s) in RCA: 3631] [Impact Index Per Article: 330.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Protein structure homology modelling has become a routine technique to generate 3D models for proteins when experimental structures are not available. Fully automated servers such as SWISS-MODEL with user-friendly web interfaces generate reliable models without the need for complex software packages or downloading large databases. Here, we describe the latest version of the SWISS-MODEL expert system for protein structure modelling. The SWISS-MODEL template library provides annotation of quaternary structure and essential ligands and co-factors to allow for building of complete structural models, including their oligomeric structure. The improved SWISS-MODEL pipeline makes extensive use of model quality estimation for selection of the most suitable templates and provides estimates of the expected accuracy of the resulting models. The accuracy of the models generated by SWISS-MODEL is continuously evaluated by the CAMEO system. The new web site allows users to interactively search for templates, cluster them by sequence similarity, structurally compare alternative templates and select the ones to be used for model building. In cases where multiple alternative template structures are available for a protein of interest, a user-guided template selection step allows building models in different functional states. SWISS-MODEL is available at http://swissmodel.expasy.org/.
Collapse
Affiliation(s)
- Marco Biasini
- Biozentrum, University of Basel, Basel 4056, Switzerland SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Stefan Bienert
- Biozentrum, University of Basel, Basel 4056, Switzerland SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Andrew Waterhouse
- Biozentrum, University of Basel, Basel 4056, Switzerland SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Konstantin Arnold
- Biozentrum, University of Basel, Basel 4056, Switzerland SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Gabriel Studer
- Biozentrum, University of Basel, Basel 4056, Switzerland SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Tobias Schmidt
- Biozentrum, University of Basel, Basel 4056, Switzerland SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Florian Kiefer
- Biozentrum, University of Basel, Basel 4056, Switzerland SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Tiziano Gallo Cassarino
- Biozentrum, University of Basel, Basel 4056, Switzerland SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Martino Bertoni
- Biozentrum, University of Basel, Basel 4056, Switzerland SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Lorenza Bordoli
- Biozentrum, University of Basel, Basel 4056, Switzerland SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel 4056, Switzerland SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| |
Collapse
|
625
|
Kryshtafovych A, Moult J, Bales P, Bazan JF, Biasini M, Burgin A, Chen C, Cochran FV, Craig TK, Das R, Fass D, Garcia-Doval C, Herzberg O, Lorimer D, Luecke H, Ma X, Nelson DC, van Raaij MJ, Rohwer F, Segall A, Seguritan V, Zeth K, Schwede T. Challenging the state of the art in protein structure prediction: Highlights of experimental target structures for the 10th Critical Assessment of Techniques for Protein Structure Prediction Experiment CASP10. Proteins 2014; 82 Suppl 2:26-42. [PMID: 24318984 PMCID: PMC4072496 DOI: 10.1002/prot.24489] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2013] [Revised: 11/01/2013] [Accepted: 11/09/2013] [Indexed: 11/12/2022]
Abstract
For the last two decades, CASP has assessed the state of the art in techniques for protein structure prediction and identified areas which required further development. CASP would not have been possible without the prediction targets provided by the experimental structural biology community. In the latest experiment, CASP10, more than 100 structures were suggested as prediction targets, some of which appeared to be extraordinarily difficult for modeling. In this article, authors of some of the most challenging targets discuss which specific scientific question motivated the experimental structure determination of the target protein, which structural features were especially interesting from a structural or functional perspective, and to what extent these features were correctly reproduced in the predictions submitted to CASP10. Specifically, the following targets will be presented: the acid-gated urea channel, a difficult to predict transmembrane protein from the important human pathogen Helicobacter pylori; the structure of human interleukin (IL)-34, a recently discovered helical cytokine; the structure of a functionally uncharacterized enzyme OrfY from Thermoproteus tenax formed by a gene duplication and a novel fold; an ORFan domain of mimivirus sulfhydryl oxidase R596; the fiber protein gene product 17 from bacteriophage T7; the bacteriophage CBA-120 tailspike protein; a virus coat protein from metagenomic samples of the marine environment; and finally, an unprecedented class of structure prediction targets based on engineered disulfide-rich small proteins.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, California 95616,
| | - John Moult
- Institute for Bioscience and Biotechnology Research, Department of Cell Biology and Molecular genetics, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA;
| | - Patrick Bales
- Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA;
| | - J. Fernando Bazan
- (1) Departments of Protein Engineering and (2) Structural Biology, Genentech, 1 DNA Way, South San Francisco, CA 94080, (3) Present address: 44th & Aspen Life Sciences, 924 4th St. N., Stillwater, MN 55082,
| | - Marco Biasini
- (1) Biozentrum, University of Basel, Klingelbergstrasse 50, 4056 Basel, Switzerland; (2) SIB Swiss Institute of Bioinformatics, Klingelbergstrasse 50, 4056 Basel, Switzerland;
| | - Alex Burgin
- Broad Institute, 5 Cambridge Center, Cambridge, MA 02142, USA;
| | - Chen Chen
- Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA;
| | - Frank V. Cochran
- Department of Biochemistry, Stanford University, Stanford, California, 94305, USA;
| | | | - Rhiju Das
- (1) Department of Biochemistry, Stanford University, Stanford, California, 94305, USA; (2) Department of Physics, Stanford University, Stanford, California, 94305, USA,
| | - Deborah Fass
- Department of Structural Biology, Weizmann Institute of Science, Rehovot 76100 Israel, Tel: +972-8-934-3214; Fax: +972-8-934-4136;
| | - Carmela Garcia-Doval
- Centro Nactional de Biotecnologia (CNB-CSIC), calle Darwin 3, E-28049 Madrid, Spain.
| | - Osnat Herzberg
- (1) Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA; (2) Department of Chemistry and Biochemistry, University of Maryland, College Park;
| | - Donald Lorimer
- Emerald Bio, 7869 NE Day Rd W, Bainbridge Isle, WA 98110, USA;
| | - Hartmut Luecke
- Center for Biomembrane Systems and Depts. of Biochemistry, Biophysics & Computer Science, 3205 McGaugh Hall, University of California, Irvine, CA 92697-3900, USA;
| | - Xiaolei Ma
- (1) Departments of Protein Engineering and (2) Structural Biology, Genentech, 1 DNA Way, South San Francisco, CA 94080 (3) Present address: Novartis Institutes for Biomedical Research, 4560 Horton St., Emeryville, CA 94608, USA;
| | - Daniel C. Nelson
- (1) Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA; (2) Department of Veterinary Medicine, University of Maryland, College Park,
| | - Mark J. van Raaij
- Centro Nactional de Biotecnologia (CNB-CSIC), calle Darwin 3, E-28049 Madrid, Spain.
| | - Forest Rohwer
- Department of Biology, San Diego State University, San Diego, CA 92182, USA;
| | - Anca Segall
- Department of Biology, San Diego State University, San Diego, CA 92182, USA;
| | - Victor Seguritan
- Department of Biology, San Diego State University, San Diego, CA 9218
| | - Kornelius Zeth
- Unidad de Biofisica (CSIC-UPV/EHU), Barrio Sarriena s/n 48940, Leioa, Vizcaya, SPAIN, and IKERBASQUE, Basque Foundation for Science, Bilbao, Spain;
| | - Torsten Schwede
- (1) Biozentrum, University of Basel, Klingelbergstrasse 50, 4056 Basel, Switzerland; (2) SIB Swiss Institute of Bioinformatics, Klingelbergstrasse 50, 4056 Basel, Switzerland;
| |
Collapse
|
626
|
Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction (CASP)--round x. Proteins 2014. [PMID: 24344053 DOI: 10.1002/prot.24452.critical] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/16/2023]
Abstract
This article is an introduction to the special issue of the journal PROTEINS, dedicated to the tenth Critical Assessment of Structure Prediction (CASP) experiment to assess the state of the art in protein structure modeling. The article describes the conduct of the experiment, the categories of prediction included, and outlines the evaluation and assessment procedures. The 10 CASP experiments span almost 20 years of progress in the field of protein structure modeling, and there have been enormous advances in methods and model accuracy in that period. Notable in this round is the first sustained improvement of models with refinement methods, using molecular dynamics. For the first time, we tested the ability of modeling methods to make use of sparse experimental three-dimensional contact information, such as may be obtained from new experimental techniques, with encouraging results. On the other hand, new contact prediction methods, though holding considerable promise, have yet to make an impact in CASP testing. The nature of CASP targets has been changing in recent CASPs, reflecting shifts in experimental structural biology, with more irregular structures, more multi-domain and multi-subunit structures, and less standard versions of known folds. When allowance is made for these factors, we continue to see steady progress in the overall accuracy of models, particularly resulting from improvement of non-template regions.
Collapse
Affiliation(s)
- John Moult
- Institute for Bioscience and Biotechnology Research and Department of Cell Biology and Molecular Genetics, University of Maryland, Rockville, Maryland, 20850
| | | | | | | | | |
Collapse
|
627
|
Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction (CASP)--round x. Proteins 2014; 82 Suppl 2:1-6. [PMID: 24344053 PMCID: PMC4394854 DOI: 10.1002/prot.24452] [Citation(s) in RCA: 282] [Impact Index Per Article: 25.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2013] [Accepted: 10/21/2013] [Indexed: 12/28/2022]
Abstract
This article is an introduction to the special issue of the journal PROTEINS, dedicated to the tenth Critical Assessment of Structure Prediction (CASP) experiment to assess the state of the art in protein structure modeling. The article describes the conduct of the experiment, the categories of prediction included, and outlines the evaluation and assessment procedures. The 10 CASP experiments span almost 20 years of progress in the field of protein structure modeling, and there have been enormous advances in methods and model accuracy in that period. Notable in this round is the first sustained improvement of models with refinement methods, using molecular dynamics. For the first time, we tested the ability of modeling methods to make use of sparse experimental three-dimensional contact information, such as may be obtained from new experimental techniques, with encouraging results. On the other hand, new contact prediction methods, though holding considerable promise, have yet to make an impact in CASP testing. The nature of CASP targets has been changing in recent CASPs, reflecting shifts in experimental structural biology, with more irregular structures, more multi-domain and multi-subunit structures, and less standard versions of known folds. When allowance is made for these factors, we continue to see steady progress in the overall accuracy of models, particularly resulting from improvement of non-template regions.
Collapse
Affiliation(s)
- John Moult
- Institute for Bioscience and Biotechnology Research, and Department of Cell Biology and Molecular Genetics, University of Maryland, Rockville, Maryland 20850
| | | | | | - Torsten Schwede
- University of Basel, Biozentrum & SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Anna Tramontano
- Department of Physics and Istituto Pasteur-Fondazione Cenci Bolognetti, Sapienza University of Rome, 00185 Rome, Italy
| |
Collapse
|
628
|
Abstract
This article is an introduction to the special issue of the journal PROTEINS, dedicated to the tenth Critical Assessment of Structure Prediction (CASP) experiment to assess the state of the art in protein structure modeling. The article describes the conduct of the experiment, the categories of prediction included, and outlines the evaluation and assessment procedures. The 10 CASP experiments span almost 20 years of progress in the field of protein structure modeling, and there have been enormous advances in methods and model accuracy in that period. Notable in this round is the first sustained improvement of models with refinement methods, using molecular dynamics. For the first time, we tested the ability of modeling methods to make use of sparse experimental three-dimensional contact information, such as may be obtained from new experimental techniques, with encouraging results. On the other hand, new contact prediction methods, though holding considerable promise, have yet to make an impact in CASP testing. The nature of CASP targets has been changing in recent CASPs, reflecting shifts in experimental structural biology, with more irregular structures, more multi-domain and multi-subunit structures, and less standard versions of known folds. When allowance is made for these factors, we continue to see steady progress in the overall accuracy of models, particularly resulting from improvement of non-template regions.
Collapse
|
629
|
Kryshtafovych A, Monastyrskyy B, Fidelis K. CASP prediction center infrastructure and evaluation measures in CASP10 and CASP ROLL. Proteins 2013; 82 Suppl 2:7-13. [PMID: 24038551 DOI: 10.1002/prot.24399] [Citation(s) in RCA: 74] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2013] [Revised: 08/08/2013] [Accepted: 08/14/2013] [Indexed: 12/27/2022]
Abstract
The Protein Structure Prediction Center at the University of California, Davis, supports the CASP experiments by identifying prediction targets, accepting predictions, performing standard evaluations, assisting independent CASP assessors, presenting and archiving results, and facilitating information exchange relating to CASP and structure prediction in general. We provide an overview of the CASP infrastructure implemented at the Center, and summarize standard measures used for evaluating predictions in the latest round of CASP. Several components were introduced or significantly redesigned for CASP10, in particular an improved assessors' common web-workspace; a Sphere Grinder visualization tool for analyzing local accuracy of predictions; brand new blocks for evaluation contact prediction and contact-assisted structure prediction; expanded evaluation and visualization tools for tertiary structure, refinement and quality assessment. Technical aspects of conducting the CASP10 and CASP ROLL experiments and relevant statistics are also provided.
Collapse
|