1
|
Wang X, Huang SY. Integrating Bonded and Nonbonded Potentials in the Knowledge-Based Scoring Function for Protein Structure Prediction. J Chem Inf Model 2019; 59:3080-3090. [DOI: 10.1021/acs.jcim.9b00057] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Affiliation(s)
- Xinxiang Wang
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
2
|
Colbes J, Corona RI, Lezcano C, Rodríguez D, Brizuela CA. Protein side-chain packing problem: is there still room for improvement? Brief Bioinform 2018; 18:1033-1043. [PMID: 27567382 DOI: 10.1093/bib/bbw079] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Indexed: 11/12/2022] Open
Abstract
The protein side-chain packing problem (PSCPP) is an important subproblem of both protein structure prediction and protein design. During the past two decades, a large number of methods have been proposed to tackle this problem. These methods consist of three main components: a rotamer library, a scoring function and a search strategy. The average overall accuracy level obtained by these methods is approximately 87%. Whether a better accuracy level could be achieved remains to be answered. To address this question, we calculated the maximum accuracy level attainable using a simple rotamer library, independently of the energy function or the search method. Using 2883 different structures from the Protein Data Bank, we compared this accuracy level with the accuracy level of five state-of-the-art methods. These comparisons indicated that, for buried residues in the protein, we are already close to the best possible accuracy results. In addition, for exposed residues, we found that a significant gap exists between the possible improvement and the maximum accuracy level achievable with current methods. After determining that an improvement is possible, the next step is to understand what limitations are preventing us from obtaining such an improvement. Previous works on protein structure prediction and protein design have shown that scoring function inaccuracies may represent the main obstacle to achieving better results for these problems. To show that the same is true for the PSCPP, we evaluated the quality of two scoring functions used by some state-of-the-art algorithms. Our results indicate that neither of these scoring functions can guide the search method correctly, thereby reinforcing the idea that efforts to solve the PSCPP must also focus on developing better scoring functions.
Collapse
|
3
|
Anishchenko I, Kundrotas PJ, Vakser IA. Contact Potential for Structure Prediction of Proteins and Protein Complexes from Potts Model. Biophys J 2018; 115:809-821. [PMID: 30122295 DOI: 10.1016/j.bpj.2018.07.035] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 07/16/2018] [Accepted: 07/31/2018] [Indexed: 12/18/2022] Open
Abstract
The energy function is the key component of protein modeling methodology. This work presents a semianalytical approach to the development of contact potentials for protein structure modeling. Residue-residue and atom-atom contact energies were derived by maximizing the probability of observing native sequences in a nonredundant set of protein structures. The optimization task was formulated as an inverse statistical mechanics problem applied to the Potts model. Its solution by pseudolikelihood maximization provides consistent estimates of coupling constants at atomic and residue levels. The best performance was achieved when interacting atoms were grouped according to their physicochemical properties. For individual protein structures, the performance of the contact potentials in distinguishing near-native structures from the decoys is similar to the top-performing scoring functions. The potentials also yielded significant improvement in the protein docking success rates. The potentials recapitulated experimentally determined protein stability changes upon point mutations and protein-protein binding affinities. The approach offers a different perspective on knowledge-based potentials and may serve as the basis for their further development.
Collapse
Affiliation(s)
- Ivan Anishchenko
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas
| | - Petras J Kundrotas
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas.
| | - Ilya A Vakser
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas.
| |
Collapse
|
4
|
Deng H, Jia Y, Zhang Y. 3DRobot: automated generation of diverse and well-packed protein structure decoys. Bioinformatics 2015; 32:378-87. [PMID: 26471454 DOI: 10.1093/bioinformatics/btv601] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2015] [Accepted: 10/10/2015] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Computationally generated non-native protein structure conformations (or decoys) are often used for designing protein folding simulation methods and force fields. However, almost all the decoy sets currently used in literature suffer from uneven root mean square deviation (RMSD) distribution with bias to non-protein like hydrogen-bonding and compactness patterns. Meanwhile, most protein decoy sets are pre-calculated and there is a lack of methods for automated generation of high-quality decoys for any target proteins. RESULTS We developed a new algorithm, 3DRobot, to create protein structure decoys by free fragment assembly with enhanced hydrogen-bonding and compactness interactions. The method was benchmarked with three widely used decoy sets from ab initio folding and comparative modeling simulations. The decoys generated by 3DRobot are shown to have significantly enhanced diversity and evenness with a continuous distribution in the RMSD space. The new energy terms introduced in 3DRobot improve the hydrogen-bonding network and compactness of decoys, which eliminates the possibility of native structure recognition by trivial potentials. Algorithms that can automatically create such diverse and well-packed non-native conformations from any protein structure should have a broad impact on the development of advanced protein force field and folding simulation methods. AVAILIABLITY AND IMPLEMENTATION: http://zhanglab.ccmb.med.umich.edu/3DRobot/ CONTACT jiay@phy.ccnu.edu.cn; zhng@umich.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Haiyou Deng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 45108, USA, Department of Physics and Institute of Biophysics, Central China Normal University, Wuhan 430079, China and
| | - Ya Jia
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 45108, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 45108, USA, Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 45108, USA
| |
Collapse
|
5
|
Yeh HYC, Lindsey A, Wu CP, Thomas S, Amato NM. Decoy Database Improvement for Protein Folding. J Comput Biol 2015; 22:823-36. [PMID: 26258648 DOI: 10.1089/cmb.2015.0116] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Predicting protein structures and simulating protein folding are two of the most important problems in computational biology today. Simulation methods rely on a scoring function to distinguish the native structure (the most energetically stable) from non-native structures. Decoy databases are collections of non-native structures used to test and verify these functions. We present a method to evaluate and improve the quality of decoy databases by adding novel structures and removing redundant structures. We test our approach on 20 different decoy databases of varying size and type and show significant improvement across a variety of metrics. We also test our improved databases on two popular modern scoring functions and show that for most cases they contain a greater or equal number of native-like structures than the original databases, thereby producing a more rigorous database for testing scoring functions.
Collapse
Affiliation(s)
- Hsin-Yi Cindy Yeh
- Parasol Lab, Department of Computer Science & Engineering, Texas A&M University , College Station, Texas
| | - Aaron Lindsey
- Parasol Lab, Department of Computer Science & Engineering, Texas A&M University , College Station, Texas
| | - Chih-Peng Wu
- Parasol Lab, Department of Computer Science & Engineering, Texas A&M University , College Station, Texas
| | - Shawna Thomas
- Parasol Lab, Department of Computer Science & Engineering, Texas A&M University , College Station, Texas
| | - Nancy M Amato
- Parasol Lab, Department of Computer Science & Engineering, Texas A&M University , College Station, Texas
| |
Collapse
|
6
|
Wang J, Zhao Y, Zhu C, Xiao Y. 3dRNAscore: a distance and torsion angle dependent evaluation function of 3D RNA structures. Nucleic Acids Res 2015; 43:e63. [PMID: 25712091 PMCID: PMC4446410 DOI: 10.1093/nar/gkv141] [Citation(s) in RCA: 65] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2014] [Accepted: 02/06/2015] [Indexed: 01/02/2023] Open
Abstract
Model evaluation is a necessary step for better prediction and design of 3D RNA structures. For proteins, this has been widely studied and the knowledge-based statistical potential has been proved to be one of effective ways to solve this problem. Currently, a few knowledge-based statistical potentials have also been proposed to evaluate predicted models of RNA tertiary structures. The benchmark tests showed that they can identify the native structures effectively but further improvements are needed to identify near-native structures and those with non-canonical base pairs. Here, we present a novel knowledge-based potential, 3dRNAscore, which combines distance-dependent and dihedral-dependent energies. The benchmarks on different testing datasets all show that 3dRNAscore are more efficient than existing evaluation methods in recognizing native state from a pool of near-native states of RNAs as well as in ranking near-native states of RNA models.
Collapse
Affiliation(s)
- Jian Wang
- Biomolecular Physics and Modeling Group, Department of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
| | - Yunjie Zhao
- Biomolecular Physics and Modeling Group, Department of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
| | - Chunyan Zhu
- Biomolecular Physics and Modeling Group, Department of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
| | - Yi Xiao
- Biomolecular Physics and Modeling Group, Department of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
| |
Collapse
|
7
|
Carlsen M, Koehl P, Røgen P. On the importance of the distance measures used to train and test knowledge-based potentials for proteins. PLoS One 2014; 9:e109335. [PMID: 25411785 PMCID: PMC4239004 DOI: 10.1371/journal.pone.0109335] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2014] [Accepted: 08/31/2014] [Indexed: 12/15/2022] Open
Abstract
Knowledge-based potentials are energy functions derived from the analysis of databases of protein structures and sequences. They can be divided into two classes. Potentials from the first class are based on a direct conversion of the distributions of some geometric properties observed in native protein structures into energy values, while potentials from the second class are trained to mimic quantitatively the geometric differences between incorrectly folded models and native structures. In this paper, we focus on the relationship between energy and geometry when training the second class of knowledge-based potentials. We assume that the difference in energy between a decoy structure and the corresponding native structure is linearly related to the distance between the two structures. We trained two distance-based knowledge-based potentials accordingly, one based on all inter-residue distances (PPD), while the other had the set of all distances filtered to reflect consistency in an ensemble of decoys (PPE). We tested four types of metric to characterize the distance between the decoy and the native structure, two based on extrinsic geometry (RMSD and GTD-TS*), and two based on intrinsic geometry (Q* and MT). The corresponding eight potentials were tested on a large collection of decoy sets. We found that it is usually better to train a potential using an intrinsic distance measure. We also found that PPE outperforms PPD, emphasizing the benefits of capturing consistent information in an ensemble. The relevance of these results for the design of knowledge-based potentials is discussed.
Collapse
Affiliation(s)
- Martin Carlsen
- Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Patrice Koehl
- Department of Computer Science and Genome Center, University of California Davis, Davis, CA, United States of America
| | - Peter Røgen
- Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark
- * E-mail:
| |
Collapse
|
8
|
Ruiz-Blanco YB, Marrero-Ponce Y, García Y, Puris A, Bello R, Green J, Sotomayor-Torres CM. A physics-based scoring function for protein structural decoys: Dynamic testing on targets of CASP-ROLL. Chem Phys Lett 2014. [DOI: 10.1016/j.cplett.2014.07.014] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
9
|
Chakraborty S, Venkatramani R, Rao BJ, Asgeirsson B, Dandekar AM. The electrostatic profile of consecutive Cβ atoms applied to protein structure quality assessment. F1000Res 2013; 2:243. [PMID: 25506420 PMCID: PMC4257144 DOI: 10.12688/f1000research.2-243.v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/16/2014] [Indexed: 02/10/2024] Open
Abstract
The structure of a protein provides insight into its physiological interactions with other components of the cellular soup. Methods that predict putative structures from sequences typically yield multiple, closely-ranked possibilities. A critical component in the process is the model quality assessing program (MQAP), which selects the best candidate from this pool of structures. Here, we present a novel MQAP based on the physical properties of sidechain atoms. We propose a method for assessing the quality of protein structures based on the electrostatic potential difference (EPD) of Cβ atoms in consecutive residues. We demonstrate that the EPDs of Cβ atoms on consecutive residues provide unique signatures of the amino acid types. The EPD of Cβ atoms are learnt from a set of 1000 non-homologous protein structures with a resolution cuto of 1.6 Å obtained from the PISCES database. Based on the Boltzmann hypothesis that lower energy conformations are proportionately sampled more, and on Annsen's thermodynamic hypothesis that the native structure of a protein is the minimum free energy state, we hypothesize that the deviation of observed EPD values from the mean values obtained in the learning phase is minimized in the native structure. We achieved an average specificity of 0.91, 0.94 and 0.93 on hg_structal, 4state_reduced and ig_structal decoy sets, respectively, taken from the Decoys `R' Us database. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.
Collapse
Affiliation(s)
- Sandeep Chakraborty
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Ravindra Venkatramani
- Department of Chemical Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Basuthkar J. Rao
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Bjarni Asgeirsson
- Science Institute, Department of Biochemistry, University of Iceland, IS-107 Reykjavik, Iceland
| | - Abhaya M. Dandekar
- Plant Sciences Department, University of California,, Davis, CA, 95616, USA
| |
Collapse
|
10
|
Chakraborty S, Venkatramani R, Rao BJ, Asgeirsson B, Dandekar AM. Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms. F1000Res 2013; 2:211. [PMID: 24555103 PMCID: PMC3892923 DOI: 10.12688/f1000research.2-211.v1#sthash.lfll9fko.snt845h1.dpuf] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/16/2013] [Indexed: 06/29/2024] Open
Abstract
Predicting the three dimensional native state structure of a protein from its primary sequence is an unsolved grand challenge in molecular biology. Two main computational approaches have evolved to obtain the structure from the protein sequence - ab initio/de novo methods and template-based modeling - both of which typically generate multiple possible native state structures. Model quality assessment programs (MQAP) validate these predicted structures in order to identify the correct native state structure. Here, we propose a MQAP for assessing the quality of protein structures based on the distances of consecutive Cα atoms. We hypothesize that the root-mean-square deviation of the distance of consecutive Cα (RDCC) atoms from the ideal value of 3.8 Å, derived from a statistical analysis of high quality protein structures (top100H database), is minimized in native structures. Based on tests with the top100H set, we propose a RDCC cutoff value of 0.012 Å, above which a structure can be filtered out as a non-native structure. We applied the RDCC discriminator on decoy sets from the Decoys 'R' Us database to show that the native structures in all decoy sets tested have RDCC below the 0.012 Å cutoff. While most decoy sets were either indistinguishable using this discriminator or had very few violations, all the decoy structures in the fisa decoy set were discriminated by applying the RDCC criterion. This highlights the physical non-viability of the fisa decoy set, and possible issues in benchmarking other methods using this set. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.
Collapse
Affiliation(s)
- Sandeep Chakraborty
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Ravindra Venkatramani
- Department of Chemical Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Basuthkar J. Rao
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Bjarni Asgeirsson
- Science Institute, Department of Biochemistry, University of Iceland, Reykjavik, IS-107, Iceland
| | - Abhaya M. Dandekar
- Plant Sciences Department, University of California, Davis, CA 95616, USA
| |
Collapse
|
11
|
Chakraborty S, Venkatramani R, Rao BJ, Asgeirsson B, Dandekar AM. Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms. F1000Res 2013; 2:211. [PMID: 24555103 DOI: 10.12688/f1000research.2-211.v1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/10/2013] [Indexed: 01/22/2023] Open
Abstract
Predicting the three dimensional native state structure of a protein from its primary sequence is an unsolved grand challenge in molecular biology. Two main computational approaches have evolved to obtain the structure from the protein sequence - ab initio/de novo methods and template-based modeling - both of which typically generate multiple possible native state structures. Model quality assessment programs (MQAP) validate these predicted structures in order to identify the correct native state structure. Here, we propose a MQAP for assessing the quality of protein structures based on the distances of consecutive Cα atoms. We hypothesize that the root-mean-square deviation of the distance of consecutive Cα (RDCC) atoms from the ideal value of 3.8 Å, derived from a statistical analysis of high quality protein structures (top100H database), is minimized in native structures. Based on tests with the top100H set, we propose a RDCC cutoff value of 0.012 Å, above which a structure can be filtered out as a non-native structure. We applied the RDCC discriminator on decoy sets from the Decoys 'R' Us database to show that the native structures in all decoy sets tested have RDCC below the 0.012 Å cutoff. While most decoy sets were either indistinguishable using this discriminator or had very few violations, all the decoy structures in the fisa decoy set were discriminated by applying the RDCC criterion. This highlights the physical non-viability of the fisa decoy set, and possible issues in benchmarking other methods using this set. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.
Collapse
Affiliation(s)
- Sandeep Chakraborty
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Ravindra Venkatramani
- Department of Chemical Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Basuthkar J Rao
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Bjarni Asgeirsson
- Science Institute, Department of Biochemistry, University of Iceland, Reykjavik, IS-107, Iceland
| | - Abhaya M Dandekar
- Plant Sciences Department, University of California, Davis, CA 95616, USA
| |
Collapse
|
12
|
Kauffman C, Karypis G. Coarse- and fine-grained models for proteins: Evaluation by decoy discrimination. Proteins 2013. [DOI: 10.1002/prot.24222] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Affiliation(s)
- Chris Kauffman
- Department of Computer Science, George Mason University, Fairfax, Virginia 22030, USA.
| | | |
Collapse
|
13
|
Chakraborty S, Venkatramani R, Rao BJ, Asgeirsson B, Dandekar AM. The electrostatic profile of consecutive Cβ atoms applied to protein structure quality assessment. F1000Res 2013; 2:243. [PMID: 25506420 PMCID: PMC4257144 DOI: 10.12688/f1000research.2-243.v3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/16/2014] [Indexed: 12/23/2022] Open
Abstract
The structure of a protein provides insight into its physiological interactions with other components of the cellular soup. Methods that predict putative structures from sequences typically yield multiple, closely-ranked possibilities. A critical component in the process is the model quality assessing program (MQAP), which selects the best candidate from this pool of structures. Here, we present a novel MQAP based on the physical properties of sidechain atoms. We propose a method for assessing the quality of protein structures based on the electrostatic potential difference (EPD) of Cβ atoms in consecutive residues. We demonstrate that the EPDs of Cβ atoms on consecutive residues provide unique signatures of the amino acid types. The EPD of Cβ atoms are learnt from a set of 1000 non-homologous protein structures with a resolution cuto of 1.6 Å obtained from the PISCES database. Based on the Boltzmann hypothesis that lower energy conformations are proportionately sampled more, and on Annsen's thermodynamic hypothesis that the native structure of a protein is the minimum free energy state, we hypothesize that the deviation of observed EPD values from the mean values obtained in the learning phase is minimized in the native structure. We achieved an average specificity of 0.91, 0.94 and 0.93 on hg_structal, 4state_reduced and ig_structal decoy sets, respectively, taken from the Decoys `R' Us database. The source code and manual is made available at
https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.
Collapse
Affiliation(s)
- Sandeep Chakraborty
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Ravindra Venkatramani
- Department of Chemical Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Basuthkar J Rao
- Department of Biological Sciences, Tata Institute of Fundamental Research, Mumbai, 400 005, India
| | - Bjarni Asgeirsson
- Science Institute, Department of Biochemistry, University of Iceland, IS-107 Reykjavik, Iceland
| | - Abhaya M Dandekar
- Plant Sciences Department, University of California,, Davis, CA, 95616, USA
| |
Collapse
|
14
|
Cossio P, Granata D, Laio A, Seno F, Trovato A. A simple and efficient statistical potential for scoring ensembles of protein structures. Sci Rep 2012. [DOI: 10.1038/srep00351] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
|
15
|
Developing a high-quality scoring function for membrane protein structures based on specific inter-residue interactions. J Comput Aided Mol Des 2012; 26:301-9. [PMID: 22395902 DOI: 10.1007/s10822-012-9556-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2011] [Accepted: 02/19/2012] [Indexed: 10/28/2022]
Abstract
Membrane proteins are of particular biological and pharmaceutical importance, and computational modeling and structure prediction approaches play an important role in studies of membrane proteins. Developing an accurate model quality assessment program is of significance to the structure prediction of membrane proteins. Few such programs are proposed that can be applied to a broad range of membrane protein classes and perform with high accuracy. We developed a new model scoring function Interaction-based Quality assessment (IQ), based on the analysis of four types of inter-residue interactions within the transmembrane domains of helical membrane proteins. This function was tested using three high-quality model sets: all 206 models of GPCR Dock 2008, all 284 models of GPCR Dock 2010, and all 92 helical membrane protein models of the HOMEP set. For all three sets, the scoring function can select the native structures among all of the models with the success rates of 93, 85, and 100% respectively. For comparison, these three model sets were also adopted for a recently published model assessment program for membrane protein structures, ProQM, which gave the success rates of 85, 79, and 92% separately. These results suggested that IQ outperforms ProQM when only the transmembrane regions of the models are considered. This scoring function should be useful for the computational modeling of membrane proteins.
Collapse
|
16
|
Huang SY, Zou X. Statistical mechanics-based method to extract atomic distance-dependent potentials from protein structures. Proteins 2011; 79:2648-61. [PMID: 21732421 PMCID: PMC11108592 DOI: 10.1002/prot.23086] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2011] [Revised: 04/21/2011] [Accepted: 05/09/2011] [Indexed: 12/25/2022]
Abstract
In this study, we have developed a statistical mechanics-based iterative method to extract statistical atomic interaction potentials from known, nonredundant protein structures. Our method circumvents the long-standing reference state problem in deriving traditional knowledge-based scoring functions, by using rapid iterations through a physical, global convergence function. The rapid convergence of this physics-based method, unlike other parameter optimization methods, warrants the feasibility of deriving distance-dependent, all-atom statistical potentials to keep the scoring accuracy. The derived potentials, referred to as ITScore/Pro, have been validated using three diverse benchmarks: the high-resolution decoy set, the AMBER benchmark decoy set, and the CASP8 decoy set. Significant improvement in performance has been achieved. Finally, comparisons between the potentials of our model and potentials of a knowledge-based scoring function with a randomized reference state have revealed the reason for the better performance of our scoring function, which could provide useful insight into the development of other physical scoring functions. The potentials developed in this study are generally applicable for structural selection in protein structure prediction.
Collapse
Affiliation(s)
- Sheng-You Huang
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute, University of Missouri, Columbia, MO 65211
| | - Xiaoqin Zou
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute, University of Missouri, Columbia, MO 65211
| |
Collapse
|
17
|
Bernauer J, Huang X, Sim AYL, Levitt M. Fully differentiable coarse-grained and all-atom knowledge-based potentials for RNA structure evaluation. RNA (NEW YORK, N.Y.) 2011; 17:1066-1075. [PMID: 21521828 PMCID: PMC3096039 DOI: 10.1261/rna.2543711] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2010] [Accepted: 03/01/2011] [Indexed: 05/27/2023]
Abstract
RNA molecules play integral roles in gene regulation, and understanding their structures gives us important insights into their biological functions. Despite recent developments in template-based and parameterized energy functions, the structure of RNA--in particular the nonhelical regions--is still difficult to predict. Knowledge-based potentials have proven efficient in protein structure prediction. In this work, we describe two differentiable knowledge-based potentials derived from a curated data set of RNA structures, with all-atom or coarse-grained representation, respectively. We focus on one aspect of the prediction problem: the identification of native-like RNA conformations from a set of near-native models. Using a variety of near-native RNA models generated from three independent methods, we show that our potential is able to distinguish the native structure and identify native-like conformations, even at the coarse-grained level. The all-atom version of our knowledge-based potential performs better and appears to be more effective at discriminating near-native RNA conformations than one of the most highly regarded parameterized potential. The fully differentiable form of our potentials will additionally likely be useful for structure refinement and/or molecular dynamics simulations.
Collapse
Affiliation(s)
- Julie Bernauer
- INRIA AMIB Bioinformatique, Laboratoire d'Informatique (LIX), Ecole Polytechnique, 91128 Palaiseau, France.
| | | | | | | |
Collapse
|
18
|
Kalman M, Ben-Tal N. Quality assessment of protein model-structures using evolutionary conservation. ACTA ACUST UNITED AC 2010; 26:1299-307. [PMID: 20385730 PMCID: PMC2865859 DOI: 10.1093/bioinformatics/btq114] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Motivation: Programs that evaluate the quality of a protein structural model are important both for validating the structure determination procedure and for guiding the model-building process. Such programs are based on properties of native structures that are generally not expected for faulty models. One such property, which is rarely used for automatic structure quality assessment, is the tendency for conserved residues to be located at the structural core and for variable residues to be located at the surface. Results: We present ConQuass, a novel quality assessment program based on the consistency between the model structure and the protein's conservation pattern. We show that it can identify problematic structural models, and that the scores it assigns to the server models in CASP8 correlate with the similarity of the models to the native structure. We also show that when the conservation information is reliable, the method's performance is comparable and complementary to that of the other single-structure quality assessment methods that participated in CASP8 and that do not use additional structural information from homologs. Availability: A perl implementation of the method, as well as the various perl and R scripts used for the analysis are available at http://bental.tau.ac.il/ConQuass/. Contact:nirb@tauex.tau.ac.il Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matan Kalman
- Department of Biochemistry, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv 69978, Israel
| | | |
Collapse
|
19
|
Rykunov D, Fiser A. New statistical potential for quality assessment of protein models and a survey of energy functions. BMC Bioinformatics 2010; 11:128. [PMID: 20226048 PMCID: PMC2853469 DOI: 10.1186/1471-2105-11-128] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2009] [Accepted: 03/12/2010] [Indexed: 11/30/2022] Open
Abstract
Background Scoring functions, such as molecular mechanic forcefields and statistical potentials are fundamentally important tools in protein structure modeling and quality assessment. Results The performances of a number of publicly available scoring functions are compared with a statistical rigor, with an emphasis on knowledge-based potentials. We explored the effect on accuracy of alternative choices for representing interaction center types and other features of scoring functions, such as using information on solvent accessibility, on torsion angles, accounting for secondary structure preferences and side chain orientation. Partially based on the observations made, we present a novel residue based statistical potential, which employs a shuffled reference state definition and takes into account the mutual orientation of residue side chains. Atom- and residue-level statistical potentials and Linux executables to calculate the energy of a given protein proposed in this work can be downloaded from http://www.fiserlab.org/potentials. Conclusions Among the most influential terms we observed a critical role of a proper reference state definition and the benefits of including information about the microenvironment of interaction centers. Molecular mechanical potentials were also tested and found to be over-sensitive to small local imperfections in a structure, requiring unfeasible long energy relaxation before energy scores started to correlate with model quality.
Collapse
Affiliation(s)
- Dmitry Rykunov
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Ave,, Bronx, NY 10461, USA
| | | |
Collapse
|