1
|
Hashemi AS, Vaisman II. Topology-based protein classification: A deep learning approach. Biochem Biophys Res Commun 2025; 746:151240. [PMID: 39742787 DOI: 10.1016/j.bbrc.2024.151240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Revised: 11/29/2024] [Accepted: 12/23/2024] [Indexed: 01/04/2025]
Abstract
Utilizing Artificial Intelligence (AI) in computational biology techniques could offer significant advantages in alleviating the growing workloads faced by structural biologists, especially with the emergence of big data. In this study, we employed Delaunay tessellation as a promising method to obtain the overall structural topology of proteins. Subsequently, we developed multi-class deep neural network models to classify protein superfamilies based on their local topology. Our models achieved a test accuracy of approximately 0.92 in classifying proteins into 18 well-populated superfamilies. We believe that the results of this study hold substantial value since, to the best of our knowledge, no previous studies have reported the utilization of protein topological data for protein classification through deep learning and Delaunay tessellation.
Collapse
Affiliation(s)
- Aliye Sadat Hashemi
- School of Systems Biology, George Mason University, Manassas, VA, 20110, USA.
| | - Iosif I Vaisman
- School of Systems Biology, George Mason University, Manassas, VA, 20110, USA.
| |
Collapse
|
2
|
Joo H, Chavan AG, Fraga KJ, Tsai J. An amino acid code for irregular and mixed protein packing. Proteins 2015; 83:2147-61. [PMID: 26370334 DOI: 10.1002/prot.24929] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2015] [Revised: 09/01/2015] [Accepted: 09/02/2015] [Indexed: 11/10/2022]
Abstract
To advance our understanding of protein tertiary structure, the development of the knob-socket model is completed in an analysis of the packing in irregular coil and turn secondary structure packing as well as between mixed secondary structure. The knob-socket model simplifies packing based on repeated patterns of two motifs: a three-residue socket for packing within secondary (2°) structure and a four-residue knob-socket for tertiary (3°) packing. For coil and turn secondary structure, knob-sockets allow identification of a correlation between amino acid composition and tertiary arrangements in space. Coil contributes almost as much as α-helices to tertiary packing. In irregular sockets, Gly, Pro, Asp, and Ser are favored, while in irregular knobs, the preference order is Arg, Asp, Pro, Asn, Thr, Leu, and Gly. Cys, His,Met, and Trp are not favored in either. In mixed packing, the knob amino acid preferences are a function of the socket that they are packing into, whereas the amino acid composition of the sockets does not depend on the secondary structure of the knob. A unique motif of a coil knob with an XYZ β-sheet socket may potentially function to inhibit β-sheet extension. In addition, analysis of the preferred crossing angles for strands within a β-sheet and mixed α-helice/β-sheet identifies canonical packing patterns useful in protein design. Lastly, the knob-socket model abstracts the complexity of protein tertiary structure into an intuitive packing surface topology map.
Collapse
Affiliation(s)
- Hyun Joo
- Department of Chemistry, University of the Pacific, Stockton, California, 95211
| | - Archana G Chavan
- Department of Chemistry, University of the Pacific, Stockton, California, 95211
| | - Keith J Fraga
- Department of Chemistry, University of the Pacific, Stockton, California, 95211
| | - Jerry Tsai
- Department of Chemistry, University of the Pacific, Stockton, California, 95211
| |
Collapse
|
3
|
Taylor TJ, Bai H, Tai CH, Lee B. Assessment of CASP10 contact-assisted predictions. Proteins 2013; 82 Suppl 2:84-97. [PMID: 23873510 DOI: 10.1002/prot.24367] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2013] [Accepted: 07/09/2013] [Indexed: 11/08/2022]
Abstract
In CASP10, for the first time, contact-assisted structure predictions have been assessed. Sets of pairs of contacting residues from target structures were provided to predictors for a second round of prediction after the initial round in which they were given only sequences. The objective of the experiment was to measure model quality improvement resulting from the added contact information and thereby assess and help develop so-called hybrid prediction methods--methods where some experimentally determined distance constraints are used to augment de novo computational prediction methods. The results of the experiment were, overall, quite promising.
Collapse
Affiliation(s)
- Todd J Taylor
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | | | | | | |
Collapse
|
4
|
Tian Y, Deutsch C, Krishnamoorthy B. Scoring function to predict solubility mutagenesis. Algorithms Mol Biol 2010; 5:33. [PMID: 20929563 PMCID: PMC2958853 DOI: 10.1186/1748-7188-5-33] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2010] [Accepted: 10/07/2010] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Mutagenesis is commonly used to engineer proteins with desirable properties not present in the wild type (WT) protein, such as increased or decreased stability, reactivity, or solubility. Experimentalists often have to choose a small subset of mutations from a large number of candidates to obtain the desired change, and computational techniques are invaluable to make the choices. While several such methods have been proposed to predict stability and reactivity mutagenesis, solubility has not received much attention. RESULTS We use concepts from computational geometry to define a three body scoring function that predicts the change in protein solubility due to mutations. The scoring function captures both sequence and structure information. By exploring the literature, we have assembled a substantial database of 137 single- and multiple-point solubility mutations. Our database is the largest such collection with structural information known so far. We optimize the scoring function using linear programming (LP) methods to derive its weights based on training. Starting with default values of 1, we find weights in the range [0,2] so that predictions of increase or decrease in solubility are optimized. We compare the LP method to the standard machine learning techniques of support vector machines (SVM) and the Lasso. Using statistics for leave-one-out (LOO), 10-fold, and 3-fold cross validations (CV) for training and prediction, we demonstrate that the LP method performs the best overall. For the LOOCV, the LP method has an overall accuracy of 81%. AVAILABILITY Executables of programs, tables of weights, and datasets of mutants are available from the following web page: http://www.wsu.edu/~kbala/OptSolMut.html.
Collapse
Affiliation(s)
- Ye Tian
- Department of Mathematics, Washington State University, Pullman, WA 99164, USA
| | | | - Bala Krishnamoorthy
- Department of Mathematics, Washington State University, Pullman, WA 99164, USA
| |
Collapse
|
5
|
Abstract
Background There is a considerable literature on the source of the thermostability of proteins from thermophilic organisms. Understanding the mechanisms for this thermostability would provide insights into proteins generally and permit the design of synthetic hyperstable biocatalysts. Results We have systematically tested a large number of sequence and structure derived quantities for their ability to discriminate thermostable proteins from their non-thermostable orthologs using sets of mesophile-thermophile ortholog pairs. Most of the quantities tested correspond to properties previously reported to be associated with thermostability. Many of the structure related properties were derived from the Delaunay tessellation of protein structures. Conclusions Carefully selected sequence based indices discriminate better than purely structure based indices. Combined sequence and structure based indices improve performance somewhat further. Based on our analysis, the strongest contributors to thermostability are an increase in ion pairs on the protein surface and a more strongly hydrophobic interior.
Collapse
|
6
|
Sadowski MI, Taylor WR. Protein structures, folds and fold spaces. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2010; 22:033103. [PMID: 21386276 DOI: 10.1088/0953-8984/22/3/033103] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
There has been considerable progress towards the goal of understanding the space of possible tertiary structures adopted by proteins. Despite a greatly increased rate of structure determination and a deliberate strategy of sequencing proteins expected to be very different from those already known, it is now rare to see a genuinely new fold, leading to the conclusion that we have seen the majority of natural structural types. The increase in knowledge has also led to a critical examination of traditional fold-based classifications and their meaning for evolution and protein structures. We review these issues and discuss possible solutions.
Collapse
Affiliation(s)
- Michael I Sadowski
- Division of Mathematical Biology, MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK
| | | |
Collapse
|
7
|
Müller CL, Sbalzarini IF, van Gunsteren WF, Zagrović B, Hünenberger PH. In the eye of the beholder: Inhomogeneous distribution of high-resolution shapes within the random-walk ensemble. J Chem Phys 2009; 130:214904. [PMID: 19508095 DOI: 10.1063/1.3140090] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The concept of high-resolution shapes (also referred to as folds or states, depending on the context) of a polymer chain plays a central role in polymer science, structural biology, bioinformatics, and biopolymer dynamics. However, although the idea of shape is intuitively very useful, there is no unambiguous mathematical definition for this concept. In the present work, the distributions of high-resolution shapes within the ideal random-walk ensembles with N=3,...,6 beads (or up to N=10 for some properties) are investigated using a systematic (grid-based) approach based on a simple working definition of shapes relying on the root-mean-square atomic positional deviation as a metric (i.e., to define the distance between pairs of structures) and a single cutoff criterion for the shape assignment. Although the random-walk ensemble appears to represent the paramount of homogeneity and randomness, this analysis reveals that the distribution of shapes within this ensemble, i.e., in the total absence of interatomic interactions characteristic of a specific polymer (beyond the generic connectivity constraint), is significantly inhomogeneous. In particular, a specific (densest) shape occurs with a local probability that is 1.28, 1.79, 2.94, and 10.05 times (N=3,...,6) higher than the corresponding average over all possible shapes (these results can tentatively be extrapolated to a factor as large as about 10(28) for N=100). The qualitative results of this analysis lead to a few rather counterintuitive suggestions, namely, that, e.g., (i) a fold classification analysis applied to the random-walk ensemble would lead to the identification of random-walk "folds;" (ii) a clustering analysis applied to the random-walk ensemble would also lead to the identification random-walk "states" and associated relative free energies; and (iii) a random-walk ensemble of polymer chains could lead to well-defined diffraction patterns in hypothetical fiber or crystal diffraction experiments. The inhomogeneous nature of the shape probability distribution identified here for random walks may represent a significant underlying baseline effect in the analysis of real polymer chain ensembles (i.e., in the presence of specific interatomic interactions). As a consequence, a part of what is called a polymer shape may actually reside just "in the eye of the beholder" rather than in the nature of the interactions between the constituting atoms, and the corresponding observation-related bias should be taken into account when drawing conclusions from shape analyses as applied to real structural ensembles.
Collapse
Affiliation(s)
- Christian L Müller
- Institute of Computational Science and Swiss Institute of Bioinformatics, ETH Zürich, Switzerland
| | | | | | | | | |
Collapse
|
8
|
Maeda MH, Kinoshita K. Development of new indices to evaluate protein–protein interfaces: Assembling space volume, assembling space distance, and global shape descriptor. J Mol Graph Model 2009; 27:706-11. [DOI: 10.1016/j.jmgm.2008.11.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2008] [Revised: 10/30/2008] [Accepted: 11/03/2008] [Indexed: 10/21/2022]
|
9
|
Kirillova S, Carugo O. Progress in the PRIDE technique for rapidly comparing protein three-dimensional structures. BMC Res Notes 2008; 1:44. [PMID: 18710497 PMCID: PMC2535597 DOI: 10.1186/1756-0500-1-44] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2008] [Accepted: 07/11/2008] [Indexed: 12/02/2022] Open
Abstract
Background Accurate and fast tools for comparing protein three-dimensional structures are necessary to scan and analyze large data sets. Findings The method described here is not only very fast but it is also reasonable precise, as it is shown by using the CATH database as a test set. Its rapidity depends on the fact that the protein structure is represented by vectors that monitors the distribution of the inter-residue distances within the protein core and the structure of which is optimized with the Freedman-Diaconis rule. Conclusion The similarity score is based on a χ2 test, the probability density function of which can be accurately estimated.
Collapse
Affiliation(s)
- Svetlana Kirillova
- Department of Biomolecular Structural Chemistry, Programme of Structural and Computational Biology, Max F. Perutz Laboratories, Vienna University, Campus Vienna Biocenter 5, A-1030 Vienna, Austria.
| | | |
Collapse
|
10
|
Stout M, Bacardit J, Hirst JD, Smith RE, Krasnogor N. Prediction of topological contacts in proteins using learning classifier systems. Soft comput 2008. [DOI: 10.1007/s00500-008-0318-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
11
|
Abstract
As protein databases continue to grow in size, exhaustive search methods that compare a query structure against every database structure can no longer provide satisfactory performance. Instead, the filter-and-refine paradigm offers an efficient alternative to database search without compromising the accuracy of the answers. In this paradigm, protein structures are represented in an abstract form. During querying, based on the abstract representations, the filtering phase prunes away dissimilar structures quickly so that only a small collection of promising structures are examined using a detailed structure alignment technique in the refinement phase. This article reviews mainly techniques developed for the filtering phase.
Collapse
Affiliation(s)
- Zeyar Aung
- Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore 119613, Singapore.
| | | |
Collapse
|
12
|
Zotenko E, Islamaj Dogan R, Wilbur WJ, O'Leary DP, Przytycka TM. Structural footprinting in protein structure comparison: the impact of structural fragments. BMC STRUCTURAL BIOLOGY 2007; 7:53. [PMID: 17688700 PMCID: PMC2082327 DOI: 10.1186/1472-6807-7-53] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2007] [Accepted: 08/09/2007] [Indexed: 11/23/2022]
Abstract
Background One approach for speeding-up protein structure comparison is the projection approach, where a protein structure is mapped to a high-dimensional vector and structural similarity is approximated by distance between the corresponding vectors. Structural footprinting methods are projection methods that employ the same general technique to produce the mapping: first select a representative set of structural fragments as models and then map a protein structure to a vector in which each dimension corresponds to a particular model and "counts" the number of times the model appears in the structure. The main difference between any two structural footprinting methods is in the set of models they use; in fact a large number of methods can be generated by varying the type of structural fragments used and the amount of detail in their representation. How do these choices affect the ability of the method to detect various types of structural similarity? Results To answer this question we benchmarked three structural footprinting methods that vary significantly in their selection of models against the CATH database. In the first set of experiments we compared the methods' ability to detect structural similarity characteristic of evolutionarily related structures, i.e., structures within the same CATH superfamily. In the second set of experiments we tested the methods' agreement with the boundaries imposed by classification groups at the Class, Architecture, and Fold levels of the CATH hierarchy. Conclusion In both experiments we found that the method which uses secondary structure information has the best performance on average, but no one method performs consistently the best across all groups at a given classification level. We also found that combining the methods' outputs significantly improves the performance. Moreover, our new techniques to measure and visualize the methods' agreement with the CATH hierarchy, including the threshholded affinity graph, are useful beyond this work. In particular, they can be used to expose a similar composition of different classification groups in terms of structural fragments used by the method and thus provide an alternative demonstration of the continuous nature of the protein structure universe.
Collapse
Affiliation(s)
- Elena Zotenko
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Rezarta Islamaj Dogan
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - W John Wilbur
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Dianne P O'Leary
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA
- Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA
| | - Teresa M Przytycka
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
13
|
Abstract
A novel protein structure alignment technique has been developed reducing much of the secondary and tertiary structure to a sequential representation greatly accelerating many structural computations, including alignment. Constructed from incidence relations in the Delaunay tetrahedralization, alignments of the sequential representation describe structural similarities that cannot be expressed with rigid-body superposition and complement existing techniques minimizing root-mean-squared distance through superposition. Restricting to the largest substructure superimposable by a single rigid-body transformation determines an alignment suitable for root-mean-squared distance comparisons and visualization. Restricted alignments of a test set of histones and histone-like proteins determined superpositions nearly identical to those produced by the established structure alignment routines of DaliLite and ProSup. Alignment of three, increasingly complex proteins: ferredoxin, cytidine deaminase, and carbamoyl phosphate synthetase, to themselves, demonstrated previously identified regions of self-similarity. All-against-all similarity index comparisons performed on a test set of 45 class I and class II aminoacyl-tRNA synthetases closely reproduced the results of established distance matrix methods while requiring 1/16 the time. Principal component analysis of pairwise tetrahedral decomposition similarity of 2300 molecular dynamics snapshots of tryptophanyl-tRNA synthetase revealed discrete microstates within the trajectory consistent with experimental results. The method produces results with sufficient efficiency for large-scale multiple structure alignment and is well suited to genomic and evolutionary investigations where no geometric model of similarity is known a priori.
Collapse
Affiliation(s)
- Jeffrey Roach
- Department of Biochemistry and Biophysics, University of North Carolina, Chapel Hill, North Carolina 27599, USA.
| | | | | | | |
Collapse
|
14
|
Marsh L. Evolution of Structural Shape in Bacterial Globin-Related Proteins. J Mol Evol 2006; 62:575-87. [PMID: 16612536 DOI: 10.1007/s00239-005-0025-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2005] [Accepted: 12/31/2005] [Indexed: 10/24/2022]
Abstract
The globin family of proteins has a characteristic structural pattern of helix interactions that nonetheless exhibits some variation. A simplified model for globin structural evolution was developed in which protein shape evolved by random change of contacts between helices. A conserved globin domain of 15 bacterial proteins representing four structural families was studied. Using a parsimony approach ancestral structural states could be reconstructed. The distribution of number of contact changes per site for a fixed topology tree fit a gamma distribution. Homoplasy was high, with multiple changes per site and no support for an invariant class of residue-residue contacts. Contacts changed more slowly than sequence. A phylogenetic reconstruction using a distance measure based on the proportion of shared contacts was generally consistent with a sequence-based phylogeny but not highly resolved. Contact pattern convergence between members of different globin family proteins could not be detected. Simulation studies indicated the convergence test was sensitive enough to have detected convergence involving only 10% of the contacts, suggesting a limit on the extent of selection for a specific contact pattern. Contact site methods may provide additional approaches to study the relationship between protein structure and sequence evolution.
Collapse
Affiliation(s)
- Lorraine Marsh
- Department of Biology, Long Island University, 1 University Plaza, Brooklyn, NY 11201, USA.
| |
Collapse
|
15
|
Taylor TJ, Vaisman II. Graph theoretic properties of networks formed by the Delaunay tessellation of protein structures. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2006; 73:041925. [PMID: 16711854 DOI: 10.1103/physreve.73.041925] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2005] [Indexed: 05/09/2023]
Abstract
The Delaunay tessellation of several sets of real and simplified model protein structures has been used to explore graph theoretic properties of residue contact networks. The system of contacts defined by residues joined by edges in the Delaunay simplices can be thought of as a graph or network and analyzed using techniques from elementary graph theory and the theory of complex networks. Such analysis indicates that protein contact networks have small world character, but technically are not small world networks. This approach also indicates that networks formed by native structures and by most misfolded decoys can be differentiated by their respective graph properties. The characteristic features of residue contact networks can be used for the detection of structural elements in proteins, such as the ubiquitous closed loops consisting of 22-32 consecutive residues, where terminal residues are Delaunay neighbors.
Collapse
Affiliation(s)
- Todd J Taylor
- Laboratory for Structural Bioinformatics, School of Computational Sciences, George Mason University, 10900 University Boulevard MSN5B3, Manassas, VA 20110, USA
| | | |
Collapse
|
16
|
Zhou X, Chou J, Wong STC. Protein structure similarity from Principle Component Correlation analysis. BMC Bioinformatics 2006; 7:40. [PMID: 16436213 PMCID: PMC1386710 DOI: 10.1186/1471-2105-7-40] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2005] [Accepted: 01/25/2006] [Indexed: 11/28/2022] Open
Abstract
Background Owing to rapid expansion of protein structure databases in recent years, methods of structure comparison are becoming increasingly effective and important in revealing novel information on functional properties of proteins and their roles in the grand scheme of evolutionary biology. Currently, the structural similarity between two proteins is measured by the root-mean-square-deviation (RMSD) in their best-superimposed atomic coordinates. RMSD is the golden rule of measuring structural similarity when the structures are nearly identical; it, however, fails to detect the higher order topological similarities in proteins evolved into different shapes. We propose new algorithms for extracting geometrical invariants of proteins that can be effectively used to identify homologous protein structures or topologies in order to quantify both close and remote structural similarities. Results We measure structural similarity between proteins by correlating the principle components of their secondary structure interaction matrix. In our approach, the Principle Component Correlation (PCC) analysis, a symmetric interaction matrix for a protein structure is constructed with relationship parameters between secondary elements that can take the form of distance, orientation, or other relevant structural invariants. When using a distance-based construction in the presence or absence of encoded N to C terminal sense, there are strong correlations between the principle components of interaction matrices of structurally or topologically similar proteins. Conclusion The PCC method is extensively tested for protein structures that belong to the same topological class but are significantly different by RMSD measure. The PCC analysis can also differentiate proteins having similar shapes but different topological arrangements. Additionally, we demonstrate that when using two independently defined interaction matrices, comparison of their maximum eigenvalues can be highly effective in clustering structurally or topologically similar proteins. We believe that the PCC analysis of interaction matrix is highly flexible in adopting various structural parameters for protein structure comparison.
Collapse
Affiliation(s)
- Xiaobo Zhou
- Harvard Center for Neurodegeneration and Repair – Center for Bioinformatics, Harvard Medical School, 1249 Boylston Street, Boston, MA 02215, USA
- Functional and Molecular Imaging Center, Radiology Department, Brigham and Women's Hospital, One Brigham Circle, 1620 Tremont Street, Boston, MA 02121, USA
| | - James Chou
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medial School, 240 Longwood Avenue, Boston, MA 02115, USA
| | - Stephen TC Wong
- Harvard Center for Neurodegeneration and Repair – Center for Bioinformatics, Harvard Medical School, 1249 Boylston Street, Boston, MA 02215, USA
- Functional and Molecular Imaging Center, Radiology Department, Brigham and Women's Hospital, One Brigham Circle, 1620 Tremont Street, Boston, MA 02121, USA
| |
Collapse
|
17
|
Lee MC, Yang R, Duan Y. Comparison between Generalized-Born and Poisson-Boltzmann methods in physics-based scoring functions for protein structure prediction. J Mol Model 2005; 12:101-10. [PMID: 16096807 DOI: 10.1007/s00894-005-0013-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2005] [Accepted: 06/23/2005] [Indexed: 11/28/2022]
Abstract
Continuum solvent models such as Generalized-Born and Poisson-Boltzmann methods hold the promise to treat solvation effect efficiently and to enable rapid scoring of protein structures when they are combined with physics-based energy functions. Yet, direct comparison of these two approaches on large protein data set is lacking. Building on our previous work with a scoring function based on a Generalized-Born (GB) solvation model, and short molecular-dynamics simulations, we further extended the scoring function to compare with the MM-PBSA method to treat the solvent effect. We benchmarked this scoring function against seven publicly available decoy sets. We found that, somewhat surprisingly, the results of MM-PBSA approach are comparable to the previous GB-based scoring function. We also discussed the effect to the scoring function accuracy due to presence of large ligands and ions in some native structures of the decoy sets.
Collapse
Affiliation(s)
- Matthew C Lee
- Department of Chemistry and Biochemistry, University of Delaware, Newark, DE 19716, USA
| | | | | |
Collapse
|