1
|
Dawid AE, Gront D, Kolinski A. SURPASS Low-Resolution Coarse-Grained Protein Modeling. J Chem Theory Comput 2017; 13:5766-5779. [PMID: 28992694 DOI: 10.1021/acs.jctc.7b00642] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Coarse-grained modeling of biomolecules has a very important role in molecular biology. In this work we present a novel SURPASS (Single United Residue per Pre-Averaged Secondary Structure fragment) model of proteins that can be an interesting alternative for existing coarse-grained models. The design of the model is unique and strongly supported by the statistical analysis of structural regularities characteristic for protein systems. Coarse-graining of protein chain structures assumes a single center of interactions per residue and accounts for preaveraged effects of four adjacent residue fragments. Knowledge-based statistical potentials encode complex interaction patterns of these fragments. Using the Replica Exchange Monte Carlo sampling scheme and a generic version of the SURPASS force field we performed test simulations of a representative set of single-domain globular proteins. The method samples a significant part of conformational space and reproduces protein structures, including native-like, with surprisingly good accuracy. Future extension of the SURPASS model on large biomacromolecular systems is briefly discussed.
Collapse
Affiliation(s)
- Aleksandra E Dawid
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw , Pasteura 1, 02-093 Warsaw, Poland
| | - Dominik Gront
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw , Pasteura 1, 02-093 Warsaw, Poland
| | - Andrzej Kolinski
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw , Pasteura 1, 02-093 Warsaw, Poland
| |
Collapse
|
2
|
Rajgaria R, Wei Y, Floudas CA. Contact prediction for beta and alpha-beta proteins using integer linear optimization and its impact on the first principles 3D structure prediction method ASTRO-FOLD. Proteins 2010; 78:1825-46. [PMID: 20225257 PMCID: PMC2858251 DOI: 10.1002/prot.22696] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
An integer linear optimization model is presented to predict residue contacts in beta, alpha + beta, and alpha/beta proteins. The total energy of a protein is expressed as sum of a C(alpha)-C(alpha) distance dependent contact energy contribution and a hydrophobic contribution. The model selects contact that assign lowest energy to the protein structure as satisfying a set of constraints that are included to enforce certain physically observed topological information. A new method based on hydrophobicity is proposed to find the beta-sheet alignments. These beta-sheet alignments are used as constraints for contacts between residues of beta-sheets. This model was tested on three independent protein test sets and CASP8 test proteins consisting of beta, alpha + beta, alpha/beta proteins and it was found to perform very well. The average accuracy of the predictions (separated by at least six residues) was approximately 61%. The average true positive and false positive distances were also calculated for each of the test sets and they are 7.58 A and 15.88 A, respectively. Residue contact prediction can be directly used to facilitate the protein tertiary structure prediction. This proposed residue contact prediction model is incorporated into the first principles protein tertiary structure prediction approach, ASTRO-FOLD. The effectiveness of the contact prediction model was further demonstrated by the improvement in the quality of the protein structure ensemble generated using the predicted residue contacts for a test set of 10 proteins.
Collapse
Affiliation(s)
- R. Rajgaria
- Department of Chemical Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| | - Y. Wei
- Department of Chemical Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| | - C. A. Floudas
- Department of Chemical Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| |
Collapse
|
3
|
Abia D, Bastolla U, Chacón P, Fábrega C, Gago F, Morreale A, Tramontano A. In memoriam. Proteins 2010; 78:iii-viii. [DOI: 10.1002/prot.22660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
4
|
Rajgaria R, McAllister SR, Floudas CA. Towards accurate residue-residue hydrophobic contact prediction for alpha helical proteins via integer linear optimization. Proteins 2009; 74:929-47. [PMID: 18767158 DOI: 10.1002/prot.22202] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
A new optimization-based method is presented to predict the hydrophobic residue contacts in alpha-helical proteins. The proposed approach uses a high resolution distance dependent force field to calculate the interaction energy between different residues of a protein. The formulation predicts the hydrophobic contacts by minimizing the sum of these contact energies. These residue contacts are highly useful in narrowing down the conformational space searched by protein structure prediction algorithms. The proposed algorithm also offers the algorithmic advantage of producing a rank ordered list of the best contact sets. This model was tested on four independent alpha-helical protein test sets and was found to perform very well. The average accuracy of the predictions (separated by at least six residues) obtained using the presented method was approximately 66% for single domain proteins. The average true positive and false positive distances were also calculated for each protein test set and they are 8.87 and 14.67 A, respectively.
Collapse
Affiliation(s)
- R Rajgaria
- Department of Chemical Engineering, Princeton University, Princeton, New Jersey 08544-5263, USA
| | | | | |
Collapse
|
5
|
Vicatos S, Kaznessis YN. Separating true positive predicted residue contacts from false positive ones in mainly alpha proteins, using constrained Metropolis MC simulations. Proteins 2008; 70:539-52. [PMID: 17879348 DOI: 10.1002/prot.21553] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
We present a method that significantly improves the accuracy of predicted proximal residue pairs in protein molecules. Computational methods for predicting pairs of amino acids that are distant in the protein sequence but close in the protein 3D structure can benefit attempts to in silico recognize the fold of a protein molecule. Unfortunately, currently available methods suffer from low predictive accuracy. In this work, we use Monte Carlo simulations to fold protein molecules with proximal pair predictions used as additional energy constraints. To test our methods, we study molecules with known tertiary structures. With Monte Carlo, we generate ensembles of structures for each set of residues constraints. The distribution of the root mean square deviation of the folded structures from the known native structure reveals clear information about the accuracy of the constraint sets used. With recursive substitutions of constraints, false positive predictions are identified and filtered out and significant improvements in accuracy are observed.
Collapse
Affiliation(s)
- Spyridon Vicatos
- Department of Chemical Engineering and Materials Science, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | | |
Collapse
|
6
|
Vicatos S, Reddy BVB, Kaznessis Y. Prediction of distant residue contacts with the use of evolutionary information. Proteins 2005; 58:935-49. [PMID: 15645442 DOI: 10.1002/prot.20370] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In this work we present a novel correlated mutations analysis (CMA) method that is significantly more accurate than previously reported CMA methods. Calculation of correlation coefficients is based on physicochemical properties of residues (predictors) and not on substitution matrices. This results in reliable prediction of pairs of residues that are distant in protein sequence but proximal in its three dimensional tertiary structure. Multiple sequence alignments (MSA) containing a sequence of known structure for 127 families from PFAM database have been selected so that all major protein architectures described in CATH classification database are represented. Protein sequences in the selected families were filtered so that only those evolutionarily close to the target protein remain in the MSA. The average accuracy obtained for the alpha beta class of proteins was 26.8% of predicted proximal pairs with average improvement over random accuracy (IOR) of 6.41. Average accuracy is 20.6% for the mainly beta class and 14.4% for the mainly alpha class. The optimum correlation coefficient cutoff (cc cutoff) was found to be around 0.65. The first predictor, which correlates to hydrophobicity, provides the most reliable results. The other two predictors give good predictions which can be used in conjunction to those of the first one. When stricter cc cutoff is chosen, the average accuracy increases significantly (38.76% for alpha beta class), but the trade off is a smaller number of predictions. The use of solvent accessible area estimations for filtering false positives out of the predictions is promising.
Collapse
Affiliation(s)
- Spyridon Vicatos
- Department of Chemical Engineering and Materials Science, University of Minnesota,Minneapolis, Minnesota 55455, USA
| | | | | |
Collapse
|
7
|
|
8
|
Betancourt MR, Skolnick J. Finding the needle in a haystack: educing native folds from ambiguousab initio protein structure predictions. J Comput Chem 2001. [DOI: 10.1002/1096-987x(200102)22:3<339::aid-jcc1006>3.0.co;2-r] [Citation(s) in RCA: 60] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
|
9
|
Simmerling C, Lee MR, Ortiz AR, Kolinski A, Skolnick J, Kollman PA. Combining MONSSTER and LES/PME to Predict Protein Structure from Amino Acid Sequence: Application to the Small Protein CMTI-1. J Am Chem Soc 2000. [DOI: 10.1021/ja993119k] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Carlos Simmerling
- Contribution from the Department of Pharmaceutical Chemistry, University of California, 513 Parnassus, San Francisco, California 94143-0446, Department of Molecular Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037
| | - Matthew R. Lee
- Contribution from the Department of Pharmaceutical Chemistry, University of California, 513 Parnassus, San Francisco, California 94143-0446, Department of Molecular Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037
| | - Angel. R. Ortiz
- Contribution from the Department of Pharmaceutical Chemistry, University of California, 513 Parnassus, San Francisco, California 94143-0446, Department of Molecular Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037
| | - Andrzej Kolinski
- Contribution from the Department of Pharmaceutical Chemistry, University of California, 513 Parnassus, San Francisco, California 94143-0446, Department of Molecular Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037
| | - Jeffrey Skolnick
- Contribution from the Department of Pharmaceutical Chemistry, University of California, 513 Parnassus, San Francisco, California 94143-0446, Department of Molecular Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037
| | - Peter A. Kollman
- Contribution from the Department of Pharmaceutical Chemistry, University of California, 513 Parnassus, San Francisco, California 94143-0446, Department of Molecular Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037
| |
Collapse
|
10
|
Application of Reduced Models to Protein Structure Prediction. ACTA ACUST UNITED AC 1999. [DOI: 10.1016/s1380-7323(99)80086-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
11
|
Ortiz AR, Kolinski A, Rotkiewicz P, Ilkowski B, Skolnick J. Ab initio folding of proteins using restraints derived from evolutionary information. Proteins 1999. [DOI: 10.1002/(sici)1097-0134(1999)37:3+<177::aid-prot22>3.0.co;2-e] [Citation(s) in RCA: 87] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
12
|
Skolnick J, Kolinski A, Ortiz AR. Reduced protein models and their application to the protein folding problem. J Biomol Struct Dyn 1998; 16:381-96. [PMID: 9833676 DOI: 10.1080/07391102.1998.10508255] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
One of the most important unsolved problems of computational biology is prediction of the three-dimensional structure of a protein from its amino acid sequence. In practice, the solution to the protein folding problem demands that two interrelated problems be simultaneously addressed. Potentials that recognize the native state from the myriad of misfolded conformations are required, and the multiple minima conformational search problem must be solved. A means of partly surmounting both problems is to use reduced protein models and knowledge-based potentials. Such models have been employed to elucidate a number of general features of protein folding, including the nature of the energy landscape, the factors responsible for the uniqueness of the native state and the origin of the two-state thermodynamic behavior of globular proteins. Reduced models have also been used to predict protein tertiary and quaternary structure. When combined with a limited amount of experimental information about secondary and tertiary structure, molecules of substantial complexity can be assembled. If predicted secondary structure and tertiary restraints are employed, low resolution models of single domain proteins can be successfully predicted. Thus, simplified protein models have played an important role in furthering the understanding of the physical properties of proteins.
Collapse
Affiliation(s)
- J Skolnick
- Department of Molecular Biology, The Scripps Research Institute, La Jolla, CA 92037, USA.
| | | | | |
Collapse
|