1
|
Abstract
The potential energy landscape of pentapeptides was mapped in a collective coordinate principal conformational subspace derived from principal component analysis of a nonredundant representative set of protein structures from the PDB. Three pentapeptide sequences that are known to be distinct in terms of their secondary structure characteristics, (Ala)5, (Gly)5, and Val.Asn.Thr.Phe.Val, were considered. Partitioning the landscapes into different energy valleys allowed for calculation of the relative propensities of the peptide secondary structures in a statistical mechanical framework. The distribution of the observed conformations of pentapeptide data showed good correspondence to the topology of the energy landscape of the (Ala)5 sequence where, in accord with reported trends, the α-helix showed a predominant propensity at 298 K. The topography of the landscapes indicates that the stabilization of the α-helix in the (Ala)5 sequence is enthalpic in nature while entropic factors are important for stabilization of the β-sheet in the Val.Asn.Thr.Phe.Val sequence. The results indicate that local interactions within small pentapeptide segments can lead to conformational preference of one secondary structure over the other where account of conformational entropy is important in order to reveal such preference. The method, therefore, can provide critical structural information for ab initio protein folding methods.
Collapse
|
2
|
Abstract
There is a growing interest in the identification of proteins on the proteome wide scale. Among different kinds of protein structure identification methods, graph-theoretic methods are very sharp ones. Due to their lower costs, higher effectiveness and many other advantages, they have drawn more and more researchers' attention nowadays. Specifically, graph-theoretic methods have been widely used in homology identification, side-chain cluster identification, peptide sequencing and so on. This paper reviews several methods in solving protein structure identification problems using graph theory. We mainly introduce classical methods and mathematical models including homology modeling based on clique finding, identification of side-chain clusters in protein structures upon graph spectrum, and de novo peptide sequencing via tandem mass spectrometry using the spectrum graph model. In addition, concluding remarks and future priorities of each method are given.
Collapse
Affiliation(s)
- Yan Yan
- Department of Applied Mathematics, Northwestern Polytechnical University, Xi’an, Shaanxi 710072, P.R. China
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada
| | - Shenggui Zhang
- Department of Applied Mathematics, Northwestern Polytechnical University, Xi’an, Shaanxi 710072, P.R. China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada
| |
Collapse
|
3
|
|
4
|
Dulin F, Callebaut I, Colloc'h N, Mornon JP. Sequence-based modeling of Aβ42 soluble oligomers. Biopolymers 2007; 85:422-37. [PMID: 17211889 DOI: 10.1002/bip.20675] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Abeta fibrils, which are central to the pathology of Alzheimer's disease, form a cross-beta-structure that contains likely parallel beta-sheets with a salt bridge between residues Asp23 and Lys28. Recent studies suggest that soluble oligomers of amyloid peptides have neurotoxic effects in cell cultures, raising the interest in studying the structures of these intermediate forms. Here, we present three models of possible soluble Abeta forms based on the sequences similarities, assumed to support local structural similarities, of the Abeta peptide with fragments of three proteins (adhesin, Semliki Forest virus capsid protein, and transthyretin). These three models share a similar structure in the C-terminal region composed of two beta-strands connected by a loop, which contain the Asp23-Lys28 salt bridge. This segment is also structurally well conserved in Abeta fibril forms. Differences between the three monomeric models occur in the N-terminal region and in the C-terminal tail. These three models might sample some of the most stable conformers of the soluble Abeta peptide within oligomeric assemblies, which were modeled here in the form of dimers, trimers, tetramers, and hexamers. The consistency of these models is discussed with respect to available experimental and theoretical data.
Collapse
Affiliation(s)
- Fabienne Dulin
- Département de Biologie Structurale, IMPMC, CNRS UMR7590, Universités Pierre et Marie Curie-Paris 6 et Denis Diderot-Paris 7, F-75005 France
| | | | | | | |
Collapse
|
5
|
Ho BK, Dill KA. Folding very short peptides using molecular dynamics. PLoS Comput Biol 2006; 2:e27. [PMID: 16617376 PMCID: PMC1435986 DOI: 10.1371/journal.pcbi.0020027] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2005] [Accepted: 02/20/2005] [Indexed: 11/29/2022] Open
Abstract
Peptides often have conformational preferences. We simulated 133 peptide 8-mer fragments from six different proteins, sampled by replica-exchange molecular dynamics using Amber7 with a GB/SA (generalized-Born/solvent-accessible electrostatic approximation to water) implicit solvent. We found that 85 of the peptides have no preferred structure, while 48 of them converge to a preferred structure. In 85% of the converged cases (41 peptides), the structures found by the simulations bear some resemblance to their native structures, based on a coarse-grained backbone description. In particular, all seven of the β hairpins in the native structures contain a fragment in the turn that is highly structured. In the eight cases where the bioinformatics-based I-sites library picks out native-like structures, the present simulations are largely in agreement. Such physics-based modeling may be useful for identifying early nuclei in folding kinetics and for assisting in protein-structure prediction methods that utilize the assembly of peptide fragments. To carry out specific biochemical reactions, proteins must adopt precise three-dimensional conformations. During the folding of a protein, the protein picks out the right conformation out of billions of other conformations. It is not yet possible to do this computationally. Picking out the native conformation using physics-based atomically detailed models, sampled by molecular dynamics, is presently beyond the reach of computer methods. How can we speed up computational protein-structure prediction? One idea is that proteins start folding at specific parts of a chain that kink up early in the folding process. If we can identify these kinks, we should be able to speed up protein-structure prediction. Previous studies have identified likely kinks through bioinformatic analysis of existing protein structures. The goal of the authors here is to identify these putative folding initiation sites with a physical model instead. In this study, Ho and Dill show that, by chopping a protein chain into peptide pieces, then simulating the pieces in molecular dynamics, they can identify those peptide fragments that have conformational biases. These peptides identify the kinks in the protein chain.
Collapse
Affiliation(s)
- Bosco K Ho
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, California, USA.
| | | |
Collapse
|
6
|
Herges T, Wenzel W. An all-atom force field for tertiary structure prediction of helical proteins. Biophys J 2004; 87:3100-9. [PMID: 15507688 PMCID: PMC1304781 DOI: 10.1529/biophysj.104.040071] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2004] [Accepted: 06/28/2004] [Indexed: 11/18/2022] Open
Abstract
We have developed an all-atom free-energy force field (PFF01) for protein tertiary structure prediction. PFF01 is based on physical interactions and was parameterized using experimental structures of a family of proteins believed to span a wide variety of possible folds. It contains empirical, although sequence-independent terms for hydrogen bonding. Its solvent-accessible surface area solvent model was first fit to transfer energies of small peptides. The parameters of the solvent model were then further optimized to stabilize the native structure of a single protein, the autonomously folding villin headpiece, against competing low-energy decoys. Here we validate the force field for five nonhomologous helical proteins with 20-60 amino acids. For each protein, decoys with 2-3 A backbone root mean-square deviation and correct experimental Cbeta-Cbeta distance constraints emerge as those with the lowest energy.
Collapse
Affiliation(s)
- T Herges
- Forschungszentrum Karlsruhe, Institut für Nanotechnologie, Karlsruhe, Germany
| | | |
Collapse
|
7
|
Herges T, Schug A, Wenzel W. Protein Structure Prediction with Stochastic Optimization Methods: Folding and Misfolding the Villin Headpiece. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS – ICCSA 2004 2004. [DOI: 10.1007/978-3-540-24767-8_47] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
8
|
Avbelj F, Luo P, Baldwin RL. Energetics of the interaction between water and the helical peptide group and its role in determining helix propensities. Proc Natl Acad Sci U S A 2000; 97:10786-91. [PMID: 10984522 PMCID: PMC27101 DOI: 10.1073/pnas.200343197] [Citation(s) in RCA: 101] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The alanine helix provides a model system for studying the energetics of interaction between water and the helical peptide group, a possible major factor in the energetics of protein folding. Helix formation is enthalpy-driven (-1.0 kcal/mol per residue). Experimental transfer data (vapor phase to aqueous) for amides give the enthalpy of interaction with water of the amide group as approximately -11.5 kcal/mol. The enthalpy of the helical peptide hydrogen bond, computed for the gas phase by quantum mechanics, is -4.9 kcal/mol. These numbers give an enthalpy deficit for helix formation of -7.6 kcal/mol. To study this problem, we calculate the electrostatic solvation free energy (ESF) of the peptide groups in the helical and beta-strand conformations, by using the delphi program and parse parameter set. Experimental data show that the ESF values of amides are almost entirely enthalpic. Two key results are: in the beta-strand conformation, the ESF value of an interior alanine peptide group is -7.9 kcal/mol, substantially less than that of N-methylacetamide (-12.2 kcal/mol), and the helical peptide group is solvated with an ESF of -2.5 kcal/mol. These results reduce the enthalpy deficit to -1.5 kcal/mol, and desolvation of peptide groups through partial burial in the random coil may account for the remainder. Mutant peptides in the helical conformation show ESF differences among nonpolar amino acids that are comparable to observed helix propensity differences, but the ESF differences in the random coil conformation still must be subtracted.
Collapse
Affiliation(s)
- F Avbelj
- Department of Biochemistry, Beckman Center, Stanford University Medical Center, Stanford, CA 94305-5307, USA
| | | | | |
Collapse
|
9
|
Avbelj F. Amino acid conformational preferences and solvation of polar backbone atoms in peptides and proteins. J Mol Biol 2000; 300:1335-59. [PMID: 10903873 DOI: 10.1006/jmbi.2000.3901] [Citation(s) in RCA: 59] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Amino acids in peptides and proteins display distinct preferences for alpha-helical, beta-strand, and other conformational states. Various physicochemical reasons for these preferences have been suggested: conformational entropy, steric factors, hydrophobic effect, and backbone electrostatics; however, the issue remains controversial. It has been proposed recently that the side-chain-dependent solvent screening of the local and non-local backbone electrostatic interactions primarily determines the preferences not only for the alpha-helical but also for all other main-chain conformational states. Side-chains modulate the electrostatic screening of backbone interactions by excluding the solvent from the vicinity of main-chain polar atoms. The deficiency of this electrostatic screening model of amino acid preferences is that the relationships between the main-chain electrostatics and the amino acid preferences have been demonstrated for a limited set of six non-polar amino acid types in proteins only. Here, these relationships are determined for all amino acid types in tripeptides, dekapeptides, and proteins. The solvation free energies of polar backbone atoms are approximated by the electrostatic contributions calculated by the finite difference Poisson-Boltzmann and the Langevin dipoles methods. The results show that the average solvation free energy of main-chain polar atoms depends strongly on backbone conformation, shape of side-chains, and exposure to solvent. The equilibrium between the low-energy beta-strand conformation of an amino acid (anti-parallel alignment of backbone dipole moments) and the high-energy alpha conformation (parallel alignment of backbone dipole moments) is strongly influenced by the solvation of backbone polar atoms. The free energy cost of reaching the alpha conformation is by approximately 1.5 kcal/mol smaller for residues with short side-chains than it is for the large beta-branched amino acid residues. This free energy difference is comparable to those obtained experimentally by mutation studies and is thus large enough to account for the distinct preferences of amino acid residues. The screening coefficients gamma(local)(r) and gamma(non-local)(r) correlate with the solvation effects for 19 amino acid types with the coefficients between 0.698 to 0.851, depending on the type of calculation and on the set of point atomic charges used. The screening coefficients gamma(local)(r) increase with the level of burial of amino acids in proteins, converging to 1.0 for the completely buried amino acid residues. The backbone solvation free energies of amino acid residues involved in strong hydrogen bonding (for example: in the middle of an alpha-helix) are small. The hydrogen bonded backbone is thus more hydrophobic than the peptide groups in random coil. The alpha-helix forming preference of alanine is attributed to the relatively small free energy cost of reaching the high-energy alpha-helix conformation. These results confirm that the side-chain-dependent solvent screening of the backbone electrostatic interactions is the dominant factor in determining amino acid conformational preferences.
Collapse
Affiliation(s)
- F Avbelj
- National Institute of Chemistry, Hajdrihova 19, Ljubljana, Slovenia.
| |
Collapse
|
10
|
|
11
|
Yi Q, Bystroff C, Rajagopal P, Klevit RE, Baker D. Prediction and structural characterization of an independently folding substructure in the src SH3 domain. J Mol Biol 1998; 283:293-300. [PMID: 9761691 DOI: 10.1006/jmbi.1998.2072] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Previous studies of the conformations of peptides spanning the length of the alpha-spectrin SH3 domain suggested that SH3 domains lack independently folding substructures. Using a local structure prediction method based on the I-sites library of sequence-structure motifs, we identified a seven residue peptide in the src SH3 domain predicted to adopt a native-like structure, a type II beta-turn bridging unpaired beta-strands, that was not contained intact in any of the SH3 domain peptides studied earlier. NMR characterization confirmed that the isolated peptide, FKKGERL, adopts a structure similar to that adopted in the native protein: the NOE and 3JNHalpha coupling constant patterns were indicative of a type II beta-turn, and NOEs between the Phe and the Leu side-chains suggest that they are juxtaposed as in the prediction and the native structure. These results support the idea that high-confidence I-sites predictions identify protein segments that are likely to form native-like structures early in folding.
Collapse
Affiliation(s)
- Q Yi
- Department of Biochemistry, University of Washington, Seattle, WA, 98195, USA
| | | | | | | | | |
Collapse
|
12
|
Vorobjev YN, Almagro JC, Hermans J. Discrimination between native and intentionally misfolded conformations of proteins: ES/IS, a new method for calculating conformational free energy that uses both dynamics simulations with an explicit solvent and an implicit solvent continuum model. Proteins 1998. [DOI: 10.1002/(sici)1097-0134(19980901)32:4<399::aid-prot1>3.0.co;2-c] [Citation(s) in RCA: 128] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
13
|
Compiani M, Fariselli P, Martelli PL, Casadio R. An entropy criterion to detect minimally frustrated intermediates in native proteins. Proc Natl Acad Sci U S A 1998; 95:9290-4. [PMID: 9689073 PMCID: PMC21331 DOI: 10.1073/pnas.95.16.9290] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
The analysis of the information flow in a feed-forward neural network suggests that the output of the network can be used to compute a structural entropy for the sequence-to-secondary structure mapping. On this basis, we formulate a minimum entropy criterion for the identification of minimally frustrated traits with helical conformation that correspond to initiation sites of protein folding. The entropy of protein segments can be viewed as a nucleation propensity that is useful to characterize putative regions where folding is likely to be initiated with the formation of stretches of alpha-helices under the predominant influence of local interactions. Our procedure is successfully tested in the search for initiation sites of protein folding for which independent experimental and computational evidence exists. Our results lend support to the view that folding is a hierarchical event in which, in harmony with the minimal frustration principle, the final conformation preserves structural modules formed in the early stages of the process.
Collapse
Affiliation(s)
- M Compiani
- Dipartimento di Scienze Chimiche, Università di Camerino, Via S. Agostino 1, 62032 Camerino MC, Italy.
| | | | | | | |
Collapse
|
14
|
Avbelj F, Fele L. Role of main-chain electrostatics, hydrophobic effect and side-chain conformational entropy in determining the secondary structure of proteins. J Mol Biol 1998; 279:665-84. [PMID: 9641985 DOI: 10.1006/jmbi.1998.1792] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The physiochemical bases of amino acid preferences for alpha-helical, beta-strand, and other main-chain conformational states in proteins is controversial. Hydrophobic effect, side-chain conformational entropy, steric factors, and main-chain electrostatic interactions have all been advanced as the dominant physical factors which determine these preferences. Many attempts to resolve the controversy have focused on small model systems. The disadvantage of such systems is that the amino acids in small molecules are largely exposed to the solvent. In proteins, however, the amino acids are in contact with the solvent to a different degree, causing a large variability of strengths of all interactions. The estimates of mean strengths of interactions in the actual protein environment are therefore essential to resolve the controversy. In this work the experimental protein structures are used to estimate the mean strengths of various interactions in proteins. The free energy contributions of the interactions are implemented into the Lifson-Roig theory to calculate the helix and strand free energy profiles. From the profiles the secondary structures of proteins and peptides are predicted using simple rules. The role of hydrophobic effect, side-chain conformational entropy, and main-chain electrostatic interactions in determining the secondary structure of proteins is assessed from the abilities of different models, describing stability of secondary structures, to correctly predict alpha-helices, beta-strands and coil in 130 proteins. The three-state accuracy of the model, which contains only the free energy terms due to the main-chain electrostatics with 40 coefficients, is 68.7%. This accuracy is approaching to the accuracy of currently the best secondary structure prediction algorithm based on neural networks (72%); however, many thousands of parameters have to be optimized during the training of the neural networks to reach this level of accuracy. The correlation coefficient between the calculated and the experimental helix contents of 37 alanine based peptides is 0.91. If the hydrophobic and the side-chain conformational entropy terms are included into the helix-coil transition parameters, the accuracy of the algorithm does not improve significantly. However, if the main-chain electrostatic interactions are excluded from the helix-coil and strand-coil transition parameters, the accuracy of the algorithm reaches only 59.5%. These results support the dominant role of the short-range main-chain electrostatics in determining the secondary structure of proteins and peptides. The role of the hydrophobic effect and the side-chain conformational entropy is small.
Collapse
Affiliation(s)
- F Avbelj
- National Institute of Chemistry, Ljubljana, Slovenia
| | | |
Collapse
|
15
|
Abstract
The interconnected nature of interactions in protein structures appears to be the major hurdle in preventing the construction of accurate comparative models. We present an algorithm that uses graph theory to handle this problem. Each possible conformation of a residue in an amino acid sequence is represented using the notion of a node in a graph. Each node is given a weight based on the degree of the interaction between its side-chain atoms and the local main-chain atoms. Edges are then drawn between pairs of residue conformations/nodes that are consistent with each other (i.e. clash-free and satisfying geometrical constraints). The edges are weighted based on the interactions between the atoms of the two nodes. Once the entire graph is constructed, all the maximal sets of completely connected nodes (cliques) are found using a clique-finding algorithm. The cliques with the best weights represent the optimal combinations of the various main-chain and side-chain possibilities, taking the respective environments into account. The algorithm is used in a comparative modeling scenario to build side-chains, regions of main chain, and mix and match between different homologs in a context-sensitive manner. The predictive power of this method is assessed by applying it to cases where the experimental structure is not known in advance.
Collapse
Affiliation(s)
- R Samudrala
- Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, Rockville 20850, USA
| | | |
Collapse
|
16
|
Avbelj F, Fele L. Prediction of the three-dimensional structure of proteins using the electrostatic screening model and hierarchic condensation. Proteins 1998; 31:74-96. [PMID: 9552160 DOI: 10.1002/(sici)1097-0134(19980401)31:1<74::aid-prot7>3.0.co;2-h] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We describe a method for predicting the three-dimensional (3-D) structure of proteins from their sequence alone. The method is based on the electrostatic screening model for the stability of the protein main-chain conformation. The free energy of a protein as a function of its conformation is obtained from the potentials of mean force analysis of high-resolution x-ray protein structures. The free energy function is simple and contains only 44 fitted coefficients. The minimization of the free energy is performed by the torsion space Monte Carlo procedure using the concept of hierarchic condensation. The Monte Carlo minimization procedure is applied to predict the secondary, super-secondary, and native 3-D structures of 12 proteins with 28-110 amino acids. The 3-D structures of the majority of local secondary and super-secondary structures are predicted accurately. This result suggests that control in forming the native-like local structure is distributed along the entire protein sequence. The native 3-D structure is predicted correctly for 3 of 12 proteins composed mainly from the alpha-helices. The method fails to predict the native 3-D structure of proteins with a predominantly beta secondary structure. We suggest that the hierarchic condensation is not an appropriate procedure for simulating the folding of proteins made up primarily from beta-strands. The method has been proved accurate in predicting the local secondary and super-secondary structures in the blind ab initio 3-D prediction experiment.
Collapse
Affiliation(s)
- F Avbelj
- National Institute of Chemistry, Ljubljana, Slovenia.
| | | |
Collapse
|
17
|
Chung MS, Neuwald AF, Wilbur WJ. A free energy analysis by unfolding applied to 125-mers on a cubic lattice. FOLDING & DESIGN 1998; 3:51-65. [PMID: 9502320 DOI: 10.1016/s1359-0278(98)00008-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
BACKGROUND A common approach to the protein folding problem involves computer simulation of folding using lattice models of amino acid sequences. Key factors for good performance in such models are the correct choice of the temperature and the average interaction energy between residues. In order to push the lattice approach to its limit it is important to have a method to adjust these parameters for optimal folding that is not limited by our ability to successfully simulate folding in a reasonable time. RESULTS In this study, we adopt a simple cubic-lattice model and present a method for calculating the free energy of a chain as a function of the number of native contacts. This does not require that we are able to fold the sequence by simulation and it provides a method of estimating the folding transition temperature. For a given set of parameters, the free energy analysis also allows an estimate of foldability. By applying the method to sequences with 27 and 125 residues, we show that optimal folding occurs near the folding transition temperature and at either zero or small negative average interaction energy. We find ourselves able to fold only 125-mers that have significant short-range native contacts. CONCLUSIONS A free energy analysis during unfolding is a useful tool for the study of foldability and should be applicable to a variety of folding models. In this way we are able to fold some 125-mer designed sequences and our results confirm the finding that short-range contacts contribute to foldability.
Collapse
Affiliation(s)
- M S Chung
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | | |
Collapse
|
18
|
Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. J Mol Biol 1998; 275:895-916. [PMID: 9480776 DOI: 10.1006/jmbi.1997.1479] [Citation(s) in RCA: 325] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
We present a formalism to compute the probability of an amino acid sequence conformation being native-like, given a set of pairwise atom-atom distances. The formalism is used to derive three discriminatory functions with different types of representations for the atom-atom contacts observed in a database of protein structures. These functions include two virtual atom representations and one all-heavy atom representation. When applied to six different decoy sets containing a range of correct and incorrect conformations of amino acid sequences, the all-atom distance-dependent discriminatory function is able to identify correct from incorrect more often than the discriminatory functions using approximate representations. We illustrate the importance of using a detailed atomic description for obtaining the most accurate discrimination, and the necessity for testing discriminatory functions against a wide variety of decoys. The discriminatory function is also shown to be capable of capturing the fine details of atom-atom preferences. These results suggest that the all-atom distance-dependent discriminatory function will be useful for protein structure prediction and model refinement.
Collapse
Affiliation(s)
- R Samudrala
- Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, 9600, Gudelsky Drive, Rockville, MD 20850, USA
| | | |
Collapse
|
19
|
Prévost M, Ortmans I. Refolding simulations of an isolated fragment of barnase into a native-like beta hairpin: evidence for compactness and hydrogen bonding as concurrent stabilizing factors. Proteins 1997; 29:212-27. [PMID: 9329086 DOI: 10.1002/(sici)1097-0134(199710)29:2<212::aid-prot9>3.0.co;2-e] [Citation(s) in RCA: 25] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Experimental evidence and theoretical models both suggest that protein folding is initiated within specific fragments intermittently adopting conformations close to that found in the protein native structure. These folding initiation sites encompassing short portions of the protein are ideally suited for study in isolation by computational methods aimed at peering into the very early events of folding. We have used Molecular Dynamics (MD) technique to investigate the behavior of an isolated protein fragment formed by residues 85 to 102 of barnase that folds into a beta hairpin in the protein native structure. Three independent MD simulations of 1.3 to 1.8 ns starting from unfolded conformations of the peptide portrayed with an all-atom model in water were carried out at gradually decreasing temperature. A detailed analysis of the conformational preferences adopted by this peptide in the course of the simulations is presented. Two of the unfolded peptides conformations fold into a hairpin characterized by native and a larger bulk of nonnative interactions. Both refolding simulations substantiate the close relationship between interstrand compactness and hydrogen bonding network involving backbone atoms. Persistent compactness witnessed by side-chain interactions always occurs concomitantly with the formation of backbone hydrogen bonds. No highly populated conformations generated in a third simulation starting from the remotest unfolded conformer relative to the native structure are observed. However, nonnative long-range and medium-range contacts with the aromatic moiety of Trp94 are spotted, which are in fair agreement with a former nuclear magnetic resonance study of a denaturing solution of an isolated barnase fragment encompassing the beta hairpin. All this lends reason to believe that the 85-102 barnase fragment is a strong initiation site for folding.
Collapse
Affiliation(s)
- M Prévost
- Unité de Conformation de Macromolécules Biologiques, Université Libre de Bruxelles, Belgium.
| | | |
Collapse
|
20
|
Derreumaux P. Folding a 20 amino acid αβ peptide with the diffusion process-controlled Monte Carlo method. J Chem Phys 1997. [DOI: 10.1063/1.474546] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
21
|
Demchuk E, Bashford D, Gippert GP, Case DA. Thermodynamics of a reverse turn motif. Solvent effects and side-chain packing. J Mol Biol 1997; 270:305-17. [PMID: 9236131 DOI: 10.1006/jmbi.1997.1103] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The linear pentapeptide, Ala-Tyr-cis-Pro-Tyr-Asp-NMA (AYPYD) is known to have a significant population of type VI turn conformers in aqueous solvent. We have carried out theoretical studies of the conformational energetics of this peptide using a potential of mean force (PMF) consisting of the AMBER/OPLS empirical potential energy function, a macroscopic electrostatic model of polar solvation, and a surface area-based model of non-polar solvation. Conformers were taken from molecular dynamics simulations reported elsewhere, or generated by a random search method reported here. The chain entropy of folding was calculated by a systematic search of accessible dihedral angle space. The intra-peptide component was found to strongly favor folding and was nearly cancelled by the polar solvation term which disfavored folding. The non-polar solvation term had little effect. Fluctuations about the average value of the PMF were small and in accord with estimates from a simple harmonic model. When applied to conformers generated by a random search, the PMF selected a conformer close to the NMR-determined structure as the lowest energy conformer. The conformer with the second-lowest energy was extended, but was found to fold rapidly to the turn state in a subsequent molecular dynamics study, and may be an important state on the folding-unfolding pathway. Averages of the PMF were combined with the entropy estimates to provide an estimate of the free energy of folding that is in reasonable agreement with experimental results. In terms of the interplay between backbone electrostatic interactions and the packing of apolar side-chains, this peptide provides a model for the energetics of protein folding, and therefore makes a useful test case for calculations.
Collapse
Affiliation(s)
- E Demchuk
- Department of Molecular Biology, The Scripps Research Institute, La Jolla, California 92037, USA
| | | | | | | |
Collapse
|
22
|
Han KF, Bystroff C, Baker D. Three-dimensional structures and contexts associated with recurrent amino acid sequence patterns. Protein Sci 1997; 6:1587-90. [PMID: 9232660 PMCID: PMC2143736 DOI: 10.1002/pro.5560060723] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
We have used cluster analysis to identify recurring sequence patterns that transcend protein family boundaries. A subset of these patterns occur predominantly in a single type of local structure in proteins. Here we characterize the three-dimensional structures and contexts in which these sequence patterns occur, with particular attention to the interactions responsible for their structural selectivity.
Collapse
Affiliation(s)
- K F Han
- Graduate Group in Biophysics, University of California San Francisco School of Medicine 94143-0448, USA
| | | | | |
Collapse
|
23
|
Pedersen JT, Moult J. Protein folding simulations with genetic algorithms and a detailed molecular description. J Mol Biol 1997; 269:240-59. [PMID: 9191068 DOI: 10.1006/jmbi.1997.1010] [Citation(s) in RCA: 97] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
We have explored the application of genetic algorithms (GA) to the determination of protein structure from sequence, using a full atom representation. A free energy function with point charge electrostatics and an area based solvation model is used. The method is found to be superior to previously investigated Monte Carlo algorithms. For selected fragments, up to 14 residues long, the lowest free energy structures produced by the GA are similar in conformation to the corresponding experimental structures in most cases. There are three main conclusions from these results. First, the genetic algorithm is an effective method for searching amongst the compact conformations of a polypeptide chain. Second, the free energy function is generally able to select native-like conformations. However, some deficiencies are identified, and further development is proposed. Third, the selection of native-like conformations for some protein fragments establishes that in these cases the conformation observed in the full protein structure is largely context independent. The implications for the nature of protein folding pathways are discussed.
Collapse
Affiliation(s)
- J T Pedersen
- Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, Rockville, MD 20850, USA
| | | |
Collapse
|
24
|
Simons KT, Kooperberg C, Huang E, Baker D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol 1997; 268:209-25. [PMID: 9149153 DOI: 10.1006/jmbi.1997.0959] [Citation(s) in RCA: 961] [Impact Index Per Article: 35.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
We explore the ability of a simple simulated annealing procedure to assemble native-like structures from fragments of unrelated protein structures with similar local sequences using Bayesian scoring functions. Environment and residue pair specific contributions to the scoring functions appear as the first two terms in a series expansion for the residue probability distributions in the protein database; the decoupling of the distance and environment dependencies of the distributions resolves the major problems with current database-derived scoring functions noted by Thomas and Dill. The simulated annealing procedure rapidly and frequently generates native-like structures for small helical proteins and better than random structures for small beta sheet containing proteins. Most of the simulated structures have native-like solvent accessibility and secondary structure patterns, and thus ensembles of these structures provide a particularly challenging set of decoys for evaluating scoring functions. We investigate the effects of multiple sequence information and different types of conformational constraints on the overall performance of the method, and the ability of a variety of recently developed scoring functions to recognize the native-like conformations in the ensembles of simulated structures.
Collapse
Affiliation(s)
- K T Simons
- Department of Biochemistry, University of Washington, Seattle 98195, USA
| | | | | | | |
Collapse
|
25
|
Abstract
Recently, protein-folding models have advanced to the point where folding simulations of protein-like chains of reasonable length (up to 125 amino acids) are feasible, and the major physical features of folding proteins, such as cooperativity in thermodynamics and nucleation mechanisms in kinetics, can be reproduced. This has allowed deep insight into the physical mechanism of folding, including the solution of the so-called 'Levinthal paradox'.
Collapse
Affiliation(s)
- E I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA
| |
Collapse
|
26
|
Pedersen JT, Moult J. Ab initio protein folding simulations with genetic algorithms: Simulations on the complete sequence of small proteins. Proteins 1997. [DOI: 10.1002/(sici)1097-0134(1997)1+<179::aid-prot23>3.0.co;2-k] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
27
|
Abstract
Ribonuclease A (RNase A), an unusually well defined enzyme, has been a test protein in the study of a wide variety of chemical and physical methods of protein chemistry. These methods have in turn provided many insights into the functional properties of RNase A, as well as topics of general interest in protein biochemistry. The presence of four disulfide bonds and the existence of two cis peptide bonds preceding prolines in the native state have complicated the analysis of the folding pathway of RNase A. In this review, we present some new information about the folding of RNase A obtained recently by quench-flow H/D exchange combined with NMR and single-jump and double-jump stopped-flow techniques.
Collapse
Affiliation(s)
- J L Neira
- Instituto de Estructura de la Materia, Consejo Superior de Investigaciones Científicas, Madrid, Spain
| | | |
Collapse
|
28
|
Dinner AR, Sali A, Karplus M. The folding mechanism of larger model proteins: role of native structure. Proc Natl Acad Sci U S A 1996; 93:8356-61. [PMID: 8710875 PMCID: PMC38675 DOI: 10.1073/pnas.93.16.8356] [Citation(s) in RCA: 70] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
The folding mechanism of a 125-bead heteropolymer model for proteins is investigated with Monte Carlo simulations on a cubic lattice. Sequences that do and do not fold in a reasonable time are compared. The overall folding behavior is found to be more complex than that of models for smaller proteins. Folding begins with a rapid collapse followed by a slow search through the semi-compact globule for a sequence-dependent stable core with about 30 out of 176 native contacts which serves as the transition state for folding to a near-native structure. Efficient search for the core is dependent on structural features of the native state. Sequences that fold have large amounts of stable, cooperative structure that is accessible through short-range initiation sites, such as those in anti-parallel sheets connected by turns. Before folding is completed, the system can encounter a second bottleneck, involving the condensation and rearrangement of surface residues. Overly stable local structure of the surface residues slows this stage of the folding process. The relation of the results from the 125-mer model studies to the folding of real proteins is discussed.
Collapse
Affiliation(s)
- A R Dinner
- Committee on Higher Degrees in Biophysics, Department of Chemistry, Harvard University, Cambridge, MA 02138, USA
| | | | | |
Collapse
|
29
|
Abstract
Considerable progress has been made in understanding the relationship between local amino acid sequence and local protein structure. Recent highlights include numerous studies of the structures adopted by short peptides, new approaches to correlating sequence patterns with structure patterns, and folding simulations using simple potentials.
Collapse
Affiliation(s)
- C Bystroff
- Department of Biochemistry, University of Washington, Seattle 98195, USA.
| | | | | | | |
Collapse
|
30
|
Abstract
Future research on protein folding must confront two serious dilemmas. (1) It may never be possible to observe at high resolution the very important structures that form in the first few milliseconds of the refolding reaction. (2) The energy functions used to predict structure from sequence will always be approximations of the true energy function. One strategy to resolve both dilemmas is to view protein folding from a different perspective, one that no longer emphasizes time and unique trajectories through conformation space. Instead, free energy replaces time as the reaction coordinate, and ensembles of equilibrium states of partially folded proteins are analyzed in place of trajectories of one protein chain through conformation space, either in vitro or in silico. Initial characterization of the folding of staphylococcal nuclease within this alternative conceptual framework has led to an equilibrium folding pathway with several surprising features. In addition to the finding of two bundles of four hydrophobic segments containing both native and non-native interactions, a gradient in relative stability of different substructures has been identified, with the most stable interactions located toward the amino terminus and the least stable toward the carboxy terminus. Hydrophobic bundles with up-down topology and stability gradients may be two examples of numerous tactics used by proteins to facilitate rapid folding and minimize aggregation. As NMR methods for structural analysis of partially folded proteins are refined, higher resolution descriptions of the structure and dynamics of the polypeptide chain outside the native state may provide many insights into the processes and energetics underlying the self-assembly of folded structure.
Collapse
Affiliation(s)
- D Shortle
- Department of Biological Chemistry, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
| | | | | | | |
Collapse
|
31
|
Abstract
Genetic algorithms are a general class of search methods that mimic natural gene-based optimization mechanisms. Mutation, cross-over and replication operations are performed on strings. When applied to structure prediction, each string describes a particular conformation of a protein molecule. There are many ways in which such search methods may be implemented. Recent results show potential for helping with protein structure prediction, but more data are needed before a complete assessment can be made.
Collapse
Affiliation(s)
- J T Pedersen
- University of Maryland Biotechnology Institute, Rockville, MD 20850, USA
| | | |
Collapse
|
32
|
Abstract
BACKGROUND The beta-hairpin of barnase (residues Ser92-Leu95) has been proposed in theoretical and protein engineering studies to be an initiation site for folding [1]. There is evidence for residual structure in this region from NMR studies of the denatured protein under different denaturing conditions [2,3]. A more detailed analysis is possible by NMR studies of isolated fragments. RESULTS Protons of fragments B(80-110) and B(69-110) in 6 M urea have non-random chemical shifts. Non-native long-range and medium-range NOE contacts with the aromatic moiety of Trp94 indicate that it is involved in a beta-turn-like or alpha-helix-like conformation. Also, the sidechains of Trp71, Tyr79, Phe82, Tyr90, Tyr97, His102, Tyr103 and Phe106 show non-native hydrophobic contacts. Non-random conformational shifts and sequential NN(i,i+1) NOE contacts are clustered to one of the beta-strands and one of the loop regions. CONCLUSIONS The hairpin region of barnase adopts beta-turn-like or alpha-helix-like conformations, which are weakly populated even in 6 M urea. The hairpin region is a potential nucleation site in folding that may consolidate on docking with the first alpha-helix. The other residues that have conformational preferences from a beta-strand and one of the loop regions in the native intact protein, but they do not constitute a nucleation site.
Collapse
Affiliation(s)
- J L Neira
- MRC Unit for Protein Function and Design, Cambridge Centre for Protein Engineering, UK
| | | |
Collapse
|
33
|
Abstract
The results of a protein structure prediction contest are reviewed. Twelve different groups entered predictions on 14 proteins of known sequence whose structures had been determined but not yet disseminated to the scientific community. Thus, these represent true tests of the current state of structure prediction methodologies. From this work, it is clear that accurate tertiary structure prediction is not yet possible. However, protein fold and motif prediction are possible when the motif is recognizable similar to another known structure. Internal symmetry and the information inherent in an aligned family of homologous sequences facilitate predictive efforts. Novel folds remain a major challenge for prediction efforts.
Collapse
Affiliation(s)
- T Defay
- Graduate Group in Biophysics, University of California, San Francisco 94131-0450, USA
| | | |
Collapse
|