101
|
Affiliation(s)
- J G Saven
- Department of Chemistry, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
102
|
Chen H, Zhou X, Ou-Yang ZC. Secondary-structure-favored hydrophobic-polar lattice model of protein folding. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2001; 64:041905. [PMID: 11690050 DOI: 10.1103/physreve.64.041905] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/12/2000] [Revised: 03/08/2001] [Indexed: 05/23/2023]
Abstract
Protein folding is studied using a two-dimensional lattice model with the Hamiltonian including both hydrophobic interactions and main chain hydrogen bond interactions of amino acids. Since compact conformations have different designabilities and only highly designable conformations can act as native structural candidates [H. Li, R. Helling, C. Tang, and N. Wingreen, Science 273, 666 (1996)], it is shown that hydrophobic interaction alone is insufficient to explain the appearance of a high proportion of regular secondary structures, especially beta sheets whose content decreases with increasing designability, but interactions of main chain hydrogen bonds can account for this. Thus the emergence of only a small number of structure types (folds) among all possible structures can be understood to some extent.
Collapse
Affiliation(s)
- H Chen
- Center for Advanced Study, Tsinghua University, Beijing 100084, People's Republic of China
| | | | | |
Collapse
|
103
|
Fang F, Szleifer I. Kinetics and thermodynamics of protein adsorption: a generalized molecular theoretical approach. Biophys J 2001; 80:2568-89. [PMID: 11371435 PMCID: PMC1301446 DOI: 10.1016/s0006-3495(01)76228-5] [Citation(s) in RCA: 138] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The thermodynamics and kinetics of protein adsorption are studied using a molecular theoretical approach. The cases studied include competitive adsorption from mixtures and the effect of conformational changes upon adsorption. The kinetic theory is based on a generalized diffusion equation in which the driving force for motion is the gradient of chemical potentials of the proteins. The time-dependent chemical potentials, as well as the equilibrium behavior of the system, are obtained using a molecular mean-field theory. The theory provides, within the same theoretical formulation, the diffusion and the kinetic (activated) controlled regimes. By separation of ideal and nonideal contributions to the chemical potential, the equation of motion shows a purely diffusive part and the motion of the particles in the potential of mean force resulting from the intermolecular interactions. The theory enables the calculation of the time-dependent surface coverage of proteins, the dynamic surface tension, and the structure of the adsorbed layer in contact with the approaching proteins. For the case of competitive adsorption from a solution containing a mixture of large and small proteins, a variety of different adsorption patterns are observed depending upon the bulk composition, the strength of the interaction between the particles, and the surface and size of the proteins. It is found that the experimentally observed Vroman sequence is predicted in the case that the bulk solution is at a composition with an excess of the small protein, and that the interaction between the large protein and the surface is much larger than that of the smaller protein. The effect of surface conformational changes of the adsorbed proteins in the time-dependent adsorption is studied in detail. The theory predicts regimes of constant density and dynamic surface tension that are long lived but are only intermediates before the final approach to equilibrium. The implications of the findings to the interpretation of experimental observations is discussed.
Collapse
Affiliation(s)
- F Fang
- Department of Chemistry, Purdue University, West Lafayette, Indiana 47907, USA
| | | |
Collapse
|
104
|
|
105
|
Garcia LG, Treptow WL, de Araújo AF. Folding simulations of a three-dimensional protein model with a nonspecific hydrophobic energy function. PHYSICAL REVIEW E 2001; 64:011912. [PMID: 11461293 DOI: 10.1103/physreve.64.011912] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2001] [Indexed: 11/07/2022]
Abstract
We show that a nonspecific hydrophobic energy function can produce protein-like folding behavior of a three-dimensional protein model of 40 monomers in the cubic lattice when the native conformation is chosen judiciously. We confirm that monomer inside/outside segregation is a powerful criterion for the selection of appropriate structures, an idea that was recently proposed with basis on a general theoretical analysis and simulations of much simpler two-dimensional models.
Collapse
Affiliation(s)
- L G Garcia
- Departamento de Biologia Celular and International Center of Condensed Matter Physics, Universidade de Brasília, Brasília-DF 70910-900, Brazil
| | | | | |
Collapse
|
106
|
Sorenson JM, Head-Gordon T. Matching simulation and experiment: a new simplified model for simulating protein folding. J Comput Biol 2001; 7:469-81. [PMID: 11108474 DOI: 10.1089/106652700750050899] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Simulations of simplified protein folding models have provided much insight into solving the protein folding problem. We propose here a new off-lattice bead model, capable of simulating several different fold classes of small proteins. We present the sequence for an alpha/beta protein resembling the IgG-binding proteins L and G. The thermodynamics of the folding process for this model are characterized using the multiple multihistogram method combined with constant-temperature Langevin simulations. The folding is shown to be highly cooperative, with chain collapse nearly accompanying folding. Two parallel folding pathways are shown to exist on the folding free energy landscape. One pathway contains an intermediate--similar to experiments on protein G, and one pathway contains no intermediates-similar to experiments on protein L. The folding kinetics are characterized by tabulating mean-first passage times, and we show that the onset of glasslike kinetics occurs at much lower temperatures than the folding temperature. This model is expected to be useful in many future contexts: investigating questions of the role of local versus nonlocal interactions in various fold classes, addressing the effect of sequence mutations affecting secondary structure propensities, and providing a computationally feasible model for studying the role of solvation forces in protein folding.
Collapse
Affiliation(s)
- J M Sorenson
- Department of Chemistry, University of California, Berkeley 94720, USA
| | | |
Collapse
|
107
|
Sanjeev BS, Patra SM, Vishveshwara S. Sequence design in lattice models by graph theoretical methods. J Chem Phys 2001. [DOI: 10.1063/1.1332809] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
108
|
Pereira de Araújo AF. Sequence rotation in N-dimensional space and the folding of hydrophobic protein models: Surpassing the diagonal unfolded state approximation. J Chem Phys 2001. [DOI: 10.1063/1.1329347] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
109
|
DeGrado WF, Summa CM, Pavone V, Nastri F, Lombardi A. De novo design and structural characterization of proteins and metalloproteins. Annu Rev Biochem 2000; 68:779-819. [PMID: 10872466 DOI: 10.1146/annurev.biochem.68.1.779] [Citation(s) in RCA: 500] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
De novo protein design has recently emerged as an attractive approach for studying the structure and function of proteins. This approach critically tests our understanding of the principles of protein folding; only in de novo design must one truly confront the issue of how to specify a protein's fold and function. If we truly understand proteins, it should be possible to design receptors, enzymes, and ion channels from scratch. Further, as this understanding evolves and is further refined, it should be possible to design proteins and biomimetic polymers with properties unprecedented in nature.
Collapse
Affiliation(s)
- W F DeGrado
- Johnson Research Foundation, Pennsylvania, Philadelphia, USA.
| | | | | | | | | |
Collapse
|
110
|
Wang J, Wang W. Modeling study on the validity of a possibly simplified representation of proteins. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 2000; 61:6981-6986. [PMID: 11088391 DOI: 10.1103/physreve.61.6981] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/1999] [Revised: 02/18/2000] [Indexed: 05/23/2023]
Abstract
The folding characteristics of sequences reduced with a possibly simplified representation of five types of residues are shown to be similar to their original ones with the natural set of residues (20 types or 20 letters). The reduced sequences have a good foldability and fold to the same native structure of their optimized original ones. A large ground state gap for the native structure shows the thermodynamic stability of the reduced sequences. The general validity of such a five-letter reduction is further studied via the correlation between the reduced sequences and the original ones. As a comparison, a reduction with two letters is found not to reproduce the native structure of the original sequences due to its homopolymeric features.
Collapse
Affiliation(s)
- J Wang
- National Laboratory of Solid State Microstructure and Physics Department, Nanjing University, Nanjing 210093, China
| | | |
Collapse
|
111
|
Street AG, Datta D, Gordon DB, Mayo SL. Designing protein beta-sheet surfaces by Z-score optimization. PHYSICAL REVIEW LETTERS 2000; 84:5010-5013. [PMID: 10990854 DOI: 10.1103/physrevlett.84.5010] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/1999] [Indexed: 05/23/2023]
Abstract
Studies of lattice models of proteins have suggested that the appropriate energy expression for protein design may include nonthermodynamic terms to accommodate negative design concerns. One method, developed in lattice model studies, maximizes a quantity known as the " Z-score," which compares the lowest energy sequence whose ground state structure is the target structure to an ensemble of random sequences. Here we show that, in certain circumstances, the technique can be applied to real proteins. The resulting energy expression is used to design the beta-sheet surfaces of two real proteins. We find experimentally that the designed proteins are stable and well folded, and in one case is even more thermostable than the wild type.
Collapse
Affiliation(s)
- A G Street
- Division of Physics, Mathematics and Astronomy, California Institute of Technology, MC 147-75, Pasadena, California 91125, USA
| | | | | | | |
Collapse
|
112
|
Abstract
BACKGROUND A large energy gap between the native state and the non-native folded states is required for folding into a unique three-dimensional structure. The features that define this energy gap are not well understood, but can be addressed using de novo protein design. Previously, alpha(2)D, a dimeric four-helix bundle, was designed and shown to adopt a native-like conformation. The high-resolution solution structure revealed that this protein adopted a bisecting U motif. Glu7, a solvent-exposed residue that adopts many conformations in solution, might be involved in defining the unique three-dimensional structure of alpha(2)D. RESULTS A variety of hydrophobic and polar residues were substituted for Glu7 and the dynamic and thermodynamic properties of the resulting proteins were characterized by analytical ultracentrifugation, circular dichroism spectroscopy, and nuclear magnetic resonance spectroscopy. The majority of substitutions at this solvent-exposed position had little affect on the ability to fold into a dimeric four-helix bundle. The ability to adopt a unique conformation, however, was profoundly modulated by the residue at this position despite the similar free energies of folding of each variant. CONCLUSIONS Although Glu7 is not involved directly in stabilizing the native state of alpha(2)D, it is involved indirectly in specifying the observed fold by modulating the energy gap between the native state and the non-native folded states. These results provide experimental support for hypothetical models arising from lattice simulations of protein folding, and underscore the importance of polar interfacial residues in defining the native conformations of proteins.
Collapse
|
113
|
Zou J, Saven JG. Statistical theory of combinatorial libraries of folding proteins: energetic discrimination of a target structure. J Mol Biol 2000; 296:281-94. [PMID: 10656832 DOI: 10.1006/jmbi.1999.3426] [Citation(s) in RCA: 67] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
A self-consistent theory is presented that can be used to estimate the number and composition of sequences satisfying a predetermined set of constraints. The theory is formulated so as to examine the features of sequences having a particular value of Delta=E(f)-<E>(u), where E(f) is the energy of sequences when in a target structure and <E>(u) is an average energy of non-target structures. The theory yields the probabilities w(i)(alpha) that each position i in the sequence is occupied by a particular monomer type alpha. The theory is applied to a simple lattice model of proteins. Excellent agreement is observed between the theory and the results of exact enumerations. The theory provides a quantitative framework for the design and interpretation of combinatorial experiments involving proteins, where a library of amino acid sequences is searched for sequences that fold to a desired structure.
Collapse
Affiliation(s)
- J Zou
- Department of Chemistry, University of Pennsylvania, Philadelphia, PA 19104-6323, USA
| | | |
Collapse
|
114
|
Abstract
We have developed a fully automated protein design strategy that works on the entire sequence of the protein and uses a full atom representation. At each step of the procedure, an all-atom model of the protein is built using the template protein structure and the current designed sequence. The energy of the model is used to drive a Monte Carlo optimization in sequence space: random moves are either accepted or rejected based on the Metropolis criterion. We rely on the physical forces that stabilize native protein structures to choose the optimum sequence. Our energy function includes van der Waals interactions, electrostatics and an environment free energy. Successful protein design should be specific and generate a sequence compatible with the template fold and incompatible with competing folds. We impose specificity by maintaining the amino acid composition constant, based on the random energy model. The specificity of the optimized sequence is tested by fold recognition techniques. Successful sequence designs for the B1 domain of protein G, for the lambda repressor and for sperm whale myoglobin are presented. We show that each additional term of the energy function improves the performance of our design procedure: the van der Waals term ensures correct packing, the electrostatics term increases the specificity for the correct native fold, and the environment solvation term ensures a correct pattern of buried hydrophobic and exposed hydrophilic residues. For the globin family, we show that we can design a protein sequence that is stable in the myoglobin fold, yet incompatible with the very similar hemoglobin fold.
Collapse
Affiliation(s)
- P Koehl
- Department of Structural Biology, Fairchild Building, Stanford University, Stanford, CA 94305, USA.
| | | |
Collapse
|
115
|
Pereira De Araújo AF. Folding protein models with a simple hydrophobic energy function: the fundamental importance of monomer inside/outside segregation. Proc Natl Acad Sci U S A 1999; 96:12482-7. [PMID: 10535948 PMCID: PMC22956 DOI: 10.1073/pnas.96.22.12482] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The present study explores a "hydrophobic" energy function for folding simulations of the protein lattice model. The contribution of each monomer to conformational energy is the product of its "hydrophobicity" and the number of contacts it makes, i.e., E(h,c) = -Sigma N/i=1 c(i)h(i) = -(h.c) is the negative scalar product between two vectors in N-dimensional cartesian space: h = (h1,., hN), which represents monomer hydrophobicities and is sequence-dependent; and c = (c(1),., c(N)), which represents the number of contacts made by each monomer and is conformation-dependent. A simple theoretical analysis shows that restrictions are imposed concomitantly on both sequences and native structures if the stability criterion for protein-like behavior is to be satisfied. Given a conformation with vector c, the best sequence is a vector h on the direction upon which the projection of c - c is maximal, where c is the diagonal vector with components equal to c, the average number of contacts per monomer in the unfolded state. Best native conformations are suggested to be not maximally compact, as assumed in many studies, but the ones with largest variance of contacts among its monomers, i.e., with monomers tending to occupy completely buried or completely exposed positions. This inside/outside segregation is reflected on an apolar/polar distribution on the corresponding sequence. Monte Carlo simulations in two dimensions corroborate this general scheme. Sequences targeted to conformations with large contact variances folded cooperatively with thermodynamics of a two-state transition. Sequences targeted to maximally compact conformations, which have lower contact variance, were either found to have degenerate ground state or to fold with much lower cooperativity.
Collapse
Affiliation(s)
- A F Pereira De Araújo
- Departamento de Biologia Celular, International Center of Condensed Matter Physics, Universidade de Brasília, Brasília-DF 70910-900, Brazil.
| |
Collapse
|
116
|
Tatsumi R, Chikenji G. Origin of the designability of protein structures. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1999; 60:4696-700. [PMID: 11970334 DOI: 10.1103/physreve.60.4696] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/1998] [Revised: 05/03/1999] [Indexed: 11/07/2022]
Abstract
We examined what determines the designability of two-letter codes (H and P) lattice proteins from three points of view. First, whether the native structure is searched within all possible structures or within maximally compact structures. Second, whether the structure of the used lattice is bipartite or not. Third, the effect of the length of the chain, namely, the number of monomers on the chain. We found that the bipartiteness of the lattice structure is not a main factor that determines the designability. Our results suggest that highly designable structures will be found when the length of the chain is sufficiently long to make the hydrophobic core consisting of a large enough number of monomers.
Collapse
Affiliation(s)
- R Tatsumi
- Department of Physics, Graduate School of Science, Osaka University, Machikaneyama-cho 1-1, Toyonaka, Osaka 560-0043, Japan
| | | |
Collapse
|
117
|
Sorenson JM, Hura G, Soper AK, Pertsemlidis A, Head-Gordon T. Determining the Role of Hydration Forces in Protein Folding. J Phys Chem B 1999. [DOI: 10.1021/jp990434k] [Citation(s) in RCA: 83] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Jon M. Sorenson
- Department of Chemistry, University of California, Berkeley, Berkeley, California 94720
| | - Greg Hura
- Graduate Group in Biophysics, University of California, Berkeley and Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720
| | - Alan K. Soper
- ISIS Facility, Rutherford Appleton Laboratory, Chilton, Didcot, Oxon OX11 0QX, U.K
| | - Alexander Pertsemlidis
- Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas 75235-9038
| | - Teresa Head-Gordon
- Life Sciences Division & Physical Biosciences Division, Lawrence Berkeley, National Laboratory, Berkeley, California 94720
| |
Collapse
|
118
|
Micheletti C, Maritan A, Banavar JR. A comparative study of existing and new design techniques for protein models. J Chem Phys 1999. [DOI: 10.1063/1.478938] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
119
|
Sorenson JM, Head-Gordon T. The importance of hydration for the kinetics and thermodynamics of protein folding: simplified lattice models. FOLDING & DESIGN 1999; 3:523-34. [PMID: 9889163 DOI: 10.1016/s1359-0278(98)00068-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
BACKGROUND Recent studies have proposed various sources for the origin of cooperativity in simplified protein folding models. Important contributions to cooperativity that have been discussed include backbone hydrogen bonding, sidechain packing and hydrophobic interactions. Related work has also focused on which interactions are responsible for making the free energy of the native structure a pronounced global minimum in the free energy landscape. In addition, two-flavor bead models have been found to exhibit poor folding cooperativity and often lack unique native structures. We propose a simple multibody description of hydration with expectations that it might modify the free energy surface in such a way as to increase the cooperativity of folding and improve the performance of two-flavor models. RESULTS We study the thermodynamics and kinetics of folding for designed 36-mer sequences on a cubic lattice using both our solvation model and the corresponding model without solvation terms. Degeneracies of the native states are studied by enumerating the maximally compact states. The histogram Monte Carlo method is used to obtain folding temperatures, densities of states and heat capacity curves. Folding kinetics are examined by accumulating mean first-passage times versus temperature. Sequences in the proposed solvation model are found to have more unique ground states, fold faster and fold with more cooperativity than sequences in the nonsolvation model. CONCLUSIONS We find that the addition of a multibody description of solvation can improve the poor performance of two-flavor lattice models and provide an additional source for more cooperative folding. Our results suggest that a better description of solvation will be important for future theoretical protein folding studies.
Collapse
Affiliation(s)
- J M Sorenson
- Department of Chemistry, University of California, Berkeley 94720, USA
| | | |
Collapse
|
120
|
|
121
|
|
122
|
|
123
|
|
124
|
Shakhnovich EI. Protein design: a perspective from simple tractable models. FOLDING & DESIGN 1998; 3:R45-58. [PMID: 9562552 DOI: 10.1016/s1359-0278(98)00021-2] [Citation(s) in RCA: 141] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Recent progress in computational approaches to protein design builds on advances in statistical mechanical protein folding theory. Here, the number of sequences folding into a given conformation is evaluated and a simple condition for a protein model's designability is outlined.
Collapse
Affiliation(s)
- EI Shakhnovich
- Harvard University Department of Chemistry and Chemical Biology 12 Oxford Street, Cambridge, MA 02138, USA
| |
Collapse
|
125
|
Abstract
A structure-based, sequence-design procedure is proposed in which one considers a set of decoy structures that compete significantly with the target structure in being low energy conformations. The decoy structures are chosen to have strong overlaps in contacts with the putative native state. The procedure allows the design of sequences with large and small stability gaps in a random-bond heteropolymer model in both two and three dimensions by an appropriate assignment of the contact energies to both the native and nonnative contacts. The design procedure is also successfully applied to the two-dimensional HP model.
Collapse
Affiliation(s)
- J R Banavar
- Department of Physics and Center for Materials Physics, Pennsylvania State University, University Park, USA
| | | | | | | | | | | |
Collapse
|
126
|
Irbäck A, Sandelin E. Local interactions and protein folding: A model study on the square and triangular lattices. J Chem Phys 1998. [DOI: 10.1063/1.475605] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
127
|
Abstract
The energy landscape theory of protein folding is a statistical description of a protein's potential surface. It assumes that folding occurs through organizing an ensemble of structures rather than through only a few uniquely defined structural intermediates. It suggests that the most realistic model of a protein is a minimally frustrated heteropolymer with a rugged funnel-like landscape biased toward the native structure. This statistical description has been developed using tools from the statistical mechanics of disordered systems, polymers, and phase transitions of finite systems. We review here its analytical background and contrast the phenomena in homopolymers, random heteropolymers, and protein-like heteropolymers that are kinetically and thermodynamically capable of folding. The connection between these statistical concepts and the results of minimalist models used in computer simulations is discussed. The review concludes with a brief discussion of how the theory helps in the interpretation of results from fast folding experiments and in the practical task of protein structure prediction.
Collapse
Affiliation(s)
- J N Onuchic
- Department of Physics, University of California at San Diego, La Jolla 92093-0319, USA
| | | | | |
Collapse
|
128
|
Pande VS, Grosberg AY, Tanaka T. Statistical mechanics of simple models of protein folding and design. Biophys J 1997; 73:3192-210. [PMID: 9414231 PMCID: PMC1181222 DOI: 10.1016/s0006-3495(97)78345-0] [Citation(s) in RCA: 95] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
It is now believed that the primary equilibrium aspects of simple models of protein folding are understood theoretically. However, current theories often resort to rather heavy mathematics to overcome some technical difficulties inherent in the problem or start from a phenomenological model. To this end, we take a new approach in this pedagogical review of the statistical mechanics of protein folding. The benefit of our approach is a drastic mathematical simplification of the theory, without resort to any new approximations or phenomenological prescriptions. Indeed, the results we obtain agree precisely with previous calculations. Because of this simplification, we are able to present here a thorough and self contained treatment of the problem. Topics discussed include the statistical mechanics of the random energy model (REM), tests of the validity of REM as a model for heteropolymer freezing, freezing transition of random sequences, phase diagram of designed ("minimally frustrated") sequences, and the degree to which errors in the interactions employed in simulations of either folding and design can still lead to correct folding behavior.
Collapse
Affiliation(s)
- V S Pande
- Physics Department, University of California, Berkeley 94720-7300, USA
| | | | | |
Collapse
|
129
|
Abstract
The figure-to-structure maps for all uniquely folding sequences of short hydrophobic polar (HP) model proteins on a square lattice is analyzed to investigate aspects considered relevant to evolution. By ranking structures by their frequencies, few very frequent and many rare structures are found. The distribution can be empirically described by a generalized Zipf's law. All structures are relatively compact, yet the most compact ones are rare. Most sequences falling to the same structure belong to "neutral nets." These graphs in sequence space are connected by point mutations and centered around prototype sequences, which tolerate the largest number (up to 55%) of neutral mutations. Profiles have been derived from these homologous sequences. Frequent structures conserve hydrophobic cores only while rare ones are sensitive to surface mutations as well. Shape space covering, i.e., the ability to transform any structure into most others with few point mutations, is very unlikely. It is concluded that many characteristic features of the sequence-to-structure map of real proteins, such as the dominance of few folds, can be explained by the simple HP model. In analogy to protein families, nets are dense and well separated in sequence space. Potential implications in better understanding the evolution of proteins and applications to improving database searches are discussed.
Collapse
Affiliation(s)
- E Bornberg-Bauer
- Abteilung Theoretische Bioinformatik, Deutsches Krebsforschungszentrum, Heidelberg, Germany.
| |
Collapse
|
130
|
|
131
|
Affiliation(s)
- H W Hellinga
- Department of Biochemistry, Duke University Medical Center, Durham, NC 27710, USA.
| |
Collapse
|
132
|
Khimasia MM, Coveney PV. Protein Structure Prediction as a Hard Optimization Problem: The Genetic Algorithm Approach. MOLECULAR SIMULATION 1997. [DOI: 10.1080/08927029708024151] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
133
|
Irbäck A, Peterson C, Potthast F, Sommelius O. Local interactions and protein folding: A three-dimensional off-lattice approach. J Chem Phys 1997. [DOI: 10.1063/1.474357] [Citation(s) in RCA: 91] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
134
|
Abstract
If protein structure prediction methods are to make any impact on the impending onerous task of analyzing the large numbers of unknown protein sequences generated by the ongoing genome-sequencing projects, it is vital that they make the difficult transition from computational 'gedankenexperiments' to practical software tools. This has already happened in the field of comparative modelling and is currently happening in the threading field. Unfortunately, there is little evidence of this transition happening in the field of ab initio tertiary-structure prediction.
Collapse
Affiliation(s)
- D T Jones
- Department of Biological Sciences, University of Warwick, Coventry, UK.
| |
Collapse
|
135
|
|
136
|
Abstract
Experiment and theory are converging on the importance of nucleation mechanisms in protein folding. These mechanisms do not use classic nuclei, which are well formed elements of structure present in ground states, but they use diffuse, extended regions, which are observed in transition states.
Collapse
Affiliation(s)
- A R Fersht
- Cambridge University Chemical Laboratory, Lensfield Road, Cambridge CB2 1EW, UK.
| |
Collapse
|
137
|
Abstract
Recently, protein-folding models have advanced to the point where folding simulations of protein-like chains of reasonable length (up to 125 amino acids) are feasible, and the major physical features of folding proteins, such as cooperativity in thermodynamics and nucleation mechanisms in kinetics, can be reproduced. This has allowed deep insight into the physical mechanism of folding, including the solution of the so-called 'Levinthal paradox'.
Collapse
Affiliation(s)
- E I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA
| |
Collapse
|
138
|
|
139
|
Mirny LA, Shakhnovich EI. How to derive a protein folding potential? A new approach to an old problem. J Mol Biol 1996; 264:1164-79. [PMID: 9000638 DOI: 10.1006/jmbi.1996.0704] [Citation(s) in RCA: 230] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
In this paper we introduce a novel method of deriving a pairwise potential for protein folding. The potential is obtained by an optimization procedure that simultaneously maximizes thermodynamic stability for all proteins in the database. When applied to the representative dataset of proteins and with the energy function taken in pairwise contact approximation, our potential scored somewhat better than existing ones. However, the discrimination of the native structure from decoys is still not strong enough to make the potential useful for ab initio folding. Our results suggest that the problem lies with pairwise amino acid contact approximation and/or simplified presentation of proteins rather than with the derivation of potential. We argue that more detail of protein structure and energetics should be taken into account to achieve energy gaps. The suggested method is general enough to allow us to systematically derive parameters for more sophisticated energy functions. The internal control of validity for the potential derived by our method is convergence to a unique solution upon addition of new proteins to the database. The method is tested on simple model systems where sequences are designed, using the preset "true" potential, to have low energy in a dataset of structures. Our procedure is able to recover the potential with correlation r approximately 91% with the true one and we were able to fold all model structures using the recovered potential. Other statistical knowledge-based approaches were tested using this model and the results indicate that they also can recover the true potential with high degree of accuracy.
Collapse
Affiliation(s)
- L A Mirny
- Harvard University, Department of Chemistry, Cambridge, MA 02138, USA
| | | |
Collapse
|
140
|
Beutler TC, Dill KA. A fast conformational search strategy for finding low energy structures of model proteins. Protein Sci 1996; 5:2037-43. [PMID: 8897604 PMCID: PMC2143263 DOI: 10.1002/pro.5560051010] [Citation(s) in RCA: 50] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
We describe a new computer algorithm for finding low-energy conformations of proteins. It is a chain-growth method that uses a heuristic bias function to help assemble a hydrophobic core. We call it the Core-directed chain Growth method (CG). We test the CG method on several well-known literature examples of HP lattice model proteins [in which proteins are modeled as sequences of hydrophobic (H) and polar (P) monomers], ranging from 20-64 monomers in two dimensions, and up to 88-mers in three dimensions. Previous nonexhaustive methods--Monte Carlo, a Genetic Algorithm, Hydrophobic Zippers, and Contact Interactions--have been tried on these same model sequences. CG is substantially better at finding the global optima, and avoiding local optima, and it does so in comparable or shorter times. CG finds the global minimum energy of the longest HP lattice model chain for which the global optimum is known, a 3D 88-mer that has only been reachable before by the CHCC complete search method. CG has the potential advantage that it should have nonexponential scaling with chain length. We believe this is a promising method for conformational searching in protein folding algorithms.
Collapse
Affiliation(s)
- T C Beutler
- Department of Pharmaceutical Chemistry, University of California, San Francisco 94143-1204, USA
| | | |
Collapse
|
141
|
|
142
|
Lindgård PA, Bohr H. Magic Numbers in Protein Structures. PHYSICAL REVIEW LETTERS 1996; 77:779-782. [PMID: 10062900 DOI: 10.1103/physrevlett.77.779] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
|
143
|
Pande VS, Grosberg AY, Joerg C, Tanaka T. Is heteropolymer freezing well described by the random energy model? PHYSICAL REVIEW LETTERS 1996; 76:3987-3990. [PMID: 10061163 DOI: 10.1103/physrevlett.76.3987] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
|
144
|
Hao MH, Scheraga HA. How optimization of potential functions affects protein folding. Proc Natl Acad Sci U S A 1996; 93:4984-9. [PMID: 8643516 PMCID: PMC39392 DOI: 10.1073/pnas.93.10.4984] [Citation(s) in RCA: 76] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
The relationship between the optimization of the potential function and the foldability of theoretical protein models is studied based on investigations of a 27-mer cubic-lattice protein model and a more realistic lattice model for the protein crambin. In both the simple and the more complicated systems, optimization of the energy parameters achieves significant improvements in the statistical-mechanical characteristics of the systems and leads to foldable protein models in simulation experiments. The foldability of the protein models is characterized by their statistical-mechanical properties--e.g., by the density of states and by Monte Carlo folding simulations of the models. With optimized energy parameters, a high level of consistency exists among different interactions in the native structures of the protein models, as revealed by a correlation function between the optimized energy parameters and the native structure of the model proteins. The results of this work are relevant to the design of a general potential function for folding proteins by theoretical simulations.
Collapse
Affiliation(s)
- M H Hao
- Baker Laboratory of Chemistry, Cornell University, Ithaca, NY 14853-1301, USA
| | | |
Collapse
|
145
|
Abstract
Proteins fold to unique compact native structures. Perhaps other polymers could be designed to fold in similar ways. The chemical nature of the monomer "alphabet" determines the "energy matrix" of monomer interactions-which defines the folding code, the relationship between sequence and structure. We study two properties of energy matrices using two-dimensional lattice models: uniqueness, the number of sequences that fold to only one structure, and encodability, the number of folds that are unique lowest-energy structures of certain monomer sequences. For the simplest model folding code, involving binary sequences of H (hydrophobic) and P (polar) monomers, only a small fraction of sequences fold uniquely, and not all structures can be encoded. Adding strong repulsive interactions results in a folding code with more sequences folding uniquely and more designable folds. Some theories suggest that the quality of a folding code depends only on the number of letters in the monomer alphabet, but we find that the energy matrix itself can be at least as important as the size of the alphabet. Certain multi-letter codes, including some with 20 letters, may be less physical or protein-like than codes with smaller numbers of letters because they neglect correlations among inter-residue interactions, treat only maximally compact conformations, or add arbitrary energies to the energy matrix.
Collapse
Affiliation(s)
- H S Chan
- Department of Pharmaceutical Chemistry, University of California, San Francisco 94143-1204, USA
| | | |
Collapse
|
146
|
Yue K, Dill KA. Folding proteins with a simple energy function and extensive conformational searching. Protein Sci 1996; 5:254-61. [PMID: 8745403 PMCID: PMC2143350 DOI: 10.1002/pro.5560050209] [Citation(s) in RCA: 75] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
We describe a computer algorithm for predicting the three-dimensional structures of proteins using only their amino acid sequences. The method differs from others in two ways: (1) it uses very few energy parameters, representing hydrophobic and polar interactions, and (2) it uses a new "constraint-based exhaustive" searching method, which appears to be among the fastest and most complete search methods yet available for realistic protein models. It finds a relatively small number of low-energy conformations, among which are native-like conformations, for crambin (1CRN), avian pancreatic polypeptide (1PPT), melittin (2MLT), and apamin. Thus, the lowest-energy states of very simple energy functions may predict the native structures of globular proteins.
Collapse
Affiliation(s)
- K Yue
- Department of Pharmaceutical Chemistry, University of California at San Francisco, 94143, USA
| | | |
Collapse
|
147
|
Deutsch JM, Kurosky T. New algorithm for protein design. PHYSICAL REVIEW LETTERS 1996; 76:323-326. [PMID: 10061072 DOI: 10.1103/physrevlett.76.323] [Citation(s) in RCA: 67] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
|
148
|
Bornberg-Bauer E. Simple folding model for HP lattice proteins. Bioinformatics 1996. [DOI: 10.1007/bfb0033211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
|
149
|
Abstract
Computer simulations of simple exact lattice models are an aid in the study of protein folding process; they have sometimes resulted in predictions experimentally proved. The contact interactions (CI) method is here proposed as a new algorithm for the conformational search in the low-energy regions of protein chains modeled as copolymers of hydrophobic and polar monomers configured as self-avoiding walks on square or cubic lattices. It may be regarded as an extension of the standard Monte Carlo method improved by the concept of cooperativity deriving from nonlocal contact interactions. A major difference with respect to other algorithms is that criteria for the acceptance of new conformations generated during the simulations are not based on the energy of the entire molecule, but cooling factors associated with each residue define regions of the model protein with higher or lower mobility. Nine sequences of length ranging from 20 to 64 residues were used on the square lattice and 15 sequences of length ranging from 46 to 136 residues were used on the cubic lattice. The CI algorithm proved very efficient both in two and three dimensions, and allowed us to localize energy minima not localized by other searching algorithms described in the literature. Use of this algorithm is not limited to the conformational search, because it allows the exploration of thermodynamic and kinetic behavior of model protein chains.
Collapse
Affiliation(s)
- L Toma
- Dipartimento di Chimica Organica, Università di Pavia, Italy.
| | | |
Collapse
|
150
|
Hart WE, Istrail SC. Fast protein folding in the hydrophobic-hydrophilic model within three-eighths of optimal. J Comput Biol 1996; 3:53-96. [PMID: 8697239 DOI: 10.1089/cmb.1996.3.53] [Citation(s) in RCA: 59] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
We present performance-guaranteed approximation algorithms for the protein folding problem in the hydrophobic-hydrophilic model (Dill, 1985). Our algorithms are the first approximation algorithms in the literature with guaranteed performance for this model (Dill, 1994). The hydrophobic-hydrophilic model abstracts the dominant force of protein folding: the hydrophobic interaction. The protein is modeled as a chain of amino acids of length n that are of two types; H (hydrophobic, i.e., nonpolar) and P (hydrophilic, i.e., polar). Although this model is a simplification of more complex protein folding models, the protein folding structure prediction problem is notoriously difficult for this model. Our algorithms have linear (3n) or quadratic time and achieve a three-dimensional protein conformation that has a guaranteed free energy no worse than three-eighths of optimal. This result answers the open problem of Ngo et al. (1994) about the possible existence of an efficient approximation algorithm with guaranteed performance for protein structure prediction in any well-studied model of protein folding. By achieving speed and near-optimality simultaneously, our algorithms rigorously capture salient features of the recently proposed framework of protein folding by Sali et al. (1994). Equally important, the final conformations of our algorithms have significant secondary structure (antiparallel sheets, beta-sheets, compact hydrophobic core). Furthermore, hypothetical folding pathways can be described for our algorithms that fit within the framework of diffusion-collision protein folding proposed by Karplus and Weaver (1979). Computational limitations of algorithms that compute the optimal conformation have restricted their applicability to short sequences (length < or = 90). Because our algorithms trade computational accuracy for speed, they can construct near-optimal conformations in linear time for sequences of any size.
Collapse
Affiliation(s)
- W E Hart
- Sandia National Laboratories, Massively Parallel Computing Research Laboratory, Albuquerque, NM 87185-1110, USA
| | | |
Collapse
|