1
|
Das S, Merz KM. Molecular Gas-Phase Conformational Ensembles. J Chem Inf Model 2024; 64:749-760. [PMID: 38206321 DOI: 10.1021/acs.jcim.3c01309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2024]
Abstract
Accurately determining the global minima of a molecular structure is important in diverse scientific fields, including drug design, materials science, and chemical synthesis. Conformational search engines serve as valuable tools for exploring the extensive conformational space of molecules and for identifying energetically favorable conformations. In this study, we present a comparison of Auto3D, CREST, Balloon, and ETKDG (from RDKit), which are freely available conformational search engines, to evaluate their effectiveness in locating global minima. These engines employ distinct methodologies, including machine learning (ML) potential-based, semiempirical, and force field-based approaches. To validate these methods, we propose the use of collisional cross-section (CCS) values obtained from ion mobility-mass spectrometry studies. We hypothesize that experimental gas-phase CCS values can provide experimental evidence that we likely have the global minimum for a given molecule. To facilitate this effort, we used our gas-phase conformation library (GPCL) which currently consists of the full ensembles of 20 small molecules and can be used by the community to validate any conformational search engine. Further members of the GPCL can be readily created for any molecule of interest using our standard workflow used to compute CCS values, expanding the ability of the GPCL in validation exercises. These innovative validation techniques enhance our understanding of the conformational landscape and provide valuable insights into the performance of conformational generation engines. Our findings shed light on the strengths and limitations of each search engine, enabling informed decisions for their utilization in various scientific fields, where accurate molecular structure determination is crucial for understanding biological activity and designing targeted interventions. By facilitating the identification of reliable conformations, this study significantly contributes to enhancing the efficiency and accuracy of molecular structure determination, with particular focus on metabolite structure elucidation. The findings of this research also provide valuable insights for developing effective workflows for predicting the structures of unknown compounds with high precision.
Collapse
Affiliation(s)
- Susanta Das
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
| | - Kenneth M Merz
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
| |
Collapse
|
2
|
Wilson MS, Landau DP. Thermodynamics of hydrophobic-polar model proteins on the face-centered cubic lattice. Phys Rev E 2021; 104:025303. [PMID: 34525583 DOI: 10.1103/physreve.104.025303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 07/07/2021] [Indexed: 11/07/2022]
Abstract
The HP model, a coarse-grained protein representation with only hydrophobic (H) and polar (P) amino acids, has already been extensively studied on the simple cubic (SC) lattice. However, this geometry severely restricts possible bond angles, and a simple improvement is to instead use the face-centered cubic (fcc) lattice. In this paper, the density of states and ground state energies are calculated for several benchmark HP sequences on the fcc lattice using the replica-exchange Wang-Landau algorithm and a powerful set of Monte Carlo trial moves. Results from the fcc lattice proteins are directly compared with those obtained from a previous lattice protein folding study with a similar methodology on the SC lattice. A thermodynamic analysis shows comparable folding behavior between the two lattice geometries, but with a greater rate of hydrophobic-core formation persisting into lower temperatures on the fcc lattice.
Collapse
Affiliation(s)
- Matthew S Wilson
- Center for Simulational Physics, Department of Physics and Astronomy, The University of Georgia, Athens, Georgia 30602, USA
| | - David P Landau
- Center for Simulational Physics, Department of Physics and Astronomy, The University of Georgia, Athens, Georgia 30602, USA
| |
Collapse
|
3
|
Farris ACK, Seaton DT, Landau DP. Effects of lattice constraints in coarse-grained protein models. J Chem Phys 2021; 154:084903. [PMID: 33639740 DOI: 10.1063/5.0038184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We compare and contrast folding behavior in several coarse-grained protein models, both on- and off-lattice, in an attempt to uncover the effect of lattice constraints in these kinds of models. Using modern, extended ensemble Monte Carlo methods-Wang-Landau sampling, multicanonical sampling, replica-exchange Wang-Landau sampling, and replica-exchange multicanonical sampling, we investigate the thermodynamic and structural behavior of the protein Crambin within the context of the hydrophobic-polar, hydrophobic-"neutral"-polar (H0P), and semi-flexible H0P model frameworks. We uncover the folding process in all cases; all models undergo, at least, the two major structural transitions observed in nature-the coil-globule collapse and the folding transition. As the complexity of the model increases, these two major transitions begin to split into multi-step processes, wherein the lattice coarse-graining has a significant impact on the details of these processes. The results show that the level of structural coarse-graining is coupled to the level of interaction coarse-graining.
Collapse
Affiliation(s)
- Alfred C K Farris
- Department of Physics and Astronomy, Oxford College of Emory University, Oxford, Georgia 30054, USA
| | - Daniel T Seaton
- Open Learning, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - David P Landau
- Center for Simulational Physics, Department of Physics and Astronomy, The University of Georgia, Athens, Georgia 30602, USA
| |
Collapse
|
4
|
Solving the protein folding problem in hydrophobic-polar model using deep reinforcement learning. SN APPLIED SCIENCES 2020. [DOI: 10.1007/s42452-020-2012-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
|
5
|
Shi G, Wüst T, Landau DP. Elucidating thermal behavior, native contacts, and folding funnels of simple lattice proteins using replica exchange Wang-Landau sampling. J Chem Phys 2018; 149:164913. [PMID: 30384708 DOI: 10.1063/1.5026256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
We studied the folding behavior of two coarse-grained, lattice models, the HP (hydrophobic-polar) model and the semi-flexible H0P model, whose 124 monomer long sequences were derived from the protein Ribonuclease A. Taking advantage of advanced parallel computing techniques, we applied replica exchange Wang-Landau sampling and calculated the density of states over the models entire energy ranges to high accuracy. We then determined both energetic and structural quantities in order to elucidate the folding behavior of each model completely. As a result of sufficiently long sequences and model complexity, yet computational accessibility, we were able to depict distinct free energy folding funnels for both models. In particular, we found that the HP model folds in a single-step process with a very highly degenerate native state and relatively flat low temperature folding funnel minimum. By contrast, the semi-flexible H0P model folds via a multi-step process and the native state is almost four orders of magnitude less degenerate than that for the HP model. In addition, for the H0P model, the bottom of the free energy folding funnel remains rough, even at low temperatures.
Collapse
Affiliation(s)
- Guangjie Shi
- Center for Simulational Physics, The University of Georgia, Athens, Georgia 30602-0002, USA
| | - Thomas Wüst
- Scientific IT Services, ETH Zurich, 8092 Zurich, Switzerland
| | - David P Landau
- Center for Simulational Physics, The University of Georgia, Athens, Georgia 30602-0002, USA
| |
Collapse
|
6
|
Farris ACK, Shi G, Wüst T, Landau DP. The role of chain-stiffness in lattice protein models: A replica-exchange Wang-Landau study. J Chem Phys 2018; 149:125101. [PMID: 30278675 DOI: 10.1063/1.5045482] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Using Monte Carlo simulations, we investigate simple, physically motivated extensions to the hydrophobic-polar lattice protein model for the small (46 amino acid) protein Crambin. We use two-dimensional replica-exchange Wang-Landau sampling to study the effects of a bond angle stiffness parameter on the folding and uncover a new step in the collapse process for particular values of this stiffness parameter. A physical interpretation of the folding is developed by analysis of changes in structural quantities, and the free energy landscape is explored. For these special values of stiffness, we find non-degenerate ground states, a property that is consistent with behavior of real proteins, and we use these unique ground states to elucidate the formation of native contacts during the folding process. Through this analysis, we conclude that chain-stiffness is particularly influential in the low energy, low temperature regime of the folding process once the lattice protein has partially collapsed.
Collapse
Affiliation(s)
- Alfred C K Farris
- Center for Simulational Physics, Department of Physics and Astronomy, The University of Georgia, Athens, Georgia 30602, USA
| | - Guangjie Shi
- Center for Simulational Physics, Department of Physics and Astronomy, The University of Georgia, Athens, Georgia 30602, USA
| | - Thomas Wüst
- Scientific IT Services, ETH Zürich, 8092 Zürich, Switzerland
| | - David P Landau
- Center for Simulational Physics, Department of Physics and Astronomy, The University of Georgia, Athens, Georgia 30602, USA
| |
Collapse
|
7
|
Biological complexity: ant colony meta-heuristic optimization algorithm for protein folding. Neural Comput Appl 2016. [DOI: 10.1007/s00521-016-2252-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
8
|
Martins PHL, Bachmann M. Interlocking order parameter fluctuations in structural transitions between adsorbed polymer phases. Phys Chem Chem Phys 2016; 18:2143-51. [PMID: 26690091 DOI: 10.1039/c5cp05038c] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
By means of contact-density chain-growth simulations of a simple coarse-grained lattice model for a polymer grafted at a solid homogeneous substrate, we investigate the complementary behavior of the numbers of surface-monomer and monomer-monomer contacts under various solvent and thermal conditions. This pair of contact numbers represents an appropriate set of order parameters that enables the distinct discrimination of significantly different compact phases of polymer adsorption. Depending on the transition scenario, these order parameters can interlock in perfect cooperation. The analysis helps understand the transitions from compact filmlike adsorbed polymer conformations into layered morphologies and dissolved adsorbed structures, respectively, in more detail.
Collapse
Affiliation(s)
- Paulo H L Martins
- Instituto de Física, Universidade Federal de Mato Grosso, 78060-900 Cuiabá, MT, Brazil.
| | | |
Collapse
|
9
|
Shi G, Vogel T, Wüst T, Li YW, Landau DP. Effect of single-site mutations on hydrophobic-polar lattice proteins. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2014; 90:033307. [PMID: 25314564 DOI: 10.1103/physreve.90.033307] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2014] [Indexed: 06/04/2023]
Abstract
We developed a heuristic method for determining the ground-state degeneracy of hydrophobic-polar (HP) lattice proteins, based on Wang-Landau and multicanonical sampling. It is applied during comprehensive studies of single-site mutations in specific HP proteins with different sequences. The effects in which we are interested include structural changes in ground states, changes of ground-state energy, degeneracy, and thermodynamic properties of the system. With respect to mutations, both extremely sensitive and insensitive positions in the HP sequence have been found. That is, ground-state energies and degeneracies, as well as other thermodynamic and structural quantities, may be either largely unaffected or may change significantly due to mutation.
Collapse
Affiliation(s)
- Guangjie Shi
- Center for Simulational Physics, The University of Georgia, Athens, Georgia 30602, USA
| | - Thomas Vogel
- Theoretical Division (T-1), Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Thomas Wüst
- Scientific IT Services, ETH Zürich IT Services, 8092 Zürich, Switzerland
| | - Ying Wai Li
- National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA
| | - David P Landau
- Center for Simulational Physics, The University of Georgia, Athens, Georgia 30602, USA
| |
Collapse
|
10
|
Gabrielsen M, Kurczab R, Siwek A, Wolak M, Ravna AW, Kristiansen K, Kufareva I, Abagyan R, Nowak G, Chilmonczyk Z, Sylte I, Bojarski AJ. Identification of novel serotonin transporter compounds by virtual screening. J Chem Inf Model 2014; 54:933-43. [PMID: 24521202 PMCID: PMC3982395 DOI: 10.1021/ci400742s] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
The serotonin (5-hydroxytryptamine, 5-HT) transporter (SERT) plays an essential role in the termination of serotonergic neurotransmission by removing 5-HT from the synaptic cleft into the presynaptic neuron. It is also of pharmacological importance being targeted by antidepressants and psychostimulant drugs. Here, five commercial databases containing approximately 3.24 million drug-like compounds have been screened using a combination of two-dimensional (2D) fingerprint-based and three-dimensional (3D) pharmacophore-based screening and flexible docking into multiple conformations of the binding pocket detected in an outward-open SERT homology model. Following virtual screening (VS), selected compounds were evaluated using in vitro screening and full binding assays and an in silico hit-to-lead (H2L) screening was performed to obtain analogues of the identified compounds. Using this multistep VS/H2L approach, 74 active compounds, 46 of which had K(i) values of ≤1000 nM, belonging to 16 structural classes, have been identified, and multiple compounds share no structural resemblance with known SERT binders.
Collapse
Affiliation(s)
- Mari Gabrielsen
- Medical Pharmacology and Toxicology, Department of Medical Biology, Faculty of Health Sciences, UiT, The Arctic University of Norway , 9037 Tromsø, Norway
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Custódio FL, Barbosa HJ, Dardenne LE. A multiple minima genetic algorithm for protein structure prediction. Appl Soft Comput 2014. [DOI: 10.1016/j.asoc.2013.10.029] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
12
|
Maher B, Albrecht AA, Loomes M, Yang XS, Steinhöfel K. A firefly-inspired method for protein structure prediction in lattice models. Biomolecules 2014; 4:56-75. [PMID: 24970205 PMCID: PMC4030990 DOI: 10.3390/biom4010056] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2013] [Revised: 12/17/2013] [Accepted: 12/27/2013] [Indexed: 02/05/2023] Open
Abstract
We introduce a Firefly-inspired algorithmic approach for protein structure prediction over two different lattice models in three-dimensional space. In particular, we consider three-dimensional cubic and three-dimensional face-centred-cubic (FCC) lattices. The underlying energy models are the Hydrophobic-Polar (H-P) model, the Miyazawa–Jernigan (M-J) model and a related matrix model. The implementation of our approach is tested on ten H-P benchmark problems of a length of 48 and ten M-J benchmark problems of a length ranging from 48 until 61. The key complexity parameter we investigate is the total number of objective function evaluations required to achieve the optimum energy values for the H-P model or competitive results in comparison to published values for the M-J model. For H-P instances and cubic lattices, where data for comparison are available, we obtain an average speed-up over eight instances of 2.1, leaving out two extreme values (otherwise, 8.8). For six M-J instances, data for comparison are available for cubic lattices and runs with a population size of 100, where, a priori, the minimum free energy is a termination criterion. The average speed-up over four instances is 1.2 (leaving out two extreme values, otherwise 1.1), which is achieved for a population size of only eight instances. The present study is a test case with initial results for ad hoc parameter settings, with the aim of justifying future research on larger instances within lattice model settings, eventually leading to the ultimate goal of implementations for off-lattice models.
Collapse
Affiliation(s)
- Brian Maher
- Department of Informatics, King's College London, Strand, London WC2R 2LS, UK.
| | - Andreas A Albrecht
- School of Science and Technology, Middlesex University, The Burroughs, London, NW4 4BT, UK.
| | - Martin Loomes
- School of Science and Technology, Middlesex University, The Burroughs, London, NW4 4BT, UK.
| | - Xin-She Yang
- School of Science and Technology, Middlesex University, The Burroughs, London, NW4 4BT, UK.
| | - Kathleen Steinhöfel
- Department of Informatics, King's College London, Strand, London WC2R 2LS, UK.
| |
Collapse
|
13
|
Kurczab R, Bojarski AJ. New strategy for receptor-based pharmacophore query construction: a case study for 5-HT₇ receptor ligands. J Chem Inf Model 2013; 53:3233-43. [PMID: 24245803 DOI: 10.1021/ci4005207] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
In this paper, a new approach for generating receptor-based 3D pharmacophore models for rapid in silico virtual screening is presented. The method combines information from docking poses of known ligands of different structures and further ligand-receptor complexes analyses using structural interaction fingerprints (SIFts). Next, the best linear combination of three-, four-, and five-feature pharmacophores in terms of selected performance parameter (i.e., recall, F-score, and MCC) is constructed. The resultant queries showed significantly better VS performance and new scaffold recognition when compared with the known ligand- and receptor-based pharmacophore models. The approach was developed and validated on 5-HT₇ receptor homology models created on available crystal structure templates. The efficiency of the obtained linear combinations exhibited only a minor dependence on the template selection.
Collapse
Affiliation(s)
- Rafał Kurczab
- Department of Medicinal Chemistry, Institute of Pharmacology Polish Academy of Sciences , 12 Smętna Street, Kraków, 31-343, Poland
| | | |
Collapse
|
14
|
Yan S, Wu G. Detailed folding structures of M-lycotoxin-Hc1a and its mutageneses using 2D HP model. MOLECULAR SIMULATION 2012. [DOI: 10.1080/08927022.2012.654473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
15
|
Wüst T, Landau DP. Optimized Wang-Landau sampling of lattice polymers: Ground state search and folding thermodynamics of HP model proteins. J Chem Phys 2012; 137:064903. [DOI: 10.1063/1.4742969] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
|
16
|
Liu J, Li G, Yu J, Yao Y. Heuristic energy landscape paving for protein folding problem in the three-dimensional HP lattice model. Comput Biol Chem 2012; 38:17-26. [PMID: 22551826 DOI: 10.1016/j.compbiolchem.2012.02.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2011] [Revised: 01/10/2012] [Accepted: 02/14/2012] [Indexed: 10/28/2022]
Abstract
The protein folding problem, i.e., the prediction of the tertiary structures of protein molecules from their amino acid sequences is one of the most important problems in computational biology and biochemistry. However, the extremely difficult optimization problem arising from energy function is a key challenge in protein folding simulation. The energy landscape paving (ELP) method has already been applied very successfully to off-lattice protein models and other optimization problems with complex energy landscape in continuous space. By improving the ELP method, and subsequently incorporating the neighborhood strategy with the pull-move set into the improved ELP method, a heuristic ELP algorithm is proposed to find low-energy conformations of 3D HP lattice model proteins in the discrete space. The algorithm is tested on three sets of 3D HP benchmark instances consisting 31 sequences. For eleven sequences with 27 monomers, the proposed method explores the conformation surfaces more efficiently than other methods, and finds new lower energies in several cases. For ten 48-monomer sequences, we find the lowest energies so far. With the achieved results, the algorithm converges rapidly and efficiently. For all ten 64-monomer sequences, the algorithm finds lower energies within comparable computation times than previous methods. Numeric results show that the heuristic ELP method is a competitive tool for protein folding simulation in 3D lattice model. To the best of our knowledge, this is the first application of ELP to the 3D discrete space.
Collapse
Affiliation(s)
- Jingfa Liu
- School of Computer & Software, Nanjing University of Information Science & Technology, Nanjing 210044, China.
| | | | | | | |
Collapse
|
17
|
Swetnam AD, Allen MP. Improving the Wang-Landau algorithm for polymers and proteins. J Comput Chem 2010; 32:816-21. [DOI: 10.1002/jcc.21660] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2010] [Accepted: 08/07/2010] [Indexed: 11/09/2022]
|
18
|
Hu X, Beratan DN, Yang W. A gradient-directed Monte Carlo method for global optimization in a discrete space: application to protein sequence design and folding. J Chem Phys 2010; 131:154117. [PMID: 20568857 DOI: 10.1063/1.3236834] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We apply the gradient-directed Monte Carlo (GDMC) method to select optimal members of a discrete space, the space of chemically viable proteins described by a model Hamiltonian. In contrast to conventional Monte Carlo approaches, our GDMC method uses local property gradients with respect to chemical variables that have discrete values in the actual systems, e.g., residue types in a protein sequence. The local property gradients are obtained from the interpolation of discrete property values, following the linear combination of atomic potentials scheme developed recently [M. Wang et al., J. Am. Chem. Soc. 128, 3228 (2006)]. The local property derivative information directs the search toward the global minima while the Metropolis criterion incorporated in the method overcomes barriers between local minima. Using the simple HP lattice model, we apply the GDMC method to protein sequence design and folding. The GDMC algorithm proves to be particularly efficient, suggesting that this strategy can be extended to other discrete optimization problems in addition to inverse molecular design.
Collapse
Affiliation(s)
- Xiangqian Hu
- Department of Chemistry, Duke University, Durham, North Carolina 27708-0354, USA
| | | | | |
Collapse
|
19
|
Stochastic protein folding simulation in the three-dimensional HP-model. Comput Biol Chem 2008; 32:248-55. [PMID: 18485827 DOI: 10.1016/j.compbiolchem.2008.03.004] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2007] [Accepted: 03/17/2008] [Indexed: 11/23/2022]
|
20
|
Wei W, Yanlin T. A new algorithm for 2D hydrophobic-polar model: an algorithm based on hydrophobic core in square lattice. Pak J Biol Sci 2008; 11:1815-1819. [PMID: 18817222 DOI: 10.3923/pjbs.2008.1815.1819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
This study was engaged in a new algorithm which was used to solve the problem of protein folding. The conformation of hydrophobic core of protein was key factor of structure of protein. So, in our algorithm, we set a hydrophobic core which was restricted by new aggregate. Then, the hydrophilic residues between two hydrophobic residues were ranged, the optimal conformation was gained if all residues were not overlap and continuous. The algorithm in this study can be prevented effectively falls into partially smallest energy.
Collapse
Affiliation(s)
- Wang Wei
- College of Science, Guizhou University, Guiyang, Guizhou Province, China
| | | |
Collapse
|
21
|
Advances on protein folding simulations based on the lattice HP models with natural computing. Appl Soft Comput 2008. [DOI: 10.1016/j.asoc.2007.03.012] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
22
|
Thachuk C, Shmygelska A, Hoos HH. A replica exchange Monte Carlo algorithm for protein folding in the HP model. BMC Bioinformatics 2007; 8:342. [PMID: 17875212 PMCID: PMC2071922 DOI: 10.1186/1471-2105-8-342] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2007] [Accepted: 09/17/2007] [Indexed: 12/04/2022] Open
Abstract
Background The ab initio protein folding problem consists of predicting protein tertiary structure from a given amino acid sequence by minimizing an energy function; it is one of the most important and challenging problems in biochemistry, molecular biology and biophysics. The ab initio protein folding problem is computationally challenging and has been shown to be NP
MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFneVtcqqGqbauaaa@3961@-hard even when conformations are restricted to a lattice. In this work, we implement and evaluate the replica exchange Monte Carlo (REMC) method, which has already been applied very successfully to more complex protein models and other optimization problems with complex energy landscapes, in combination with the highly effective pull move neighbourhood in two widely studied Hydrophobic Polar (HP) lattice models. Results We demonstrate that REMC is highly effective for solving instances of the square (2D) and cubic (3D) HP protein folding problem. When using the pull move neighbourhood, REMC outperforms current state-of-the-art algorithms for most benchmark instances. Additionally, we show that this new algorithm provides a larger ensemble of ground-state structures than the existing state-of-the-art methods. Furthermore, it scales well with sequence length, and it finds significantly better conformations on long biological sequences and sequences with a provably unique ground-state structure, which is believed to be a characteristic of real proteins. We also present evidence that our REMC algorithm can fold sequences which exhibit significant interaction between termini in the hydrophobic core relatively easily. Conclusion We demonstrate that REMC utilizing the pull move neighbourhood significantly outperforms current state-of-the-art methods for protein structure prediction in the HP model on 2D and 3D lattices. This is particularly noteworthy, since so far, the state-of-the-art methods for 2D and 3D HP protein folding – in particular, the pruned-enriched Rosenbluth method (PERM) and, to some extent, Ant Colony Optimisation (ACO) – were based on chain growth mechanisms. To the best of our knowledge, this is the first application of REMC to HP protein folding on the cubic lattice, and the first extension of the pull move neighbourhood to a 3D lattice.
Collapse
Affiliation(s)
- Chris Thachuk
- School of Computing Science, Simon Fraser University, Burnaby, B.C., V5A 1S6, Canada
| | - Alena Shmygelska
- Department of Structural Biology, Stanford University, Stanford, CA, 94305, USA
| | - Holger H Hoos
- Department of Computer Science, University of British Columbia, B.C., V6T 1Z4, Canada
| |
Collapse
|
23
|
Zhang J, Kou SC, Liu JS. Biopolymer structure simulation and optimization via fragment regrowth Monte Carlo. J Chem Phys 2007; 126:225101. [PMID: 17581081 DOI: 10.1063/1.2736681] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
An efficient exploration of the configuration space of a biopolymer is essential for its structure modeling and prediction. In this study, the authors propose a new Monte Carlo method, fragment regrowth via energy-guided sequential sampling (FRESS), which incorporates the idea of multigrid Monte Carlo into the framework of configurational-bias Monte Carlo and is suitable for chain polymer simulations. As a by-product, the authors also found a novel extension of the Metropolis Monte Carlo framework applicable to all Monte Carlo computations. They tested FRESS on hydrophobic-hydrophilic (HP) protein folding models in both two and three dimensions. For the benchmark sequences, FRESS not only found all the minimum energies obtained by previous studies with substantially less computation time but also found new lower energies for all the three-dimensional HP models with sequence length longer than 80 residues.
Collapse
Affiliation(s)
- Jinfeng Zhang
- Department of Statistics, Harvard University, Science Center, Cambridge, Massachusetts 02138, USA
| | | | | |
Collapse
|
24
|
Abstract
It has been proposed that proteins fold by a process called "Zipping and Assembly" (Z&A). Zipping refers to the growth of local substructures within the chain, and assembly refers to the coming together of already-formed pieces. Our interest here is in whether Z&A is a general method that can fold most of sequence space, to global minima, efficiently. Using the HP model, we can address this question by enumerating full conformation and sequence spaces. We find that Z&A reaches the global energy minimum native states, even though it searches only a very small fraction of conformational space, for most sequences in the full sequence space. We find that Z&A, a mechanism-based search, is more efficient in our tests than the replica exchange search method. Folding efficiency is increased for chains having: (a) small loop-closure steps, consistent with observations by Plaxco et al. 1998;277;985-994 that folding rates correlate with contact order, (b) neither too few nor too many nucleation sites per chain, and (c) assembly steps that do not occur too early in the folding process. We find that the efficiency increases with chain length, although our range of chain lengths is limited. We believe these insights may be useful for developing faster protein conformational search algorithms.
Collapse
Affiliation(s)
- Vincent A Voelz
- Graduate Group in Biophysics, University of California at San Francisco, San Francisco, California 94143, USA
| | | |
Collapse
|
25
|
Skolnick J, Kolinski A. Monte Carlo Approaches to the Protein Folding Problem. ADVANCES IN CHEMICAL PHYSICS 2007. [DOI: 10.1002/9780470141649.ch7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
|
26
|
Abstract
The free energy landscape of protein folding is rugged, occasionally characterized by compact, intermediate states of low free energy. In computational folding, this landscape leads to trapped, compact states with incorrect secondary structure. We devised a residue-specific, protein backbone move set for efficient sampling of protein-like conformations in computational folding simulations. The move set is based on the selection of a small set of backbone dihedral angles, derived from clustering dihedral angles sampled from experimental structures. We show in both simulated annealing and replica exchange Monte Carlo (REMC) simulations that the knowledge-based move set, when compared with a conventional move set, shows statistically significant improved ability at overcoming kinetic barriers, reaching deeper energy minima, and achieving correspondingly lower RMSDs to native structures. The new move set is also more efficient, being able to reach low energy states considerably faster. Use of this move set in determining the energy minimum state and for calculating thermodynamic quantities is discussed.
Collapse
Affiliation(s)
- William W Chen
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02318, USA
| | | | | |
Collapse
|
27
|
Liang F. Annealing contour Monte Carlo algorithm for structure optimization in an off-lattice protein model. J Chem Phys 2006; 120:6756-63. [PMID: 15267570 DOI: 10.1063/1.1665529] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We present a space annealing version for a contour Monte Carlo algorithm and show that it can be applied successfully to finding the ground states for an off-lattice protein model. The comparison shows that the algorithm has made a significant improvement over the pruned-enriched-Rosenbluth method and the Metropolis Monte Carlo method in finding the ground states for AB models. For all sequences, the algorithm has renewed the putative ground energy values in the two-dimensional AB model and set the putative ground energy values in the three-dimensional AB model.
Collapse
Affiliation(s)
- Faming Liang
- Department of Statistics, Texas A&M University, College Station, Texas 77843-3143, USA.
| |
Collapse
|
28
|
Kou SC, Oh J, Wong WH. A study of density of states and ground states in hydrophobic-hydrophilic protein folding models by equi-energy sampling. J Chem Phys 2006; 124:244903. [PMID: 16821999 DOI: 10.1063/1.2208607] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We propose an equi-energy (EE) sampling approach to study protein folding in the two-dimensional hydrophobic-hydrophilic (HP) lattice model. This approach enables efficient exploration of the global energy landscape and provides accurate estimates of the density of states, which then allows us to conduct a detailed study of the thermodynamics of HP protein folding, in particular, on the temperature dependence of the transition from folding to unfolding and on how sequence composition affects this phenomenon. With no extra cost, this approach also provides estimates on global energy minima and ground states. Without using any prior structural information of the protein the EE sampler is able to find the ground states that match the best known results in most benchmark cases. The numerical results demonstrate it as a powerful method to study lattice protein folding models.
Collapse
Affiliation(s)
- S C Kou
- Department of Statistics, Science Center, Harvard University, Cambridge, Massachusetts 02138, USA.
| | | | | |
Collapse
|
29
|
Abstract
A new benchmark 20-bead HP model protein sequence (on a square lattice), which has 17 distinct but degenerate global minimum (GM) energy structures, has been studied using a genetic algorithm (GA). The relative probabilities of finding particular GM conformations are determined and related to the theoretical probability of generating these structures using a recoil growth constructor operator. It is found that for longer successful GA runs, the GM probability distribution is generally very different from the constructor probability, as other GA operators have had time to overcome any initial bias in the originally generated population of structures. Structural and metric relationships (e.g., Hamming distances) between the 17 distinct GM are investigated and used, in conjunction with data on the connectivities of the GM and the pathways that link them, to explain the GM probability distributions obtained by the GA. A comparison is made of searches where the sequence is defined in the normal (forward) and reverse directions. The ease of finding mirror image solutions are also compared. Finally, this approach is applied to rationalize the ease or difficulty of finding the GM for a number of standard benchmark HP sequences on the square lattice. It is shown that the relative probabilities of finding particular members of a set of degenerate global minima depend critically on the topography of the energy landscape in the vicinity of the GM, the connections and distances between the GM, and the nature of the operators used in the chosen search method.
Collapse
Affiliation(s)
- Graham A Cox
- School of Chemistry, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom
| | | |
Collapse
|
30
|
Błazewicz J, Lukasiak P, Miłostan M. Application of tabu search strategy for finding low energy structure of protein. Artif Intell Med 2005; 35:135-45. [PMID: 16051476 DOI: 10.1016/j.artmed.2005.02.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2004] [Revised: 01/24/2005] [Accepted: 02/22/2005] [Indexed: 11/22/2022]
Abstract
OBJECTIVE Understanding protein functionality would mean understanding the basics of life. This functionality follows a three-dimensional structure of proteins. Unfortunately till now it is not possible to obtain these structures artificially. This article offers a survey on the use of meta-heuristic methods in context of simplified models of protein folding. METHODS Tabu search (TS) strategy is one of the most successful meta-heuristics that has been applied for large number of optimization problems. In the paper, the application of TS for finding low energy conformations of proteins in a simplified lattice model has been proposed. RESULTS The algorithm has been extensively tested and the tests showed its good performance. It compares well with the other heuristic approaches. CONCLUSIONS The approach presented is competitive as compared with other methods and due to its low computation time can be used as a complementary tool for an analysis of the three-dimensional protein structures.
Collapse
Affiliation(s)
- Jacek Błazewicz
- Institute of Computing Science, Poznań University of Technology, Piotrowo 3a, 60-965 Poznań, Poland
| | | | | |
Collapse
|
31
|
Huang W, Lü Z, Shi H. Growth algorithm for finding low energy configurations of simple lattice proteins. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2005; 72:016704. [PMID: 16090131 DOI: 10.1103/physreve.72.016704] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2004] [Indexed: 05/03/2023]
Abstract
PERM and its new variant nPERMis have been developed to optimize the energy function of protein folding based on HP simple lattice model and were found to outperform all other previous fully blind general purpose algorithms. Using the concept of core-guiding and life-forecasting, we propose a new version of nPERMis, called nPERMh. A major difference with respect to nPERMis is that criteria for further growth of new residue are based on the species of current growing monomer and its position in the HP sequence. Seventeen sequences of length ranging from 46 to 124 residues were tested by nPERMh on the cubic lattice and our algorithm proved very efficient. It should be pointed out that our new version of nPERMis is exclusively designed for conformational search. We hope that similar methods will ultimately be useful for finding native states of more realistic protein models.
Collapse
Affiliation(s)
- Wenqi Huang
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei Province, 430074, China
| | | | | |
Collapse
|
32
|
Schiemann R, Bachmann M, Janke W. Exact sequence analysis for three-dimensional hydrophobic-polar lattice proteins. J Chem Phys 2005; 122:114705. [PMID: 15836241 DOI: 10.1063/1.1814941] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We have exactly enumerated all sequences and conformations of hydrophobic-polar (HP) proteins with chains of up to 19 monomers on the simple cubic lattice. For two variants of the HP model, where only two types of monomers are distinguished, we determined and statistically analyzed designing sequences, i.e., sequences that have a nondegenerate ground state. Furthermore we were interested in characteristic thermodynamic properties of HP proteins with designing sequences. In order to be able to perform these exact studies, we applied an efficient enumeration method based on contact sets.
Collapse
|
33
|
|
34
|
An ant colony optimisation algorithm for the 2D and 3D hydrophobic polar protein folding problem. BMC Bioinformatics 2005; 6:30. [PMID: 15710037 PMCID: PMC555464 DOI: 10.1186/1471-2105-6-30] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2004] [Accepted: 02/14/2005] [Indexed: 11/24/2022] Open
Abstract
Background The protein folding problem is a fundamental problems in computational molecular biology and biochemical physics. Various optimisation methods have been applied to formulations of the ab-initio folding problem that are based on reduced models of protein structure, including Monte Carlo methods, Evolutionary Algorithms, Tabu Search and hybrid approaches. In our work, we have introduced an ant colony optimisation (ACO) algorithm to address the non-deterministic polynomial-time hard (NP-hard) combinatorial problem of predicting a protein's conformation from its amino acid sequence under a widely studied, conceptually simple model – the 2-dimensional (2D) and 3-dimensional (3D) hydrophobic-polar (HP) model. Results We present an improvement of our previous ACO algorithm for the 2D HP model and its extension to the 3D HP model. We show that this new algorithm, dubbed ACO-HPPFP-3, performs better than previous state-of-the-art algorithms on sequences whose native conformations do not contain structural nuclei (parts of the native fold that predominantly consist of local interactions) at the ends, but rather in the middle of the sequence, and that it generally finds a more diverse set of native conformations. Conclusions The application of ACO to this bioinformatics problem compares favourably with specialised, state-of-the-art methods for the 2D and 3D HP protein folding problem; our empirical results indicate that our rather simple ACO algorithm scales worse with sequence length but usually finds a more diverse ensemble of native states. Therefore the development of ACO algorithms for more complex and realistic models of protein structure holds significant promise.
Collapse
|
35
|
Abstract
We calculate thermodynamic quantities of hydrophobic-polar (HP) lattice proteins by means of a multicanonical chain-growth algorithm that connects the new variants of the Pruned-Enriched Rosenbluth Method and flat histogram sampling of the entire energy space. Since our method directly simulates the density of states, we obtain results for thermodynamic quantities of the system for all temperatures. In particular, this algorithm enables us to accurately simulate the usually difficult accessible low-temperature region. Therefore, it becomes possible to perform detailed analyses of the low-temperature transition between ground states and compact globules.
Collapse
Affiliation(s)
- Michael Bachmann
- Institut fur Theoretische Physik, Universitat Leipzig, Augustusplatz 10/11, D-04109 Leipzig, Germany.
| | | |
Collapse
|
36
|
Bachmann M, Janke W. Multicanonical chain-growth algorithm. PHYSICAL REVIEW LETTERS 2003; 91:208105. [PMID: 14683403 DOI: 10.1103/physrevlett.91.208105] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2003] [Indexed: 05/24/2023]
Abstract
We present a temperature-independent Monte Carlo method for the determination of the density of states of lattice proteins that combines the fast ground-state search strategy of the new pruned-enriched Rosenbluth chain-growth method and multicanonical reweighting for sampling the complete energy space. Since the density of states contains all energetic information of a statistical system, we can directly calculate the mean energy, specific heat, Helmholtz free energy, and entropy for all temperatures. We apply this method to lattice proteins consisting of hydrophobic and polar monomers, and for the examples of sequences considered, we identify the transitions between native, globule, and random coil states. Since no special properties of heteropolymers are involved in this algorithm, the method applies to polymer models as well.
Collapse
Affiliation(s)
- Michael Bachmann
- Institut für Theoretische Physik, Universität Leipzig, Augustusplatz 10/11, D-04109 Leipzig, Germany.
| | | |
Collapse
|
37
|
Hsu HP, Mehra V, Grassberger P. Structure optimization in an off-lattice protein model. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2003; 68:037703. [PMID: 14524935 DOI: 10.1103/physreve.68.037703] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2003] [Indexed: 05/24/2023]
Abstract
We study an off-lattice protein toy model with two species of monomers interacting through modified Lennard-Jones interactions. Low energy configurations are optimized using the pruned-enriched-Rosenbluth method (PERM), hitherto employed to native state searches only for off-lattice models. For two dimensions we found states with lower energy than previously proposed putative ground states for all chain lengths >/=13. This indicates that PERM has the potential to produce native states also for more realistic protein models. For d=3, where no published ground states exist, we present some putative lowest energy states for future comparison with other methods.
Collapse
Affiliation(s)
- Hsiao-Ping Hsu
- John-von-Neumann Institute for Computing, Forschungszentrum Jülich, D-52425 Jülich, Germany
| | | | | |
Collapse
|
38
|
Hsu HP, Mehra V, Nadler W, Grassberger P. Growth-based optimization algorithm for lattice heteropolymers. ACTA ACUST UNITED AC 2003; 68:021113. [PMID: 14524959 DOI: 10.1103/physreve.68.021113] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2002] [Indexed: 11/07/2022]
Abstract
An improved version of the pruned-enriched-Rosenbluth method (PERM) is proposed and tested on finding lowest energy states in simple models of lattice heteropolymers. It is found to outperform not only the previous version of PERM, but also all other fully blind general purpose stochastic algorithms which have been employed on this problem. In many cases, it found new lowest energy states missed in previous papers. Limitations are discussed.
Collapse
Affiliation(s)
- Hsiao-Ping Hsu
- John-von-Neumann Institute for Computing, Forschungszentrum Jülich, D-52425 Jülich, Germany
| | | | | | | |
Collapse
|
39
|
Chou CI, Han RS, Li SP, Lee TK. Guided simulated annealing method for optimization problems. ACTA ACUST UNITED AC 2003; 67:066704. [PMID: 16241377 DOI: 10.1103/physreve.67.066704] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2003] [Revised: 03/28/2003] [Indexed: 11/07/2022]
Abstract
Incorporating the concept of order parameter of the mean-field theory into the simulated annealing method, we present an optimization algorithm, the guided simulated annealing method. In this method mean-field order parameters are calculated to guide the configuration search for the global minimum. Allowing fluctuations and improvement of mean-field values iteratively, this method successfully identifies global minima for several difficult optimization problems. Application of this method to the HP lattice-protein model has found another lowest-energy state for an N=100 sequence that was not found by other methods before. Results for spin glass models are also presented which show improvement over the previous results.
Collapse
Affiliation(s)
- C I Chou
- Institute of Physics, Academia Sinica, Taipei, Taiwan
| | | | | | | |
Collapse
|
40
|
|
41
|
Hsu HP, Mehra V, Nadler W, Grassberger P. Growth algorithms for lattice heteropolymers at low temperatures. J Chem Phys 2003. [DOI: 10.1063/1.1522710] [Citation(s) in RCA: 116] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
42
|
Zhang JL, Liu JS. A new sequential importance sampling method and its application to the two-dimensional hydrophobic–hydrophilic model. J Chem Phys 2002. [DOI: 10.1063/1.1494415] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
43
|
Dominy BN, Brooks CL. Identifying native-like protein structures using physics-based potentials. J Comput Chem 2002; 23:147-60. [PMID: 11913380 DOI: 10.1002/jcc.10018] [Citation(s) in RCA: 85] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
As the field of structural genomics matures, new methods will be required that can accurately and rapidly distinguish reliable structure predictions from those that are more dubious. We present a method based on the CHARMM gas phase implicit hydrogen force field in conjunction with a generalized Born implicit solvation term that allows one to make such discrimination. We begin by analyzing pairs of threaded structures from the EMBL database, and find that it is possible to identify the misfolded structures with over 90% accuracy. Further, we find that misfolded states are generally favored by the solvation term due to the mispairing of favorable intramolecular ionic contacts. We also examine 29 sets of 29 misfolded globin sequences from Levitt's "Decoys 'R' Us" database generated using a sequence homology-based method. Again, we find that discrimination is possible with approximately 90% accuracy. Also, even in these less distorted structures, mispairing of ionic contacts results in a more favorable solvation energy for misfolded states. This is also found to be the case for collapsed, partially folded conformations of CspA and protein G taken from folding free energy calculations. We also find that the inclusion of the generalized Born solvation term, in postprocess energy evaluation, improves the correlation between structural similarity and energy in the globin database. This significantly improves the reliability of the hypothesis that more energetically favorable structures are also more similar to the native conformation. Additionally, we examine seven extensive collections of misfolded structures created by Park and Levitt using a four-state reduced model also contained in the "Decoys 'R' Us" database. Results from these large databases confirm those obtained in the EMBL and misfolded globin databases concerning predictive accuracy, the energetic advantage of misfolded proteins regarding the solvation component, and the improved correlation between energy and structural similarity due to implicit solvation. Z-scores computed for these databases are improved by including the generalized Born implicit solvation term, and are found to be comparable to trained and knowledge-based scoring functions. Finally, we briefly explore the dynamic behavior of a misfolded protein relative to properly folded conformations. We demonstrate that the misfolded conformation diverges quickly from its initial structure while the properly folded states remain stable. Proteins in this study are shown to be more stable than their misfolded counterparts and readily identified based on energetic as well as dynamic criteria. In summary, we demonstrate the utility of physics-based force fields in identifying native-like conformations in a variety of preconstructed structural databases. The details of this discrimination are shown to be dependent on the construction of the structural database.
Collapse
Affiliation(s)
- Brian N Dominy
- Department of Molecular Biology, The Scripps Research Institute, La Jolla, California 92037, USA
| | | |
Collapse
|
44
|
Gibbs N, Clarke AR, Sessions RB. Ab initio protein structure prediction using physicochemical potentials and a simplified off-lattice model. Proteins 2001; 43:186-202. [PMID: 11276088 DOI: 10.1002/1097-0134(20010501)43:2<186::aid-prot1030>3.0.co;2-l] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
This study describes a computational method for ab inito protein structure prediction. Protein conformation has been modeled by using six optimized backbone torsion angles and fixed side chains approximating rotationally averaged real side chains. The approximations aim to keep complexity of the structure description to a minimum without seriously compromising the accuracy of the structural representation. An evolutionary Monte Carlo algorithm has been developed to search through this restricted conformational space to locate low-energy protein structures. A simple physicochemical force field has been developed to assess the energies of different conformations within this structural description. The corresponding residue interaction energies are based on hydrophobic, hydrophilic, steric, and hydrogen-bonding potentials. The search procedure has been used to locate native energy minima from primary sequence alone. The 3-D structures of polypeptides up to 38 residues with both beta and alpha secondary structural elements have been accurately predicted. The search procedure has been found to be highly efficient and follows an energetically and structurally plausible pathway to locate native populations. The simple force field described in the study has been compared with a more complex all-atom model and been found to be similarly effective in predicting the structures of proposed independent folding units. Proteins 2001;43:186-202.
Collapse
Affiliation(s)
- N Gibbs
- Department of Biochemistry, School of Medical Sciences, University of Bristol, University Walk, Bristol BS8 1TD, United Kingdom
| | | | | |
Collapse
|
45
|
Convex Global Underestimation for Molecular Structure Prediction. ACTA ACUST UNITED AC 2001. [DOI: 10.1007/978-1-4757-5284-7_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
46
|
|
47
|
Application of the Gaussian theory of elastomeric networks to native proteins: analysis of fluctuations and the dynamic scattering function. ACTA ACUST UNITED AC 1999. [DOI: 10.1016/s1089-3156(99)00018-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
48
|
|
49
|
Derreumaux P. Finding the low-energy forms of avian pancreatic polypeptide with the diffusion-process-controlled Monte Carlo method. J Chem Phys 1998. [DOI: 10.1063/1.476708] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
50
|
Abstract
If protein structure prediction methods are to make any impact on the impending onerous task of analyzing the large numbers of unknown protein sequences generated by the ongoing genome-sequencing projects, it is vital that they make the difficult transition from computational 'gedankenexperiments' to practical software tools. This has already happened in the field of comparative modelling and is currently happening in the threading field. Unfortunately, there is little evidence of this transition happening in the field of ab initio tertiary-structure prediction.
Collapse
Affiliation(s)
- D T Jones
- Department of Biological Sciences, University of Warwick, Coventry, UK.
| |
Collapse
|