1
|
Chew PY, Joseph JA, Collepardo-Guevara R, Reinhardt A. Aromatic and arginine content drives multiphasic condensation of protein-RNA mixtures. Biophys J 2024; 123:1342-1355. [PMID: 37408305 PMCID: PMC11163273 DOI: 10.1016/j.bpj.2023.06.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Revised: 06/20/2023] [Accepted: 06/30/2023] [Indexed: 07/07/2023] Open
Abstract
Multiphasic architectures are found ubiquitously in biomolecular condensates and are thought to have important implications for the organization of multiple chemical reactions within the same compartment. Many of these multiphasic condensates contain RNA in addition to proteins. Here, we investigate the importance of different interactions in multiphasic condensates comprising two different proteins and RNA using computer simulations with a residue-resolution coarse-grained model of proteins and RNA. We find that in multilayered condensates containing RNA in both phases, protein-RNA interactions dominate, with aromatic residues and arginine forming the key stabilizing interactions. The total aromatic and arginine content of the two proteins must be appreciably different for distinct phases to form, and we show that this difference increases as the system is driven toward greater multiphasicity. Using the trends observed in the different interaction energies of this system, we demonstrate that we can also construct multilayered condensates with RNA preferentially concentrated in one phase. The "rules" identified can thus enable the design of synthetic multiphasic condensates to facilitate further study of their organization and function.
Collapse
Affiliation(s)
- Pin Yu Chew
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Jerelle A Joseph
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey
| | - Rosana Collepardo-Guevara
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom; Department of Physics, University of Cambridge, Cambridge, United Kingdom; Department of Genetics, University of Cambridge, Cambridge, United Kingdom.
| | - Aleks Reinhardt
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom.
| |
Collapse
|
2
|
Scalzitti N, Miralavy I, Korenchan DE, Farrar CT, Gilad AA, Banzhaf W. Computational peptide discovery with a genetic programming approach. J Comput Aided Mol Des 2024; 38:17. [PMID: 38570405 DOI: 10.1007/s10822-024-00558-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 03/07/2024] [Indexed: 04/05/2024]
Abstract
The development of peptides for therapeutic targets or biomarkers for disease diagnosis is a challenging task in protein engineering. Current approaches are tedious, often time-consuming and require complex laboratory data due to the vast search spaces that need to be considered. In silico methods can accelerate research and substantially reduce costs. Evolutionary algorithms are a promising approach for exploring large search spaces and can facilitate the discovery of new peptides. This study presents the development and use of a new variant of the genetic-programming-based POET algorithm, called POETRegex , where individuals are represented by a list of regular expressions. This algorithm was trained on a small curated dataset and employed to generate new peptides improving the sensitivity of peptides in magnetic resonance imaging with chemical exchange saturation transfer (CEST). The resulting model achieves a performance gain of 20% over the initial POET models and is able to predict a candidate peptide with a 58% performance increase compared to the gold-standard peptide. By combining the power of genetic programming with the flexibility of regular expressions, new peptide targets were identified that improve the sensitivity of detection by CEST. This approach provides a promising research direction for the efficient identification of peptides with therapeutic or diagnostic potential.
Collapse
Affiliation(s)
- Nicolas Scalzitti
- BEACON Center of Evolution in Action, Michigan State University, East Lansing, MI, USA
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
| | - Iliya Miralavy
- BEACON Center of Evolution in Action, Michigan State University, East Lansing, MI, USA
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
| | - David E Korenchan
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Christian T Farrar
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Assaf A Gilad
- BEACON Center of Evolution in Action, Michigan State University, East Lansing, MI, USA.
- Department of Chemical Engineering, Michigan State University, East Lansing, MI, USA.
- Department of Radiology, Michigan State University, East Lansing, MI, USA.
| | - Wolfgang Banzhaf
- BEACON Center of Evolution in Action, Michigan State University, East Lansing, MI, USA.
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA.
| |
Collapse
|
3
|
Chew PY, Joseph JA, Collepardo-Guevara R, Reinhardt A. Thermodynamic origins of two-component multiphase condensates of proteins. Chem Sci 2023; 14:1820-1836. [PMID: 36819870 PMCID: PMC9931050 DOI: 10.1039/d2sc05873a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 01/06/2023] [Indexed: 01/26/2023] Open
Abstract
Intracellular condensates are highly multi-component systems in which complex phase behaviour can ensue, including the formation of architectures comprising multiple immiscible condensed phases. Relying solely on physical intuition to manipulate such condensates is difficult because of the complexity of their composition, and systematically learning the underlying rules experimentally would be extremely costly. We address this challenge by developing a computational approach to design pairs of protein sequences that result in well-separated multilayered condensates and elucidate the molecular origins of these compartments. Our method couples a genetic algorithm to a residue-resolution coarse-grained protein model. We demonstrate that we can design protein partners to form multiphase condensates containing naturally occurring proteins, such as the low-complexity domain of hnRNPA1 and its mutants, and show how homo- and heterotypic interactions must differ between proteins to result in multiphasicity. We also show that in some cases the specific pattern of amino-acid residues plays an important role. Our findings have wide-ranging implications for understanding and controlling the organisation, functions and material properties of biomolecular condensates.
Collapse
Affiliation(s)
- Pin Yu Chew
- Yusuf Hamied Department of Chemistry, University of Cambridge Cambridge CB2 1EW UK
| | - Jerelle A. Joseph
- Yusuf Hamied Department of Chemistry, University of CambridgeCambridgeCB2 1EWUK,Department of Physics, University of CambridgeCambridgeCB3 0HEUK,Department of Genetics, University of CambridgeCambridgeCB2 3EHUK
| | - Rosana Collepardo-Guevara
- Yusuf Hamied Department of Chemistry, University of Cambridge Cambridge CB2 1EW UK .,Department of Physics, University of Cambridge Cambridge CB3 0HE UK.,Department of Genetics, University of Cambridge Cambridge CB2 3EH UK
| | - Aleks Reinhardt
- Yusuf Hamied Department of Chemistry, University of Cambridge Cambridge CB2 1EW UK
| |
Collapse
|
4
|
Villard J, Kılıç M, Rothlisberger U. Surrogate Based Genetic Algorithm Method for Efficient Identification of Low-Energy Peptide Structures. J Chem Theory Comput 2023; 19:1080-1097. [PMID: 36692853 PMCID: PMC9933449 DOI: 10.1021/acs.jctc.2c01078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Identification of the most stable structure(s) of a system is a prerequisite for the calculation of any of its properties from first-principles. However, even for relatively small molecules, exhaustive explorations of the potential energy surface (PES) are severely hampered by the dimensionality bottleneck. In this work, we address the challenging task of efficiently sampling realistic low-lying peptide coordinates by resorting to a surrogate based genetic algorithm (GA)/density functional theory (DFT) approach (sGADFT) in which promising candidates provided by the GA are ultimately optimized with DFT. We provide a benchmark of several computational methods (GAFF, AMOEBApro13, PM6, PM7, DFTB3-D3(BJ)) as possible prescanning surrogates and apply sGADFT to two test case systems that are (i) two isomer families of the protonated Gly-Pro-Gly-Gly tetrapeptide (Masson, A.; J. Am. Soc. Mass Spectrom.2015, 26, 1444-1454) and (ii) the doubly protonated cyclic decapeptide gramicidin S (Nagornova, N. S.; J. Am. Chem. Soc.2010, 132, 4040-4041). We show that our GA procedure can correctly identify low-energy minima in as little as a few hours. Subsequent refinement of surrogate low-energy structures within a given energy threshold (≤10 kcal/mol (i), ≤5 kcal/mol (ii)) via DFT relaxation invariably led to the identification of the most stable structures as determined from high-resolution infrared (IR) spectroscopy at low temperature. The sGADFT method therefore constitutes a highly efficient route for the screening of realistic low-lying peptide structures in the gas phase as needed for instance for the interpretation and assignment of experimental IR spectra.
Collapse
|
5
|
Boumedine N, Bouroubi S. Protein folding in 3D lattice HP model using a combining cuckoo search with the Hill-Climbing algorithms. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.108564] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
6
|
Lichtinger SM, Garaizar A, Collepardo-Guevara R, Reinhardt A. Targeted modulation of protein liquid-liquid phase separation by evolution of amino-acid sequence. PLoS Comput Biol 2021; 17:e1009328. [PMID: 34428231 PMCID: PMC8415608 DOI: 10.1371/journal.pcbi.1009328] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 09/03/2021] [Accepted: 08/07/2021] [Indexed: 12/27/2022] Open
Abstract
Rationally and efficiently modifying the amino-acid sequence of proteins to control their ability to undergo liquid-liquid phase separation (LLPS) on demand is not only highly desirable, but can also help to elucidate which protein features are important for LLPS. Here, we propose a computational method that couples a genetic algorithm to a sequence-dependent coarse-grained protein model to evolve the amino-acid sequences of phase-separating intrinsically disordered protein regions (IDRs), and purposely enhance or inhibit their capacity to phase-separate. We validate the predicted critical solution temperatures of the mutated sequences with ABSINTH, a more accurate all-atom model. We apply the algorithm to the phase-separating IDRs of three naturally occurring proteins, namely FUS, hnRNPA1 and LAF1, as prototypes of regions that exist in cells and undergo homotypic LLPS driven by different types of intermolecular interaction, and we find that the evolution of amino-acid sequences towards enhanced LLPS is driven in these three cases, among other factors, by an increase in the average size of the amino acids. However, the direction of change in the molecular driving forces that enhance LLPS (such as hydrophobicity, aromaticity and charge) depends on the initial amino-acid sequence. Finally, we show that the evolution of amino-acid sequences to modulate LLPS is strongly coupled to the make-up of the medium (e.g. the presence or absence of RNA), which may have significant implications for our understanding of phase separation within the many-component mixtures of biological systems.
Collapse
Affiliation(s)
- Simon M. Lichtinger
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Adiran Garaizar
- Department of Physics, Cavendish Laboratory, Maxwell Centre, University of Cambridge, Cambridge, United Kingdom
| | - Rosana Collepardo-Guevara
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
- Department of Physics, Cavendish Laboratory, Maxwell Centre, University of Cambridge, Cambridge, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Aleks Reinhardt
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
7
|
Nazmul R, Chetty M, Chowdhury AR. An improved memetic approach for protein structure prediction incorporating maximal hydrophobic core estimation concept. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2018.06.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
8
|
Solving a novel designed second order nonlinear Lane–Emden delay differential model using the heuristic techniques. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107105] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
9
|
Farris ACK, Seaton DT, Landau DP. Effects of lattice constraints in coarse-grained protein models. J Chem Phys 2021; 154:084903. [PMID: 33639740 DOI: 10.1063/5.0038184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We compare and contrast folding behavior in several coarse-grained protein models, both on- and off-lattice, in an attempt to uncover the effect of lattice constraints in these kinds of models. Using modern, extended ensemble Monte Carlo methods-Wang-Landau sampling, multicanonical sampling, replica-exchange Wang-Landau sampling, and replica-exchange multicanonical sampling, we investigate the thermodynamic and structural behavior of the protein Crambin within the context of the hydrophobic-polar, hydrophobic-"neutral"-polar (H0P), and semi-flexible H0P model frameworks. We uncover the folding process in all cases; all models undergo, at least, the two major structural transitions observed in nature-the coil-globule collapse and the folding transition. As the complexity of the model increases, these two major transitions begin to split into multi-step processes, wherein the lattice coarse-graining has a significant impact on the details of these processes. The results show that the level of structural coarse-graining is coupled to the level of interaction coarse-graining.
Collapse
Affiliation(s)
- Alfred C K Farris
- Department of Physics and Astronomy, Oxford College of Emory University, Oxford, Georgia 30054, USA
| | - Daniel T Seaton
- Open Learning, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - David P Landau
- Center for Simulational Physics, Department of Physics and Astronomy, The University of Georgia, Athens, Georgia 30602, USA
| |
Collapse
|
10
|
Solving the protein folding problem in hydrophobic-polar model using deep reinforcement learning. SN APPLIED SCIENCES 2020. [DOI: 10.1007/s42452-020-2012-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
|
11
|
ABDELHALIM MOHAMEDB, MABROUK MAIS, SAYED AHMEDY. HPS_PSP: HIGH PERFORMANCE SYSTEM FOR PROTEIN STRUCTURE PREDICTION. J BIOL SYST 2019. [DOI: 10.1142/s0218339019500190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Prediction of least energy conformation of a protein from its primary structure (chain of amino acids) is an optimization problem associated with a large complex energy landscape. In this study, a simple 2D hydrophobic–hydrophilic model was used to model the protein sequence, which allows the fast and efficient design of genetic algorithm-based protein structure prediction approach. The neighborhood search strategy is integrated into the genetic operator. The neighborhood search guides the genetic operator to regions in the computational space with good solutions. To prevent convergence to local optima, the proposed method employs crowding-based parent replacement strategy, which improves the performance of the algorithm and the ability to deal with multiple numbers of solutions. The proposed algorithm was tested with a standard benchmark of HP sequences and comparative results demonstrate that the proposed system beats most of the evolutionary algorithms for seven sequences. It finds the best energy for a sequence of length [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text] and [Formula: see text].
Collapse
Affiliation(s)
- MOHAMED B. ABDELHALIM
- College of Computing and Information Technology (CCIT), Arab Academy for Science Technology and Maritime Transport (AASTMT) Cairo, Egypt
| | - MAI S. MABROUK
- Biomedical Engineering Department, Misr University for Science and Technology, 6 October City, Giza, Egypt
| | - AHMED Y. SAYED
- Physics and Engineering Mathematics Department, Faculty of Engineering at Mataria, Helwan Uinversity, Cairo, Egypt
| |
Collapse
|
12
|
Li J, Zhang H, Chen JZY. Structural Prediction and Inverse Design by a Strongly Correlated Neural Network. PHYSICAL REVIEW LETTERS 2019; 123:108002. [PMID: 31573310 DOI: 10.1103/physrevlett.123.108002] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 06/01/2019] [Indexed: 06/10/2023]
Abstract
Macromolecules contain molecular units as the coding information for their correlated structures in physical dimensions. The relationship between these two features is governed by the interaction energies of the involved molecular units and their encoded sequences. We present a neural network algorithm that treats molecular units themselves as neural networks, which has the flexibility to allow each unit to respond to its own environment and to influence others in the system. Through a deep neural network and a self-consistent procedure, molecular units in the network establish a strong correlation to produce the desirable features in the physical world. The proposed framework is applied to the HP model. Both the forward problem of predicting folded structures from given sequences and the inverse problem of predicting required sequences for a given structure are examined.
Collapse
Affiliation(s)
- Jianfeng Li
- The State Key Laboratory of Molecular Engineering of Polymers, Department of Macromolecular Science, Fudan University, Shanghai 200433, China
| | - Hongdong Zhang
- The State Key Laboratory of Molecular Engineering of Polymers, Department of Macromolecular Science, Fudan University, Shanghai 200433, China
| | - Jeff Z Y Chen
- Department of Physics and Astronomy, University of Waterloo, Waterloo, Ontario, Canada N2L 3G1
| |
Collapse
|
13
|
Sakae Y, Straub JE, Okamoto Y. Enhanced sampling method in molecular simulations using genetic algorithm for biomolecular systems. J Comput Chem 2018; 40:475-481. [PMID: 30414195 DOI: 10.1002/jcc.25735] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Revised: 09/23/2018] [Accepted: 09/25/2018] [Indexed: 12/13/2022]
Abstract
We propose a molecular simulation method using genetic algorithm (GA) for biomolecular systems to obtain ensemble averages efficiently. In this method, we incorporate the genetic crossover, which is one of the operations of GA, to any simulation method such as conventional molecular dynamics (MD), Monte Carlo, and other simulation methods. The genetic crossover proposes candidate conformations by exchanging parts of conformations of a target molecule between a pair of conformations during the simulation. If the candidate conformations are accepted, the simulation resumes from the accepted ones. While conventional simulations are based on local update of conformations, the genetic crossover introduces global update of conformations. As an example of the present approach, we incorporated genetic crossover to MD simulations. We tested the validity of the method by calculating ensemble averages and the sampling efficiency by using two kinds of peptides, ALA3 and (AAQAA)3 . The results show that for ALA3 system, the distribution probabilities of backbone dihedral angles are in good agreement with those of the conventional MD and replica-exchange MD simulations. In the case of (AAQAA)3 system, our method showed lower structural correlation of α-helix structures than the other two methods and more flexibility in the backbone ψ angles than the conventional MD simulation. These results suggest that our method gives more efficient conformational sampling than conventional simulation methods based on local update of conformations. © 2018 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Yoshitake Sakae
- Department of Physics, Graduate School of Science, Nagoya University, Nagoya, Aichi, 464-8602, Japan
| | - John E Straub
- Department of Chemistry, Boston University, Boston, Massachusetts, 02215-2521
| | - Yuko Okamoto
- Department of Physics, Graduate School of Science, Nagoya University, Nagoya, Aichi, 464-8602, Japan.,Information Technology Center, Nagoya University, Nagoya, Aichi, 464-8601, Japan.,Structural Biology Research Center, Graduate School of Science, Nagoya University, Nagoya, Aichi, 464-8602, Japan.,Center for Computational Science, Graduate School of Engineering, Nagoya University, Nagoya, Aichi, 464-8603, Japan.,JST-CREST, Nagoya, Aichi, 464-8602, Japan
| |
Collapse
|
14
|
Shi G, Wüst T, Landau DP. Elucidating thermal behavior, native contacts, and folding funnels of simple lattice proteins using replica exchange Wang-Landau sampling. J Chem Phys 2018; 149:164913. [PMID: 30384708 DOI: 10.1063/1.5026256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
We studied the folding behavior of two coarse-grained, lattice models, the HP (hydrophobic-polar) model and the semi-flexible H0P model, whose 124 monomer long sequences were derived from the protein Ribonuclease A. Taking advantage of advanced parallel computing techniques, we applied replica exchange Wang-Landau sampling and calculated the density of states over the models entire energy ranges to high accuracy. We then determined both energetic and structural quantities in order to elucidate the folding behavior of each model completely. As a result of sufficiently long sequences and model complexity, yet computational accessibility, we were able to depict distinct free energy folding funnels for both models. In particular, we found that the HP model folds in a single-step process with a very highly degenerate native state and relatively flat low temperature folding funnel minimum. By contrast, the semi-flexible H0P model folds via a multi-step process and the native state is almost four orders of magnitude less degenerate than that for the HP model. In addition, for the H0P model, the bottom of the free energy folding funnel remains rough, even at low temperatures.
Collapse
Affiliation(s)
- Guangjie Shi
- Center for Simulational Physics, The University of Georgia, Athens, Georgia 30602-0002, USA
| | - Thomas Wüst
- Scientific IT Services, ETH Zurich, 8092 Zurich, Switzerland
| | - David P Landau
- Center for Simulational Physics, The University of Georgia, Athens, Georgia 30602-0002, USA
| |
Collapse
|
15
|
Farris ACK, Shi G, Wüst T, Landau DP. The role of chain-stiffness in lattice protein models: A replica-exchange Wang-Landau study. J Chem Phys 2018; 149:125101. [PMID: 30278675 DOI: 10.1063/1.5045482] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Using Monte Carlo simulations, we investigate simple, physically motivated extensions to the hydrophobic-polar lattice protein model for the small (46 amino acid) protein Crambin. We use two-dimensional replica-exchange Wang-Landau sampling to study the effects of a bond angle stiffness parameter on the folding and uncover a new step in the collapse process for particular values of this stiffness parameter. A physical interpretation of the folding is developed by analysis of changes in structural quantities, and the free energy landscape is explored. For these special values of stiffness, we find non-degenerate ground states, a property that is consistent with behavior of real proteins, and we use these unique ground states to elucidate the formation of native contacts during the folding process. Through this analysis, we conclude that chain-stiffness is particularly influential in the low energy, low temperature regime of the folding process once the lattice protein has partially collapsed.
Collapse
Affiliation(s)
- Alfred C K Farris
- Center for Simulational Physics, Department of Physics and Astronomy, The University of Georgia, Athens, Georgia 30602, USA
| | - Guangjie Shi
- Center for Simulational Physics, Department of Physics and Astronomy, The University of Georgia, Athens, Georgia 30602, USA
| | - Thomas Wüst
- Scientific IT Services, ETH Zürich, 8092 Zürich, Switzerland
| | - David P Landau
- Center for Simulational Physics, Department of Physics and Astronomy, The University of Georgia, Athens, Georgia 30602, USA
| |
Collapse
|
16
|
Wilson MS, Shi G, Wüst T, Li YW, Landau DP. Influence of substrate pattern on the adsorption of HP lattice proteins. MOLECULAR SIMULATION 2018. [DOI: 10.1080/08927022.2018.1471691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
Affiliation(s)
- Matthew S. Wilson
- Centre for Simulational Physics, The University of Georgia, Athens, GA, USA
| | - Guangjie Shi
- Centre for Simulational Physics, The University of Georgia, Athens, GA, USA
| | - Thomas Wüst
- Scientific IT Services, ETH Zürich IT Services, Zürich, Switzerland
| | - Ying Wai Li
- National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - David P. Landau
- Centre for Simulational Physics, The University of Georgia, Athens, GA, USA
| |
Collapse
|
17
|
Dubey SP, Balaji S, Kini NG, Sathish Kumar M. A Novel Framework for Ab Initio Coarse Protein Structure Prediction. Adv Bioinformatics 2018; 2018:7607384. [PMID: 30026759 PMCID: PMC6031167 DOI: 10.1155/2018/7607384] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Revised: 04/26/2018] [Accepted: 05/27/2018] [Indexed: 02/07/2023] Open
Abstract
Hydrophobic-Polar model is a simplified representation of Protein Structure Prediction (PSP) problem. However, even with the HP model, the PSP problem remains NP-complete. This work proposes a systematic and problem specific design for operators of the evolutionary program which hybrids with local search hill climbing, to efficiently explore the search space of PSP and thereby obtain an optimum conformation. The proposed algorithm achieves this by incorporating the following novel features: (i) new initialization method which generates only valid individuals with (rather than random) better fitness values; (ii) use of probability-based selection operators that limit the local convergence; (iii) use of secondary structure based mutation operator that makes the structure more closely to the laboratory determined structure; and (iv) incorporating all the above-mentioned features developed a complete two-tier framework. The developed framework builds the protein conformation on the square and triangular lattice. The test has been performed using benchmark sequences, and a comparative evaluation is done with various state-of-the-art algorithms. Moreover, in addition to hypothetical test sequences, we have tested protein sequences deposited in protein database repository. It has been observed that the proposed framework has shown superior performance regarding accuracy (fitness value) and speed (number of generations needed to attain the final conformation). The concepts used to enhance the performance are generic and can be used with any other population-based search algorithm such as genetic algorithm, ant colony optimization, and immune algorithm.
Collapse
Affiliation(s)
- Sandhya Parasnath Dubey
- Department of Computer Science & Eng., Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
| | - S. Balaji
- Department of Biotechnology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
| | - N. Gopalakrishna Kini
- Department of Computer Science & Eng., Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
| | - M. Sathish Kumar
- Department of ECE, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
| |
Collapse
|
18
|
Application of local rules and cellular automata in representing protein translation and enhancing protein folding approximation. PROGRESS IN ARTIFICIAL INTELLIGENCE 2018. [DOI: 10.1007/s13748-018-0146-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
19
|
Morshedian A, Razmara J, Lotfi S. A novel approach for protein structure prediction based on an estimation of distribution algorithm. Soft comput 2018. [DOI: 10.1007/s00500-018-3130-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
20
|
Li B, Fooksa M, Heinze S, Meiler J. Finding the needle in the haystack: towards solving the protein-folding problem computationally. Crit Rev Biochem Mol Biol 2018; 53:1-28. [PMID: 28976219 PMCID: PMC6790072 DOI: 10.1080/10409238.2017.1380596] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Revised: 08/22/2017] [Accepted: 09/13/2017] [Indexed: 12/22/2022]
Abstract
Prediction of protein tertiary structures from amino acid sequence and understanding the mechanisms of how proteins fold, collectively known as "the protein folding problem," has been a grand challenge in molecular biology for over half a century. Theories have been developed that provide us with an unprecedented understanding of protein folding mechanisms. However, computational simulation of protein folding is still difficult, and prediction of protein tertiary structure from amino acid sequence is an unsolved problem. Progress toward a satisfying solution has been slow due to challenges in sampling the vast conformational space and deriving sufficiently accurate energy functions. Nevertheless, several techniques and algorithms have been adopted to overcome these challenges, and the last two decades have seen exciting advances in enhanced sampling algorithms, computational power and tertiary structure prediction methodologies. This review aims at summarizing these computational techniques, specifically conformational sampling algorithms and energy approximations that have been frequently used to study protein-folding mechanisms or to de novo predict protein tertiary structures. We hope that this review can serve as an overview on how the protein-folding problem can be studied computationally and, in cases where experimental approaches are prohibitive, help the researcher choose the most relevant computational approach for the problem at hand. We conclude with a summary of current challenges faced and an outlook on potential future directions.
Collapse
Affiliation(s)
- Bian Li
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Michaela Fooksa
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
- Chemical and Physical Biology Graduate Program, Vanderbilt University, Nashville, TN, USA
| | - Sten Heinze
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
21
|
Elfin: An algorithm for the computational design of custom three-dimensional structures from modular repeat protein building blocks. J Struct Biol 2018; 201:100-107. [DOI: 10.1016/j.jsb.2017.09.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2017] [Revised: 08/11/2017] [Accepted: 09/02/2017] [Indexed: 11/17/2022]
|
22
|
|
23
|
González-Pérez PP, Orta DJ, Peña I, Flores EC, Ramírez JU, Beltrán HI, Alas SJ. A Computational Approach to Studying Protein Folding Problems Considering the Crucial Role of the Intracellular Environment. J Comput Biol 2017; 24:995-1013. [PMID: 28177752 DOI: 10.1089/cmb.2016.0115] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Intracellular protein folding (PF) is performed in a highly inhomogeneous, crowded, and correlated environment. Due to this inherent complexity, the study and understanding of PF phenomena is a fundamental issue in the field of computational systems biology. In particular, it is important to use a modeled medium that accurately reflects PF in natural systems. In the current study, we present a simulation wherein PF is carried out within an inhomogeneous modeled medium. Simulation resources included a two-dimensional hydrophobic-polar (HP) model, evolutionary algorithms, and the dual site-bond model. The dual site-bond model was used to develop an environment where HP beads could be folded. Our modeled medium included correlation lengths and fractal-like behavior, which were selected according to HP sequence lengths to induce folding in a crowded environment. Analysis of three benchmark HP sequences showed that the modeled inhomogeneous space played an important role in deeper energy folding and obtained better performance and convergence compared with homogeneous environments. Our computational approach also demonstrated that our correlated network provided a better space for PF. Thus, our approach represents a major advancement in PF simulations, not only for folding but also for understanding functional chemical structure and physicochemical properties of proteins in crowded molecular systems, which normally occur in nature.
Collapse
Affiliation(s)
- Pedro P González-Pérez
- 1 Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana Unidad Cuajimalpa , Ciudad de México, México
| | - Daniel J Orta
- 1 Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana Unidad Cuajimalpa , Ciudad de México, México
| | - Irving Peña
- 1 Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana Unidad Cuajimalpa , Ciudad de México, México
| | - Eduardo C Flores
- 1 Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana Unidad Cuajimalpa , Ciudad de México, México
| | - José U Ramírez
- 1 Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana Unidad Cuajimalpa , Ciudad de México, México
| | - Hiram I Beltrán
- 2 Departamento de Ciencias Naturales, Universidad Autónoma Metropolitana Unidad Cuajimalpa , Ciudad de México, México
| | - Salomón J Alas
- 2 Departamento de Ciencias Naturales, Universidad Autónoma Metropolitana Unidad Cuajimalpa , Ciudad de México, México
| |
Collapse
|
24
|
Shin JM, Lee B, Cho KH. A New Efficient Conformational Search Method forab initioProtein Folding Study: Window Growth Evolutionary Algorithm. B KOREAN CHEM SOC 2016. [DOI: 10.1002/bkcs.11006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Jae-Min Shin
- School of Systems Biomedical Science; Soongsil University; Seoul 156-743 Republic of Korea
| | - Byungkook Lee
- Laboratory of Molecular Biology, Division of Basic Sciences; National Cancer Institute, National Institutes of Health; Bethesda MD 20892-4200 USA
| | - Kwang-Hwi Cho
- School of Systems Biomedical Science; Soongsil University; Seoul 156-743 Republic of Korea
| |
Collapse
|
25
|
Kalikka J, Zhou X, Behera J, Nannicini G, Simpson RE. Evolutionary design of interfacial phase change van der Waals heterostructures. NANOSCALE 2016; 8:18212-18220. [PMID: 27759127 DOI: 10.1039/c6nr05539g] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
We use an evolutionary algorithm to explore the design space of hexagonal Ge2Sb2Te5; a van der Waals layered two dimensional crystal heterostructure. The Ge2Sb2Te5 structure is more complicated than previously thought. Predominant features include layers of Ge3Sb2Te6 and Ge1Sb2Te4 two dimensional crystals that interact through Te-Te van der Waals bonds. Interestingly, (Ge/Sb)-Te-(Ge/Sb)-Te alternation is a common feature for the most stable structures of each generation's evolution. This emergent rule provides an important structural motif that must be included in the design of high performance Sb2Te3-GeTe van der Waals heterostructure superlattices with interfacial atomic switching capability. The structures predicted by the algorithm agree well with experimental measurements on highly oriented, and single crystal Ge2Sb2Te5 samples. By analysing the evolutionary algorithm optimised structures, we show that diffusive atomic switching is probable by Ge atoms undergoing a transition at the van der Waals interface from layers of Ge3Sb2Te6 to Ge1Sb2Te4 thus producing two blocks of Ge2Sb2Te5. Evolutionary methods present an efficient approach to explore the enormous multi-dimensional design parameter space of van der Waals bonded heterostructure superlattices.
Collapse
Affiliation(s)
- Janne Kalikka
- Singapore University of Technology and Design, 8 Somapah Road, Singapore.
| | - Xilin Zhou
- Singapore University of Technology and Design, 8 Somapah Road, Singapore.
| | - Jitendra Behera
- Singapore University of Technology and Design, 8 Somapah Road, Singapore.
| | | | - Robert E Simpson
- Singapore University of Technology and Design, 8 Somapah Road, Singapore.
| |
Collapse
|
26
|
Walker JA, Bartels DM. A Simple ab Initio Model for the Solvated Electron in Methanol. J Phys Chem A 2016; 120:7240-7. [PMID: 27599299 DOI: 10.1021/acs.jpca.6b07955] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The solvation structure of a solvated electron in methanol is investigated with ab initio calculations of small anion methanol clusters in a polarized dielectric continuum. We find that the lowest-energy structure in best agreement with experiment, calculated with CCSD, MP2, and B3LYP methods with aug-cc-pvdz basis set, is a tetrahedral arrangement of four methanol molecules with OH bonds oriented toward the center. The optimum distance from the tetrahedron center to the hydroxyl protons is ∼1.8 Å, significantly smaller than previous estimates. We are able to reproduce experimental radius of gyration Rg (deduced from optical absorption), vertical detachment energy, and resonance Raman frequencies. The electron paramagnetic resonance g-factor shift is qualitatively reproduced using density functional theory.
Collapse
Affiliation(s)
- J A Walker
- Radiation Laboratory and Dept. of Chemistry & Biochemistry, Notre Dame University , Notre Dame, Indiana 46556, United States
| | - D M Bartels
- Radiation Laboratory and Dept. of Chemistry & Biochemistry, Notre Dame University , Notre Dame, Indiana 46556, United States
| |
Collapse
|
27
|
Bošković B, Brest J. Genetic algorithm with advanced mechanisms applied to the protein structure prediction in a hydrophobic-polar model and cubic lattice. Appl Soft Comput 2016. [DOI: 10.1016/j.asoc.2016.04.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
28
|
Alas SJ, González-Pérez PP. Simulating the folding of HP-sequences with a minimalist model in an inhomogeneous medium. Biosystems 2016; 142-143:52-67. [PMID: 27020756 DOI: 10.1016/j.biosystems.2016.03.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Revised: 03/16/2016] [Accepted: 03/24/2016] [Indexed: 11/24/2022]
Abstract
The phenomenon of protein folding is a fundamental issue in the field of the computational molecular biology. The protein folding inside the cells is performed in a highly inhomogeneous, tortuous, and correlated environment. Therefore, it is important to include in the theoretical studies the medium where the protein folding is developed. In this work we present the combination of three models to mimic the protein folding inside of an inhomogeneous medium. The models used here are Hydrophobic-Polar (HP) in 2D square arrangement, Evolutionary Algorithms (EA), and the Dual Site Bond Model (DSBM). The DSBM model is used to simulate the environment where the HP beads are folded; in this case the medium is correlated and is fractal-like. The analysis of five benchmark HP sequences shows that the inhomogeneous space provided with a given correlation length and fractal dimension plays an important role for correct folding of these sequences, which does not occur in a homogeneous space.
Collapse
Affiliation(s)
- S J Alas
- Departamento de Ciencias Naturales, Universidad Autónoma Metropolitana Unidad Cuajimalpa, Av. Vasco de Quiroga 4871, Distrito Federal 05300, Mexico
| | - P P González-Pérez
- Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana Unidad Cuajimalpa, Av. Vasco de Quiroga 4871, Distrito Federal 05300, Mexico.
| |
Collapse
|
29
|
Biological complexity: ant colony meta-heuristic optimization algorithm for protein folding. Neural Comput Appl 2016. [DOI: 10.1007/s00521-016-2252-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
30
|
3D protein structure prediction using Imperialist Competitive algorithm and half sphere exposure prediction. J Theor Biol 2016; 391:81-7. [PMID: 26718864 DOI: 10.1016/j.jtbi.2015.12.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2015] [Revised: 11/22/2015] [Accepted: 12/01/2015] [Indexed: 11/23/2022]
Abstract
Predicting the native structure of proteins based on half-sphere exposure and contact numbers has been studied deeply within recent years. Online predictors of these vectors and secondary structures of amino acids sequences have made it possible to design a function for the folding process. By choosing variant structures and directs for each secondary structure, a random conformation can be generated, and a potential function can then be assigned. Minimizing the potential function utilizing meta-heuristic algorithms is the final step of finding the native structure of a given amino acid sequence. In this work, Imperialist Competitive algorithm was used in order to accelerate the process of minimization. Moreover, we applied an adaptive procedure to apply revolutionary changes. Finally, we considered a more accurate tool for prediction of secondary structure. The results of the computational experiments on standard benchmark show the superiority of the new algorithm over the previous methods with similar potential function.
Collapse
|
31
|
Rashid MA, Iqbal S, Khatib F, Hoque MT, Sattar A. Guided macro-mutation in a graded energy based genetic algorithm for protein structure prediction. Comput Biol Chem 2016; 61:162-77. [PMID: 26878130 DOI: 10.1016/j.compbiolchem.2016.01.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2015] [Revised: 11/29/2015] [Accepted: 01/21/2016] [Indexed: 10/22/2022]
Abstract
Protein structure prediction is considered as one of the most challenging and computationally intractable combinatorial problem. Thus, the efficient modeling of convoluted search space, the clever use of energy functions, and more importantly, the use of effective sampling algorithms become crucial to address this problem. For protein structure modeling, an off-lattice model provides limited scopes to exercise and evaluate the algorithmic developments due to its astronomically large set of data-points. In contrast, an on-lattice model widens the scopes and permits studying the relatively larger proteins because of its finite set of data-points. In this work, we took the full advantage of an on-lattice model by using a face-centered-cube lattice that has the highest packing density with the maximum degree of freedom. We proposed a graded energy-strategically mixes the Miyazawa-Jernigan (MJ) energy with the hydrophobic-polar (HP) energy-based genetic algorithm (GA) for conformational search. In our application, we introduced a 2 × 2 HP energy guided macro-mutation operator within the GA to explore the best possible local changes exhaustively. Conversely, the 20 × 20 MJ energy model-the ultimate objective function of our GA that needs to be minimized-considers the impacts amongst the 20 different amino acids and allow searching the globally acceptable conformations. On a set of benchmark proteins, our proposed approach outperformed state-of-the-art approaches in terms of the free energy levels and the root-mean-square deviations.
Collapse
Affiliation(s)
- Mahmood A Rashid
- SCIMS, University of the South Pacific, Laucala Bay, Suva, Fiji; IIIS, Griffith University, Brisbane, QLD, Australia.
| | | | - Firas Khatib
- CIS, University of Massachusetts Dartmouth, MA, USA.
| | | | - Abdul Sattar
- IIIS, Griffith University, Brisbane, QLD, Australia.
| |
Collapse
|
32
|
Guo Y. The Noncompacted Folding of Proteins by Modified Elastic Net Algorithm. J Comput Biol 2015; 22:609-18. [PMID: 26161596 DOI: 10.1089/cmb.2012.0290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In this article, a protein sequence of length n is embedded in a two-dimensional lattice with m (>n) points. Thus the obtained minimal energy configurations are expected to have flexible shapes in contrast to the compact rectangular conformations. To fulfill this extension, the elastic net algorithm is modified to deal with the difficulty brought by the unsymmetrical relationship between amino acids and lattice points. New set partition strategy in the embedding phase is introduced, and two local search methods are applied to overcome the multimapping phenomena. Several HP benchmark examples with up to 48 amino acids are tested to verify the effectiveness of the proposed approach.
Collapse
Affiliation(s)
- Yuzhen Guo
- Department of Mathematics, College of Science, Nanjing University of Aeronautics and Astronautics , Nanjing, China
| |
Collapse
|
33
|
A Multi-Objective Approach for Protein Structure Prediction Based on an Energy Model and Backbone Angle Preferences. Int J Mol Sci 2015; 16:15136-49. [PMID: 26151847 PMCID: PMC4519891 DOI: 10.3390/ijms160715136] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2015] [Revised: 06/25/2015] [Accepted: 06/25/2015] [Indexed: 11/17/2022] Open
Abstract
Protein structure prediction (PSP) is concerned with the prediction of protein tertiary structure from primary structure and is a challenging calculation problem. After decades of research effort, numerous solutions have been proposed for optimisation methods based on energy models. However, further investigation and improvement is still needed to increase the accuracy and similarity of structures. This study presents a novel backbone angle preference factor, which is one of the factors inducing protein folding. The proposed multiobjective optimisation approach simultaneously considers energy models and backbone angle preferences to solve the ab initio PSP. To prove the effectiveness of the multiobjective optimisation approach based on the energy models and backbone angle preferences, 75 amino acid sequences with lengths ranging from 22 to 88 amino acids were selected from the CB513 data set to be the benchmarks. The data sets were highly dissimilar, therefore indicating that they are meaningful. The experimental results showed that the root-mean-square deviation (RMSD) of the multiobjective optimization approach based on energy model and backbone angle preferences was superior to those of typical energy models, indicating that the proposed approach can facilitate the ab initio PSP.
Collapse
|
34
|
Kumar M. An enhanced algorithm for multiple sequence alignment of protein sequences using genetic algorithm. EXCLI JOURNAL 2015; 14:1232-55. [PMID: 27065770 PMCID: PMC4820728 DOI: 10.17179/excli2015-302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/01/2015] [Accepted: 11/19/2015] [Indexed: 11/10/2022]
Abstract
One of the most fundamental operations in biological sequence analysis is multiple sequence alignment (MSA). The basic of multiple sequence alignment problems is to determine the most biologically plausible alignments of protein or DNA sequences. In this paper, an alignment method using genetic algorithm for multiple sequence alignment has been proposed. Two different genetic operators mainly crossover and mutation were defined and implemented with the proposed method in order to know the population evolution and quality of the sequence aligned. The proposed method is assessed with protein benchmark dataset, e.g., BALIBASE, by comparing the obtained results to those obtained with other alignment algorithms, e.g., SAGA, RBT-GA, PRRP, HMMT, SB-PIMA, CLUSTALX, CLUSTAL W, DIALIGN and PILEUP8 etc. Experiments on a wide range of data have shown that the proposed algorithm is much better (it terms of score) than previously proposed algorithms in its ability to achieve high alignment quality.
Collapse
Affiliation(s)
- Manish Kumar
- Department of Computer Science and Engineering, Indian School of Mines, Dhanbad, Jharkhand, India
| |
Collapse
|
35
|
Three-dimensional protein structure prediction: Methods and computational strategies. Comput Biol Chem 2014; 53PB:251-276. [DOI: 10.1016/j.compbiolchem.2014.10.001] [Citation(s) in RCA: 121] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Revised: 10/03/2014] [Accepted: 10/07/2014] [Indexed: 01/01/2023]
|
36
|
Santos J, Villot P, Diéguez M. Emergent protein folding modeled with evolved neural cellular automata using the 3D HP model. J Comput Biol 2014; 21:823-45. [PMID: 25343217 DOI: 10.1089/cmb.2014.0077] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We used cellular automata (CA) for the modeling of the temporal folding of proteins. Unlike the focus of the vast research already done on the direct prediction of the final folded conformations, we will model the temporal and dynamic folding process. To reduce the complexity of the interactions and the nature of the amino acid elements, lattice models like HP were used, a model that categorizes the amino acids regarding their hydrophobicity. Taking into account the restrictions of the lattice model, the CA model defines how the amino acids interact through time to obtain a folded conformation. We extended the classical CA models using artificial neural networks for their implementation (neural CA), and we used evolutionary computing to automatically obtain the models by means of Differential Evolution. As the iterative folding also provides the final folded conformation, we can compare the results with those from direct prediction methods of the final protein conformation. Finally, as the neural CA that provides the iterative folding process can be evolved using several protein sequences and used as operators in the folding of another protein with different length, this represents an advantage over the NP-hard complexity of the original problem of the direct prediction.
Collapse
Affiliation(s)
- José Santos
- Department of Computer Science, University of A Coruña , A Coruña, Spain
| | | | | |
Collapse
|
37
|
Brown MS, Bennett T, Coker JA. Niche Genetic Algorithms are better than traditional Genetic Algorithms for de novo Protein Folding. F1000Res 2014. [DOI: 10.12688/f1000research.5412.1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Here we demonstrate that Niche Genetic Algorithms (NGA) are better at computing de novo protein folding than traditional Genetic Algorithms (GA). Previous research has shown that proteins can fold into their active forms in a limited number of ways; however, predicting how a set of amino acids will fold starting from the primary structure is still a mystery. GAs have a unique ability to solve these types of scientific problems because of their computational efficiency. Unfortunately, GAs are generally quite poor at solving problems with multiple optima. However, there is a special group of GAs called Niche Genetic Algorithms (NGA) that are quite good at solving problems with multiple optima. In this study, we use a specific NGA: the Dynamic-radius Species-conserving Genetic Algorithm (DSGA), and show that DSGA is very adept at predicting the folded state of proteins, and that DSGA is better than a traditional GA in deriving the correct folding pattern of a protein.
Collapse
|
38
|
Shi G, Vogel T, Wüst T, Li YW, Landau DP. Effect of single-site mutations on hydrophobic-polar lattice proteins. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2014; 90:033307. [PMID: 25314564 DOI: 10.1103/physreve.90.033307] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2014] [Indexed: 06/04/2023]
Abstract
We developed a heuristic method for determining the ground-state degeneracy of hydrophobic-polar (HP) lattice proteins, based on Wang-Landau and multicanonical sampling. It is applied during comprehensive studies of single-site mutations in specific HP proteins with different sequences. The effects in which we are interested include structural changes in ground states, changes of ground-state energy, degeneracy, and thermodynamic properties of the system. With respect to mutations, both extremely sensitive and insensitive positions in the HP sequence have been found. That is, ground-state energies and degeneracies, as well as other thermodynamic and structural quantities, may be either largely unaffected or may change significantly due to mutation.
Collapse
Affiliation(s)
- Guangjie Shi
- Center for Simulational Physics, The University of Georgia, Athens, Georgia 30602, USA
| | - Thomas Vogel
- Theoretical Division (T-1), Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Thomas Wüst
- Scientific IT Services, ETH Zürich IT Services, 8092 Zürich, Switzerland
| | - Ying Wai Li
- National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA
| | - David P Landau
- Center for Simulational Physics, The University of Georgia, Athens, Georgia 30602, USA
| |
Collapse
|
39
|
Vogel T, Li YW, Wüst T, Landau DP. Scalable replica-exchange framework for Wang-Landau sampling. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2014; 90:023302. [PMID: 25215846 DOI: 10.1103/physreve.90.023302] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2014] [Indexed: 06/03/2023]
Abstract
We investigate a generic, parallel replica-exchange framework for Monte Carlo simulations based on the Wang-Landau method. To demonstrate its advantages and general applicability for massively parallel simulations of complex systems, we apply it to lattice spin models, the self-assembly process in amphiphilic solutions, and the adsorption of molecules on surfaces. While of general current interest, the latter phenomena are challenging to study computationally because of multiple structural transitions occurring over a broad temperature range. We show how the parallel framework facilitates simulations of such processes and, without any loss of accuracy or precision, gives a significant speedup and allows for the study of much larger systems and much wider temperature ranges than possible with single-walker methods.
Collapse
Affiliation(s)
- Thomas Vogel
- Center for Simulational Physics, The University of Georgia, Athens, Georgia 30602, USA
| | - Ying Wai Li
- National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA
| | - Thomas Wüst
- Scientific IT Services, ETH Zürich IT Services, 8092 Zürich, Switzerland
| | - David P Landau
- Center for Simulational Physics, The University of Georgia, Athens, Georgia 30602, USA
| |
Collapse
|
40
|
Shaw D, Shohidull Islam ASM, Sohel Rahman M, Hasan M. Protein folding in HP model on hexagonal lattices with diagonals. BMC Bioinformatics 2014; 15 Suppl 2:S7. [PMID: 24564789 PMCID: PMC4016602 DOI: 10.1186/1471-2105-15-s2-s7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Three dimensional structure prediction of a protein from its amino acid sequence, known as protein folding, is one of the most studied computational problem in bioinformatics and computational biology. Since, this is a hard problem, a number of simplified models have been proposed in literature to capture the essential properties of this problem. In this paper we introduce the hexagonal lattices with diagonals to handle the protein folding problem considering the well researched HP model. We give two approximation algorithms for protein folding on this lattice. Our first algorithm is a 53-approximation algorithm, which is based on the strategy of partitioning the entire protein sequence into two pieces. Our next algorithm is also based on partitioning approaches and improves upon the first algorithm.
Collapse
|
41
|
Li Y, Wüst T, Landau D. Wang–Landau sampling of the interplay between surface adsorption and folding of HP lattice proteins. MOLECULAR SIMULATION 2014. [DOI: 10.1080/08927022.2013.847273] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
42
|
Custódio FL, Barbosa HJ, Dardenne LE. A multiple minima genetic algorithm for protein structure prediction. Appl Soft Comput 2014. [DOI: 10.1016/j.asoc.2013.10.029] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
43
|
Blavatska V, Janke W. Conformational transitions in random heteropolymer models. J Chem Phys 2014; 140:034904. [DOI: 10.1063/1.4849175] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
44
|
Maher B, Albrecht AA, Loomes M, Yang XS, Steinhöfel K. A firefly-inspired method for protein structure prediction in lattice models. Biomolecules 2014; 4:56-75. [PMID: 24970205 PMCID: PMC4030990 DOI: 10.3390/biom4010056] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2013] [Revised: 12/17/2013] [Accepted: 12/27/2013] [Indexed: 02/05/2023] Open
Abstract
We introduce a Firefly-inspired algorithmic approach for protein structure prediction over two different lattice models in three-dimensional space. In particular, we consider three-dimensional cubic and three-dimensional face-centred-cubic (FCC) lattices. The underlying energy models are the Hydrophobic-Polar (H-P) model, the Miyazawa–Jernigan (M-J) model and a related matrix model. The implementation of our approach is tested on ten H-P benchmark problems of a length of 48 and ten M-J benchmark problems of a length ranging from 48 until 61. The key complexity parameter we investigate is the total number of objective function evaluations required to achieve the optimum energy values for the H-P model or competitive results in comparison to published values for the M-J model. For H-P instances and cubic lattices, where data for comparison are available, we obtain an average speed-up over eight instances of 2.1, leaving out two extreme values (otherwise, 8.8). For six M-J instances, data for comparison are available for cubic lattices and runs with a population size of 100, where, a priori, the minimum free energy is a termination criterion. The average speed-up over four instances is 1.2 (leaving out two extreme values, otherwise 1.1), which is achieved for a population size of only eight instances. The present study is a test case with initial results for ad hoc parameter settings, with the aim of justifying future research on larger instances within lattice model settings, eventually leading to the ultimate goal of implementations for off-lattice models.
Collapse
Affiliation(s)
- Brian Maher
- Department of Informatics, King's College London, Strand, London WC2R 2LS, UK.
| | - Andreas A Albrecht
- School of Science and Technology, Middlesex University, The Burroughs, London, NW4 4BT, UK.
| | - Martin Loomes
- School of Science and Technology, Middlesex University, The Burroughs, London, NW4 4BT, UK.
| | - Xin-She Yang
- School of Science and Technology, Middlesex University, The Burroughs, London, NW4 4BT, UK.
| | - Kathleen Steinhöfel
- Department of Informatics, King's College London, Strand, London WC2R 2LS, UK.
| |
Collapse
|
45
|
Md Sohidull Islam AS, Rahman MS. On the protein folding problem in 2D-triangular lattices. Algorithms Mol Biol 2013; 8:30. [PMID: 24279437 PMCID: PMC4175104 DOI: 10.1186/1748-7188-8-30] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2013] [Accepted: 11/20/2013] [Indexed: 12/02/2022] Open
Abstract
In this paper, we present a novel approximation algorithm to solve the protein folding problem in HP model. Our algorithm is polynomial in terms of the length of the given HP string. The expected approximation ratio of our algorithm is 1-2lognn-1 for n ≥ 6, where n2 is the total number of H’s in a given HP string. The expected approximation ratio tends to reach 1 for large values of n. Hence our algorithm is expected to perform very well for larger HP strings.
Collapse
|
46
|
Tsay JJ, Su SC. An effective evolutionary algorithm for protein folding on 3D FCC HP model by lattice rotation and generalized move sets. Proteome Sci 2013; 11:S19. [PMID: 24565217 PMCID: PMC3908773 DOI: 10.1186/1477-5956-11-s1-s19] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Background Proteins are essential biological molecules which play vital roles in nearly all biological processes. It is the tertiary structure of a protein that determines its functions. Therefore the prediction of a protein's tertiary structure based on its primary amino acid sequence has long been the most important and challenging subject in biochemistry, molecular biology and biophysics. In the past, the HP lattice model was one of the ab initio methods that many researchers used to forecast the protein structure. Although these kinds of simplified methods could not achieve high resolution, they provided a macrocosm-optimized protein structure. The model has been employed to investigate general principles of protein folding, and plays an important role in the prediction of protein structures. Methods In this paper, we present an improved evolutionary algorithm for the protein folding problem. We study the problem on the 3D FCC lattice HP model which has been widely used in previous research. Our focus is to develop evolutionary algorithms (EA) which are robust, easy to implement and can handle various energy functions. We propose to combine three different local search methods, including lattice rotation for crossover, K-site move for mutation, and generalized pull move; these form our key components to improve previous EA-based approaches. Results We have carried out experiments over several data sets which were used in previous research. The results of the experiments show that our approach is able to find optimal conformations which were not found by previous EA-based approaches. Conclusions We have investigated the geometric properties of the 3D FCC lattice and developed several local search techniques to improve traditional EA-based approaches to the protein folding problem. It is known that EA-based approaches are robust and can handle arbitrary energy functions. Our results further show that by extensive development of local searches, EA can also be very effective for finding optimal conformations on the 3D FCC HP model. Furthermore, the local searches developed in this paper can be integrated with other approaches such as the Monte Carlo and Tabu searches to improve their performance.
Collapse
Affiliation(s)
- Jyh-Jong Tsay
- Department of Computer Science and Information Engineering, National Chung Cheng University, 168 University Road, Minhsiung Township, Chiayi County 62102, Taiwan
| | - Shih-Chieh Su
- Department of Computer Science and Information Engineering, National Chung Cheng University, 168 University Road, Minhsiung Township, Chiayi County 62102, Taiwan
| |
Collapse
|
47
|
Pattanasiri B, Li YW, Landau DP, Wüst T, Triampo W. Thermodynamics and structural properties of a confined HP protein determined by Wang-Landau simulation. ACTA ACUST UNITED AC 2013. [DOI: 10.1088/1742-6596/454/1/012071] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
48
|
Wright T, Ward J. The evolution of a visual-to-auditory sensory substitution device using interactive genetic algorithms. Q J Exp Psychol (Hove) 2013; 66:1620-38. [DOI: 10.1080/17470218.2012.754911] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Sensory substitution is a promising technique for mitigating the loss of a sensory modality. Sensory substitution devices (SSDs) work by converting information from the impaired sense (e.g., vision) into another, intact sense (e.g., audition). However, there are a potentially infinite number of ways of converting images into sounds, and it is important that the conversion takes into account the limits of human perception and other user-related factors (e.g., whether the sounds are pleasant to listen to). The device explored here is termed “polyglot” because it generates a very large set of solutions. Specifically, we adapt a procedure that has been in widespread use in the design of technology but has rarely been used as a tool to explore perception—namely, interactive genetic algorithms. In this procedure, a very large range of potential sensory substitution devices can be explored by creating a set of “genes” with different allelic variants (e.g., different ways of translating luminance into loudness). The most successful devices are then “bred” together, and we statistically explore the characteristics of the selected-for traits after multiple generations. The aim of the present study is to produce design guidelines for a better SSD. In three experiments, we vary the way that the fitness of the device is computed: by asking the user to rate the auditory aesthetics of different devices (Experiment 1), and by measuring the ability of participants to match sounds to images (Experiment 2) and the ability to perceptually discriminate between two sounds derived from similar images (Experiment 3). In each case, the traits selected for by the genetic algorithm represent the ideal SSD for that task. Taken together, these traits can guide the design of a better SSD.
Collapse
Affiliation(s)
- Thomas Wright
- School of Psychology & Sackler Centre for Consciousness Science, University of Sussex, Falmer, Brighton, UK
| | - Jamie Ward
- School of Psychology & Sackler Centre for Consciousness Science, University of Sussex, Falmer, Brighton, UK
| |
Collapse
|
49
|
Liang C, Jansen TLC. Simulation of Two-Dimensional Sum-Frequency Generation Response Functions: Application to Amide I in Proteins. J Phys Chem B 2013; 117:6937-45. [DOI: 10.1021/jp403111j] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Chungwen Liang
- Biozentrum, University of Basel, Klingelbergstrasse 50/70, CH - 4056 Basel,
Switzerland
| | - Thomas L. C. Jansen
- Zernike Institute
for Advanced Materials, University of Groningen, Nijenborgh 4, 9747 AG Groningen, The Netherlands
| |
Collapse
|
50
|
Brasil CRS, Delbem ACB, da Silva FLB. Multiobjective evolutionary algorithm with many tables for purelyab initioprotein structure prediction. J Comput Chem 2013; 34:1719-34. [DOI: 10.1002/jcc.23315] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2012] [Revised: 02/26/2013] [Accepted: 04/07/2013] [Indexed: 11/10/2022]
|