1
|
Mufassirin MMM, Newton MAH, Sattar A. Artificial intelligence for template-free protein structure prediction: a comprehensive review. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10350-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
2
|
Beltrán HI, Alas-Guardado SJ, González-Pérez PP. Improving coarse-grained models of protein folding through weighting of polar-polar/hydrophobic-hydrophobic interactions into crowded spaces. J Mol Model 2022; 28:87. [PMID: 35262807 DOI: 10.1007/s00894-022-05071-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 02/26/2022] [Indexed: 10/18/2022]
Abstract
Herein were tested 7 hydrophobic-polar sequences in two types of 2D-square space lattices, homogeneous and correlated, the latter simulating molecular crowding included as a geometric boundary restriction. Optimization of 2D structures was carried out using a variant of Dill's model, inspired by convex function, taking into account both hydrophobic (Dill's model) and polar interactions, including more structural information to reach better folding solutions. While using correlated networks, degrees of freedom in the folding of sequences were limited; as a result in all cases, more successful structural trials were found in comparison to a homogeneous lattice. The majority of employed sequences were designed by our workgroup, two of them were folded with other approaches, and another is a modified version of a previous sequence, initial forms of the other two have been employed but without taking into account polar-polar contributions. Three of them are newly proposed, intended to test the conjoint hydrophobic-hydrophobic and polar-polar contributions in crowded spaces. One sequence turned out to be the most difficult of the seven folded, this perhaps due to intrinsic (i) degrees of freedom and (ii) motifs of the expected 2D HP structure. Meanwhile two-sequence, although optimal folding was not achieved for neither of the two approaches, folding with correlated network approach not only produced better results than homogeneous space, but for them the best values found with crowding were very close to the expected optimal fitness. In general, five sequences were better folded with medium lattice units for correlated media; instead, another two sequences were better folded with a bit larger degree of lattice unit, revealing that depending on the degrees of freedom and particular folding, motifs in each sequence would require tuned crowding to achieve better folding. Therefore, the main goal herein was to obtain a modified 2D HP lattice model to mimic folding of proteins or secondary structures, like β-sheets, taking into account both hydrophobic-hydrophobic and polar-polar interactions, and fold them in a crowded environment. This simple but enough construction would be conducted to determine the needed information to fold sequences in a sort of a minimal but complete heuristic model. Finally, we claim that all folded sequences into crowded spaces achieve better results than homogeneous ones.
Collapse
Affiliation(s)
- Hiram Isaac Beltrán
- Departamento de Ciencias Básicas, Universidad Autónoma Metropolitana, Unidad Azcapotzalco, CDMX 02200, Mexico, Mexico
| | - Salomón J Alas-Guardado
- Departamento de Ciencias Naturales, Universidad Autónoma Metropolitana Unidad Cuajimalpa, CDMX 05300, Mexico, Mexico.
| | - Pedro Pablo González-Pérez
- Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana, Unidad Cuajimalpa, CDMX 05300, Mexico, Mexico.
| |
Collapse
|
3
|
Alas-Guardado SJ, González-Pérez PP, Beltrán HI. Contributions of topological polar-polar contacts to achieve better folding stability of 2D/3D HP lattice proteins: An in silico approach. AIMS BIOPHYSICS 2021. [DOI: 10.3934/biophy.2021023] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
<abstract>
<p>Many of the simplistic hydrophobic-polar lattice models, such as Dill's model (called <bold>Model 1</bold> herein), are aimed to fold structures through hydrophobic-hydrophobic interactions mimicking the well-known hydrophobic collapse present in protein structures. In this work, we studied 11 designed hydrophobic-polar sequences, S<sub>1</sub>-S<sub>8</sub> folded in 2D-square lattice, and S<sub>9</sub>-S<sub>11</sub> folded in 3D-cubic lattice. And to better fold these structures we have developed <bold>Model 2</bold> as an approximation to convex function aimed to weight hydrophobic-hydrophobic but also polar-polar contacts as an augmented version of <bold>Model 1</bold>. In this partitioned approach hydrophobic-hydrophobic ponderation was tuned as <italic>α</italic>-1 and polar-polar ponderation as <italic>α</italic>. This model is centered in preserving required hydrophobic substructure, and at the same time including polar-polar interactions, otherwise absent, to reach a better folding score now also acquiring the polar-polar substructure. In all tested cases the folding trials were better achieved with <bold>Model 2</bold>, using <italic>α</italic> values of 0.05, 0.1, 0.2 and 0.3 depending of sequence size, even finding optimal scores not reached with <bold>Model 1</bold>. An important result is that the better folding score, required the lower <italic>α</italic> weighting. And when <italic>α</italic> values above 0.3 are employed, no matter the nature of the hydrophobic-polar sequence, banning of hydrophobic-hydrophobic contacts started, thus yielding misfolding of sequences. Therefore, the value of <italic>α</italic> to correctly fold structures is the result of a careful weighting among hydrophobic-hydrophobic and polar-polar contacts.</p>
</abstract>
Collapse
|
4
|
Li Y, Zhang Y, Lv J. An Effective Cumulative Torsion Angles Model for Prediction of Protein Folding Rates. Protein Pept Lett 2020; 27:321-328. [PMID: 31612815 DOI: 10.2174/0929866526666191014152207] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Revised: 06/07/2019] [Accepted: 06/29/2019] [Indexed: 02/05/2023]
Abstract
BACKGROUND Protein folding rate is mainly determined by the size of the conformational space to search, which in turn is dictated by factors such as size, structure and amino-acid sequence in a protein. It is important to integrate these factors effectively to form a more precisely description of conformation space. But there is no general paradigm to answer this question except some intuitions and empirical rules. Therefore, at the present stage, predictions of the folding rate can be improved through finding new factors, and some insights are given to the above question. OBJECTIVE Its purpose is to propose a new parameter that can describe the size of the conformational space to improve the prediction accuracy of protein folding rate. METHODS Based on the optimal set of amino acids in a protein, an effective cumulative backbone torsion angles (CBTAeff) was proposed to describe the size of the conformational space. Linear regression model was used to predict protein folding rate with CBTAeff as a parameter. The degree of correlation was described by the coefficient of determination and the mean absolute error MAE between the predicted folding rates and experimental observations. RESULTS It achieved a high correlation (with the coefficient of determination of 0.70 and MAE of 1.88) between the logarithm of folding rates and the (CBTAeff)0.5 with experimental over 112 twoand multi-state folding proteins. CONCLUSION The remarkable performance of our simplistic model demonstrates that CBTA based on optimal set was the major determinants of the conformation space of natural proteins.
Collapse
Affiliation(s)
- Yanru Li
- Department of Physics, College of Science, Inner Mongolia University of Technology, Hohhot, China
| | - Ying Zhang
- Department of Physics, College of Science, Inner Mongolia University of Technology, Hohhot, China
| | - Jun Lv
- Department of Physics, College of Science, Inner Mongolia University of Technology, Hohhot, China
| |
Collapse
|
5
|
Alas SDJ, González-Pérez PP, Beltrán HI. In silico minimalist approach to study 2D HP protein folding into an inhomogeneous space mimicking osmolyte effect: First trial in the search of foldameric backbones. Biosystems 2019; 181:31-43. [PMID: 31029589 DOI: 10.1016/j.biosystems.2019.04.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Revised: 04/01/2019] [Accepted: 04/08/2019] [Indexed: 12/22/2022]
Abstract
We have employed our bioinformatics workbench, named Evolution, a Multi-Agent System based architecture with lattice-bead-models, evolutionary-algorithms, and correlated-networks as inhomogeneous spaces, with different correlation lengths, mimicking osmolyte effect (molecular crowding), to in silico survey protein folding. Resolution is with hydrophobic-polar (H-P) sequences in inhomogeneous 2D square lattices, since general biophysicochemical trends consider i) that the backbone is one of the major components responsible for protein folding and ii) osmolyte effect plays an important role to better folding kinetics and reach deeper optima. We have designed foldamers, as square n × n (n = 3, 4, 5, 6) arrays of hydrophobic cores stabilized by H⋯H contacts, attached through short PP (P2) or long PPPP (P4) loops, giving rise to 8 sequences (S1 to S8) with known optimal scores. Designed sequences were folded into different inhomogeneous spaces and indeed crowded media induced deeper optima, being crowding necessary to best fold, but the space should be enough constrained to induce folding without banning chain movement. The constrained space plays an important role to reach the optimal structure, depending on designed foldamer sequence size, for an optimal correlation length, implying that media affects the folding pathways as happens in real systems. Designed structures were found, moreover, they undergo to degenerated states, both folding states could survey considering i) backbone information and ii) osmolyte effect. In nature, the proteins fold in different structures aiming to reach a global minimum, but a local minimum could be enough to the protein to be functional or dysfunctional.
Collapse
Affiliation(s)
- Salomón de Jesús Alas
- Departamento de Ciencias Naturales, Universidad Autónoma Metropolitana Unidad Cuajimalpa, Ciudad de México, 05300, Mexico.
| | - Pedro Pablo González-Pérez
- Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana Unidad Cuajimalpa, 05300, Ciudad de Mexico, Mexico
| | - Hiram Isaac Beltrán
- Departamento de Ciencias Básicas, Universidad Autónoma Metropolitana Unidad Azcapotzalco, Ciudad de México, 02200, Mexico.
| |
Collapse
|
6
|
Bao W, Wang D, Chen Y. Classification of Protein Structure Classes on Flexible Neutral Tree. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:1122-1133. [PMID: 28113983 DOI: 10.1109/tcbb.2016.2610967] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Accurate classification on protein structural is playing an important role in Bioinformatics. An increase in evidence demonstrates that a variety of classification methods have been employed in such a field. In this research, the features of amino acids composition, secondary structure's feature, and correlation coefficient of amino acid dimers and amino acid triplets have been used. Flexible neutral tree (FNT), a particular tree structure neutral network, has been employed as the classification model in the protein structures' classification framework. Considering different feature groups owing diverse roles in the model, impact factors of different groups have been put forward in this research. In order to evaluate different impact factors, Impact Factors Scaling (IFS) algorithm, which aim at reducing redundant information of the selected features in some degree, have been put forward. To examine the performance of such framework, the 640, 1189, and ASTRAL datasets are employed as the low-homology protein structure benchmark datasets. Experimental results demonstrate that the performance of the proposed method is better than the other methods in the low-homology protein tertiary structures.
Collapse
|
7
|
Ullah A, Ahmed N, Pappu SD, Shatabda S, Ullah AZMD, Rahman MS. Efficient conformational space exploration in ab initio protein folding simulation. ROYAL SOCIETY OPEN SCIENCE 2015; 2:150238. [PMID: 26361554 PMCID: PMC4555859 DOI: 10.1098/rsos.150238] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2015] [Accepted: 07/27/2015] [Indexed: 06/05/2023]
Abstract
Ab initio protein folding simulation largely depends on knowledge-based energy functions that are derived from known protein structures using statistical methods. These knowledge-based energy functions provide us with a good approximation of real protein energetics. However, these energy functions are not very informative for search algorithms and fail to distinguish the types of amino acid interactions that contribute largely to the energy function from those that do not. As a result, search algorithms frequently get trapped into the local minima. On the other hand, the hydrophobic-polar (HP) model considers hydrophobic interactions only. The simplified nature of HP energy function makes it limited only to a low-resolution model. In this paper, we present a strategy to derive a non-uniform scaled version of the real 20×20 pairwise energy function. The non-uniform scaling helps tackle the difficulty faced by a real energy function, whereas the integration of 20×20 pairwise information overcomes the limitations faced by the HP energy function. Here, we have applied a derived energy function with a genetic algorithm on discrete lattices. On a standard set of benchmark protein sequences, our approach significantly outperforms the state-of-the-art methods for similar models. Our approach has been able to explore regions of the conformational space which all the previous methods have failed to explore. Effectiveness of the derived energy function is presented by showing qualitative differences and similarities of the sampled structures to the native structures. Number of objective function evaluation in a single run of the algorithm is used as a comparison metric to demonstrate efficiency.
Collapse
Affiliation(s)
- Ahammed Ullah
- AℓEDA Group, Department of CSE, BUET, ECE Building, Dhaka 1205, Bangladesh
- Department of CSE, Independent University, Bangladesh, Dhaka 1229, Bangladesh
| | - Nasif Ahmed
- AℓEDA Group, Department of CSE, BUET, ECE Building, Dhaka 1205, Bangladesh
| | - Subrata Dey Pappu
- AℓEDA Group, Department of CSE, BUET, ECE Building, Dhaka 1205, Bangladesh
| | - Swakkhar Shatabda
- AℓEDA Group, Department of CSE, BUET, ECE Building, Dhaka 1205, Bangladesh
- Department of CSE, United International University, Dhanmondi, Dhaka 1209, Bangladesh
| | | | - M. Sohel Rahman
- AℓEDA Group, Department of CSE, BUET, ECE Building, Dhaka 1205, Bangladesh
| |
Collapse
|
8
|
A Parallel Framework for Multipoint Spiral Search in ab Initio Protein Structure Prediction. Adv Bioinformatics 2014; 2014:985968. [PMID: 24744779 PMCID: PMC3976798 DOI: 10.1155/2014/985968] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2013] [Revised: 02/04/2014] [Accepted: 02/06/2014] [Indexed: 11/17/2022] Open
Abstract
Protein structure prediction is computationally a very challenging problem. A large number of existing search algorithms attempt to solve the problem by exploring possible structures and finding the one with the minimum free energy. However, these algorithms perform poorly on large sized proteins due to an astronomically wide search space. In this paper, we present a multipoint spiral search framework that uses parallel processing techniques to expedite exploration by starting from different points. In our approach, a set of random initial solutions are generated and distributed to different threads. We allow each thread to run for a predefined period of time. The improved solutions are stored threadwise. When the threads finish, the solutions are merged together and the duplicates are removed. A selected distinct set of solutions are then split to different threads again. In our ab initio protein structure prediction method, we use the three-dimensional face-centred-cubic lattice for structure-backbone mapping. We use both the low resolution hydrophobic-polar energy model and the high-resolution 20 × 20 energy model for search guiding. The experimental results show that our new parallel framework significantly improves the results obtained by the state-of-the-art single-point search approaches for both energy models on three-dimensional face-centred-cubic lattice. We also experimentally show the effectiveness of mixing energy models within parallel threads.
Collapse
|