1
|
Wang X, Xu K, Tan Y, Liu S, Zhou J. Possibilities of Using De Novo Design for Generating Diverse Functional Food Enzymes. Int J Mol Sci 2023; 24:3827. [PMID: 36835238 PMCID: PMC9964944 DOI: 10.3390/ijms24043827] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 02/03/2023] [Accepted: 02/03/2023] [Indexed: 02/17/2023] Open
Abstract
Food enzymes have an important role in the improvement of certain food characteristics, such as texture improvement, elimination of toxins and allergens, production of carbohydrates, enhancing flavor/appearance characteristics. Recently, along with the development of artificial meats, food enzymes have been employed to achieve more diverse functions, especially in converting non-edible biomass to delicious foods. Reported food enzyme modifications for specific applications have highlighted the significance of enzyme engineering. However, using direct evolution or rational design showed inherent limitations due to the mutation rates, which made it difficult to satisfy the stability or specific activity needs for certain applications. Generating functional enzymes using de novo design, which highly assembles naturally existing enzymes, provides potential solutions for screening desired enzymes. Here, we describe the functions and applications of food enzymes to introduce the need for food enzymes engineering. To illustrate the possibilities of using de novo design for generating diverse functional proteins, we reviewed protein modelling and de novo design methods and their implementations. The future directions for adding structural data for de novo design model training, acquiring diversified training data, and investigating the relationship between enzyme-substrate binding and activity were highlighted as challenges to overcome for the de novo design of food enzymes.
Collapse
Affiliation(s)
- Xinglong Wang
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology, School of Biotechnology, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Kangjie Xu
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology, School of Biotechnology, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Yameng Tan
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology, School of Biotechnology, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Song Liu
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology, School of Biotechnology, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Jingwen Zhou
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology, School of Biotechnology, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
- Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology, Jiangnan University, Wuxi 214122, China
| |
Collapse
|
2
|
Shannon RJ, Deeks HM, Burfoot E, Clark E, Jones AJ, Mulholland AJ, Glowacki DR. Exploring human-guided strategies for reaction network exploration: Interactive molecular dynamics in virtual reality as a tool for citizen scientists. J Chem Phys 2021; 155:154106. [PMID: 34686059 DOI: 10.1063/5.0062517] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The emerging fields of citizen science and gamification reformulate scientific problems as games or puzzles to be solved. Through engaging the wider non-scientific community, significant breakthroughs may be made by analyzing citizen-gathered data. In parallel, recent advances in virtual reality (VR) technology are increasingly being used within a scientific context and the burgeoning field of interactive molecular dynamics in VR (iMD-VR) allows users to interact with dynamical chemistry simulations in real time. Here, we demonstrate the utility of iMD-VR as a medium for gamification of chemistry research tasks. An iMD-VR "game" was designed to encourage users to explore the reactivity of a particular chemical system, and a cohort of 18 participants was recruited to playtest this game as part of a user study. The reaction game encouraged users to experiment with making chemical reactions between a propyne molecule and an OH radical, and "molecular snapshots" from each game session were then compiled and used to map out reaction pathways. The reaction network generated by users was compared to existing literature networks demonstrating that users in VR capture almost all the important reaction pathways. Further comparisons between humans and an algorithmic method for guiding molecular dynamics show that through using citizen science to explore these kinds of chemical problems, new approaches and strategies start to emerge.
Collapse
Affiliation(s)
- Robin J Shannon
- School of Chemistry, University of Bristol, Bristol BS8 1TS, United Kingdom
| | - Helen M Deeks
- School of Chemistry, University of Bristol, Bristol BS8 1TS, United Kingdom
| | - Eleanor Burfoot
- School of Chemistry, University of Bristol, Bristol BS8 1TS, United Kingdom
| | - Edward Clark
- School of Chemistry, University of Bristol, Bristol BS8 1TS, United Kingdom
| | - Alex J Jones
- School of Chemistry, University of Bristol, Bristol BS8 1TS, United Kingdom
| | | | - David R Glowacki
- ArtSci Foundation International, 5th floor Mariner House, Bristol, BS1 4QD, United Kingdom
| |
Collapse
|
3
|
Trevizani R, Custódio FL, Dos Santos KB, Dardenne LE. Critical Features of Fragment Libraries for Protein Structure Prediction. PLoS One 2017; 12:e0170131. [PMID: 28085928 DOI: 10.1371/journal.pone.0170131] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Accepted: 12/29/2016] [Indexed: 11/19/2022] Open
Abstract
The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction.
Collapse
|
4
|
Isaac AE, Sinha S. Analysis of core-periphery organization in protein contact networks reveals groups of structurally and functionally critical residues. J Biosci 2015; 40:683-99. [PMID: 26564971 DOI: 10.1007/s12038-015-9554-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
The representation of proteins as networks of interacting amino acids, referred to as protein contact networks (PCN), and their subsequent analyses using graph theoretic tools, can provide novel insights into the key functional roles of specific groups of residues. We have characterized the networks corresponding to the native states of 66 proteins (belonging to different families) in terms of their core-periphery organization. The resulting hierarchical classification of the amino acid constituents of a protein arranges the residues into successive layers - having higher core order - with increasing connection density, ranging from a sparsely linked periphery to a densely intra-connected core (distinct from the earlier concept of protein core defined in terms of the three-dimensional geometry of the native state, which has least solvent accessibility). Our results show that residues in the inner cores are more conserved than those at the periphery. Underlining the functional importance of the network core, we see that the receptor sites for known ligand molecules of most proteins occur in the innermost core. Furthermore, the association of residues with structural pockets and cavities in binding or active sites increases with the core order. From mutation sensitivity analysis, we show that the probability of deleterious or intolerant mutations also increases with the core order. We also show that stabilization centre residues are in the innermost cores, suggesting that the network core is critically important in maintaining the structural stability of the protein. A publicly available Web resource for performing core-periphery analysis of any protein whose native state is known has been made available by us at http://www.imsc.res.in/ ~sitabhra/proteinKcore/index.html.
Collapse
Affiliation(s)
- Arnold Emerson Isaac
- Bioinformatics Division, School of Bio Sciences and Technology, VIT University, Vellore, India
| | | |
Collapse
|
5
|
Orevi T, Rahamim G, Hazan G, Amir D, Haas E. The loop hypothesis: contribution of early formed specific non-local interactions to the determination of protein folding pathways. Biophys Rev 2013; 5:85-98. [PMID: 28510159 DOI: 10.1007/s12551-013-0113-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2013] [Accepted: 03/01/2013] [Indexed: 12/12/2022] Open
Abstract
The extremely fast and efficient folding transition (in seconds) of globular proteins led to the search for some unifying principles embedded in the physics of the folding polypeptides. Most of the proposed mechanisms highlight the role of local interactions that stabilize secondary structure elements or a folding nucleus as the starting point of the folding pathways, i.e., a "bottom-up" mechanism. Non-local interactions were assumed either to stabilize the nucleus or lead to the later steps of coalescence of the secondary structure elements. An alternative mechanism was proposed, an "up-down" mechanism in which it was assumed that folding starts with the formation of very few non-local interactions which form closed long loops at the initiation of folding. The possible biological advantage of this mechanism, the "loop hypothesis", is that the hydrophobic collapse is associated with ordered compactization which reduces the chance for degradation and misfolding. In the present review the experiments, simulations and theoretical consideration that either directly or indirectly support this mechanism are summarized. It is argued that experiments monitoring the time-dependent development of the formation of specifically targeted early-formed sub-domain structural elements, either long loops or secondary structure elements, are necessary. This can be achieved by the time-resolved FRET-based "double kinetics" method in combination with mutational studies. Yet, attempts to improve the time resolution of the folding initiation should be extended down to the sub-microsecond time regime in order to design experiments that would resolve the classes of proteins which first fold by local or non-local interactions.
Collapse
Affiliation(s)
- Tomer Orevi
- The Goodman Faculty of Life Sciences, Bar Ilan University, Ramat Gan, Israel, 52900
| | - Gil Rahamim
- The Goodman Faculty of Life Sciences, Bar Ilan University, Ramat Gan, Israel, 52900
| | - Gershon Hazan
- The Goodman Faculty of Life Sciences, Bar Ilan University, Ramat Gan, Israel, 52900
| | - Dan Amir
- The Goodman Faculty of Life Sciences, Bar Ilan University, Ramat Gan, Israel, 52900
| | - Elisha Haas
- The Goodman Faculty of Life Sciences, Bar Ilan University, Ramat Gan, Israel, 52900.
| |
Collapse
|
6
|
Abstract
Small protein fragments, and not just residues, can be used as basic building blocks to reconstruct networks of coevolved amino acids in proteins. Fragments often enter in physical contact one with the other and play a major biological role in the protein. The nature of these interactions might be multiple and spans beyond binding specificity, allosteric regulation and folding constraints. Indeed, coevolving fragments are indicators of important information explaining folding intermediates, peptide assembly, key mutations with known roles in genetic diseases, distinguished subfamily-dependent motifs and differentiated evolutionary pressures on protein regions. Coevolution analysis detects networks of fragments interaction and highlights a high order organization of fragments demonstrating the importance of studying at a deeper level this structure. We demonstrate that it can be applied to protein families that are highly conserved or represented by few sequences, enlarging in this manner, the class of proteins where coevolution analysis can be performed and making large-scale coevolution studies a feasible goal.
Collapse
Affiliation(s)
- Linda Dib
- Université Pierre et Marie Curie, UMR 7238, Équipe de Génomique Analytique, Paris, France
- CNRS, UMR 7238, Laboratoire de Génomique des Microorganismes, Paris, France
| | - Alessandra Carbone
- Université Pierre et Marie Curie, UMR 7238, Équipe de Génomique Analytique, Paris, France
- CNRS, UMR 7238, Laboratoire de Génomique des Microorganismes, Paris, France
| |
Collapse
|
7
|
Rorick M. Quantifying protein modularity and evolvability: a comparison of different techniques. Biosystems 2012; 110:22-33. [PMID: 22796584 DOI: 10.1016/j.biosystems.2012.06.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2011] [Revised: 06/20/2012] [Accepted: 06/27/2012] [Indexed: 10/28/2022]
Abstract
Modularity increases evolvability by reducing constraints on adaptation and by allowing preexisting parts to function in new contexts for novel uses. Protein evolution provides an excellent context to study the causes and consequences of biological modularity. In order to address such questions, however, an index for protein modularity is necessary. This paper proposes a simple index for protein modularity-"module density"-which is the number of evolutionarily independent modules that compose a protein divided by the number of amino acids in the protein. The decomposition of proteins into constituent modules can be accomplished by either of two classes of methods. The first class of methods relies on "suppositional" criteria to assign amino acids to modules, whereas the second class of methods relies on "coevolutionary" criteria for this task. One simple and practical method from the first class consists of approximating the number of modules in a protein as the number of regular secondary structure elements (i.e., helices and sheets). Methods based on coevolutionary criteria require more elaborate data, but they have the advantage of being able to specify modules without prior assumptions about why they exist. Given the increasing availability of datasets sampling protein mutational spectra (e.g., from comparative genomics, experimental evolution, and computational prediction), methods based on coevolutionary criteria will likely become more promising in the near future. The ability to meaningfully quantify protein modularity via simple indices has the potential to aid future efforts to understand protein evolutionary rate determinants, improve molecular evolution models and engineer novel proteins.
Collapse
Affiliation(s)
- Mary Rorick
- University of Michigan, Department of Ecology and Evolutionary Biology, Ann Arbor, MI 48109-1048, United States.
| |
Collapse
|
8
|
OLSON BRIAN, MOLLOY KEVIN, SHEHU AMARDA. IN SEARCH OF THE PROTEIN NATIVE STATE WITH A PROBABILISTIC SAMPLING APPROACH. J Bioinform Comput Biol 2011; 9:383-98. [DOI: 10.1142/s0219720011005574] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2011] [Revised: 04/07/2011] [Accepted: 04/11/2011] [Indexed: 11/18/2022]
Abstract
The three-dimensional structure of a protein is a key determinant of its biological function. Given the cost and time required to acquire this structure through experimental means, computational models are necessary to complement wet-lab efforts. Many computational techniques exist for navigating the high-dimensional protein conformational search space, which is explored for low-energy conformations that comprise a protein's native states. This work proposes two strategies to enhance the sampling of conformations near the native state. An enhanced fragment library with greater structural diversity is used to expand the search space in the context of fragment-based assembly. To manage the increased complexity of the search space, only a representative subset of the sampled conformations is retained to further guide the search towards the native state. Our results make the case that these two strategies greatly enhance the sampling of the conformational space near the native state. A detailed comparative analysis shows that our approach performs as well as state-of-the-art ab initio structure prediction protocols.
Collapse
Affiliation(s)
- BRIAN OLSON
- Department of Computer Science, George Mason University 4400 University Drive, Fairfax, VA 22030, USA
| | - KEVIN MOLLOY
- Department of Computer Science, George Mason University 4400 University Drive, Fairfax, VA 22030, USA
| | - AMARDA SHEHU
- Department of Computer Science, George Mason University 4400 University Drive, Fairfax, VA 22030, USA
- Department of Bioinformatics and Computational Biology, George Mason University 4400 University Drive, Fairfax, VA 22030, USA
| |
Collapse
|
9
|
Abstract
Accurate tertiary structures are very important for the functional study of non-coding RNA molecules. However, predicting RNA tertiary structures is extremely challenging, because of a large conformation space to be explored and lack of an accurate scoring function differentiating the native structure from decoys. The fragment-based conformation sampling method (e.g. FARNA) bears shortcomings that the limited size of a fragment library makes it infeasible to represent all possible conformations well. A recent dynamic Bayesian network method, BARNACLE, overcomes the issue of fragment assembly. In addition, neither of these methods makes use of sequence information in sampling conformations. Here, we present a new probabilistic graphical model, conditional random fields (CRFs), to model RNA sequence–structure relationship, which enables us to accurately estimate the probability of an RNA conformation from sequence. Coupled with a novel tree-guided sampling scheme, our CRF model is then applied to RNA conformation sampling. Experimental results show that our CRF method can model RNA sequence–structure relationship well and sequence information is important for conformation sampling. Our method, named as TreeFolder, generates a much higher percentage of native-like decoys than FARNA and BARNACLE, although we use the same simple energy function as BARNACLE. Contact:zywang@ttic.edu; j3xu@ttic.edu Supplementary Information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhiyong Wang
- Toyota Technological Institute at Chicago, IL, USA.
| | | |
Collapse
|
10
|
Prudhomme N, Chomilier J. Prediction of the protein folding core: application to the immunoglobulin fold. Biochimie 2009; 91:1465-74. [PMID: 19665046 DOI: 10.1016/j.biochi.2009.07.016] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2009] [Accepted: 07/30/2009] [Indexed: 11/27/2022]
Abstract
We propose an algorithm that allows predicting residues important for the formation of the structure of globular proteins. It relies on a simulation that detects the amino acids presenting a maximum number of neighbours during the early steps of the folding process. They have been called MIR (Most Interacting Residues). Independently, description of the protein structures in fragments with closed ends shows the correlation between these extremities and the core of the globules. These fragments are of rather constant length, typically between 20 and 25 amino acids, and we have previously shown that their extremities are preferentially occupied by MIR. Introduction of rules derived from this fragment analysis of tertiary structures allows to smooth the distribution of MIR, for a better match between TEF ends and MIR. In order to assess this prediction of the folding core, a large family of structures has been used, with sequences as different as possible. A dataset of 56 immunoglobulin structures of various functions but common fold has been used in this study. This fold was chosen because it is one of the most populated with a large amount of data available on its nucleus. In the immunoglobulin domain, "functional and structural load is clearly separated: loops are responsible for binding and recognition while interactions between several residues of the buried core provide stability and fast folding"[1]. We then determined the positions susceptible of high importance for the folding process to occur and compared them to published data, either to High Throw Out Order (HTOO), Conservatism of Conservatism (CoC) or Phi value experiments. It results a reasonable agreement between the positions that we predict and experimental data. Besides, our prediction goes beyond the simple use of a null solvent accessibility of amino acids as a criterion to predict the core. We find the same quality of our prediction on the flavodoxin like superfamily.
Collapse
Affiliation(s)
- Nicolas Prudhomme
- Protein Structure Prediction, IMPMC, CNRS UMR 7590, Paris 6 University, 75015 Paris, France
| | | |
Collapse
|
11
|
Abstract
Background Here we continue our efforts to use methods developed in the folding mechanism community to both better understand and improve structure prediction. Our previous work demonstrated that Rosetta's coarse-grained potentials may actually impede accurate structure prediction at full-atom resolution. Based on this work we postulated that it may be time to work completely at full-atom resolution but that doing so may require more careful attention to the kinetics of convergence. Methodology/Principal Findings To explore the possibility of working entirely at full-atom resolution, we apply enhanced sampling algorithms and the free energy theory developed in the folding mechanism community to full-atom protein structure prediction with the prominent Rosetta package. We find that Rosetta's full-atom scoring function is indeed able to recognize diverse protein native states and that there is a strong correlation between score and Cα RMSD to the native state. However, we also show that there is a huge entropic barrier to folding under this potential and the kinetics of folding are extremely slow. We then exploit this new understanding to suggest ways to improve structure prediction. Conclusions/Significance Based on this work we hypothesize that structure prediction may be improved by taking a more physical approach, i.e. considering the nature of the model thermodynamics and kinetics which result from structure prediction simulations.
Collapse
Affiliation(s)
- Gregory R. Bowman
- Biophysics Program, Stanford University, Stanford, California, United States of America
| | - Vijay S. Pande
- Department of Chemistry, Stanford University, Stanford, California, United States of America
- * E-mail:
| |
Collapse
|
12
|
Abstract
Three-dimensional structures of proteins are the support of their biological functions. Their folds are maintained by inter-residue interactions which are one of the main focuses to understand the mechanisms of protein folding and stability. Furthermore, protein structures can be composed of single or multiple functional domains that can fold and function independently. Hence, dividing a protein into domains is useful for obtaining an accurate structure and function determination. In previous studies, we enlightened protein contact properties according to different definitions and developed a novel methodology named Protein Peeling. Within protein structures, Protein Peeling characterizes small successive compact units along the sequence called protein units (PUs). The cutting done by Protein Peeling maximizes the number of contacts within the PUs and minimizes the number of contacts between them. This method is so a relevant tool in the context of the protein folding research and particularly regarding the hierarchical model proposed by George Rose. Here, we accurately analyze the PUs at different levels of cutting, using a non-redundant protein databank. Distribution of PU sizes, number of PUs or their accessibility are screened to determine their common and different features. Moreover, we highlight the preferential amino acid interactions inside and between PUs. Our results show that PUs are clearly an intermediate level between secondary structures and protein structural domains.
Collapse
Affiliation(s)
- Guilhem Faure
- INSERM UMR-S 726, Equipe de Bioinformatique Génomique et Moléculaire (EBGM), DSIMB, Université Paris Diderot - Paris 7, case 7113, 2 place Jussieu, 75251 Paris, France
| | | | | |
Collapse
|
13
|
Abstract
How a one-dimensional protein sequence folds into a specific 3D structure remains a difficult challenge in structural biology. Many computational methods have been developed in an attempt to predict the tertiary structure of the protein; most of these employ approaches that are based on the accumulated knowledge of solved protein structures. Here we introduce a novel and fully automated approach for predicting the 3D structure of a protein that is based on the well accepted notion that protein folding is a hierarchical process. Our algorithm follows the hierarchical model by employing two stages: the first aims to find a match between the sequences of short independently-folding structural entities and parts of the target sequence and assigns the respective structures. The second assembles these local structural parts into a complete 3D structure, allowing for long-range interactions between them. We present the results of applying our method to a subset of the targets from CASP6 and CASP7. Our results indicate that for targets with a significant sequence similarity to known structures we are often able to provide predictions that are better than those achieved by two leading servers, and that the most significant improvements in comparison with these methods occur in regions of a gapped structural alignment between the native structure and the closest available structural template. We conclude that in addition to performing well for targets with known homologous structures, our method shows great promise for addressing the more general category of comparative modeling targets, which is our next goal.
Collapse
Affiliation(s)
- Ilona Kifer
- School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel.
| | | | | |
Collapse
|
14
|
Santos J, Sica MP, Buslje CM, Garrote AM, Ermácora MR, Delfino JM. Structural selection of a native fold by peptide recognition. Insights into the thioredoxin folding mechanism. Biochemistry 2009; 48:595-607. [PMID: 19119857 DOI: 10.1021/bi801969w] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Thioredoxins (TRXs) are monomeric alpha/beta proteins with a fold characterized by a central twisted beta-sheet surrounded by alpha-helical elements. The interaction of the C-terminal alpha-helix 5 of TRX against the remainder of the protein involves a close packing of hydrophobic surfaces, offering the opportunity of studying a fine-tuned molecular recognition phenomenon with long-range consequences on the acquisition of tertiary structure. In this work, we focus on the significance of interactions involving residues L94, L99, E101, F102, L103 and L107 on the formation of the noncovalent complex between reduced TRX1-93 and TRX94-108. The conformational status of the system was assessed experimentally by circular dichroism, intrinsic fluorescence emission and enzymic activity; and theoretically by molecular dynamics simulations (MDS). Alterations in tertiary structure of the complexes, resulting as a consequence of site specific mutation, were also examined. To distinguish the effect of alanine scanning mutagenesis on secondary structure stability, the intrinsic helix-forming ability of the mutant peptides was monitored experimentally by far-UV CD spectroscopy upon the addition of 2,2,2-trifluoroethanol, and also theoretically by Monte Carlo conformational search and MDS. This evidence suggests a key role of residues L99, F102 and L103 on the stabilization of the secondary structure of alpha-helix 5, and on the acquisition of tertiary structure upon complex formation. We hypothesize that the transition between a partially folded and a native-like conformation of reduced TRX1-93 would fundamentally depend on the consolidation of a cooperative tertiary unit based on the interaction between alpha-helix 3 and alpha-helix 5.
Collapse
Affiliation(s)
- Javier Santos
- Department of Biological Chemistry and Institute of Biochemistry and Biophysics (IQUIFIB), School of Pharmacy and Biochemistry, University of Buenos Aires, Junín 956, C1113AAD, Buenos Aires, Argentina
| | | | | | | | | | | |
Collapse
|
15
|
Benros C, de Brevern AG, Hazout S. Analyzing the sequence–structure relationship of a library of local structural prototypes. J Theor Biol 2009; 256:215-26. [DOI: 10.1016/j.jtbi.2008.08.032] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2008] [Revised: 08/23/2008] [Accepted: 08/31/2008] [Indexed: 10/21/2022]
|
16
|
Abstract
In recent years, protein structure prediction using local structure information has made great progress. In this study, a novel and effective method is developed to predict the local structure and the folding fragments of proteins. First, the proteins with known structures are split into fragments. Second, these fragments, represented by dihedrals, are clustered to produce the building blocks (BBs). Third, an efficient machine learning method is used to predict the local structures of proteins from sequence profiles. Finally, a bi-gram model, trained by an iterated algorithm, is introduced to simulate the interactions of these BBs. For test proteins, the building-block lattice is constructed, which contains all the folding fragments of the proteins. The local structures and the optimal fragments are then obtained by the dynamic programming algorithm. The experiment is performed on a subset of the PDB database with sequence identity less than 25%. The results show that the performance of the method is better than the method that uses only sequence information. When multiple paths are returned, the average classification accuracy of local structures is 72.27% and the average prediction accuracy of local structures is 67.72%, which is a significant improvement in comparison with previous studies. The method can predict not only the local structures but also the folding fragments of proteins. This work is helpful for the ab initio protein structure prediction and especially, the understanding of the folding process of proteins.
Collapse
Affiliation(s)
- Qiwen Dong
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China.
| | | | | |
Collapse
|
17
|
Abstract
MOTIVATION The 3D structure of a protein sequence can be assembled from the substructures corresponding to small segments of this sequence. For each small sequence segment, there are only a few more likely substructures. We call them the 'structural alphabet' for this segment. Classical approaches such as ROSETTA used sequence profile and secondary structure information, to predict structural fragments. In contrast, we utilize more structural information, such as solvent accessibility and contact capacity, for finding structural fragments. RESULTS Integer linear programming technique is applied to derive the best combination of these sequence and structural information items. This approach generates significantly more accurate and succinct structural alphabets with more than 50% improvement over the previous accuracies. With these novel structural alphabets, we are able to construct more accurate protein structures than the state-of-art ab initio protein structure prediction programs such as ROSETTA. We are also able to reduce the Kolodny's library size by a factor of 8, at the same accuracy. AVAILABILITY The online FRazor server is under construction.
Collapse
Affiliation(s)
- Shuai Cheng Li
- David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ont. N2L 3G1, Canada.
| | | | | | | | | |
Collapse
|
18
|
Baeten L, Reumers J, Tur V, Stricher F, Lenaerts T, Serrano L, Rousseau F, Schymkowitz J. Reconstruction of protein backbones from the BriX collection of canonical protein fragments. PLoS Comput Biol 2008; 4:e1000083. [PMID: 18483555 DOI: 10.1371/journal.pcbi.1000083] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2007] [Accepted: 04/07/2008] [Indexed: 12/23/2022] Open
Abstract
As modeling of changes in backbone conformation still lacks a computationally efficient solution, we developed a discretisation of the conformational states accessible to the protein backbone similar to the successful rotamer approach in side chains. The BriX fragment database, consisting of fragments from 4 to 14 residues long, was realized through identification of recurrent backbone fragments from a non-redundant set of high-resolution protein structures. BriX contains an alphabet of more than 1,000 frequently observed conformations per peptide length for 6 different variation levels. Analysis of the performance of BriX revealed an average structural coverage of protein structures of more than 99% within a root mean square distance (RMSD) of 1 Angstrom. Globally, we are able to reconstruct protein structures with an average accuracy of 0.48 Angstrom RMSD. As expected, regular structures are well covered, but, interestingly, many loop regions that appear irregular at first glance are also found to form a recurrent structural motif, albeit with lower frequency of occurrence than regular secondary structures. Larger loop regions could be completely reconstructed from smaller recurrent elements, between 4 and 8 residues long. Finally, we observed that a significant amount of short sequences tend to display strong structural ambiguity between alpha helix and extended conformations. When the sequence length increases, this so-called sequence plasticity is no longer observed, illustrating the context dependency of polypeptide structures. Large-scale DNA sequencing efforts produce large amounts of protein sequence data. However, in order to understand the function of a protein, its tertiary three-dimensional structure is required. Despite worldwide efforts in structural biology, experimental protein structures are determined at a significantly slower pace. As a result, computational methods for protein structure prediction receive significant attention. A large part of the structure prediction problem lies in the enormous size of the problem: proteins seem to occur in an infinite variety of shapes. Here, we propose that this huge complexity may be overcome by identifying recurrent protein fragments, which are frequently reused as building blocks to construct proteins that were hitherto thought to be unrelated. The BriX database is the outcome of identifying about 2,000 canonical shapes among 1,261 protein structures. We show any given protein can be reconstructed from this library of building blocks at a very high resolution, suggesting that the modelling of protein backbones may be greatly aided by our database.
Collapse
|
19
|
Manikandan K, Pal D, Ramakumar S, Brener NE, Iyengar SS, Seetharaman G. Functionally important segments in proteins dissected using Gene Ontology and geometric clustering of peptide fragments. Genome Biol 2008; 9:R52. [PMID: 18331637 PMCID: PMC2397504 DOI: 10.1186/gb-2008-9-3-r52] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2007] [Revised: 02/24/2008] [Accepted: 03/10/2008] [Indexed: 11/25/2022] Open
Abstract
A geometric clustering algorithm has been developed to dissect protein fragments based on their relevance to function. We have developed a geometric clustering algorithm using backbone φ,ψ angles to group conformationally similar peptide fragments of any length. By labeling each fragment in the cluster with the level-specific Gene Ontology 'molecular function' term of its protein, we are able to compute statistics for molecular function-propensity and p-value of individual fragments in the cluster. Clustering-cum-statistical analysis for peptide fragments 8 residues in length and with only trans peptide bonds shows that molecular function propensities ≥20 and p-values ≤0.05 can dissect fragments within a protein linked to the molecular function.
Collapse
|
20
|
|
21
|
Abstract
The "protein folding problem" consists of three closely related puzzles: (a) What is the folding code? (b) What is the folding mechanism? (c) Can we predict the native structure of a protein from its amino acid sequence? Once regarded as a grand challenge, protein folding has seen great progress in recent years. Now, foldable proteins and nonbiological polymers are being designed routinely and moving toward successful applications. The structures of small proteins are now often well predicted by computer methods. And, there is now a testable explanation for how a protein can fold so quickly: A protein solves its large global optimization problem as a series of smaller local optimization problems, growing and assembling the native structure from peptide fragments, local structures first.
Collapse
Affiliation(s)
- Ken A. Dill
- Department of Pharmaceutical Chemistry, University of California, San Francisco, California 94143
- Graduate Group in Biophysics, University of California, San Francisco, California 94143;
| | - S. Banu Ozkan
- Department of Physics, Arizona State University, Tempe, Arizona 85287;
| | - M. Scott Shell
- Department of Chemical Engineering, University of California, Santa Barbara, California 93106;
| | - Thomas R. Weikl
- Max Planck Institute of Colloids and Interfaces, Department of Theory and Bio-Systems, 14424 Potsdam, Germany;
| |
Collapse
|
22
|
Abstract
Currently there is increasing interest in nanostructures and their design. Nanostructure design involves the ability to predictably manipulate the properties of the self-assembly of autonomous units. Autonomous units have preferred conformational states. The units can be synthetic material science-based or derived from functional biological macromolecules. Autonomous biological building blocks with available structures provide an extremely rich and useful resource for design. For proteins, the structural databases contain large libraries of protein molecules and their building blocks with a range of shapes, surfaces, and chemical properties. The introduction of engineered synthetic residues or short peptides into these can expand the available chemical space and enhance the desired properties. Here we focus on the principles of nanostructure design with protein building blocks.
Collapse
Affiliation(s)
- Chung-Jung Tsai
- Basic Research Program, SAIC-Frederick, Inc., Center for Cancer Research Nanobiology Program, NCI-Frederick, Frederick, Maryland 21702, USA
| | | | | | | | | | | | | |
Collapse
|
23
|
Minary P, Levitt M. Probing protein fold space with a simplified model. J Mol Biol 2008; 375:920-33. [PMID: 18054792 DOI: 10.1016/j.jmb.2007.10.087] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2007] [Revised: 10/15/2007] [Accepted: 10/31/2007] [Indexed: 11/24/2022]
Abstract
We probe the stability and near-native energy landscape of protein fold space using powerful conformational sampling methods together with simple reduced models and statistical potentials. Fold space is represented by a set of 280 protein domains spanning all topological classes and having a wide range of lengths (33-300 residues) amino acid composition and number of secondary structural elements. The degrees of freedom are taken as the loop torsion angles. This choice preserves the native secondary structure but allows the tertiary structure to change. The proteins are represented by three-point per residue, three-dimensional models with statistical potentials derived from a knowledge-based study of known protein structures. When this space is sampled by a combination of parallel tempering and equi-energy Monte Carlo, we find that the three-point model captures the known stability of protein native structures with stable energy basins that are near-native (all alpha: 4.77 A, all beta: 2.93 A, alpha/beta: 3.09 A, alpha+beta: 4.89 A on average and within 6 A for 71.41%, 92.85%, 94.29% and 64.28% for all-alpha, all-beta, alpha/beta and alpha+beta, classes, respectively). Denatured structures also occur and these have interesting structural properties that shed light on the different landscape characteristics of alpha and beta folds. We find that alpha/beta proteins with alternating alpha and beta segments (such as the beta-barrel) are more stable than proteins in other fold classes.
Collapse
|
24
|
Friedberg I, Godzik A. Connecting the protein structure universe by using sparse recurring fragments. Structure 2007; 13:1213-24. [PMID: 16084393 DOI: 10.1016/j.str.2005.05.009] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2005] [Revised: 04/22/2005] [Accepted: 05/11/2005] [Indexed: 10/25/2022]
Abstract
The quest to order and classify protein structures has lead to various classification schemes, focusing mostly on hierarchical relationships between structural domains. At the coarsest classification level, such schemes typically identify hundreds of types of fundamental units called folds. As a result, we picture protein structure space as a collection of isolated fold islands. It is obvious, however, that many protein folds share structural and functional commonalities. Locating those commonalities is important for our understanding of protein structure, function, and evolution. Here, we present an alternative view of the protein fold space, based on an interfold similarity measure that is related to the frequency of fragments shared between folds. In this view, protein structures form a complicated, crossconnected network with very interesting topology. We show that interfold similarity based on sequence/structure fragments correlates well with similarities of functions between protein populations in different folds.
Collapse
Affiliation(s)
- Iddo Friedberg
- Program in Bioinformatics and Systems Biology, The Burnham Institute, La Jolla, California 92037, USA.
| | | |
Collapse
|
25
|
Dallüge R, Oschmann J, Birkenmeier O, Lücke C, Lilie H, Rudolph R, Lange C. A tetrapeptide fragment-based design method results in highly stable artificial proteins. Proteins 2007; 68:839-49. [PMID: 17557327 DOI: 10.1002/prot.21493] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Computational protein design has progressed rapidly over the last years. A number of design methods have been proposed and tested. In this paper, we report the successful application of a fragment-based method for protein design. The method uses statistical information on tetrapeptide backbone conformations. The previously published artificial fold of TOP 7 (Kuhlman et al., Science, 2003; 302:1364-1368) was chosen as template. A series of polypeptide sequences were created that were predicted to fold into this target structure. Two of the designed proteins, M5 and M7, were expressed and characterized by fluorescence spectroscopy, circular dichroism and NMR. They showed the hallmarks of well-ordered tertiary structure as well as cooperative folding/unfolding transitions. Furthermore, the two novel proteins were found to be highly stable against temperature and denaturant-induced unfolding.
Collapse
Affiliation(s)
- Roman Dallüge
- Institut für Biotechnologie, Martin-Luther-Universität Halle-Wittenberg, 06099 Halle, Saale, Germany
| | | | | | | | | | | | | |
Collapse
|
26
|
Abstract
MOTIVATION Most methods that are used to compare protein structures use three-dimensional (3D) structural information. At the same time, it has been shown that a 1D string representation of local protein structure retains a degree of structural information. This type of representation can be a powerful tool for protein structure comparison and classification, given the arsenal of sequence comparison tools developed by computational biology. However, in order to do so, there is a need to first understand how much information is contained in various possible 1D representations of protein structure. RESULTS Here we describe the use of a particular structure fragment library, denoted here as KL-strings, for the 1D representation of protein structure. Using KL-strings, we develop an infrastructure for comparing protein structures with a 1D representation. This study focuses on the added value gained from such a description. We show the new local structure language adds resolution to the traditional three-state (helix, strand and coil) secondary structure description, and provides a high degree of accuracy in recognizing structural similarities when used with a pairwise alignment benchmark. The results of this study have immediate applications towards fast structure recognition, and for fold prediction and classification.
Collapse
Affiliation(s)
- Iddo Friedberg
- Program in Bioinformatics and Systems Biology, Burnham Institute for Medical Research, La Jolla, CA, USA.
| | | | | | | | | | | |
Collapse
|
27
|
Tsai CJ, Zheng J, Alemán C, Nussinov R. Structure by design: from single proteins and their building blocks to nanostructures. Trends Biotechnol 2006; 24:449-54. [PMID: 16935374 DOI: 10.1016/j.tibtech.2006.08.004] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2006] [Revised: 07/12/2006] [Accepted: 08/15/2006] [Indexed: 10/24/2022]
Abstract
Nanotechnology realizes the advantages of naturally occurring biological macromolecules and their building-block nature for design. Frequently, assembly starts with the choice of a "good" molecule that is synthetically optimized towards the desired shape. By contrast, we propose starting with a pre-specified nanostructure shape, selecting candidate protein building blocks from a library and mapping them onto the shape and, finally, testing the stability of the construct. Such a shape-based, part-assembly strategy is conceptually similar to protein design through the combinatorial assembly of building blocks. If the conformational preferences of the building blocks are retained and their interactions are favorable, the nanostructure will be stable. The richness of the conformations, shapes and chemistries of the protein building blocks suggests a broad range of potential applications; at the same time, it also highlights their complexity. In this Opinion article, we focus on the first step: validating such a strategy against experimental data.
Collapse
Affiliation(s)
- Chung-Jung Tsai
- Basic Research Program, SAIC-Frederick, Inc., Center for Cancer Research Nanobiology Program, NCI-Frederick, Frederick, MD 21702, USA
| | | | | | | |
Collapse
|
28
|
Abstract
Here our goal is to carry out nanotube design using naturally occurring protein building blocks. Inspection of the protein structural database reveals the richness of the conformations of proteins, their parts, and their chemistry. Given target functional protein nanotube geometry, our strategy involves scanning a library of candidate building blocks, combinatorially assembling them into the shape and testing its stability. Since self-assembly takes place on time scales not affordable for computations, here we propose a strategy for the very first step in protein nanotube design: we map the candidate building blocks onto a planar sheet and wrap the sheet around a cylinder with the target dimensions. We provide examples of three nanotubes, two peptide and one protein, in atomistic model detail for which there are experimental data. The nanotube models can be used to verify a nanostructure observed by low-resolution experiments, and to study the mechanism of tube formation.
Collapse
Affiliation(s)
- Chung-Jung Tsai
- Basic Research Program, SAIC-Frederick, Inc., Center for Cancer Research, Nanobiology Program, National Cancer Institute-Frederick, Frederick, Maryland, United States of America
| | | | | |
Collapse
|
29
|
Wainreb G, Haspel N, Wolfson HJ, Nussinov R. A permissive secondary structure-guided superposition tool for clustering of protein fragments toward protein structure prediction via fragment assembly. ACTA ACUST UNITED AC 2006; 22:1343-52. [PMID: 16543273 DOI: 10.1093/bioinformatics/btl098] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
MOTIVATION Secondary-Structure Guided Superposition tool (SSGS) is a permissive secondary structure-based algorithm for matching of protein structures and in particular their fragments. The algorithm was developed towards protein structure prediction via fragment assembly. RESULTS In a fragment-based structural prediction scheme, a protein sequence is cut into building blocks (BBs). The BBs are assembled to predict their relative 3D arrangement. Finally, the assemblies are refined. To implement this prediction scheme, a clustered structural library representing sequence patterns for protein fragments is essential. To create a library, BBs generated by cutting proteins from the PDB are compared and structurally similar BBs are clustered. To allow structural comparison and clustering of the BBs, which are often relatively short with flexible loops, we have devised SSGS. SSGS maintains high similarity between cluster members and is highly efficient. When it comes to comparing BBs for clustering purposes, the algorithm obtains better results than other, non-secondary structure guided protein superimposition algorithms.
Collapse
Affiliation(s)
- Gilad Wainreb
- Sackler Institute of Molecular Medicine, Department of Human Genetics, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | | | | | | |
Collapse
|
30
|
Jacobs DJ, Livesay DR, Hules J, Tasayco ML. Elucidating quantitative stability/flexibility relationships within thioredoxin and its fragments using a distance constraint model. J Mol Biol 2006; 358:882-904. [PMID: 16542678 PMCID: PMC4667950 DOI: 10.1016/j.jmb.2006.02.015] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2005] [Revised: 01/17/2006] [Accepted: 02/07/2006] [Indexed: 11/21/2022]
Abstract
Numerous quantitative stability/flexibility relationships, within Escherichia coli thioredoxin (Trx) and its fragments are determined using a minimal distance constraint model (DCM). A one-dimensional free energy landscape as a function of global flexibility reveals Trx to fold in a low-barrier two-state process, with a voluminous transition state. Near the folding transition temperature, the native free energy basin is markedly skewed to allow partial unfolded forms. Under native conditions the skewed shape is lost, and the protein forms a compact structure with some flexibility. Predictions on ten Trx fragments are generally consistent with experimental observations that they are disordered, and that complementary fragments reconstitute. A hierarchical unfolding pathway is uncovered using an exhaustive computational procedure of breaking interfacial cross-linking hydrogen bonds that span over a series of fragment dissociations. The unfolding pathway leads to a stable core structure (residues 22-90), predicted to act as a kinetic trap. Direct connection between degree of rigidity within molecular structure and non-additivity of free energy is demonstrated using a thermodynamic cycle involving fragments and their hierarchical unfolding pathway. Additionally, the model provides insight about molecular cooperativity within Trx in its native state, and about intermediate states populating the folding/unfolding pathways. Native state cooperativity correlation plots highlight several flexibly correlated regions, giving insight into the catalytic mechanism that facilitates access to the active site disulfide bond. Residual native cooperativity correlations are present in the core substructure, suggesting that Trx can function when it is partly unfolded. This natively disordered kinetic trap, interpreted as a molten globule, has a wide temperature range of metastability, and it is identified as the "slow intermediate state" observed in kinetic experiments. These computational results are found to be in overall agreement with a large array of experimental data.
Collapse
Affiliation(s)
- Donald J Jacobs
- Department of Physics and Optical Science, University of North Carolina, Charlotte, 9201 University City Blvd, Charlotte, NC 28227, USA.
| | | | | | | |
Collapse
|
31
|
Abstract
MOTIVATION The object of this study is to propose a new method to identify small compact units that compose protein three-dimensional structures. These fragments, called 'protein units (PU)', are a new level of description to well understand and analyze the organization of protein structures. The method only works from the contact probability matrix, i.e. the inter Calpha-distances translated into probabilities. It uses the principle of conventional hierarchical clustering, leading to a series of nested partitions of the 3D structure. Every step aims at dividing optimally a unit into 2 or 3 subunits according to a criterion called 'partition index' assessing the structural independence of the subunits newly defined. Moreover, an entropy-derived squared correlation R is used for assessing globally the protein structure dissection. The method is compared to other splitting algorithms and shows relevant performance. AVAILABILITY An Internet server with dedicated tools is available at http://www.ebgm.jussieu.fr/~gelly/
Collapse
Affiliation(s)
- Jean-Christophe Gelly
- INSERM U726, Equipe de Bioinformatique Génomique and Moléculaire (EBGM), Université Denis Diderot-Paris 7, case 7113, 75251 Paris Cedex 05, France
| | | | | |
Collapse
|
32
|
Dell'Orco D, Seeber M, De Benedetti PG, Fanelli F. Probing Fragment Complementation by Rigid-Body Docking: in Silico Reconstitution of Calbindin D9k. J Chem Inf Model 2005; 45:1429-38. [PMID: 16180920 DOI: 10.1021/ci0501995] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Fragment complementation is gaining an increasing impact as a nonperturbing method to probe noncovalent interactions within protein supersecondary structures. In this study, the fast Fourier transform rigid-body docking algorithm ZDOCK has been employed for in silico reconstitution of the calcium binding protein calbindin D9k, from its two EF-hands subdomains, namely, EF1 (residues 1-43) and EF2 (residues 44-75). The EF1 fragment has been used both in its wild type and in nine mutant forms, in line with in vitro experiments. Consistent with in vitro data, ZDOCK reconstituted the proper fold of wild-type and mutated calbindin, locating the nativelike structures (i.e., holding a root-mean-square deviation < 1 A with respect to the X-ray structure) among the first 10 top-scored solutions out of 4000. Moreover, the three independent in silico reconstitutions of wild-type calbindin ranked a nativelike structure at the top of the output list, that is, the best scored one. The algorithm has been also successfully challenged in reconstituting the EF2 homodimer from two identical copies of the monomer. Furthermore, quantitative models consisting of linear correlations between thermodynamic data and ZDOCK scores were built, providing a tested tool for very fast in silico predictions of the free energy of association of protein-protein complexes solved at the atomic level and known to not undergo significant conformational changes upon binding.
Collapse
Affiliation(s)
- Daniele Dell'Orco
- Department of Chemistry and Dulbecco Telethon Institute, University of Modena and Reggio Emilia, via Campi 183, 41100 Modena, Italy
| | | | | | | |
Collapse
|
33
|
Coinçon M, Heitz A, Chiche L, Derreumaux P. The βαβαβ elementary supersecondary structure of the Rossmann fold from porcine lactate dehydrogenase exhibits characteristics of a molten globule. Proteins 2005; 60:740-5. [PMID: 16001419 DOI: 10.1002/prot.20507] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Protein classifications show that the Rossmann fold, which consists of two betaalphabetaalphabeta motifs (BABAB) related by a rough twofold axis, is the most populated alphabeta fold, and that the betaalphabeta submotif (BAB) is a widespread elementary structural arrangement. Herein, we report MD simulations, circular dichroism and NMR analyses on BAB and BABAB from porcine lactate dehydrogenase to evaluate their intrinsic stability. Our results demonstrate that BAB is not stable in solution and is not a folding nucleus. We also find that BABAB, despite its appearance of a functional and structural unit, is not an independent and thermodynamically stable folding unit. Rather, we show that BABAB retains most native secondary structure but very little tertiary structure, thus displaying characteristics of a molten globule.
Collapse
Affiliation(s)
- Mathieu Coinçon
- Information Génomique et Structurale, CNRS UPR 2589, Marseille Cedex, France
| | | | | | | |
Collapse
|
34
|
Abstract
Utilizing concepts of protein building blocks, we propose a de novo computational algorithm that is similar to combinatorial shuffling experiments. Our goal is to engineer new naturally occurring folds with low homology to existing proteins. A selected protein is first partitioned into its building blocks based on their compactness, degree of isolation from the rest of the structure, and hydrophobicity. Next, the protein building blocks are substituted by fragments taken from other proteins with overall low sequence identity, but with a similar hydrophobic/hydrophilic pattern and a high structural similarity. These criteria ensure that the designed protein has a similar fold, low sequence identity, and a good hydrophobic core compared with its native counterpart. Here, we have selected two proteins for engineering, protein G B1 domain and ubiquitin. The two engineered proteins share approximately 20% and approximately 25% amino acid sequence identities with their native counterparts, respectively. The stabilities of the engineered proteins are tested by explicit water molecular dynamics simulations. The algorithm implements a strategy of designing a protein using relatively stable fragments, with a high population time. Here, we have selected the fragments by searching for local minima along the polypeptide chain using the protein building block model. Such an approach provides a new method for engineering new proteins with similar folds and low homology.
Collapse
Affiliation(s)
- Hui-Hsu Gavin Tsai
- Basic Research Program, SAIC-Frederick, Inc., Laboratory of Experimental and Computational Biology, NCI-Frederick, Building 469, Room 145, Frederick, MD 21702, USA
| | | | | | | |
Collapse
|
35
|
Abstract
The possibility is addressed that protein folding and function may be related via regions that are critical for both folding and function. This approach is based on the building blocks folding model that describes protein folding as binding events of conformationally fluctuating building blocks. Within these, we identify building block fragments that are critical for achieving the native fold. A library of such critical building blocks (CBBs) is constructed. Then, it is asked whether the functionally important residues fall in these CBB fragments. We find that for over two-thirds of the proteins in our library with available functional information, the catalytic or binding site residues lie within the CBB regions. From the evolutionary standpoint, a folding-function relationship is advantageous, since the need to guard against mutations is limited to one region. Furthermore, conformationally similar CBBs are found in globally unrelated proteins with different functions. Hence, substituting CBBs may lead to designed proteins with altered functions. We further find that the CBBs in our library are conformationally unstable.
Collapse
Affiliation(s)
- Adi Barzilai
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | | | | | | |
Collapse
|
36
|
Camproux AC, Gautier R, Tufféry P. A hidden markov model derived structural alphabet for proteins. J Mol Biol 2004; 339:591-605. [PMID: 15147844 DOI: 10.1016/j.jmb.2004.04.005] [Citation(s) in RCA: 103] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2003] [Revised: 03/30/2004] [Accepted: 04/05/2004] [Indexed: 10/26/2022]
Abstract
Understanding and predicting protein structures depends on the complexity and the accuracy of the models used to represent them. We have set up a hidden Markov model that discretizes protein backbone conformation as series of overlapping fragments (states) of four residues length. This approach learns simultaneously the geometry of the states and their connections. We obtain, using a statistical criterion, an optimal systematic decomposition of the conformational variability of the protein peptidic chain in 27 states with strong connection logic. This result is stable over different protein sets. Our model fits well the previous knowledge related to protein architecture organisation and seems able to grab some subtle details of protein organisation, such as helix sub-level organisation schemes. Taking into account the dependence between the states results in a description of local protein structure of low complexity. On an average, the model makes use of only 8.3 states among 27 to describe each position of a protein structure. Although we use short fragments, the learning process on entire protein conformations captures the logic of the assembly on a larger scale. Using such a model, the structure of proteins can be reconstructed with an average accuracy close to 1.1A root-mean-square deviation and for a complexity of only 3. Finally, we also observe that sequence specificity increases with the number of states of the structural alphabet. Such models can constitute a very relevant approach to the analysis of protein architecture in particular for protein structure prediction.
Collapse
Affiliation(s)
- A C Camproux
- Equipe de Bioinformatique Génomique et Moléculaire, INSERM E0436, Université Paris 7, case 7113, 2 place Jussieu, 75251 Paris, France.
| | | | | |
Collapse
|