1
|
Peng CX, Liang F, Xia YH, Zhao KL, Hou MH, Zhang GJ. Recent Advances and Challenges in Protein Structure Prediction. J Chem Inf Model 2024; 64:76-95. [PMID: 38109487 DOI: 10.1021/acs.jcim.3c01324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Artificial intelligence has made significant advances in the field of protein structure prediction in recent years. In particular, DeepMind's end-to-end model, AlphaFold2, has demonstrated the capability to predict three-dimensional structures of numerous unknown proteins with accuracy levels comparable to those of experimental methods. This breakthrough has opened up new possibilities for understanding protein structure and function as well as accelerating drug discovery and other applications in the field of biology and medicine. Despite the remarkable achievements of artificial intelligence in the field, there are still some challenges and limitations. In this Review, we discuss the recent progress and some of the challenges in protein structure prediction. These challenges include predicting multidomain protein structures, protein complex structures, multiple conformational states of proteins, and protein folding pathways. Furthermore, we highlight directions in which further improvements can be conducted.
Collapse
Affiliation(s)
- Chun-Xiang Peng
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Fang Liang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Yu-Hao Xia
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Kai-Long Zhao
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Ming-Hua Hou
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Gui-Jun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
2
|
Accurate prediction of protein torsion angles using evolutionary signatures and recurrent neural network. Sci Rep 2021; 11:21033. [PMID: 34702851 PMCID: PMC8548351 DOI: 10.1038/s41598-021-00477-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Accepted: 09/27/2021] [Indexed: 11/08/2022] Open
Abstract
The amino acid sequence of a protein contains all the necessary information to specify its shape, which dictates its biological activities. However, it is challenging and expensive to experimentally determine the three-dimensional structure of proteins. The backbone torsion angles play a critical role in protein structure prediction, and accurately predicting the angles can considerably advance the tertiary structure prediction by accelerating efficient sampling of the large conformational space for low energy structures. Here we first time propose evolutionary signatures computed from protein sequence profiles, and a novel recurrent architecture, termed ESIDEN, that adopts a straightforward architecture of recurrent neural networks with a small number of learnable parameters. The ESIDEN can capture efficient information from both the classic and new features benefiting from different recurrent architectures in processing information. On the other hand, compared to widely used classic features, the new features, especially the Ramachandran basin potential, provide statistical and evolutionary information to improve prediction accuracy. On four widely used benchmark datasets, the ESIDEN significantly improves the accuracy in predicting the torsion angles by comparison to the best-so-far methods. As demonstrated in the present study, the predicted angles can be used as structural constraints to accurately infer protein tertiary structures. Moreover, the proposed features would pave the way to improve machine learning-based methods in protein folding and structure prediction, as well as function prediction. The source code and data are available at the website https://kornmann.bioch.ox.ac.uk/leri/resources/download.html .
Collapse
|
3
|
Bhatia S, Krishnamoorthy G, Udgaonkar JB. Mapping Distinct Sequences of Structure Formation Differentiating Multiple Folding Pathways of a Small Protein. J Am Chem Soc 2021; 143:1447-1457. [PMID: 33430589 DOI: 10.1021/jacs.0c11097] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
To determine experimentally how the multiple folding pathways of a protein differ, in the order in which the structural parts are assembled, has been a long-standing challenge. To resolve whether structure formation during folding can progress in multiple ways, the complex folding landscape of monellin has been characterized, structurally and temporally, using the multisite time-resolved FRET methodology. After an initial heterogeneous polypeptide chain collapse, structure formation proceeds on parallel pathways. Kinetic analysis of the population evolution data across various protein segments provides a clear structural distinction between the parallel pathways. The analysis leads to a phenomenological model that describes how and when discrete segments acquire structure independently of each other in different subensembles of protein molecules. When averaged over all molecules, structure formation is seen to progress as α-helix formation, followed by core consolidation, then β-sheet formation, and last end-to-end distance compaction. Parts of the protein that are closer in the primary sequence acquire structure before parts separated by longer sequence.
Collapse
Affiliation(s)
- Sandhya Bhatia
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bengaluru 560 065, India.,Indian Institute of Science Education and Research, Pune 411 008, India
| | | | - Jayant B Udgaonkar
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bengaluru 560 065, India.,Indian Institute of Science Education and Research, Pune 411 008, India
| |
Collapse
|
4
|
Becerra D, Butyaev A, Waldispühl J. Fast and flexible coarse-grained prediction of protein folding routes using ensemble modeling and evolutionary sequence variation. Bioinformatics 2020; 36:1420-1428. [PMID: 31584628 DOI: 10.1093/bioinformatics/btz743] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Revised: 09/22/2019] [Accepted: 09/28/2019] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION Protein folding is a dynamic process through which polypeptide chains reach their native 3D structures. Although the importance of this mechanism is widely acknowledged, very few high-throughput computational methods have been developed to study it. RESULTS In this paper, we report a computational platform named P3Fold that combines statistical and evolutionary information for predicting and analyzing protein folding routes. P3Fold uses coarse-grained modeling and efficient combinatorial schemes to predict residue contacts and evaluate the folding routes of a protein sequence within minutes or hours. To facilitate access to this technology, we devise graphical representations and implement an interactive web interface that allows end-users to leverage P3Fold predictions. Finally, we use P3Fold to conduct large and short scale experiments on the human proteome that reveal the broad conservation and variations of structural intermediates within protein families. AVAILABILITY AND IMPLEMENTATION A Web server of P3Fold is freely available at http://csb.cs.mcgill.ca/P3Fold. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- David Becerra
- School of Computer Science, McGill University, Montréal, QC H3A 0E9, Canada
| | - Alexander Butyaev
- School of Computer Science, McGill University, Montréal, QC H3A 0E9, Canada
| | - Jérôme Waldispühl
- School of Computer Science, McGill University, Montréal, QC H3A 0E9, Canada
| |
Collapse
|
5
|
Abstract
Proteins are molecular machines whose function depends on their ability to achieve complex folds with precisely defined structural and dynamic properties. The rational design of proteins from first-principles, or de novo, was once considered to be impossible, but today proteins with a variety of folds and functions have been realized. We review the evolution of the field from its earliest days, placing particular emphasis on how this endeavor has illuminated our understanding of the principles underlying the folding and function of natural proteins, and is informing the design of macromolecules with unprecedented structures and properties. An initial set of milestones in de novo protein design focused on the construction of sequences that folded in water and membranes to adopt folded conformations. The first proteins were designed from first-principles using very simple physical models. As computers became more powerful, the use of the rotamer approximation allowed one to discover amino acid sequences that stabilize the desired fold. As the crystallographic database of protein structures expanded in subsequent years, it became possible to construct proteins by assembling short backbone fragments that frequently recur in Nature. The second set of milestones in de novo design involves the discovery of complex functions. Proteins have been designed to bind a variety of metals, porphyrins, and other cofactors. The design of proteins that catalyze hydrolysis and oxygen-dependent reactions has progressed significantly. However, de novo design of catalysts for energetically demanding reactions, or even proteins that bind with high affinity and specificity to highly functionalized complex polar molecules remains an importnant challenge that is now being achieved. Finally, the protein design contributed significantly to our understanding of membrane protein folding and transport of ions across membranes. The area of membrane protein design, or more generally of biomimetic polymers that function in mixed or non-aqueous environments, is now becoming increasingly possible.
Collapse
|
6
|
Clark PL, Plaxco KW, Sosnick TR. Water as a Good Solvent for Unfolded Proteins: Folding and Collapse are Fundamentally Different. J Mol Biol 2020; 432:2882-2889. [PMID: 32044346 DOI: 10.1016/j.jmb.2020.01.031] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Revised: 01/28/2020] [Accepted: 01/29/2020] [Indexed: 12/30/2022]
Abstract
The argument that the hydrophobic effect is the primary effect driving the folding of globular proteins is nearly universally accepted (including by the authors). But does this view also imply that water is a "poor" solvent for the unfolded states of these same proteins? Here we argue that the answer is "no," that is, folding to a well-packed, extensively hydrogen-bonded native structure differs fundamentally from the nonspecific chain collapse that defines a poor solvent. Thus, the observation that a protein folds in water does not necessitate that water is a poor solvent for its unfolded state. Indeed, chain-solvent interactions that are marginally more favorable than nonspecific intrachain interactions are beneficial to protein function because they destabilize deleterious misfolded conformations and inter-chain interactions.
Collapse
Affiliation(s)
- Patricia L Clark
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, IN, 46556, USA.
| | - Kevin W Plaxco
- Department of Chemistry and Biochemistry, University of California, Santa Barbara, CA, 93106, USA.
| | - Tobin R Sosnick
- Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, Pritzker School of Molecular Engineering, University of Chicago, Chicago, IL, 60637, USA.
| |
Collapse
|
7
|
Cheung NJ, Yu W. Sibe: a computation tool to apply protein sequence statistics to predict folding and design in silico. BMC Bioinformatics 2019; 20:455. [PMID: 31492097 PMCID: PMC6728967 DOI: 10.1186/s12859-019-2984-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Accepted: 07/03/2019] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND Evolutionary information contained in the amino acid sequences of proteins specifies the biological function and fold, but exactly what information contained in the protein sequence drives both of these processes? Considerable progress has been made to answer this fundamental question, but it remains challenging to explore the potential space of cooperative interactions between amino acids. Statistical analysis plays a significant role in studying such interactions and its use has expanded in recent years to studies ranging from coevolution-guided rational protein design to protein folding in silico. RESULTS Here we describe a computational tool named Sibe for use in studies of protein sequence, folding, and design using evolutionary coupling between amino acids as a driving factor. In this study, Sibe is used to identify positionally conserved couplings between pairwise amino acids and aid rational protein design. In this process, pairwise couplings are filtered according to the relative entropy computed from the positional conservations and grouped into several 'blocks', which could contribute to driving protein folding and design. A human β2-adrenergic receptor (β2AR) was used to demonstrate that those 'blocks' contribute the rational design for specifying functional residues. Sibe also provides folding modules based on both the positionally conserved couplings and well-established statistical potentials for simulating protein folding in silico and predicting tertiary structure. Our results show that statistically inferences of basic evolutionary principles, such as conservations and coupled-mutations, can be used to rapidly design a diverse set of proteins and study protein folding. CONCLUSIONS The developed software Sibe provides a computational tool for systematical analysis from protein primary to its tertiary structure using the evolutionary couplings as a driving factor. Sibe, written in C++, accounts for compatibility with the 'big data' era in biological science, and it primarily focuses on protein sequence analysis, but it is also applicable to extend to other modeling and predictions of experimental measurements.
Collapse
Affiliation(s)
- Ngaam J. Cheung
- Department of Brain and Cognitive Science, DGIST, Daegu, 42988 South Korea
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge, CB3 0HA UK
| | - Wookyung Yu
- Department of Brain and Cognitive Science, DGIST, Daegu, 42988 South Korea
- Core Protein Resources Center, DGIST, Daegu, 42988 South Korea
| |
Collapse
|
8
|
Adhikari AN. Gene-specific features enhance interpretation of mutational impact on acid α-glucosidase enzyme activity. Hum Mutat 2019; 40:1507-1518. [PMID: 31228295 DOI: 10.1002/humu.23846] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 05/21/2019] [Accepted: 06/17/2019] [Indexed: 01/30/2023]
Abstract
We present a computational model for predicting mutational impact on enzymatic activity of human acid α-glucosidase (GAA), an enzyme associated with Pompe disease. Using a model that combines features specific to GAA with other general evolutionary and physiochemical features, we made blind predictions of enzymatic activity relative to wildtype human GAA for >300 GAA mutants, as part of the Critical Assessment of Genome Interpretation 5 GAA challenge. We found that gene-specific features can improve the performance of existing impact prediction tools that mostly rely on general features for pathogenicity prediction. Majority of the poorly predicted mutants that lower wildtype GAA enzyme activity occurred on the surface of the GAA protein. We also found that gene-specific features were uncorrelated with existing methods and provided orthogonal information for interpreting the origin of pathogenicity, particular in variants that are poorly predicted by existing general methods. Specific variants in GAA, when investigated in the context of its protein structure, suggested gene-specific information like the disruption of local backbone torsional geometry and disruption of particular sidechain-sidechain hydrogen bonds as some potential sources for pathogenicity.
Collapse
Affiliation(s)
- Aashish N Adhikari
- Department of Plant and Microbial Biology, University of California, Berkeley, California
| |
Collapse
|
9
|
Ciemny MP, Badaczewska-Dawid AE, Pikuzinska M, Kolinski A, Kmiecik S. Modeling of Disordered Protein Structures Using Monte Carlo Simulations and Knowledge-Based Statistical Force Fields. Int J Mol Sci 2019; 20:E606. [PMID: 30708941 PMCID: PMC6386871 DOI: 10.3390/ijms20030606] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Revised: 01/23/2019] [Accepted: 01/29/2019] [Indexed: 12/20/2022] Open
Abstract
The description of protein disordered states is important for understanding protein folding mechanisms and their functions. In this short review, we briefly describe a simulation approach to modeling protein interactions, which involve disordered peptide partners or intrinsically disordered protein regions, and unfolded states of globular proteins. It is based on the CABS coarse-grained protein model that uses a Monte Carlo (MC) sampling scheme and a knowledge-based statistical force field. We review several case studies showing that description of protein disordered states resulting from CABS simulations is consistent with experimental data. The case studies comprise investigations of protein⁻peptide binding and protein folding processes. The CABS model has been recently made available as the simulation engine of multiscale modeling tools enabling studies of protein⁻peptide docking and protein flexibility. Those tools offer customization of the modeling process, driving the conformational search using distance restraints, reconstruction of selected models to all-atom resolution, and simulation of large protein systems in a reasonable computational time. Therefore, CABS can be combined in integrative modeling pipelines incorporating experimental data and other modeling tools of various resolution.
Collapse
Affiliation(s)
- Maciej Pawel Ciemny
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
- Faculty of Physics, University of Warsaw, Pasteura 5, 02-093 Warsaw, Poland.
| | | | - Monika Pikuzinska
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
| | - Andrzej Kolinski
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
| | - Sebastian Kmiecik
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
| |
Collapse
|
10
|
Lai JK, Kubelka GS, Kubelka J. Effect of Mutations on the Global and Site-Specific Stability and Folding of an Elementary Protein Structural Motif. J Phys Chem B 2018; 122:11083-11094. [PMID: 29985619 DOI: 10.1021/acs.jpcb.8b05280] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Understanding the folding mechanism of proteins requires detailed knowledge of the roles of individual amino acid residues in stabilization of specific elements and local segments of the native structure. Recently, we have utilized the combination of circular dichroism (CD) and site-specific 13C isotopically edited infrared spectroscopy (IR) coupled with the Ising-like model for protein folding to map the thermal unfolding at the residue level of a de novo designed helix-turn-helix motif αtα. Here we use the same methodology to study how the sequence of local thermal unfolding is affected by selected mutations introduced into the most and least stable parts of the motif. Seven different mutants of αtα are screened to find substitutions with the most pronounced effects on the overall stability. Subsequently, thermal unfolding of two mutated αtα sequences is studied with site-specific resolution, using four distinct 13C isotopologues of each. The data are analyzed with the Ising-like model, which builds on a previous parametrization for the original αtα sequence and tests different ways of incorporating the amino acid substitution. We show that for both more and less stable mutants only the adjustment of all interaction parameters of the model can yield a satisfactory fit to the experimental data. The stabilizing and destabilizing mutations result, respectively, in a similar increase and decrease of the stability of all probed local segments, irrespective of their position with respect to the mutation site. Consequently, the relative order of their unfolding remains essentially unchanged. These results underline the importance of the interconnectivity of the stabilizing interaction network and cooperativity of the protein structure, which is evident even in a small motif with apparently noncooperative, heterogeneous unfolding. Overall, our findings are consistent with the native structure being the dominant factor in determining the folding mechanism, regardless of the details of its overall or local thermodynamic stabilization.
Collapse
Affiliation(s)
- Jason K Lai
- Department of Chemistry , University of Wyoming , Laramie , Wyoming 82071 , United States
| | - Ginka S Kubelka
- Department of Chemistry , University of Wyoming , Laramie , Wyoming 82071 , United States
| | - Jan Kubelka
- Department of Chemistry , University of Wyoming , Laramie , Wyoming 82071 , United States
| |
Collapse
|
11
|
Jumper JM, Faruk NF, Freed KF, Sosnick TR. Trajectory-based training enables protein simulations with accurate folding and Boltzmann ensembles in cpu-hours. PLoS Comput Biol 2018; 14:e1006578. [PMID: 30589834 PMCID: PMC6307714 DOI: 10.1371/journal.pcbi.1006578] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Accepted: 10/08/2018] [Indexed: 01/01/2023] Open
Abstract
An ongoing challenge in protein chemistry is to identify the underlying interaction energies that capture protein dynamics. The traditional trade-off in biomolecular simulation between accuracy and computational efficiency is predicated on the assumption that detailed force fields are typically well-parameterized, obtaining a significant fraction of possible accuracy. We re-examine this trade-off in the more realistic regime in which parameterization is a greater source of error than the level of detail in the force field. To address parameterization of coarse-grained force fields, we use the contrastive divergence technique from machine learning to train from simulations of 450 proteins. In our procedure, the computational efficiency of the model enables high accuracy through the precise tuning of the Boltzmann ensemble. This method is applied to our recently developed Upside model, where the free energy for side chains is rapidly calculated at every time-step, allowing for a smooth energy landscape without steric rattling of the side chains. After this contrastive divergence training, the model is able to de novo fold proteins up to 100 residues on a single core in days. This improved Upside model provides a starting point both for investigation of folding dynamics and as an inexpensive Bayesian prior for protein physics that can be integrated with additional experimental or bioinformatic data.
Collapse
Affiliation(s)
- John M. Jumper
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, USA
- Department of Chemistry, and The James Franck Institute, University of Chicago, Chicago, Illinois, USA
| | - Nabil F. Faruk
- Graduate Program in Biophysical Sciences, University of Chicago, Chicago, Illinois, USA
| | - Karl F. Freed
- Department of Chemistry, and The James Franck Institute, University of Chicago, Chicago, Illinois, USA
| | - Tobin R. Sosnick
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, USA
- Institute for Biophysical Dynamics, University of Chicago, Chicago, Illinois, USA
| |
Collapse
|
12
|
Wang Z, Jumper JM, Wang S, Freed KF, Sosnick TR. A Membrane Burial Potential with H-Bonds and Applications to Curved Membranes and Fast Simulations. Biophys J 2018; 115:1872-1884. [PMID: 30413241 DOI: 10.1016/j.bpj.2018.10.012] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Revised: 09/21/2018] [Accepted: 10/10/2018] [Indexed: 10/28/2022] Open
Abstract
We use the statistics of a large and curated training set of transmembrane helical proteins to develop a knowledge-based potential that accounts for the dependence on both the depth of burial of the protein in the membrane and the degree of side-chain exposure. Additionally, the statistical potential includes depth-dependent energies for unsatisfied backbone hydrogen bond donors and acceptors, which are found to be relatively small, ∼2 RT. Our potential accurately places known proteins within the bilayer. The potential is applied to the mechanosensing MscL channel in membranes of varying thickness and curvature, as well as to the prediction of protein structure. The potential is incorporated into our new Upside molecular dynamics algorithm. Notably, we account for the exchange of protein-lipid interactions for protein-protein interactions as helices contact each other, thereby avoiding overestimating the energetics of helix association within the membrane. Simulations of most multimeric complexes find that isolated monomers and the oligomers retain the same orientation in the membrane, suggesting that the assembly of prepositioned monomers presents a viable mechanism of oligomerization.
Collapse
Affiliation(s)
- Zongan Wang
- Department of Chemistry, The University of Chicago, Chicago, Illinois; James Franck Institute, The University of Chicago, Chicago, Illinois
| | - John M Jumper
- Department of Chemistry, The University of Chicago, Chicago, Illinois; James Franck Institute, The University of Chicago, Chicago, Illinois; Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, Illinois
| | - Sheng Wang
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia; Toyota Technological Institute at Chicago, Chicago, Illinois
| | - Karl F Freed
- Department of Chemistry, The University of Chicago, Chicago, Illinois; James Franck Institute, The University of Chicago, Chicago, Illinois.
| | - Tobin R Sosnick
- Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, Illinois; Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois.
| |
Collapse
|
13
|
Jacobs WM, Shakhnovich EI. Accurate Protein-Folding Transition-Path Statistics from a Simple Free-Energy Landscape. J Phys Chem B 2018; 122:11126-11136. [PMID: 30091592 DOI: 10.1021/acs.jpcb.8b05842] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A central goal of protein-folding theory is to predict the stochastic dynamics of transition paths-the rare trajectories that transit between the folded and unfolded ensembles-using only thermodynamic information, such as a low-dimensional equilibrium free-energy landscape. However, commonly used one-dimensional landscapes typically fall short of this aim, because an empirical coordinate-dependent diffusion coefficient has to be fit to transition-path trajectory data in order to reproduce the transition-path dynamics. We show that an alternative, first-principles free-energy landscape predicts transition-path statistics that agree well with simulations and single-molecule experiments without requiring dynamical data as an input. This "topological configuration" model assumes that distinct, native-like substructures assemble on a time scale that is slower than native-contact formation but faster than the folding of the entire protein. Using only equilibrium simulation data to determine the free energies of these coarse-grained intermediate states, we predict a broad distribution of transition-path transit times that agrees well with the transition-path durations observed in simulations. We further show that both the distribution of finite-time displacements on a one-dimensional order parameter and the ensemble of transition-path trajectories generated by the model are consistent with the simulated transition paths. These results indicate that a landscape based on transient folding intermediates, which are often hidden by one-dimensional projections, can form the basis of a predictive model of protein-folding transition-path dynamics.
Collapse
Affiliation(s)
- William M Jacobs
- Department of Chemistry and Chemical Biology , Harvard University , 12 Oxford Street , Cambridge , Massachusetts 02138 , United States
| | - Eugene I Shakhnovich
- Department of Chemistry and Chemical Biology , Harvard University , 12 Oxford Street , Cambridge , Massachusetts 02138 , United States
| |
Collapse
|
14
|
Wang T, Yang Y, Zhou Y, Gong H. LRFragLib: an effective algorithm to identify fragments for de novo protein structure prediction. Bioinformatics 2017; 33:677-684. [PMID: 27797773 DOI: 10.1093/bioinformatics/btw668] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Accepted: 10/18/2016] [Indexed: 11/13/2022] Open
Abstract
Motivation The quality of fragment library determines the efficiency of fragment assembly, an approach that is widely used in most de novo protein-structure prediction algorithms. Conventional fragment libraries are constructed mainly based on the identities of amino acids, sometimes facilitated by predicted information including dihedral angles and secondary structures. However, it remains challenging to identify near-native fragment structures with low sequence homology. Results We introduce a novel fragment-library-construction algorithm, LRFragLib, to improve the detection of near-native low-homology fragments of 7-10 residues, using a multi-stage, flexible selection protocol. Based on logistic regression scoring models, LRFragLib outperforms existing techniques by achieving a significantly higher precision and a comparable coverage on recent CASP protein sets in sampling near-native structures. The method also has a comparable computational efficiency to the fastest existing techniques with substantially reduced memory usage. Availability and Implementation The source code is available for download at http://166.111.152.91/Downloads.html. Contact hgong@tsinghua.edu.cn. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tong Wang
- MOE Key Laboratory of Bioinformatics, School of Life Sciences.,Beijing Innovation Center of Structural Biology, Tsinghua University, Beijing 100084, China
| | - Yuedong Yang
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast, QLD 4222, Australia
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast, QLD 4222, Australia
| | - Haipeng Gong
- MOE Key Laboratory of Bioinformatics, School of Life Sciences.,Beijing Innovation Center of Structural Biology, Tsinghua University, Beijing 100084, China
| |
Collapse
|
15
|
Carpinteri A, Lacidogna G, Piana G, Bassani A. Terahertz mechanical vibrations in lysozyme: Raman spectroscopy vs modal analysis. J Mol Struct 2017. [DOI: 10.1016/j.molstruc.2017.02.099] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
16
|
Abstract
We consider the differences between the many-pathway protein folding model derived from theoretical energy landscape considerations and the defined-pathway model derived from experiment. A basic tenet of the energy landscape model is that proteins fold through many heterogeneous pathways by way of amino acid-level dynamics biased toward selecting native-like interactions. The many pathways imagined in the model are not observed in the structure-formation stage of folding by experiments that would have found them, but they have now been detected and characterized for one protein in the initial prenucleation stage. Analysis presented here shows that these many microscopic trajectories are not distinct in any functionally significant way, and they have neither the structural information nor the biased energetics needed to select native vs. nonnative interactions during folding. The opposed defined-pathway model stems from experimental results that show that proteins are assemblies of small cooperative units called foldons and that a number of proteins fold in a reproducible pathway one foldon unit at a time. Thus, the same foldon interactions that encode the native structure of any given protein also naturally encode its particular foldon-based folding pathway, and they collectively sum to produce the energy bias toward native interactions that is necessary for efficient folding. Available information suggests that quantized native structure and stepwise folding coevolved in ancient repeat proteins and were retained as a functional pair due to their utility for solving the difficult protein folding problem.
Collapse
|
17
|
Anderson JM, Jurban B, Huggins KNL, Shcherbakov AA, Shu I, Kier B, Andersen NH. Nascent Hairpins in Proteins: Identifying Turn Loci and Quantitating Turn Contributions to Hairpin Stability. Biochemistry 2016; 55:5537-5553. [DOI: 10.1021/acs.biochem.6b00732] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Affiliation(s)
- Jordan M. Anderson
- Department of Chemistry, University of Washington, Seattle, Washington 98105, United States
| | - Brice Jurban
- Department of Chemistry, University of Washington, Seattle, Washington 98105, United States
| | - Kelly N. L. Huggins
- Department of Chemistry, University of Washington, Seattle, Washington 98105, United States
| | | | - Irene Shu
- Department of Chemistry, University of Washington, Seattle, Washington 98105, United States
| | - Brandon Kier
- Department of Chemistry, University of Washington, Seattle, Washington 98105, United States
| | - Niels H. Andersen
- Department of Chemistry, University of Washington, Seattle, Washington 98105, United States
| |
Collapse
|
18
|
Wang F, Cazzolli G, Wintrode P, Faccioli P. Folding Mechanism of Proteins Im7 and Im9: Insight from All-Atom Simulations in Implicit and Explicit Solvent. J Phys Chem B 2016; 120:9297-307. [PMID: 27532482 DOI: 10.1021/acs.jpcb.6b05819] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Im7 and Im9 are evolutionary related proteins with almost identical native structures. In spite of their structural similarity, experiments show that Im7 folds through a long-lived on-pathway intermediate, while Im9 folds according to two-state kinetics. In this work, we use a recently developed enhanced path sampling method to generate many folding trajectories for these proteins, using realistic atomistic force fields, in both implicit and explicit solvent. Overall, our results are in good agreement with the experimental ϕ values and with the result of ϕ-value-restrained molecular dynamics (MD) simulations. However, our implicit solvent simulations fail to predict a qualitative difference in the folding pathways of Im7 and Im9. In contrast, our simulations in explicit solvent correctly reproduce the fact that only protein Im7 folds through a on-pathway intermediate. By analyzing our atomistic trajectories, we provide a physical picture which explains the observed difference in the folding kinetics of these chains.
Collapse
Affiliation(s)
- F Wang
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy , Baltimore, Maryland 21201, United States
| | - G Cazzolli
- Physics Department, University of Trento , via Sommarive 14 Povo, Trento 38128, Italy
| | - P Wintrode
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy , Baltimore, Maryland 21201, United States
| | - P Faccioli
- Physics Department, University of Trento , via Sommarive 14 Povo, Trento 38128, Italy
| |
Collapse
|
19
|
Kmiecik S, Gront D, Kolinski M, Wieteska L, Dawid AE, Kolinski A. Coarse-Grained Protein Models and Their Applications. Chem Rev 2016; 116:7898-936. [DOI: 10.1021/acs.chemrev.6b00163] [Citation(s) in RCA: 555] [Impact Index Per Article: 69.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Sebastian Kmiecik
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Dominik Gront
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Michal Kolinski
- Bioinformatics
Laboratory, Mossakowski Medical Research Center of the Polish Academy of Sciences, Pawinskiego 5, 02-106 Warsaw, Poland
| | - Lukasz Wieteska
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
- Department
of Medical Biochemistry, Medical University of Lodz, Mazowiecka 6/8, 92-215 Lodz, Poland
| | | | - Andrzej Kolinski
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| |
Collapse
|
20
|
Duan M, Liu H, Li M, Huo S. Network representation of conformational transitions between hidden intermediates of Rd-apocytochrome b562. J Chem Phys 2016; 143:135101. [PMID: 26450332 DOI: 10.1063/1.4931921] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The folding kinetics of Rd-apocytochrome b562 is two-state, but native-state hydrogen exchange experiments show that there are discrete partially unfolded (PUF) structures in equilibrium with the native state. These PUF structures are called hidden intermediates because they are not detected in kinetic experiments and they exist after the rate-limiting step. Structures of the mimics of hidden intermediates of Rd-apocytochrome b562 are resolved by NMR. Based upon their relative stability and structural features, the folding mechanism was proposed to follow a specific pathway (unfolded → rate-limiting transition state → PUF1 → PUF2 → native). Investigating the roles of equilibrium PUF structures in folding kinetics and their interrelationship not only deepens our understanding of the details of folding mechanism but also provides guides in protein design and prevention of misfolding. We performed molecular dynamics simulations starting from a hidden intermediate and the native state of Rd-apocytochrome b562 in explicit solvent, for a total of 37.18 μs mainly with Anton. We validated our simulations by detailed comparison with experimental data and other computations. We have verified that we sampled the post rate-limiting transition state region only. Markov state model was used to analyze the simulation results. We replace the specific pathway model with a network model. Transition-path theory was employed to calculate the net effective flux from the most unfolded state towards the most folded state in the network. The proposed sequential folding pathway via PUF1 then more stable, more native-like PUF2 is one of the routes in our network, but it is not dominant. The dominant path visits PUF2 without going through PUF1. There is also a route from PUF1 directly to the most folded state in the network without visiting PUF2. Our results indicate that the PUF states are not necessarily sequential in the folding. The major routes predicted in our network are testable by future experiments such as single molecule experiment.
Collapse
Affiliation(s)
- Mojie Duan
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts 01610, USA
| | - Hanzhong Liu
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts 01610, USA
| | - Minghai Li
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts 01610, USA
| | - Shuanghong Huo
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts 01610, USA
| |
Collapse
|
21
|
Cooperative folding near the downhill limit determined with amino acid resolution by hydrogen exchange. Proc Natl Acad Sci U S A 2016; 113:4747-52. [PMID: 27078098 DOI: 10.1073/pnas.1522500113] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
The relationship between folding cooperativity and downhill, or barrier-free, folding of proteins under highly stabilizing conditions remains an unresolved topic, especially for proteins such as λ-repressor that fold on the microsecond timescale. Under aqueous conditions where downhill folding is most likely to occur, we measure the stability of multiple H bonds, using hydrogen exchange (HX) in a λYA variant that is suggested to be an incipient downhill folder having an extrapolated folding rate constant of 2 × 10(5) s(-1) and a stability of 7.4 kcal·mol(-1) at 298 K. At least one H bond on each of the three largest helices (α1, α3, and α4) breaks during a common unfolding event that reflects global denaturation. The use of HX enables us to both examine folding under highly stabilizing, native-like conditions and probe the pretransition state region for stable species without the need to initiate the folding reaction. The equivalence of the stability determined at zero and high denaturant indicates that any residual denatured state structure minimally affects the stability even under native conditions. Using our ψ analysis method along with mutational ϕ analysis, we find that the three aforementioned helices are all present in the folding transition state. Hence, the free energy surface has a sufficiently high barrier separating the denatured and native states that folding appears cooperative even under extremely stable and fast folding conditions.
Collapse
|
22
|
Faísca PF. Knotted proteins: A tangled tale of Structural Biology. Comput Struct Biotechnol J 2015; 13:459-68. [PMID: 26380658 PMCID: PMC4556803 DOI: 10.1016/j.csbj.2015.08.003] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Revised: 07/31/2015] [Accepted: 08/07/2015] [Indexed: 01/19/2023] Open
Abstract
Knotted proteins have their native structures arranged in the form of an open knot. In the last ten years researchers have been making significant efforts to reveal their folding mechanism and understand which functional advantage(s) knots convey to their carriers. Molecular simulations have been playing a fundamental role in this endeavor, and early computational predictions about the knotting mechanism have just been confirmed in wet lab experiments. Here we review a collection of simulation results that allow outlining the current status of the field of knotted proteins, and discuss directions for future research.
Collapse
|
23
|
Sequence, structure, and cooperativity in folding of elementary protein structural motifs. Proc Natl Acad Sci U S A 2015. [PMID: 26216963 DOI: 10.1073/pnas.1506309112] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Residue-level unfolding of two helix-turn-helix proteins--one naturally occurring and one de novo designed--is reconstructed from multiple sets of site-specific (13)C isotopically edited infrared (IR) and circular dichroism (CD) data using Ising-like statistical-mechanical models. Several model variants are parameterized to test the importance of sequence-specific interactions (approximated by Miyazawa-Jernigan statistical potentials), local structural flexibility (derived from the ensemble of NMR structures), interhelical hydrogen bonds, and native contacts separated by intervening disordered regions (through the Wako-Saitô-Muñoz-Eaton scheme, which disallows such configurations). The models are optimized by directly simulating experimental observables: CD ellipticity at 222 nm for model proteins and their fragments and (13)C-amide I' bands for multiple isotopologues of each protein. We find that data can be quantitatively reproduced by the model that allows two interacting segments flanking a disordered loop (double sequence approximation) and incorporates flexibility in the native contact maps, but neither sequence-specific interactions nor hydrogen bonds are required. The near-identical free energy profiles as a function of the global order parameter are consistent with expected similar folding kinetics for nearly identical structures. However, the predicted folding mechanism for the two motifs is different, reflecting the order of local stability. We introduce free energy profiles for "experimental" reaction coordinates--namely, the degree of local folding as sensed by site-specific (13)C-edited IR, which highlight folding heterogeneity and contrast its overall, average description with the detailed, local picture.
Collapse
|
24
|
Even with nonnative interactions, the updated folding transition states of the homologs Proteins G & L are extensive and similar. Proc Natl Acad Sci U S A 2015; 112:8302-7. [PMID: 26100906 DOI: 10.1073/pnas.1503613112] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Experimental and computational folding studies of Proteins L & G and NuG2 typically find that sequence differences determine which of the two hairpins is formed in the transition state ensemble (TSE). However, our recent work on Protein L finds that its TSE contains both hairpins, compelling a reassessment of the influence of sequence on the folding behavior of the other two homologs. We characterize the TSEs for Protein G and NuG2b, a triple mutant of NuG2, using ψ analysis, a method for identifying contacts in the TSE. All three homologs are found to share a common and near-native TSE topology with interactions between all four strands. However, the helical content varies in the TSE, being largely absent in Proteins G & L but partially present in NuG2b. The variability likely arises from competing propensities for the formation of nonnative β turns in the naturally occurring proteins, as observed in our TerItFix folding algorithm. All-atom folding simulations of NuG2b recapitulate the observed TSEs with four strands for 5 of 27 transition paths [Lindorff-Larsen K, Piana S, Dror RO, Shaw DE (2011) Science 334(6055):517-520]. Our data support the view that homologous proteins have similar folding mechanisms, even when nonnative interactions are present in the transition state. These findings emphasize the ongoing challenge of accurately characterizing and predicting TSEs, even for relatively simple proteins.
Collapse
|
25
|
Chen T, Chan HS. Native contact density and nonnative hydrophobic effects in the folding of bacterial immunity proteins. PLoS Comput Biol 2015; 11:e1004260. [PMID: 26016652 PMCID: PMC4446218 DOI: 10.1371/journal.pcbi.1004260] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2015] [Accepted: 03/29/2015] [Indexed: 11/18/2022] Open
Abstract
The bacterial colicin-immunity proteins Im7 and Im9 fold by different mechanisms. Experimentally, at pH 7.0 and 10°C, Im7 folds in a three-state manner via an intermediate but Im9 folding is two-state-like. Accordingly, Im7 exhibits a chevron rollover, whereas the chevron arm for Im9 folding is linear. Here we address the biophysical basis of their different behaviors by using native-centric models with and without additional transferrable, sequence-dependent energies. The Im7 chevron rollover is not captured by either a pure native-centric model or a model augmented by nonnative hydrophobic interactions with a uniform strength irrespective of residue type. By contrast, a more realistic nonnative interaction scheme that accounts for the difference in hydrophobicity among residues leads simultaneously to a chevron rollover for Im7 and an essentially linear folding chevron arm for Im9. Hydrophobic residues identified by published experiments to be involved in nonnative interactions during Im7 folding are found to participate in the strongest nonnative contacts in this model. Thus our observations support the experimental perspective that the Im7 folding intermediate is largely underpinned by nonnative interactions involving large hydrophobics. Our simulation suggests further that nonnative effects in Im7 are facilitated by a lower local native contact density relative to that of Im9. In a one-dimensional diffusion picture of Im7 folding with a coordinate- and stability-dependent diffusion coefficient, a significant chevron rollover is consistent with a diffusion coefficient that depends strongly on native stability at the conformational position of the folding intermediate. In order to fold correctly, a globular protein must avoid being trapped in wrong, i.e., nonnative conformations. Thus a biophysical account of how attractive nonnative interactions are bypassed by some amino acid sequences but not others is key to deciphering protein structure and function. We examine two closely related bacterial immunity proteins, Im7 and Im9, that are experimentally known to fold very differently: Whereas Im9 folds directly, Im7 folds through a mispacked conformational intermediate. A simple model we developed accounts for their intriguingly different folding kinetics in terms of a balance between the density of native-promoting contacts and the hydrophobicity of local amino acid sequences. This emergent principle is extensible to other biomolecular recognition processes.
Collapse
Affiliation(s)
- Tao Chen
- Departments of Biochemistry, of Molecular Genetics, and of Physics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Hue Sun Chan
- Departments of Biochemistry, of Molecular Genetics, and of Physics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
- * E-mail:
| |
Collapse
|
26
|
Chen T, Song J, Chan HS. Theoretical perspectives on nonnative interactions and intrinsic disorder in protein folding and binding. Curr Opin Struct Biol 2014; 30:32-42. [PMID: 25544254 DOI: 10.1016/j.sbi.2014.12.002] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2014] [Revised: 12/02/2014] [Accepted: 12/02/2014] [Indexed: 11/29/2022]
Abstract
The diverse biological functions of intrinsically disordered proteins (IDPs) have markedly raised our appreciation of protein conformational versatility, whereas the existence of energetically favorable yet functional detrimental nonnative interactions underscores the physical limitations of evolutionary optimization. Here we survey recent advances in using biophysical modeling to gain insight into experimentally observed nonnative behaviors and IDP properties. Simulations of IDP interactions to date focus mostly on coupled folding-binding, which follows essentially the same organizing principle as the local-nonlocal coupling mechanism in cooperative folding of monomeric globular proteins. By contrast, more innovative theories of electrostatic and aromatic interactions are needed for the conceptually novel but less-explored 'fuzzy' complexes in which the functionally bound IDPs remain largely disordered.
Collapse
Affiliation(s)
- Tao Chen
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada; Department of Physics, University of Toronto, Toronto, Ontario M5S 1A7, Canada
| | - Jianhui Song
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada; Department of Physics, University of Toronto, Toronto, Ontario M5S 1A7, Canada
| | - Hue Sun Chan
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada; Department of Physics, University of Toronto, Toronto, Ontario M5S 1A7, Canada.
| |
Collapse
|
27
|
Zhou CY, Jiang F, Wu YD. Residue-Specific Force Field Based on Protein Coil Library. RSFF2: Modification of AMBER ff99SB. J Phys Chem B 2014; 119:1035-47. [DOI: 10.1021/jp5064676] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Chen-Yang Zhou
- College
of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
| | - Fan Jiang
- Laboratory of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School, Shenzhen, 518055, China
| | - Yun-Dong Wu
- College
of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
- Laboratory of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School, Shenzhen, 518055, China
| |
Collapse
|
28
|
Rollins GC, Dill KA. General mechanism of two-state protein folding kinetics. J Am Chem Soc 2014; 136:11420-7. [PMID: 25056406 DOI: 10.1021/ja5049434] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
We describe here a general model of the kinetic mechanism of protein folding. In the Foldon Funnel Model, proteins fold in units of secondary structures, which form sequentially along the folding pathway, stabilized by tertiary interactions. The model predicts that the free energy landscape has a volcano shape, rather than a simple funnel, that folding is two-state (single-exponential) when secondary structures are intrinsically unstable, and that each structure along the folding path is a transition state for the previous structure. It shows how sequential pathways are consistent with multiple stochastic routes on funnel landscapes, and it gives good agreement with the 9 order of magnitude dependence of folding rates on protein size for a set of 93 proteins, at the same time it is consistent with the near independence of folding equilibrium constant on size. This model gives estimates of folding rates of proteomes, leading to a median folding time in Escherichia coli of about 5 s.
Collapse
Affiliation(s)
- Geoffrey C Rollins
- Department of Biochemistry and Biophysics, University of California , San Francisco, California 94143, United States
| | | |
Collapse
|
29
|
Shao Q. Probing Sequence Dependence of Folding Pathway of α-Helix Bundle Proteins through Free Energy Landscape Analysis. J Phys Chem B 2014; 118:5891-900. [DOI: 10.1021/jp5043393] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Qiang Shao
- Drug Discovery and Design
Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
| |
Collapse
|
30
|
Kubelka GS, Kubelka J. Site-Specific Thermodynamic Stability and Unfolding of a de Novo Designed Protein Structural Motif Mapped by 13C Isotopically Edited IR Spectroscopy. J Am Chem Soc 2014; 136:6037-48. [DOI: 10.1021/ja500918k] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Ginka S. Kubelka
- Department
of Chemistry, University of Wyoming, Laramie, Wyoming 82071, United States
| | - Jan Kubelka
- Department
of Chemistry, University of Wyoming, Laramie, Wyoming 82071, United States
| |
Collapse
|
31
|
Piana S, Klepeis JL, Shaw DE. Assessing the accuracy of physical models used in protein-folding simulations: quantitative evidence from long molecular dynamics simulations. Curr Opin Struct Biol 2014; 24:98-105. [DOI: 10.1016/j.sbi.2013.12.006] [Citation(s) in RCA: 294] [Impact Index Per Article: 29.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2013] [Revised: 12/19/2013] [Accepted: 12/20/2013] [Indexed: 01/15/2023]
|
32
|
|
33
|
Simoncini D, Zhang KYJ. Efficient sampling in fragment-based protein structure prediction using an estimation of distribution algorithm. PLoS One 2013; 8:e68954. [PMID: 23935913 PMCID: PMC3723781 DOI: 10.1371/journal.pone.0068954] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2013] [Accepted: 06/07/2013] [Indexed: 11/19/2022] Open
Abstract
Fragment assembly is a powerful method of protein structure prediction that builds protein models from a pool of candidate fragments taken from known structures. Stochastic sampling is subsequently used to refine the models. The structures are first represented as coarse-grained models and then as all-atom models for computational efficiency. Many models have to be generated independently due to the stochastic nature of the sampling methods used to search for the global minimum in a complex energy landscape. In this paper we present EdaFold(AA), a fragment-based approach which shares information between the generated models and steers the search towards native-like regions. A distribution over fragments is estimated from a pool of low energy all-atom models. This iteratively-refined distribution is used to guide the selection of fragments during the building of models for subsequent rounds of structure prediction. The use of an estimation of distribution algorithm enabled EdaFold(AA) to reach lower energy levels and to generate a higher percentage of near-native models. [Formula: see text] uses an all-atom energy function and produces models with atomic resolution. We observed an improvement in energy-driven blind selection of models on a benchmark of EdaFold(AA) in comparison with the [Formula: see text] AbInitioRelax protocol.
Collapse
Affiliation(s)
- David Simoncini
- Zhang Initiative Research Unit, Institute Laboratories, RIKEN, Wako, Saitama, Japan
| | - Kam Y. J. Zhang
- Zhang Initiative Research Unit, Institute Laboratories, RIKEN, Wako, Saitama, Japan
- * E-mail:
| |
Collapse
|
34
|
Adhikari AN, Freed KF, Sosnick TR. Simplified protein models: predicting folding pathways and structure using amino acid sequences. PHYSICAL REVIEW LETTERS 2013; 111:028103. [PMID: 23889448 PMCID: PMC4047675 DOI: 10.1103/physrevlett.111.028103] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2013] [Indexed: 06/02/2023]
Abstract
We demonstrate the ability of simultaneously determining a protein's folding pathway and structure using a properly formulated model without prior knowledge of the native structure. Our model employs a natural coordinate system for describing proteins and a search strategy inspired by the observation that real proteins fold in a sequential fashion by incrementally stabilizing nativelike substructures or "foldons." Comparable folding pathways and structures are obtained for the twelve proteins recently studied using atomistic molecular dynamics simulations [K. Lindorff-Larsen, S. Piana, R. O. Dror, D. E. Shaw, Science 334, 517 (2011)], with our calculations running several orders of magnitude faster. We find that nativelike propensities in the unfolded state do not necessarily determine the order of structure formation, a departure from a major conclusion of the molecular dynamics study. Instead, our results support a more expansive view wherein intrinsic local structural propensities may be enhanced or overridden in the folding process by environmental context. The success of our search strategy validates it as an expedient mechanism for folding both in silico and in vivo.
Collapse
Affiliation(s)
- Aashish N. Adhikari
- Department of Chemistry, University of Chicago, Chicago, IL 60637 USA
- James Franck Institute, University of Chicago, Chicago, IL 60637 USA
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL 60637 USA
| | - Karl F. Freed
- Department of Chemistry, University of Chicago, Chicago, IL 60637 USA
- James Franck Institute, University of Chicago, Chicago, IL 60637 USA
- Computation Institute, University of Chicago, Chicago, IL 60637 USA
| | - Tobin R. Sosnick
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL 60637 USA
- Computation Institute, University of Chicago, Chicago, IL 60637 USA
- Institute for Biophysical Dynamics, University of Chicago, Chicago, IL 60637 USA
| |
Collapse
|
35
|
Stepwise protein folding at near amino acid resolution by hydrogen exchange and mass spectrometry. Proc Natl Acad Sci U S A 2013; 110:7684-9. [PMID: 23603271 DOI: 10.1073/pnas.1305887110] [Citation(s) in RCA: 143] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The kinetic folding of ribonuclease H was studied by hydrogen exchange (HX) pulse labeling with analysis by an advanced fragment separation mass spectrometry technology. The results show that folding proceeds through distinct intermediates in a stepwise pathway that sequentially incorporates cooperative native-like structural elements to build the native protein. Each step is seen as a concerted transition of one or more segments from an HX-unprotected to an HX-protected state. Deconvolution of the data to near amino acid resolution shows that each step corresponds to the folding of a secondary structural element of the native protein, termed a "foldon." Each folded segment is retained through subsequent steps of foldon addition, revealing a stepwise buildup of the native structure via a single dominant pathway. Analysis of the pertinent literature suggests that this model is consistent with experimental results for many proteins and some current theoretical results. Two biophysical principles appear to dictate this behavior. The principle of cooperativity determines the central role of native-like foldon units. An interaction principle termed "sequential stabilization" based on native-like interfoldon interactions orders the pathway.
Collapse
|