1
|
Strahan J, Lorpaiboon C, Weare J, Dinner AR. BAD-NEUS: Rapidly converging trajectory stratification. J Chem Phys 2024; 161:084109. [PMID: 39185846 PMCID: PMC11349377 DOI: 10.1063/5.0215975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Accepted: 07/25/2024] [Indexed: 08/27/2024] Open
Abstract
An issue for molecular dynamics simulations is that events of interest often involve timescales that are much longer than the simulation time step, which is set by the fastest timescales of the model. Because of this timescale separation, direct simulation of many events is prohibitively computationally costly. This issue can be overcome by aggregating information from many relatively short simulations that sample segments of trajectories involving events of interest. This is the strategy of Markov state models (MSMs) and related approaches, but such methods suffer from approximation error because the variables defining the states generally do not capture the dynamics fully. By contrast, once converged, the weighted ensemble (WE) method aggregates information from trajectory segments so as to yield unbiased estimates of both thermodynamic and kinetic statistics. Unfortunately, errors decay no faster than unbiased simulation in WE as originally formulated and commonly deployed. Here, we introduce a theoretical framework for describing WE that shows that the introduction of an approximate stationary distribution on top of the stratification, as in nonequilibrium umbrella sampling (NEUS), accelerates convergence. Building on ideas from MSMs and related methods, we generalize the NEUS approach in such a way that the approximation error can be reduced systematically. We show that the improved algorithm can decrease the simulation time required to achieve the desired precision by orders of magnitude.
Collapse
Affiliation(s)
- John Strahan
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Chatipat Lorpaiboon
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Jonathan Weare
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, USA
| | - Aaron R. Dinner
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| |
Collapse
|
2
|
Fersht AR. From covalent transition states in chemistry to noncovalent in biology: from β- to Φ-value analysis of protein folding. Q Rev Biophys 2024; 57:e4. [PMID: 38597675 DOI: 10.1017/s0033583523000045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/11/2024]
Abstract
Solving the mechanism of a chemical reaction requires determining the structures of all the ground states on the pathway and the elusive transition states linking them. 2024 is the centenary of Brønsted's landmark paper that introduced the β-value and structure-activity studies as the only experimental means to infer the structures of transition states. It involves making systematic small changes in the covalent structure of the reactants and analysing changes in activation and equilibrium-free energies. Protein engineering was introduced for an analogous procedure, Φ-value analysis, to analyse the noncovalent interactions in proteins central to biological chemistry. The methodology was developed first by analysing noncovalent interactions in transition states in enzyme catalysis. The mature procedure was then applied to study transition states in the pathway of protein folding - 'part (b) of the protein folding problem'. This review describes the development of Φ-value analysis of transition states and compares and contrasts the interpretation of β- and Φ-values and their limitations. Φ-analysis afforded the first description of transition states in protein folding at the level of individual residues. It revealed the nucleation-condensation folding mechanism of protein domains with the transition state as an expanded, distorted native structure, containing little fully formed secondary structure but many weak tertiary interactions. A spectrum of transition states with various degrees of structural polarisation was then uncovered that spanned from nucleation-condensation to the framework mechanism of fully formed secondary structure. Φ-analysis revealed how movement of the expanded transition state on an energy landscape accommodates the transition from framework to nucleation-condensation mechanisms with a malleability of structure as a unifying feature of folding mechanisms. Such movement follows the rubric of analysis of classical covalent chemical mechanisms that began with Brønsted. Φ-values are used to benchmark computer simulation, and Φ and simulation combine to describe folding pathways at atomic resolution.
Collapse
Affiliation(s)
- Alan R Fersht
- MRC Laboratory of Molecular Biology, Cambridge, UK
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
- Gonville and Caius College, University of Cambridge, Cambridge, UK
| |
Collapse
|
3
|
Rothfuss MT, Becht DC, Zeng B, McClelland LJ, Yates-Hansen C, Bowler BE. High-Accuracy Prediction of Stabilizing Surface Mutations to the Three-Helix Bundle, UBA(1), with EmCAST. J Am Chem Soc 2023; 145:22979-22992. [PMID: 37815921 PMCID: PMC10626973 DOI: 10.1021/jacs.3c04966] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/12/2023]
Abstract
The accurate modeling of energetic contributions to protein structure is a fundamental challenge in computational approaches to protein analysis and design. We describe a general computational method, EmCAST (empirical Cα stabilization), to score and optimize the sequence to the structure in proteins. The method relies on an empirical potential derived from the database of the Cα dihedral angle preferences for all possible four-residue sequences, using the data available in the Protein Data Bank. Our method produces stability predictions that naturally correlate one-to-one with the experimental results for solvent-exposed mutation sites. EmCAST predicted four mutations that increased the stability of a three-helix bundle, UBA(1), from 2.4 to 4.8 kcal/mol by optimizing residues in both helices and turns. For a set of eight variants, the predicted and experimental stabilizations correlate very well (R2 = 0.97) with a slope near 1 and with a 0.16 kcal/mol standard error for EmCAST predictions. Tests against literature data for the stability effects of surface-exposed mutations show that EmCAST outperforms the existing stability prediction methods. UBA(1) variants were crystallized to verify and analyze their structures at an atomic resolution. Thermodynamic and kinetic folding experiments were performed to determine the magnitude and mechanism of stabilization. Our method has the potential to enable the rapid, rational optimization of natural proteins, expand the analysis of the sequence/structure relationship, and supplement the existing protein design strategies.
Collapse
Affiliation(s)
- Michael T. Rothfuss
- Department of Chemistry and Biochemistry, University of Montana, Missoula, MT 59812, United States
| | - Dustin C. Becht
- Department of Chemistry and Biochemistry, University of Montana, Missoula, MT 59812, United States
| | - Baisen Zeng
- Center for Biomolecular Structure and Dynamics, University of Montana, Missoula, MT 59812, United States
| | - Levi J. McClelland
- Center for Biomolecular Structure and Dynamics, University of Montana, Missoula, MT 59812, United States
- Division of Biological Sciences, University of Montana, Missoula, MT 59812, United States
| | - Cindee Yates-Hansen
- Center for Biomolecular Structure and Dynamics, University of Montana, Missoula, MT 59812, United States
| | - Bruce E. Bowler
- Department of Chemistry and Biochemistry, University of Montana, Missoula, MT 59812, United States
- Center for Biomolecular Structure and Dynamics, University of Montana, Missoula, MT 59812, United States
| |
Collapse
|
4
|
Tsuboyama K, Dauparas J, Chen J, Laine E, Mohseni Behbahani Y, Weinstein JJ, Mangan NM, Ovchinnikov S, Rocklin GJ. Mega-scale experimental analysis of protein folding stability in biology and design. Nature 2023; 620:434-444. [PMID: 37468638 PMCID: PMC10412457 DOI: 10.1038/s41586-023-06328-6] [Citation(s) in RCA: 56] [Impact Index Per Article: 56.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 06/14/2023] [Indexed: 07/21/2023]
Abstract
Advances in DNA sequencing and machine learning are providing insights into protein sequences and structures on an enormous scale1. However, the energetics driving folding are invisible in these structures and remain largely unknown2. The hidden thermodynamics of folding can drive disease3,4, shape protein evolution5-7 and guide protein engineering8-10, and new approaches are needed to reveal these thermodynamics for every sequence and structure. Here we present cDNA display proteolysis, a method for measuring thermodynamic folding stability for up to 900,000 protein domains in a one-week experiment. From 1.8 million measurements in total, we curated a set of around 776,000 high-quality folding stabilities covering all single amino acid variants and selected double mutants of 331 natural and 148 de novo designed protein domains 40-72 amino acids in length. Using this extensive dataset, we quantified (1) environmental factors influencing amino acid fitness, (2) thermodynamic couplings (including unexpected interactions) between protein sites, and (3) the global divergence between evolutionary amino acid usage and protein folding stability. We also examined how our approach could identify stability determinants in designed proteins and evaluate design methods. The cDNA display proteolysis method is fast, accurate and uniquely scalable, and promises to reveal the quantitative rules for how amino acid sequences encode folding stability.
Collapse
Affiliation(s)
- Kotaro Tsuboyama
- Department of Pharmacology, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
- PRESTO, Japan Science and Technology Agency, Tokyo, Japan
- Institute of Industrial Science, The University of Tokyo, Tokyo, Japan
| | - Justas Dauparas
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Jonathan Chen
- Department of Pharmacology, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
- McCormick School of Engineering, Northwestern University, Evanston, IL, USA
| | - Elodie Laine
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris, France
| | - Yasser Mohseni Behbahani
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris, France
| | - Jonathan J Weinstein
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Niall M Mangan
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
- Department of Engineering Sciences and Applied Mathematics, Northwestern University, Evanston, IL, USA
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, USA
| | - Gabriel J Rocklin
- Department of Pharmacology, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA.
| |
Collapse
|
5
|
Abstract
Salts differ in their ability to stabilize protein conformations, thereby affecting the thermodynamics and kinetics of protein folding. We developed a coarse-grained protein model that can predict salt-induced changes in protein properties by using the transfer free-energy data of various chemical groups from water to salt solutions. Using this model and molecular dynamics simulations, we probed the effect of seven different salts on the folding thermodynamics of the DNA binding domain of lac repressor protein ( lac-DBD) and N-terminal domain of ribosomal protein (NTL9). We show that a salt can act as a protein stabilizing or destabilizing agent depending on the protein sequence and folded state topology. The computed thermodynamic properties, especially the m values for various salts, which reveal the relative ability of a salt to stabilize the protein folded state, are in quantitative agreement with the experimentally measured values. The computations show that the degree of protein compaction in the denatured ensemble strongly depends on the salt identity, and for the same variation in salt concentration, the compaction in the protein dimensions varies from ∼4% to ∼30% depending on the salt. The transition-state ensemble (TSE) of lac-DBD is homogeneous and polarized, while the TSE of NTL9 is heterogeneous and diffusive. Salts induce subtle structural changes in the TSE that are in agreement with Hammond's postulate. The barrier to protein folding tends to disappear in the presence of moderate concentrations (∼3-4 m) of strongly stabilizing salts.
Collapse
Affiliation(s)
- Hiranmay Maity
- Solid State and Structural Chemistry Unit , Indian Institute of Science , Bengaluru , Karnataka , India 560012
| | - Aswathy N Muttathukattil
- Solid State and Structural Chemistry Unit , Indian Institute of Science , Bengaluru , Karnataka , India 560012
| | - Govardhan Reddy
- Solid State and Structural Chemistry Unit , Indian Institute of Science , Bengaluru , Karnataka , India 560012
| |
Collapse
|