1
|
Functional protein dynamics in a crystal. Nat Commun 2024; 15:3244. [PMID: 38622111 PMCID: PMC11018856 DOI: 10.1038/s41467-024-47473-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 04/02/2024] [Indexed: 04/17/2024] Open
Abstract
Proteins are molecular machines and to understand how they work, we need to understand how they move. New pump-probe time-resolved X-ray diffraction methods open up ways to initiate and observe protein motions with atomistic detail in crystals on biologically relevant timescales. However, practical limitations of these experiments demands parallel development of effective molecular dynamics approaches to accelerate progress and extract meaning. Here, we establish robust and accurate methods for simulating dynamics in protein crystals, a nontrivial process requiring careful attention to equilibration, environmental composition, and choice of force fields. With more than seven milliseconds of sampling of a single chain, we identify critical factors controlling agreement between simulation and experiments and show that simulated motions recapitulate ligand-induced conformational changes. This work enables a virtuous cycle between simulation and experiments for visualizing and understanding the basic functional motions of proteins.
Collapse
|
2
|
Learning Markovian dynamics with spectral maps. J Chem Phys 2024; 160:091102. [PMID: 38436438 DOI: 10.1063/5.0189241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 02/05/2024] [Indexed: 03/05/2024] Open
Abstract
The long-time behavior of many complex molecular systems can often be described by Markovian dynamics in a slow subspace spanned by a few reaction coordinates referred to as collective variables (CVs). However, determining CVs poses a fundamental challenge in chemical physics. Depending on intuition or trial and error to construct CVs can lead to non-Markovian dynamics with long memory effects, hindering analysis. To address this problem, we continue to develop a recently introduced deep-learning technique called spectral map [J. Rydzewski, J. Phys. Chem. Lett. 14, 5216-5220 (2023)]. Spectral map learns slow CVs by maximizing a spectral gap of a Markov transition matrix describing anisotropic diffusion. Here, to represent heterogeneous and multiscale free-energy landscapes with spectral map, we implement an adaptive algorithm to estimate transition probabilities. Through a Markov state model analysis, we validate that spectral map learns slow CVs related to the dominant relaxation timescales and discerns between long-lived metastable states.
Collapse
|
3
|
Main role of fractal-like nature of conformational space in subdiffusion in proteins. Phys Rev E 2024; 109:034402. [PMID: 38632804 DOI: 10.1103/physreve.109.034402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 02/05/2024] [Indexed: 04/19/2024]
Abstract
Protein dynamics involves a myriad of mechanical movements happening at different time and space scales, which make it highly complex. One of the less understood features of protein dynamics is subdiffusivity, defined as sublinear dependence between displacement and time. Here, we use all-atoms molecular dynamics (MD) simulations to directly interrogate an already well-established theory and demonstrate that subdiffusivity arises from the fractal nature of the network of metastable conformations over which the dynamics, thought of as a diffusion process, takes place.
Collapse
|
4
|
Toward physics-based precision medicine: Exploiting protein dynamics to design new therapeutics and interpret variants. Protein Sci 2024; 33:e4902. [PMID: 38358129 PMCID: PMC10868452 DOI: 10.1002/pro.4902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 12/01/2023] [Accepted: 01/04/2024] [Indexed: 02/16/2024]
Abstract
The goal of precision medicine is to utilize our knowledge of the molecular causes of disease to better diagnose and treat patients. However, there is a substantial mismatch between the small number of food and drug administration (FDA)-approved drugs and annotated coding variants compared to the needs of precision medicine. This review introduces the concept of physics-based precision medicine, a scalable framework that promises to improve our understanding of sequence-function relationships and accelerate drug discovery. We show that accounting for the ensemble of structures a protein adopts in solution with computer simulations overcomes many of the limitations imposed by assuming a single protein structure. We highlight studies of protein dynamics and recent methods for the analysis of structural ensembles. These studies demonstrate that differences in conformational distributions predict functional differences within protein families and between variants. Thanks to new computational tools that are providing unprecedented access to protein structural ensembles, this insight may enable accurate predictions of variant pathogenicity for entire libraries of variants. We further show that explicitly accounting for protein ensembles, with methods like alchemical free energy calculations or docking to Markov state models, can uncover novel lead compounds. To conclude, we demonstrate that cryptic pockets, or cavities absent in experimental structures, provide an avenue to target proteins that are currently considered undruggable. Taken together, our review provides a roadmap for the field of protein science to accelerate precision medicine.
Collapse
|
5
|
Acceleration of Molecular Simulations by Parametric Time-Lagged tSNE Metadynamics. J Phys Chem B 2024; 128:903-913. [PMID: 38237064 PMCID: PMC10839826 DOI: 10.1021/acs.jpcb.3c05669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 12/22/2023] [Accepted: 12/28/2023] [Indexed: 02/02/2024]
Abstract
The potential of molecular simulations is limited by their computational costs. There is often a need to accelerate simulations using some of the enhanced sampling methods. Metadynamics applies a history-dependent bias potential that disfavors previously visited states. To apply metadynamics, it is necessary to select a few properties of the system─collective variables (CVs) that can be used to define the bias potential. Over the past few years, there have been emerging opportunities for machine learning and, in particular, artificial neural networks within this domain. In this broad context, a specific unsupervised machine learning method was utilized, namely, parametric time-lagged t-distributed stochastic neighbor embedding (ptltSNE) to design CVs. The approach was tested on a Trp-cage trajectory (tryptophan cage) from the literature. The trajectory was used to generate a map of conformations, distinguish fast conformational changes from slow ones, and design CVs. Then, metadynamic simulations were performed. To accelerate the formation of the α-helix, we added the α-RMSD collective variable. This simulation led to one folding event in a 350 ns metadynamics simulation. To accelerate degrees of freedom not addressed by CVs, we performed parallel tempering metadynamics. This simulation led to 10 folding events in a 200 ns simulation with 32 replicas.
Collapse
|
6
|
Markov State Models: To Optimize or Not to Optimize. J Chem Theory Comput 2024; 20:977-988. [PMID: 38163961 PMCID: PMC10809420 DOI: 10.1021/acs.jctc.3c01134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 12/10/2023] [Accepted: 12/11/2023] [Indexed: 01/03/2024]
Abstract
Markov state models (MSM) are a popular statistical method for analyzing the conformational dynamics of proteins including protein folding. With all statistical and machine learning (ML) models, choices must be made about the modeling pipeline that cannot be directly learned from the data. These choices, or hyperparameters, are often evaluated by expert judgment or, in the case of MSMs, by maximizing variational scores such as the VAMP-2 score. Modern ML and statistical pipelines often use automatic hyperparameter selection techniques ranging from the simple, choosing the best score from a random selection of hyperparameters, to the complex, optimization via, e.g., Bayesian optimization. In this work, we ask whether it is possible to automatically select MSM models this way by estimating and analyzing over 16,000,000 observations from over 280,000 estimated MSMs. We find that differences in hyperparameters can change the physical interpretation of the optimization objective, making automatic selection difficult. In addition, we find that enforcing conditions of equilibrium in the VAMP scores can result in inconsistent model selection. However, other parameters that specify the VAMP-2 score (lag time and number of relaxation processes scored) have only a negligible influence on model selection. We suggest that model observables and variational scores should be only a guide to model selection and that a full investigation of the MSM properties should be undertaken when selecting hyperparameters.
Collapse
|
7
|
Conformational Dynamics in Proteins: Entangled Slow Fluctuations and Nonequilibrium Reaction Events. J Phys Chem B 2024; 128:20-32. [PMID: 38133567 DOI: 10.1021/acs.jpcb.3c05307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Proteins exhibit conformational fluctuations and changes over various time scales, ranging from rapid picosecond-scale local atomic motions to slower microsecond-scale global conformational transformations. In the presence of these intricate fluctuations, chemical reactions occur and functions emerge. These conformational fluctuations of proteins are not merely stochastic random motions but possess distinct spatiotemporal characteristics. Moreover, chemical reactions do not always proceed along a single reaction coordinate in a quasi-equilibrium manner. Therefore, it is essential to understand spatiotemporal conformational fluctuations of proteins and the conformational change processes associated with reactions. In this Perspective, we shed light on the complex dynamics of proteins and their role in enzyme catalysis by presenting recent results regarding dynamic couplings and disorder in the conformational dynamics of proteins and rare but rapid enzymatic reaction events obtained from molecular dynamics simulations.
Collapse
|
8
|
Abstract
Molecular dynamics (MD) simulations are fundamental computational tools for the study of proteins and their free energy landscapes. However, sampling protein conformational changes through MD simulations is challenging due to the relatively long time scales of these processes. Many enhanced sampling approaches have emerged to tackle this problem, including biased sampling and path-sampling methods. In this Perspective, we focus on adaptive sampling algorithms. These techniques differ from other approaches because the thermodynamic ensemble is preserved and the sampling is enhanced solely by restarting MD trajectories at particularly chosen seeds rather than introducing biasing forces. We begin our treatment with an overview of theoretically transparent methods, where we discuss principles and guidelines for adaptive sampling. Then, we present a brief summary of select methods that have been applied to realistic systems in the past. Finally, we discuss recent advances in adaptive sampling methodology powered by deep learning techniques, as well as their shortcomings.
Collapse
|
9
|
Uncertainties in Markov State Models of Small Proteins. J Chem Theory Comput 2023; 19:5516-5524. [PMID: 37540193 PMCID: PMC10448719 DOI: 10.1021/acs.jctc.3c00372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Indexed: 08/05/2023]
Abstract
Markov state models are widely used to describe and analyze protein dynamics based on molecular dynamics simulations, specifically to extract functionally relevant characteristic time scales and motions. Particularly for larger biomolecules such as proteins, however, insufficient sampling is a notorious concern and often the source of large uncertainties that are difficult to quantify. Furthermore, there are several other sources of uncertainty, such as choice of the number of Markov states and lag time, choice and parameters of dimension reduction preprocessing step, and uncertainty due to the limited number of observed transitions; the latter is often estimated via a Bayesian approach. Here, we quantified and ranked all of these uncertainties for four small globular test proteins. We found that the largest uncertainty is due to insufficient sampling and initially increases with the total trajectory length T up to a critical tipping point, after which it decreases as 1 / T , thus providing guidelines for how much sampling is required for given accuracy. We also found that single long trajectories yielded better sampling accuracy than many shorter trajectories starting from the same structure. In comparison, the remaining sources of the above uncertainties are generally smaller by a factor of about 5, rendering them less of a concern but certainly not negligible. Importantly, the Bayes uncertainty, commonly used as the only uncertainty estimate, captures only a relatively small part of the true uncertainty, which is thus often drastically underestimated.
Collapse
|
10
|
Abstract
Adopting a 300 μs long MD trajectory of the folding of villin headpiece (HP35) by D. E. Shaw Research, we recently constructed a Markov state model (MSM) based on inter-residue contacts. The model reproduces the folding time and predicts that the native basin and unfolded region consist of metastable substates that are structurally well-characterized. Recognizing the need to establish well-defined benchmark problems, we study to what extent and in what sense this MSM can be employed as a reference model. Hence, we test the robustness of the MSM by comparing it to models that use alternative combinations of features, dimensionality reduction methods, and clustering schemes. The study suggests some main characteristics of the folding of HP35 that should be reproduced by other competitive models. Moreover, the discussion reveals which parts of the MSM workflow matter most for the considered problem and illustrates the promises and pitfalls of state-based models for the interpretation of biomolecular simulations.
Collapse
|
11
|
Modeling diffuse scattering with simple, physically interpretable models. Methods Enzymol 2023; 688:169-194. [PMID: 37748826 DOI: 10.1016/bs.mie.2023.06.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/27/2023]
Abstract
Diffuse scattering has long been proposed to probe protein dynamics relevant for biological function, and more recently, as a tool to aid structure determination. Despite recent advances in measuring and modeling this signal, the field has not been able to routinely use experimental diffuse scattering for either application. A persistent challenge has been to devise models that are sophisticated enough to robustly reproduce experimental diffuse features but remain readily interpretable from the standpoint of structural biology. This chapter presents eryx, a suite of computational tools to evaluate the primary models of disorder that have been used to analyze protein diffuse scattering. By facilitating comparative modeling, eryx aims to provide insights into the physical origins of this signal and help identify the sources of disorder that are critical for reproducing experimental features. This framework also lays the groundwork for the development of more advanced models that integrate different types of disorder without loss of interpretability.
Collapse
|
12
|
Fast conformational clustering of extensive molecular dynamics simulation data. J Chem Phys 2023; 158:144109. [PMID: 37061476 DOI: 10.1063/5.0142797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2023] Open
Abstract
We present an unsupervised data processing workflow that is specifically designed to obtain a fast conformational clustering of long molecular dynamics simulation trajectories. In this approach, we combine two dimensionality reduction algorithms (cc_analysis and encodermap) with a density-based spatial clustering algorithm (hierarchical density-based spatial clustering of applications with noise). The proposed scheme benefits from the strengths of the three algorithms while avoiding most of the drawbacks of the individual methods. Here, the cc_analysis algorithm is applied for the first time to molecular simulation data. The encodermap algorithm complements cc_analysis by providing an efficient way to process and assign large amounts of data to clusters. The main goal of the procedure is to maximize the number of assigned frames of a given trajectory while keeping a clear conformational identity of the clusters that are found. In practice, we achieve this by using an iterative clustering approach and a tunable root-mean-square-deviation-based criterion in the final cluster assignment. This allows us to find clusters of different densities and different degrees of structural identity. With the help of four protein systems, we illustrate the capability and performance of this clustering workflow: wild-type and thermostable mutant of the Trp-cage protein (TC5b and TC10b), NTL9, and Protein B. Each of these test systems poses their individual challenges to the scheme, which, in total, give a nice overview of the advantages and potential difficulties that can arise when using the proposed method.
Collapse
|
13
|
Optimization of non-equilibrium self-assembly protocols using Markov state models. J Chem Phys 2022; 157:244901. [PMID: 36586982 PMCID: PMC9788858 DOI: 10.1063/5.0130407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 12/04/2022] [Indexed: 12/12/2022] Open
Abstract
The promise of self-assembly to enable the bottom-up formation of materials with prescribed architectures and functions has driven intensive efforts to uncover rational design principles for maximizing the yield of a target structure. Yet, despite many successful examples of self-assembly, ensuring kinetic accessibility of the target structure remains an unsolved problem in many systems. In particular, long-lived kinetic traps can result in assembly times that vastly exceed experimentally accessible timescales. One proposed solution is to design non-equilibrium assembly protocols in which system parameters change over time to avoid such kinetic traps. Here, we develop a framework to combine Markov state model (MSM) analysis with optimal control theory to compute a time-dependent protocol that maximizes the yield of the target structure at a finite time. We present an adjoint-based gradient descent method that, in conjunction with MSMs for a system as a function of its control parameters, enables efficiently optimizing the assembly protocol. We also describe an interpolation approach to significantly reduce the number of simulations required to construct the MSMs. We demonstrate our approach with two examples; a simple semi-analytic model for the folding of a polymer of colloidal particles, and a more complex model for capsid assembly. Our results show that optimizing time-dependent protocols can achieve significant improvements in the yields of selected structures, including equilibrium free energy minima, long-lived metastable structures, and transient states.
Collapse
|
14
|
Automated Path Searching Reveals the Mechanism of Hydrolysis Enhancement by T4 Lysozyme Mutants. Int J Mol Sci 2022; 23:ijms232314628. [PMID: 36498954 PMCID: PMC9736071 DOI: 10.3390/ijms232314628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Revised: 11/16/2022] [Accepted: 11/19/2022] [Indexed: 11/25/2022] Open
Abstract
Bacteriophage T4 lysozyme (T4L) is a glycosidase that is widely applied as a natural antimicrobial agent in the food industry. Due to its wide applications and small size, T4L has been regarded as a model system for understanding protein dynamics and for large-scale protein engineering. Through structural insights from the single conformation of T4L, a series of mutations (L99A,G113A,R119P) have been introduced, which have successfully raised the fractional population of its only hydrolysis-competent excited state to 96%. However, the actual impact of these substitutions on its dynamics remains unclear, largely due to the lack of highly efficient sampling algorithms. Here, using our recently developed travelling-salesman-based automated path searching (TAPS), we located the minimum-free-energy path (MFEP) for the transition of three T4L mutants from their ground states to their excited states. All three mutants share a three-step transition: the flipping of F114, the rearrangement of α0/α1 helices, and final refinement. Remarkably, the MFEP revealed that the effects of the mutations are drastically beyond the expectations of their original design: (a) the G113A substitution not only enhances helicity but also fills the hydrophobic Cavity I and reduces the free energy barrier for flipping F114; (b) R119P barely changes the stability of the ground state but stabilizes the excited state through rarely reported polar contacts S117OG:N132ND2, E11OE1:R145NH1, and E11OE2:Q105NE2; (c) the residue W138 flips into Cavity I and further stabilizes the excited state for the triple mutant L99A,G113A,R119P. These novel insights that were unexpected in the original mutant design indicated the necessity of incorporating path searching into the workflow of rational protein engineering.
Collapse
|
15
|
Conformational transitions in BTG1 antiproliferative protein and their modulation by disease mutants. Biophys J 2022; 121:3753-3764. [PMID: 35459639 PMCID: PMC9617077 DOI: 10.1016/j.bpj.2022.04.023] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 04/01/2022] [Accepted: 04/15/2022] [Indexed: 12/01/2022] Open
Abstract
B cell translocation gene 1 (BTG1) protein belongs to the BTG/transducer of ERBB2 (TOB) family of antiproliferative proteins whose members regulate various key cellular processes such as cell cycle progression, apoptosis, and differentiation. Somatic missense mutations in BTG1 are found in ∼70% of a particularly malignant and disseminated subtype of diffuse large B cell lymphoma (DLBCL). Antiproliferative activity of BTG1 has been linked to its ability to associate with transcriptional cofactors and various enzymes. However, molecular mechanisms underlying these functional interactions and how the disease-linked mutations in BTG1 affect these mechanisms are currently unknown. To start filling these knowledge gaps, here, using atomistic molecular dynamics (MD) simulations, we explored structural, dynamic, and kinetic characteristics of BTG1 protein, and studied how various DLBCL mutations affect these characteristics. We focused on the protein region formed by α2 and α4 helices, as this interface has been reported not only to serve as a binding hotspot for several cellular partners but also to harbor sites for the majority of known DLBCL mutations. Markov state modeling analysis of extensive MD simulations revealed that the α2-α4 interface in the wild-type (WT) BTG1 undergoes conformational transitions between closed and open metastable states. Importantly, we show that some of the mutations in this region that are observed in DLBCL, such as Q36H, F40C, Q45P, E50K (in α2), and A83T and A84E (in α4), either overstabilize one of these two metastable states or give rise to new conformations in which these helices are distorted (i.e., kinked or unfolded). Based on these results, we conclude that the rapid interconversion between the closed and open conformations of the α2-α4 interface is an essential component of the BTG1 functional dynamics that can prime the protein for functional associations with its binding partners. Disruption of the native dynamic equilibrium by DLBCL mutants leads to the ensemble of conformations in BTG1 that are unlikely structurally and/or kinetically to enable productive functional interactions with the binding proteins.
Collapse
|
16
|
Mathematical, Thermodynamical, and Experimental Necessity for Coarse Graining Empirical Densities and Currents in Continuous Space. PHYSICAL REVIEW LETTERS 2022; 129:140601. [PMID: 36240401 DOI: 10.1103/physrevlett.129.140601] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Revised: 07/19/2022] [Accepted: 07/28/2022] [Indexed: 06/16/2023]
Abstract
We present general results on fluctuations and spatial correlations of the coarse-grained empirical density and current of Markovian diffusion in equilibrium or nonequilibrium steady states on all timescales. We unravel a deep connection between current fluctuations and generalized time-reversal symmetry, providing new insight into time-averaged observables. We highlight the essential role of coarse graining in space from mathematical, thermodynamical, and experimental points of view. Spatial coarse graining is required to uncover salient features of currents that break detailed balance, and a thermodynamically "optimal" coarse graining ensures the most precise inference of dissipation. Defined without coarse graining, the fluctuations of empirical density and current are proven to diverge on all timescales in dimensions higher than one, which has far-reaching consequences for the central-limit regime in continuous space. We apply the results to examples of irreversible diffusion. Our findings provide new intuition about time-averaged observables and allow for a more efficient analysis of single-molecule experiments.
Collapse
|
17
|
Enhanced-Sampling Simulations for the Estimation of Ligand Binding Kinetics: Current Status and Perspective. Front Mol Biosci 2022; 9:899805. [PMID: 35755817 PMCID: PMC9216551 DOI: 10.3389/fmolb.2022.899805] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Accepted: 05/09/2022] [Indexed: 12/12/2022] Open
Abstract
The dissociation rate (k off) associated with ligand unbinding events from proteins is a parameter of fundamental importance in drug design. Here we review recent major advancements in molecular simulation methodologies for the prediction of k off. Next, we discuss the impact of the potential energy function models on the accuracy of calculated k off values. Finally, we provide a perspective from high-performance computing and machine learning which might help improve such predictions.
Collapse
|
18
|
Abstract
![]()
The stabilization
of native states of proteins is a powerful drug
discovery strategy. It is still unclear, however, whether this approach
can be applied to intrinsically disordered proteins. Here, we report
a small molecule that stabilizes the native state of the Aβ42
peptide, an intrinsically disordered protein fragment associated with
Alzheimer’s disease. We show that this stabilization takes
place by a disordered binding mechanism, in which both the small molecule
and the Aβ42 peptide remain disordered. This disordered binding
mechanism involves enthalpically favorable local π-stacking
interactions coupled with entropically advantageous global effects.
These results indicate that small molecules can stabilize disordered
proteins in their native states through transient non-specific interactions
that provide enthalpic gain while simultaneously increasing the conformational
entropy of the proteins.
Collapse
|
19
|
Estimation of binding rates and affinities from multiensemble Markov models and ligand decoupling. J Chem Phys 2022; 156:134115. [PMID: 35395889 PMCID: PMC8993428 DOI: 10.1063/5.0088024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Accurate and efficient simulation of the thermodynamics and kinetics of protein-ligand interactions is crucial for computational drug discovery. Multiensemble Markov Model (MEMM) estimators can provide estimates of both binding rates and affinities from collections of short trajectories but have not been systematically explored for situations when a ligand is decoupled through scaling of non-bonded interactions. In this work, we compare the performance of two MEMM approaches for estimating ligand binding affinities and rates: (1) the transition-based reweighting analysis method (TRAM) and (2) a Maximum Caliber (MaxCal) based method. As a test system, we construct a small host-guest system where the ligand is a single uncharged Lennard-Jones (LJ) particle, and the receptor is an 11-particle icosahedral pocket made from the same atom type. To realistically mimic a protein-ligand binding system, the LJ ϵ parameter was tuned, and the system was placed in a periodic box with 860 TIP3P water molecules. A benchmark was performed using over 80 µs of unbiased simulation, and an 18-state Markov state model was used to estimate reference binding affinities and rates. We then tested the performance of TRAM and MaxCal when challenged with limited data. Both TRAM and MaxCal approaches perform better than conventional Markov state models, with TRAM showing better convergence and accuracy. We find that subsampling of trajectories to remove time correlation improves the accuracy of both TRAM and MaxCal and that in most cases, only a single biased ensemble to enhance sampled transitions is required to make accurate estimates.
Collapse
|
20
|
Atomistic description of molecular binding processes based on returning probability theory. J Chem Phys 2021; 155:204503. [PMID: 34852475 DOI: 10.1063/5.0070308] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The efficiency of molecular binding such as host-guest binding is commonly evaluated in terms of kinetics, such as rate coefficients. In general, to compute the coefficient of the overall binding process, we need to consider both the diffusion of reactants and barrier crossing to reach the bound state. Here, we develop a methodology of quantifying the rate coefficient of binding based on molecular dynamics simulation and returning probability (RP) theory proposed by Kim and Lee [J. Chem. Phys. 131, 014503 (2009)]. RP theory provides a tractable formula of the rate coefficient in terms of the thermodynamic stability and kinetics of the intermediate state on a predefined reaction coordinate. In this study, the interaction energy between reactants is utilized as the reaction coordinate, enabling us to effectively describe the reactants' relative position and orientation on one-dimensional space. Application of this method to the host-guest binding systems, which consist of β-cyclodextrin and small guest molecules, yields the rate coefficients consistent with the experimental results.
Collapse
|
21
|
Variational embedding of protein folding simulations using Gaussian mixture variational autoencoders. J Chem Phys 2021; 155:194108. [PMID: 34800961 DOI: 10.1063/5.0069708] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Conformational sampling of biomolecules using molecular dynamics simulations often produces a large amount of high dimensional data that makes it difficult to interpret using conventional analysis techniques. Dimensionality reduction methods are thus required to extract useful and relevant information. Here, we devise a machine learning method, Gaussian mixture variational autoencoder (GMVAE), that can simultaneously perform dimensionality reduction and clustering of biomolecular conformations in an unsupervised way. We show that GMVAE can learn a reduced representation of the free energy landscape of protein folding with highly separated clusters that correspond to the metastable states during folding. Since GMVAE uses a mixture of Gaussians as its prior, it can directly acknowledge the multi-basin nature of the protein folding free energy landscape. To make the model end-to-end differentiable, we use a Gumbel-softmax distribution. We test the model on three long-timescale protein folding trajectories and show that GMVAE embedding resembles the folding funnel with folded states down the funnel and unfolded states outside the funnel path. Additionally, we show that the latent space of GMVAE can be used for kinetic analysis and Markov state models built on this embedding produce folding and unfolding timescales that are in close agreement with other rigorous dynamical embeddings such as time independent component analysis.
Collapse
|
22
|
Along the allostery stream: Recent advances in computational methods for allosteric drug discovery. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2021. [DOI: 10.1002/wcms.1585] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
23
|
Computational strategies for protein conformational ensemble detection. Curr Opin Struct Biol 2021; 72:79-87. [PMID: 34563946 DOI: 10.1016/j.sbi.2021.08.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 08/13/2021] [Accepted: 08/17/2021] [Indexed: 01/18/2023]
Abstract
Protein function is constrained by the three-dimensional structure but is delineated by its dynamics. This framework must satisfy specificity of function along with adaptability to changing environments and evolvability under external constraints. The accessibility of the available regions of the energy landscape for a set of conditions and shifts in the populations upon their modulation have effects propagating across scales, from biomolecular interactions, to organisms, to populations. Developing the ability to detect and juggle protein conformations supplemented by a physics-based understanding has implications for not only in vivo problems but also for resistance impeding drug discovery and bionano-sensor design.
Collapse
|
24
|
The Next Frontier for Designing Switchable Proteins: Rational Enhancement of Kinetics. J Phys Chem B 2021; 125:9069-9077. [PMID: 34324338 DOI: 10.1021/acs.jpcb.1c04082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Designing proteins that can switch between active (ON) and inactive (OFF) conformations in response to signals such as ligand binding and incident light has been a tantalizing endeavor in protein engineering for over a decade. While such designs have yielded novel biosensors, therapeutic agents, and smart biomaterials, the response times (times for switching ON and OFF) of many switches have been too slow to be of practical use. Among the defining properties of such switches, the kinetics of switching has been the most challenging to optimize. This is largely due to the difficulty of characterizing the structures of transient states, which are required for manipulating the height of the effective free energy barrier between the ON and OFF states. We share our perspective of the most promising new experimental and computational strategies over the past several years for tackling this next frontier for designing switchable proteins.
Collapse
|
25
|
Abstract
β-coronavirus (CoVs) alone has been responsible for three major global outbreaks in the 21st century. The current crisis has led to an urgent requirement to develop therapeutics. Even though a number of vaccines are available, alternative strategies targeting essential viral components are required as a backup against the emergence of lethal viral variants. One such target is the main protease (Mpro) that plays an indispensable role in viral replication. The availability of over 270 Mpro X-ray structures in complex with inhibitors provides unique insights into ligand-protein interactions. Herein, we provide a comprehensive comparison of all nonredundant ligand-binding sites available for SARS-CoV2, SARS-CoV, and MERS-CoV Mpro. Extensive adaptive sampling has been used to investigate structural conservation of ligand-binding sites using Markov state models (MSMs) and compare conformational dynamics employing convolutional variational auto-encoder-based deep learning. Our results indicate that not all ligand-binding sites are dynamically conserved despite high sequence and structural conservation across β-CoV homologs. This highlights the complexity in targeting all three Mpro enzymes with a single pan inhibitor.
Collapse
|