1
|
Cao S, Nüske F, Liu B, Soley MB, Huang X. AMUSET-TICA: A Tensor-Based Approach for Identifying Slow Collective Variables in Biomolecular Dynamics. J Chem Theory Comput 2025; 21:4855-4866. [PMID: 40254940 DOI: 10.1021/acs.jctc.5c00076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/22/2025]
Abstract
Elucidating collective variables (CVs) for biomolecular dynamics is crucial for understanding numerous biological processes. By leveraging the tensor-train data structure, a multilinear version of the AMUSE (Algorithm for Multiple Unknown Signals) algorithm for Koopman approximation (AMUSEt) was recently developed to identify CVs for biomolecular dynamics. To find slow CVs, AMUSEt transforms input features (e.g., pairwise atomic distances) into nonlinear basis functions (e.g., Gaussian functions) and encodes these nonlinear basis functions within a tensor-train structure via time-lagged correlation functions. Due to the need to fit these tensor-train data structures into computer memory, AMUSEt can handle only a limited number of input features. Consequently, AMUSEt relies on manually selecting and ranking features based on physical intuition to fully capture the slow dynamics. However, when applied to complex biological systems with numerous features, this selection and ranking process becomes increasingly challenging. To address this challenge, here we present AMUSET-TICA (AMUSEt-based Time-lagged Independent Component Analysis), a CV-identification method using time-structure-independent components (tICs) as the input features for AMUSEt. The key insight of AMUSET-TICA lies in its highly effective embedding of high-dimensional atomistic protein conformations, achieved by expanding orthogonal tICs into overlapping Gaussian basis functions through a tensor-product data structure. This eliminates the need for manually selecting and ranking input features for a wide range of biomolecular systems. We demonstrate that AMUSET-TICA consistently and significantly outperforms AMUSEt and tICA in identifying slow CVs for three different biomolecular systems: alanine dipeptide, the N-terminal domain of L9 (NTL9), and the FIP35 WW domain. For all these systems, the CVs generated by AMUSET-TICA accurately describe the slowest dynamical modes underlying these biological conformational changes. Furthermore, we show that AMUSET-TICA achieves performance comparable to deep-learning approaches like VAMPnets in identifying the slowest dynamical modes, while being significantly more computationally efficient in terms of CPU time. In addition, the CVs yielded by AMUSET-TICA provide insights into the folding mechanisms of NTL9 and the FIP35 WW domain, including CV3 and CV4 of the WW domain, which capture its two parallel folding pathways. We expect AMUSET-TICA can be widely applied to facilitate the investigation of biomolecular dynamics.
Collapse
Affiliation(s)
- Siqin Cao
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Feliks Nüske
- Max-Planck-Institute for Dynamics of Complex Technical Systems, Magdeburg 39106, Germany
| | - Bojun Liu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Micheline B Soley
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| |
Collapse
|
2
|
Xi K, Liu J, Zhu L. Locating Multiple Transition Pathways for Complex Biomolecules. J Chem Inf Model 2025; 65:2961-2973. [PMID: 40064618 DOI: 10.1021/acs.jcim.4c01604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2025]
Abstract
Locating the low free energy paths (LFEPs) connecting different conformational states is among the major tasks for the simulations of complex biomolecules as the pathways encode the physical essence and, therefore, the underlying mechanism for their functional dynamics. Finding the LFEPs is yet challenging due to the numerous degrees of freedom of the molecules and expensive force calculations. To alleviate this issue, we have previously introduced a Traveling-Salesman-based Automated Path Searching (TAPS) approach that requires minimal input information to locate the LFEP closest to a given initial guess path. Despite its high efficiency for large biomolecules, it remains, as all path-searching methods, incapable of revealing multiple parallel LFEPs simultaneously, which are, however, near-ubiquitous. This work describes a comprehensive protocol that offers parallel LFEPs efficiently. Our protocol starts with a modified version of the parallel cascade approach, which extensively searches for a large pile of geometrically distinct paths of the target molecule in implicit solvents. These paths are clustered and then filtered by their cumulative barriers, yielding a smaller set of initial paths for subsequent optimization by TAPS in explicit solvents. Through this protocol, we successfully sampled eight LFEPs for the transition of Met-enkephalin from its 310-helix to the β-turn form, whose highest barriers range from 4.57 to 14.72 kBT. Remarkably, for the activation of the L99A variant of T4 Lysozyme (T4L-L99A), our approach revealed four LFEPs. Among them, the dominant and second preferable paths (barrier of 11.8 and 19.2 kBT) resemble previously reported mechanisms, while the other two (barrier of 23.7 and 25.3 kBT) offer novel mechanistic insights of the flipping of residues M102/M106 and anticlock flipping of F114. These results demonstrate our protocol's robustness and efficiency in providing multiple transition paths for complex conformational changes of biomolecules.
Collapse
Affiliation(s)
- Kun Xi
- School of Medicine and Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen 518172, China
| | - Jinchu Liu
- School of Medicine and Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen 518172, China
| | - Lizhe Zhu
- School of Medicine and Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen 518172, China
| |
Collapse
|
3
|
Goonetilleke EC, Huang X. Targeting Bacterial RNA Polymerase: Harnessing Simulations and Machine Learning to Design Inhibitors for Drug-Resistant Pathogens. Biochemistry 2025; 64:1169-1179. [PMID: 40014017 PMCID: PMC12016775 DOI: 10.1021/acs.biochem.4c00751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/28/2025]
Abstract
The increase in antimicrobial resistance presents a major challenge in treating bacterial infections, underscoring the need for innovative drug discovery approaches and novel inhibitors. Bacterial RNA polymerase (RNAP) has emerged as a crucial target for antibiotic development due to its essential role in transcription. RNAP is a molecular motor, and its function relies heavily on the dynamic shifts between multiple conformational states. While biochemical and structural experimental methods offer crucial insights into static RNAP-drug interactions, they fall short in capturing the dynamics at a molecular level. By integrating experimental data with advanced computational techniques like Markov State Models (MSMs), Generalized Master Equation (GME) Models and other machine-learning models constructed from molecular dynamics (MD) simulations, researchers can elucidate novel cryptic pockets that open transiently for antibiotic compounds and gain a more nuanced and comprehensive understanding of RNAP-drug interactions. This integrated approach not only deepens our fundamental knowledge but also enables more targeted and efficient antibiotic design strategies. In this Perspective, we highlight how this synergy between experimental and computational methods has the potential to open new pathways for innovative drug design and combination therapies that may help turn the tide in the ongoing battle against antibiotic-resistant bacteria.
Collapse
Affiliation(s)
- Eshani C. Goonetilleke
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin−Madison, Madison, Wisconsin 53706, United States
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin−Madison, Madison, Wisconsin 53706, United States
| |
Collapse
|
4
|
Liu B, Boysen JG, Unarta IC, Du X, Li Y, Huang X. Exploring transition states of protein conformational changes via out-of-distribution detection in the hyperspherical latent space. Nat Commun 2025; 16:349. [PMID: 39753544 PMCID: PMC11699157 DOI: 10.1038/s41467-024-55228-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Accepted: 12/05/2024] [Indexed: 01/06/2025] Open
Abstract
Identifying transitional states is crucial for understanding protein conformational changes that underlie numerous biological processes. Markov state models (MSMs), built from Molecular Dynamics (MD) simulations, capture these dynamics through transitions among metastable conformational states, and have demonstrated success in studying protein conformational changes. However, MSMs face challenges in identifying transition states, as they partition MD conformations into discrete metastable states (or free energy minima), lacking description of transition states located at the free energy barriers. Here, we introduce Transition State identification via Dispersion and vAriational principle Regularized neural networks (TS-DAR), a deep learning framework inspired by out-of-distribution (OOD) detection in trustworthy artificial intelligence (AI). TS-DAR offers an end-to-end pipeline that can simultaneously detect all transition states between multiple free minima from MD simulations using the regularized hyperspherical embeddings in latent space. The key insight of TS-DAR lies in treating transition state structures as OOD data, recognizing that they are sparsely populated and exhibit a distributional shift from metastable states. We demonstrate the power of TS-DAR by applying it to a 2D potential, alanine dipeptide, and the translocation of a DNA motor protein on DNA, where it outperforms previous methods in identifying transition states.
Collapse
Affiliation(s)
- Bojun Liu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Jordan G Boysen
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Ilona Christy Unarta
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Xuefeng Du
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Yixuan Li
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA.
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA.
| |
Collapse
|
5
|
Xu T, Li Y, Gao X, Zhang L. Understanding the Fast-Triggering Unfolding Dynamics of FK-11 upon Photoexcitation of Azobenzene. J Phys Chem Lett 2024; 15:3531-3540. [PMID: 38526058 DOI: 10.1021/acs.jpclett.4c00091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
Photoswitchable molecules can control the activity and functions of biomolecules by triggering conformational changes. However, it is still challenging to fully understand such fast-triggering conformational evolution from nonequilibrium to equilibrium distribution at the molecular level. Herein, we successfully simulated the unfolding of the FK-11 peptide upon the photoinduced trans-to-cis isomerization of azobenzene based on the Markov state model. We found that the ensemble of FK-11 contains five conformational states, constituting two unfolding pathways. More intriguingly, we observed the microsecond-scale conformational propagation of the FK-11 peptide from the fully folded state to the equilibrium populations of the five states. The computed CD spectra match well with the experimental data, validating our simulation method. Overall, our study not only offers a protocol to study the photoisomerization-induced conformational changes of enzymes but also could orientate the rational design of a photoswitchable molecule to manipulate biological functions.
Collapse
Affiliation(s)
- Tiantian Xu
- State Key Laboratory of Structural Chemistry, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou, Fujian 350002, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yongfang Li
- State Key Laboratory of Structural Chemistry, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou, Fujian 350002, China
| | - Xin Gao
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
| | - Lu Zhang
- State Key Laboratory of Structural Chemistry, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou, Fujian 350002, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Fuzhou, Fujian 361005, China
| |
Collapse
|
6
|
Wu Y, Cao S, Qiu Y, Huang X. Tutorial on how to build non-Markovian dynamic models from molecular dynamics simulations for studying protein conformational changes. J Chem Phys 2024; 160:121501. [PMID: 38516972 PMCID: PMC10964226 DOI: 10.1063/5.0189429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 02/20/2024] [Indexed: 03/23/2024] Open
Abstract
Protein conformational changes play crucial roles in their biological functions. In recent years, the Markov State Model (MSM) constructed from extensive Molecular Dynamics (MD) simulations has emerged as a powerful tool for modeling complex protein conformational changes. In MSMs, dynamics are modeled as a sequence of Markovian transitions among metastable conformational states at discrete time intervals (called lag time). A major challenge for MSMs is that the lag time must be long enough to allow transitions among states to become memoryless (or Markovian). However, this lag time is constrained by the length of individual MD simulations available to track these transitions. To address this challenge, we have recently developed Generalized Master Equation (GME)-based approaches, encoding non-Markovian dynamics using a time-dependent memory kernel. In this Tutorial, we introduce the theory behind two recently developed GME-based non-Markovian dynamic models: the quasi-Markov State Model (qMSM) and the Integrative Generalized Master Equation (IGME). We subsequently outline the procedures for constructing these models and provide a step-by-step tutorial on applying qMSM and IGME to study two peptide systems: alanine dipeptide and villin headpiece. This Tutorial is available at https://github.com/xuhuihuang/GME_tutorials. The protocols detailed in this Tutorial aim to be accessible for non-experts interested in studying the biomolecular dynamics using these non-Markovian dynamic models.
Collapse
Affiliation(s)
- Yue Wu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Siqin Cao
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Yunrui Qiu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Xuhui Huang
- Author to whom correspondence should be addressed:
| |
Collapse
|
7
|
Wang X, Xu T, Yao Y, Cheung PPH, Gao X, Zhang L. SARS-CoV-2 RNA-Dependent RNA Polymerase Follows Asynchronous Translocation Pathway for Viral Transcription and Replication. J Phys Chem Lett 2023; 14:10119-10128. [PMID: 37922192 DOI: 10.1021/acs.jpclett.3c01249] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2023]
Abstract
Translocation is one essential step for the SARS-CoV-2 RNA-dependent RNA polymerase (RdRp) to exert viral replication and transcription. Although cryo-EM structures of SARS-CoV-2 RdRp are available, the molecular mechanisms of dynamic translocation remain elusive. Herein, we constructed a Markov state model based on extensive molecular dynamics simulations to elucidate the translocation dynamics of the SARS-CoV-2 RdRp. We identified two intermediates that pinpoint the rate-limiting step of translocation and characterize the asynchronous movement of the template-primer duplex. The 3'-terminal nucleotide in the primer strand lags behind due to the uneven distribution of protein-RNA interactions, while the translocation of the template strand is delayed by the hurdle residue K500. Even so, the two strands share the same "ratchet" to stabilize the polymerase in the post-translocation state, suggesting a Brownian-ratchet model. Overall, our study provides intriguing insights into SARS-CoV-2 replication and transcription, which would open a new avenue for drug discoveries.
Collapse
Affiliation(s)
- Xiaowei Wang
- Department of Chemical and Biological Engineering and Department of Mathematics, Hong Kong University of Science and Technology Kowloon, Clear Water Bay, Hong Kong
| | - Tiantian Xu
- State Key Laboratory of Structural Chemistry, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou, Fujian 350002, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yuan Yao
- Department of Chemical and Biological Engineering and Department of Mathematics, Hong Kong University of Science and Technology Kowloon, Clear Water Bay, Hong Kong
| | - Peter Pak-Hang Cheung
- Li Ka Shing Institute of Health Sciences, Department of Chemical Pathology, Chinese University of Hong Kong, 999077, Hong Kong
| | - Xin Gao
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Lu Zhang
- State Key Laboratory of Structural Chemistry, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou, Fujian 350002, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Fuzhou, Fujian 361005, China
| |
Collapse
|
8
|
Cao S, Qiu Y, Kalin ML, Huang X. Integrative generalized master equation: A method to study long-timescale biomolecular dynamics via the integrals of memory kernels. J Chem Phys 2023; 159:134106. [PMID: 37787134 PMCID: PMC11005468 DOI: 10.1063/5.0167287] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 09/18/2023] [Indexed: 10/04/2023] Open
Abstract
The generalized master equation (GME) provides a powerful approach to study biomolecular dynamics via non-Markovian dynamic models built from molecular dynamics (MD) simulations. Previously, we have implemented the GME, namely the quasi Markov State Model (qMSM), where we explicitly calculate the memory kernel and propagate dynamics using a discretized GME. qMSM can be constructed with much shorter MD trajectories than the MSM. However, since qMSM needs to explicitly compute the time-dependent memory kernels, it is heavily affected by the numerical fluctuations of simulation data when applied to study biomolecular conformational changes. This can lead to numerical instability of predicted long-time dynamics, greatly limiting the applicability of qMSM in complicated biomolecules. We present a new method, the Integrative GME (IGME), in which we analytically solve the GME under the condition when the memory kernels have decayed to zero. Our IGME overcomes the challenges of the qMSM by using the time integrations of memory kernels, thereby avoiding the numerical instability caused by explicit computation of time-dependent memory kernels. Using our solutions of the GME, we have developed a new approach to compute long-time dynamics based on MD simulations in a numerically stable, accurate and efficient way. To demonstrate its effectiveness, we have applied the IGME in three biomolecules: the alanine dipeptide, FIP35 WW-domain, and Taq RNA polymerase. In each system, the IGME achieves significantly smaller fluctuations for both memory kernels and long-time dynamics compared to the qMSM. We anticipate that the IGME can be widely applied to investigate biomolecular conformational changes.
Collapse
Affiliation(s)
- Siqin Cao
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Yunrui Qiu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Michael L. Kalin
- Biophysics Graduate Program, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| |
Collapse
|
9
|
Hellemann E, Durrant JD. Worth the Weight: Sub-Pocket EXplorer (SubPEx), a Weighted Ensemble Method to Enhance Binding-Pocket Conformational Sampling. J Chem Theory Comput 2023; 19:5677-5689. [PMID: 37585617 PMCID: PMC10500992 DOI: 10.1021/acs.jctc.3c00478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Indexed: 08/18/2023]
Abstract
Structure-based virtual screening (VS) is an effective method for identifying potential small-molecule ligands, but traditional VS approaches consider only a single binding-pocket conformation. Consequently, they struggle to identify ligands that bind to alternate conformations. Ensemble docking helps address this issue by incorporating multiple conformations into the docking process, but it depends on methods that can thoroughly explore pocket flexibility. We here introduce Sub-Pocket EXplorer (SubPEx), an approach that uses weighted ensemble (WE) path sampling to accelerate binding-pocket sampling. As proof of principle, we apply SubPEx to three proteins relevant to drug discovery: heat shock protein 90, influenza neuraminidase, and yeast hexokinase 2. SubPEx is available free of charge without registration under the terms of the open-source MIT license: http://durrantlab.com/subpex/.
Collapse
Affiliation(s)
- Erich Hellemann
- Department of Biological
Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Jacob D. Durrant
- Department of Biological
Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| |
Collapse
|
10
|
Qiu Y, O’Connor MS, Xue M, Liu B, Huang X. An Efficient Path Classification Algorithm Based on Variational Autoencoder to Identify Metastable Path Channels for Complex Conformational Changes. J Chem Theory Comput 2023; 19:4728-4742. [PMID: 37382437 PMCID: PMC11042546 DOI: 10.1021/acs.jctc.3c00318] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2023]
Abstract
Conformational changes (i.e., dynamic transitions between pairs of conformational states) play important roles in many chemical and biological processes. Constructing the Markov state model (MSM) from extensive molecular dynamics (MD) simulations is an effective approach to dissect the mechanism of conformational changes. When combined with transition path theory (TPT), MSM can be applied to elucidate the ensemble of kinetic pathways connecting pairs of conformational states. However, the application of TPT to analyze complex conformational changes often results in a vast number of kinetic pathways with comparable fluxes. This obstacle is particularly pronounced in heterogeneous self-assembly and aggregation processes. The large number of kinetic pathways makes it challenging to comprehend the molecular mechanisms underlying conformational changes of interest. To address this challenge, we have developed a path classification algorithm named latent-space path clustering (LPC) that efficiently lumps parallel kinetic pathways into distinct metastable path channels, making them easier to comprehend. In our algorithm, MD conformations are first projected onto a low-dimensional space containing a small set of collective variables (CVs) by time-structure-based independent component analysis (tICA) with kinetic mapping. Then, MSM and TPT are constructed to obtain the ensemble of pathways, and a deep learning architecture named the variational autoencoder (VAE) is used to learn the spatial distributions of kinetic pathways in the continuous CV space. Based on the trained VAE model, the TPT-generated ensemble of kinetic pathways can be embedded into a latent space, where the classification becomes clear. We show that LPC can efficiently and accurately identify the metastable path channels in three systems: a 2D potential, the aggregation of two hydrophobic particles in water, and the folding of the Fip35 WW domain. Using the 2D potential, we further demonstrate that our LPC algorithm outperforms the previous path-lumping algorithms by making substantially fewer incorrect assignments of individual pathways to four path channels. We expect that LPC can be widely applied to identify the dominant kinetic pathways underlying complex conformational changes.
Collapse
Affiliation(s)
- Yunrui Qiu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Michael S. O’Connor
- Biophysics Graduate Program, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Mingyi Xue
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Bojun Liu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
- Biophysics Graduate Program, University of Wisconsin-Madison, Madison, WI, 53706, USA
| |
Collapse
|
11
|
Hellemann E, Durrant JD. Worth the weight: Sub-Pocket EXplorer (SubPEx), a weighted-ensemble method to enhance binding-pocket conformational sampling. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.03.539330. [PMID: 37251500 PMCID: PMC10214482 DOI: 10.1101/2023.05.03.539330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Structure-based virtual screening (VS) is an effective method for identifying potential small-molecule ligands, but traditional VS approaches consider only a single binding-pocket conformation. Consequently, they struggle to identify ligands that bind to alternate conformations. Ensemble docking helps address this issue by incorporating multiple conformations into the docking process, but it depends on methods that can thoroughly explore pocket flexibility. We here introduce Sub-Pocket EXplorer (SubPEx), an approach that uses weighted ensemble (WE) path sampling to accelerate binding-pocket sampling. As proof of principle, we apply SubPEx to three proteins relevant to drug discovery: heat shock protein 90, influenza neuraminidase, and yeast hexokinase 2. SubPEx is available free of charge without registration under the terms of the open-source MIT license: http://durrantlab.com/subpex/.
Collapse
Affiliation(s)
- Erich Hellemann
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania, 15260, United States
| | - Jacob D. Durrant
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania, 15260, United States
| |
Collapse
|
12
|
Dominic AJ, Cao S, Montoya-Castillo A, Huang X. Memory Unlocks the Future of Biomolecular Dynamics: Transformative Tools to Uncover Physical Insights Accurately and Efficiently. J Am Chem Soc 2023; 145:9916-9927. [PMID: 37104720 DOI: 10.1021/jacs.3c01095] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Conformational changes underpin function and encode complex biomolecular mechanisms. Gaining atomic-level detail of how such changes occur has the potential to reveal these mechanisms and is of critical importance in identifying drug targets, facilitating rational drug design, and enabling bioengineering applications. While the past two decades have brought Markov state model techniques to the point where practitioners can regularly use them to glimpse the long-time dynamics of slow conformations in complex systems, many systems are still beyond their reach. In this Perspective, we discuss how including memory (i.e., non-Markovian effects) can reduce the computational cost to predict the long-time dynamics in these complex systems by orders of magnitude and with greater accuracy and resolution than state-of-the-art Markov state models. We illustrate how memory lies at the heart of successful and promising techniques, ranging from the Fokker-Planck and generalized Langevin equations to deep-learning recurrent neural networks and generalized master equations. We delineate how these techniques work, identify insights that they can offer in biomolecular systems, and discuss their advantages and disadvantages in practical settings. We show how generalized master equations can enable the investigation of, for example, the gate-opening process in RNA polymerase II and demonstrate how our recent advances tame the deleterious influence of statistical underconvergence of the molecular dynamics simulations used to parameterize these techniques. This represents a significant leap forward that will enable our memory-based techniques to interrogate systems that are currently beyond the reach of even the best Markov state models. We conclude by discussing some current challenges and future prospects for how exploiting memory will open the door to many exciting opportunities.
Collapse
Affiliation(s)
- Anthony J Dominic
- Department of Chemistry, University of Colorado Boulder, Boulder, Colorado 80309, USA
| | - Siqin Cao
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | | | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| |
Collapse
|
13
|
Dominic AJ, Sayer T, Cao S, Markland TE, Huang X, Montoya-Castillo A. Building insightful, memory-enriched models to capture long-time biochemical processes from short-time simulations. Proc Natl Acad Sci U S A 2023; 120:e2221048120. [PMID: 36920924 PMCID: PMC10041170 DOI: 10.1073/pnas.2221048120] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Accepted: 02/21/2023] [Indexed: 03/16/2023] Open
Abstract
The ability to predict and understand complex molecular motions occurring over diverse timescales ranging from picoseconds to seconds and even hours in biological systems remains one of the largest challenges to chemical theory. Markov state models (MSMs), which provide a memoryless description of the transitions between different states of a biochemical system, have provided numerous important physically transparent insights into biological function. However, constructing these models often necessitates performing extremely long molecular simulations to converge the rates. Here, we show that by incorporating memory via the time-convolutionless generalized master equation (TCL-GME) one can build a theoretically transparent and physically intuitive memory-enriched model of biochemical processes with up to a three order of magnitude reduction in the simulation data required while also providing a higher temporal resolution. We derive the conditions under which the TCL-GME provides a more efficient means to capture slow dynamics than MSMs and rigorously prove when the two provide equally valid and efficient descriptions of the slow configurational dynamics. We further introduce a simple averaging procedure that enables our TCL-GME approach to quickly converge and accurately predict long-time dynamics even when parameterized with noisy reference data arising from short trajectories. We illustrate the advantages of the TCL-GME using alanine dipeptide, the human argonaute complex, and FiP35 WW domain.
Collapse
Affiliation(s)
| | - Thomas Sayer
- Department of Chemistry, University of Colorado, Boulder, CO80309
| | - Siqin Cao
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI53706
| | | | - Xuhui Huang
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI53706
| | | |
Collapse
|
14
|
Unarta IC, Goonetilleke EC, Wang D, Huang X. Nucleotide addition and cleavage by RNA polymerase II: Coordination of two catalytic reactions using a single active site. J Biol Chem 2022; 299:102844. [PMID: 36581202 PMCID: PMC9860460 DOI: 10.1016/j.jbc.2022.102844] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 12/19/2022] [Accepted: 12/22/2022] [Indexed: 12/28/2022] Open
Abstract
RNA polymerase II (Pol II) incorporates complementary ribonucleotides into the growing RNA chain one at a time via the nucleotide addition cycle. The nucleotide addition cycle, however, is prone to misincorporation of noncomplementary nucleotides. Thus, to ensure transcriptional fidelity, Pol II backtracks and then cleaves the misincorporated nucleotides. These two reverse reactions, nucleotide addition and cleavage, are catalyzed in the same active site of Pol II, which is different from DNA polymerases or other endonucleases. Recently, substantial progress has been made to understand how Pol II effectively performs its dual role in the same active site. Our review highlights these recent studies and provides an overall model of the catalytic mechanisms of Pol II. In particular, RNA extension follows the two-metal-ion mechanism, and several Pol II residues play important roles to facilitate the catalysis. In sharp contrast, the cleavage reaction is independent of any Pol II residues. Interestingly, Pol II relies on its residues to recognize the misincorporated nucleotides during the backtracking process, prior to cleavage. In this way, Pol II efficiently compartmentalizes its two distinct catalytic functions using the same active site. Lastly, we also discuss a new perspective on the potential third Mg2+ in the nucleotide addition and intrinsic cleavage reactions.
Collapse
Affiliation(s)
- Ilona Christy Unarta
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Eshani C Goonetilleke
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Dong Wang
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, USA; Department of Cellular and Molecular Medicine, School of Medicine, University of California, San Diego, La Jolla, California, USA; Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California, USA.
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin, USA.
| |
Collapse
|
15
|
Gu H, Wang W, Cao S, Unarta IC, Yao Y, Sheong FK, Huang X. RPnet: a reverse-projection-based neural network for coarse-graining metastable conformational states for protein dynamics. Phys Chem Chem Phys 2022; 24:1462-1474. [PMID: 34985469 DOI: 10.1039/d1cp03622j] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The Markov State Model (MSM) is a powerful tool for modeling long timescale dynamics based on numerous short molecular dynamics (MD) simulation trajectories, which makes it a useful tool for elucidating the conformational changes of biological macromolecules. By partitioning the phase space into discretized states and estimating the probabilities of inter-state transitions based on short MD trajectories, one can construct a kinetic network model that could be used to extrapolate long-timescale kinetics if the Markovian condition is met. However, meeting the Markovian condition often requires hundreds or even thousands of states (microstates), which greatly hinders the comprehension of the conformational dynamics of complex biomolecules. Kinetic lumping algorithms can coarse grain numerous microstates into a handful of metastable states (macrostates), which would greatly facilitate the elucidation of biological mechanisms. In this work, we have developed a reverse-projection-based neural network (RPnet) to lump microstates into macrostates, by making use of a physics-based loss function that is based on the projection operator framework of conformational dynamics. By recognizing that microstate and macrostate transition modes can be related through a projection process, we have developed a reverse-projection scheme to directly compare the microstate and macrostate dynamics. Based on this reverse-projection scheme, we designed a loss function that allows the effective assessment of the quality of a given kinetic lumping. We then make use of a neural network to efficiently minimize this loss function to obtain an optimized set of macrostates. We have demonstrated the power of our RPnet in analyzing the dynamics of a numerical 2D potential, alanine dipeptide, and the clamp opening of an RNA polymerase. In all these systems, we have illustrated that our method could yield comparable or better results than competing methods in terms of state partitioning and reproduction of slow dynamics. We expect that our RPnet holds promise in analyzing the conformational dynamics of biological macromolecules.
Collapse
Affiliation(s)
- Hanlin Gu
- Department of Mathematics, Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Wei Wang
- Department of Chemistry, Hong Kong University of Science and Technology, Kowloon, Hong Kong.
| | - Siqin Cao
- Department of Chemistry, Hong Kong University of Science and Technology, Kowloon, Hong Kong.
| | - Ilona Christy Unarta
- Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Yuan Yao
- Department of Mathematics, Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Fu Kit Sheong
- Department of Chemistry, Hong Kong University of Science and Technology, Kowloon, Hong Kong. .,Institute for Advanced Study, Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Xuhui Huang
- Department of Chemistry, Hong Kong University of Science and Technology, Kowloon, Hong Kong. .,Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong
| |
Collapse
|
16
|
Saikia N, Yanez-Orozco IS, Qiu R, Hao P, Milikisiyants S, Ou E, Hamilton GL, Weninger KR, Smirnova TI, Sanabria H, Ding F. Integrative structural dynamics probing of the conformational heterogeneity in synaptosomal-associated protein 25. CELL REPORTS. PHYSICAL SCIENCE 2021; 2:100616. [PMID: 34888535 PMCID: PMC8654206 DOI: 10.1016/j.xcrp.2021.100616] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
SNAP-25 (synaptosomal-associated protein of 25 kDa) is a prototypical intrinsically disordered protein (IDP) that is unstructured by itself but forms coiled-coil helices in the SNARE complex. With high conformational heterogeneity, detailed structural dynamics of unbound SNAP-25 remain elusive. Here, we report an integrative method to probe the structural dynamics of SNAP-25 by combining replica-exchange discrete molecular dynamics (rxDMD) simulations and label-based experiments at ensemble and single-molecule levels. The rxDMD simulations systematically characterize the coil-to-molten globular transition and reconstruct structural ensemble consistent with prior ensemble experiments. Label-based experiments using Förster resonance energy transfer and double electron-electron resonance further probe the conformational dynamics of SNAP-25. Agreements between simulations and experiments under both ensemble and single-molecule conditions allow us to assign specific helix-coil transitions in SNAP-25 that occur in submillisecond timescales and potentially play a vital role in forming the SNARE complex. We expect that this integrative approach may help further our understanding of IDPs.
Collapse
Affiliation(s)
- Nabanita Saikia
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
- Department of Chemistry, Navajo Technical University, Chinle, AZ 86503, USA
| | | | - Ruoyi Qiu
- Department of Physics, North Carolina State University, Raleigh, NC 27695, USA
| | - Pengyu Hao
- Department of Physics, North Carolina State University, Raleigh, NC 27695, USA
| | - Sergey Milikisiyants
- Department of Chemistry, North Carolina State University, Raleigh, NC 27695, USA
| | - Erkang Ou
- Department of Chemistry, North Carolina State University, Raleigh, NC 27695, USA
| | - George L. Hamilton
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| | - Keith R. Weninger
- Department of Physics, North Carolina State University, Raleigh, NC 27695, USA
| | - Tatyana I. Smirnova
- Department of Chemistry, North Carolina State University, Raleigh, NC 27695, USA
| | - Hugo Sanabria
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| | - Feng Ding
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
- Lead contact
| |
Collapse
|
17
|
Konovalov K, Unarta IC, Cao S, Goonetilleke EC, Huang X. Markov State Models to Study the Functional Dynamics of Proteins in the Wake of Machine Learning. JACS AU 2021; 1:1330-1341. [PMID: 34604842 PMCID: PMC8479766 DOI: 10.1021/jacsau.1c00254] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Indexed: 05/19/2023]
Abstract
Markov state models (MSMs) based on molecular dynamics (MD) simulations are routinely employed to study protein folding, however, their application to functional conformational changes of biomolecules is still limited. In the past few years, the field of computational chemistry has experienced a surge of advancements stemming from machine learning algorithms, and MSMs have not been left out. Unlike global processes, such as protein folding, the application of MSMs to functional conformational changes is challenging because they mostly consist of localized structural transitions. Therefore, it is critical to properly select a subset of structural features that can describe the slowest dynamics of these functional conformational changes. To address this challenge, we recommend several automatic feature selection methods such as Spectral-OASIS. To identify states in MSMs, the chosen features can be subject to dimensionality reduction methods such as TICA or deep learning based VAMPNets to project MD conformations onto a few collective variables for subsequent clustering. Another challenge for the application of MSMs to the study of functional conformational changes is the ability to comprehend their biophysical mechanisms, as MSMs built for these processes often require a large number of states. We recommend the recently developed quasi-MSMs (qMSMs) to address this issue. Compared to MSMs, qMSMs encode the non-Markovian dynamics via the generalized master equation and can significantly reduce the number of states. As a result, qMSMs can be built with a handful of states to facilitate the interpretation of functional conformational changes. In the wake of machine learning, we believe that the rapid advancement in the MSM methodology will lead to their wider application in studying functional conformational changes of biomolecules.
Collapse
Affiliation(s)
- Kirill
A. Konovalov
- Department
of Chemistry, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- Hong
Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong
| | - Ilona Christy Unarta
- Department
of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- Hong
Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong
| | - Siqin Cao
- Department
of Chemistry, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- Hong
Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong
| | - Eshani C. Goonetilleke
- Department
of Chemistry, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- Hong
Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong
| | - Xuhui Huang
- Department
of Chemistry, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- Department
of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- Hong
Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong
| |
Collapse
|
18
|
Kolimi N, Pabbathi A, Saikia N, Ding F, Sanabria H, Alper J. Out-of-Equilibrium Biophysical Chemistry: The Case for Multidimensional, Integrated Single-Molecule Approaches. J Phys Chem B 2021; 125:10404-10418. [PMID: 34506140 PMCID: PMC8474109 DOI: 10.1021/acs.jpcb.1c02424] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
![]()
Out-of-equilibrium
processes are ubiquitous across living organisms
and all structural hierarchies of life. At the molecular scale, out-of-equilibrium
processes (for example, enzyme catalysis, gene regulation, and motor
protein functions) cause biological macromolecules to sample an ensemble
of conformations over a wide range of time scales. Quantifying and
conceptualizing the structure–dynamics to function relationship
is challenging because continuously evolving multidimensional energy
landscapes are necessary to describe nonequilibrium biological processes
in biological macromolecules. In this perspective, we explore the
challenges associated with state-of-the-art experimental techniques
to understanding biological macromolecular function. We argue that
it is time to revisit how we probe and model functional out-of-equilibrium
biomolecular dynamics. We suggest that developing integrated single-molecule
multiparametric force–fluorescence instruments and using advanced
molecular dynamics simulations to study out-of-equilibrium biomolecules
will provide a path towards understanding the principles of and mechanisms
behind the structure–dynamics to function paradigm in biological
macromolecules.
Collapse
Affiliation(s)
- Narendar Kolimi
- Department of Physics and Astronomy, Clemson University, Clemson, South Carolina 29634, United States
| | - Ashok Pabbathi
- Department of Physics and Astronomy, Clemson University, Clemson, South Carolina 29634, United States
| | - Nabanita Saikia
- Department of Physics and Astronomy, Clemson University, Clemson, South Carolina 29634, United States
| | - Feng Ding
- Department of Physics and Astronomy, Clemson University, Clemson, South Carolina 29634, United States
| | - Hugo Sanabria
- Department of Physics and Astronomy, Clemson University, Clemson, South Carolina 29634, United States
| | - Joshua Alper
- Department of Physics and Astronomy, Clemson University, Clemson, South Carolina 29634, United States.,Department of Biological Sciences, Clemson University, Clemson, South Carolina 29634, United States
| |
Collapse
|
19
|
Xi K, Hu Z, Wu Q, Wei M, Qian R, Zhu L. Assessing the Performance of Traveling-salesman based Automated Path Searching (TAPS) on Complex Biomolecular Systems. J Chem Theory Comput 2021; 17:5301-5311. [PMID: 34270241 DOI: 10.1021/acs.jctc.1c00182] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Though crucial for understanding the function of large biomolecular systems, locating the minimum free energy paths (MFEPs) between their key conformational states is far from trivial due to their high-dimensional nature. Most existing path-searching methods require a static collective variable space as input, encoding intuition or prior knowledge of the transition mechanism. Such information is, however, hardly available a priori and expensive to validate. To alleviate this issue, we have previously introduced a Traveling-salesman based Automated Path Searching method (TAPS) and demonstrated its efficiency on simple peptide systems. Having implemented a parallel version of this method, here we assess the performance of TAPS on three realistic systems (tens to hundreds of residues) in explicit solvents. We show that TAPS successfully located the MFEP for the ground/excited state transition of the T4 lysozyme L99A variant, consistent with previous findings. TAPS also helped identifying the important role of the two polar contacts in directing the loop-in/loop-out transition of the mitogen-activated protein kinase kinase (MEK1), which explained previous mutant experiments. Remarkably, at a minimal cost of 126 ns sampling, TAPS revealed that the Ltn40/Ltn10 transition of lymphotactin needs no complete unfolding/refolding of its β-sheets and that five polar contacts are sufficient to stabilize the various partially unfolded intermediates along the MFEP. These results present TAPS as a general and promising tool for studying the functional dynamics of complex biomolecular systems.
Collapse
Affiliation(s)
- Kun Xi
- Warshel Institute for Computational Biology, School of Life and Health Sciences, The Chinese University of Hong Kong (Shenzhen), Shenzhen, Guangdong 518172, P. R. China.,School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
| | - Zhenquan Hu
- Warshel Institute for Computational Biology, School of Life and Health Sciences, The Chinese University of Hong Kong (Shenzhen), Shenzhen, Guangdong 518172, P. R. China.,School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China
| | - Qiang Wu
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), Shenzhen, Guangdong 518172, P. R. China
| | - Meihan Wei
- Warshel Institute for Computational Biology, School of Life and Health Sciences, The Chinese University of Hong Kong (Shenzhen), Shenzhen, Guangdong 518172, P. R. China
| | - Runtong Qian
- Warshel Institute for Computational Biology, School of Life and Health Sciences, The Chinese University of Hong Kong (Shenzhen), Shenzhen, Guangdong 518172, P. R. China
| | - Lizhe Zhu
- Warshel Institute for Computational Biology, School of Life and Health Sciences, The Chinese University of Hong Kong (Shenzhen), Shenzhen, Guangdong 518172, P. R. China
| |
Collapse
|
20
|
A comprehensive mechanism for 5-carboxylcytosine-induced transcriptional pausing revealed by Markov state models. J Biol Chem 2021; 296:100735. [PMID: 33991521 PMCID: PMC8191312 DOI: 10.1016/j.jbc.2021.100735] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 04/27/2021] [Accepted: 04/28/2021] [Indexed: 11/23/2022] Open
Abstract
RNA polymerase II (Pol II) surveils the genome, pausing as it encounters DNA lesions and base modifications and initiating signals for DNA repair among other important regulatory events. Recent work suggests that Pol II pauses at 5-carboxycytosine (5caC), an epigenetic modification of cytosine, because of a specific hydrogen bond between the carboxyl group of 5caC and a specific residue in fork loop 3 of Pol II. This hydrogen bond compromises productive NTP binding and slows down elongation. Apart from this specific interaction, the carboxyl group of 5caC can potentially interact with numerous charged residues in the cleft of Pol II. However, it is not clear how other interactions between Pol II and 5caC contribute to pausing. In this study, we use Markov state models (a type of kinetic network models) built from extensive molecular dynamics simulations to comprehensively study the impact of 5caC on Pol II translocation. We describe two translocation intermediates with specific interactions that prevent the template base from loading into the Pol II active site. In addition to the previously observed state with 5caC constrained by fork loop 3, we discovered a new intermediate state with a hydrogen bond between 5caC and fork loop 2. Surprisingly, we find that 5caC may curb translocation by suppressing kinking of the helix bordering the active site (the bridge helix) because its high flexibility is critical to translocation. Our work provides new insights into how epigenetic modifications of genomic DNA can modulate Pol II translocation, inducing pauses in transcription.
Collapse
|
21
|
Role of bacterial RNA polymerase gate opening dynamics in DNA loading and antibiotics inhibition elucidated by quasi-Markov State Model. Proc Natl Acad Sci U S A 2021; 118:2024324118. [PMID: 33883282 DOI: 10.1073/pnas.2024324118] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
To initiate transcription, the holoenzyme (RNA polymerase [RNAP] in complex with σ factor) loads the promoter DNA via the flexible loading gate created by the clamp and β-lobe, yet their roles in DNA loading have not been characterized. We used a quasi-Markov State Model (qMSM) built from extensive molecular dynamics simulations to elucidate the dynamics of Thermus aquaticus holoenzyme's gate opening. We showed that during gate opening, β-lobe oscillates four orders of magnitude faster than the clamp, whose opening depends on the Switch 2's structure. Myxopyronin, an antibiotic that binds to Switch 2, was shown to undergo a conformational selection mechanism to inhibit clamp opening. Importantly, we reveal a critical but undiscovered role of β-lobe, whose opening is sufficient for DNA loading even when the clamp is partially closed. These findings open the opportunity for the development of antibiotics targeting β-lobe of RNAP. Finally, we have shown that our qMSMs, which encode non-Markovian dynamics based on the generalized master equation formalism, hold great potential to be widely applied to study biomolecular dynamics.
Collapse
|
22
|
Cao S, Montoya-Castillo A, Wang W, Markland TE, Huang X. On the advantages of exploiting memory in Markov state models for biomolecular dynamics. J Chem Phys 2021; 153:014105. [PMID: 32640825 DOI: 10.1063/5.0010787] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Biomolecular dynamics play an important role in numerous biological processes. Markov State Models (MSMs) provide a powerful approach to study these dynamic processes by predicting long time scale dynamics based on many short molecular dynamics (MD) simulations. In an MSM, protein dynamics are modeled as a kinetic process consisting of a series of Markovian transitions between different conformational states at discrete time intervals (called "lag time"). To achieve this, a master equation must be constructed with a sufficiently long lag time to allow interstate transitions to become truly Markovian. This imposes a major challenge for MSM studies of proteins since the lag time is bound by the length of relatively short MD simulations available to estimate the frequency of transitions. Here, we show how one can employ the generalized master equation formalism to obtain an exact description of protein conformational dynamics both at short and long time scales without the time resolution restrictions imposed by the MSM lag time. Using a simple kinetic model, alanine dipeptide, and WW domain, we demonstrate that it is possible to construct these quasi-Markov State Models (qMSMs) using MD simulations that are 5-10 times shorter than those required by MSMs. These qMSMs only contain a handful of metastable states and, thus, can greatly facilitate the interpretation of mechanisms associated with protein dynamics. A qMSM opens the door to the study of conformational changes of complex biomolecules where a Markovian model with a few states is often difficult to construct due to the limited length of available MD simulations.
Collapse
Affiliation(s)
- Siqin Cao
- Department of Chemistry, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | | | - Wei Wang
- Department of Chemistry, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Thomas E Markland
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| | - Xuhui Huang
- Department of Chemistry, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| |
Collapse
|
23
|
Wang X, Unarta IC, Cheung PPH, Huang X. Elucidating molecular mechanisms of functional conformational changes of proteins via Markov state models. Curr Opin Struct Biol 2020; 67:69-77. [PMID: 33126140 DOI: 10.1016/j.sbi.2020.10.005] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2020] [Revised: 09/28/2020] [Accepted: 10/07/2020] [Indexed: 01/01/2023]
Abstract
Functional conformational changes of proteins can facilitate numerous biological events in cells. The Markov state model (MSM) built from molecular dynamics simulations provide a powerful approach to study them. We here introduce a protocol that is tailor-made for constructing MSMs to study the functional conformational changes of proteins. In this protocol, one of the important steps is to select proper molecular features that can collectively describe the slowest timescales of conformational changes of interest. We recommend spectral oASIS, the modified version of oASIS, as a promising approach for automatic feature selection. Recently developed deep learning methods could also serve efficient approaches for selecting features and finding collective variables. Using DNA repair enzymes and RNA polymerases as examples, we review recent applications of MSMs to elucidate molecular mechanisms of functional conformational changes. Finally, we discuss remaining challenges and future perspectives for constructing MSMs to study functional conformational changes of proteins.
Collapse
Affiliation(s)
- Xiaowei Wang
- The Hong Kong University of Science and Technology-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China; Department of Chemistry, Center of Systems Biology and Human Health, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Ilona Christy Unarta
- Bioengineering Graduate Program, The Hong Kong University of Science and Technology, Kowloon, 4Hong Kong Center for Neurodegenerative Diseases, Hong Kong
| | - Peter Pak-Hang Cheung
- The Hong Kong University of Science and Technology-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China; Department of Chemistry, Center of Systems Biology and Human Health, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Xuhui Huang
- The Hong Kong University of Science and Technology-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China; Department of Chemistry, Center of Systems Biology and Human Health, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong; Bioengineering Graduate Program, The Hong Kong University of Science and Technology, Kowloon, 4Hong Kong Center for Neurodegenerative Diseases, Hong Kong.
| |
Collapse
|
24
|
König G, Glaser N, Schroeder B, Kubincová A, Hünenberger PH, Riniker S. An Alternative to Conventional λ-Intermediate States in Alchemical Free Energy Calculations: λ-Enveloping Distribution Sampling. J Chem Inf Model 2020; 60:5407-5423. [PMID: 32794763 DOI: 10.1021/acs.jcim.0c00520] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Alchemical free energy calculations typically rely on intermediate states to bridge between the relevant phase spaces of the two end states. These intermediate states are usually created by mixing the energies or parameters of the end states according to a coupling parameter λ. The choice of the procedure has a strong impact on the efficiency of the calculation, as it affects both the encountered energy barriers and the phase space overlap between the states. The present work builds on the connection between the minimum variance pathway (MVP) and enveloping distribution sampling (EDS). It is shown that both methods can be regarded as special cases of a common scheme referred to as λ-EDS, which can also reproduce the behavior of conventional λ-intermediate states. A particularly attractive feature of λ-EDS is its ability to emulate the use of soft core potentials (SCP) while avoiding the associated computational overhead when applying efficient free energy estimators such as the multistate Bennett's acceptance ratio (MBAR). The method is illustrated for both relative and absolute free energy calculations considering five benchmark systems. The first two systems (charge inversion and cavity creation in a dipolar solvent) demonstrate the use of λ-EDS as an alternative coupling scheme in the context of thermodynamic integration (TI). The three other systems (change of bond length, change of dihedral angles, and cavity creation in water) investigate the efficiency and optimal choice of parameters in the context of free energy perturbation (FEP) and Bennett's acceptance ratio (BAR). It is shown that λ-EDS allows larger steps along the alchemical pathway than conventional intermediate states.
Collapse
Affiliation(s)
- Gerhard König
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Nina Glaser
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Benjamin Schroeder
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Alžbeta Kubincová
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Philippe H Hünenberger
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Sereina Riniker
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| |
Collapse
|
25
|
Target search and recognition mechanisms of glycosylase AlkD revealed by scanning FRET-FCS and Markov state models. Proc Natl Acad Sci U S A 2020; 117:21889-21895. [PMID: 32820079 PMCID: PMC7486748 DOI: 10.1073/pnas.2002971117] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
DNA glycosylase repairs DNA damage to maintain the genome integrity, and thus it is essential for the survival of all organisms. However, it remains a long-standing puzzle how glycosylase diffuses along the genomic DNA to locate the sparse and aberrant lesion sites efficiently and accurately in the genome containing numerous base pairs. Previously, only the high-speed–low-accuracy search mode has been characterized experimentally, while the low-speed–high-accuracy mode is undetectable. Here, we observed the low-speed mode of glycosylase AlkD translocating, and further dissected its molecular mechanisms. To achieve this, we developed an integrated platform by combining scanning FRET-FCS with Markov state model. We expect that this platform can be widely applied to investigate other glycosylases and DNA-binding proteins. DNA glycosylase is responsible for repairing DNA damage to maintain the genome stability and integrity. However, how glycosylase can efficiently and accurately recognize DNA lesions across the enormous DNA genome remains elusive. It has been hypothesized that glycosylase translocates along the DNA by alternating between a fast but low-accuracy diffusion mode and a slow but high-accuracy mode when searching for DNA lesions. However, the slow mode has not been successfully characterized due to the limitation in the spatial and temporal resolutions of current experimental techniques. Using a newly developed scanning fluorescence resonance energy transfer (FRET)–fluorescence correlation spectroscopy (FCS) platform, we were able to observe both slow and fast modes of glycosylase AlkD translocating on double-stranded DNA (dsDNA), reaching the temporal resolution of microsecond and spatial resolution of subnanometer. The underlying molecular mechanism of the slow mode was further elucidated by Markov state model built from extensive all-atom molecular dynamics simulations. We found that in the slow mode, AlkD follows an asymmetric diffusion pathway, i.e., rotation followed by translation. Furthermore, the essential role of Y27 in AlkD diffusion dynamics was identified both experimentally and computationally. Our results provided mechanistic insights on how conformational dynamics of AlkD–dsDNA complex coordinate different diffusion modes to accomplish the search for DNA lesions with high efficiency and accuracy. We anticipate that the mechanism adopted by AlkD to search for DNA lesions could be a general one utilized by other glycosylases and DNA binding proteins.
Collapse
|
26
|
George A, Purnaprajna M, Athri P. Laplacian score and genetic algorithm based automatic feature selection for Markov State Models in adaptive sampling based molecular dynamics. PEERJ PHYSICAL CHEMISTRY 2020. [DOI: 10.7717/peerj-pchem.9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Adaptive sampling molecular dynamics based on Markov State Models use short parallel MD simulations to accelerate simulations, and are proven to identify hidden conformers. The accuracy of the predictions provided by it depends on the features extracted from the simulated data that is used to construct it. The identification of the most important features in the trajectories of the simulated system has a considerable effect on the results.
Methods
In this study, we use a combination of Laplacian scoring and genetic algorithms to obtain an optimized feature subset for the construction of the MSM. The approach is validated on simulations of three protein folding complexes, and two protein ligand binding complexes.
Results
Our experiments show that this approach produces better results when the number of samples is significantly lesser than the number of features extracted. We also observed that this method mitigates over fitting that occurs due to high dimensionality of large biosystems with shorter simulation times.
Collapse
Affiliation(s)
- Anu George
- Department of Computer Science & Engineering, Amrita School of Engineering, Bengaluru, Amrita Vishwa Vidyapeetham, India
| | - Madhura Purnaprajna
- Department of Computer Science & Engineering, Amrita School of Engineering, Bengaluru, Amrita Vishwa Vidyapeetham, India
| | - Prashanth Athri
- Department of Computer Science & Engineering, Amrita School of Engineering, Bengaluru, Amrita Vishwa Vidyapeetham, India
| |
Collapse
|
27
|
Hahn DF, Zarotiadis RA, Hünenberger PH. The Conveyor Belt Umbrella Sampling (CBUS) Scheme: Principle and Application to the Calculation of the Absolute Binding Free Energies of Alkali Cations to Crown Ethers. J Chem Theory Comput 2020; 16:2474-2493. [DOI: 10.1021/acs.jctc.9b00998] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- David F. Hahn
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Rhiannon A. Zarotiadis
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Philippe H. Hünenberger
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| |
Collapse
|
28
|
Hahn DF, König G, Hünenberger PH. Overcoming Orthogonal Barriers in Alchemical Free Energy Calculations: On the Relative Merits of λ-Variations, λ-Extrapolations, and Biasing. J Chem Theory Comput 2020; 16:1630-1645. [DOI: 10.1021/acs.jctc.9b00853] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Affiliation(s)
- David F. Hahn
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Gerhard König
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Philippe H. Hünenberger
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| |
Collapse
|
29
|
Cui D, Zhang BW, Tan Z, Levy RM. Ligand Binding Thermodynamic Cycles: Hysteresis, the Locally Weighted Histogram Analysis Method, and the Overlapping States Matrix. J Chem Theory Comput 2019; 16:67-79. [PMID: 31743019 DOI: 10.1021/acs.jctc.9b00740] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Free energy perturbation (FEP) simulations have been widely applied to obtain predictions of the relative binding free energy for a series of congeneric ligands binding to the same receptor, which is an essential component for the lead optimization process in computer-aided drug discovery. In the case of several congeneric ligands forming a perturbation map involving a closed thermodynamic cycle, the summation of the estimated free energy change along each edge in the cycle using Bennett acceptance ratio (BAR) usually will deviate from zero due to systematic and random errors, which is the hysteresis of cycle closure. In this work, the advanced reweighting techniques binless weighted histogram analysis method (UWHAM) and locally weighted histogram analysis method (LWHAM) are applied to provide statistical estimators of the free energy change along each edge in order to eliminate the hysteresis effect. As an example, we analyze a closed thermodynamic cycle involving four congeneric ligands which bind to HIV-1 integrase, a promising target which has emerged for antiviral therapy. We demonstrate that, compared with FEP and BAR, more accurate and hysteresis-free estimates of free energy differences can be achieved by using UWHAM to find a single estimate of the density of states based on all of the data in the cycle. Furthermore, by comparison of LWHAM results obtained from the inclusion of different numbers of neighboring states with UWHAM estimation involving all the states, we show how to determine the optimal neighborhood size in the LWHAM analysis to balance the trade-offs between computational cost and accuracy of the free energy prediction. Even with the smallest neighborhood, LWHAM can improve the BAR free energy estimates using the same input data as BAR. We introduce an overlapping states matrix that is constructed by using the global jump formula of LWHAM and plot its heat map. The heat map provides a quantitative measure of the overlap between pairs of alchemical/thermodynamic states. We explain how to identify and improve the FEP calculations along the edges that most likely cause large systematic errors by using the heat map of the overlapping states matrix and by comparing the BAR and UWHAM estimates of the free energy change.
Collapse
Affiliation(s)
- Di Cui
- Center for Biophysics and Computational Biology, Department of Chemistry, and Institute for Computational Molecular Science , Temple University , Philadelphia , Pennsylvania 19122 , United States
| | - Bin W Zhang
- Center for Biophysics and Computational Biology, Department of Chemistry, and Institute for Computational Molecular Science , Temple University , Philadelphia , Pennsylvania 19122 , United States
| | - Zhiqiang Tan
- Department of Statistics , Rutgers, The State University of New Jersey , Piscataway , New Jersey 08854 , United States
| | - Ronald M Levy
- Center for Biophysics and Computational Biology, Department of Chemistry, and Institute for Computational Molecular Science , Temple University , Philadelphia , Pennsylvania 19122 , United States
| |
Collapse
|
30
|
Xia J, Flynn W, Gallicchio E, Uplinger K, Armstrong JD, Forli S, Olson AJ, Levy RM. Massive-Scale Binding Free Energy Simulations of HIV Integrase Complexes Using Asynchronous Replica Exchange Framework Implemented on the IBM WCG Distributed Network. J Chem Inf Model 2019; 59:1382-1397. [PMID: 30758197 PMCID: PMC6496938 DOI: 10.1021/acs.jcim.8b00817] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
To perform massive-scale replica exchange molecular dynamics (REMD) simulations for calculating binding free energies of protein-ligand complexes, we implemented the asynchronous replica exchange (AsyncRE) framework of the binding energy distribution analysis method (BEDAM) in implicit solvent on the IBM World Community Grid (WCG) and optimized the simulation parameters to reduce the overhead and improve the prediction power of the WCG AsyncRE simulations. We also performed the first massive-scale binding free energy calculations using the WCG distributed computing grid and 301 ligands from the SAMPL4 challenge for large-scale binding free energy predictions of HIV-1 integrase complexes. In total there are ∼10000 simulated complexes, ∼1 million replicas, and ∼2000 μs of aggregated MD simulations. Running AsyncRE MD simulations on the WCG requires accepting a trade-off between the number of replicas that can be run (breadth) and the number of full RE cycles that can be completed per replica (depth). As compared with synchronous Replica Exchange (SyncRE) running on tightly coupled clusters like XSEDE, on the WCG many more replicas can be launched simultaneously on heterogeneous distributed hardware, but each full RE cycle requires more overhead. We compared the WCG results with that from AutoDock and more advanced RE simulations including the use of flattening potentials to accelerate sampling of selected degrees of freedom of ligands and/or receptors related to slow dynamics due to high energy barriers. We propose a suitable strategy of RE simulations to refine high throughput docking results which can be matched to corresponding computing resources: from HPC clusters, to small or medium-size distributed campus grids, and finally to massive-scale computing networks including millions of CPUs like the resources available on the WCG.
Collapse
Affiliation(s)
- Junchao Xia
- Center for Biophysics and Computational Biology and Department of Physics , Temple University , Philadelphia , Pennsylvania 19122 , United States
| | - William Flynn
- Center for Biophysics and Computational Biology and Department of Chemistry , Temple University , Philadelphia , Pennsylvania 19122 , United States
| | - Emilio Gallicchio
- Department of Chemistry , CUNY Brooklyn College , Brooklyn , New York 11210 , United States
| | - Keith Uplinger
- IBM WCG Team, 1177 South Belt Line Road , Coppell , Texas 75019 , United States
| | - Jonathan D Armstrong
- IBM WCG Team, 11400 Burnet Road , 0453B129, Austin , Texas 78758 , United States
| | - Stefano Forli
- Department of Integrative Structural and Computational Biology , The Scripps Research Institute , La Jolla , California 92037-1000 , United States
| | - Arthur J Olson
- Department of Integrative Structural and Computational Biology , The Scripps Research Institute , La Jolla , California 92037-1000 , United States
| | - Ronald M Levy
- Center for Biophysics and Computational Biology and Department of Chemistry , Temple University , Philadelphia , Pennsylvania 19122 , United States
| |
Collapse
|
31
|
Zhu L, Sheong FK, Cao S, Liu S, Unarta IC, Huang X. TAPS: A traveling-salesman based automated path searching method for functional conformational changes of biological macromolecules. J Chem Phys 2019; 150:124105. [DOI: 10.1063/1.5082633] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Affiliation(s)
- Lizhe Zhu
- Department of Chemistry, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
- Warshel Institute for Computational Biology, School of Life and Health Sciences, The Chinese University of Hong Kong (Shenzhen), Shenzhen, Guangdong 518172, China
| | - Fu Kit Sheong
- Department of Chemistry, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Siqin Cao
- Department of Chemistry, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Song Liu
- Department of Chemistry, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Ilona C. Unarta
- Department of Chemistry, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Xuhui Huang
- Department of Chemistry, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
- Bioengineering Program, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
| |
Collapse
|
32
|
Hahn DF, Hünenberger PH. Alchemical Free-Energy Calculations by Multiple-Replica λ-Dynamics: The Conveyor Belt Thermodynamic Integration Scheme. J Chem Theory Comput 2019; 15:2392-2419. [PMID: 30821973 DOI: 10.1021/acs.jctc.8b00782] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
A new method is proposed to calculate alchemical free-energy differences based on molecular dynamics (MD) simulations, called the conveyor belt thermodynamic integration (CBTI) scheme. As in thermodynamic integration (TI), K replicas of the system are simulated at different values of the alchemical coupling parameter λ. The number K is taken to be even, and the replicas are equally spaced on a forward-turn-backward-turn path, akin to a conveyor belt (CB) between the two physical end-states; and as in λ-dynamics (λD), the λ-values associated with the individual systems evolve in time along the simulation. However, they do so in a concerted fashion, determined by the evolution of a single dynamical variable Λ of period 2π controlling the advance of the entire CB. Thus, a change of Λ is always associated with K/2 equispaced replicas moving forward and K/2 equispaced replicas moving backward along λ. As a result, the effective free-energy profile of the replica system along Λ is periodic of period 2 πK-1, and the magnitude of its variations decreases rapidly upon increasing K, at least as K-1 in the limit of large K. When a sufficient number of replicas is used, these variations become small, which enables a complete and quasi-homogeneous coverage of the λ-range by the replica system, without application of any biasing potential. If desired, a memory-based biasing potential can still be added to further homogenize the sampling, the preoptimization of which is computationally inexpensive. The final free-energy profile along λ is calculated similarly to TI, by binning of the Hamiltonian λ-derivative as a function of λ considering all replicas simultaneously, followed by quadrature integration. The associated quadrature error can be kept very low owing to the continuous and quasi-homogeneous λ-sampling. The CBTI scheme can be viewed as a continuous/deterministic/dynamical analog of the Hamiltonian replica-exchange/permutation (HRE/HRP) schemes or as a correlated multiple-replica analog of the λD or λ-local elevation umbrella sampling (λ-LEUS) schemes. Compared to TI, it shares the advantage of the latter schemes in terms of enhanced orthogonal sampling, i.e. the availability of variable-λ paths to circumvent conformational barriers present at specific λ-values. Compared to HRE/HRP, it permits a deterministic and continuous sampling of the λ-range, is expected to be less sensitive to possible artifacts of the thermo- and barostating schemes, and bypasses the need to carefully preselect a λ-ladder and a swapping-attempt frequency. Compared to λ-LEUS, it eliminates (or drastically reduces) the dead time associated with the preoptimization of a biasing potential. The goal of this article is to provide the mathematical/physical formulation of the proposed CBTI scheme, along with an initial application of the method to the calculation of the hydration free energy of methanol.
Collapse
Affiliation(s)
- David F Hahn
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences , ETH Zürich , Vladimir-Prelog-Weg 2 , 8093 Zürich , Switzerland
| | - Philippe H Hünenberger
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences , ETH Zürich , Vladimir-Prelog-Weg 2 , 8093 Zürich , Switzerland
| |
Collapse
|
33
|
Zhang BW, Arasteh S, Levy RM. The UWHAM and SWHAM Software Package. Sci Rep 2019; 9:2803. [PMID: 30808938 PMCID: PMC6391495 DOI: 10.1038/s41598-019-39420-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Accepted: 01/21/2019] [Indexed: 11/09/2022] Open
Abstract
We introduce the UWHAM (binless weighted histogram analysis method) and SWHAM (stochastic UWHAM) software package that can be used to estimate the density of states and free energy differences based on the data generated by multi-state simulations. The programs used to solve the UWHAM equations are written in the C++ language and operated via the command line interface. In this paper, first we review the theoretical bases of UWHAM, its stochastic solver RE-SWHAM (replica exchange-like SWHAM)and ST-SWHAM (serial tempering-like SWHAM). Then we provide a tutorial with examples that explains how to apply the UWHAM program package to analyze the data generated by different types of multi-state simulations: umbrella sampling, replica exchange, free energy perturbation simulations, etc. The tutorial examples also show that the UWHAM equations can be solved stochastically by applying the RE-SWHAM and ST-SWHAM programs when the data ensemble is large. If the simulations at some states are far from equilibrium, the Stratified RE-SWHAM program can be applied to obtain the equilibrium distribution of the state of interest. All the source codes and the tutorial examples are available from our group's web page: https://ronlevygroup.cst.temple.edu/software/UWHAM_and_SWHAM_webpage/index.html .
Collapse
Affiliation(s)
- Bin W Zhang
- Center for Biophysics and Computational Biology, Department of Chemistry and Institute for Computational Molecular Science, Temple University, Philadelphia, Pennsylvania, 19122, United States.
| | - Shima Arasteh
- Center for Biophysics and Computational Biology, Department of Chemistry and Institute for Computational Molecular Science, Temple University, Philadelphia, Pennsylvania, 19122, United States
| | - Ronald M Levy
- Center for Biophysics and Computational Biology, Department of Chemistry and Institute for Computational Molecular Science, Temple University, Philadelphia, Pennsylvania, 19122, United States
| |
Collapse
|
34
|
Demuynck R, Wieme J, Rogge SMJ, Dedecker KD, Vanduyfhuys L, Waroquier M, Van Speybroeck V. Protocol for Identifying Accurate Collective Variables in Enhanced Molecular Dynamics Simulations for the Description of Structural Transformations in Flexible Metal-Organic Frameworks. J Chem Theory Comput 2018; 14:5511-5526. [PMID: 30336016 PMCID: PMC6236469 DOI: 10.1021/acs.jctc.8b00725] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2018] [Indexed: 01/05/2023]
Abstract
Various kinds of flexibility have been observed in metal-organic frameworks, which may originate from the topology of the material or the presence of flexible ligands. The construction of free energy profiles describing the full dynamical behavior along the phase transition path is challenging since it is not trivial to identify collective variables able to identify all metastable states along the reaction path. In this work, a systematic three-step protocol to uniquely identify the dominant order parameters for structural transformations in flexible metal-organic frameworks and subsequently construct accurate free energy profiles is presented. Methodologically, this protocol is rooted in the time-structure based independent component analysis (tICA), a well-established statistical modeling technique embedded in the Markov state model methodology and often employed to study protein folding, that allows for the identification of the slowest order parameters characterizing the structural transformation. To ensure an unbiased and systematic identification of these order parameters, the tICA decomposition is performed based on information from a prior replica exchange (RE) simulation, as this technique enhances the sampling along all degrees of freedom of the system simultaneously. From this simulation, the tICA procedure extracts the order parameters-often structural parameters-that characterize the slowest transformations in the material. Subsequently, these order parameters are adopted in traditional enhanced sampling methods such as umbrella sampling, thermodynamic integration, and variationally enhanced sampling to construct accurate free energy profiles capturing the flexibility in these nanoporous materials. In this work, the applicability of this tICA-RE protocol is demonstrated by determining the slowest order parameters in both MIL-53(Al) and CAU-13, which exhibit a strongly different type of flexibility. The obtained free energy profiles as a function of this extracted order parameter are furthermore compared to the profiles obtained when adopting less-suited collective variables, indicating the importance of systematically selecting the relevant order parameters to construct accurate free energy profiles for flexible metal-organic frameworks, which is in correspondence with experimental findings. The method succeeds in mapping the full free energy surface in terms of appropriate collective variables for MOFs exhibiting linker flexibility. For CAU-13, we show the decreased stability of the closed pore phase by systematically adding adsorbed xylene molecules in the framework.
Collapse
Affiliation(s)
- Ruben Demuynck
- Center for Molecular Modeling, Ghent University, Technologiepark 903, B-9052 Zwijnaarde, Belgium
| | - Jelle Wieme
- Center for Molecular Modeling, Ghent University, Technologiepark 903, B-9052 Zwijnaarde, Belgium
| | - Sven M. J. Rogge
- Center for Molecular Modeling, Ghent University, Technologiepark 903, B-9052 Zwijnaarde, Belgium
| | - Karen D. Dedecker
- Center for Molecular Modeling, Ghent University, Technologiepark 903, B-9052 Zwijnaarde, Belgium
| | - Louis Vanduyfhuys
- Center for Molecular Modeling, Ghent University, Technologiepark 903, B-9052 Zwijnaarde, Belgium
| | - Michel Waroquier
- Center for Molecular Modeling, Ghent University, Technologiepark 903, B-9052 Zwijnaarde, Belgium
| | - Veronique Van Speybroeck
- Center for Molecular Modeling, Ghent University, Technologiepark 903, B-9052 Zwijnaarde, Belgium
| |
Collapse
|
35
|
Wang W, Liang T, Sheong FK, Fan X, Huang X. An efficient Bayesian kinetic lumping algorithm to identify metastable conformational states via Gibbs sampling. J Chem Phys 2018; 149:072337. [PMID: 30134698 DOI: 10.1063/1.5027001] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
Markov State Model (MSM) has become a popular approach to study the conformational dynamics of complex biological systems in recent years. Built upon a large number of short molecular dynamics simulation trajectories, MSM is able to predict the long time scale dynamics of complex systems. However, to achieve Markovianity, an MSM often contains hundreds or thousands of states (microstates), hindering human interpretation of the underlying system mechanism. One way to reduce the number of states is to lump kinetically similar states together and thus coarse-grain the microstates into macrostates. In this work, we introduce a probabilistic lumping algorithm, the Gibbs lumping algorithm, to assign a probability to any given kinetic lumping using the Bayesian inference. In our algorithm, the transitions among kinetically distinct macrostates are modeled by Poisson processes, which will well reflect the separation of time scales in the underlying free energy landscape of biomolecules. Furthermore, to facilitate the search for the optimal kinetic lumping (i.e., the lumped model with the highest probability), a Gibbs sampling algorithm is introduced. To demonstrate the power of our new method, we apply it to three systems: a 2D potential, alanine dipeptide, and a WW protein domain. In comparison with six other popular lumping algorithms, we show that our method can persistently produce the lumped macrostate model with the highest probability as well as the largest metastability. We anticipate that our Gibbs lumping algorithm holds great promise to be widely applied to investigate conformational changes in biological macromolecules.
Collapse
Affiliation(s)
- Wei Wang
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
| | - Tong Liang
- Department of Statistics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Fu Kit Sheong
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Xiaodan Fan
- Department of Statistics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Xuhui Huang
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
| |
Collapse
|
36
|
Tran DP, Takemura K, Kuwata K, Kitao A. Protein-Ligand Dissociation Simulated by Parallel Cascade Selection Molecular Dynamics. J Chem Theory Comput 2017; 14:404-417. [PMID: 29182324 DOI: 10.1021/acs.jctc.7b00504] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We investigated the dissociation process of tri-N-acetyl-d-glucosamine from hen egg white lysozyme using parallel cascade selection molecular dynamics (PaCS-MD), which comprises cycles of multiple unbiased MD simulations using a selection of MD snapshots as the initial structures for the next cycle. Dissociation was significantly accelerated by PaCS-MD, in which the probability of rare event occurrence toward dissociation was enhanced by the selection and rerandomization of the initial velocities. Although this complex was stable during 1 μs of conventional MD, PaCS-MD easily induced dissociation within 100-101 ns. We found that velocity rerandomization enhances the dissociation of triNAG from the bound state, whereas diffusion plays a more important role in the unbound state. We calculated the dissociation free energy by analyzing all PaCS-MD trajectories using the Markov state model (MSM), compared the results to those obtained by combinations of PaCS-MD and umbrella sampling (US), steered MD (SMD) and US, and SMD and the Jarzynski equality, and experimentally determined binding free energy. PaCS-MD/MSM yielded results most comparable to the experimentally determined binding free energy, independent of simulation parameter variations, and also gave the lowest standard errors.
Collapse
Affiliation(s)
- Duy Phuoc Tran
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo , 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba 277-8562, Japan
| | - Kazuhiro Takemura
- Institute of Molecular and Cellular Biosciences, The University of Tokyo , 1-1-1 Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Kazuo Kuwata
- Center for Emerging Infectious Diseases, Gifu University , 1-1 Yanagido, Gifu-shi, Gifu 501-1194, Japan
| | - Akio Kitao
- School of Life Science and Technology, Tokyo Institute of Technology , 2-12-1 Ookayama, Meguro-ku, Tokyo 152-8550, Japan
| |
Collapse
|
37
|
Lee KH, Chen J. Efficacy of independence sampling in replica exchange simulations of ordered and disordered proteins. J Comput Chem 2017; 38:2632-2640. [PMID: 28841239 PMCID: PMC5752115 DOI: 10.1002/jcc.24923] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2017] [Revised: 05/24/2017] [Accepted: 08/03/2017] [Indexed: 01/23/2023]
Abstract
Recasting temperature replica exchange (T-RE) as a special case of Gibbs sampling has led to a simple and efficient scheme for enhanced mixing (Chodera and Shirts, J. Chem. Phys., 2011, 135, 194110). To critically examine if T-RE with independence sampling (T-REis) improves conformational sampling, we performed T-RE and T-REis simulations of ordered and disordered proteins using coarse-grained and atomistic models. The results demonstrate that T-REis effectively increase the replica mobility in temperatures space with minimal computational overhead, especially for folded proteins. However, enhanced mixing does not translate well into improved conformational sampling. The convergences of thermodynamic properties interested are similar, with slight improvements for T-REis of ordered systems. The study re-affirms the efficiency of T-RE does not appear to be limited by temperature diffusion, but by the inherent rates of spontaneous large-scale conformational re-arrangements. Due to its simplicity and efficacy of enhanced mixing, T-REis is expected to be more effective when incorporated with various Hamiltonian-RE protocols. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Kuo Hao Lee
- Department of Chemistry, University of Massachusetts, Amherst, MA 01003, USA
| | - Jianhan Chen
- Department of Chemistry, University of Massachusetts, Amherst, MA 01003, USA
- Department of Biochemistry and Molecular Biology, University of Massachusetts, Amherst, MA 01003, USA
| |
Collapse
|
38
|
Zhang BW, Deng N, Tan Z, Levy RM. Stratified UWHAM and Its Stochastic Approximation for Multicanonical Simulations Which Are Far from Equilibrium. J Chem Theory Comput 2017; 13:4660-4674. [PMID: 28902500 PMCID: PMC5897113 DOI: 10.1021/acs.jctc.7b00651] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We describe a new analysis tool called Stratified unbinned Weighted Histogram Analysis Method (Stratified-UWHAM), which can be used to compute free energies and expectations from a multicanonical ensemble when a subset of the parallel simulations is far from being equilibrated because of barriers between free energy basins which are only rarely (or never) crossed at some states. The Stratified-UWHAM equations can be obtained in the form of UWHAM equations but with an expanded set of states. We also provide a stochastic solver, Stratified RE-SWHAM, for Stratified-UWHAM to remove its computational bottleneck. Stratified-UWHAM and Stratified RE-SWHAM are applied to study three test topics: the free energy landscape of alanine dipeptide, the binding affinity of a host-guest binding complex, and path sampling for a two-dimensional double well potential. The examples show that when some of the parallel simulations are only locally equilibrated, the estimates of free energies and equilibrium distributions provided by the conventional UWHAM (or MBAR) solutions exhibit considerable biases, but the estimates provided by Stratified-UWHAM and Stratified RE-SWHAM agree with the benchmark very well. Lastly, we discuss features of the Stratified-UWHAM approach which is based on coarse-graining in relation to two other maximum likelihood-based methods which were proposed recently, that also coarse-grain the multicanonical data.
Collapse
|
39
|
Wang W, Cao S, Zhu L, Huang X. Constructing Markov State Models to elucidate the functional conformational changes of complex biomolecules. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2017. [DOI: 10.1002/wcms.1343] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Affiliation(s)
- Wei Wang
- Department of ChemistryThe Hong Kong University of Science and Technology Kowloon Hong Kong
- Center of Systems Biology and Human HealthThe Hong Kong University of Science and Technology Kowloon Hong Kong
| | - Siqin Cao
- Department of ChemistryThe Hong Kong University of Science and Technology Kowloon Hong Kong
| | - Lizhe Zhu
- Department of ChemistryThe Hong Kong University of Science and Technology Kowloon Hong Kong
- Center of Systems Biology and Human HealthThe Hong Kong University of Science and Technology Kowloon Hong Kong
| | - Xuhui Huang
- Department of ChemistryThe Hong Kong University of Science and Technology Kowloon Hong Kong
- Center of Systems Biology and Human HealthThe Hong Kong University of Science and Technology Kowloon Hong Kong
- Hong Kong Branch of Chinese National Engineering Research Center for Tissue Restoration & ReconstructionThe Hong Kong University of Science and Technology Kowloon Hong Kong
- HKUST‐Shenzhen Research Institute Shenzhen China
| |
Collapse
|
40
|
Ding X, Vilseck JZ, Hayes RL, Brooks CL. Gibbs Sampler-Based λ-Dynamics and Rao-Blackwell Estimator for Alchemical Free Energy Calculation. J Chem Theory Comput 2017; 13:2501-2510. [PMID: 28510433 DOI: 10.1021/acs.jctc.7b00204] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
λ-dynamics is a generalized ensemble method for alchemical free energy calculations. In traditional λ-dynamics, the alchemical switch variable λ is treated as a continuous variable ranging from 0 to 1 and an empirical estimator is utilized to approximate the free energy. In the present article, we describe an alternative formulation of λ-dynamics that utilizes the Gibbs sampler framework, which we call Gibbs sampler-based λ-dynamics (GSLD). GSLD, like traditional λ-dynamics, can be readily extended to calculate free energy differences between multiple ligands in one simulation. We also introduce a new free energy estimator, the Rao-Blackwell estimator (RBE), for use in conjunction with GSLD. Compared with the current empirical estimator, the advantage of RBE is that RBE is an unbiased estimator and its variance is usually smaller than the current empirical estimator. We also show that the multistate Bennett acceptance ratio equation or the unbinned weighted histogram analysis method equation can be derived using the RBE. We illustrate the use and performance of this new free energy computational framework by application to a simple harmonic system as well as relevant calculations of small molecule relative free energies of solvation and binding to a protein receptor. Our findings demonstrate consistent and improved performance compared with conventional alchemical free energy methods.
Collapse
Affiliation(s)
- Xinqiang Ding
- Department of Computational Medicine & Bioinformatics, ‡Department of Chemistry, §Biophysics Program, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Jonah Z Vilseck
- Department of Computational Medicine & Bioinformatics, ‡Department of Chemistry, §Biophysics Program, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Ryan L Hayes
- Department of Computational Medicine & Bioinformatics, ‡Department of Chemistry, §Biophysics Program, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Charles L Brooks
- Department of Computational Medicine & Bioinformatics, ‡Department of Chemistry, §Biophysics Program, University of Michigan , Ann Arbor, Michigan 48109, United States
| |
Collapse
|
41
|
Yu TQ, Lu J, Abrams CF, Vanden-Eijnden E. Multiscale implementation of infinite-swap replica exchange molecular dynamics. Proc Natl Acad Sci U S A 2016; 113:11744-11749. [PMID: 27698148 PMCID: PMC5081654 DOI: 10.1073/pnas.1605089113] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Replica exchange molecular dynamics (REMD) is a popular method to accelerate conformational sampling of complex molecular systems. The idea is to run several replicas of the system in parallel at different temperatures that are swapped periodically. These swaps are typically attempted every few MD steps and accepted or rejected according to a Metropolis-Hastings criterion. This guarantees that the joint distribution of the composite system of replicas is the normalized sum of the symmetrized product of the canonical distributions of these replicas at the different temperatures. Here we propose a different implementation of REMD in which (i) the swaps obey a continuous-time Markov jump process implemented via Gillespie's stochastic simulation algorithm (SSA), which also samples exactly the aforementioned joint distribution and has the advantage of being rejection free, and (ii) this REMD-SSA is combined with the heterogeneous multiscale method to accelerate the rate of the swaps and reach the so-called infinite-swap limit that is known to optimize sampling efficiency. The method is easy to implement and can be trivially parallelized. Here we illustrate its accuracy and efficiency on the examples of alanine dipeptide in vacuum and C-terminal β-hairpin of protein G in explicit solvent. In this latter example, our results indicate that the landscape of the protein is a triple funnel with two folded structures and one misfolded structure that are stabilized by H-bonds.
Collapse
Affiliation(s)
- Tang-Qing Yu
- Courant Institute of Mathematical Sciences, New York University, New York, NY 10012
| | - Jianfeng Lu
- Department of Mathematics, Duke University, Durham, NC 27708; Department of Physics, Duke University, Durham, NC 27708; Department of Chemistry, Duke University, Durham, NC 27708
| | - Cameron F Abrams
- Department of Chemical and Biological Engineering, Drexel University, Philadelphia, PA 19104
| | - Eric Vanden-Eijnden
- Courant Institute of Mathematical Sciences, New York University, New York, NY 10012;
| |
Collapse
|