Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang BW, Dai W, Gallicchio E, He P, Xia J, Tan Z, Levy RM. Simulating Replica Exchange: Markov State Models, Proposal Schemes, and the Infinite Swapping Limit. J Phys Chem B 2016;120:8289-301. [PMID: 27079355 DOI: 10.1021/acs.jpcb.6b02015] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]

For:	Zhang BW, Dai W, Gallicchio E, He P, Xia J, Tan Z, Levy RM. Simulating Replica Exchange: Markov State Models, Proposal Schemes, and the Infinite Swapping Limit. J Phys Chem B 2016;120:8289-301. [PMID: 27079355 DOI: 10.1021/acs.jpcb.6b02015] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]

Number

Cited by Other Article(s)

Cao S, Nüske F, Liu B, Soley MB, Huang X. AMUSET-TICA: A Tensor-Based Approach for Identifying Slow Collective Variables in Biomolecular Dynamics. J Chem Theory Comput 2025;21:4855-4866. [PMID: 40254940 DOI: 10.1021/acs.jctc.5c00076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/22/2025]

Abstract

Elucidating collective variables (CVs) for biomolecular dynamics is crucial for understanding numerous biological processes. By leveraging the tensor-train data structure, a multilinear version of the AMUSE (Algorithm for Multiple Unknown Signals) algorithm for Koopman approximation (AMUSEt) was recently developed to identify CVs for biomolecular dynamics. To find slow CVs, AMUSEt transforms input features (e.g., pairwise atomic distances) into nonlinear basis functions (e.g., Gaussian functions) and encodes these nonlinear basis functions within a tensor-train structure via time-lagged correlation functions. Due to the need to fit these tensor-train data structures into computer memory, AMUSEt can handle only a limited number of input features. Consequently, AMUSEt relies on manually selecting and ranking features based on physical intuition to fully capture the slow dynamics. However, when applied to complex biological systems with numerous features, this selection and ranking process becomes increasingly challenging. To address this challenge, here we present AMUSET-TICA (AMUSEt-based Time-lagged Independent Component Analysis), a CV-identification method using time-structure-independent components (tICs) as the input features for AMUSEt. The key insight of AMUSET-TICA lies in its highly effective embedding of high-dimensional atomistic protein conformations, achieved by expanding orthogonal tICs into overlapping Gaussian basis functions through a tensor-product data structure. This eliminates the need for manually selecting and ranking input features for a wide range of biomolecular systems. We demonstrate that AMUSET-TICA consistently and significantly outperforms AMUSEt and tICA in identifying slow CVs for three different biomolecular systems: alanine dipeptide, the N-terminal domain of L9 (NTL9), and the FIP35 WW domain. For all these systems, the CVs generated by AMUSET-TICA accurately describe the slowest dynamical modes underlying these biological conformational changes. Furthermore, we show that AMUSET-TICA achieves performance comparable to deep-learning approaches like VAMPnets in identifying the slowest dynamical modes, while being significantly more computationally efficient in terms of CPU time. In addition, the CVs yielded by AMUSET-TICA provide insights into the folding mechanisms of NTL9 and the FIP35 WW domain, including CV3 and CV4 of the WW domain, which capture its two parallel folding pathways. We expect AMUSET-TICA can be widely applied to facilitate the investigation of biomolecular dynamics.

Collapse

Xi K, Liu J, Zhu L. Locating Multiple Transition Pathways for Complex Biomolecules. J Chem Inf Model 2025;65:2961-2973. [PMID: 40064618 DOI: 10.1021/acs.jcim.4c01604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2025]

Abstract

Locating the low free energy paths (LFEPs) connecting different conformational states is among the major tasks for the simulations of complex biomolecules as the pathways encode the physical essence and, therefore, the underlying mechanism for their functional dynamics. Finding the LFEPs is yet challenging due to the numerous degrees of freedom of the molecules and expensive force calculations. To alleviate this issue, we have previously introduced a Traveling-Salesman-based Automated Path Searching (TAPS) approach that requires minimal input information to locate the LFEP closest to a given initial guess path. Despite its high efficiency for large biomolecules, it remains, as all path-searching methods, incapable of revealing multiple parallel LFEPs simultaneously, which are, however, near-ubiquitous. This work describes a comprehensive protocol that offers parallel LFEPs efficiently. Our protocol starts with a modified version of the parallel cascade approach, which extensively searches for a large pile of geometrically distinct paths of the target molecule in implicit solvents. These paths are clustered and then filtered by their cumulative barriers, yielding a smaller set of initial paths for subsequent optimization by TAPS in explicit solvents. Through this protocol, we successfully sampled eight LFEPs for the transition of Met-enkephalin from its 310-helix to the β-turn form, whose highest barriers range from 4.57 to 14.72 kBT. Remarkably, for the activation of the L99A variant of T4 Lysozyme (T4L-L99A), our approach revealed four LFEPs. Among them, the dominant and second preferable paths (barrier of 11.8 and 19.2 kBT) resemble previously reported mechanisms, while the other two (barrier of 23.7 and 25.3 kBT) offer novel mechanistic insights of the flipping of residues M102/M106 and anticlock flipping of F114. These results demonstrate our protocol's robustness and efficiency in providing multiple transition paths for complex conformational changes of biomolecules.

Collapse

Goonetilleke EC, Huang X. Targeting Bacterial RNA Polymerase: Harnessing Simulations and Machine Learning to Design Inhibitors for Drug-Resistant Pathogens. Biochemistry 2025;64:1169-1179. [PMID: 40014017 PMCID: PMC12016775 DOI: 10.1021/acs.biochem.4c00751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/28/2025]

Liu B, Boysen JG, Unarta IC, Du X, Li Y, Huang X. Exploring transition states of protein conformational changes via out-of-distribution detection in the hyperspherical latent space. Nat Commun 2025;16:349. [PMID: 39753544 PMCID: PMC11699157 DOI: 10.1038/s41467-024-55228-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Accepted: 12/05/2024] [Indexed: 01/06/2025] Open

Xu T, Li Y, Gao X, Zhang L. Understanding the Fast-Triggering Unfolding Dynamics of FK-11 upon Photoexcitation of Azobenzene. J Phys Chem Lett 2024;15:3531-3540. [PMID: 38526058 DOI: 10.1021/acs.jpclett.4c00091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]

Wu Y, Cao S, Qiu Y, Huang X. Tutorial on how to build non-Markovian dynamic models from molecular dynamics simulations for studying protein conformational changes. J Chem Phys 2024;160:121501. [PMID: 38516972 PMCID: PMC10964226 DOI: 10.1063/5.0189429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 02/20/2024] [Indexed: 03/23/2024] Open

Wang X, Xu T, Yao Y, Cheung PPH, Gao X, Zhang L. SARS-CoV-2 RNA-Dependent RNA Polymerase Follows Asynchronous Translocation Pathway for Viral Transcription and Replication. J Phys Chem Lett 2023;14:10119-10128. [PMID: 37922192 DOI: 10.1021/acs.jpclett.3c01249] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2023]

Cao S, Qiu Y, Kalin ML, Huang X. Integrative generalized master equation: A method to study long-timescale biomolecular dynamics via the integrals of memory kernels. J Chem Phys 2023;159:134106. [PMID: 37787134 PMCID: PMC11005468 DOI: 10.1063/5.0167287] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 09/18/2023] [Indexed: 10/04/2023] Open

Hellemann E, Durrant JD. Worth the Weight: Sub-Pocket EXplorer (SubPEx), a Weighted Ensemble Method to Enhance Binding-Pocket Conformational Sampling. J Chem Theory Comput 2023;19:5677-5689. [PMID: 37585617 PMCID: PMC10500992 DOI: 10.1021/acs.jctc.3c00478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Indexed: 08/18/2023]

Qiu Y, O’Connor MS, Xue M, Liu B, Huang X. An Efficient Path Classification Algorithm Based on Variational Autoencoder to Identify Metastable Path Channels for Complex Conformational Changes. J Chem Theory Comput 2023;19:4728-4742. [PMID: 37382437 PMCID: PMC11042546 DOI: 10.1021/acs.jctc.3c00318] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2023]

Abstract

Conformational changes (i.e., dynamic transitions between pairs of conformational states) play important roles in many chemical and biological processes. Constructing the Markov state model (MSM) from extensive molecular dynamics (MD) simulations is an effective approach to dissect the mechanism of conformational changes. When combined with transition path theory (TPT), MSM can be applied to elucidate the ensemble of kinetic pathways connecting pairs of conformational states. However, the application of TPT to analyze complex conformational changes often results in a vast number of kinetic pathways with comparable fluxes. This obstacle is particularly pronounced in heterogeneous self-assembly and aggregation processes. The large number of kinetic pathways makes it challenging to comprehend the molecular mechanisms underlying conformational changes of interest. To address this challenge, we have developed a path classification algorithm named latent-space path clustering (LPC) that efficiently lumps parallel kinetic pathways into distinct metastable path channels, making them easier to comprehend. In our algorithm, MD conformations are first projected onto a low-dimensional space containing a small set of collective variables (CVs) by time-structure-based independent component analysis (tICA) with kinetic mapping. Then, MSM and TPT are constructed to obtain the ensemble of pathways, and a deep learning architecture named the variational autoencoder (VAE) is used to learn the spatial distributions of kinetic pathways in the continuous CV space. Based on the trained VAE model, the TPT-generated ensemble of kinetic pathways can be embedded into a latent space, where the classification becomes clear. We show that LPC can efficiently and accurately identify the metastable path channels in three systems: a 2D potential, the aggregation of two hydrophobic particles in water, and the folding of the Fip35 WW domain. Using the 2D potential, we further demonstrate that our LPC algorithm outperforms the previous path-lumping algorithms by making substantially fewer incorrect assignments of individual pathways to four path channels. We expect that LPC can be widely applied to identify the dominant kinetic pathways underlying complex conformational changes.

Collapse

Hellemann E, Durrant JD. Worth the weight: Sub-Pocket EXplorer (SubPEx), a weighted-ensemble method to enhance binding-pocket conformational sampling. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.03.539330. [PMID: 37251500 PMCID: PMC10214482 DOI: 10.1101/2023.05.03.539330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Dominic AJ, Cao S, Montoya-Castillo A, Huang X. Memory Unlocks the Future of Biomolecular Dynamics: Transformative Tools to Uncover Physical Insights Accurately and Efficiently. J Am Chem Soc 2023;145:9916-9927. [PMID: 37104720 DOI: 10.1021/jacs.3c01095] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]

Abstract

Conformational changes underpin function and encode complex biomolecular mechanisms. Gaining atomic-level detail of how such changes occur has the potential to reveal these mechanisms and is of critical importance in identifying drug targets, facilitating rational drug design, and enabling bioengineering applications. While the past two decades have brought Markov state model techniques to the point where practitioners can regularly use them to glimpse the long-time dynamics of slow conformations in complex systems, many systems are still beyond their reach. In this Perspective, we discuss how including memory (i.e., non-Markovian effects) can reduce the computational cost to predict the long-time dynamics in these complex systems by orders of magnitude and with greater accuracy and resolution than state-of-the-art Markov state models. We illustrate how memory lies at the heart of successful and promising techniques, ranging from the Fokker-Planck and generalized Langevin equations to deep-learning recurrent neural networks and generalized master equations. We delineate how these techniques work, identify insights that they can offer in biomolecular systems, and discuss their advantages and disadvantages in practical settings. We show how generalized master equations can enable the investigation of, for example, the gate-opening process in RNA polymerase II and demonstrate how our recent advances tame the deleterious influence of statistical underconvergence of the molecular dynamics simulations used to parameterize these techniques. This represents a significant leap forward that will enable our memory-based techniques to interrogate systems that are currently beyond the reach of even the best Markov state models. We conclude by discussing some current challenges and future prospects for how exploiting memory will open the door to many exciting opportunities.

Collapse

Dominic AJ, Sayer T, Cao S, Markland TE, Huang X, Montoya-Castillo A. Building insightful, memory-enriched models to capture long-time biochemical processes from short-time simulations. Proc Natl Acad Sci U S A 2023;120:e2221048120. [PMID: 36920924 PMCID: PMC10041170 DOI: 10.1073/pnas.2221048120] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Accepted: 02/21/2023] [Indexed: 03/16/2023] Open

Unarta IC, Goonetilleke EC, Wang D, Huang X. Nucleotide addition and cleavage by RNA polymerase II: Coordination of two catalytic reactions using a single active site. J Biol Chem 2022;299:102844. [PMID: 36581202 PMCID: PMC9860460 DOI: 10.1016/j.jbc.2022.102844] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 12/19/2022] [Accepted: 12/22/2022] [Indexed: 12/28/2022] Open

Gu H, Wang W, Cao S, Unarta IC, Yao Y, Sheong FK, Huang X. RPnet: a reverse-projection-based neural network for coarse-graining metastable conformational states for protein dynamics. Phys Chem Chem Phys 2022;24:1462-1474. [PMID: 34985469 DOI: 10.1039/d1cp03622j] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Abstract

The Markov State Model (MSM) is a powerful tool for modeling long timescale dynamics based on numerous short molecular dynamics (MD) simulation trajectories, which makes it a useful tool for elucidating the conformational changes of biological macromolecules. By partitioning the phase space into discretized states and estimating the probabilities of inter-state transitions based on short MD trajectories, one can construct a kinetic network model that could be used to extrapolate long-timescale kinetics if the Markovian condition is met. However, meeting the Markovian condition often requires hundreds or even thousands of states (microstates), which greatly hinders the comprehension of the conformational dynamics of complex biomolecules. Kinetic lumping algorithms can coarse grain numerous microstates into a handful of metastable states (macrostates), which would greatly facilitate the elucidation of biological mechanisms. In this work, we have developed a reverse-projection-based neural network (RPnet) to lump microstates into macrostates, by making use of a physics-based loss function that is based on the projection operator framework of conformational dynamics. By recognizing that microstate and macrostate transition modes can be related through a projection process, we have developed a reverse-projection scheme to directly compare the microstate and macrostate dynamics. Based on this reverse-projection scheme, we designed a loss function that allows the effective assessment of the quality of a given kinetic lumping. We then make use of a neural network to efficiently minimize this loss function to obtain an optimized set of macrostates. We have demonstrated the power of our RPnet in analyzing the dynamics of a numerical 2D potential, alanine dipeptide, and the clamp opening of an RNA polymerase. In all these systems, we have illustrated that our method could yield comparable or better results than competing methods in terms of state partitioning and reproduction of slow dynamics. We expect that our RPnet holds promise in analyzing the conformational dynamics of biological macromolecules.

Collapse

Saikia N, Yanez-Orozco IS, Qiu R, Hao P, Milikisiyants S, Ou E, Hamilton GL, Weninger KR, Smirnova TI, Sanabria H, Ding F. Integrative structural dynamics probing of the conformational heterogeneity in synaptosomal-associated protein 25. CELL REPORTS. PHYSICAL SCIENCE 2021;2:100616. [PMID: 34888535 PMCID: PMC8654206 DOI: 10.1016/j.xcrp.2021.100616] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]

Konovalov K, Unarta IC, Cao S, Goonetilleke EC, Huang X. Markov State Models to Study the Functional Dynamics of Proteins in the Wake of Machine Learning. JACS AU 2021;1:1330-1341. [PMID: 34604842 PMCID: PMC8479766 DOI: 10.1021/jacsau.1c00254] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Indexed: 05/19/2023]

Abstract

Markov state models (MSMs) based on molecular dynamics (MD) simulations are routinely employed to study protein folding, however, their application to functional conformational changes of biomolecules is still limited. In the past few years, the field of computational chemistry has experienced a surge of advancements stemming from machine learning algorithms, and MSMs have not been left out. Unlike global processes, such as protein folding, the application of MSMs to functional conformational changes is challenging because they mostly consist of localized structural transitions. Therefore, it is critical to properly select a subset of structural features that can describe the slowest dynamics of these functional conformational changes. To address this challenge, we recommend several automatic feature selection methods such as Spectral-OASIS. To identify states in MSMs, the chosen features can be subject to dimensionality reduction methods such as TICA or deep learning based VAMPNets to project MD conformations onto a few collective variables for subsequent clustering. Another challenge for the application of MSMs to the study of functional conformational changes is the ability to comprehend their biophysical mechanisms, as MSMs built for these processes often require a large number of states. We recommend the recently developed quasi-MSMs (qMSMs) to address this issue. Compared to MSMs, qMSMs encode the non-Markovian dynamics via the generalized master equation and can significantly reduce the number of states. As a result, qMSMs can be built with a handful of states to facilitate the interpretation of functional conformational changes. In the wake of machine learning, we believe that the rapid advancement in the MSM methodology will lead to their wider application in studying functional conformational changes of biomolecules.

Collapse

Kolimi N, Pabbathi A, Saikia N, Ding F, Sanabria H, Alper J. Out-of-Equilibrium Biophysical Chemistry: The Case for Multidimensional, Integrated Single-Molecule Approaches. J Phys Chem B 2021;125:10404-10418. [PMID: 34506140 PMCID: PMC8474109 DOI: 10.1021/acs.jpcb.1c02424] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Xi K, Hu Z, Wu Q, Wei M, Qian R, Zhu L. Assessing the Performance of Traveling-salesman based Automated Path Searching (TAPS) on Complex Biomolecular Systems. J Chem Theory Comput 2021;17:5301-5311. [PMID: 34270241 DOI: 10.1021/acs.jctc.1c00182] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

A comprehensive mechanism for 5-carboxylcytosine-induced transcriptional pausing revealed by Markov state models. J Biol Chem 2021;296:100735. [PMID: 33991521 PMCID: PMC8191312 DOI: 10.1016/j.jbc.2021.100735] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 04/27/2021] [Accepted: 04/28/2021] [Indexed: 11/23/2022] Open

Role of bacterial RNA polymerase gate opening dynamics in DNA loading and antibiotics inhibition elucidated by quasi-Markov State Model. Proc Natl Acad Sci U S A 2021;118:2024324118. [PMID: 33883282 DOI: 10.1073/pnas.2024324118] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Cao S, Montoya-Castillo A, Wang W, Markland TE, Huang X. On the advantages of exploiting memory in Markov state models for biomolecular dynamics. J Chem Phys 2021;153:014105. [PMID: 32640825 DOI: 10.1063/5.0010787] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open

Wang X, Unarta IC, Cheung PPH, Huang X. Elucidating molecular mechanisms of functional conformational changes of proteins via Markov state models. Curr Opin Struct Biol 2020;67:69-77. [PMID: 33126140 DOI: 10.1016/j.sbi.2020.10.005] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2020] [Revised: 09/28/2020] [Accepted: 10/07/2020] [Indexed: 01/01/2023]

König G, Glaser N, Schroeder B, Kubincová A, Hünenberger PH, Riniker S. An Alternative to Conventional λ-Intermediate States in Alchemical Free Energy Calculations: λ-Enveloping Distribution Sampling. J Chem Inf Model 2020;60:5407-5423. [PMID: 32794763 DOI: 10.1021/acs.jcim.0c00520] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Target search and recognition mechanisms of glycosylase AlkD revealed by scanning FRET-FCS and Markov state models. Proc Natl Acad Sci U S A 2020;117:21889-21895. [PMID: 32820079 PMCID: PMC7486748 DOI: 10.1073/pnas.2002971117] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open

Abstract

DNA glycosylase repairs DNA damage to maintain the genome integrity, and thus it is essential for the survival of all organisms. However, it remains a long-standing puzzle how glycosylase diffuses along the genomic DNA to locate the sparse and aberrant lesion sites efficiently and accurately in the genome containing numerous base pairs. Previously, only the high-speed–low-accuracy search mode has been characterized experimentally, while the low-speed–high-accuracy mode is undetectable. Here, we observed the low-speed mode of glycosylase AlkD translocating, and further dissected its molecular mechanisms. To achieve this, we developed an integrated platform by combining scanning FRET-FCS with Markov state model. We expect that this platform can be widely applied to investigate other glycosylases and DNA-binding proteins.

DNA glycosylase is responsible for repairing DNA damage to maintain the genome stability and integrity. However, how glycosylase can efficiently and accurately recognize DNA lesions across the enormous DNA genome remains elusive. It has been hypothesized that glycosylase translocates along the DNA by alternating between a fast but low-accuracy diffusion mode and a slow but high-accuracy mode when searching for DNA lesions. However, the slow mode has not been successfully characterized due to the limitation in the spatial and temporal resolutions of current experimental techniques. Using a newly developed scanning fluorescence resonance energy transfer (FRET)–fluorescence correlation spectroscopy (FCS) platform, we were able to observe both slow and fast modes of glycosylase AlkD translocating on double-stranded DNA (dsDNA), reaching the temporal resolution of microsecond and spatial resolution of subnanometer. The underlying molecular mechanism of the slow mode was further elucidated by Markov state model built from extensive all-atom molecular dynamics simulations. We found that in the slow mode, AlkD follows an asymmetric diffusion pathway, i.e., rotation followed by translation. Furthermore, the essential role of Y27 in AlkD diffusion dynamics was identified both experimentally and computationally. Our results provided mechanistic insights on how conformational dynamics of AlkD–dsDNA complex coordinate different diffusion modes to accomplish the search for DNA lesions with high efficiency and accuracy. We anticipate that the mechanism adopted by AlkD to search for DNA lesions could be a general one utilized by other glycosylases and DNA binding proteins.

Collapse

George A, Purnaprajna M, Athri P. Laplacian score and genetic algorithm based automatic feature selection for Markov State Models in adaptive sampling based molecular dynamics. PEERJ PHYSICAL CHEMISTRY 2020. [DOI: 10.7717/peerj-pchem.9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open

Hahn DF, Zarotiadis RA, Hünenberger PH. The Conveyor Belt Umbrella Sampling (CBUS) Scheme: Principle and Application to the Calculation of the Absolute Binding Free Energies of Alkali Cations to Crown Ethers. J Chem Theory Comput 2020;16:2474-2493. [DOI: 10.1021/acs.jctc.9b00998] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Hahn DF, König G, Hünenberger PH. Overcoming Orthogonal Barriers in Alchemical Free Energy Calculations: On the Relative Merits of λ-Variations, λ-Extrapolations, and Biasing. J Chem Theory Comput 2020;16:1630-1645. [DOI: 10.1021/acs.jctc.9b00853] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

Cui D, Zhang BW, Tan Z, Levy RM. Ligand Binding Thermodynamic Cycles: Hysteresis, the Locally Weighted Histogram Analysis Method, and the Overlapping States Matrix. J Chem Theory Comput 2019;16:67-79. [PMID: 31743019 DOI: 10.1021/acs.jctc.9b00740] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Abstract

Free energy perturbation (FEP) simulations have been widely applied to obtain predictions of the relative binding free energy for a series of congeneric ligands binding to the same receptor, which is an essential component for the lead optimization process in computer-aided drug discovery. In the case of several congeneric ligands forming a perturbation map involving a closed thermodynamic cycle, the summation of the estimated free energy change along each edge in the cycle using Bennett acceptance ratio (BAR) usually will deviate from zero due to systematic and random errors, which is the hysteresis of cycle closure. In this work, the advanced reweighting techniques binless weighted histogram analysis method (UWHAM) and locally weighted histogram analysis method (LWHAM) are applied to provide statistical estimators of the free energy change along each edge in order to eliminate the hysteresis effect. As an example, we analyze a closed thermodynamic cycle involving four congeneric ligands which bind to HIV-1 integrase, a promising target which has emerged for antiviral therapy. We demonstrate that, compared with FEP and BAR, more accurate and hysteresis-free estimates of free energy differences can be achieved by using UWHAM to find a single estimate of the density of states based on all of the data in the cycle. Furthermore, by comparison of LWHAM results obtained from the inclusion of different numbers of neighboring states with UWHAM estimation involving all the states, we show how to determine the optimal neighborhood size in the LWHAM analysis to balance the trade-offs between computational cost and accuracy of the free energy prediction. Even with the smallest neighborhood, LWHAM can improve the BAR free energy estimates using the same input data as BAR. We introduce an overlapping states matrix that is constructed by using the global jump formula of LWHAM and plot its heat map. The heat map provides a quantitative measure of the overlap between pairs of alchemical/thermodynamic states. We explain how to identify and improve the FEP calculations along the edges that most likely cause large systematic errors by using the heat map of the overlapping states matrix and by comparing the BAR and UWHAM estimates of the free energy change.

Collapse

Xia J, Flynn W, Gallicchio E, Uplinger K, Armstrong JD, Forli S, Olson AJ, Levy RM. Massive-Scale Binding Free Energy Simulations of HIV Integrase Complexes Using Asynchronous Replica Exchange Framework Implemented on the IBM WCG Distributed Network. J Chem Inf Model 2019;59:1382-1397. [PMID: 30758197 PMCID: PMC6496938 DOI: 10.1021/acs.jcim.8b00817] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Abstract

To perform massive-scale replica exchange molecular dynamics (REMD) simulations for calculating binding free energies of protein-ligand complexes, we implemented the asynchronous replica exchange (AsyncRE) framework of the binding energy distribution analysis method (BEDAM) in implicit solvent on the IBM World Community Grid (WCG) and optimized the simulation parameters to reduce the overhead and improve the prediction power of the WCG AsyncRE simulations. We also performed the first massive-scale binding free energy calculations using the WCG distributed computing grid and 301 ligands from the SAMPL4 challenge for large-scale binding free energy predictions of HIV-1 integrase complexes. In total there are ∼10000 simulated complexes, ∼1 million replicas, and ∼2000 μs of aggregated MD simulations. Running AsyncRE MD simulations on the WCG requires accepting a trade-off between the number of replicas that can be run (breadth) and the number of full RE cycles that can be completed per replica (depth). As compared with synchronous Replica Exchange (SyncRE) running on tightly coupled clusters like XSEDE, on the WCG many more replicas can be launched simultaneously on heterogeneous distributed hardware, but each full RE cycle requires more overhead. We compared the WCG results with that from AutoDock and more advanced RE simulations including the use of flattening potentials to accelerate sampling of selected degrees of freedom of ligands and/or receptors related to slow dynamics due to high energy barriers. We propose a suitable strategy of RE simulations to refine high throughput docking results which can be matched to corresponding computing resources: from HPC clusters, to small or medium-size distributed campus grids, and finally to massive-scale computing networks including millions of CPUs like the resources available on the WCG.

Collapse

Zhu L, Sheong FK, Cao S, Liu S, Unarta IC, Huang X. TAPS: A traveling-salesman based automated path searching method for functional conformational changes of biological macromolecules. J Chem Phys 2019;150:124105. [DOI: 10.1063/1.5082633] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open

Hahn DF, Hünenberger PH. Alchemical Free-Energy Calculations by Multiple-Replica λ-Dynamics: The Conveyor Belt Thermodynamic Integration Scheme. J Chem Theory Comput 2019;15:2392-2419. [PMID: 30821973 DOI: 10.1021/acs.jctc.8b00782] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Abstract

A new method is proposed to calculate alchemical free-energy differences based on molecular dynamics (MD) simulations, called the conveyor belt thermodynamic integration (CBTI) scheme. As in thermodynamic integration (TI), K replicas of the system are simulated at different values of the alchemical coupling parameter λ. The number K is taken to be even, and the replicas are equally spaced on a forward-turn-backward-turn path, akin to a conveyor belt (CB) between the two physical end-states; and as in λ-dynamics (λD), the λ-values associated with the individual systems evolve in time along the simulation. However, they do so in a concerted fashion, determined by the evolution of a single dynamical variable Λ of period 2π controlling the advance of the entire CB. Thus, a change of Λ is always associated with K/2 equispaced replicas moving forward and K/2 equispaced replicas moving backward along λ. As a result, the effective free-energy profile of the replica system along Λ is periodic of period 2 πK^-1, and the magnitude of its variations decreases rapidly upon increasing K, at least as K^-1 in the limit of large K. When a sufficient number of replicas is used, these variations become small, which enables a complete and quasi-homogeneous coverage of the λ-range by the replica system, without application of any biasing potential. If desired, a memory-based biasing potential can still be added to further homogenize the sampling, the preoptimization of which is computationally inexpensive. The final free-energy profile along λ is calculated similarly to TI, by binning of the Hamiltonian λ-derivative as a function of λ considering all replicas simultaneously, followed by quadrature integration. The associated quadrature error can be kept very low owing to the continuous and quasi-homogeneous λ-sampling. The CBTI scheme can be viewed as a continuous/deterministic/dynamical analog of the Hamiltonian replica-exchange/permutation (HRE/HRP) schemes or as a correlated multiple-replica analog of the λD or λ-local elevation umbrella sampling (λ-LEUS) schemes. Compared to TI, it shares the advantage of the latter schemes in terms of enhanced orthogonal sampling, i.e. the availability of variable-λ paths to circumvent conformational barriers present at specific λ-values. Compared to HRE/HRP, it permits a deterministic and continuous sampling of the λ-range, is expected to be less sensitive to possible artifacts of the thermo- and barostating schemes, and bypasses the need to carefully preselect a λ-ladder and a swapping-attempt frequency. Compared to λ-LEUS, it eliminates (or drastically reduces) the dead time associated with the preoptimization of a biasing potential. The goal of this article is to provide the mathematical/physical formulation of the proposed CBTI scheme, along with an initial application of the method to the calculation of the hydration free energy of methanol.

Collapse

Zhang BW, Arasteh S, Levy RM. The UWHAM and SWHAM Software Package. Sci Rep 2019;9:2803. [PMID: 30808938 PMCID: PMC6391495 DOI: 10.1038/s41598-019-39420-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Accepted: 01/21/2019] [Indexed: 11/09/2022] Open

Demuynck R, Wieme J, Rogge SMJ, Dedecker KD, Vanduyfhuys L, Waroquier M, Van Speybroeck V. Protocol for Identifying Accurate Collective Variables in Enhanced Molecular Dynamics Simulations for the Description of Structural Transformations in Flexible Metal-Organic Frameworks. J Chem Theory Comput 2018;14:5511-5526. [PMID: 30336016 PMCID: PMC6236469 DOI: 10.1021/acs.jctc.8b00725] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2018] [Indexed: 01/05/2023]

Abstract

Various kinds of flexibility have been observed in metal-organic frameworks, which may originate from the topology of the material or the presence of flexible ligands. The construction of free energy profiles describing the full dynamical behavior along the phase transition path is challenging since it is not trivial to identify collective variables able to identify all metastable states along the reaction path. In this work, a systematic three-step protocol to uniquely identify the dominant order parameters for structural transformations in flexible metal-organic frameworks and subsequently construct accurate free energy profiles is presented. Methodologically, this protocol is rooted in the time-structure based independent component analysis (tICA), a well-established statistical modeling technique embedded in the Markov state model methodology and often employed to study protein folding, that allows for the identification of the slowest order parameters characterizing the structural transformation. To ensure an unbiased and systematic identification of these order parameters, the tICA decomposition is performed based on information from a prior replica exchange (RE) simulation, as this technique enhances the sampling along all degrees of freedom of the system simultaneously. From this simulation, the tICA procedure extracts the order parameters-often structural parameters-that characterize the slowest transformations in the material. Subsequently, these order parameters are adopted in traditional enhanced sampling methods such as umbrella sampling, thermodynamic integration, and variationally enhanced sampling to construct accurate free energy profiles capturing the flexibility in these nanoporous materials. In this work, the applicability of this tICA-RE protocol is demonstrated by determining the slowest order parameters in both MIL-53(Al) and CAU-13, which exhibit a strongly different type of flexibility. The obtained free energy profiles as a function of this extracted order parameter are furthermore compared to the profiles obtained when adopting less-suited collective variables, indicating the importance of systematically selecting the relevant order parameters to construct accurate free energy profiles for flexible metal-organic frameworks, which is in correspondence with experimental findings. The method succeeds in mapping the full free energy surface in terms of appropriate collective variables for MOFs exhibiting linker flexibility. For CAU-13, we show the decreased stability of the closed pore phase by systematically adding adsorbed xylene molecules in the framework.

Collapse

Wang W, Liang T, Sheong FK, Fan X, Huang X. An efficient Bayesian kinetic lumping algorithm to identify metastable conformational states via Gibbs sampling. J Chem Phys 2018;149:072337. [PMID: 30134698 DOI: 10.1063/1.5027001] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open

Tran DP, Takemura K, Kuwata K, Kitao A. Protein-Ligand Dissociation Simulated by Parallel Cascade Selection Molecular Dynamics. J Chem Theory Comput 2017;14:404-417. [PMID: 29182324 DOI: 10.1021/acs.jctc.7b00504] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Lee KH, Chen J. Efficacy of independence sampling in replica exchange simulations of ordered and disordered proteins. J Comput Chem 2017;38:2632-2640. [PMID: 28841239 PMCID: PMC5752115 DOI: 10.1002/jcc.24923] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2017] [Revised: 05/24/2017] [Accepted: 08/03/2017] [Indexed: 01/23/2023]

Zhang BW, Deng N, Tan Z, Levy RM. Stratified UWHAM and Its Stochastic Approximation for Multicanonical Simulations Which Are Far from Equilibrium. J Chem Theory Comput 2017;13:4660-4674. [PMID: 28902500 PMCID: PMC5897113 DOI: 10.1021/acs.jctc.7b00651] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Wang W, Cao S, Zhu L, Huang X. Constructing Markov State Models to elucidate the functional conformational changes of complex biomolecules. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2017. [DOI: 10.1002/wcms.1343] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Ding X, Vilseck JZ, Hayes RL, Brooks CL. Gibbs Sampler-Based λ-Dynamics and Rao-Blackwell Estimator for Alchemical Free Energy Calculation. J Chem Theory Comput 2017;13:2501-2510. [PMID: 28510433 DOI: 10.1021/acs.jctc.7b00204] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Yu TQ, Lu J, Abrams CF, Vanden-Eijnden E. Multiscale implementation of infinite-swap replica exchange molecular dynamics. Proc Natl Acad Sci U S A 2016;113:11744-11749. [PMID: 27698148 PMCID: PMC5081654 DOI: 10.1073/pnas.1605089113] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open