1
|
Ishizone T, Matsunaga Y, Fuchigami S, Nakamura K. Representation of Protein Dynamics Disentangled by Time-Structure-Based Prior. J Chem Theory Comput 2024; 20:436-450. [PMID: 38151233 DOI: 10.1021/acs.jctc.3c01025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2023]
Abstract
Representation learning (RL) is a universal technique for deriving low-dimensional disentangled representations from high-dimensional observations, aiding in a multitude of downstream tasks. RL has been extensively applied to various data types, including images and natural language. Here, we analyze molecular dynamics (MD) simulation data of biomolecules in terms of RL. Currently, state-of-the-art RL techniques, mainly motivated by the variational principle, try to capture slow motions in the representation (latent) space. Here, we propose two methods based on an alternative perspective on the disentanglement in the latent space. By disentanglement, we here mean the separation of underlying factors in the simulation data, aiding in detecting physically important coordinates for conformational transitions. The proposed methods introduce a simple prior that imposes temporal constraints in the latent space, serving as a regularization term to facilitate the capture of disentangled representations of dynamics. Comparison with other methods via the analysis of MD simulation trajectories for alanine dipeptide and chignolin validates that the proposed methods construct Markov state models (MSMs) whose implied time scales are comparable to those of the state-of-the-art methods. Using a measure based on total variation, we quantitatively evaluated that the proposed methods successfully disentangle physically important coordinates, aiding the interpretation of folding/unfolding transitions of chignolin. Overall, our methods provide good representations of complex biomolecular dynamics for downstream tasks, allowing for better interpretations of the conformational transitions.
Collapse
Affiliation(s)
- Tsuyoshi Ishizone
- Mathematical Sciences Program, Graduate School of Advanced Mathematical Sciences, Meiji University, Nakano 4-21-1, Nakano-ku, Tokyo 164-8525, Japan
| | - Yasuhiro Matsunaga
- Graduate School of Science and Engineering, Saitama University, Shimo-Okubo 255, Sakura-ku, Saitama-shi, Saitama 338-8570, Japan
| | - Sotaro Fuchigami
- Physical Biochemistry Laboratory, Division of Pharmaceutical Sciences, School of Pharmaceutical Sciences, University of Shizuoka, 52-1 Yada, Suruga-ku, Shizuoka 422-8526, Japan
| | - Kazuyuki Nakamura
- Department of Mathematical Sciences Based on Modeling and Analysis, School of Interdisciplinary Mathematical Sciences, Meiji University, Nakano 4-21-1, Nakano-ku, Tokyo 164-8525, Japan
| |
Collapse
|
2
|
Low K, Coote ML, Izgorodina EI. Inclusion of More Physics Leads to Less Data: Learning the Interaction Energy as a Function of Electron Deformation Density with Limited Training Data. J Chem Theory Comput 2022; 18:1607-1618. [PMID: 35175045 DOI: 10.1021/acs.jctc.1c01264] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Machine learning (ML) approaches to predicting quantum mechanical (QM) properties have made great strides toward achieving the computational chemist's holy grail of structure-based property prediction. In contrast to direct ML methods, which encode a molecule with only structural information, in this work, we show that QM descriptors improve ML predictions of dimer interaction energy, both in terms of accuracy and data efficiency, by incorporating electronic information into the descriptor. We present the electron deformation density interaction energy machine learning (EDDIE-ML) model, which predicts the interaction energy as a function of Hartree-Fock electron deformation density. We compare its performance with leading direct ML schemes and modern DFT methods for the prediction of interaction energies for dimers of varying charge type, size, and intermolecular separation. Under a low-data regime, EDDIE-ML outperforms other direct ML schemes and is the only model readily transferrable to larger, more complex systems including base pair trimers and porous cages. The underlying physical connection between the density and interaction energy enables EDDIE-ML to reach an accuracy comparable to modern DFT functionals in fewer training data points compared to other ML methods.
Collapse
Affiliation(s)
- Kaycee Low
- Monash Computational Chemistry Group, School of Chemistry, Monash University, Clayton, Victoria 3800, Australia
| | - Michelle L Coote
- Research School of Chemistry, Australian National University, Canberra, Australian Capital Territory 0200, Australia
| | - Ekaterina I Izgorodina
- Monash Computational Chemistry Group, School of Chemistry, Monash University, Clayton, Victoria 3800, Australia
| |
Collapse
|
3
|
Mitxelena I, López X, de Sancho D. Markov state models from hierarchical density-based assignment. J Chem Phys 2021; 155:054102. [PMID: 34364321 DOI: 10.1063/5.0056748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Markov state models (MSMs) have become one of the preferred methods for the analysis and interpretation of molecular dynamics (MD) simulations of conformational transitions in biopolymers. While there is great variation in terms of implementation, a well-defined workflow involving multiple steps is often adopted. Typically, molecular coordinates are first subjected to dimensionality reduction and then clustered into small "microstates," which are subsequently lumped into "macrostates" using the information from the slowest eigenmodes. However, the microstate dynamics is often non-Markovian, and long lag times are required to converge the relevant slow dynamics in the MSM. Here, we propose a variation on this typical workflow, taking advantage of hierarchical density-based clustering. When applied to simulation data, this type of clustering separates high population regions of conformational space from others that are rarely visited. In this way, density-based clustering naturally implements assignment of the data based on transitions between metastable states, resulting in a core-set MSM. As a result, the state definition becomes more consistent with the assumption of Markovianity, and the timescales of the slow dynamics of the system are recovered more effectively. We present results of this simplified workflow for a model potential and MD simulations of the alanine dipeptide and the FiP35 WW domain.
Collapse
Affiliation(s)
- Ion Mitxelena
- Polimero eta Material Aurreratuak: Fisika, Kimika eta Teknologia, Kimika Fakultatea, UPV/EHU & Donostia International Physics Center (DIPC), PK 1072, 20018 Donostia-San Sebastian, Euskadi, Spain
| | - Xabier López
- Polimero eta Material Aurreratuak: Fisika, Kimika eta Teknologia, Kimika Fakultatea, UPV/EHU & Donostia International Physics Center (DIPC), PK 1072, 20018 Donostia-San Sebastian, Euskadi, Spain
| | - David de Sancho
- Polimero eta Material Aurreratuak: Fisika, Kimika eta Teknologia, Kimika Fakultatea, UPV/EHU & Donostia International Physics Center (DIPC), PK 1072, 20018 Donostia-San Sebastian, Euskadi, Spain
| |
Collapse
|
4
|
Jiang H, Fan X. The Two-Step Clustering Approach for Metastable States Learning. Int J Mol Sci 2021; 22:6576. [PMID: 34205252 PMCID: PMC8233889 DOI: 10.3390/ijms22126576] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 06/14/2021] [Accepted: 06/14/2021] [Indexed: 01/20/2023] Open
Abstract
Understanding the energy landscape and the conformational dynamics is crucial for studying many biological or chemical processes, such as protein-protein interaction and RNA folding. Molecular Dynamics (MD) simulations have been a major source of dynamic structure. Although many methods were proposed for learning metastable states from MD data, some key problems are still in need of further investigation. Here, we give a brief review on recent progresses in this field, with an emphasis on some popular methods belonging to a two-step clustering framework, and hope to draw more researchers to contribute to this area.
Collapse
Affiliation(s)
- Hangjin Jiang
- Center for Data Science, Zhejiang University, Hangzhou 310058, China;
| | - Xiaodan Fan
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
5
|
Determination of stable structure of a cluster using convolutional neural network and particle swarm optimization. Theor Chem Acc 2021. [DOI: 10.1007/s00214-021-02726-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
6
|
Wetherell J, Costamagna A, Gatti M, Reining L. Insights into one-body density matrices using deep learning. Faraday Discuss 2020; 224:265-291. [PMID: 32936199 DOI: 10.1039/d0fd00061b] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The one-body reduced density matrix (1-RDM) of a many-body system at zero temperature gives direct access to many observables, such as the charge density, kinetic energy and occupation numbers. It would be desirable to express it as a simple functional of the density or of other local observables, but to date satisfactory approximations have not yet been found. Deep learning is the state of the art approach to performing high dimensional regressions and classification tasks, and is becoming widely used in the condensed matter community to develop increasingly accurate density functionals. Autoencoders are deep learning models that perform efficient dimensionality reduction, allowing the distillation of data to the fundamental features needed to represent it. By training autoencoders on a large data-set of 1-RDMs from exactly solvable real-space model systems, and performing principal component analysis, the machine learns to what extent the data can be compressed and hence how it is constrained. We gain insight into these machine learned constraints and employ them to inform approximations to the 1-RDM as a functional of the charge density. We exploit known physical properties of the 1-RDM in the simplest possible cases to perform feature engineering, where we inform the structure of the models from known mathematical relations, allowing us to integrate existing understanding into the machine learning methods. By comparing various deep learning approaches we gain insight into what physical features of the density matrix are most amenable to machine learning, utilising both known and learned characteristics.
Collapse
Affiliation(s)
- Jack Wetherell
- Laboratoire des Solides Irradiés, École Polytechnique, CNRS, CEA/DRF/IRAMIS, Institut Polytechnique de Paris, F-91128 Palaiseau, France.
| | | | | | | |
Collapse
|
7
|
Tan Q, Duan M, Li M, Han L, Huo S. Approximating dynamic proximity with a hybrid geometry energy-based kernel for diffusion maps. J Chem Phys 2019; 151:105101. [PMID: 31521094 DOI: 10.1063/1.5100968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The diffusion map is a dimensionality reduction method. The reduction coordinates are associated with the leading eigenfunctions of the backward Fokker-Planck operator, providing a dynamic meaning for these coordinates. One of the key factors that affect the accuracy of diffusion map embedding is the dynamic measure implemented in the Gaussian kernel. A common practice in diffusion map study of molecular systems is to approximate dynamic proximity with RMSD (root-mean-square deviation). In this paper, we present a hybrid geometry-energy based kernel. Since high energy-barriers may exist between geometrically similar conformations, taking both RMSD and energy difference into account in the kernel can better describe conformational transitions between neighboring conformations and lead to accurate embedding. We applied our diffusion map method to the β-hairpin of the B1 domain of streptococcal protein G and to Trp-cage. Our results in β-hairpin show that the diffusion map embedding achieves better results with the hybrid kernel than that with the RMSD-based kernel in terms of free energy landscape characterization and a new correlation measure between the cluster center Euclidean distances in the reduced-dimension space and the reciprocals of the total net flow between these clusters. In addition, our diffusion map analysis of the ultralong molecular dynamics trajectory of Trp-cage has provided a unified view of its folding mechanism. These promising results demonstrate the effectiveness of our diffusion map approach in the analysis of the dynamics and thermodynamics of molecular systems. The hybrid geometry-energy criterion could be also useful as a general dynamic measure for other purposes.
Collapse
Affiliation(s)
- Qingzhe Tan
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts 01610, USA
| | - Mojie Duan
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts 01610, USA
| | - Minghai Li
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts 01610, USA
| | - Li Han
- Department of Math and Computer Science, Clark University, Worcester, Massachusetts 01610, USA
| | - Shuanghong Huo
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts 01610, USA
| |
Collapse
|
8
|
Artificial neural networks for density-functional optimizations in fermionic systems. Sci Rep 2019; 9:1886. [PMID: 30760812 PMCID: PMC6374439 DOI: 10.1038/s41598-018-37999-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2018] [Accepted: 12/17/2018] [Indexed: 11/08/2022] Open
Abstract
In this work we propose an artificial neural network functional to the ground-state energy of fermionic interacting particles in homogeneous chains described by the Hubbard model. Our neural network functional was proven to have an excellent performance: it deviates from numerically exact calculations by less than 0.15% for a vast regime of interactions and for all the regimes of filling factors and magnetizations. When compared to analytical functionals, the neural functional was found to be more precise for all the regimes of parameters, being particularly superior at the weakly interacting regime: where the analytical parametrization fails the most, ~7%, against only ~0.1% for the neural network. We have also applied our homogeneous functional to finite, localized impurities and harmonically confined systems within density-functional theory (DFT) methods. The results show that while our artificial neural network approach is substantially more accurate than other equivalently simple and fast DFT treatments, it has similar performance than more costly DFT calculations and other independent many-body calculations, at a fraction of the computational cost.
Collapse
|
9
|
Wang W, Liang T, Sheong FK, Fan X, Huang X. An efficient Bayesian kinetic lumping algorithm to identify metastable conformational states via Gibbs sampling. J Chem Phys 2018; 149:072337. [PMID: 30134698 DOI: 10.1063/1.5027001] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
Markov State Model (MSM) has become a popular approach to study the conformational dynamics of complex biological systems in recent years. Built upon a large number of short molecular dynamics simulation trajectories, MSM is able to predict the long time scale dynamics of complex systems. However, to achieve Markovianity, an MSM often contains hundreds or thousands of states (microstates), hindering human interpretation of the underlying system mechanism. One way to reduce the number of states is to lump kinetically similar states together and thus coarse-grain the microstates into macrostates. In this work, we introduce a probabilistic lumping algorithm, the Gibbs lumping algorithm, to assign a probability to any given kinetic lumping using the Bayesian inference. In our algorithm, the transitions among kinetically distinct macrostates are modeled by Poisson processes, which will well reflect the separation of time scales in the underlying free energy landscape of biomolecules. Furthermore, to facilitate the search for the optimal kinetic lumping (i.e., the lumped model with the highest probability), a Gibbs sampling algorithm is introduced. To demonstrate the power of our new method, we apply it to three systems: a 2D potential, alanine dipeptide, and a WW protein domain. In comparison with six other popular lumping algorithms, we show that our method can persistently produce the lumped macrostate model with the highest probability as well as the largest metastability. We anticipate that our Gibbs lumping algorithm holds great promise to be widely applied to investigate conformational changes in biological macromolecules.
Collapse
Affiliation(s)
- Wei Wang
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
| | - Tong Liang
- Department of Statistics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Fu Kit Sheong
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Xiaodan Fan
- Department of Statistics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Xuhui Huang
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
| |
Collapse
|
10
|
Bhoutekar A, Ghosh S, Bhattacharya S, Chatterjee A. A new class of enhanced kinetic sampling methods for building Markov state models. J Chem Phys 2018; 147:152702. [PMID: 29055344 DOI: 10.1063/1.4984932] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Markov state models (MSMs) and other related kinetic network models are frequently used to study the long-timescale dynamical behavior of biomolecular and materials systems. MSMs are often constructed bottom-up using brute-force molecular dynamics (MD) simulations when the model contains a large number of states and kinetic pathways that are not known a priori. However, the resulting network generally encompasses only parts of the configurational space, and regardless of any additional MD performed, several states and pathways will still remain missing. This implies that the duration for which the MSM can faithfully capture the true dynamics, which we term as the validity time for the MSM, is always finite and unfortunately much shorter than the MD time invested to construct the model. A general framework that relates the kinetic uncertainty in the model to the validity time, missing states and pathways, network topology, and statistical sampling is presented. Performing additional calculations for frequently-sampled states/pathways may not alter the MSM validity time. A new class of enhanced kinetic sampling techniques is introduced that aims at targeting rare states/pathways that contribute most to the uncertainty so that the validity time is boosted in an effective manner. Examples including straightforward 1D energy landscapes, lattice models, and biomolecular systems are provided to illustrate the application of the method. Developments presented here will be of interest to the kinetic Monte Carlo community as well.
Collapse
Affiliation(s)
- Arti Bhoutekar
- Department of Chemical Engineering, Indian Institute of Technology Bombay, Mumbai 400076, India
| | - Susmita Ghosh
- Department of Physics, Indian Institute of Technology Guwahati, Guwahati 781039, India
| | - Swati Bhattacharya
- Department of Chemical Engineering, Indian Institute of Technology Bombay, Mumbai 400076, India
| | - Abhijit Chatterjee
- Department of Chemical Engineering, Indian Institute of Technology Bombay, Mumbai 400076, India
| |
Collapse
|
11
|
Grazioli G, Butts CT, Andricioaei I. Automated placement of interfaces in conformational kinetics calculations using machine learning. J Chem Phys 2018; 147:152727. [PMID: 29055331 DOI: 10.1063/1.4989857] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Several recent implementations of algorithms for sampling reaction pathways employ a strategy for placing interfaces or milestones across the reaction coordinate manifold. Interfaces can be introduced such that the full feature space describing the dynamics of a macromolecule is divided into Voronoi (or other) cells, and the global kinetics of the molecular motions can be calculated from the set of fluxes through the interfaces between the cells. Although some methods of this type are exact for an arbitrary set of cells, in practice, the calculations will converge fastest when the interfaces are placed in regions where they can best capture transitions between configurations corresponding to local minima. The aim of this paper is to introduce a fully automated machine-learning algorithm for defining a set of cells for use in kinetic sampling methodologies based on subdividing the dynamical feature space; the algorithm requires no intuition about the system or input from the user and scales to high-dimensional systems.
Collapse
Affiliation(s)
- Gianmarc Grazioli
- Department of Chemistry, University of California, Irvine, California 92697, USA
| | - Carter T Butts
- California Institute for Telecommunications and Information Technology, University of California, Irvine, California 92697, USA
| | - Ioan Andricioaei
- Department of Chemistry, University of California, Irvine, California 92697, USA
| |
Collapse
|
12
|
Chen J, Chen J, Pinamonti G, Clementi C. Learning Effective Molecular Models from Experimental Observables. J Chem Theory Comput 2018; 14:3849-3858. [DOI: 10.1021/acs.jctc.8b00187] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Affiliation(s)
- Justin Chen
- Department of Physics and Astronomy, Rice University, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
| | - Jiming Chen
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, United States
| | - Giovanni Pinamonti
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | - Cecilia Clementi
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, United States
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
| |
Collapse
|
13
|
Affiliation(s)
- Brooke E. Husic
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Vijay S. Pande
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
14
|
Zuñiga C, Zaramela L, Zengler K. Elucidation of complexity and prediction of interactions in microbial communities. Microb Biotechnol 2017; 10:1500-1522. [PMID: 28925555 PMCID: PMC5658597 DOI: 10.1111/1751-7915.12855] [Citation(s) in RCA: 77] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Revised: 08/10/2017] [Accepted: 08/11/2017] [Indexed: 12/11/2022] Open
Abstract
Microorganisms engage in complex interactions with other members of the microbial community, higher organisms as well as their environment. However, determining the exact nature of these interactions can be challenging due to the large number of members in these communities and the manifold of interactions they can engage in. Various omic data, such as 16S rRNA gene sequencing, shotgun metagenomics, metatranscriptomics, metaproteomics and metabolomics, have been deployed to unravel the community structure, interactions and resulting community dynamics in situ. Interpretation of these multi-omic data often requires advanced computational methods. Modelling approaches are powerful tools to integrate, contextualize and interpret experimental data, thus shedding light on the underlying processes shaping the microbiome. Here, we review current methods and approaches, both experimental and computational, to elucidate interactions in microbial communities and to predict their responses to perturbations.
Collapse
Affiliation(s)
- Cristal Zuñiga
- Department of PediatricsUniversity of California, San Diego9500 Gilman DriveLa JollaCA92093‐0760USA
| | - Livia Zaramela
- Department of PediatricsUniversity of California, San Diego9500 Gilman DriveLa JollaCA92093‐0760USA
| | - Karsten Zengler
- Department of PediatricsUniversity of California, San Diego9500 Gilman DriveLa JollaCA92093‐0760USA
| |
Collapse
|
15
|
Brockherde F, Vogt L, Li L, Tuckerman ME, Burke K, Müller KR. Bypassing the Kohn-Sham equations with machine learning. Nat Commun 2017; 8:872. [PMID: 29021555 PMCID: PMC5636838 DOI: 10.1038/s41467-017-00839-3] [Citation(s) in RCA: 310] [Impact Index Per Article: 38.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 07/26/2017] [Indexed: 11/25/2022] Open
Abstract
Last year, at least 30,000 scientific papers used the Kohn–Sham scheme of density functional theory to solve electronic structure problems in a wide variety of scientific fields. Machine learning holds the promise of learning the energy functional via examples, bypassing the need to solve the Kohn–Sham equations. This should yield substantial savings in computer time, allowing larger systems and/or longer time-scales to be tackled, but attempts to machine-learn this functional have been limited by the need to find its derivative. The present work overcomes this difficulty by directly learning the density-potential and energy-density maps for test systems and various molecules. We perform the first molecular dynamics simulation with a machine-learned density functional on malonaldehyde and are able to capture the intramolecular proton transfer process. Learning density models now allows the construction of accurate density functionals for realistic molecular systems. Machine learning allows electronic structure calculations to access larger system sizes and, in dynamical simulations, longer time scales. Here, the authors perform such a simulation using a machine-learned density functional that avoids direct solution of the Kohn-Sham equations.
Collapse
Affiliation(s)
- Felix Brockherde
- Machine Learning Group, Technische Universität Berlin, Marchstraße 23, 10587, Berlin, Germany.,Max-Planck-Institut für Mikrostrukturphysik, Weinberg 2, 06120, Halle, Germany
| | - Leslie Vogt
- Department of Chemistry, New York University, New York, NY, 10003, USA
| | - Li Li
- Department of Physics and Astronomy, University of California, Irvine, CA, 92697, USA
| | - Mark E Tuckerman
- Department of Chemistry, New York University, New York, NY, 10003, USA. .,Courant Institute of Mathematical Science, New York University, New York, NY, 10003, USA. .,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, 3663 Zhongshan Road North, Shanghai, 200062, China.
| | - Kieron Burke
- Department of Physics and Astronomy, University of California, Irvine, CA, 92697, USA. .,Department of Chemistry, University of California, Irvine, CA, 92697, USA.
| | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, Marchstraße 23, 10587, Berlin, Germany. .,Department of Brain and Cognitive Engineering, Korea University, Anam-dong, Seongbuk-gu, Seoul, 136-713, Republic of Korea. .,Max-Planck-Institut für Informatik, Stuhlsatzenhausweg, 66123, Saarbrücken, Germany.
| |
Collapse
|
16
|
Bernetti M, Cavalli A, Mollica L. Protein-ligand (un)binding kinetics as a new paradigm for drug discovery at the crossroad between experiments and modelling. MEDCHEMCOMM 2017; 8:534-550. [PMID: 30108770 PMCID: PMC6072069 DOI: 10.1039/c6md00581k] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 01/25/2017] [Indexed: 12/14/2022]
Abstract
In the last three decades, protein and nucleic acid structure determination and comprehension of the mechanisms, leading to their physiological and pathological functions, have become a cornerstone of biomedical sciences. A deep understanding of the principles governing the fates of cells and tissue at the molecular level has been gained over the years, offering a solid basis for the rational design of drugs aimed at the pharmacological treatment of numerous diseases. Historically, affinity indicators (i.e. Kd and IC50/EC50) have been assumed to be valid indicators of the in vivo efficacy of a drug. However, recent studies pointed out that the kinetics of the drug-receptor binding process could be as important or even more important than affinity in determining the drug efficacy. This eventually led to a growing interest in the characterisation and prediction of the rate constants of protein-ligand association and dissociation. For instance, a drug with a longer residence time can kinetically select a given receptor over another, even if the affinity for both receptors is comparable, thus increasing its therapeutic index. Therefore, understanding the molecular features underlying binding and unbinding processes is of central interest towards the rational control of drug binding kinetics. In this review, we report the theoretical framework behind protein-ligand association and highlight the latest advances in the experimental and computational approaches exploited to investigate the binding kinetics.
Collapse
Affiliation(s)
- M Bernetti
- Department of Pharmacy and Biotechnology , University of Bologna , via Belmeloro 6 , 40126 Bologna , Italy
- CompuNet , Istituto Italiano di Tecnologia , via Morego 30 , 16163 Genova , Italy .
| | - A Cavalli
- Department of Pharmacy and Biotechnology , University of Bologna , via Belmeloro 6 , 40126 Bologna , Italy
- CompuNet , Istituto Italiano di Tecnologia , via Morego 30 , 16163 Genova , Italy .
| | - L Mollica
- CompuNet , Istituto Italiano di Tecnologia , via Morego 30 , 16163 Genova , Italy .
| |
Collapse
|
17
|
Wan H, Zhou G, Voelz VA. A Maximum-Caliber Approach to Predicting Perturbed Folding Kinetics Due to Mutations. J Chem Theory Comput 2016; 12:5768-5776. [PMID: 27951664 DOI: 10.1021/acs.jctc.6b00938] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We present a maximum-caliber method for inferring transition rates of a Markov state model (MSM) with perturbed equilibrium populations given estimates of state populations and rates for an unperturbed MSM. It is similar in spirit to previous approaches, but given the inclusion of prior information, it is more robust and simple to implement. We examine its performance in simple biased diffusion models of kinetics and then apply the method to predicting changes in folding rates for several highly nontrivial protein folding systems for which non-native interactions play a significant role, including (1) tryptophan variants of the GB1 hairpin, (2) salt-bridge mutations of the Fs peptide helix, and (3) MSMs built from ultralong folding trajectories of FiP35 and GTT variants of the WW domain. In all cases, the method correctly predicts changes in folding rates, suggesting the wide applicability of maximum-caliber approaches to efficiently predict how mutations perturb protein conformational dynamics.
Collapse
Affiliation(s)
- Hongbin Wan
- Department of Chemistry, Temple University , Philadelphia, Pennsylvania 19122, United States
| | - Guangfeng Zhou
- Department of Chemistry, Temple University , Philadelphia, Pennsylvania 19122, United States
| | - Vincent A Voelz
- Department of Chemistry, Temple University , Philadelphia, Pennsylvania 19122, United States
| |
Collapse
|
18
|
Schor M, Mey ASJS, MacPhee CE. Analytical methods for structural ensembles and dynamics of intrinsically disordered proteins. Biophys Rev 2016; 8:429-439. [PMID: 28003858 PMCID: PMC5135723 DOI: 10.1007/s12551-016-0234-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Accepted: 10/14/2016] [Indexed: 01/02/2023] Open
Abstract
Intrinsically disordered proteins, proteins that do not have a well-defined three-dimensional structure, make up a significant proportion of our proteome and are particularly prevalent in signaling and regulation. Although their importance has been realized for two decades, there is a lack of high-resolution experimental data. Molecular dynamics simulations have been crucial in reaching our current understanding of the dynamical structural ensemble sampled by intrinsically disordered proteins. In this review, we discuss enhanced sampling simulation methods that are particularly suitable to characterize the structural ensemble, along with examples of applications and limitations. The dynamics within the ensemble can be rigorously analyzed using Markov state models. We discuss recent developments that make Markov state modeling a viable approach for studying intrinsically disordered proteins. Finally, we briefly discuss challenges and future directions when applying molecular dynamics simulations to study intrinsically disordered proteins.
Collapse
Affiliation(s)
- Marieke Schor
- School of Physics and Astronomy, University of Edinburgh, Edinburgh, UK
| | | | - Cait E. MacPhee
- School of Physics and Astronomy, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
19
|
Lemke O, Keller BG. Density-based cluster algorithms for the identification of core sets. J Chem Phys 2016; 145:164104. [DOI: 10.1063/1.4965440] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Affiliation(s)
- Oliver Lemke
- Department of Biology, Chemistry, Pharmacy, Freie Universität Berlin, Takustraße 3, D-14195 Berlin, Germany
| | - Bettina G. Keller
- Department of Biology, Chemistry, Pharmacy, Freie Universität Berlin, Takustraße 3, D-14195 Berlin, Germany
| |
Collapse
|
20
|
Noskov SY, Rostovtseva TK, Chamberlin AC, Teijido O, Jiang W, Bezrukov SM. Current state of theoretical and experimental studies of the voltage-dependent anion channel (VDAC). BIOCHIMICA ET BIOPHYSICA ACTA 2016; 1858:1778-90. [PMID: 26940625 PMCID: PMC4877207 DOI: 10.1016/j.bbamem.2016.02.026] [Citation(s) in RCA: 57] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2015] [Revised: 02/09/2016] [Accepted: 02/10/2016] [Indexed: 01/04/2023]
Abstract
Voltage-dependent anion channel (VDAC), the major channel of the mitochondrial outer membrane provides a controlled pathway for respiratory metabolites in and out of the mitochondria. In spite of the wealth of experimental data from structural, biochemical, and biophysical investigations, the exact mechanisms governing selective ion and metabolite transport, especially the role of titratable charged residues and interactions with soluble cytosolic proteins, remain hotly debated in the field. The computational advances hold a promise to provide a much sought-after solution to many of the scientific disputes around solute and ion transport through VDAC and hence, across the mitochondrial outer membrane. In this review, we examine how Molecular Dynamics, Free Energy, and Brownian Dynamics simulations of the large β-barrel channel, VDAC, advanced our understanding. We will provide a short overview of non-conventional techniques and also discuss examples of how the modeling excursions into VDAC biophysics prospectively aid experimental efforts. This article is part of a Special Issue entitled: Membrane Proteins edited by J.C. Gumbart and Sergei Noskov.
Collapse
Affiliation(s)
- Sergei Yu Noskov
- Department of Biological Sciences and Centre for Molecular Simulation, University of Calgary, 2500 University Drive N.W., Calgary, Alberta T2N1N4, Canada.
| | - Tatiana K Rostovtseva
- Section on Molecular Transport, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD 20892, USA.
| | | | - Oscar Teijido
- Section on Molecular Transport, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD 20892, USA; Department of Medical Epigenetics, Institute of Medical Sciences and Genomic Medicine, EuroEspes Sta. Marta de Babío S/N, 15165 Bergondo, A Coruña, Spain
| | - Wei Jiang
- Leadership Computing Facility, Argonne National Laboratory, 9700S Cass Avenue, Lemont, IL 60439, USA
| | - Sergey M Bezrukov
- Section on Molecular Transport, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD 20892, USA.
| |
Collapse
|
21
|
Li Y, Dong Z. Effect of Clustering Algorithm on Establishing Markov State Model for Molecular Dynamics Simulations. J Chem Inf Model 2016; 56:1205-15. [DOI: 10.1021/acs.jcim.6b00181] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Yan Li
- The Hormel Institute, University of Minnesota, Austin, Minnesota 55912, United States
| | - Zigang Dong
- The Hormel Institute, University of Minnesota, Austin, Minnesota 55912, United States
| |
Collapse
|
22
|
Liu H, Li M, Fan J, Huo S. Inherent structure versus geometric metric for state space discretization. J Comput Chem 2016; 37:1251-8. [PMID: 26915811 DOI: 10.1002/jcc.24315] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Revised: 11/17/2015] [Accepted: 01/06/2016] [Indexed: 01/13/2023]
Abstract
Inherent structure (IS) and geometry-based clustering methods are commonly used for analyzing molecular dynamics trajectories. ISs are obtained by minimizing the sampled conformations into local minima on potential/effective energy surface. The conformations that are minimized into the same energy basin belong to one cluster. We investigate the influence of the applications of these two methods of trajectory decomposition on our understanding of the thermodynamics and kinetics of alanine tetrapeptide. We find that at the microcluster level, the IS approach and root-mean-square deviation (RMSD)-based clustering method give totally different results. Depending on the local features of energy landscape, the conformations with close RMSDs can be minimized into different minima, while the conformations with large RMSDs could be minimized into the same basin. However, the relaxation timescales calculated based on the transition matrices built from the microclusters are similar. The discrepancy at the microcluster level leads to different macroclusters. Although the dynamic models established through both clustering methods are validated approximately Markovian, the IS approach seems to give a meaningful state space discretization at the macrocluster level in terms of conformational features and kinetics.
Collapse
Affiliation(s)
- Hanzhong Liu
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts, 01610
| | - Minghai Li
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts, 01610
| | - Jue Fan
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts, 01610
| | - Shuanghong Huo
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts, 01610
| |
Collapse
|
23
|
Davis CM, Dyer RB. The Role of Electrostatic Interactions in Folding of β-Proteins. J Am Chem Soc 2016; 138:1456-64. [PMID: 26750867 DOI: 10.1021/jacs.5b13201] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Atomic-level molecular dynamic simulations are capable of fully folding structurally diverse proteins; however, they are limited in their ability to accurately represent electrostatic interactions. Here we have experimentally tested the role of charged residues on stability and folding kinetics of one of the most widely simulated β-proteins, the WW domain. The folding of wild type Pin1 WW domain, which has two positively charged residues in the first turn, was compared to the fast folding mutant FiP35 Pin1, which introduces a negative charge into the first turn. A combination of FTIR spectroscopy and laser-induced temperature-jump coupled with infrared spectroscopy was used to probe changes in the amide I region. The relaxation dynamics of the peptide backbone, β-sheets and β-turns, and negatively charged aspartic acid side chain of FiP35 were measured independently by probing the corresponding bands assigned in the amide I region. Folding is initiated in the turns and the β-sheets form last. While the global folding mechanism is in good agreement with simulation predictions, we observe changes in the protonation state of aspartic acid during folding that have not been captured by simulation methods. The protonation state of aspartic acid is coupled to protein folding; the apparent pKa of aspartic acid in the folded protein is 6.4. The dynamics of the aspartic acid follow the dynamics of the intermediate phase, supporting assignment of this phase to formation of the first hairpin. These results demonstrate the importance of electrostatic interactions in turn stability and formation of extended β-sheet structures.
Collapse
Affiliation(s)
- Caitlin M Davis
- Department of Chemistry, Emory University , Atlanta, Georgia 30322, United States
| | - R Brian Dyer
- Department of Chemistry, Emory University , Atlanta, Georgia 30322, United States
| |
Collapse
|
24
|
Boninsegna L, Gobbo G, Noé F, Clementi C. Investigating Molecular Kinetics by Variationally Optimized Diffusion Maps. J Chem Theory Comput 2015; 11:5947-60. [DOI: 10.1021/acs.jctc.5b00749] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
- Lorenzo Boninsegna
- Center
for Theoretical Biological Physics and Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
| | - Gianpaolo Gobbo
- Maxwell
Institute for Mathematical Sciences and School of Mathematics, The University of Edinburgh, Peter Guthrie Tait Road, Edinburgh EH9 3FD, United Kingdom
| | - Frank Noé
- Department
of Mathematics, Computer Science and Bioinformatics, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Cecilia Clementi
- Center
for Theoretical Biological Physics and Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
| |
Collapse
|
25
|
Blöchliger N, Caflisch A, Vitalis A. Weighted Distance Functions Improve Analysis of High-Dimensional Data: Application to Molecular Dynamics Simulations. J Chem Theory Comput 2015; 11:5481-92. [DOI: 10.1021/acs.jctc.5b00618] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Nicolas Blöchliger
- Department of Biochemistry, University of Zurich, Winterthurerstrasse
190, CH-8057 Zurich, Zurich, Switzerland
| | - Amedeo Caflisch
- Department of Biochemistry, University of Zurich, Winterthurerstrasse
190, CH-8057 Zurich, Zurich, Switzerland
| | - Andreas Vitalis
- Department of Biochemistry, University of Zurich, Winterthurerstrasse
190, CH-8057 Zurich, Zurich, Switzerland
| |
Collapse
|
26
|
Noé F, Clementi C. Kinetic distance and kinetic maps from molecular dynamics simulation. J Chem Theory Comput 2015; 11:5002-11. [PMID: 26574285 DOI: 10.1021/acs.jctc.5b00553] [Citation(s) in RCA: 140] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Characterizing macromolecular kinetics from molecular dynamics (MD) simulations requires a distance metric that can distinguish slowly interconverting states. Here, we build upon diffusion map theory and define a kinetic distance metric for irreducible Markov processes that quantifies how slowly molecular conformations interconvert. The kinetic distance can be computed given a model that approximates the eigenvalues and eigenvectors (reaction coordinates) of the MD Markov operator. Here, we employ the time-lagged independent component analysis (TICA). The TICA components can be scaled to provide a kinetic map in which the Euclidean distance corresponds to the kinetic distance. As a result, the question of how many TICA dimensions should be kept in a dimensionality reduction approach becomes obsolete, and one parameter less needs to be specified in the kinetic model construction. We demonstrate the approach using TICA and Markov state model (MSM) analyses for illustrative models, protein conformation dynamics in bovine pancreatic trypsin inhibitor and protein-inhibitor association in trypsin and benzamidine. We find that the total kinetic variance (TKV) is an excellent indicator of model quality and can be used to rank different input feature sets.
Collapse
Affiliation(s)
- Frank Noé
- FU Berlin , Department of Mathematics, Computer Science and Bioinformatics, Arnimallee 6, 14195 Berlin, Germany
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, and Department of Chemistry, Rice University , Houston, Texas 77005, United States
| |
Collapse
|
27
|
Bai Q, Pérez-Sánchez H, Zhang Y, Shao Y, Shi D, Liu H, Yao X. Ligand induced change of β2 adrenergic receptor from active to inactive conformation and its implication for the closed/open state of the water channel: insight from molecular dynamics simulation, free energy calculation and Markov state model analysis. Phys Chem Chem Phys 2015; 16:15874-85. [PMID: 24962153 DOI: 10.1039/c4cp01185f] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The reported crystal structures of β2 adrenergic receptor (β2AR) reveal that the open and closed states of the water channel are correlated with the inactive and active conformations of β2AR. However, more details about the process by which the water channel states are affected by the active to inactive conformational change of β2AR remain illusive. In this work, molecular dynamics simulations are performed to study the dynamical inactive and active conformational change of β2AR induced by inverse agonist ICI 118,551. Markov state model analysis and free energy calculation are employed to explore the open and close states of the water channel. The simulation results show that inverse agonist ICI 118,551 can induce water channel opening during the conformational transition of β2AR. Markov state model (MSM) analysis proves that the energy contour can be divided into seven states. States S1, S2 and S5, which represent the active conformation of β2AR, show that the water channel is in the closed state, while states S4 and S6, which correspond to the intermediate state conformation of β2AR, indicate the water channel opens gradually. State S7, which represents the inactive structure of β2AR, corresponds to the full open state of the water channel. The opening mechanism of the water channel is involved in the ligand-induced conformational change of β2AR. These results can provide useful information for understanding the opening mechanism of the water channel and will be useful for the rational design of potent inverse agonists of β2AR.
Collapse
Affiliation(s)
- Qifeng Bai
- Department of Chemistry, Lanzhou University, Lanzhou 730000, China.
| | | | | | | | | | | | | |
Collapse
|
28
|
Schwantes C, Pande VS. Modeling molecular kinetics with tICA and the kernel trick. J Chem Theory Comput 2015; 11:600-8. [PMID: 26528090 PMCID: PMC4610300 DOI: 10.1021/ct5007357] [Citation(s) in RCA: 87] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2014] [Indexed: 11/28/2022]
Abstract
The allure of a molecular dynamics simulation is that, given a sufficiently accurate force field, it can provide an atomic-level view of many interesting phenomena in biology. However, the result of a simulation is a large, high-dimensional time series that is difficult to interpret. Recent work has introduced the time-structure based Independent Components Analysis (tICA) method for analyzing MD, which attempts to find the slowest decorrelating linear functions of the molecular coordinates. This method has been used in conjunction with Markov State Models (MSMs) to provide estimates of the characteristic eigenprocesses contained in a simulation (e.g., protein folding, ligand binding). Here, we extend the tICA method using the kernel trick to arrive at nonlinear solutions. This is a substantial improvement as it allows for kernel-tICA (ktICA) to provide estimates of the characteristic eigenprocesses directly without building an MSM.
Collapse
Affiliation(s)
- Christian
R. Schwantes
- Department of Chemistry, Department of Computer Science, Department of Structural Biology, and Program in Biophysics, Stanford University, Stanford, California 94305, United States
| | - Vijay S. Pande
- Department of Chemistry, Department of Computer Science, Department of Structural Biology, and Program in Biophysics, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
29
|
Sheong FK, Silva DA, Meng L, Zhao Y, Huang X. Automatic state partitioning for multibody systems (APM): an efficient algorithm for constructing Markov state models to elucidate conformational dynamics of multibody systems. J Chem Theory Comput 2014; 11:17-27. [PMID: 26574199 DOI: 10.1021/ct5007168] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The conformational dynamics of multibody systems plays crucial roles in many important problems. Markov state models (MSMs) are powerful kinetic network models that can predict long-time-scale dynamics using many short molecular dynamics simulations. Although MSMs have been successfully applied to conformational changes of individual proteins, the analysis of multibody systems is still a challenge because of the complexity of the dynamics that occur on a mixture of drastically different time scales. In this work, we have developed a new algorithm, automatic state partitioning for multibody systems (APM), for constructing MSMs to elucidate the conformational dynamics of multibody systems. The APM algorithm effectively addresses different time scales in the multibody systems by directly incorporating dynamics into geometric clustering when identifying the metastable conformational states. We have applied the APM algorithm to a 2D potential that can mimic a protein-ligand binding system and the aggregation of two hydrophobic particles in water and have shown that it can yield tremendous enhancements in the computational efficiency of MSM construction and the accuracy of the models.
Collapse
Affiliation(s)
- Fu Kit Sheong
- HKUST Shenzhen Research Institute , Nanshan, Shenzhen 518057, China
| | | | - Luming Meng
- HKUST Shenzhen Research Institute , Nanshan, Shenzhen 518057, China
| | | | - Xuhui Huang
- HKUST Shenzhen Research Institute , Nanshan, Shenzhen 518057, China
| |
Collapse
|
30
|
Schwantes CR, McGibbon RT, Pande VS. Perspective: Markov models for long-timescale biomolecular dynamics. J Chem Phys 2014; 141:090901. [PMID: 25194354 PMCID: PMC4156582 DOI: 10.1063/1.4895044] [Citation(s) in RCA: 70] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2014] [Accepted: 08/27/2014] [Indexed: 01/24/2023] Open
Abstract
Molecular dynamics simulations have the potential to provide atomic-level detail and insight to important questions in chemical physics that cannot be observed in typical experiments. However, simply generating a long trajectory is insufficient, as researchers must be able to transform the data in a simulation trajectory into specific scientific insights. Although this analysis step has often been taken for granted, it deserves further attention as large-scale simulations become increasingly routine. In this perspective, we discuss the application of Markov models to the analysis of large-scale biomolecular simulations. We draw attention to recent improvements in the construction of these models as well as several important open issues. In addition, we highlight recent theoretical advances that pave the way for a new generation of models of molecular kinetics.
Collapse
Affiliation(s)
- C R Schwantes
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| | - R T McGibbon
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| | - V S Pande
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
31
|
High-resolution visualisation of the states and pathways sampled in molecular dynamics simulations. Sci Rep 2014; 4:6264. [PMID: 25179558 PMCID: PMC4151098 DOI: 10.1038/srep06264] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2014] [Accepted: 08/15/2014] [Indexed: 11/20/2022] Open
Abstract
We have recently developed a scalable algorithm for ordering the instantaneous observations of a dynamical system evolving continuously in time. Here, we apply the method to long molecular dynamics trajectories. The procedure requires only a pairwise, geometrical distance as input. Suitable annotations of both structural and kinetic nature reveal the free energy basins visited by biomolecules. The profile is supplemented by a trace of the temporal evolution of the system highlighting the sequence of events. We demonstrate that the resultant SAPPHIRE (States And Pathways Projected with HIgh REsolution) plots provide a comprehensive picture of the thermodynamics and kinetics of complex, molecular systems exhibiting dynamics covering a range of time and length scales. Information on pathways connecting states and the level of recurrence are quickly inferred from the visualisation. The considerable advantages of our approach are speed and resolution: the SAPPHIRE plot is scalable to very large data sets and represents every single snapshot. This minimizes the risk of missing states because of overlap or prior coarse-graining of the data.
Collapse
|
32
|
Larsson P, Pouya I, Lindahl E. From Side Chains Rattling on Picoseconds to Ensemble Simulations of Protein Folding. Isr J Chem 2014. [DOI: 10.1002/ijch.201400020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
33
|
Gil VA, Guallar V. pyProCT: Automated Cluster Analysis for Structural Bioinformatics. J Chem Theory Comput 2014; 10:3236-43. [PMID: 26588293 DOI: 10.1021/ct500306s] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Cluster analysis is becoming a relevant tool in structural bioinformatics. It allows analyzing large conformational ensembles in order to extract features or diminish redundancy, or just as a first step for other methods. Unfortunately, the successfulness of this analysis strongly depends on the data set traits, the chosen algorithm, and its parameters, which can lead to poor or even erroneous results not easily detected. In order to overcome this problem, we have developed pyProCT, a Python open source cluster analysis toolkit specially designed to be used with ensembles of biomolecule conformations. pyProCT implements an automated protocol to choose the clustering algorithm and parameters that produce the best results for a particular data set. It offers different levels of customization according to users' expertise. Moreover, pyProCT has been designed as a collection of interchangeable libraries, making it easier to reuse it as part of other programs.
Collapse
Affiliation(s)
- Víctor A Gil
- Joint BSC-CRG-IRB Research Program in Computational Biology, Barcelona Supercomputing Center, Jordi Girona 29, 08034 Barcelona, Spain
| | - Víctor Guallar
- Joint BSC-CRG-IRB Research Program in Computational Biology, Barcelona Supercomputing Center, Jordi Girona 29, 08034 Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluís Companys 23, E-08010 Barcelona, Spain
| |
Collapse
|
34
|
Malmstrom RD, Lee CT, Van Wart A, Amaro RE. On the Application of Molecular-Dynamics Based Markov State Models to Functional Proteins. J Chem Theory Comput 2014; 10:2648-2657. [PMID: 25473382 PMCID: PMC4248791 DOI: 10.1021/ct5002363] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
![]()
Owing
to recent developments in computational algorithms and architectures,
it is now computationally tractable to explore biologically relevant,
equilibrium dynamics of realistically sized functional proteins using
all-atom molecular dynamics simulations. Molecular dynamics simulations
coupled with Markov state models is a nascent but rapidly growing
technology that is enabling robust exploration of equilibrium dynamics.
The objective of this work is to explore the challenges of coupling
molecular dynamics simulations and Markov state models in the study
of functional proteins. Using recent studies as a framework, we explore
progress in sampling, model building, model selection, and coarse-grained
analysis of models. Our goal is to highlight some of the current challenges
in applying Markov state models to realistically sized proteins and
spur discussion on advances in the field.
Collapse
Affiliation(s)
- Robert D Malmstrom
- Department of Chemistry and Biochemistry, University of California, San Diego ; National Biomedical Computational Resource
| | - Christopher T Lee
- Department of Chemistry and Biochemistry, University of California, San Diego
| | - Adam Van Wart
- Department of Chemistry and Biochemistry, University of California, San Diego
| | - Rommie E Amaro
- Department of Chemistry and Biochemistry, University of California, San Diego ; National Biomedical Computational Resource
| |
Collapse
|
35
|
Chodera JD, Noé F. Markov state models of biomolecular conformational dynamics. Curr Opin Struct Biol 2014; 25:135-44. [PMID: 24836551 DOI: 10.1016/j.sbi.2014.04.002] [Citation(s) in RCA: 524] [Impact Index Per Article: 47.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2014] [Revised: 04/08/2014] [Accepted: 04/12/2014] [Indexed: 10/25/2022]
Abstract
It has recently become practical to construct Markov state models (MSMs) that reproduce the long-time statistical conformational dynamics of biomolecules using data from molecular dynamics simulations. MSMs can predict both stationary and kinetic quantities on long timescales (e.g. milliseconds) using a set of atomistic molecular dynamics simulations that are individually much shorter, thus addressing the well-known sampling problem in molecular dynamics simulation. In addition to providing predictive quantitative models, MSMs greatly facilitate both the extraction of insight into biomolecular mechanism (such as folding and functional dynamics) and quantitative comparison with single-molecule and ensemble kinetics experiments. A variety of methodological advances and software packages now bring the construction of these models closer to routine practice. Here, we review recent progress in this field, considering theoretical and methodological advances, new software tools, and recent applications of these approaches in several domains of biochemistry and biophysics, commenting on remaining challenges.
Collapse
Affiliation(s)
- John D Chodera
- Computational Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany.
| |
Collapse
|
36
|
Klippenstein SJ, Pande VS, Truhlar DG. Chemical Kinetics and Mechanisms of Complex Systems: A Perspective on Recent Theoretical Advances. J Am Chem Soc 2014; 136:528-46. [DOI: 10.1021/ja408723a] [Citation(s) in RCA: 187] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- Stephen J. Klippenstein
- Chemical
Sciences and Engineering Division, Argonne National Laboratory, Argonne, Illinois 60439, United States
| | - Vijay S. Pande
- Department
of Chemistry and Structural Biology, Stanford University, Stanford, California 94305, United States
| | - Donald G. Truhlar
- Department
of Chemistry, Chemical Theory Center, and Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455-0431, United States
| |
Collapse
|