1
|
Peña Ccoa WJ, Mukadum F, Ramon A, Stirnemann G, Hocky GM. A direct computational assessment of vinculin-actin unbinding kinetics reveals catch-bonding behavior. Proc Natl Acad Sci U S A 2025; 122:e2425982122. [PMID: 40397673 DOI: 10.1073/pnas.2425982122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2024] [Accepted: 04/16/2025] [Indexed: 05/23/2025] Open
Abstract
Vinculin forms a catch bond with the cytoskeletal polymer actin, displaying an increased bond lifetime upon force application. Notably, this behavior depends on the direction of the applied force, which has significant implications for cellular mechanotransduction. In this work, we present a comprehensive molecular dynamics simulation study, employing enhanced sampling techniques to investigate the thermodynamic, kinetic, and mechanistic aspects of this phenomenon at physiologically relevant forces. We dissect a catch bond mechanism in which force shifts vinculin between either a weakly or strongly bound state. Our results demonstrate that models for these states have unbinding times consistent with those from single-molecule studies, and suggest that both have some intrinsic catch-bonding behavior. We provide atomistic insight into this behavior, and show how a directional pulling force can promote the strong or weak state. Crucially, our strategy can be extended to measure the difficult-to-capture effects of small mechanical forces on biomolecular systems in general, and those involved in mechanotransduction more specifically.
Collapse
Affiliation(s)
| | - Fatemah Mukadum
- Department of Chemistry, New York University, New York, NY 10003
| | - Aubin Ramon
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United Kingdom
- Chimie Physique et Chimie pour le Vivant Laboratory, Department of Chemistry, École Normale Supérieure, Paris Sciences et Lettres (PSL) University, Sorbonne University, CNRS, Paris 75005, France
| | - Guillaume Stirnemann
- Chimie Physique et Chimie pour le Vivant Laboratory, Department of Chemistry, École Normale Supérieure, Paris Sciences et Lettres (PSL) University, Sorbonne University, CNRS, Paris 75005, France
| | - Glen M Hocky
- Department of Chemistry, New York University, New York, NY 10003
- Simons Center For Computational Physical Chemistry, New York University, New York, NY 10003
| |
Collapse
|
2
|
Yamamoto Y. Algorithm for Efficient Superposition and Clustering of Molecular Assemblies Using the Branch-and-Bound Method. J Chem Inf Model 2025; 65:4512-4530. [PMID: 40276894 DOI: 10.1021/acs.jcim.4c02217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2025]
Abstract
The root-mean-square deviation (RMSD) is one of the most common metrics for comparing the similarity of three-dimensional chemical structures. The chemical structure similarity plays an important role in data chemistry because it is closely related to chemical reactivity, physical properties, and bioactivity. Despite the wide applicability of the RMSD, the simultaneous determination of atom mapping and spatial superposition of RMSD remains a challenging problem to solve in polynomial time. We introduce an algorithm called mobbRMSD, which is formulated in molecular-oriented coordinates and uses the branch-and-bound method to obtain an exact solution for the RMSD. mobbRMSD can efficiently handle a wide range of chemical systems, such as molecular liquids, solute solvations, and self-assembly of large molecules, using chemical knowledge such as atom types, chemical bonding, and chirality. In benchmarks involving small molecular aggregates, mobbRMSD extends the limiting system size of existing exact solution methods by almost twice. Furthermore, mobbRMSD demonstrated the ability to analyze the structural similarity of large molecular micelles, which has been difficult with previous methods. We also propose a mobbRMSD-based structural clustering method designed for molecular dynamics trajectories, which improves the computational cost of branch-and-bound methods to asymptotically average the polynomial time as the number of data increases. Our algorithm is freely available at https://github.com/yymmt742/mobbrmsd.
Collapse
Affiliation(s)
- Yuki Yamamoto
- Department of Chemistry, Graduate School of Science, Kyoto University, Kitashirakawa Oiwake-cho, Sakyo-ku, Kyoto 606-8502, Japan
| |
Collapse
|
3
|
Zakir M, LeVatte MA, Wishart DS. RT-Pred: A web server for accurate, customized liquid chromatography retention time prediction of chemicals. J Chromatogr A 2025; 1747:465816. [PMID: 40023050 DOI: 10.1016/j.chroma.2025.465816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2024] [Revised: 02/21/2025] [Accepted: 02/23/2025] [Indexed: 03/04/2025]
Abstract
High-performance liquid chromatography (HPLC) together with mass spectrometry (MS) is routinely used to separate, identify and quantify chemicals. HPLC data also provides retention time (RT) which can be aligned with structural data. Recent developments in machine learning (ML) have improved our ability to predict RTs from known or postulated chemical structures, allowing RT data to be used more effectively in LC-MS-based compound identification. However, RT data is highly specific to each chromatographic method (CM) and hundreds of different CMs with interdependent parameters are used in separations. This has limited the application of ML-based RT predictions in compound identification. Here we introduce an easy-to-use RT prediction webserver (called RT-Pred) that predicts RTs for molecules across most chromatographic setups. RT-Pred not only supports its own in-house CM-specific RT predictors, it allows users to easily train a custom RT-Pred model using their own RT data on their own CM and to predict RTs with that custom model. RT-Pred also supports RT and compound searches against its own database of millions of predicted RTs spanning >40 different CMs. RT-Pred is also uniquely capable of accurately identifying compounds that will elute in the void volume or be retained on the column. Including this void/retained/eluted classifier significantly improves RT-Pred's performance. Tests indicate that RT-Pred had an average coefficient of determination (R²) of 0.95 over 20 different CMs. Comparisons of RT-Pred against other RT predictors showed that RT-Pred achieved lower mean absolute errors and higher R² scores than any other published RT predictor. RT-Pred is freely available at https://rtpred.ca.
Collapse
Affiliation(s)
- Mahi Zakir
- Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada
| | - Marcia A LeVatte
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - David S Wishart
- Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada; Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada; Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, AB T6G 2B7, Canada; Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB T6G 2H7, Canada.
| |
Collapse
|
4
|
Sasmal S, McCullagh M, Hocky GM. Improved data-driven collective variables for biased sampling through iteration on biased data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.25.644418. [PMID: 40196594 PMCID: PMC11974779 DOI: 10.1101/2025.03.25.644418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2025]
Abstract
Our ability to efficiently sample conformational transitions between two known states of a biomolecule using collective variable (CV) based sampling depends strongly on the choice of the CV. We previously reported a data-driven approach to clustering biomolecular configurations with a probabilistic clustering model termed ShapeGMM. ShapeGMM is a Gaussian Mixture Model in cartesian coordinates, with means and covariances in each cluster representing the harmonic approximation to the conformational ensemble around a metastable state. We subsequently showed that Linear Discriminant Analysis on positions (posLDA) is a good reaction coordinate to characterize the transition between two of these states, and moreover can be biased to produce transitions between the states using Metadynamics-like approaches. However, the quality of these LDA coordinates depends on the amount of data used to characterize the states, and here we demonstrate the ability to systematically improve them using enhanced sampling data. Specifically, we demonstrate that improved CVs for sampling can be generated by iteratively performing biased sampling along a posLDA coordinate and then generating a new shapeGMM model from biased data in the previous iteration. The new coordinates derived from our iterative approach show a substantial improvement in being able to induce transitions between metastable states, and to converge a free energy surface.
Collapse
|
5
|
Chen L, Santos JBW, Gaza J, Perez A, Miranda-Quintana RA. Hierarchical Extended Linkage Method (HELM)'s Deep Dive into Hybrid Clustering Strategies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.05.641742. [PMID: 40161705 PMCID: PMC11952300 DOI: 10.1101/2025.03.05.641742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
Clustering remains a key tool in the analysis of molecular dynamics (MD) simulations, from the preparation of kinetic models to the study of mechanistic pathways and structural determination. It is no surprise then that multiple algorithms are currently used in the MD community, with k -means and hierarchical approaches being arguably the two most popular approaches. The former is very attractive from a purely computational point of view, demanding minimal memory and time resources, but at the price of being able to partition the data in very restrictive ways. Hierarchical strategies, on the other hand, can generate arbitrary partitions, but with steep memory and time requirements due to their need to build a pairwise distance matrix for all the considered conformations/frames. Here we propose a new hybrid paradigm, the Hierarchical Extended Linkage Method (HELM), that retains the efficiency of k -means while incorporating the flexibility of hierarchical methods. The key ingredient is the use of n -ary difference functions as a way to stabilize the k -means results and efficiently build the hierarchy of subsets. We showcase the applicability of this strategy over protein-DNA and protein folding studies, including the complete analysis of simulations with over 1.5 million frames. HELM is freely available in our MDANCE clustering package.
Collapse
Affiliation(s)
- Lexin Chen
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida, 32611, USA
| | | | - Jokent Gaza
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida, 32611, USA
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida, 32611, USA
| | | |
Collapse
|
6
|
Liu B, Boysen JG, Unarta IC, Du X, Li Y, Huang X. Exploring transition states of protein conformational changes via out-of-distribution detection in the hyperspherical latent space. Nat Commun 2025; 16:349. [PMID: 39753544 PMCID: PMC11699157 DOI: 10.1038/s41467-024-55228-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Accepted: 12/05/2024] [Indexed: 01/06/2025] Open
Abstract
Identifying transitional states is crucial for understanding protein conformational changes that underlie numerous biological processes. Markov state models (MSMs), built from Molecular Dynamics (MD) simulations, capture these dynamics through transitions among metastable conformational states, and have demonstrated success in studying protein conformational changes. However, MSMs face challenges in identifying transition states, as they partition MD conformations into discrete metastable states (or free energy minima), lacking description of transition states located at the free energy barriers. Here, we introduce Transition State identification via Dispersion and vAriational principle Regularized neural networks (TS-DAR), a deep learning framework inspired by out-of-distribution (OOD) detection in trustworthy artificial intelligence (AI). TS-DAR offers an end-to-end pipeline that can simultaneously detect all transition states between multiple free minima from MD simulations using the regularized hyperspherical embeddings in latent space. The key insight of TS-DAR lies in treating transition state structures as OOD data, recognizing that they are sparsely populated and exhibit a distributional shift from metastable states. We demonstrate the power of TS-DAR by applying it to a 2D potential, alanine dipeptide, and the translocation of a DNA motor protein on DNA, where it outperforms previous methods in identifying transition states.
Collapse
Affiliation(s)
- Bojun Liu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Jordan G Boysen
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Ilona Christy Unarta
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Xuefeng Du
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Yixuan Li
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA.
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA.
| |
Collapse
|
7
|
Mondal K, Klauda JB. Physically interpretable performance metrics for clustering. J Chem Phys 2024; 161:244106. [PMID: 39723706 DOI: 10.1063/5.0241122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2024] [Accepted: 11/21/2024] [Indexed: 12/28/2024] Open
Abstract
Clustering is a type of machine learning technique, which is used to group huge amounts of data based on their similarity into separate groups or clusters. Clustering is a very important task that is nowadays used to analyze the huge and diverse amount of data coming out of molecular dynamics (MD) simulations. Typically, the data from the MD simulations in terms of their various frames in the trajectory are clustered into different groups and a representative element from each group is studied separately. Now, a very important question coming in this process is: what is the quality of the clusters that are obtained? There are several performance metrics that are available in the literature such as the silhouette index and the Davies-Bouldin Index that are often used to analyze the quality of clustering. However, most of these metrics focus on the overlap or the similarity of the clusters in the reduced dimension that is used for clustering and do not focus on the physically important properties or the parameters of the system. To address this issue, we have developed two physically interpretable scoring metrics that focus on the physical parameters of the system that we are analyzing. We have used and tested our algorithm on three different systems: (1) Ising model, (2) peptide folding and unfolding of WT HP35, (3) a protein-ligand trajectory of an enzyme and substrate, and (4) a protein-ligand dissociated trajectory. We show that the scoring metrics provide us clusters that match with our physical intuition about the systems.
Collapse
Affiliation(s)
- Kinjal Mondal
- Institute for Physical Science and Technology, Biophysics Program, University of Maryland, College Park, Maryland 20742, USA
| | - Jeffery B Klauda
- Institute for Physical Science and Technology, Biophysics Program, University of Maryland, College Park, Maryland 20742, USA
- Department of Chemical and Biomolecular Engineering, University of Maryland, College Park, Maryland 20742, USA
| |
Collapse
|
8
|
França VLB, Amaral JL, do Ó Pessoa C, Carvalho HF, Freire VN. Shedding light on cancer immunology at the molecular level: A quantum biochemistry study of representative PD-1/PD-L1 conformations. Biochem Biophys Res Commun 2024; 735:150832. [PMID: 39423575 DOI: 10.1016/j.bbrc.2024.150832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 09/06/2024] [Accepted: 10/12/2024] [Indexed: 10/21/2024]
Abstract
BACKGROUND Programmed death 1 (PD-1) binding to PD-L1 is a potent mechanism used by immunogenic tumors to evade the immune system and the immune checkpoint PD-1PD-L1 has emerged as a promising target in the search for new drugs to improve cancer treatment. The crystallographic structure of humanPD-1humanPD-L1 shed light on the molecular characterization of this system and allowed computational studies to be carried out to characterize structural behaviors. METHODS This study demonstrated the importance of analyzing the flexibility of protein systems through molecular dynamics simulations (MDS) and its impacts on the interaction energy obtained through quantum biochemistry. RESULTS The computational results obtained provide a description of the flexibility and energetic profile of the PD-1PD-L1 contact surface using representative conformations from MDS. Variations of up to 50 % in the total interaction energy values were detected depending on the scrutinized conformation, which can be mainly attributed to the flexibility of the CC' loop, FG loop and ASP85-GLN91 of PD-1 and the MET58-LYS62 segment of PD-L1. Quantum biochemistry revealed the three hot spots in PD-L1: ARG113L-ARG125L > ILE54L-VAL76L > ALA18L-ASP26L; and two energetic hot spots in PD-1: ALA125-ARG139 > VAL63-GLN88. Nonetheless, VAL63-GLN88 and GLY124-ARG139 exhibit significant variation in interaction energy between different conformations, while ARG113L-ARG125L is the only hot spot with high energetic fluctuation on the PD-L1 surface. CONCLUSION This is the first application of MDS coupled to dimensionality reduction and density functional theory (DFT) demonstrating new structural and energetic features that might be useful in discovering/designing more potent PD-1PD-L1 inhibitors.
Collapse
Affiliation(s)
- Victor L B França
- Department of Physiology and Pharmacology, Federal University of Ceará, 60430-270, Fortaleza, Ceará, Brazil; Department of Physics, Federal University of Ceará, Fortaleza, 60440-900, Brazil
| | - Jackson L Amaral
- Department of Biological Sciences, Federal University of Piauí, Bom Jesus, 64900-000, Brazil.
| | - Cláudia do Ó Pessoa
- Department of Physiology and Pharmacology, Federal University of Ceará, Fortaleza, 60430-275, Brazil
| | - Hernandes F Carvalho
- Department of Structural and Functional Biology, Institute of Biology, State University of Campinas, 13083-864, Campinas, São Paulo, Brazil
| | - Valder N Freire
- Department of Physics, Federal University of Ceará, Fortaleza, 60440-900, Brazil
| |
Collapse
|
9
|
França VLB, Bezerra EM, da Costa RF, Carvalho HF, Freire VN, Matos G. Alzheimer's Disease Immunotherapy and Mimetic Peptide Design for Drug Development: Mutation Screening, Molecular Dynamics, and a Quantum Biochemistry Approach Focusing on Aducanumab::Aβ2-7 Binding Affinity. ACS Chem Neurosci 2024; 15:3543-3562. [PMID: 39302203 PMCID: PMC11450751 DOI: 10.1021/acschemneuro.4c00453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Revised: 09/06/2024] [Accepted: 09/09/2024] [Indexed: 09/22/2024] Open
Abstract
Seven treatments are approved for Alzheimer's disease, but five of them only relieve symptoms and do not alter the course of the disease. Aducanumab (Adu) and lecanemab are novel disease-modifying antiamyloid-β (Aβ) human monoclonal antibodies that specifically target the pathophysiology of Alzheimer's disease (AD) and were recently approved for its treatment. However, their administration is associated with serious side effects, and their use is limited to early stages of the disease. Therefore, drug discovery remains of great importance in AD research. To gain new insights into the development of novel drugs for Alzheimer's disease, a combination of techniques was employed, including mutation screening, molecular dynamics, and quantum biochemistry. These were used to outline the interfacial interactions of the Aducanumab::Aβ2-7 complex. Our analysis identified critical stabilizing contacts, revealing up to 40% variation in the affinity of the Adu chains for Aβ2-7 depending on the conformation outlined. Remarkably, two complementarity determining regions (CDRs) of the Adu heavy chain (HCDR3 and HCDR2) and one CDR of the Adu light chain (LCDR3) accounted for approximately 77% of the affinity of Adu for Aβ2-7, confirming their critical role in epitope recognition. A single mutation, originally reported to have the potential to increase the affinity of Adu for Aβ2-7, was shown to decrease its structural stability without increasing the overall binding affinity. Mimetic peptides that have the potential to inhibit Aβ aggregation were designed by using computational outcomes. Our results support the use of these peptides as promising drugs with great potential as inhibitors of Aβ aggregation.
Collapse
Affiliation(s)
- Victor L. B. França
- Department
of Physiology and Pharmacology, Federal
University of Ceará, 60430-270 Fortaleza, Ceará, Brazil
| | - Eveline M. Bezerra
- Department
of Sciences, Mathematics and Statistics, Federal Rural University of Semi-Arid (UFERSA), 59625-900 Mossoró, RN, Brazil
| | - Roner F. da Costa
- Department
of Sciences, Mathematics and Statistics, Federal Rural University of Semi-Arid (UFERSA), 59625-900 Mossoró, RN, Brazil
| | - Hernandes F. Carvalho
- Department
of Structural and Functional Biology, Institute of Biology, State University of Campinas, 13083-864 Campinas, São
Paulo, Brazil
| | - Valder N. Freire
- Department
of Physics, Federal University of Ceará, 60430-270 Fortaleza, Ceará, Brazil
| | - Geanne Matos
- Department
of Physiology and Pharmacology, Federal
University of Ceará, 60430-270 Fortaleza, Ceará, Brazil
| |
Collapse
|
10
|
Akgüller Ö, Balcı MA, Cioca G. Clustering Molecules at a Large Scale: Integrating Spectral Geometry with Deep Learning. Molecules 2024; 29:3902. [PMID: 39202980 PMCID: PMC11357287 DOI: 10.3390/molecules29163902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2024] [Revised: 08/14/2024] [Accepted: 08/14/2024] [Indexed: 09/03/2024] Open
Abstract
This study conducts an in-depth analysis of clustering small molecules using spectral geometry and deep learning techniques. We applied a spectral geometric approach to convert molecular structures into triangulated meshes and used the Laplace-Beltrami operator to derive significant geometric features. By examining the eigenvectors of these operators, we captured the intrinsic geometric properties of the molecules, aiding their classification and clustering. The research utilized four deep learning methods: Deep Belief Network, Convolutional Autoencoder, Variational Autoencoder, and Adversarial Autoencoder, each paired with k-means clustering at different cluster sizes. Clustering quality was evaluated using the Calinski-Harabasz and Davies-Bouldin indices, Silhouette Score, and standard deviation. Nonparametric tests were used to assess the impact of topological descriptors on clustering outcomes. Our results show that the DBN + k-means combination is the most effective, particularly at lower cluster counts, demonstrating significant sensitivity to structural variations. This study highlights the potential of integrating spectral geometry with deep learning for precise and efficient molecular clustering.
Collapse
Affiliation(s)
- Ömer Akgüller
- Faculty of Science, Department of Mathematics, Mugla Sitki Kocman University, Muğla 48000, Turkey;
| | - Mehmet Ali Balcı
- Faculty of Science, Department of Mathematics, Mugla Sitki Kocman University, Muğla 48000, Turkey;
| | - Gabriela Cioca
- Faculty of Medicine, Preclinical Department, Lucian Blaga University of Sibiu, 550024 Sibiu, Romania;
| |
Collapse
|
11
|
Mukadum F, Ccoa WJP, Hocky GM. Molecular simulation approaches to probing the effects of mechanical forces in the actin cytoskeleton. Cytoskeleton (Hoboken) 2024; 81:318-327. [PMID: 38334204 PMCID: PMC11310368 DOI: 10.1002/cm.21837] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 01/24/2024] [Accepted: 01/25/2024] [Indexed: 02/10/2024]
Abstract
In this article we give our perspective on the successes and promise of various molecular and coarse-grained simulation approaches to probing the effect of mechanical forces in the actin cytoskeleton.
Collapse
Affiliation(s)
- Fatemah Mukadum
- Department of Chemistry, New York University, New York, NY 10003, USA
| | | | - Glen M. Hocky
- Department of Chemistry, New York University, New York, NY 10003, USA
- Simons Center for Computational Physical Chemistry, New York, NY 10003, USA
| |
Collapse
|
12
|
Mazzaferro N, Sasmal S, Cossio P, Hocky GM. Good Rates From Bad Coordinates: The Exponential Average Time-dependent Rate Approach. J Chem Theory Comput 2024; 20:5901-5912. [PMID: 38954555 PMCID: PMC11270837 DOI: 10.1021/acs.jctc.4c00425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 06/11/2024] [Accepted: 06/12/2024] [Indexed: 07/04/2024]
Abstract
Our ability to calculate rate constants of biochemical processes using molecular dynamics simulations is severely limited by the fact that the time scales for reactions, or changes in conformational state, scale exponentially with the relevant free-energy barrier heights. In this work, we improve upon a recently proposed rate estimator that allows us to predict transition times with molecular dynamics simulations biased to rapidly explore one or several collective variables (CVs). This approach relies on the idea that not all bias goes into promoting transitions, and along with the rate, it estimates a concomitant scale factor for the bias termed the "CV biasing efficiency" γ. First, we demonstrate mathematically that our new formulation allows us to derive the commonly used Infrequent Metadynamics (iMetaD) estimator when using a perfect CV, where γ = 1. After testing it on a model potential, we then study the unfolding behavior of a previously well characterized coarse-grained protein, which is sufficiently complex that we can choose many different CVs to bias, but which is sufficiently simple that we are able to compute the unbiased rate directly. For this system, we demonstrate that predictions from our new Exponential Average Time-Dependent Rate (EATR) estimator converge to the true rate constant more rapidly as a function of bias deposition time than does the previous iMetaD approach, even for bias deposition times that are short. We also show that the γ parameter can serve as a good metric for assessing the quality of the biasing coordinate. We demonstrate that these results hold when applying the methods to an atomistic protein folding example. Finally, we demonstrate that our approach works when combining multiple less-than-optimal bias coordinates, and adapt our method to the related "OPES flooding" approach. Overall, our time-dependent rate approach offers a powerful framework for predicting rate constants from biased simulations.
Collapse
Affiliation(s)
- Nicodemo Mazzaferro
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Subarna Sasmal
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Pilar Cossio
- Center
for Computational Mathematics, Flatiron
Institute, New York, New York 10010, United States
- Center
for Computational Biology, Flatiron Institute, New York, New York 10010, United States
| | - Glen M. Hocky
- Department
of Chemistry, New York University, New York, New York 10003, United States
- Simons
Center for Computational Physical Chemistry, New York University, New York, New York 10003, United States
| |
Collapse
|
13
|
Roy P, Walter Z, Berish L, Ramage H, McCullagh M. Motif-VI loop acts as a nucleotide valve in the West Nile Virus NS3 Helicase. Nucleic Acids Res 2024; 52:7447-7464. [PMID: 38884215 PMCID: PMC11260461 DOI: 10.1093/nar/gkae500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 05/11/2024] [Accepted: 06/04/2024] [Indexed: 06/18/2024] Open
Abstract
The Orthoflavivirus NS3 helicase (NS3h) is crucial in virus replication, representing a potential drug target for pathogenesis. NS3h utilizes nucleotide triphosphate (ATP) for hydrolysis energy to translocate on single-stranded nucleic acids, which is an important step in the unwinding of double-stranded nucleic acids. Intermediate states along the ATP hydrolysis cycle and conformational changes between these states, represent important yet difficult-to-identify targets for potential inhibitors. Extensive molecular dynamics simulations of West Nile virus NS3h+ssRNA in the apo, ATP, ADP+Pi and ADP bound states were used to model the conformational ensembles along this cycle. Energetic and structural clustering analyses depict a clear trend of differential enthalpic affinity of NS3h with ADP, demonstrating a probable mechanism of hydrolysis turnover regulated by the motif-VI loop (MVIL). Based on these results, MVIL mutants (D471L, D471N and D471E) were found to have a substantial reduction in ATPase activity and RNA replication compared to the wild-type. Simulations of the mutants in the apo state indicate a shift in MVIL populations favoring either a closed or open 'valve' conformation, affecting ATP entry or stabilization, respectively. Combining our molecular modeling with experimental evidence highlights a conformation-dependent role for MVIL as a 'valve' for the ATP-pocket, presenting a promising target for antiviral development.
Collapse
Affiliation(s)
- Priti Roy
- Department of Chemistry, Oklahoma State University, Stillwater, OK 74078, USA
| | - Zachary Walter
- Department of Microbiology and Immunology, Thomas Jefferson University, Philadelphia, PA 19107, USA
| | - Lauren Berish
- Department of Microbiology and Immunology, Thomas Jefferson University, Philadelphia, PA 19107, USA
| | - Holly Ramage
- Department of Microbiology and Immunology, Thomas Jefferson University, Philadelphia, PA 19107, USA
| | - Martin McCullagh
- Department of Chemistry, Oklahoma State University, Stillwater, OK 74078, USA
| |
Collapse
|
14
|
Chen L, Roe DR, Kochert M, Simmerling C, Miranda-Quintana RA. k-Means NANI: An Improved Clustering Algorithm for Molecular Dynamics Simulations. J Chem Theory Comput 2024; 20:5583-5597. [PMID: 38905589 PMCID: PMC11541788 DOI: 10.1021/acs.jctc.4c00308] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/23/2024]
Abstract
One of the key challenges of k-means clustering is the seed selection or the initial centroid estimation since the clustering result depends heavily on this choice. Alternatives such as k-means++ have mitigated this limitation by estimating the centroids using an empirical probability distribution. However, with high-dimensional and complex data sets such as those obtained from molecular simulation, k-means++ fails to partition the data in an optimal manner. Furthermore, stochastic elements in all flavors of k-means++ will lead to a lack of reproducibility. K-means N-Ary Natural Initiation (NANI) is presented as an alternative to tackle this challenge by using efficient n-ary comparisons to both identify high-density regions in the data and select a diverse set of initial conformations. Centroids generated from NANI are not only representative of the data and different from one another, helping k-means to partition the data accurately, but also deterministic, providing consistent cluster populations across replicates. From peptide and protein folding molecular simulations, NANI was able to create compact and well-separated clusters as well as accurately find the metastable states that agree with the literature. NANI can cluster diverse data sets and be used as a standalone tool or as part of our MDANCE clustering package.
Collapse
Affiliation(s)
- Lexin Chen
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
- Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| | - Daniel R Roe
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Matthew Kochert
- Laufer Center for Physical & Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, United States
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
| | - Carlos Simmerling
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
- Laufer Center for Physical & Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, United States
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York 11794, United States
| | - Ramón Alain Miranda-Quintana
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
- Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| |
Collapse
|
15
|
Sasmal S, Pal T, Hocky GM, McCullagh M. Quantifying Unbiased Conformational Ensembles from Biased Simulations Using ShapeGMM. J Chem Theory Comput 2024; 20:3492-3502. [PMID: 38662196 PMCID: PMC11104435 DOI: 10.1021/acs.jctc.4c00223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 04/05/2024] [Accepted: 04/05/2024] [Indexed: 04/26/2024]
Abstract
Quantifying the conformational ensembles of biomolecules is fundamental to describing mechanisms of processes such as protein folding, interconversion between folded states, ligand binding, and allosteric regulation. Accurate quantification of these ensembles remains a challenge for conventional molecular simulations of all but the simplest molecules due to insufficient sampling. Enhanced sampling approaches, such as metadynamics, were designed to overcome this challenge; however, the nonuniform frame weights that result from many of these approaches present an additional challenge to ensemble quantification techniques such as Markov State Modeling or structural clustering. Here, we present rigorous inclusion of nonuniform frame weights into a structural clustering method entitled shapeGMM. The result of frame-weighted shapeGMM is a high dimensional probability density and generative model for the unbiased system from which we can compute important thermodynamic properties such as relative free energies and configurational entropy. The accuracy of this approach is demonstrated by the quantitative agreement between GMMs computed by Hamiltonian reweighting and direct simulation of a coarse-grained helix model system. Furthermore, the relative free energy computed from a shapeGMM probability density of alanine dipeptide reweighted from a metadynamics simulation quantitatively reproduces the underlying free energy in the basins. Finally, the method identifies hidden structures along the actin globular to filamentous-like structural transition from a metadynamics simulation on a linear discriminant analysis coordinate trained on GMM states, illustrating how structural clustering of biased data can lead to biophysical insight. Combined, these results demonstrate that frame-weighted shapeGMM is a powerful approach to quantifying biomolecular ensembles from biased simulations.
Collapse
Affiliation(s)
- Subarna Sasmal
- Department of Chemistry, New York
University, New York, New York 10003, United
States
| | - Triasha Pal
- Department of Chemistry, New York
University, New York, New York 10003, United
States
| | - Glen M. Hocky
- Department of Chemistry, New York
University, New York, New York 10003, United
States
- Simons Center for Computational Physical Chemistry,
New York University, New York, New York 10003,
United States
| | - Martin McCullagh
- Department of Chemistry, Oklahoma State
University, Stillwater, Oklahoma 74078, United
States
| |
Collapse
|
16
|
Gallegos M, Isamura BK, Popelier PLA, Martín Pendás Á. An Unsupervised Machine Learning Approach for the Automatic Construction of Local Chemical Descriptors. J Chem Inf Model 2024; 64:3059-3079. [PMID: 38498942 PMCID: PMC11040729 DOI: 10.1021/acs.jcim.3c01906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 03/06/2024] [Accepted: 03/07/2024] [Indexed: 03/20/2024]
Abstract
Condensing the many physical variables defining a chemical system into a fixed-size array poses a significant challenge in the development of chemical Machine Learning (ML). Atom Centered Symmetry Functions (ACSFs) offer an intuitive featurization approach by means of a tedious and labor-intensive selection of tunable parameters. In this work, we implement an unsupervised ML strategy relying on a Gaussian Mixture Model (GMM) to automatically optimize the ACSF parameters. GMMs effortlessly decompose the vastness of the chemical and conformational spaces into well-defined radial and angular clusters, which are then used to build tailor-made ACSFs. The unsupervised exploration of the space has demonstrated general applicability across a diverse range of systems, spanning from various unimolecular landscapes to heterogeneous databases. The impact of the sampling technique and temperature on space exploration is also addressed, highlighting the particularly advantageous role of high-temperature Molecular Dynamics (MD) simulations. The reliability of the resulting features is assessed through the estimation of the atomic charges of a prototypical capped amino acid and a heterogeneous collection of CHON molecules. The automatically constructed ACSFs serve as high-quality descriptors, consistently yielding typical prediction errors below 0.010 electrons bound for the reported atomic charges. Altering the spatial distribution of the functions with respect to the cluster highlights the critical role of symmetry rupture in achieving significantly improved features. More specifically, using two separate functions to describe the lower and upper tails of the cluster results in the best performing models with errors as low as 0.006 electrons. Finally, the effectiveness of finely tuned features was checked across different architectures, unveiling the superior performance of Gaussian Process (GP) models over Feed Forward Neural Networks (FFNNs), particularly in low-data regimes, with nearly a 2-fold increase in prediction quality. Altogether, this approach paves the way toward an easier construction of local chemical descriptors, while providing valuable insights into how radial and angular spaces should be mapped. Finally, this work opens the possibility of encoding many-body information beyond angular terms into upcoming ML features.
Collapse
Affiliation(s)
- Miguel Gallegos
- Department
of Analytical and Physical Chemistry, University
of Oviedo, Oviedo E-33006, Spain
| | | | - Paul L. A. Popelier
- Department
of Chemistry, The University of Manchester, Oxford Road, Manchester M13 9PL, U.K.
| | - Ángel Martín Pendás
- Department
of Analytical and Physical Chemistry, University
of Oviedo, Oviedo E-33006, Spain
| |
Collapse
|
17
|
Wu Y, Cao S, Qiu Y, Huang X. Tutorial on how to build non-Markovian dynamic models from molecular dynamics simulations for studying protein conformational changes. J Chem Phys 2024; 160:121501. [PMID: 38516972 PMCID: PMC10964226 DOI: 10.1063/5.0189429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 02/20/2024] [Indexed: 03/23/2024] Open
Abstract
Protein conformational changes play crucial roles in their biological functions. In recent years, the Markov State Model (MSM) constructed from extensive Molecular Dynamics (MD) simulations has emerged as a powerful tool for modeling complex protein conformational changes. In MSMs, dynamics are modeled as a sequence of Markovian transitions among metastable conformational states at discrete time intervals (called lag time). A major challenge for MSMs is that the lag time must be long enough to allow transitions among states to become memoryless (or Markovian). However, this lag time is constrained by the length of individual MD simulations available to track these transitions. To address this challenge, we have recently developed Generalized Master Equation (GME)-based approaches, encoding non-Markovian dynamics using a time-dependent memory kernel. In this Tutorial, we introduce the theory behind two recently developed GME-based non-Markovian dynamic models: the quasi-Markov State Model (qMSM) and the Integrative Generalized Master Equation (IGME). We subsequently outline the procedures for constructing these models and provide a step-by-step tutorial on applying qMSM and IGME to study two peptide systems: alanine dipeptide and villin headpiece. This Tutorial is available at https://github.com/xuhuihuang/GME_tutorials. The protocols detailed in this Tutorial aim to be accessible for non-experts interested in studying the biomolecular dynamics using these non-Markovian dynamic models.
Collapse
Affiliation(s)
- Yue Wu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Siqin Cao
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Yunrui Qiu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Xuhui Huang
- Author to whom correspondence should be addressed:
| |
Collapse
|
18
|
Daniel DT, Mitra S, Eichel RA, Diddens D, Granwehr J. Machine Learning Isotropic g Values of Radical Polymers. J Chem Theory Comput 2024; 20:2592-2604. [PMID: 38456629 DOI: 10.1021/acs.jctc.3c01252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2024]
Abstract
Methods for electronic structure computations, such as density functional theory (DFT), are routinely used for the calculation of spectroscopic parameters to establish and validate structure-parameter correlations. DFT calculations, however, are computationally expensive for large systems such as polymers. This work explores the machine learning (ML) of isotropic g values, giso, obtained from electron paramagnetic resonance (EPR) experiments of an organic radical polymer. An ML model based on regression trees is trained on DFT-calculated g values of poly(2,2,6,6-tetramethylpiperidinyloxy-4-yl methacrylate) (PTMA) polymer structures extracted from different time frames of a molecular dynamics trajectory. The DFT-derived g values, gisocalc, for different radical densities of PTMA, are compared against experimentally derived g values obtained from in operando EPR measurements of a PTMA-based organic radical battery. The ML-predicted giso values, gisopred, were compared with gisocalc to evaluate the performance of the model. Mean deviations of gisopred from gisocalc were found to be on the order of 0.0001. Furthermore, a performance evaluation on test structures from a separate MD trajectory indicated that the model is sensitive to the radical density and efficiently learns to predict giso values even for radical densities that were not part of the training data set. Since our trained model can reproduce the changes in giso along the MD trajectory and is sensitive to the extent of equilibration of the polymer structure, it is a promising alternative to computationally more expensive DFT methods, particularly for large systems that cannot be easily represented by a smaller model system.
Collapse
Affiliation(s)
- Davis Thomas Daniel
- Institute of Energy and Climate Research (IEK-9), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
- Institute of Technical and Macromolecular Chemistry, RWTH Aachen University, 52056 Aachen, Germany
| | - Souvik Mitra
- Institute of Physical Chemistry, University of Münster, 48149 Münster, Germany
| | - Rüdiger-A Eichel
- Institute of Energy and Climate Research (IEK-9), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
- Institute of Physical Chemistry, RWTH Aachen University, Aachen 52056, Germany
| | - Diddo Diddens
- Helmholtz Institute Münster (IEK-12), Forschungszentrum Jülich GmbH, 48149 Münster, Germany
| | - Josef Granwehr
- Institute of Energy and Climate Research (IEK-9), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
- Institute of Technical and Macromolecular Chemistry, RWTH Aachen University, 52056 Aachen, Germany
| |
Collapse
|
19
|
Chen L, Roe DR, Kochert M, Simmerling C, Miranda-Quintana RA. k-Means NANI: an improved clustering algorithm for Molecular Dynamics simulations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.07.583975. [PMID: 38496504 PMCID: PMC10942464 DOI: 10.1101/2024.03.07.583975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
One of the key challenges of k-means clustering is the seed selection or the initial centroid estimation since the clustering result depends heavily on this choice. Alternatives such as k-means++ have mitigated this limitation by estimating the centroids using an empirical probability distribution. However, with high-dimensional and complex datasets such as those obtained from molecular simulation, k-means++ fails to partition the data in an optimal manner. Furthermore, stochastic elements in all flavors of k-means++ will lead to a lack of reproducibility. K-means N-Ary Natural Initiation (NANI) is presented as an alternative to tackle this challenge by using efficient n-ary comparisons to both identify high-density regions in the data and select a diverse set of initial conformations. Centroids generated from NANI are not only representative of the data and different from one another, helping k-means to partition the data accurately, but also deterministic, providing consistent cluster populations across replicates. From peptide and protein folding molecular simulations, NANI was able to create compact and well-separated clusters as well as accurately find the metastable states that agree with the literature. NANI can cluster diverse datasets and be used as a standalone tool or as part of our MDANCE clustering package.
Collapse
Affiliation(s)
- Lexin Chen
- Department of Chemistry, University of Florida, FL, USA
- Quantum Theory Project, University of Florida, FL, USA
| | - Daniel R Roe
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, USA
| | - Matthew Kochert
- Laufer Center for Physical & Quantitative Biology, Stony Brook University, Stony Brook, 11794, USA
- Department of Chemistry, Stony Brook University, Stony Brook 11794, USA
| | - Carlos Simmerling
- Laufer Center for Physical & Quantitative Biology, Stony Brook University, Stony Brook, 11794, USA
- Department of Chemistry, Stony Brook University, Stony Brook 11794, USA
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook 11794, USA
| | | |
Collapse
|
20
|
Lawal MM, Roy P, McCullagh M. Role of ATP Hydrolysis and Product Release in the Translocation Mechanism of SARS-CoV-2 NSP13. J Phys Chem B 2024; 128:492-503. [PMID: 38175211 PMCID: PMC11256563 DOI: 10.1021/acs.jpcb.3c06714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
In response to the emergence of COVID-19, caused by SARS-CoV-2, there has been a growing interest in understanding the functional mechanisms of the viral proteins to aid in the development of new therapeutics. Nonstructural protein 13 (nsp13) helicase is an attractive target for antivirals because it is essential for viral replication and has a low mutation rate, yet the structural mechanisms by which this enzyme binds and hydrolyzes ATP to cause unidirectional RNA translocation remain elusive. Using Gaussian accelerated molecular dynamics (GaMD), we generated comprehensive conformational ensembles of all substrate states along the ATP-dependent cycle. Shape-GMM clustering of the protein yields four protein conformations that describe an opening and closing of both the ATP pocket and the RNA cleft that is achieved through a combination of conformational selection and induction along the ATP hydrolysis cycle. Furthermore, three protein-RNA conformations are observed that implicate motifs Ia, IV, and V as playing a pivotal role in an ATP-dependent inchworm translocation mechanism. Finally, based on a linear discriminant analysis of protein conformations, we identify L405 as a pivotal residue for the opening and closing mechanism and propose a L405D mutation as a way to disrupt translocation. This research enhances our understanding of nsp13's role in viral replication and could contribute to the development of antiviral strategies.
Collapse
Affiliation(s)
- Monsurat M. Lawal
- Department of Chemistry, Oklahoma State University, Stillwater, OK, 74074, USA
- These authors contributed equally to this work
| | - Priti Roy
- Department of Chemistry, Oklahoma State University, Stillwater, OK, 74074, USA
- These authors contributed equally to this work
| | - Martin McCullagh
- Department of Chemistry, Oklahoma State University, Stillwater, OK, 74074, USA
| |
Collapse
|
21
|
Roy P, Walter Z, Berish L, Ramage H, McCullagh M. Motif-VI Loop Acts as a Nucleotide Valve in the West Nile Virus NS3 Helicase. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.30.569434. [PMID: 38077049 PMCID: PMC10705498 DOI: 10.1101/2023.11.30.569434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
The flavivirus NS3 helicase (NS3h), a highly conserved protein, plays a pivotal role in virus replication and thus represents a potential drug target for flavivirus pathogenesis. NS3h utilizes nucleotide triphosphate, such as ATP, for hydrolysis energy (ATPase) to translocate on single-stranded nucleic acids, which is an important step in the unwinding of double-stranded nucleic acids. The intermediate states along the ATP binding and hydrolysis cycle, as well as the conformational changes between these states, represent important yet difficult-to-identify targets for potential inhibitors. We use extensive molecular dynamics simulations of apo, ATP, ADP+Pi, and ADP bound to WNV NS3h+ssRNA to model the conformational ensembles along this cycle. Energetic and structural clustering analyses on these trajectories depict a clear trend of differential enthalpic affinity of NS3h with ADP, demonstrating a probable mechanism of hydrolysis turnover regulated by the motif-VI loop (MVIL). These findings were experimentally corroborated using viral replicons encoding three mutations at the D471 position. Replication assays using these mutants demonstrated a substantial reduction in viral replication compared to the wild-type. Molecular simulations of the D471 mutants in the apo state indicate a shift in MVIL populations favoring either a closed or open 'valve' conformation, affecting ATP entry or stabilization, respectively. Combining our molecular modeling with experimental evidence highlights a conformation-dependent role for MVIL as a 'valve' for the ATP-pocket, presenting a promising target for antiviral development.
Collapse
Affiliation(s)
- Priti Roy
- Department of Chemistry, Oklahoma State University, Stillwater, OK, USA, 74078
| | - Zachary Walter
- Department of Microbiology and Immunology, Thomas Jefferson University, Philadelphia, PA, USA, 19107
| | - Lauren Berish
- Department of Microbiology and Immunology, Thomas Jefferson University, Philadelphia, PA, USA, 19107
| | - Holly Ramage
- Department of Microbiology and Immunology, Thomas Jefferson University, Philadelphia, PA, USA, 19107
| | - Martin McCullagh
- Department of Chemistry, Oklahoma State University, Stillwater, OK, USA, 74078
| |
Collapse
|
22
|
Lawal MM, Roy P, McCullagh M. The Role of ATP Hydrolysis and Product Release in the Translocation Mechanism of SARS-CoV-2 NSP13. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.28.560057. [PMID: 37808802 PMCID: PMC10557736 DOI: 10.1101/2023.09.28.560057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
In response to the emergence of COVID-19, caused by SARS-CoV-2, there has been a growing interest in understanding the functional mechanisms of the viral proteins to aid in the development of new therapeutics. Non-structural protein 13 (Nsp13) helicase is an attractive target for antivirals because it is essential for viral replication and has a low mutation rate; yet, the structural mechanisms by which this enzyme binds and hydrolyzes ATP to cause unidirectional RNA translocation remain elusive. Using Gaussian accelerated molecular dynamics (GaMD), we generated a comprehensive conformational ensemble of all substrate states along the ATP-dependent cycle. ShapeGMM clustering of the protein yields four protein conformations that describe an opening and closing of both the ATP pocket and RNA cleft. This opening and closing is achieved through a combination of conformational selection and induction along the ATP cycle. Furthermore, three protein-RNA conformations are observed that implicate motifs Ia, IV, and V as playing a pivotal role in an ATP-dependent inchworm translocation mechanism. Finally, based on a linear discriminant analysis of protein conformations, we identify L405 as a pivotal residue for the opening and closing mechanism and propose a L405D mutation as a way of testing our proposed mechanism. This research enhances our understanding of nsp13's role in viral replication and could contribute to the development of antiviral strategies.
Collapse
Affiliation(s)
- Monsurat M. Lawal
- Department of Chemistry, Oklahoma State University, Stillwater OK
- These authors contributed equally to this work
| | - Priti Roy
- Department of Chemistry, Oklahoma State University, Stillwater OK
- These authors contributed equally to this work
| | - Martin McCullagh
- Department of Chemistry, Oklahoma State University, Stillwater OK
| |
Collapse
|
23
|
Wellawatte GP, Hocky GM, White AD. Neural potentials of proteins extrapolate beyond training data. J Chem Phys 2023; 159:085103. [PMID: 37642255 PMCID: PMC10474891 DOI: 10.1063/5.0147240] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 07/31/2023] [Indexed: 08/31/2023] Open
Abstract
We evaluate neural network (NN) coarse-grained (CG) force fields compared to traditional CG molecular mechanics force fields. We conclude that NN force fields are able to extrapolate and sample from unseen regions of the free energy surface when trained with limited data. Our results come from 88 NN force fields trained on different combinations of clustered free energy surfaces from four protein mapped trajectories. We used a statistical measure named total variation similarity to assess the agreement between reference free energy surfaces from mapped atomistic simulations and CG simulations from trained NN force fields. Our conclusions support the hypothesis that NN CG force fields trained with samples from one region of the proteins' free energy surface can, indeed, extrapolate to unseen regions. Additionally, the force matching error was found to only be weakly correlated with a force field's ability to reconstruct the correct free energy surface.
Collapse
Affiliation(s)
- Geemi P. Wellawatte
- Department of Chemistry, University of Rochester, Rochester, New York 14627, USA
| | - Glen M. Hocky
- Department of Chemistry, Simons Center for Computational Physical Chemistry, New York University, New York, New York 10003, USA
| | - Andrew D. White
- Department of Chemical Engineering, University of Rochester, Rochester, New York 14627, USA
| |
Collapse
|
24
|
Appadurai R, Koneru JK, Bonomi M, Robustelli P, Srivastava A. Clustering Heterogeneous Conformational Ensembles of Intrinsically Disordered Proteins with t-Distributed Stochastic Neighbor Embedding. J Chem Theory Comput 2023; 19:4711-4727. [PMID: 37338049 PMCID: PMC11108026 DOI: 10.1021/acs.jctc.3c00224] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/21/2023]
Abstract
Intrinsically disordered proteins (IDPs) populate a range of conformations that are best described by a heterogeneous ensemble. Grouping an IDP ensemble into "structurally similar" clusters for visualization, interpretation, and analysis purposes is a much-desired but formidable task, as the conformational space of IDPs is inherently high-dimensional and reduction techniques often result in ambiguous classifications. Here, we employ the t-distributed stochastic neighbor embedding (t-SNE) technique to generate homogeneous clusters of IDP conformations from the full heterogeneous ensemble. We illustrate the utility of t-SNE by clustering conformations of two disordered proteins, Aβ42, and α-synuclein, in their APO states and when bound to small molecule ligands. Our results shed light on ordered substates within disordered ensembles and provide structural and mechanistic insights into binding modes that confer specificity and affinity in IDP ligand binding. t-SNE projections preserve the local neighborhood information, provide interpretable visualizations of the conformational heterogeneity within each ensemble, and enable the quantification of cluster populations and their relative shifts upon ligand binding. Our approach provides a new framework for detailed investigations of the thermodynamics and kinetics of IDP ligand binding and will aid rational drug design for IDPs.
Collapse
Affiliation(s)
- Rajeswari Appadurai
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka 560012, India
| | | | - Massimiliano Bonomi
- Structural Bioinformatics Unit, Department of Structural Biology and Chemistry. CNRS UMR 3528, C3BI, CNRS USR 3756, Institut Pasteur, Paris, France
| | - Paul Robustelli
- Dartmouth College, Department of Chemistry, Hanover, NH, 03755, USA
| | - Anand Srivastava
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka 560012, India
| |
Collapse
|
25
|
Sasmal S, McCullagh M, Hocky GM. Reaction Coordinates for Conformational Transitions Using Linear Discriminant Analysis on Positions. J Chem Theory Comput 2023; 19:4427-4435. [PMID: 37130367 PMCID: PMC10373481 DOI: 10.1021/acs.jctc.3c00051] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Indexed: 05/04/2023]
Abstract
In this work, we demonstrate that Linear Discriminant Analysis (LDA) applied to atomic positions in two different states of a biomolecule produces a good reaction coordinate between those two states. Atomic coordinates of a macromolecule are a direct representation of a macromolecular configuration, and yet, they are not used in enhanced sampling studies due to a lack of rotational and translational invariance. We resolve this issue using the technique of our prior work, whereby a molecular configuration is considered a member of an equivalence class in size-and-shape space, which is the set of all configurations that can be translated and rotated to a single point within a reference multivariate Gaussian distribution characterizing a single molecular state. The reaction coordinates produced by LDA applied to positions are shown to be good reaction coordinates both in terms of characterizing the transition between two states of a system within a long molecular dynamics (MD) simulation and also ones that allow us to readily produce free energy estimates along that reaction coordinate using enhanced sampling MD techniques.
Collapse
Affiliation(s)
- Subarna Sasmal
- Department
of Chemistry and Simons Center for Computational Physical Chemistry, New York University, New York, New York 10003, United States
| | - Martin McCullagh
- Department
of Chemistry, Oklahoma State University, Stillwater, Oklahoma 74078, United States
| | - Glen M. Hocky
- Department
of Chemistry and Simons Center for Computational Physical Chemistry, New York University, New York, New York 10003, United States
| |
Collapse
|
26
|
Nagel D, Sartore S, Stock G. Selecting Features for Markov Modeling: A Case Study on HP35. J Chem Theory Comput 2023. [PMID: 37167425 DOI: 10.1021/acs.jctc.3c00240] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Markov state models represent a popular means to interpret molecular dynamics trajectories in terms of memoryless transitions between metastable conformational states. To provide a mechanistic understanding of the considered biomolecular process, these states should reflect structurally distinct conformations and ensure a time scale separation between fast intrastate and slow interstate dynamics. Adopting the folding of villin headpiece (HP35) as a well-established model problem, here we discuss the selection of suitable input coordinates or "features", such as backbone dihedral angles and interresidue distances. We show that dihedral angles account accurately for the structure of the native energy basin of HP35, while the unfolded region of the free energy landscape and the folding process are best described by tertiary contacts of the protein. To construct a contact-based model, we consider various ways to define and select contact distances and introduce a low-pass filtering of the feature trajectory as well as a correlation-based characterization of states. Relying on input data that faithfully account for the mechanistic origin of the studied process, the states of the resulting Markov model are clearly discriminated by the features, describe consistently the hierarchical structure of the free energy landscape, and─as a consequence─correctly reproduce the slow time scales of the process.
Collapse
Affiliation(s)
- Daniel Nagel
- Biomolecular Dynamics, Institute of Physics, University of Freiburg, 79104 Freiburg, Germany
| | - Sofia Sartore
- Biomolecular Dynamics, Institute of Physics, University of Freiburg, 79104 Freiburg, Germany
| | - Gerhard Stock
- Biomolecular Dynamics, Institute of Physics, University of Freiburg, 79104 Freiburg, Germany
| |
Collapse
|
27
|
Dominic AJ, Cao S, Montoya-Castillo A, Huang X. Memory Unlocks the Future of Biomolecular Dynamics: Transformative Tools to Uncover Physical Insights Accurately and Efficiently. J Am Chem Soc 2023; 145:9916-9927. [PMID: 37104720 DOI: 10.1021/jacs.3c01095] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Conformational changes underpin function and encode complex biomolecular mechanisms. Gaining atomic-level detail of how such changes occur has the potential to reveal these mechanisms and is of critical importance in identifying drug targets, facilitating rational drug design, and enabling bioengineering applications. While the past two decades have brought Markov state model techniques to the point where practitioners can regularly use them to glimpse the long-time dynamics of slow conformations in complex systems, many systems are still beyond their reach. In this Perspective, we discuss how including memory (i.e., non-Markovian effects) can reduce the computational cost to predict the long-time dynamics in these complex systems by orders of magnitude and with greater accuracy and resolution than state-of-the-art Markov state models. We illustrate how memory lies at the heart of successful and promising techniques, ranging from the Fokker-Planck and generalized Langevin equations to deep-learning recurrent neural networks and generalized master equations. We delineate how these techniques work, identify insights that they can offer in biomolecular systems, and discuss their advantages and disadvantages in practical settings. We show how generalized master equations can enable the investigation of, for example, the gate-opening process in RNA polymerase II and demonstrate how our recent advances tame the deleterious influence of statistical underconvergence of the molecular dynamics simulations used to parameterize these techniques. This represents a significant leap forward that will enable our memory-based techniques to interrogate systems that are currently beyond the reach of even the best Markov state models. We conclude by discussing some current challenges and future prospects for how exploiting memory will open the door to many exciting opportunities.
Collapse
Affiliation(s)
- Anthony J Dominic
- Department of Chemistry, University of Colorado Boulder, Boulder, Colorado 80309, USA
| | - Siqin Cao
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | | | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| |
Collapse
|
28
|
Dominic AJ, Sayer T, Cao S, Markland TE, Huang X, Montoya-Castillo A. Building insightful, memory-enriched models to capture long-time biochemical processes from short-time simulations. Proc Natl Acad Sci U S A 2023; 120:e2221048120. [PMID: 36920924 PMCID: PMC10041170 DOI: 10.1073/pnas.2221048120] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Accepted: 02/21/2023] [Indexed: 03/16/2023] Open
Abstract
The ability to predict and understand complex molecular motions occurring over diverse timescales ranging from picoseconds to seconds and even hours in biological systems remains one of the largest challenges to chemical theory. Markov state models (MSMs), which provide a memoryless description of the transitions between different states of a biochemical system, have provided numerous important physically transparent insights into biological function. However, constructing these models often necessitates performing extremely long molecular simulations to converge the rates. Here, we show that by incorporating memory via the time-convolutionless generalized master equation (TCL-GME) one can build a theoretically transparent and physically intuitive memory-enriched model of biochemical processes with up to a three order of magnitude reduction in the simulation data required while also providing a higher temporal resolution. We derive the conditions under which the TCL-GME provides a more efficient means to capture slow dynamics than MSMs and rigorously prove when the two provide equally valid and efficient descriptions of the slow configurational dynamics. We further introduce a simple averaging procedure that enables our TCL-GME approach to quickly converge and accurately predict long-time dynamics even when parameterized with noisy reference data arising from short trajectories. We illustrate the advantages of the TCL-GME using alanine dipeptide, the human argonaute complex, and FiP35 WW domain.
Collapse
Affiliation(s)
| | - Thomas Sayer
- Department of Chemistry, University of Colorado, Boulder, CO80309
| | - Siqin Cao
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI53706
| | | | - Xuhui Huang
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI53706
| | | |
Collapse
|