1
|
Peña Ccoa WJ, Mukadum F, Ramon A, Stirnemann G, Hocky GM. A direct computational assessment of vinculin-actin unbinding kinetics reveals catch-bonding behavior. Proc Natl Acad Sci U S A 2025; 122:e2425982122. [PMID: 40397673 DOI: 10.1073/pnas.2425982122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2024] [Accepted: 04/16/2025] [Indexed: 05/23/2025] Open
Abstract
Vinculin forms a catch bond with the cytoskeletal polymer actin, displaying an increased bond lifetime upon force application. Notably, this behavior depends on the direction of the applied force, which has significant implications for cellular mechanotransduction. In this work, we present a comprehensive molecular dynamics simulation study, employing enhanced sampling techniques to investigate the thermodynamic, kinetic, and mechanistic aspects of this phenomenon at physiologically relevant forces. We dissect a catch bond mechanism in which force shifts vinculin between either a weakly or strongly bound state. Our results demonstrate that models for these states have unbinding times consistent with those from single-molecule studies, and suggest that both have some intrinsic catch-bonding behavior. We provide atomistic insight into this behavior, and show how a directional pulling force can promote the strong or weak state. Crucially, our strategy can be extended to measure the difficult-to-capture effects of small mechanical forces on biomolecular systems in general, and those involved in mechanotransduction more specifically.
Collapse
Affiliation(s)
| | - Fatemah Mukadum
- Department of Chemistry, New York University, New York, NY 10003
| | - Aubin Ramon
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United Kingdom
- Chimie Physique et Chimie pour le Vivant Laboratory, Department of Chemistry, École Normale Supérieure, Paris Sciences et Lettres (PSL) University, Sorbonne University, CNRS, Paris 75005, France
| | - Guillaume Stirnemann
- Chimie Physique et Chimie pour le Vivant Laboratory, Department of Chemistry, École Normale Supérieure, Paris Sciences et Lettres (PSL) University, Sorbonne University, CNRS, Paris 75005, France
| | - Glen M Hocky
- Department of Chemistry, New York University, New York, NY 10003
- Simons Center For Computational Physical Chemistry, New York University, New York, NY 10003
| |
Collapse
|
2
|
Bera P, Mondal J. Accurate prediction of the kinetic sequence of physicochemical states using generative artificial intelligence. Chem Sci 2025; 16:8735-8751. [PMID: 40271036 PMCID: PMC12012632 DOI: 10.1039/d5sc00108k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2025] [Accepted: 04/10/2025] [Indexed: 04/25/2025] Open
Abstract
Capturing the time evolution and predicting kinetic sequences of states of physicochemical systems present significant challenges due to the precision and computational effort required. In this study, we demonstrate that 'Generative Pre-trained Transformer (GPT)', an artificial intelligence model renowned for machine translation and natural language processing, can be effectively adapted to predict the dynamical state-to-state transition kinetics of biologically relevant physicochemical systems. Specifically, by using sequences of time-discretized states from Molecular Dynamics (MD) simulation trajectories akin to the vocabulary corpus of a language, we show that a GPT-based model can learn the complex syntactic and semantic relationships within the trajectory. This enables GPT to predict kinetically accurate sequences of states for a diverse set of biomolecules of varying complexity, at a much quicker pace than traditional MD simulations and with a better efficiency than other baseline time-series prediction approaches. More significantly, the approach is found to be equally adept at forecasting the time evolution of out-of-equilibrium active systems that do not maintain detailed balance. An analysis of the mechanism inherent in GPT reveals the crucial role of the 'self-attention mechanism' in capturing the long-range correlations necessary for accurate state-to-state transition predictions. Together, our results highlight generative artificial intelligence's ability to generate kinetic sequences of states of physicochemical systems with statistical precision.
Collapse
Affiliation(s)
- Palash Bera
- Tata Institute of Fundamental Research Hyderabad Telangana 500046 India
| | - Jagannath Mondal
- Tata Institute of Fundamental Research Hyderabad Telangana 500046 India
| |
Collapse
|
3
|
Ansari N, Jing ZF, Gagelin A, Hédin F, Aviat F, Hénin J, Piquemal JP, Lagardère L. Lambda-ABF-OPES: Faster Convergence with High Accuracy in Alchemical Free Energy Calculations. J Phys Chem Lett 2025; 16:4626-4634. [PMID: 40312308 DOI: 10.1021/acs.jpclett.5c00683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2025]
Abstract
Predicting the binding affinity between small molecules and target macromolecules while combining both speed and accuracy is a cornerstone of modern computational drug discovery, which is critical for accelerating therapeutic development. Despite recent progress in molecular dynamics (MD) simulations, such as advanced polarizable force fields and enhanced sampling techniques, estimating absolute binding free energies (ABFEs) remains computationally challenging. To overcome these difficulties, we introduce a highly efficient hybrid methodology that couples the Lambda-adaptive biasing force (Lambda-ABF) scheme with on-the-fly probability enhanced sampling (OPES). This approach achieves up to a 9-fold improvement in sampling efficiency and computational speed compared to the original Lambda-ABF when used in conjunction with the AMOEBA polarizable force field, yielding converged results at a fraction of the cost of standard techniques.
Collapse
Affiliation(s)
- Narjes Ansari
- Qubit Pharmaceuticals, 29 rue du Faubourg Saint Jacques, 75014 Paris, France
| | - Zhifeng Francis Jing
- Qubit Pharmaceuticals, 31 Saint James Avenue, Suite 810, Boston, Massachusetts 02116, United States
| | - Antoine Gagelin
- Qubit Pharmaceuticals, 29 rue du Faubourg Saint Jacques, 75014 Paris, France
| | - Florent Hédin
- Qubit Pharmaceuticals, 29 rue du Faubourg Saint Jacques, 75014 Paris, France
| | - Félix Aviat
- Qubit Pharmaceuticals, 29 rue du Faubourg Saint Jacques, 75014 Paris, France
| | - Jérôme Hénin
- Laboratoire de Biochimie Théorique, UPR 9080 CNRS, Université de Paris Cité, 75005 Paris, France
| | - Jean-Philip Piquemal
- Qubit Pharmaceuticals, 29 rue du Faubourg Saint Jacques, 75014 Paris, France
- Laboratoire de Chimie Théorique, Sorbonne Université, UMR 7616 CNRS, 75005 Paris, France
| | - Louis Lagardère
- Qubit Pharmaceuticals, 29 rue du Faubourg Saint Jacques, 75014 Paris, France
- Laboratoire de Chimie Théorique, Sorbonne Université, UMR 7616 CNRS, 75005 Paris, France
| |
Collapse
|
4
|
Yang J, Yin Z, Li S. Accounting for the vibrational contribution to the configurational entropy in disordered solids with machine learned forcefields: a case study of garnet electrolyte Li 7La 3Zr 2O 12. Phys Chem Chem Phys 2025; 27:9095-9111. [PMID: 40227832 DOI: 10.1039/d5cp00138b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2025]
Abstract
Accounting for lattice vibrations to accurately determine the phase stabilities of site-disordered solids is a long-standing challenge in computational material designs, due to the high computational cost associated with sampling the vast configurational space to obtain the converged thermodynamic quantities. One example is the garnet electrolyte Li7La3Zr2O12, the high-temperature and high-ion-mobility cubic phase of which is disordered in its Li+ site occupations, such that both the vibrational and configurational entropic contributions to its phase stability cannot be ignored. Understanding the subtle interplay between vibrational and configurational entropies in this material will therefore play a critical role in the rational manipulation of dopants and defects to stabilise cubic Li7La3Zr2O12 at room temperature for practical applications. Here, by developing machine learned forcefields based on an equivariant message-passing neural network SO3KRATES, we follow a strict statistical thermodynamic protocol to quantify the phase stability of cubic Li7La3Zr2O12 through structural optimisations, as well as molecular dynamics simulations at 300 and 1500 K, for a total of 70 120 configurations of cubic Li7La3Zr2O12. Although this only covers a tiny fraction of the configurational space (∼7 × 1034 configurations in total), we are able to deterministically show that the vibrational contributions to the total configurational free energy at 1500 K are significant (on the order of 1 eV per atom) in correctly ordering the stability of the cubic Li7La3Zr2O12 over its tetragonal counterpart, thanks to the high data efficiency, accuracy, stability and good transferability of the transformer-based equivariant network architecture behind SO3KRATES. Therefore, our work opens up new avenues to accelerate the accurate computational designs of disordered solids, such as solid electrolytes, for technologically important applications.
Collapse
Affiliation(s)
- Jack Yang
- Materials and Manufacturing Futures Institute, School of Material Science and Engineering, University of New South Wales, Sydney, New South Wales 2052, Australia.
| | - Ziqi Yin
- Materials and Manufacturing Futures Institute, School of Material Science and Engineering, University of New South Wales, Sydney, New South Wales 2052, Australia.
| | - Sean Li
- Materials and Manufacturing Futures Institute, School of Material Science and Engineering, University of New South Wales, Sydney, New South Wales 2052, Australia.
| |
Collapse
|
5
|
Vargas-Rosales PA, Caflisch A. The physics-AI dialogue in drug design. RSC Med Chem 2025; 16:1499-1515. [PMID: 39906313 PMCID: PMC11788922 DOI: 10.1039/d4md00869c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2024] [Accepted: 01/16/2025] [Indexed: 02/06/2025] Open
Abstract
A long path has led from the determination of the first protein structure in 1960 to the recent breakthroughs in protein science. Protein structure prediction and design methodologies based on machine learning (ML) have been recognized with the 2024 Nobel prize in Chemistry, but they would not have been possible without previous work and the input of many domain scientists. Challenges remain in the application of ML tools for the prediction of structural ensembles and their usage within the software pipelines for structure determination by crystallography or cryogenic electron microscopy. In the drug discovery workflow, ML techniques are being used in diverse areas such as scoring of docked poses, or the generation of molecular descriptors. As the ML techniques become more widespread, novel applications emerge which can profit from the large amounts of data available. Nevertheless, it is essential to balance the potential advantages against the environmental costs of ML deployment to decide if and when it is best to apply it. For hit to lead optimization ML tools can efficiently interpolate between compounds in large chemical series but free energy calculations by molecular dynamics simulations seem to be superior for designing novel derivatives. Importantly, the potential complementarity and/or synergism of physics-based methods (e.g., force field-based simulation models) and data-hungry ML techniques is growing strongly. Current ML methods have evolved from decades of research. It is now necessary for biologists, physicists, and computer scientists to fully understand advantages and limitations of ML techniques to ensure that the complementarity of physics-based methods and ML tools can be fully exploited for drug design.
Collapse
Affiliation(s)
| | - Amedeo Caflisch
- Department of Biochemistry, University of Zurich Winterthurerstrasse 190 8057 Zürich Switzerland
| |
Collapse
|
6
|
Ghorbaninia M, Doroudgar S, Ganjalikhany MR. Delving into the crucial role of the initial structure in the dynamic and self-assembly of amyloid beta. Biochem Biophys Res Commun 2025; 758:151652. [PMID: 40117973 DOI: 10.1016/j.bbrc.2025.151652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2024] [Revised: 03/11/2025] [Accepted: 03/15/2025] [Indexed: 03/23/2025]
Abstract
Alzheimer's disease involves the accumulation of amyloid beta (Aβ) monomers that form oligomers and fibrils in the brain. Studying the Aβ monomer is critical for understanding Aβ assembly and peptide behavior and has implications for drug design. Choosing a starting structure with a higher aggregation tendency for cost-effective MD studies and drug design is crucial. Previous studies have utilized distinct initial conformations, leading to varying results. Hence, this study was conducted to compare different initial conformations using the same MD simulation protocol to investigate the behavior and oligomerization propensity of different starting structures of Aβ during 1μs. The behavior of the monomers and their self-assembly systems were studied thoroughly, and the results revealed that highly helical Aβ monomers which used as starting structures retain high helix content during the simulation, and their tautomerization states did not cause significant changes in the structure. On the other hand, the Aβ extended and S-shaped monomers displayed the fingerprints of the fibril structure, which is believed to be more favorable for self-assembly. Self-assembly behaviors were seen for three S-shaped and three Aβ extended peptides. However, both conformations did not show stable β-sheet intermolecular interaction. For the Aβ16-22 monomer as a fragment of the Aβ that can assemble into fibrils, the impacts of capping and uncapping on the initial structure were also investigated. The results displayed that capped and uncapped structures can form oligomers with β-sheet at termini. However, in the capped state, β-sheet interactions were more stable and remained relatively longer than uncapped.
Collapse
Affiliation(s)
- Maryam Ghorbaninia
- Department of Cell and Molecular Biology & Microbiology, Faculty of Biological Science and Technology, University of Isfahan, Isfahan, Iran
| | - Shirin Doroudgar
- Department of Internal Medicine and the Translational Cardiovascular Research Center, University of Arizona College of Medicine - Phoenix, Phoenix, AZ, United States
| | - Mohamad Reza Ganjalikhany
- Department of Cell and Molecular Biology & Microbiology, Faculty of Biological Science and Technology, University of Isfahan, Isfahan, Iran.
| |
Collapse
|
7
|
Ma A, Li H. Reaction Coordinates Are Optimal Channels of Energy Flow. Annu Rev Phys Chem 2025; 76:153-179. [PMID: 39903861 DOI: 10.1146/annurev-physchem-082423-010652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2025]
Abstract
Reaction coordinates (RCs) are the few essential coordinates of a protein that control its functional processes, such as allostery, enzymatic reaction, and conformational change. They are critical for understanding protein function and provide optimal enhanced sampling of protein conformational changes and states. Since the pioneering work in the late 1990s, identifying the correct and objectively provable RCs has been a central topic in molecular biophysics and chemical physics. This review summarizes the major advances in identifying RCs over the past 25 years, focusing on methods aimed at finding RCs that meet the rigorous committor criterion, widely accepted as the true RCs. Notably, the newly developed physics-based energy flow theory and generalized work functional method provide a general and rigorous approach for identifying true RCs, revealing their physical nature as the optimal channels of energy flow in biomolecules.
Collapse
Affiliation(s)
- Ao Ma
- Center for Bioinformatics and Quantitative Biology, Richard and Loan Hill Department of Biomedical Engineering, The University of Illinois Chicago, Chicago, Illinois, USA;
| | - Huiyu Li
- Center for Bioinformatics and Quantitative Biology, Richard and Loan Hill Department of Biomedical Engineering, The University of Illinois Chicago, Chicago, Illinois, USA;
| |
Collapse
|
8
|
Singh AN, Das A, Limmer DT. Variational Path Sampling of Rare Dynamical Events. Annu Rev Phys Chem 2025; 76:639-662. [PMID: 39971385 DOI: 10.1146/annurev-physchem-083122-115001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
This article reviews the concepts and methods of variational path sampling. These methods allow computational studies of rare events in systems driven arbitrarily far from equilibrium. Based upon a statistical mechanics of trajectory space and leveraging the theory of large deviations, they provide a perspective from which dynamical phenomena can be studied with the same types of ensemble reweighting ideas that have been used for static equilibrium properties. Applications to chemical, material, and biophysical systems are highlighted.
Collapse
Affiliation(s)
- Aditya N Singh
- Department of Chemistry, University of California, Berkeley, California, USA; , ,
| | - Avishek Das
- Department of Chemistry, University of California, Berkeley, California, USA; , ,
- Current affiliation: Fundamental Research on Matter Institute for Atomic and Molecular Physics (AMOLF), Amsterdam, The Netherlands
| | - David T Limmer
- Department of Chemistry, University of California, Berkeley, California, USA; , ,
- Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
- Kavli Energy Nanoscience Institute, University of California, Berkeley, California, USA
- Material Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| |
Collapse
|
9
|
Aranganathan A, Gu X, Wang D, Vani BP, Tiwary P. Modeling Boltzmann-weighted structural ensembles of proteins using artificial intelligence-based methods. Curr Opin Struct Biol 2025; 91:103000. [PMID: 39923288 PMCID: PMC12011212 DOI: 10.1016/j.sbi.2025.103000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Revised: 01/09/2025] [Accepted: 01/20/2025] [Indexed: 02/11/2025]
Abstract
This review highlights recent advances in AI-driven methods for generating Boltzmann-weighted structural ensembles, which are crucial for understanding biomolecular dynamics and drug discovery. With the rise of deep learning models such as AlphaFold2, there has been a shift toward more accurate and efficient sampling of structural ensembles. The review discusses the integration of AI with traditional molecular dynamics techniques as well as experiments, the challenges of conformational sampling, and future directions for AI-driven research in structural biology, particularly in drug discovery and protein dynamics.
Collapse
Affiliation(s)
- Akashnathan Aranganathan
- Biophysics Program, University of Maryland, College Park, 20742, MD, USA; Institute of Physical Science and Technology, University of Maryland, College Park, 20742, MD, USA
| | - Xinyu Gu
- Institute of Physical Science and Technology, University of Maryland, College Park, 20742, MD, USA; University of Maryland Institute for Health Computing, Bethesda, 20852, MD, USA.
| | - Dedi Wang
- Genentech, 1 DNA Way, South San Francisco, 94080, CA, USA
| | - Bodhi P Vani
- Genentech, 1 DNA Way, South San Francisco, 94080, CA, USA
| | - Pratyush Tiwary
- Institute of Physical Science and Technology, University of Maryland, College Park, 20742, MD, USA; University of Maryland Institute for Health Computing, Bethesda, 20852, MD, USA; Department of Chemistry and Biochemistry, University of Maryland, College Park, 20742, MD, USA.
| |
Collapse
|
10
|
Ruzmetov T, Hung TI, Jonnalagedda SP, Chen SH, Fasihianifard P, Guo Z, Bhanu B, Chang CEA. Sampling Conformational Ensembles of Highly Dynamic Proteins via Generative Deep Learning. J Chem Inf Model 2025; 65:2487-2502. [PMID: 39984300 DOI: 10.1021/acs.jcim.4c01838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2025]
Abstract
Proteins are inherently dynamic, and their conformational ensembles play a crucial role in biological function. Large-scale motions may govern the protein structure-function relationship, and numerous transient but stable conformations of intrinsically disordered proteins (IDPs) can play a crucial role in biological function. Investigating conformational ensembles to understand regulations and disease-related aggregations of IDPs is challenging, both experimentally and computationally. In this paper, we first introduce a deep learning-based model, termed Internal Coordinate Net (ICoN), which learns the physical principles of conformational changes from molecular dynamics simulation data. Second, we selected data points through interpolation in the learned latent space to rapidly identify novel synthetic conformations with sophisticated and large-scale side chains and backbone arrangements. Third, with the highly dynamic amyloid-β1-42 (Aβ42) monomer, our deep learning model provided a comprehensive sampling of Aβ42's conformational landscape. Analysis of these synthetic conformations revealed conformational clusters that could be used to rationalize experimental findings. Additionally, the method can identify novel conformations with important interactions in atomistic details that are not included in the training data. New synthetic conformations showed distinct side chain rearrangements that are probed by our electron paramagnetic resonance and amino acid substitution studies. This approach is highly transferable and can be used for any available data for training. The work also demonstrated the ability of deep learning to utilize natural atomistic motions in protein conformation sampling.
Collapse
Affiliation(s)
- Talant Ruzmetov
- Department of Chemistry, University of California, Riverside, California 92521, United States
| | - Ta I Hung
- Department of Chemistry, University of California, Riverside, California 92521, United States
- Department of Bioengineering, University of California, Riverside, California 92521, United States
| | - Saisri Padmaja Jonnalagedda
- Department of Electrical and Computer Engineering, University of California, Riverside, California 92521, United States
| | - Si-Han Chen
- Department of Chemistry, University of California, Riverside, California 92521, United States
| | - Parisa Fasihianifard
- Department of Chemistry, University of California, Riverside, California 92521, United States
| | - Zhefeng Guo
- Department of Neurology, Brain Research Institute, University of California, Los Angeles, California 90095, United States
| | - Bir Bhanu
- Department of Bioengineering, University of California, Riverside, California 92521, United States
- Department of Electrical and Computer Engineering, University of California, Riverside, California 92521, United States
| | - Chia-En A Chang
- Department of Chemistry, University of California, Riverside, California 92521, United States
- Department of Bioengineering, University of California, Riverside, California 92521, United States
| |
Collapse
|
11
|
Paloncýová M, Valério M, Dos Santos RN, Kührová P, Šrejber M, Čechová P, Dobchev DA, Balsubramani A, Banáš P, Agarwal V, Souza PCT, Otyepka M. Computational Methods for Modeling Lipid-Mediated Active Pharmaceutical Ingredient Delivery. Mol Pharm 2025; 22:1110-1141. [PMID: 39879096 PMCID: PMC11881150 DOI: 10.1021/acs.molpharmaceut.4c00744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2024] [Revised: 01/06/2025] [Accepted: 01/06/2025] [Indexed: 01/31/2025]
Abstract
Lipid-mediated delivery of active pharmaceutical ingredients (API) opened new possibilities in advanced therapies. By encapsulating an API into a lipid nanocarrier (LNC), one can safely deliver APIs not soluble in water, those with otherwise strong adverse effects, or very fragile ones such as nucleic acids. However, for the rational design of LNCs, a detailed understanding of the composition-structure-function relationships is missing. This review presents currently available computational methods for LNC investigation, screening, and design. The state-of-the-art physics-based approaches are described, with the focus on molecular dynamics simulations in all-atom and coarse-grained resolution. Their strengths and weaknesses are discussed, highlighting the aspects necessary for obtaining reliable results in the simulations. Furthermore, a machine learning, i.e., data-based learning, approach to the design of lipid-mediated API delivery is introduced. The data produced by the experimental and theoretical approaches provide valuable insights. Processing these data can help optimize the design of LNCs for better performance. In the final section of this Review, state-of-the-art of computer simulations of LNCs are reviewed, specifically addressing the compatibility of experimental and computational insights.
Collapse
Affiliation(s)
- Markéta Paloncýová
- Regional
Center of Advanced Technologies and Materials, Czech Advanced Technology and Research Institute (CATRIN), Palacký
University Olomouc, Šlechtitelů 27, 779 00 Olomouc, Czech Republic
| | - Mariana Valério
- Laboratoire
de Biologie et Modélisation de la Cellule, CNRS, UMR 5239,
Inserm, U1293, Université Claude Bernard Lyon 1, Ecole Normale
Supérieure de Lyon, 46 Allée d’Italie, 69364 Lyon, France
- Centre Blaise
Pascal de Simulation et de Modélisation Numérique, Ecole Normale Supérieure de Lyon, 46 Allée d’Italie, 69364 Lyon, France
| | | | - Petra Kührová
- Regional
Center of Advanced Technologies and Materials, Czech Advanced Technology and Research Institute (CATRIN), Palacký
University Olomouc, Šlechtitelů 27, 779 00 Olomouc, Czech Republic
| | - Martin Šrejber
- Regional
Center of Advanced Technologies and Materials, Czech Advanced Technology and Research Institute (CATRIN), Palacký
University Olomouc, Šlechtitelů 27, 779 00 Olomouc, Czech Republic
| | - Petra Čechová
- Regional
Center of Advanced Technologies and Materials, Czech Advanced Technology and Research Institute (CATRIN), Palacký
University Olomouc, Šlechtitelů 27, 779 00 Olomouc, Czech Republic
| | | | - Akshay Balsubramani
- mRNA Center
of Excellence, Sanofi, Waltham, Massachusetts 02451, United States
| | - Pavel Banáš
- Regional
Center of Advanced Technologies and Materials, Czech Advanced Technology and Research Institute (CATRIN), Palacký
University Olomouc, Šlechtitelů 27, 779 00 Olomouc, Czech Republic
| | - Vikram Agarwal
- mRNA Center
of Excellence, Sanofi, Waltham, Massachusetts 02451, United States
| | - Paulo C. T. Souza
- Laboratoire
de Biologie et Modélisation de la Cellule, CNRS, UMR 5239,
Inserm, U1293, Université Claude Bernard Lyon 1, Ecole Normale
Supérieure de Lyon, 46 Allée d’Italie, 69364 Lyon, France
- Centre Blaise
Pascal de Simulation et de Modélisation Numérique, Ecole Normale Supérieure de Lyon, 46 Allée d’Italie, 69364 Lyon, France
| | - Michal Otyepka
- Regional
Center of Advanced Technologies and Materials, Czech Advanced Technology and Research Institute (CATRIN), Palacký
University Olomouc, Šlechtitelů 27, 779 00 Olomouc, Czech Republic
- IT4Innovations,
VŠB − Technical University of Ostrava, 17. listopadu 2172/15, 708 00 Ostrava-Poruba, Czech Republic
| |
Collapse
|
12
|
Cui Q. Machine learning in molecular biophysics: Protein allostery, multi-level free energy simulations, and lipid phase transitions. BIOPHYSICS REVIEWS 2025; 6:011305. [PMID: 39957913 PMCID: PMC11825181 DOI: 10.1063/5.0248589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/12/2024] [Accepted: 01/14/2025] [Indexed: 02/18/2025]
Abstract
Machine learning (ML) techniques have been making major impacts on all areas of science and engineering, including biophysics. In this review, we discuss several applications of ML to biophysical problems based on our recent research. The topics include the use of ML techniques to identify hotspot residues in allosteric proteins using deep mutational scanning data and to analyze how mutations of these hotspots perturb co-operativity in the framework of a statistical thermodynamic model, to improve the accuracy of free energy simulations by integrating data from different levels of potential energy functions, and to determine the phase transition temperature of lipid membranes. Through these examples, we illustrate the unique value of ML in extracting patterns or parameters from complex data sets, as well as the remaining limitations. By implementing the ML approaches in the context of physically motivated models or computational frameworks, we are able to gain a deeper mechanistic understanding or better convergence in numerical simulations. We conclude by briefly discussing how the introduced models can be further expanded to tackle more complex problems.
Collapse
Affiliation(s)
- Qiang Cui
- Author to whom correspondence should be addressed:
| |
Collapse
|
13
|
Cui Q. Identification and understanding of allostery hotspots in proteins: Integration of deep mutational scanning and multi-faceted computational analyses. J Mol Biol 2025:168998. [PMID: 39952349 DOI: 10.1016/j.jmb.2025.168998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2024] [Revised: 01/19/2025] [Accepted: 02/08/2025] [Indexed: 02/17/2025]
Abstract
Motivated by recent deep mutational scanning (DMS) experiments, we have carried out a diverse set of computations to better understand the distribution and contributions of allostery hotspot residues in a transcription factor, TetR. These include extensive atomistic simulations and free energy computations for different functional states of TetR, machine learning analysis of the DMS data and a statistical thermodynamic model for the experimental induction data for the WT protein and a handful of hotspot mutants. Collectively, these computations provided insights into the structural and energetic basis of allostery in TetR, and the distinct contributions of allostery hotspots. The results highlight that the allostery function (i.e., the induction activity) of TetR can be modulated by perturbing both inter-domain coupling and intra-domain properties, such as the population of the binding-competent conformation of each domain. This mechanistic degeneracy qualitatively explains the broad distribution of allostery hotspots across the protein structure observed in the DMS experiments, and also informs the design of strategies aimed at identifying allostery hotspots. The mechanistic framework and the multi-faceted computational approaches are expected to be applicable to the analysis of other allostery systems, especially those sharing the similar two-domain structural topology, and to the design of allostery modulators.
Collapse
Affiliation(s)
- Qiang Cui
- Departments of Chemistry, Physics and Biomedical Engineering, Boston University, 590 Commonwealth Avenue, Boston 02215, MA, USA
| |
Collapse
|
14
|
Li J, Knijff L, Zhang ZY, Andersson L, Zhang C. PiNN: Equivariant Neural Network Suite for Modeling Electrochemical Systems. J Chem Theory Comput 2025; 21:1382-1395. [PMID: 39883580 PMCID: PMC11823406 DOI: 10.1021/acs.jctc.4c01570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2024] [Revised: 01/07/2025] [Accepted: 01/23/2025] [Indexed: 02/01/2025]
Abstract
Electrochemical energy storage and conversion play increasingly important roles in electrification and sustainable development across the globe. A key challenge therein is to understand, control, and design electrochemical energy materials with atomistic precision. This requires inputs from molecular modeling powered by machine learning (ML) techniques. In this work, we have upgraded our pairwise interaction neural network Python package PiNN via introducing equivariant features to the PiNet2 architecture for fitting potential energy surfaces along with PiNet2-dipole for dipole and charge predictions as well as PiNet2-χ for generating atom-condensed charge response kernels. By benchmarking publicly accessible data sets of small molecules, crystalline materials, and liquid electrolytes, we found that the equivariant PiNet2 shows significant improvements over the original PiNet architecture and provides a state-of-the-art overall performance. Furthermore, leveraging on plug-ins such as PiNNAcLe for an adaptive learn-on-the-fly workflow in generating ML potentials and PiNNwall for modeling heterogeneous electrodes under external bias, we expect PiNN to serve as a versatile and high-performing ML-accelerated platform for molecular modeling of electrochemical systems.
Collapse
Affiliation(s)
- Jichen Li
- Department
of Chemistry-Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, P.O. Box 538, 75121 Uppsala, Sweden
| | - Lisanne Knijff
- Department
of Chemistry-Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, P.O. Box 538, 75121 Uppsala, Sweden
| | - Zhan-Yun Zhang
- Department
of Chemistry-Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, P.O. Box 538, 75121 Uppsala, Sweden
- Wallenberg
Initiative Materials Science for Sustainability, Uppsala University, 75121 Uppsala, Sweden
| | - Linnéa Andersson
- Department
of Chemistry-Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, P.O. Box 538, 75121 Uppsala, Sweden
| | - Chao Zhang
- Department
of Chemistry-Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, P.O. Box 538, 75121 Uppsala, Sweden
- Wallenberg
Initiative Materials Science for Sustainability, Uppsala University, 75121 Uppsala, Sweden
| |
Collapse
|
15
|
Mitra S, Biswas R, Chakrabarty S. WeTICA: A directed search weighted ensemble based enhanced sampling method to estimate rare event kinetics in a reduced dimensional space. J Chem Phys 2025; 162:034106. [PMID: 39812249 DOI: 10.1063/5.0239713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2024] [Accepted: 12/30/2024] [Indexed: 01/16/2025] Open
Abstract
Estimating rare event kinetics from molecular dynamics simulations is a non-trivial task despite the great advances in enhanced sampling methods. Weighted Ensemble (WE) simulation, a special class of enhanced sampling techniques, offers a way to directly calculate kinetic rate constants from biased trajectories without the need to modify the underlying energy landscape using bias potentials. Conventional WE algorithms use different binning schemes to partition the collective variable (CV) space separating the two metastable states of interest. In this work, we have developed a new "binless" WE simulation algorithm to bypass the hurdles of optimizing binning procedures. Our proposed protocol (WeTICA) uses a low-dimensional CV space to drive the WE simulation toward the specified target state. We have applied this new algorithm to recover the unfolding kinetics of three proteins: (A) TC5b Trp-cage mutant, (B) TC10b Trp-cage mutant, and (C) Protein G, with unfolding times spanning the range between 3 and 40 μs using projections along predefined fixed Time-lagged Independent Component Analysis (TICA) eigenvectors as CVs. Calculated unfolding times converge to the reported values with good accuracy with more than one order of magnitude less cumulative WE simulation time than the unfolding time scales with or without a priori knowledge of the CVs that can capture unfolding. Our algorithm can be used with other linear CVs, not limited to TICA. Moreover, the new walker selection criteria for resampling employed in this algorithm can be used on more sophisticated nonlinear CV space for further improvements of binless WE methods.
Collapse
Affiliation(s)
- Sudipta Mitra
- Department of Chemical and Biological Sciences, S. N. Bose National Centre for Basic Sciences, Block-JD, Sector-III, Salt Lake, Kolkata 700106, India
| | - Ranjit Biswas
- Department of Chemical and Biological Sciences, S. N. Bose National Centre for Basic Sciences, Block-JD, Sector-III, Salt Lake, Kolkata 700106, India
| | - Suman Chakrabarty
- Department of Chemical and Biological Sciences, S. N. Bose National Centre for Basic Sciences, Block-JD, Sector-III, Salt Lake, Kolkata 700106, India
| |
Collapse
|
16
|
Temmerman W, Goeminne R, Rawat KS, Van Speybroeck V. Computational Modeling of Reticular Materials: The Past, the Present, and the Future. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024:e2412005. [PMID: 39723710 DOI: 10.1002/adma.202412005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Revised: 11/22/2024] [Indexed: 12/28/2024]
Abstract
Reticular materials rely on a unique building concept where inorganic and organic building units are stitched together giving access to an almost limitless number of structured ordered porous materials. Given the versatility of chemical elements, underlying nets, and topologies, reticular materials provide a unique platform to design materials for timely technological applications. Reticular materials have now found their way in important societal applications, like carbon capture to address climate change, water harvesting to extract atmospheric moisture in arid environments, and clean energy applications. Combining predictions from computational materials chemistry with advanced experimental characterization and synthesis procedures unlocks a design strategy to synthesize new materials with the desired properties and functions. Within this review, the current status of modeling reticular materials is addressed and supplemented with topical examples highlighting the necessity of advanced molecular modeling to design materials for technological applications. This review is structured as a templated molecular modeling study starting from the molecular structure of a realistic material towards the prediction of properties and functions of the materials. At the end, the authors provide their perspective on the past, present of future in modeling reticular materials and formulate open challenges to inspire future model and method developments.
Collapse
Affiliation(s)
- Wim Temmerman
- Center for Molecular Modeling (CMM), Ghent University, Technologiepark 46, Zwijnaarde, 9052, Belgium
| | - Ruben Goeminne
- Center for Molecular Modeling (CMM), Ghent University, Technologiepark 46, Zwijnaarde, 9052, Belgium
| | - Kuber Singh Rawat
- Center for Molecular Modeling (CMM), Ghent University, Technologiepark 46, Zwijnaarde, 9052, Belgium
| | - Veronique Van Speybroeck
- Center for Molecular Modeling (CMM), Ghent University, Technologiepark 46, Zwijnaarde, 9052, Belgium
| |
Collapse
|
17
|
Ali AAAI, Dorbath E, Stock G. Allosteric Communication Mediated by Protein Contact Clusters: A Dynamical Model. J Chem Theory Comput 2024; 20:10731-10739. [PMID: 39576941 DOI: 10.1021/acs.jctc.4c01188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2024]
Abstract
Describing the puzzling phenomenon of long-range communication between distant protein sites, allostery is of paramount importance in biomolecular regulation and signal transduction. It is commonly assumed to arise from a conformational rearrangement of the protein, although the underlying dynamical process has remained largely elusive. This study introduces a dynamical model of allosteric communication based on "contact clusters"─localized groups of highly correlated contacts that facilitate interactions between secondary structures. The model shows that allostery involves a multistep process with cooperative contact changes within clusters and communication between distant clusters mediated by rigid secondary structures. Considering time-dependent experiments on a photoswitchable PDZ3 domain, extensive (in total ∼500 μs) molecular dynamics simulations are conducted that directly monitor the photoinduced allosteric transition. The structural reorganization is illustrated by the time evolution of the contact clusters and the ligand, which effects the nonlocal coupling between distant clusters. A time scale analysis reveals dynamics from nano- to microseconds, which are in excellent agreement with the experimentally measured time scales. While the simulation of larger systems may require enhanced sampling techniques, it is expected that the general picture of allostery mediated by communicating contact clusters will still be applicable.
Collapse
Affiliation(s)
- Ahmed A A I Ali
- Biomolecular Dynamics, Institute of Physics, University of Freiburg, 79104 Freiburg, Germany
| | - Emanuel Dorbath
- Biomolecular Dynamics, Institute of Physics, University of Freiburg, 79104 Freiburg, Germany
| | - Gerhard Stock
- Biomolecular Dynamics, Institute of Physics, University of Freiburg, 79104 Freiburg, Germany
| |
Collapse
|
18
|
Wang D, Tiwary P. Augmenting Human Expertise in Weighted Ensemble Simulations through Deep Learning-Based Information Bottleneck. J Chem Theory Comput 2024; 20:10371-10383. [PMID: 39589127 DOI: 10.1021/acs.jctc.4c00919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2024]
Abstract
The weighted ensemble (WE) method stands out as a widely used segment-based sampling technique renowned for its rigorous treatment of kinetics. The WE framework typically involves initially mapping the configuration space onto a low-dimensional collective variable (CV) space and then partitioning it into bins. The efficacy of WE simulations heavily depends on the selection of CVs and binning schemes. The recently proposed state predictive information bottleneck (SPIB) method has emerged as a promising tool for automatically constructing CVs from data and guiding enhanced sampling through an iterative manner. In this work, we advance this data-driven pipeline by incorporating prior expert knowledge. Our hybrid approach combines SPIB-learned CVs to enhance sampling in explored regions with expert-based CVs to guide exploration in regions of interest, synergizing the strengths of both methods. Through benchmarking on alanine dipeptide and chignolin systems, we demonstrate that our hybrid approach effectively guides WE simulations to sample states of interest and reduces run-to-run variances. Moreover, our integration of the SPIB model also enhances the analysis and interpretation of WE simulation data by effectively identifying metastable states and pathways and offering direct visualization of dynamics.
Collapse
Affiliation(s)
- Dedi Wang
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, United States
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, United States
- University of Maryland Institute for Health Computing, Bethesda, Maryland 20852, United States
| |
Collapse
|
19
|
Ruzmetov T, Hung TI, Jonnalagedda SP, Chen SH, Fasihianifard P, Guo Z, Bhanu B, Chang CEA. Sampling Conformational Ensembles of Highly Dynamic Proteins via Generative Deep Learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.05.592587. [PMID: 38979147 PMCID: PMC11230202 DOI: 10.1101/2024.05.05.592587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Proteins are inherently dynamic, and their conformational ensembles are functionally important in biology. Large-scale motions may govern protein structure-function relationship, and numerous transient but stable conformations of Intrinsically Disordered Proteins (IDPs) can play a crucial role in biological function. Investigating conformational ensembles to understand regulations and disease-related aggregations of IDPs is challenging both experimentally and computationally. In this paper we first introduce a deep learning-based model, termed Internal Coordinate Net (ICoN), which learns the physical principles of conformational changes from Molecular Dynamics (MD) simulation data. Second, we selected interpolating data points in the learned latent space that rapidly identify novel synthetic conformations with sophisticated and large-scale sidechains and backbone arrangements. Third, with the highly dynamic amyloid-β 1-42 (Aβ42) monomer, our deep learning model provided a comprehensive sampling of Aβ42's conformational landscape. Analysis of these synthetic conformations revealed conformational clusters that can be used to rationalize experimental findings. Additionally, the method can identify novel conformations with important interactions in atomistic details that are not included in the training data. New synthetic conformations showed distinct sidechain rearrangements that are probed by our EPR and amino acid substitution studies. This approach is highly transferable and can be used for any available data for training. The work also demonstrated the ability of deep learning to utilize learned natural atomistic motions in protein conformation sampling.
Collapse
|
20
|
Wang D, Tiwary P. Augmenting Human Expertise in Weighted Ensemble Simulations through Deep Learning based Information Bottleneck. ARXIV 2024:arXiv:2406.14839v2. [PMID: 38947925 PMCID: PMC11213147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
The weighted ensemble (WE) method stands out as a widely used segment-based sampling technique renowned for its rigorous treatment of kinetics. The WE framework typically involves initially mapping the configuration space onto a low-dimensional collective variable (CV) space and then partitioning it into bins. The efficacy of WE simulations heavily depends on the selection of CVs and binning schemes. The recently proposed State Predictive Information Bottleneck (SPIB) method has emerged as a promising tool for automatically constructing CVs from data and guiding enhanced sampling through an iterative manner. In this work, we advance this data-driven pipeline by incorporating prior expert knowledge. Our hybrid approach combines SPIB-learned CVs to enhance sampling in explored regions with expert-based CVs to guide exploration in regions of interest, synergizing the strengths of both methods. Through benchmarking on alanine dipeptide and chignoin systems, we demonstrate that our hybrid approach effectively guides WE simulations to sample states of interest, and reduces run-to-run variances. Moreover, our integration of the SPIB model also enhances the analysis and interpretation of WE simulation data by effectively identifying metastable states and pathways, and offering direct visualization of dynamics.
Collapse
Affiliation(s)
- Dedi Wang
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
- University of Maryland Institute for Health Computing, Bethesda 20852, USA
| |
Collapse
|
21
|
Mitchell AR, Rotskoff GM. Committor Guided Estimates of Molecular Transition Rates. J Chem Theory Comput 2024; 20:9378-9393. [PMID: 39420582 DOI: 10.1021/acs.jctc.4c00997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2024]
Abstract
The probability that a configuration of a physical system reacts, or transitions from one metastable state to another, is quantified by the committor function. This function contains richly detailed mechanistic information about transition pathways, but a full parametrization of the committor requires the construction of a high-dimensional function, a generically challenging task. Recent efforts to leverage neural networks as a means to solve high-dimensional partial differential equations, often called "physics-informed" machine learning, have brought the committor into computational reach. Here, we build on the semigroup approach to learning the committor and assess its utility for predicting dynamical quantities such as transition rates. We show that a careful reframing of the objective function and improved adaptive sampling strategies provide highly accurate representations of the committor. Furthermore, by directly applying the Hill relation, we show that these committors provide accurate transition rates for molecular systems.
Collapse
Affiliation(s)
- Andrew R Mitchell
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Grant M Rotskoff
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
22
|
Yang DT, Goldberg AM, Chong LT. Rare-Event Sampling using a Reinforcement Learning-Based Weighted Ensemble Method. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.09.617475. [PMID: 39416089 PMCID: PMC11482931 DOI: 10.1101/2024.10.09.617475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/19/2024]
Abstract
Despite the power of path sampling strategies in enabling simulations of rare events, such strategies have not reached their full potential. A common challenge that remains is the identification of a progress coordinate that captures the slow relevant motions of a rare event. Here we have developed a weighted ensemble (WE) path sampling strategy that exploits reinforcement learning to automatically identify an effective progress coordinate among a set of potential coordinates during a simulation. We apply our WE strategy with reinforcement learning to three benchmark systems: (i) an egg carton-shaped toy potential, (ii) an S-shaped toy potential, and (iii) a dimer of the HIV-1 capsid protein (C-terminal domain). To enable rapid testing of the latter system at the atomic level, we employed discrete-state synthetic molecular dynamics trajectories using a generative, fine-grained Markov state model that was based on extensive conventional simulations. Our results demonstrate that using concepts from reinforcement learning with a weighted ensemble of trajectories automatically identifies relevant progress co-ordinates among multiple candidates at a given time during a simulation. Due to the rigorous weighting of trajectories, the simulations maintain rigorous kinetics.
Collapse
Affiliation(s)
- Darian T. Yang
- Molecular Biophysics and Structural Biology Graduate Program, University of Pittsburgh and Carnegie Mellon University, Pittsburgh, Pennsylvania 15260
- Department of Structural Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania 15260
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
| | - Alex M. Goldberg
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
| | - Lillian T. Chong
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
| |
Collapse
|
23
|
Javed R, Kapakayala AB, Nair NN. Buckets Instead of Umbrellas for Enhanced Sampling and Free Energy Calculations. J Chem Theory Comput 2024; 20:8450-8460. [PMID: 39344058 DOI: 10.1021/acs.jctc.4c00776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Umbrella sampling has been a workhorse for free energy calculations in molecular simulations for several decades. In conventional umbrella sampling, restraining bias potentials are strategically applied along one or several collective variables. Major drawbacks associated with this method are the requirement of a large number of bias windows and the poor sampling of the transverse coordinates. In this work, we propose an alternate formalism that departs from the traditional umbrella sampling to mitigate these issues, where we replace umbrella-type restraining bias potentials with bucket-type wall potentials. This modification permits one to formulate an efficient computational strategy leveraging wall potentials and metadynamics sampling. This new method, called "bucket sampling", can significantly reduce the computational cost of obtaining converged high-dimensional free energy surfaces. Extensions of the proposed method with temperature acceleration and replica exchange solute tempering are also demonstrated.
Collapse
Affiliation(s)
- Ramsha Javed
- Department of Chemistry, Indian Institute of Technology Kanpur, Kanpur 208016, India
| | - Anji Babu Kapakayala
- Department of Chemistry, Indian Institute of Technology Kanpur, Kanpur 208016, India
| | - Nisanth N Nair
- Department of Chemistry, Indian Institute of Technology Kanpur, Kanpur 208016, India
| |
Collapse
|
24
|
Parise A, Cresca S, Magistrato A. Molecular dynamics simulations for the structure-based drug design: targeting small-GTPases proteins. Expert Opin Drug Discov 2024; 19:1259-1279. [PMID: 39105536 DOI: 10.1080/17460441.2024.2387856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 07/30/2024] [Indexed: 08/07/2024]
Abstract
INTRODUCTION Molecular Dynamics (MD) simulations can support mechanism-based drug design. Indeed, MD simulations by capturing biomolecule motions at finite temperatures can reveal hidden binding sites, accurately predict drug-binding poses, and estimate the thermodynamics and kinetics, crucial information for drug discovery campaigns. Small-Guanosine Triphosphate Phosphohydrolases (GTPases) regulate a cascade of signaling events, that affect most cellular processes. Their deregulation is linked to several diseases, making them appealing drug targets. The broad roles of small-GTPases in cellular processes and the recent approval of a covalent KRas inhibitor as an anticancer agent renewed the interest in targeting small-GTPase with small molecules. AREA COVERED This review emphasizes the role of MD simulations in elucidating small-GTPase mechanisms, assessing the impact of cancer-related variants, and discovering novel inhibitors. EXPERT OPINION The application of MD simulations to small-GTPases exemplifies the role of MD simulations in the structure-based drug design process for challenging biomolecular targets. Furthermore, AI and machine learning-enhanced MD simulations, coupled with the upcoming power of quantum computing, are promising instruments to target elusive small-GTPases mutations and splice variants. This powerful synergy will aid in developing innovative therapeutic strategies associated to small-GTPases deregulation, which could potentially be used for personalized therapies and in a tissue-agnostic manner to treat tumors with mutations in small-GTPases.
Collapse
Affiliation(s)
- Angela Parise
- Consiglio Nazionale delle Ricerche (CNR) - Istituto Officina dei Materiali (IOM), c/o International School for Advanced Studies (SISSA), Trieste, Italy
| | - Sofia Cresca
- Consiglio Nazionale delle Ricerche (CNR) - Istituto Officina dei Materiali (IOM), c/o International School for Advanced Studies (SISSA), Trieste, Italy
| | - Alessandra Magistrato
- Consiglio Nazionale delle Ricerche (CNR) - Istituto Officina dei Materiali (IOM), c/o International School for Advanced Studies (SISSA), Trieste, Italy
| |
Collapse
|
25
|
Fullenkamp CR, Mehdi S, Jones CP, Tenney L, Pichling P, Prestwood PR, Ferré-D’Amaré AR, Tiwary P, Schneekloth JS. Machine learning-augmented molecular dynamics simulations (MD) reveal insights into the disconnect between affinity and activation of ZTP riboswitch ligands. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.13.612887. [PMID: 39314358 PMCID: PMC11419147 DOI: 10.1101/2024.09.13.612887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
The challenge of targeting RNA with small molecules necessitates a better understanding of RNA-ligand interaction mechanisms. However, the dynamic nature of nucleic acids, their ligand-induced stabilization, and how conformational changes influence gene expression pose significant difficulties for experimental investigation. This work employs a combination of computational and experimental methods to address these challenges. By integrating structure-informed design, crystallography, and machine learning-augmented all-atom molecular dynamics simulations (MD) we synthesized, biophysically and biochemically characterized, and studied the dissociation of a library of small molecule activators of the ZTP riboswitch, a ligand-binding RNA motif that regulates bacterial gene expression. We uncovered key interaction mechanisms, revealing valuable insights into the role of ligand binding kinetics on riboswitch activation. Further, we established that ligand on-rates determine activation potency as opposed to binding affinity and elucidated RNA structural differences, which provide mechanistic insights into the interplay of RNA structure on riboswitch activation.
Collapse
Affiliation(s)
| | - Shams Mehdi
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
| | - Christopher P. Jones
- Laboratory of Nucleic Acids, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Logan Tenney
- Chemical Biology Laboratory, National Cancer Institute, Frederick, MD, USA
| | - Patricio Pichling
- Laboratory of Nucleic Acids, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Peri R. Prestwood
- Chemical Biology Laboratory, National Cancer Institute, Frederick, MD, USA
| | - Adrian R. Ferré-D’Amaré
- Laboratory of Nucleic Acids, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
- University of Maryland Institute for Health Computing, Bethesda, Maryland 20852, USA
| | | |
Collapse
|
26
|
Rydzewski J. Spectral Map for Slow Collective Variables, Markovian Dynamics, and Transition State Ensembles. J Chem Theory Comput 2024; 20. [PMID: 39265157 PMCID: PMC11428138 DOI: 10.1021/acs.jctc.4c00428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 08/14/2024] [Accepted: 08/14/2024] [Indexed: 09/14/2024]
Abstract
Understanding the behavior of complex molecular systems is a fundamental problem in physical chemistry. To describe the long-time dynamics of such systems, which is responsible for their most informative characteristics, we can identify a few slow collective variables (CVs) while treating the remaining fast variables as thermal noise. This enables us to simplify the dynamics and treat it as diffusion in a free-energy landscape spanned by slow CVs, effectively rendering the dynamics Markovian. Our recent statistical learning technique, spectral map [Rydzewski, J. J. Phys. Chem. Lett. 2023, 14(22), 5216-5220], explores this strategy to learn slow CVs by maximizing a spectral gap of a transition matrix. In this work, we introduce several advancements into our framework, using a high-dimensional reversible folding process of a protein as an example. We implement an algorithm for coarse-graining Markov transition matrices to partition the reduced space of slow CVs kinetically and use it to define a transition state ensemble. We show that slow CVs learned by spectral map closely approach the Markovian limit for an overdamped diffusion. We demonstrate that coordinate-dependent diffusion coefficients only slightly affect the constructed free-energy landscapes. Finally, we present how spectral maps can be used to quantify the importance of features and compare slow CVs with structural descriptors commonly used in protein folding. Overall, we demonstrate that a single slow CV learned by spectral map can be used as a physical reaction coordinate to capture essential characteristics of protein folding.
Collapse
Affiliation(s)
- Jakub Rydzewski
- Institute of Physics, Faculty
of Physics, Astronomy and Informatics, Nicolaus
Copernicus University, Grudziadzka 5, 87-100 Toruń, Poland
| |
Collapse
|
27
|
Mehdi S, Tiwary P. Thermodynamics-inspired explanations of artificial intelligence. Nat Commun 2024; 15:7859. [PMID: 39251574 PMCID: PMC11385982 DOI: 10.1038/s41467-024-51970-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 08/20/2024] [Indexed: 09/11/2024] Open
Abstract
In recent years, predictive machine learning models have gained prominence across various scientific domains. However, their black-box nature necessitates establishing trust in them before accepting their predictions as accurate. One promising strategy involves employing explanation techniques that elucidate the rationale behind a model's predictions in a way that humans can understand. However, assessing the degree of human interpretability of these explanations is a nontrivial challenge. In this work, we introduce interpretation entropy as a universal solution for evaluating the human interpretability of any linear model. Using this concept and drawing inspiration from classical thermodynamics, we present Thermodynamics-inspired Explainable Representations of AI and other black-box Paradigms, a method for generating optimally human-interpretable explanations in a model-agnostic manner. We demonstrate the wide-ranging applicability of this method by explaining predictions from various black-box model architectures across diverse domains, including molecular simulations, text, and image classification.
Collapse
Affiliation(s)
- Shams Mehdi
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park, 20742, USA
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park, 20742, USA.
- University of Maryland Institute for Health Computing, Bethesda, Maryland, 20852, USA.
| |
Collapse
|
28
|
Son A, Kim W, Park J, Lee W, Lee Y, Choi S, Kim H. Utilizing Molecular Dynamics Simulations, Machine Learning, Cryo-EM, and NMR Spectroscopy to Predict and Validate Protein Dynamics. Int J Mol Sci 2024; 25:9725. [PMID: 39273672 PMCID: PMC11395565 DOI: 10.3390/ijms25179725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Revised: 09/06/2024] [Accepted: 09/07/2024] [Indexed: 09/15/2024] Open
Abstract
Protein dynamics play a crucial role in biological function, encompassing motions ranging from atomic vibrations to large-scale conformational changes. Recent advancements in experimental techniques, computational methods, and artificial intelligence have revolutionized our understanding of protein dynamics. Nuclear magnetic resonance spectroscopy provides atomic-resolution insights, while molecular dynamics simulations offer detailed trajectories of protein motions. Computational methods applied to X-ray crystallography and cryo-electron microscopy (cryo-EM) have enabled the exploration of protein dynamics, capturing conformational ensembles that were previously unattainable. The integration of machine learning, exemplified by AlphaFold2, has accelerated structure prediction and dynamics analysis. These approaches have revealed the importance of protein dynamics in allosteric regulation, enzyme catalysis, and intrinsically disordered proteins. The shift towards ensemble representations of protein structures and the application of single-molecule techniques have further enhanced our ability to capture the dynamic nature of proteins. Understanding protein dynamics is essential for elucidating biological mechanisms, designing drugs, and developing novel biocatalysts, marking a significant paradigm shift in structural biology and drug discovery.
Collapse
Affiliation(s)
- Ahrum Son
- Department of Molecular Medicine, Scripps Research, San Diego, CA 92037, USA
| | - Woojin Kim
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
| | - Jongham Park
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
| | - Wonseok Lee
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
| | - Yerim Lee
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
| | - Seongyun Choi
- Department of Convergent Bioscience and Informatics, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
| | - Hyunsoo Kim
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
- Department of Convergent Bioscience and Informatics, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
- Protein AI Design Institute, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
- SCICS, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
| |
Collapse
|
29
|
Gu X, Aranganathan A, Tiwary P. Empowering AlphaFold2 for protein conformation selective drug discovery with AlphaFold2-RAVE. eLife 2024; 13:RP99702. [PMID: 39240197 PMCID: PMC11379456 DOI: 10.7554/elife.99702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2024] Open
Abstract
Small-molecule drug design hinges on obtaining co-crystallized ligand-protein structures. Despite AlphaFold2's strides in protein native structure prediction, its focus on apo structures overlooks ligands and associated holo structures. Moreover, designing selective drugs often benefits from the targeting of diverse metastable conformations. Therefore, direct application of AlphaFold2 models in virtual screening and drug discovery remains tentative. Here, we demonstrate an AlphaFold2-based framework combined with all-atom enhanced sampling molecular dynamics and Induced Fit docking, named AF2RAVE-Glide, to conduct computational model-based small-molecule binding of metastable protein kinase conformations, initiated from protein sequences. We demonstrate the AF2RAVE-Glide workflow on three different mammalian protein kinases and their type I and II inhibitors, with special emphasis on binding of known type II kinase inhibitors which target the metastable classical DFG-out state. These states are not easy to sample from AlphaFold2. Here, we demonstrate how with AF2RAVE these metastable conformations can be sampled for different kinases with high enough accuracy to enable subsequent docking of known type II kinase inhibitors with more than 50% success rates across docking calculations. We believe the protocol should be deployable for other kinases and more proteins generally.
Collapse
Affiliation(s)
- Xinyu Gu
- Institute for Physical Science and Technology, University of MarylandCollege ParkUnited States
- University of Maryland Institute for Health ComputingBethesdaUnited States
| | - Akashnathan Aranganathan
- Institute for Physical Science and Technology, University of MarylandCollege ParkUnited States
- Biophysics Program, University of MarylandCollege ParkUnited States
| | - Pratyush Tiwary
- Institute for Physical Science and Technology, University of MarylandCollege ParkUnited States
- University of Maryland Institute for Health ComputingBethesdaUnited States
- Department of Chemistry and Biochemistry, University of MarylandCollege ParkUnited States
| |
Collapse
|
30
|
Meraz VJ, Zou Z, Tiwary P. Simulating Crystallization in a Colloidal System Using State Predictive Information Bottleneck Based Enhanced Sampling. J Phys Chem B 2024; 128:8207-8214. [PMID: 39163635 DOI: 10.1021/acs.jpcb.4c02740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/22/2024]
Abstract
We investigate crystal nucleation in supersaturated colloid suspensions using enhanced molecular dynamics simulations augmented with machine learning techniques. The simulations reveal that crystallization in the model colloidal system studied here, with particles interacting through a repulsive screened Coulomb Yukawa potential, proceeds from vapor to dense liquid droplet to crystalline phases across multiple high barriers. Employing a one-dimensional reaction coordinate derived from the State Predictive Information Bottleneck framework, our simulations capture back-and-forth phase transitions across multiple barriers effectively in biased metadynamics simulations. We obtain relative free energy differences between different phases and also quantify the roles of different molecular level features in driving the phase changes.
Collapse
Affiliation(s)
- Vanessa J Meraz
- Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, United States
| | - Ziyue Zou
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, United States
| | - Pratyush Tiwary
- Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, United States
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, United States
- University of Maryland Institute for Health Computing, Bethesda, Maryland 20852, United States
| |
Collapse
|
31
|
Aupič J, Pokorná P, Ruthstein S, Magistrato A. Predicting Conformational Ensembles of Intrinsically Disordered Proteins: From Molecular Dynamics to Machine Learning. J Phys Chem Lett 2024; 15:8177-8186. [PMID: 39093570 DOI: 10.1021/acs.jpclett.4c01544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/04/2024]
Abstract
Intrinsically disordered proteins and regions (IDP/IDRs) are ubiquitous across all domains of life. Characterized by a lack of a stable tertiary structure, IDP/IDRs populate a diverse set of transiently formed structural states that can promiscuously adapt upon binding with specific interaction partners and/or certain alterations in environmental conditions. This malleability is foundational for their role as tunable interaction hubs in core cellular processes such as signaling, transcription, and translation. Tracing the conformational ensemble of an IDP/IDR and its perturbation in response to regulatory cues is thus paramount for illuminating its function. However, the conformational heterogeneity of IDP/IDRs poses several challenges. Here, we review experimental and computational methods devised to disentangle the conformational landscape of IDP/IDRs, highlighting recent computational advances that permit proteome-wide scans of IDP/IDRs conformations. We briefly evaluate selected computational methods using the disordered N-terminal of the human copper transporter 1 as a test case and outline further challenges in IDP/IDRs ensemble prediction.
Collapse
Affiliation(s)
- Jana Aupič
- CNR-IOM at International School for Advanced Studies (SISSA/ISAS), via Bonomea 265, 34136 Trieste, Italy
| | - Pavlína Pokorná
- CNR-IOM at International School for Advanced Studies (SISSA/ISAS), via Bonomea 265, 34136 Trieste, Italy
| | - Sharon Ruthstein
- Department of Chemistry, Faculty of Exact Sciences and the Institute for Nanotechnology and Advanced Materials (BINA), Bar-Ilan University, 5290002 Ramat-Gan, Israel
| | - Alessandra Magistrato
- CNR-IOM at International School for Advanced Studies (SISSA/ISAS), via Bonomea 265, 34136 Trieste, Italy
| |
Collapse
|
32
|
Zia SR, Coricello A, Bottegoni G. Increased throughput in methods for simulating protein ligand binding and unbinding. Curr Opin Struct Biol 2024; 87:102871. [PMID: 38924980 DOI: 10.1016/j.sbi.2024.102871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 06/03/2024] [Accepted: 06/04/2024] [Indexed: 06/28/2024]
Abstract
By incorporating full flexibility and enabling the quantification of crucial parameters such as binding free energies and residence times, methods for investigating protein-ligand binding and unbinding via molecular dynamics provide details on the involved mechanisms at the molecular level. While these advancements hold promise for impacting drug discovery, a notable drawback persists: their relatively time-consuming nature limits throughput. Herein, we survey recent implementations which, employing a blend of enhanced sampling techniques, a clever choice of collective variables, and often machine learning, strive to enhance the efficiency of new and previously reported methods without compromising accuracy. Particularly noteworthy is the validation of these methods that was often performed on systems mirroring real-world drug discovery scenarios.
Collapse
Affiliation(s)
- Syeda Rehana Zia
- Department of Paediatrics and Child Health, Faculty of Health Sciences, Medical College, The Aga Khan University, Karachi, 74800, Pakistan
| | - Adriana Coricello
- Department of Biomolecular Sciences, University of Urbino Carlo Bo, Urbino, 61029, Italy.
| | - Giovanni Bottegoni
- Department of Biomolecular Sciences, University of Urbino Carlo Bo, Urbino, 61029, Italy; Institute of Clinical Sciences, College of Medical and Dental Sciences, University of Birmingham, B15 2TT, United Kingdom.
| |
Collapse
|
33
|
Lee S, Wang D, Seeliger MA, Tiwary P. Calculating Protein-Ligand Residence Times through State Predictive Information Bottleneck Based Enhanced Sampling. J Chem Theory Comput 2024; 20:6341-6349. [PMID: 38991145 PMCID: PMC11990086 DOI: 10.1021/acs.jctc.4c00503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]
Abstract
Understanding drug residence times in target proteins is key to improving drug efficacy and understanding target recognition in biochemistry. While drug residence time is just as important as binding affinity, atomic-level understanding of drug residence times through molecular dynamics (MD) simulations has been difficult primarily due to the extremely long time scales. Recent advances in rare event sampling have allowed us to reach these time scales, yet predicting protein-ligand residence times remains a significant challenge. Here we present a semi-automated protocol to calculate the ligand residence times across 12 orders of magnitude of time scales. In our proposed framework, we integrate a deep learning-based method, the state predictive information bottleneck (SPIB), to learn an approximate reaction coordinate (RC) and use it to guide the enhanced sampling method metadynamics. We demonstrate the performance of our algorithm by applying it to six different protein-ligand complexes with available benchmark residence times, including the dissociation of the widely studied anticancer drug Imatinib (Gleevec) from both wild-type Abl kinase and drug-resistant mutants. We show how our protocol can recover quantitatively accurate residence times, potentially opening avenues for deeper insights into drug development possibilities and ligand recognition mechanisms.
Collapse
Affiliation(s)
- Suemin Lee
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
| | - Dedi Wang
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
| | - Markus A. Seeliger
- Department of Pharmacological Sciences, Stony Brook University, Stony Brook, NY 11794-8651, USA
| | - Pratyush Tiwary
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
- University of Maryland Institute for Health Computing, Bethesda, Maryland 20852, USA
| |
Collapse
|
34
|
Ruzmetov T, Hung TI, Jonnalagedda SP, Chen SH, Fasihianifard P, Guo Z, Bhanu B, Chang CEA. Sampling Conformational Ensembles of Highly Dynamic Proteins via Generative Deep Learning. RESEARCH SQUARE 2024:rs.3.rs-4301803. [PMID: 38978607 PMCID: PMC11230488 DOI: 10.21203/rs.3.rs-4301803/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Proteins are inherently dynamic, and their conformational ensembles are functionally important in biology. Large-scale motions may govern protein structure-function relationship, and numerous transient but stable conformations of intrinsically disordered proteins (IDPs) can play a crucial role in biological function. Investigating conformational ensembles to understand regulations and disease-related aggregations of IDPs is challenging both experimentally and computationally. In this paper first an unsupervised deep learning-based model, termed Internal Coordinate Net (ICoN), is developed that learns the physical principles of conformational changes from molecular dynamics (MD) simulation data. Second, interpolating data points in the learned latent space are selected that rapidly identify novel synthetic conformations with sophisticated and large-scale sidechains and backbone arrangements. Third, with the highly dynamic amyloid-β1-42 (Aβ42) monomer, our deep learning model provided a comprehensive sampling of Aβ42's conformational landscape. Analysis of these synthetic conformations revealed conformational clusters that can be used to rationalize experimental findings. Additionally, the method can identify novel conformations with important interactions in atomistic details that are not included in the training data. New synthetic conformations showed distinct sidechain rearrangements that are probed by our EPR and amino acid substitution studies. The proposed approach is highly transferable and can be used for any available data for training. The work also demonstrated the ability for deep learning to utilize learned natural atomistic motions in protein conformation sampling.
Collapse
Affiliation(s)
- Talant Ruzmetov
- Department of Chemistry, University of California, Riverside, CA92521
| | - Ta I Hung
- Department of Chemistry, University of California, Riverside, CA92521
- Department of Bioengineering, University of California, Riverside, CA92521
| | | | - Si-Han Chen
- Department of Chemistry, University of California, Riverside, CA92521
| | | | - Zhefeng Guo
- Department of Neurology, Brain Research Institute, University of California, Los Angeles, CA 90095
| | - Bir Bhanu
- Department of Bioengineering, University of California, Riverside, CA92521
- Department of Electrical and Computer Engineering, University of California, Riverside, CA92521
| | - Chia-En A Chang
- Department of Chemistry, University of California, Riverside, CA92521
- Department of Bioengineering, University of California, Riverside, CA92521
| |
Collapse
|
35
|
Tänzel V, Jäger M, Wolf S. Learning Protein-Ligand Unbinding Pathways via Single-Parameter Community Detection. J Chem Theory Comput 2024; 20:5058-5067. [PMID: 38865714 DOI: 10.1021/acs.jctc.4c00250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2024]
Abstract
Understanding the dynamics of biomolecular complexes, e.g., of protein-ligand (un)binding, requires the comprehension of paths such systems take between metastable states. In MD simulations, paths are usually not observable per se, but they need to be inferred from simulation trajectories. Here, we present a novel approach to cluster trajectories based on a community detection algorithm that necessitates only the definition of a single parameter. The unbinding of the streptavidin-biotin complex is used as a benchmark system and the A2a adenosine receptor in complex with the inhibitor ZM241385 as an elaborate application. We demonstrate how such clusters of trajectories correspond to pathways and how the approach helps in the identification of reaction coordinates for a considered (un)binding process.
Collapse
Affiliation(s)
- Victor Tänzel
- Biomolecular Dynamics, Institute of Physics, University of Freiburg, Freiburg 79104, Germany
| | - Miriam Jäger
- Biomolecular Dynamics, Institute of Physics, University of Freiburg, Freiburg 79104, Germany
| | - Steffen Wolf
- Biomolecular Dynamics, Institute of Physics, University of Freiburg, Freiburg 79104, Germany
| |
Collapse
|
36
|
Tiwary P. Modeling prebiotic chemistries with quantum accuracy at classical costs. Proc Natl Acad Sci U S A 2024; 121:e2408742121. [PMID: 38809708 PMCID: PMC11161769 DOI: 10.1073/pnas.2408742121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024] Open
Affiliation(s)
- Pratyush Tiwary
- Institute for Physical Science and Technology, University of Maryland, College Park, MD20742
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD20742
- University of Maryland Institute for Health Computing, Bethesda, MD20852
| |
Collapse
|
37
|
Wang D, Wang Y, Evans L, Tiwary P. From Latent Dynamics to Meaningful Representations. J Chem Theory Comput 2024; 20:3503-3513. [PMID: 38649368 DOI: 10.1021/acs.jctc.4c00249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2024]
Abstract
While representation learning has been central to the rise of machine learning and artificial intelligence, a key problem remains in making the learned representations meaningful. For this, the typical approach is to regularize the learned representation through prior probability distributions. However, such priors are usually unavailable or are ad hoc. To deal with this, recent efforts have shifted toward leveraging the insights from physical principles to guide the learning process. In this spirit, we propose a purely dynamics-constrained representation learning framework. Instead of relying on predefined probabilities, we restrict the latent representation to follow overdamped Langevin dynamics with a learnable transition density─a prior driven by statistical mechanics. We show that this is a more natural constraint for representation learning in stochastic dynamical systems, with the crucial ability to uniquely identify the ground truth representation. We validate our framework for different systems including a real-world fluorescent DNA movie data set. We show that our algorithm can uniquely identify orthogonal, isometric, and meaningful latent representations.
Collapse
Affiliation(s)
- Dedi Wang
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, United States
| | - Yihang Wang
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, United States
| | - Luke Evans
- Department of Mathematics, University of Maryland, College Park, Maryland 20742, United States
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, United States
| |
Collapse
|
38
|
Lee S, Wang D, Seeliger MA, Tiwary P. Calculating Protein-Ligand Residence Times Through State Predictive Information Bottleneck based Enhanced Sampling. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.16.589710. [PMID: 38659748 PMCID: PMC11042289 DOI: 10.1101/2024.04.16.589710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Understanding drug residence times in target proteins is key to improving drug efficacy and understanding target recognition in biochemistry. While drug residence time is just as important as binding affinity, atomic-level understanding of drug residence times through molecular dynamics (MD) simulations has been difficult primarily due to the extremely long timescales. Recent advances in rare event sampling have allowed us to reach these timescales, yet predicting protein-ligand residence times remains a significant challenge. Here we present a semi-automated protocol to calculate the ligand residence times across 12 orders of magnitudes of timescales. In our proposed framework, we integrate a deep learning-based method, the state predictive information bottleneck (SPIB), to learn an approximate reaction coordinate (RC) and use it to guide the enhanced sampling method metadynamics. We demonstrate the performance of our algorithm by applying it to six different protein-ligand complexes with available benchmark residence times, including the dissociation of the widely studied anti-cancer drug Imatinib (Gleevec) from both wild-type Abl kinase and drug-resistant mutants. We show how our protocol can recover quantitatively accurate residence times, potentially opening avenues for deeper insights into drug development possibilities and ligand recognition mechanisms.
Collapse
Affiliation(s)
- Suemin Lee
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
| | - Dedi Wang
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
| | - Markus A. Seeliger
- Department of Pharmacological Sciences, Stony Brook University, Stony Brook, NY 11794-8651, USA
| | - Pratyush Tiwary
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
- University of Maryland Institute for Health Computing, Rockville, United States
| |
Collapse
|
39
|
Zou Z, Tiwary P. Enhanced Sampling of Crystal Nucleation with Graph Representation Learnt Variables. J Phys Chem B 2024. [PMID: 38502931 DOI: 10.1021/acs.jpcb.4c00080] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/21/2024]
Abstract
In this study, we present a graph neural network (GNN)-based learning approach using an autoencoder setup to derive low-dimensional variables from features observed in experimental crystal structures. These variables are then biased in enhanced sampling to observe state-to-state transitions and reliable thermodynamic weights. In our approach, we used simple convolution and pooling methods. To verify the effectiveness of our protocol, we examined the nucleation of various allotropes and polymorphs of iron and glycine in their molten states. Our graph latent variables, when biased in well-tempered metadynamics, consistently show transitions between states and achieve accurate thermodynamic rankings in agreement with experiments, both of which are indicators of dependable sampling. This underscores the strength and promise of our GNN variables for improved sampling. The protocol shown here should be applicable for other systems and other sampling methods.
Collapse
Affiliation(s)
- Ziyue Zou
- Department of Chemistry and Biochemistry, University of Maryland, College Park 20742, Maryland, United States
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry, University of Maryland, College Park 20742, Maryland, United States
- Institute for Physical Science and Technology, University of Maryland, College Park 20742, Maryland, United States
- University of Maryland Institute for Health Computing, Rockville, Maryland 20852, United States
| |
Collapse
|