1
|
Cao S, Nüske F, Liu B, Soley MB, Huang X. AMUSET-TICA: A Tensor-Based Approach for Identifying Slow Collective Variables in Biomolecular Dynamics. J Chem Theory Comput 2025; 21:4855-4866. [PMID: 40254940 DOI: 10.1021/acs.jctc.5c00076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/22/2025]
Abstract
Elucidating collective variables (CVs) for biomolecular dynamics is crucial for understanding numerous biological processes. By leveraging the tensor-train data structure, a multilinear version of the AMUSE (Algorithm for Multiple Unknown Signals) algorithm for Koopman approximation (AMUSEt) was recently developed to identify CVs for biomolecular dynamics. To find slow CVs, AMUSEt transforms input features (e.g., pairwise atomic distances) into nonlinear basis functions (e.g., Gaussian functions) and encodes these nonlinear basis functions within a tensor-train structure via time-lagged correlation functions. Due to the need to fit these tensor-train data structures into computer memory, AMUSEt can handle only a limited number of input features. Consequently, AMUSEt relies on manually selecting and ranking features based on physical intuition to fully capture the slow dynamics. However, when applied to complex biological systems with numerous features, this selection and ranking process becomes increasingly challenging. To address this challenge, here we present AMUSET-TICA (AMUSEt-based Time-lagged Independent Component Analysis), a CV-identification method using time-structure-independent components (tICs) as the input features for AMUSEt. The key insight of AMUSET-TICA lies in its highly effective embedding of high-dimensional atomistic protein conformations, achieved by expanding orthogonal tICs into overlapping Gaussian basis functions through a tensor-product data structure. This eliminates the need for manually selecting and ranking input features for a wide range of biomolecular systems. We demonstrate that AMUSET-TICA consistently and significantly outperforms AMUSEt and tICA in identifying slow CVs for three different biomolecular systems: alanine dipeptide, the N-terminal domain of L9 (NTL9), and the FIP35 WW domain. For all these systems, the CVs generated by AMUSET-TICA accurately describe the slowest dynamical modes underlying these biological conformational changes. Furthermore, we show that AMUSET-TICA achieves performance comparable to deep-learning approaches like VAMPnets in identifying the slowest dynamical modes, while being significantly more computationally efficient in terms of CPU time. In addition, the CVs yielded by AMUSET-TICA provide insights into the folding mechanisms of NTL9 and the FIP35 WW domain, including CV3 and CV4 of the WW domain, which capture its two parallel folding pathways. We expect AMUSET-TICA can be widely applied to facilitate the investigation of biomolecular dynamics.
Collapse
Affiliation(s)
- Siqin Cao
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Feliks Nüske
- Max-Planck-Institute for Dynamics of Complex Technical Systems, Magdeburg 39106, Germany
| | - Bojun Liu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Micheline B Soley
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| |
Collapse
|
2
|
Cao J, Zhang J, Yu Q, Ji J, Li J, He S, Zhu Z. TG-CDDPM: text-guided antimicrobial peptides generation based on conditional denoising diffusion probabilistic model. Brief Bioinform 2024; 26:bbae644. [PMID: 39668337 PMCID: PMC11637771 DOI: 10.1093/bib/bbae644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 11/13/2024] [Accepted: 11/27/2024] [Indexed: 12/14/2024] Open
Abstract
Antimicrobial peptides (AMPs) have emerged as a promising substitution to antibiotics thanks to their boarder range of activities, less likelihood of drug resistance, and low toxicity. Traditional biochemical methods for AMP discovery are costly and inefficient. Deep generative models, including the long-short term memory model, variational autoencoder model, and generative adversarial model, have been widely introduced to expedite AMP discovery. However, these models tend to suffer from the lack of diversity in generating AMPs. The denoising diffusion probabilistic model serves as a good candidate for solving this issue. We proposed a three-stage Text-Guided Conditional Denoising Diffusion Probabilistic Model (TG-CDDPM) to generate novel and homologous AMPs. In the first two stages, contrastive learning and inferring models are crafted to create better conditions for guiding AMP generation, respectively. In the last stage, a pre-trained conditional denoising diffusion probabilistic model is leveraged to enrich the peptide knowledge and fine-tuned to learn feature representation in downstream. TG-CDDPM was compared to the state-of-the-art generative models for AMP generation, and it demonstrated competitive or better performance with the assistance of text description as supervised information. The membrane penetration capabilities of the identified candidate AMPs by TG-CDDPM were also validated through molecular weight dynamics experiments.
Collapse
Affiliation(s)
- Junhang Cao
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China
| | - Jun Zhang
- National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen 518060, China
| | - Qiyuan Yu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China
| | - Junkai Ji
- National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen 518060, China
| | - Jianqiang Li
- National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen 518060, China
| | - Shan He
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - Zexuan Zhu
- National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen 518060, China
| |
Collapse
|
3
|
Qiu Y, Liu S, Xingcheng L, Unarta IC, Huang X, Zhang B. Nucleosome condensate and linker DNA alter chromatin folding pathways and rates. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.15.623891. [PMID: 39605526 PMCID: PMC11601296 DOI: 10.1101/2024.11.15.623891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Chromatin organization is essential for DNA packaging and gene regulation in eukaryotic genomes. While significant progresses have been made, the exact atomistic arrangement of nucleosomes remains controversial. Using a well-calibrated residue-level coarse-grained model and advanced dynamics modeling techniques, particularly the non-Markovian dynamics model, we map the free energy landscape of tetra-nucleosome systems, identify both metastable conformations and intermediate states in folding pathways, and quantify the folding kinetics. Our findings show that chromatin with 10 n base pairs (bp) DNA linker lengths favor zigzag fibril structures. However, longer linker lengths destabilize this conformation. When the linker length is 10 n + 5 bp , chromatin loses unique conformations, favoring a dynamic ensemble of structures resembling folding intermediates. Embedding the tetra-nucleosome in a nucleosome condensate similarly shifts stability towards folding intermediates as a result of the competition of inter-nucleosomal contacts. These results suggest that chromatin organization observed in vivo arises from the unfolding of fibril structures due to nucleosome crowding and linker length variation. This perspective aids in unifying experimental studies to develop atomistic models for chromatin. Significance Atomic structures of chromatin have become increasingly accessible, largely through cryo-EM techniques. Nonetheless, these approaches often face limitations in addressing how intrinsic in vivo factors influence chromatin organization. We present a structural characterization of chromatin under the combined effects of nucleosome condensate crowding and linker DNA length variation-two critical in vivo features that have remained challenging to capture experimentally. This work leverages a novel application of non-Markovian dynamical modeling, providing accurate mapping of chromatin folding kinetics and pathways. Our findings support a hypothesis that in vivo chromatin organization arises from folding intermediates advancing toward a stable fibril configuration, potentially resolving longstanding questions surrounding chromatin atomic structure.
Collapse
Affiliation(s)
- Yunrui Qiu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, USA
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, USA
- Contributed equally to this work
| | - Shuming Liu
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA
- Contributed equally to this work
| | - Lin Xingcheng
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Ilona Christy Unarta
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, USA
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, USA
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, USA
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, USA
| | - Bin Zhang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
4
|
Qiu Y, Wiewiora RP, Izaguirre JA, Xu H, Sherman W, Tang W, Huang X. Non-Markovian Dynamic Models Identify Non-Canonical KRAS-VHL Encounter Complex Conformations for Novel PROTAC Design. JACS AU 2024; 4:3857-3868. [PMID: 39483225 PMCID: PMC11522902 DOI: 10.1021/jacsau.4c00503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 08/26/2024] [Accepted: 09/16/2024] [Indexed: 11/03/2024]
Abstract
Targeted protein degradation (TPD) is emerging as a promising therapeutic approach for cancer and other diseases, with an increasing number of programs demonstrating its efficacy in human clinical trials. One notable method for TPD is Proteolysis Targeting Chimeras (PROTACs) that selectively degrade a protein of interest (POI) through E3-ligase induced ubiquitination followed by proteasomal degradation. PROTACs utilize a warhead-linker-ligand architecture to bring the POI (bound to the warhead) and the E3 ligase (bound to the ligand) into proximity. The resulting non-native protein-protein interactions (PPIs) formed between the POI and E3 ligase lead to the formation of a stable ternary complex, enhancing cooperativity for TPD. A significant challenge in PROTAC design is the screening of the linkers to induce favorable non-native PPIs between POI and E3 ligase. Here, we present a physics-based computational protocol to predict noncanonical and metastable PPI interfaces between an E3 ligase and a given POI, aiding in the design of linkers to stabilize the ternary complex and enhance degradation. Specifically, we build the non-Markovian dynamic model using the Integrative Generalized Master equation (IGME) method from ∼1.5 ms all-atom molecular dynamics simulations of linker-less encounter complex, to systematically explore the inherent PPIs between the oncogene homologue protein and the von Hippel-Lindau E3 ligase. Our protocol revealed six metastable states each containing a different PPI interface. We selected three of these metastable states containing promising PPIs for linker design. Our selection criterion included thermodynamic and kinetic stabilities of PPIs and the accessibility between the solvent-exposed sites on the warheads and E3 ligand. One selected PPIs closely matches a recent cocrystal PPI interface structure induced by an experimentally designed PROTAC with potent degradation efficacy. We anticipate that our protocol has significant potential for widespread application in predicting metastable POI-ligase interfaces that can enable rational design of PROTACs.
Collapse
Affiliation(s)
- Yunrui Qiu
- Department
of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Data
Science Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | | | | | - Huafeng Xu
- Atommap
Corporation, NY, New York 10013, United
States
| | - Woody Sherman
- Psivant
Therapeutics, Boston, Massachusetts 02210, United States
| | - Weiping Tang
- Lachman
Institute for Pharmaceutical Development, School of Pharmacy, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States
| | - Xuhui Huang
- Department
of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Data
Science Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| |
Collapse
|
5
|
Wang Y, Li C, Zheng X. Markov State Models Reveal How Folding Kinetics Influence Absorption Spectra of Foldamers. J Chem Theory Comput 2024; 20:5396-5407. [PMID: 38900275 DOI: 10.1021/acs.jctc.4c00202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Self-assembly of platinum(II) complex foldamers is an essential approach to fabricate advanced luminescent materials. However, a comprehensive understanding of folding kinetics and their absorption spectra remains elusive. By constructing Markov state models (MSMs) from large-scale molecular dynamics simulations, we reveal that two largely similar dinuclear alknylplatinum(II) terpyridine foldamers, Pt-PEG and Pt-PE with slightly different bridges, exhibit distinctive folding kinetics. Particularly, Pt-PEG bears bridge-dominant, plane-dominant, and cooperative pathways, while Pt-PE only prefers the plane-dominant pathway. Such preference originates from their difference in intrabridge electrostatic interactions, leading to contrastive distributions of metastable states. We also found that the bridge-dominant pathway for Pt-PEG becomes more favorable when lowering the temperature. Interestingly, based on the comprehensive conformation ensembles from our MSMs, we reveal the conformation-dependent absorption spectra of Pt-PEG and Pt-PE. Our theoretical spectra not only align with experimental results but also reveal the contributions of diverse conformations to the overall absorption bands explicitly, facilitating the rational design of stimuli-responsive smart luminescent molecules.
Collapse
Affiliation(s)
- Yijia Wang
- Key Laboratory of Cluster Science of Ministry of Education, Beijing Key Laboratory of Photoelectronic/Electrophotonic Conversion Materials, Key Laboratory of Medicinal Molecule Science and Pharmaceutics Engineering of Ministry of Industry and Information Technology, School of Chemistry and Chemical Engineering, Beijing Institute of Technology, Beijing 100081, China
| | - Chu Li
- Department of Chemistry, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Xiaoyan Zheng
- Key Laboratory of Cluster Science of Ministry of Education, Beijing Key Laboratory of Photoelectronic/Electrophotonic Conversion Materials, Key Laboratory of Medicinal Molecule Science and Pharmaceutics Engineering of Ministry of Industry and Information Technology, School of Chemistry and Chemical Engineering, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
6
|
Wang D, Qiu Y, Beyerle ER, Huang X, Tiwary P. Information Bottleneck Approach for Markov Model Construction. J Chem Theory Comput 2024; 20:5352-5367. [PMID: 38859575 PMCID: PMC11199095 DOI: 10.1021/acs.jctc.4c00449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2024]
Abstract
Markov state models (MSMs) have proven valuable in studying the dynamics of protein conformational changes via statistical analysis of molecular dynamics simulations. In MSMs, the complex configuration space is coarse-grained into conformational states, with dynamics modeled by a series of Markovian transitions among these states at discrete lag times. Constructing the Markovian model at a specific lag time necessitates defining states that circumvent significant internal energy barriers, enabling internal dynamics relaxation within the lag time. This process effectively coarse-grains time and space, integrating out rapid motions within metastable states. Thus, MSMs possess a multiresolution nature, where the granularity of states can be adjusted according to the time-resolution, offering flexibility in capturing system dynamics. This work introduces a continuous embedding approach for molecular conformations using the state predictive information bottleneck (SPIB), a framework that unifies dimensionality reduction and state space partitioning via a continuous, machine learned basis set. Without explicit optimization of the VAMP-based scores, SPIB demonstrates state-of-the-art performance in identifying slow dynamical processes and constructing predictive multiresolution Markovian models. Through applications to well-validated mini-proteins, SPIB showcases unique advantages compared to competing methods. It autonomously and self-consistently adjusts the number of metastable states based on a specified minimal time resolution, eliminating the need for manual tuning. While maintaining efficacy in dynamical properties, SPIB excels in accurately distinguishing metastable states and capturing numerous well-populated macrostates. This contrasts with existing VAMP-based methods, which often emphasize slow dynamics at the expense of incorporating numerous sparsely populated states. Furthermore, SPIB's ability to learn a low-dimensional continuous embedding of the underlying MSMs enhances the interpretation of dynamic pathways. With these benefits, we propose SPIB as an easy-to-implement methodology for end-to-end MSM construction.
Collapse
Affiliation(s)
- Dedi Wang
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742, United States
| | - Yunrui Qiu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI 53706, United States
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, United States
| | - Eric R. Beyerle
- Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742, United States
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI 53706, United States
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, United States
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742, United States
- University of Maryland Institute for Health Computing, Bethesda, MD 20852, United States
| |
Collapse
|
7
|
Wang D, Qiu Y, Beyerle ER, Huang X, Tiwary P. An Information Bottleneck Approach for Markov Model Construction. ARXIV 2024:arXiv:2404.02856v2. [PMID: 38947932 PMCID: PMC11213129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Markov state models (MSMs) have proven valuable in studying dynamics of protein conformational changes via statistical analysis of molecular dynamics (MD) simulations. In MSMs, the complex configuration space is coarse-grained into conformational states, with dynamics modeled by a series of Markovian transitions among these states at discrete lag times. Constructing the Markovian model at a specific lag time necessitates defining states that circumvent significant internal energy barriers, enabling internal dynamics relaxation within the lag time. This process effectively coarse-grains time and space, integrating out rapid motions within metastable states. Thus, MSMs possess a multi-resolution nature, where the granularity of states can be adjusted according to the time-resolution, offering flexibility in capturing system dynamics. This work introduces a continuous embedding approach for molecular conformations using the state predictive information bottleneck (SPIB), a framework that unifies dimensionality reduction and state space partitioning via a continuous, machine learned basis set. Without explicit optimization of the VAMP-based scores, SPIB demonstrates state-of-the-art performance in identifying slow dynamical processes and constructing predictive multi-resolution Markovian models. Through applications to well-validated mini-proteins, SPIB showcases unique advantages compared to competing methods. It autonomously and self-consistently adjusts the number of metastable states based on specified minimal time resolution, eliminating the need for manual tuning. While maintaining efficacy in dynamical properties, SPIB excels in accurately distinguishing metastable states and capturing numerous well-populated macrostates. This contrasts with existing VAMP-based methods, which often emphasize slow dynamics at the expense of incorporating numerous sparsely populated states. Furthermore, SPIB's ability to learn a low-dimensional continuous embedding of the underlying MSMs enhances the interpretation of dynamic pathways. With these benefits, we propose SPIB as an easy-to-implement methodology for end-to-end MSMs construction.
Collapse
Affiliation(s)
- Dedi Wang
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742, United States
| | - Yunrui Qiu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI 53706, United States
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, United States
| | - Eric R. Beyerle
- Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742, United States
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI 53706, United States
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, United States
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742, United States
- University of Maryland Institute for Health Computing, Bethesda, MD 20852, United States
| |
Collapse
|
8
|
Xu T, Li Y, Gao X, Zhang L. Understanding the Fast-Triggering Unfolding Dynamics of FK-11 upon Photoexcitation of Azobenzene. J Phys Chem Lett 2024; 15:3531-3540. [PMID: 38526058 DOI: 10.1021/acs.jpclett.4c00091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
Photoswitchable molecules can control the activity and functions of biomolecules by triggering conformational changes. However, it is still challenging to fully understand such fast-triggering conformational evolution from nonequilibrium to equilibrium distribution at the molecular level. Herein, we successfully simulated the unfolding of the FK-11 peptide upon the photoinduced trans-to-cis isomerization of azobenzene based on the Markov state model. We found that the ensemble of FK-11 contains five conformational states, constituting two unfolding pathways. More intriguingly, we observed the microsecond-scale conformational propagation of the FK-11 peptide from the fully folded state to the equilibrium populations of the five states. The computed CD spectra match well with the experimental data, validating our simulation method. Overall, our study not only offers a protocol to study the photoisomerization-induced conformational changes of enzymes but also could orientate the rational design of a photoswitchable molecule to manipulate biological functions.
Collapse
Affiliation(s)
- Tiantian Xu
- State Key Laboratory of Structural Chemistry, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou, Fujian 350002, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yongfang Li
- State Key Laboratory of Structural Chemistry, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou, Fujian 350002, China
| | - Xin Gao
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
| | - Lu Zhang
- State Key Laboratory of Structural Chemistry, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou, Fujian 350002, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Fuzhou, Fujian 361005, China
| |
Collapse
|
9
|
Wu Y, Cao S, Qiu Y, Huang X. Tutorial on how to build non-Markovian dynamic models from molecular dynamics simulations for studying protein conformational changes. J Chem Phys 2024; 160:121501. [PMID: 38516972 PMCID: PMC10964226 DOI: 10.1063/5.0189429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 02/20/2024] [Indexed: 03/23/2024] Open
Abstract
Protein conformational changes play crucial roles in their biological functions. In recent years, the Markov State Model (MSM) constructed from extensive Molecular Dynamics (MD) simulations has emerged as a powerful tool for modeling complex protein conformational changes. In MSMs, dynamics are modeled as a sequence of Markovian transitions among metastable conformational states at discrete time intervals (called lag time). A major challenge for MSMs is that the lag time must be long enough to allow transitions among states to become memoryless (or Markovian). However, this lag time is constrained by the length of individual MD simulations available to track these transitions. To address this challenge, we have recently developed Generalized Master Equation (GME)-based approaches, encoding non-Markovian dynamics using a time-dependent memory kernel. In this Tutorial, we introduce the theory behind two recently developed GME-based non-Markovian dynamic models: the quasi-Markov State Model (qMSM) and the Integrative Generalized Master Equation (IGME). We subsequently outline the procedures for constructing these models and provide a step-by-step tutorial on applying qMSM and IGME to study two peptide systems: alanine dipeptide and villin headpiece. This Tutorial is available at https://github.com/xuhuihuang/GME_tutorials. The protocols detailed in this Tutorial aim to be accessible for non-experts interested in studying the biomolecular dynamics using these non-Markovian dynamic models.
Collapse
Affiliation(s)
- Yue Wu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Siqin Cao
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Yunrui Qiu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Xuhui Huang
- Author to whom correspondence should be addressed:
| |
Collapse
|
10
|
Ray D, Parrinello M. Data-driven classification of ligand unbinding pathways. Proc Natl Acad Sci U S A 2024; 121:e2313542121. [PMID: 38412121 PMCID: PMC10927508 DOI: 10.1073/pnas.2313542121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 01/26/2024] [Indexed: 02/29/2024] Open
Abstract
Studying the pathways of ligand-receptor binding is essential to understand the mechanism of target recognition by small molecules. The binding free energy and kinetics of protein-ligand complexes can be computed using molecular dynamics (MD) simulations, often in quantitative agreement with experiments. However, only a qualitative picture of the ligand binding/unbinding paths can be obtained through a conventional analysis of the MD trajectories. Besides, the higher degree of manual effort involved in analyzing pathways limits its applicability in large-scale drug discovery. Here, we address this limitation by introducing an automated approach for analyzing molecular transition paths with a particular focus on protein-ligand dissociation. Our method is based on the dynamic time-warping algorithm, originally designed for speech recognition. We accurately classified molecular trajectories using a very generic descriptor set of contacts or distances. Our approach outperforms manual classification by distinguishing between parallel dissociation channels, within the pathways identified by visual inspection. Most notably, we could compute exit-path-specific ligand-dissociation kinetics. The unbinding timescale along the fastest path agrees with the experimental residence time, providing a physical interpretation to our entirely data-driven protocol. In combination with appropriate enhanced sampling algorithms, this technique can be used for the initial exploration of ligand-dissociation pathways as well as for calculating path-specific thermodynamic and kinetic properties.
Collapse
Affiliation(s)
- Dhiman Ray
- Simulations Research Line, Italian Institute of Technology, Via Enrico Melen 83, GenovaGE16152, Italy
| | - Michele Parrinello
- Simulations Research Line, Italian Institute of Technology, Via Enrico Melen 83, GenovaGE16152, Italy
| |
Collapse
|
11
|
Bogetti A, Leung JMG, Chong LT. LPATH: A Semiautomated Python Tool for Clustering Molecular Pathways. J Chem Inf Model 2023; 63:7610-7616. [PMID: 38048485 PMCID: PMC10751797 DOI: 10.1021/acs.jcim.3c01318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 10/14/2023] [Accepted: 11/09/2023] [Indexed: 12/06/2023]
Abstract
The pathways by which a molecular process transitions to a target state are highly sought-after as direct views of a transition mechanism. While great strides have been made in the physics-based simulation of such pathways, the analysis of these pathways can be a major challenge due to their diversity and variable lengths. Here, we present the LPATH Python tool, which implements a semiautomated method for linguistics-assisted clustering of pathways into distinct classes (or routes). This method involves three steps: 1) discretizing the configurational space into key states, 2) extracting a text-string sequence of key visited states for each pathway, and 3) pairwise matching of pathways based on a text-string similarity score. To circumvent the prohibitive memory requirements of the first step, we have implemented a general two-stage method for clustering conformational states that exploits machine learning. LPATH is primarily designed for use with the WESTPA software for weighted ensemble simulations; however, the tool can also be applied to conventional simulations. As demonstrated for the C7eq to C7ax conformational transition of the alanine dipeptide, LPATH provides physically reasonable classes of pathways and corresponding probabilities.
Collapse
Affiliation(s)
- Anthony
T. Bogetti
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Jeremy M. G. Leung
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Lillian T. Chong
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| |
Collapse
|
12
|
Bogetti AT, Leung JMG, Chong LT. LPATH: A semi-automated Python tool for clustering molecular pathways. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.17.553774. [PMID: 37645995 PMCID: PMC10462149 DOI: 10.1101/2023.08.17.553774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
The pathways by which a molecular process transitions to a target state are highly sought-after as direct views of a transition mechanism. While great strides have been made in the physics-based simulation of such pathways, the analysis of these pathways can be a major challenge due to their diversity and variable lengths. Here we present the LPATH Python tool, which implements a semi-automated method for linguistics-assisted clustering of pathways into distinct classes (or routes). This method involves three steps: 1) discretizing the configurational space into key states, 2) extracting a text-string sequence of key visited states for each pathway, and 3) pairwise matching of pathways based on a text-string similarity score. To circumvent the prohibitive memory requirements of the first step, we have implemented a general two-stage method for clustering conformational states that exploits machine learning. LPATH is primarily designed for use with the WESTPA software for weighted ensemble simulations; however, the tool can also be applied to conventional simulations. As demonstrated for the C7eq to C7ax conformational transition of alanine dipeptide, LPATH provides physically reasonable classes of pathways and corresponding probabilities.
Collapse
Affiliation(s)
- Anthony T. Bogetti
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
| | - Jeremy M. G. Leung
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
| | - Lillian T. Chong
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
| |
Collapse
|