1
|
Viegas RG, Martins IBS, Leite VBP. Understanding the Energy Landscape of Intrinsically Disordered Protein Ensembles. J Chem Inf Model 2024; 64:4149-4157. [PMID: 38713459 DOI: 10.1021/acs.jcim.4c00080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
A substantial portion of various organisms' proteomes comprises intrinsically disordered proteins (IDPs) that lack a defined three-dimensional structure. These IDPs exhibit a diverse array of conformations, displaying remarkable spatiotemporal heterogeneity and exceptional conformational flexibility. Characterizing the structure or structural ensemble of IDPs presents significant conceptual and methodological challenges owing to the absence of a well-defined native structure. While databases such as the Protein Ensemble Database (PED) provide IDP ensembles obtained through a combination of experimental data and molecular modeling, the absence of reaction coordinates poses challenges in comprehensively understanding pertinent aspects of the system. In this study, we leverage the energy landscape visualization method (JCTC, 6482, 2019) to scrutinize four IDP ensembles sourced from PED. ELViM, a methodology that circumvents the need for a priori reaction coordinates, aids in analyzing the ensembles. The specific IDP ensembles investigated are as follows: two fragments of nucleoporin (NUL: 884-993 and NUS: 1313-1390), yeast sic 1 N-terminal (1-90), and the N-terminal SH3 domain of Drk (1-59). Utilizing ELViM enables the comprehensive validation of ensembles, facilitating the detection of potential inconsistencies in the sampling process. Additionally, it allows for identifying and characterizing the most prevalent conformations within an ensemble. Moreover, ELViM facilitates the comparative analysis of ensembles obtained under diverse conditions, thereby providing a powerful tool for investigating the functional mechanisms of IDPs.
Collapse
Affiliation(s)
- Rafael G Viegas
- Federal Institute of Education, Science and Technology of São Paulo (IFSP), Catanduva, São Paulo 15.808-305, Brazil
- Department of Physics, São Paulo State University (UNESP), Institute of Biosciences, Humanities and Exact Sciences, São José do Rio Preto, São Paulo 15054-000, Brazil
| | - Ingrid B S Martins
- Department of Physics, São Paulo State University (UNESP), Institute of Biosciences, Humanities and Exact Sciences, São José do Rio Preto, São Paulo 15054-000, Brazil
| | - Vitor B P Leite
- Department of Physics, São Paulo State University (UNESP), Institute of Biosciences, Humanities and Exact Sciences, São José do Rio Preto, São Paulo 15054-000, Brazil
| |
Collapse
|
2
|
Sun L, Vandermause J, Batzner S, Xie Y, Clark D, Chen W, Kozinsky B. Multitask Machine Learning of Collective Variables for Enhanced Sampling of Rare Events. J Chem Theory Comput 2022; 18:2341-2353. [PMID: 35274958 DOI: 10.1021/acs.jctc.1c00143] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Computing accurate reaction rates is a central challenge in computational chemistry and biology because of the high cost of free energy estimation with unbiased molecular dynamics. In this work, a data-driven machine learning algorithm is devised to learn collective variables with a multitask neural network, where a common upstream part reduces the high dimensionality of atomic configurations to a low dimensional latent space and separate downstream parts map the latent space to predictions of basin class labels and potential energies. The resulting latent space is shown to be an effective low-dimensional representation, capturing the reaction progress and guiding effective umbrella sampling to obtain accurate free energy landscapes. This approach is successfully applied to model systems including a 5D Müller Brown model, a 5D three-well model, the alanine dipeptide in vacuum, and an Au(110) surface reconstruction unit reaction. It enables automated dimensionality reduction for energy controlled reactions in complex systems, offers a unified and data-efficient framework that can be trained with limited data, and outperforms single-task learning approaches, including autoencoders.
Collapse
Affiliation(s)
- Lixin Sun
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Jonathan Vandermause
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Simon Batzner
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Yu Xie
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - David Clark
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Wei Chen
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Boris Kozinsky
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| |
Collapse
|
3
|
Abstract
This chapter discusses the way in which dimensionality reduction algorithms such as diffusion maps and sketch-map can be used to analyze molecular dynamics trajectories. The first part discusses how these various algorithms function as well as practical issues such as landmark selection and how these algorithms can be used when the data to be analyzed comes from enhanced sampling trajectories. In the later part a comparison between the results obtained by applying various algorithms to two sets of sample data is performed and discussed. This section is then followed by a summary of how one algorithm in particular, sketch-map, has been applied to a range of problems. The chapter concludes with a discussion on the directions that we believe this field is currently moving.
Collapse
|
4
|
Cuendet MA, Margul DT, Schneider E, Vogt-Maranto L, Tuckerman ME. Endpoint-restricted adiabatic free energy dynamics approach for the exploration of biomolecular conformational equilibria. J Chem Phys 2018; 149:072316. [DOI: 10.1063/1.5027479] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Affiliation(s)
- Michel A. Cuendet
- Molecular Modeling Group, Swiss Institute of Bioinformatics, UNIL Sorge, 1015 Lausanne, Switzerland
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, New York 10065, USA
| | - Daniel T. Margul
- Department of Chemistry, New York University, New York, New York 10003, USA
| | - Elia Schneider
- Department of Chemistry, New York University, New York, New York 10003, USA
| | | | - Mark E. Tuckerman
- Department of Chemistry, New York University, New York, New York 10003, USA
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, USA
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, 3663 Zhongshan Road North, Shanghai 200062, China
| |
Collapse
|
5
|
Peter EK. Adaptive enhanced sampling with a path-variable for the simulation of protein folding and aggregation. J Chem Phys 2018; 147:214902. [PMID: 29221375 DOI: 10.1063/1.5000930] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
In this article, we present a novel adaptive enhanced sampling molecular dynamics (MD) method for the accelerated simulation of protein folding and aggregation. We introduce a path-variable L based on the un-biased momenta p and displacements dq for the definition of the bias s applied to the system and derive 3 algorithms: general adaptive bias MD, adaptive path-sampling, and a hybrid method which combines the first 2 methodologies. Through the analysis of the correlations between the bias and the un-biased gradient in the system, we find that the hybrid methodology leads to an improved force correlation and acceleration in the sampling of the phase space. We apply our method on SPC/E water, where we find a conservation of the average water structure. We then use our method to sample dialanine and the folding of TrpCage, where we find a good agreement with simulation data reported in the literature. Finally, we apply our methodologies on the initial stages of aggregation of a hexamer of Alzheimer's amyloid β fragment 25-35 (Aβ 25-35) and find that transitions within the hexameric aggregate are dominated by entropic barriers, while we speculate that especially the conformation entropy plays a major role in the formation of the fibril as a rate limiting factor.
Collapse
Affiliation(s)
- Emanuel K Peter
- Department of Pharmacy and Chemistry, Institute of Physical and Theoretical Chemistry, University of Regensburg, Regensburg, Germany
| |
Collapse
|
6
|
Wagner JW, Dannenhoffer-Lafage T, Jin J, Voth GA. Extending the range and physical accuracy of coarse-grained models: Order parameter dependent interactions. J Chem Phys 2018; 147:044113. [PMID: 28764380 DOI: 10.1063/1.4995946] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Order parameters (i.e., collective variables) are often used to describe the behavior of systems as they capture different features of the free energy surface. Yet, most coarse-grained (CG) models only employ two- or three-body non-bonded interactions between the CG particles. In situations where these interactions are insufficient for the CG model to reproduce the structural distributions of the underlying fine-grained (FG) model, additional interactions must be included. In this paper, we introduce an approach to expand the basis sets available in the multiscale coarse-graining (MS-CG) methodology by including order parameters. Then, we investigate the ability of an additive local order parameter (e.g., density) and an additive global order parameter (i.e., distance from a hard wall) to improve the description of CG models in interfacial systems. Specifically, we study methanol liquid-vapor coexistence, acetonitrile liquid-vapor coexistence, and acetonitrile liquid confined by hard-wall plates, all using single site CG models. We find that the use of order parameters dramatically improves the reproduction of structural properties of interfacial CG systems relative to the FG reference as compared with pairwise CG interactions alone.
Collapse
Affiliation(s)
- Jacob W Wagner
- Department of Chemistry, James Franck Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Thomas Dannenhoffer-Lafage
- Department of Chemistry, James Franck Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Jaehyeok Jin
- Department of Chemistry, James Franck Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Gregory A Voth
- Department of Chemistry, James Franck Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, USA
| |
Collapse
|