1
|
Martini 3 Force Field Parameters for Protein Lipidation Post-Translational Modifications. J Chem Theory Comput 2023; 19:8901-8918. [PMID: 38019969 DOI: 10.1021/acs.jctc.3c00604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
Protein lipidations are vital co/post-translational modifications that tether lipid tails to specific protein amino acids, allowing them to anchor to biological membranes, switch their subcellular localization, and modulate association with other proteins. Such lipidations are thus crucial for multiple biological processes including signal transduction, protein trafficking, and membrane localization and are implicated in various diseases as well. Examples of lipid-anchored proteins include the Ras family of proteins that undergo farnesylation; actin and gelsolin that are myristoylated; phospholipase D that is palmitoylated; glycosylphosphatidylinositol-anchored proteins; and others. Here, we develop parameters for cysteine-targeting farnesylation, geranylgeranylation, and palmitoylation, as well as glycine-targeting myristoylation for the latest version of the Martini 3 coarse-grained force field. The parameters are developed using the CHARMM36m all-atom force field parameters as reference. The behavior of the coarse-grained models is consistent with that of the all-atom force field for all lipidations and reproduces key dynamical and structural features of lipid-anchored peptides, such as the solvent-accessible surface area, bilayer penetration depth, and representative conformations of the anchors. The parameters are also validated in simulations of the lipid-anchored peripheral membrane proteins Rheb and Arf1, after comparison with independent all-atom simulations. The parameters, along with mapping schemes for the popular martinize2 tool, are available for download at 10.5281/zenodo.7849262 and also as supporting information.
Collapse
|
2
|
A novel antifolate suppresses growth of FPGS-deficient cells and overcomes methotrexate resistance. Life Sci Alliance 2023; 6:e202302058. [PMID: 37591722 PMCID: PMC10435995 DOI: 10.26508/lsa.202302058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 08/07/2023] [Accepted: 08/07/2023] [Indexed: 08/19/2023] Open
Abstract
Cancer cells make extensive use of the folate cycle to sustain increased anabolic metabolism. Multiple chemotherapeutic drugs interfere with the folate cycle, including methotrexate and 5-fluorouracil that are commonly applied for the treatment of leukemia and colorectal cancer (CRC), respectively. Despite high success rates, therapy-induced resistance causes relapse at later disease stages. Depletion of folylpolyglutamate synthetase (FPGS), which normally promotes intracellular accumulation and activity of natural folates and methotrexate, is linked to methotrexate and 5-fluorouracil resistance and its association with relapse illustrates the need for improved intervention strategies. Here, we describe a novel antifolate (C1) that, like methotrexate, potently inhibits dihydrofolate reductase and downstream one-carbon metabolism. Contrary to methotrexate, C1 displays optimal efficacy in FPGS-deficient contexts, due to decreased competition with intracellular folates for interaction with dihydrofolate reductase. We show that FPGS-deficient patient-derived CRC organoids display enhanced sensitivity to C1, whereas FPGS-high CRC organoids are more sensitive to methotrexate. Our results argue that polyglutamylation-independent antifolates can be applied to exert selective pressure on FPGS-deficient cells during chemotherapy, using a vulnerability created by polyglutamylation deficiency.
Collapse
|
3
|
Abstract
Many pathogens exploit host cell-surface glycans. However, precise analyses of glycan ligands binding with heavily modified pathogen proteins can be confounded by overlapping sugar signals and/or compounded with known experimental constraints. Universal saturation transfer analysis (uSTA) builds on existing nuclear magnetic resonance spectroscopy to provide an automated workflow for quantitating protein-ligand interactions. uSTA reveals that early-pandemic, B-origin-lineage severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike trimer binds sialoside sugars in an "end-on" manner. uSTA-guided modeling and a high-resolution cryo-electron microscopy structure implicate the spike N-terminal domain (NTD) and confirm end-on binding. This finding rationalizes the effect of NTD mutations that abolish sugar binding in SARS-CoV-2 variants of concern. Together with genetic variance analyses in early pandemic patient cohorts, this binding implicates a sialylated polylactosamine motif found on tetraantennary N-linked glycoproteins deep in the human lung as potentially relevant to virulence and/or zoonosis.
Collapse
|
4
|
Prediction of protein assemblies, the next frontier: The CASP14-CAPRI experiment. Proteins 2021; 89:1800-1823. [PMID: 34453465 PMCID: PMC8616814 DOI: 10.1002/prot.26222] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 07/24/2021] [Accepted: 08/05/2021] [Indexed: 12/19/2022]
Abstract
We present the results for CAPRI Round 50, the fourth joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of twelve targets, including six dimers, three trimers, and three higher-order oligomers. Four of these were easy targets, for which good structural templates were available either for the full assembly, or for the main interfaces (of the higher-order oligomers). Eight were difficult targets for which only distantly related templates were found for the individual subunits. Twenty-five CAPRI groups including eight automatic servers submitted ~1250 models per target. Twenty groups including six servers participated in the CAPRI scoring challenge submitted ~190 models per target. The accuracy of the predicted models was evaluated using the classical CAPRI criteria. The prediction performance was measured by a weighted scoring scheme that takes into account the number of models of acceptable quality or higher submitted by each group as part of their five top-ranking models. Compared to the previous CASP-CAPRI challenge, top performing groups submitted such models for a larger fraction (70-75%) of the targets in this Round, but fewer of these models were of high accuracy. Scorer groups achieved stronger performance with more groups submitting correct models for 70-80% of the targets or achieving high accuracy predictions. Servers performed less well in general, except for the MDOCKPP and LZERD servers, who performed on par with human groups. In addition to these results, major advances in methodology are discussed, providing an informative overview of where the prediction of protein assemblies currently stands.
Collapse
|
5
|
Shape-Restrained Modeling of Protein-Small-Molecule Complexes with High Ambiguity Driven DOCKing. J Chem Inf Model 2021; 61:4807-4818. [PMID: 34436890 PMCID: PMC8479858 DOI: 10.1021/acs.jcim.1c00796] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Small-molecule docking remains one of the most valuable computational techniques for the structure prediction of protein-small-molecule complexes. It allows us to study the interactions between compounds and the protein receptors they target at atomic detail in a timely and efficient manner. Here, we present a new protocol in HADDOCK (High Ambiguity Driven DOCKing), our integrative modeling platform, which incorporates homology information for both receptor and compounds. It makes use of HADDOCK's unique ability to integrate information in the simulation to drive it toward conformations, which agree with the provided data. The focal point is the use of shape restraints derived from homologous compounds bound to the target receptors. We have developed two protocols: in the first, the shape is composed of dummy atom beads based on the position of the heavy atoms of the homologous template compound, whereas in the second, the shape is additionally annotated with pharmacophore data for some or all beads. For both protocols, ambiguous distance restraints are subsequently defined between those beads and the heavy atoms of the ligand to be docked. We have benchmarked the performance of these protocols with a fully unbound version of the widely used DUD-E (Database of Useful Decoys-Enhanced) dataset. In this unbound docking scenario, our template/shape-based docking protocol reaches an overall success rate of 81% when a reliable template can be identified (which was the case for 99 out of 102 complexes in the DUD-E dataset), which is close to the best results reported for bound docking on the DUD-E dataset.
Collapse
|
6
|
Structural Biology in the Clouds: The WeNMR-EOSC Ecosystem. Front Mol Biosci 2021; 8:729513. [PMID: 34395534 PMCID: PMC8356364 DOI: 10.3389/fmolb.2021.729513] [Citation(s) in RCA: 252] [Impact Index Per Article: 84.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 07/13/2021] [Indexed: 12/05/2022] Open
Abstract
Structural biology aims at characterizing the structural and dynamic properties of biological macromolecules at atomic details. Gaining insight into three dimensional structures of biomolecules and their interactions is critical for understanding the vast majority of cellular processes, with direct applications in health and food sciences. Since 2010, the WeNMR project (www.wenmr.eu) has implemented numerous web-based services to facilitate the use of advanced computational tools by researchers in the field, using the high throughput computing infrastructure provided by EGI. These services have been further developed in subsequent initiatives under H2020 projects and are now operating as Thematic Services in the European Open Science Cloud portal (www.eosc-portal.eu), sending >12 millions of jobs and using around 4,000 CPU-years per year. Here we review 10 years of successful e-infrastructure solutions serving a large worldwide community of over 23,000 users to date, providing them with user-friendly, web-based solutions that run complex workflows in structural biology. The current set of active WeNMR portals are described, together with the complex backend machinery that allows distributed computing resources to be harvested efficiently.
Collapse
|
7
|
MENSAdb: a thorough structural analysis of membrane protein dimers. Database (Oxford) 2021; 2021:baab013. [PMID: 33822911 PMCID: PMC8023553 DOI: 10.1093/database/baab013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Revised: 01/19/2021] [Accepted: 03/01/2021] [Indexed: 11/14/2022]
Abstract
Membrane proteins (MPs) are key players in a variety of different cellular processes and constitute the target of around 60% of all Food and Drug Administration-approved drugs. Despite their importance, there is still a massive lack of relevant structural, biochemical and mechanistic information mainly due to their localization within the lipid bilayer. To help fulfil this gap, we developed the MEmbrane protein dimer Novel Structure Analyser database (MENSAdb). This interactive web application summarizes the evolutionary and physicochemical properties of dimeric MPs to expand the available knowledge on the fundamental principles underlying their formation. Currently, MENSAdb contains features of 167 unique MPs (63% homo- and 37% heterodimers) and brings insights into the conservation of residues, accessible solvent area descriptors, average B-factors, intermolecular contacts at 2.5 Å and 4.0 Å distance cut-offs, hydrophobic contacts, hydrogen bonds, salt bridges, π-π stacking, T-stacking and cation-π interactions. The regular update and organization of all these data into a unique platform will allow a broad community of researchers to collect and analyse a large number of features efficiently, thus facilitating their use in the development of prediction models associated with MPs. Database URL: http://www.moreiralab.com/resources/mensadb.
Collapse
|
8
|
A click-flipped enzyme substrate boosts the performance of the diagnostic screening for Hunter syndrome. Chem Sci 2020; 11:12671-12676. [PMID: 34094461 PMCID: PMC8163285 DOI: 10.1039/d0sc04696e] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 10/23/2020] [Indexed: 11/23/2022] Open
Abstract
We report on the unexpected finding that click modification of iduronyl azides results in a conformational flip of the pyranose ring, which led to the development of a new strategy for the design of superior enzyme substrates for the diagnostic assaying of iduronate-2-sulfatase (I2S), a lysosomal enzyme related to Hunter syndrome. Synthetic substrates are essential in testing newborns for metabolic disorders to enable early initiation of therapy. Our click-flipped iduronyl triazole showed a remarkably better performance with I2S than commonly used O-iduronates. We found that both O- and triazole-linked substrates are accepted by the enzyme, irrespective of their different conformations, but only the O-linked product inhibits the activity of I2S. Thus, in the long reaction times required for clinical assays, the triazole substrate substantially outperforms the O-iduronate. Applying our click-flipped substrate to assay I2S in dried blood spots sampled from affected patients and random newborns significantly increased the confidence in discriminating between these groups, clearly indicating the potential of the click-flip strategy to control the biomolecular function of carbohydrates.
Collapse
|
9
|
PRODIGY-crystal: a web-tool for classification of biological interfaces in protein complexes. Bioinformatics 2020; 35:4821-4823. [PMID: 31141126 PMCID: PMC9186318 DOI: 10.1093/bioinformatics/btz437] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 05/06/2019] [Accepted: 05/25/2019] [Indexed: 01/11/2023] Open
Abstract
Summary Distinguishing biologically relevant interfaces from crystallographic ones in
biological complexes is fundamental in order to associate cellular functions to the
correct macromolecular assemblies. Recently, we described a detailed study reporting the
differences in the type of intermolecular residue–residue contacts between biological
and crystallographic interfaces. Our findings allowed us to develop a fast predictor of
biological interfaces reaching an accuracy of 0.92 and competitive to the current state
of the art. Here we present its web-server implementation, PRODIGY-CRYSTAL, aimed at the
classification of biological and crystallographic interfaces. PRODIGY-CRYSTAL has the
advantage of being fast, accurate and simple. This, together with its user-friendly
interface and user support forum, ensures its broad accessibility. Availability and implementation PRODIGY-CRYSTAL is freely available without registration requirements at https://haddock.science.uu.nl/services/PRODIGY-CRYSTAL.
Collapse
|
10
|
An overview of data-driven HADDOCK strategies in CAPRI rounds 38-45. Proteins 2019; 88:1029-1036. [PMID: 31886559 DOI: 10.1002/prot.25869] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2019] [Revised: 12/17/2019] [Accepted: 12/26/2019] [Indexed: 01/18/2023]
Abstract
Our information-driven docking approach HADDOCK has demonstrated a sustained performance since the start of its participation to CAPRI. This is due, in part, to its ability to integrate data into the modeling process, and to the robustness of its scoring function. We participated in CAPRI both as server and manual predictors. In CAPRI rounds 38-45, we have used various strategies depending on the available information. These ranged from imposing restraints to a few residues identified from literature as being important for the interaction, to binding pockets identified from homologous complexes or template-based refinement/CA-CA restraint-guided docking from identified templates. When relevant, symmetry restraints were used to limit the conformational sampling. We also tested for a large decamer target a new implementation of the MARTINI coarse-grained force field in HADDOCK. Overall, we obtained acceptable or better predictions for 13 and 11 server and manual submissions, respectively, out of the 22 interfaces. Our server performance (acceptable or higher-quality models when considering the top 10) was better (59%) than the manual (50%) one, in which we typically experiment with various combinations of protocols and data sources. Again, our simple scoring function based on a linear combination of intermolecular van der Waals and electrostatic energies and an empirical desolvation term demonstrated a good performance in the scoring experiment with a 63% success rate across all 22 interfaces. An analysis of model quality indicates that, while we are consistently performing well in generating acceptable models, there is room for improvement for generating/identifying higher quality models.
Collapse
|
11
|
Blind prediction of homo- and hetero-protein complexes: The CASP13-CAPRI experiment. Proteins 2019; 87:1200-1221. [PMID: 31612567 PMCID: PMC7274794 DOI: 10.1002/prot.25838] [Citation(s) in RCA: 79] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2019] [Revised: 09/26/2019] [Accepted: 09/27/2019] [Indexed: 12/28/2022]
Abstract
We present the results for CAPRI Round 46, the third joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of 20 targets including 14 homo-oligomers and 6 heterocomplexes. Eight of the homo-oligomer targets and one heterodimer comprised proteins that could be readily modeled using templates from the Protein Data Bank, often available for the full assembly. The remaining 11 targets comprised 5 homodimers, 3 heterodimers, and two higher-order assemblies. These were more difficult to model, as their prediction mainly involved "ab-initio" docking of subunit models derived from distantly related templates. A total of ~30 CAPRI groups, including 9 automatic servers, submitted on average ~2000 models per target. About 17 groups participated in the CAPRI scoring rounds, offered for most targets, submitting ~170 models per target. The prediction performance, measured by the fraction of models of acceptable quality or higher submitted across all predictors groups, was very good to excellent for the nine easy targets. Poorer performance was achieved by predictors for the 11 difficult targets, with medium and high quality models submitted for only 3 of these targets. A similar performance "gap" was displayed by scorer groups, highlighting yet again the unmet challenge of modeling the conformational changes of the protein components that occur upon binding or that must be accounted for in template-based modeling. Our analysis also indicates that residues in binding interfaces were less well predicted in this set of targets than in previous Rounds, providing useful insights for directions of future improvements.
Collapse
|
12
|
Protein-ligand pose and affinity prediction: Lessons from D3R Grand Challenge 3. J Comput Aided Mol Des 2019; 33:83-91. [PMID: 30128928 PMCID: PMC6373529 DOI: 10.1007/s10822-018-0148-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Accepted: 08/09/2018] [Indexed: 12/30/2022]
Abstract
We report the performance of HADDOCK in the 2018 iteration of the Grand Challenge organised by the D3R consortium. Building on the findings of our participation in last year's challenge, we significantly improved our pose prediction protocol which resulted in a mean RMSD for the top scoring pose of 3.04 and 2.67 Å for the cross-docking and self-docking experiments respectively, which corresponds to an overall success rate of 63% and 71% when considering the top1 and top5 models respectively. This performance ranks HADDOCK as the 6th and 3rd best performing group (excluding multiple submissions from a same group) out of a total of 44 and 47 submissions respectively. Our ligand-based binding affinity predictor is the 3rd best predictor overall, behind only the two leading structure-based implementations, and the best ligand-based one with a Kendall's Tau correlation of 0.36 for the Cathepsin challenge. It also performed well in the classification part of the Kinase challenges, with Matthews Correlation Coefficients of 0.49 (ranked 1st), 0.39 (ranked 4th) and 0.21 (ranked 4th) for the JAK2, vEGFR2 and p38a targets respectively. Through our participation in last year's competition we came to the conclusion that template selection is of critical importance for the successful outcome of the docking. This year we have made improvements in two additional areas of importance: ligand conformer selection and initial positioning, which have been key to our excellent pose prediction performance this year.
Collapse
|
13
|
A Membrane Protein Complex Docking Benchmark. J Mol Biol 2018; 430:5246-5256. [PMID: 30414967 DOI: 10.1016/j.jmb.2018.11.005] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Revised: 11/02/2018] [Accepted: 11/05/2018] [Indexed: 01/15/2023]
Abstract
We report the first membrane protein-protein docking benchmark consisting of 37 targets of diverse functions and folds. The structures were chosen based on a set of parameters such as the availability of unbound structures, the modeling difficulty and their uniqueness. They have been cleaned and consistently numbered to facilitate their use in docking. Using this benchmark, we establish the baseline performance of HADDOCK, without any specific optimization for membrane proteins, for two scenarios: true interface-driven docking and ab initio docking. Despite the fact that HADDOCK has been developed for soluble complexes, it shows promising docking performance for membrane systems, but there is clearly room for further optimization. The resulting set of docking decoys, together with analysis scripts, is made freely available. These can serve as a basis for the optimization of membrane complex-specific scoring functions.
Collapse
|
14
|
Performance of HADDOCK and a simple contact-based protein-ligand binding affinity predictor in the D3R Grand Challenge 2. J Comput Aided Mol Des 2018; 32:175-185. [PMID: 28831657 PMCID: PMC5767195 DOI: 10.1007/s10822-017-0049-y] [Citation(s) in RCA: 77] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Accepted: 08/18/2017] [Indexed: 10/28/2022]
Abstract
We present the performance of HADDOCK, our information-driven docking software, in the second edition of the D3R Grand Challenge. In this blind experiment, participants were requested to predict the structures and binding affinities of complexes between the Farnesoid X nuclear receptor and 102 different ligands. The models obtained in Stage1 with HADDOCK and ligand-specific protocol show an average ligand RMSD of 5.1 Å from the crystal structure. Only 6/35 targets were within 2.5 Å RMSD from the reference, which prompted us to investigate the limiting factors and revise our protocol for Stage2. The choice of the receptor conformation appeared to have the strongest influence on the results. Our Stage2 models were of higher quality (13 out of 35 were within 2.5 Å), with an average RMSD of 4.1 Å. The docking protocol was applied to all 102 ligands to generate poses for binding affinity prediction. We developed a modified version of our contact-based binding affinity predictor PRODIGY, using the number of interatomic contacts classified by their type and the intermolecular electrostatic energy. This simple structure-based binding affinity predictor shows a Kendall's Tau correlation of 0.37 in ranking the ligands (7th best out of 77 methods, 5th/25 groups). Those results were obtained from the average prediction over the top10 poses, irrespective of their similarity/correctness, underscoring the robustness of our simple predictor. This results in an enrichment factor of 2.5 compared to a random predictor for ranking ligands within the top 25%, making it a promising approach to identify lead compounds in virtual screening.
Collapse
|
15
|
SpotOn: High Accuracy Identification of Protein-Protein Interface Hot-Spots. Sci Rep 2017; 7:8007. [PMID: 28808256 PMCID: PMC5556074 DOI: 10.1038/s41598-017-08321-2] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2017] [Accepted: 07/07/2017] [Indexed: 12/21/2022] Open
Abstract
We present SpotOn, a web server to identify and classify interfacial residues as Hot-Spots (HS) and Null-Spots (NS). SpotON implements a robust algorithm with a demonstrated accuracy of 0.95 and sensitivity of 0.98 on an independent test set. The predictor was developed using an ensemble machine learning approach with up-sampling of the minor class. It was trained on 53 complexes using various features, based on both protein 3D structure and sequence. The SpotOn web interface is freely available at: http://milou.science.uu.nl/services/SPOTON/.
Collapse
|
16
|
Folding Molecular Dynamics Simulations Accurately Predict the Effect of Mutations on the Stability and Structure of a Vammin-Derived Peptide. J Phys Chem B 2014; 118:10076-84. [DOI: 10.1021/jp5046113] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
17
|
On the application of Good-Turing statistics to quantify convergence of biomolecular simulations. J Chem Inf Model 2014; 54:209-17. [PMID: 24358959 DOI: 10.1021/ci4005817] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Quantifying convergence and sufficient sampling of macromolecular molecular dynamics simulations is more often than not a source of controversy (and of various ad hoc solutions) in the field. Clearly, the only reasonable, consistent, and satisfying way to infer convergence (or otherwise) of a molecular dynamics trajectory must be based on probability theory. Ideally, the question we would wish to answer is the following: "What is the probability that a molecular configuration important for the analysis in hand has not yet been observed ?". Here we propose a method for answering a variant of this question by using the Good-Turing formalism for frequency estimation of unobserved species in a sample. Although several approaches may be followed in order to deal with the problem of discretizing the configurational space, for this work we use the classical RMSD matrix as a means to answering the following question: "What is the probability that a molecular configuration with an RMSD (from all other already observed configurations) higher than a given threshold has not actually been observed ?". We apply the proposed method to several different trajectories and show that the procedure appears to be both computationally stable and internally consistent. A free, open-source program implementing these ideas is immediately available for download via public repositories.
Collapse
|
18
|
Grcarma: A fully automated task-oriented interface for the analysis of molecular dynamics trajectories. J Comput Chem 2013; 34:2310-2. [DOI: 10.1002/jcc.23381] [Citation(s) in RCA: 95] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|