1
|
Briand E, Kohnke B, Kutzner C, Grubmüller H. Constant pH Simulation with FMM Electrostatics in GROMACS. (A) Design and Applications. J Chem Theory Comput 2025; 21:1762-1786. [PMID: 39919102 PMCID: PMC11866755 DOI: 10.1021/acs.jctc.4c01318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2024] [Revised: 01/14/2025] [Accepted: 01/16/2025] [Indexed: 02/09/2025]
Abstract
The structural dynamics of biological macromolecules, such as proteins, DNA/RNA, or complexes thereof, are strongly influenced by protonation changes of their typically many titratable groups, which explains their sensitivity to pH changes. Conversely, conformational and environmental changes of the biomolecule affect the protonation state of these groups. With few exceptions, conventional force field-based molecular dynamics (MD) simulations neither account for these effects nor do they allow for coupling to a pH buffer. Here, we present design decisions and applications of a rigorous Hamiltonian interpolation λ-dynamics constant pH method in GROMACS, which rests on GPU-accelerated Fast Multipole Method (FMM) electrostatics. Our implementation supports both CHARMM36m and Amber99sb*-ILDN force fields and is largely automated to enable seamless switching from regular MD to constant pH MD, involving minimal changes to the input files. Here, the first of two companion papers describes the underlying constant pH protocol and sample applications to several prototypical benchmark systems such as cardiotoxin V, lysozyme, and staphylococcal nuclease. Enhanced convergence is achieved through a new dynamic barrier height optimization method, and high pKa accuracy is demonstrated. We use Functional Mode Analysis (FMA) and Mutual Information (MI) to explore the complex intra- and intermolecular couplings between the protonation states of titratable groups as well as those between protonation states and conformational dynamics. We identify striking conformation-dependent pKa variations and unexpected inter-residue couplings. Conformation-protonation coupling is identified as a primary cause of the slow protonation convergence notorious to constant pH simulations involving multiple titratable groups, suggesting enhanced sampling methods to accelerate convergence.
Collapse
Affiliation(s)
- Eliane Briand
- Theoretical and Computational
Biophysics, Max Planck Institute for Multidisciplinary
Sciences, Am Fassberg 11, 37077 Göttingen, Germany
| | - Bartosz Kohnke
- Theoretical and Computational
Biophysics, Max Planck Institute for Multidisciplinary
Sciences, Am Fassberg 11, 37077 Göttingen, Germany
| | - Carsten Kutzner
- Theoretical and Computational
Biophysics, Max Planck Institute for Multidisciplinary
Sciences, Am Fassberg 11, 37077 Göttingen, Germany
| | - Helmut Grubmüller
- Theoretical and Computational
Biophysics, Max Planck Institute for Multidisciplinary
Sciences, Am Fassberg 11, 37077 Göttingen, Germany
| |
Collapse
|
2
|
Wilson CJ, de Groot BL, Gapsys V. Resolving coupled pH titrations using alchemical free energy calculations. J Comput Chem 2024; 45:1444-1455. [PMID: 38471815 DOI: 10.1002/jcc.27318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 01/11/2024] [Accepted: 01/16/2024] [Indexed: 03/14/2024]
Abstract
In a protein, nearby titratable sites can be coupled: the (de)protonation of one may affect the other. The degree of this interaction depends on several factors and can influence the measured p K a . Here, we derive a formalism based on double free energy differences ( Δ Δ G ) for quantifying the individual site p K a values of coupled residues. As Δ Δ G values can be obtained by means of alchemical free energy calculations, the presented approach allows for a convenient estimation of coupled residue p K a s in practice. We demonstrate that our approach and a previously proposed microscopic p K a formalism, can be combined with alchemical free energy calculations to resolve pH-dependent protein p K a values. Toy models and both, regular and constant-pH molecular dynamics simulations, alongside experimental data, are used to validate this approach. Our results highlight the insights gleaned when coupling and microstate probabilities are analyzed and suggest extensions to more complex enzymatic contexts. Furthermore, we find that naïvely computed p K a values that ignore coupling, can be significantly improved when coupling is accounted for, in some cases reducing the error by half. In short, alchemical free energy methods can resolve the p K a values of both uncoupled and coupled residues.
Collapse
Affiliation(s)
- Carter J Wilson
- Department of Mathematics, The University of Western Ontario, London, Ontario, Canada
- Centre for Advanced Materials and Biomaterials Research (CAMBR), The University of Western Ontario, London, Ontario, Canada
- Computational Biomolecular Dynamics Group, Department of Theoretical and Computational Biophysics, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Bert L de Groot
- Computational Biomolecular Dynamics Group, Department of Theoretical and Computational Biophysics, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Vytautas Gapsys
- Computational Biomolecular Dynamics Group, Department of Theoretical and Computational Biophysics, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
- Computational Chemistry, Janssen Research & Development, Beerse, Belgium
| |
Collapse
|
3
|
Wilson C, Karttunen M, de Groot BL, Gapsys V. Accurately Predicting Protein p Ka Values Using Nonequilibrium Alchemy. J Chem Theory Comput 2023; 19:7833-7845. [PMID: 37820376 PMCID: PMC10653114 DOI: 10.1021/acs.jctc.3c00721] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Indexed: 10/13/2023]
Abstract
The stability, solubility, and function of a protein depend on both its net charge and the protonation states of its individual residues. pKa is a measure of the tendency for a given residue to (de)protonate at a specific pH. Although pKa values can be resolved experimentally, theory and computation provide a compelling alternative. To this end, we assess the applicability of a nonequilibrium (NEQ) alchemical free energy method to the problem of pKa prediction. On a data set of 144 residues that span 13 proteins, we report an average unsigned error of 0.77 ± 0.09, 0.69 ± 0.09, and 0.52 ± 0.04 pK for aspartate, glutamate, and lysine, respectively. This is comparable to current state-of-the-art predictors and the accuracy recently reached using free energy perturbation methods (e.g., FEP+). Moreover, we demonstrate that our open-source, pmx-based approach can accurately resolve the pKa values of coupled residues and observe a substantial performance disparity associated with the lysine partial charges in Amber14SB/Amber99SB*-ILDN, for which an underused fix already exists.
Collapse
Affiliation(s)
- Carter
J. Wilson
- Department
of Mathematics, The University of Western
Ontario, N6A 5B7 London, Canada
- Centre
for Advanced Materials and Biomaterials Research (CAMBR), The University of Western Ontario, N6A 5B7 London, Canada
| | - Mikko Karttunen
- Centre
for Advanced Materials and Biomaterials Research (CAMBR), The University of Western Ontario, N6A 5B7 London, Canada
- Department
of Physics & Astronomy, The University
of Western Ontario, N6A
5B7 London, Canada
- Department
of Chemistry, The University of Western
Ontario, N6A 5B7 London, Canada
| | - Bert L. de Groot
- Computational
Biomolecular Dynamics Group, Department of Theoretical and Computational
Biophysics, Max Planck Institute for Multidisciplinary
Sciences, 37077 Göttingen, Germany
| | - Vytautas Gapsys
- Computational
Biomolecular Dynamics Group, Department of Theoretical and Computational
Biophysics, Max Planck Institute for Multidisciplinary
Sciences, 37077 Göttingen, Germany
- Computational
Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., Turnhoutseweg 30, B-2340 Beerse, Belgium
| |
Collapse
|
4
|
Awoonor-Williams E, Golosov AA, Hornak V. Benchmarking In Silico Tools for Cysteine p Ka Prediction. J Chem Inf Model 2023; 63:2170-2180. [PMID: 36996330 DOI: 10.1021/acs.jcim.3c00004] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/01/2023]
Abstract
Accurate estimation of the pKa's of cysteine residues in proteins could inform targeted approaches in hit discovery. The pKa of a targetable cysteine residue in a disease-related protein is an important physiochemical parameter in covalent drug discovery, as it influences the fraction of nucleophilic thiolate amenable to chemical protein modification. Traditional structure-based in silico tools are limited in their predictive accuracy of cysteine pKa's relative to other titratable residues. Additionally, there are limited comprehensive benchmark assessments for cysteine pKa predictive tools. This raises the need for extensive assessment and evaluation of methods for cysteine pKa prediction. Here, we report the performance of several computational pKa methods, including single-structure and ensemble-based approaches, on a diverse test set of experimental cysteine pKa's retrieved from the PKAD database. The dataset consisted of 16 wildtype and 10 mutant proteins with experimentally measured cysteine pKa values. Our results highlight that these methods are varied in their overall predictive accuracies. Among the test set of wildtype proteins evaluated, the best method (MOE) yielded a mean absolute error of 2.3 pK units, highlighting the need for improvement of existing pKa methods for accurate cysteine pKa estimation. Given the limited accuracy of these methods, further development is needed before these approaches can be routinely employed to drive design decisions in early drug discovery efforts.
Collapse
Affiliation(s)
- Ernest Awoonor-Williams
- Novartis Institutes for BioMedical Research, 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Andrei A Golosov
- Novartis Institutes for BioMedical Research, 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Viktor Hornak
- Novartis Institutes for BioMedical Research, 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
5
|
Aho N, Buslaev P, Jansen A, Bauer P, Groenhof G, Hess B. Scalable Constant pH Molecular Dynamics in GROMACS. J Chem Theory Comput 2022; 18:6148-6160. [PMID: 36128977 PMCID: PMC9558312 DOI: 10.1021/acs.jctc.2c00516] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Noora Aho
- Nanoscience Center and Department of Chemistry, University of Jyväskylä, 40014Jyväskylä, Finland
| | - Pavel Buslaev
- Nanoscience Center and Department of Chemistry, University of Jyväskylä, 40014Jyväskylä, Finland
| | - Anton Jansen
- Department of Applied Physics and Swedish e-Science Research Center, Science for Life Laboratory, KTH Royal Institute of Technology, 100 44Stockholm, Sweden
| | - Paul Bauer
- Department of Applied Physics and Swedish e-Science Research Center, Science for Life Laboratory, KTH Royal Institute of Technology, 100 44Stockholm, Sweden
| | - Gerrit Groenhof
- Nanoscience Center and Department of Chemistry, University of Jyväskylä, 40014Jyväskylä, Finland
| | - Berk Hess
- Department of Applied Physics and Swedish e-Science Research Center, Science for Life Laboratory, KTH Royal Institute of Technology, 100 44Stockholm, Sweden
| |
Collapse
|
6
|
Chen AY, Lee J, Damjanovic A, Brooks BR. Protein p Ka Prediction by Tree-Based Machine Learning. J Chem Theory Comput 2022; 18:2673-2686. [PMID: 35289611 PMCID: PMC10510853 DOI: 10.1021/acs.jctc.1c01257] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Protonation states of ionizable protein residues modulate many essential biological processes. For correct modeling and understanding of these processes, it is crucial to accurately determine their pKa values. Here, we present four tree-based machine learning models for protein pKa prediction. The four models, Random Forest, Extra Trees, eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM), were trained on three experimental PDB and pKa datasets, two of which included a notable portion of internal residues. We observed similar performance among the four machine learning algorithms. The best model trained on the largest dataset performs 37% better than the widely used empirical pKa prediction tool PROPKA and 15% better than the published result from the pKa prediction method DelPhiPKa. The overall root-mean-square error (RMSE) for this model is 0.69, with surface and buried RMSE values being 0.56 and 0.78, respectively, considering six residue types (Asp, Glu, His, Lys, Cys, and Tyr), and 0.63 when considering Asp, Glu, His, and Lys only. We provide pKa predictions for proteins in human proteome from the AlphaFold Protein Structure Database and observed that 1% of Asp/Glu/Lys residues have highly shifted pKa values close to the physiological pH.
Collapse
Affiliation(s)
- Ada Y. Chen
- Department of Physics & Astronomy, Johns Hopkins
University, Baltimore, Maryland, 21218
- Laboratory of Computational Biology, National Heart, Lung
and Blood Institute, National Institutes of Health, Bethesda, Maryland, 20892
| | - Juyong Lee
- Department of Chemistry, Division of Chemistry and
Biochemistry, Kangwon National University, 1 Gangwondaehak-gil, Chuncheon, 24341,
Republic of Korea
| | - Ana Damjanovic
- Department of Biophysics, Johns Hopkins University,
Baltimore, Maryland, 21218
| | - Bernard R. Brooks
- Laboratory of Computational Biology, National Heart, Lung
and Blood Institute, National Institutes of Health, Bethesda, Maryland, 20892
| |
Collapse
|