1
|
Giese TJ, Zeng J, York DM. Transferability of MACE Graph Neural Network for Range Corrected Δ-Machine Learning Potential QM/MM Applications. J Phys Chem B 2025. [PMID: 40418048 DOI: 10.1021/acs.jpcb.5c02006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2025]
Abstract
We previously introduced a "range corrected" Δ-machine learning potential (ΔMLP) that used deep neural networks to improve the accuracy of combined quantum mechanical/molecular mechanical (QM/MM) simulations by correcting both the internal QM and QM/MM interaction energies and forces [J. Chem. Theory Comput. 2021, 17, 6993-7009]. The present work extends this approach to include graph neural networks. Specifically, the approach is applied to the MACE message passing neural network architecture, and a series of AM1/d + MACE models are trained to reproduce PBE0/6-31G* QM/MM energies and forces of model phosphoryl transesterification reactions. Several models are designed to test the transferability of AM1/d + MACE by varying the amount of training data and calculating free energy surfaces of reactions that were not included in the parameter refinement. The transferability is compared to AM1/d + DP models that use the DeepPot-SE (DP) deep neural network architecture. The AM1/d + MACE models are found to reproduce the target free energy surfaces even in instances where the AM1/d + DP models exhibit inaccuracies. We train "end-state" models that include data only from the reactant and product states of the 6 reactions. Unlike the uncorrected AM1/d profiles, the AM1/d + MACE method correctly reproduces a stable pentacoordinated phosphorus intermediate even though the training did not include structures with a similar bonding pattern. Furthermore, the message passing mechanism hyperparameters defining the MACE network are varied to explore their effect on the model's accuracy and performance. The AM1/d + MACE simulations are 28% slower than AM1/d QM/MM when the ΔMLP correction is performed on a graphics processing unit. Our results suggest that the MACE architecture may lead to ΔMLP models with improved transferability.
Collapse
Affiliation(s)
- Timothy J Giese
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine, and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway 08854, New Jersey, United States
| | - Jinzhe Zeng
- School of Artificial Intelligence and Data Science, University of Science and Technology of China, Hefei 230026, China
- Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou Big Data & AI Research and Engineering Center, Suzhou 215123, China
| | - Darrin M York
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine, and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway 08854, New Jersey, United States
| |
Collapse
|
2
|
Sha X, Chen Z, Xie D, Zhou Y. Modeling Enzyme Reaction and Mutation by Direct Machine Learning/Molecular Mechanics Simulations. J Chem Theory Comput 2025; 21:4335-4346. [PMID: 40273117 DOI: 10.1021/acs.jctc.5c00149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2025]
Abstract
Accurately modeling enzyme reactions through direct machine learning/molecular mechanics simulations remains challenging in describing the electrostatic coupling between the QM and MM subsystems. In this work, we proposed a reweighting ME (mechanic embedding) REANN (recursively embedded atom neural network) method that trains the potential and point charges of the QM subsystem in vacuo. The charge equilibration approach has been encoded into REANN to ensure conservation of the total charge of the QM subsystem. Electrostatic coupling is measured by point charges, and the polarization of the MM subsystem on the coupling can be corrected by thermodynamic perturbation after molecular dynamics simulations. We first constructed the REANN surfaces of potential energy and charges for the acylation of cyclooxygenase-1 (COX-1) and cyclooxygenase-2 (COX-2) by aspirin. These surfaces allowed us to reproduce the free energy curves of B3LYP/MM-MD with a chemical accuracy. Subsequently, they were successfully applied to R513A of COX-2, reproducing the free energy barrier simulated by B3LYP/MM MD with a difference of less than 0.5 kcal mol-1 and a speedup of 80-fold, revealing our method can predict the activity of mutants accurately and rapidly. This method is expected to be applied in virtual screening in the future.
Collapse
Affiliation(s)
- Xinhu Sha
- Institute of Theoretical and Computational Chemistry, State Key Laboratory of Coordination Chemistry, Key Laboratory of Mesoscopic Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, China
| | - Zhuo Chen
- Institute of Theoretical and Computational Chemistry, State Key Laboratory of Coordination Chemistry, Key Laboratory of Mesoscopic Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, China
| | - Daiqian Xie
- Institute of Theoretical and Computational Chemistry, State Key Laboratory of Coordination Chemistry, Key Laboratory of Mesoscopic Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, China
- Hefei National Laboratory, Hefei 230088, China
| | - Yanzi Zhou
- Institute of Theoretical and Computational Chemistry, State Key Laboratory of Coordination Chemistry, Key Laboratory of Mesoscopic Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, China
| |
Collapse
|
3
|
Tokita AM, Devergne T, Saitta AM, Behler J. Free energy profiles for chemical reactions in solution from high-dimensional neural network potentials: The case of the Strecker synthesis. J Chem Phys 2025; 162:174120. [PMID: 40326597 DOI: 10.1063/5.0268948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2025] [Accepted: 04/14/2025] [Indexed: 05/07/2025] Open
Abstract
Machine learning potentials (MLPs) have become a popular tool in chemistry and materials science as they combine the accuracy of electronic structure calculations with the high computational efficiency of analytic potentials. MLPs are particularly useful for computationally demanding simulations such as the determination of free energy profiles governing chemical reactions in solution, but to date, such applications are still rare. In this work, we show how umbrella sampling simulations can be combined with active learning of high-dimensional neural network potentials (HDNNPs) to construct free energy profiles in a systematic way. For the example of the first step of Strecker synthesis of glycine in aqueous solution, we provide a detailed analysis of the improving quality of HDNNPs for datasets of increasing size. We find that, in addition to the typical quantification of energy and force errors with respect to the underlying density functional theory data, the long-term stability of the simulations and the convergence of physical properties should be rigorously monitored to obtain reliable and converged free energy profiles of chemical reactions in solution.
Collapse
Affiliation(s)
- Alea Miako Tokita
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany
- Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Timothée Devergne
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMRCNRS 7590, Institut de Minéralogie, de Physique des Matériaux et deCosmochimie, IMPMC, F-75005 Paris, France
- Atomistic Simulations, Italian Institute of Technology, Genova, Italy and Computational Statistics and Machine Learning, Italian Institute of Technology, Genova, Italy
| | - A Marco Saitta
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMRCNRS 7590, Institut de Minéralogie, de Physique des Matériaux et deCosmochimie, IMPMC, F-75005 Paris, France
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany
- Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| |
Collapse
|
4
|
Xie Z, Li Y, Xia Y, Zhang J, Yuan S, Fan C, Yang YI, Gao YQ. Multiscale Force Field Model Based on a Graph Neural Network for Complex Chemical Systems. J Chem Theory Comput 2025; 21:2501-2514. [PMID: 40012469 DOI: 10.1021/acs.jctc.4c01449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/28/2025]
Abstract
Inspired by the QM/MM methodology, the ML/MM approach introduces a new opportunity for multiscale simulation, improving the balance between accuracy and computational efficiency. Benefited from the rapid advancements in molecular embedding methods, density functional theory level quantum mechanical (QM) calculations within the QM/MM framework can be accelerated by several orders of magnitude through the application of machine learning (ML) potential energy surfaces. As a problem inherited from the QM/MM methodology, challenges exist in designing the interactions between machine learning and molecular mechanics (MM) regions. In this study, electrostatic interactions between machine learning and MM atoms are treated by using a graphical neural network based on stationary perturbation theory. In this protocol, we process coordinates and MM charges to yield electrostatic energy and forces, resulting in a high-performance electrostatic embedding ML/MM architecture. The accuracy of the ML/MM energy was validated in aqueous solutions of alanine dipeptide and allyl vinyl ether (AVE). We investigated the transferability of parameters trained from AVE in a single solvent to various other solvents, including water, methanol, dimethyl sulfoxide, toluene, ionic liquids, and water-toluene interface environments. We then established a solvent-free protocol for data set preparation. Comparison of the free energy landscapes of the Claisen rearrangement of AVE in different solvation environments showed the catalytic effect of aqueous solutions, consistent with experiments.
Collapse
Affiliation(s)
- Zhaoxin Xie
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518132, China
| | - Yanheng Li
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518132, China
| | - Yijie Xia
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Jun Zhang
- Changping Laboratory, Beijing 102200, China
| | - Sihao Yuan
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518132, China
| | - Cheng Fan
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518132, China
| | - Yi Isaac Yang
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518132, China
| | - Yi Qin Gao
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Changping Laboratory, Beijing 102200, China
| |
Collapse
|
5
|
Crha R, Poliak P, Gillhofer M, Oostenbrink C. Alchemical Free-Energy Calculations at Quantum-Chemical Precision. J Phys Chem Lett 2025; 16:863-869. [PMID: 39818976 PMCID: PMC11789145 DOI: 10.1021/acs.jpclett.4c03213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2024] [Revised: 01/08/2025] [Accepted: 01/13/2025] [Indexed: 01/19/2025]
Abstract
In the past decade, machine-learned potentials (MLP) have demonstrated the capability to predict various QM properties learned from a set of reference QM calculations. Accordingly, hybrid QM/MM simulations can be accelerated by replacement of expensive QM calculations with efficient MLP energy predictions. At the same time, alchemical free-energy perturbations (FEP) remain unachievable at the QM level of theory. In this work, we extend the capabilities of the Buffer Region Neural Network (BuRNN) QM/MM scheme toward FEP. BuRNN introduces a buffer region that experiences full electronic polarization by the QM region to minimize artifacts at the QM/MM interface. An MLP is used to predict the energies for the QM region and its interactions with the buffer region. Furthermore, BuRNN allows us to implement FEP directly into the MLP Hamiltonian. Here, we describe the alchemical change from methanol to methane in water at the MLP/MM level as a proof of concept.
Collapse
Affiliation(s)
- Radek Crha
- Institute
for Molecular Modeling and Simulation, Department of Material Sciences
and Process Engineering, University of Natural
Resources and Life Sciences, Vienna, Muthgasse 18, Vienna 1190, Austria
- Christian
Doppler Laboratory for Molecular Informatics in the Biosciences, University of Natural Resources and Life Sciences, Vienna 1190, Austria
| | - Peter Poliak
- Institute
for Molecular Modeling and Simulation, Department of Material Sciences
and Process Engineering, University of Natural
Resources and Life Sciences, Vienna, Muthgasse 18, Vienna 1190, Austria
- Institute
of Physical Chemistry and Chemical Physics, Faculty of Chemical and
Food Technology, Slovak University of Technology
in Bratislava, Radlinského
9, Bratislava 812 37, Slovakia
| | - Michael Gillhofer
- Institute
for Molecular Modeling and Simulation, Department of Material Sciences
and Process Engineering, University of Natural
Resources and Life Sciences, Vienna, Muthgasse 18, Vienna 1190, Austria
- Christian
Doppler Laboratory for Molecular Informatics in the Biosciences, University of Natural Resources and Life Sciences, Vienna 1190, Austria
| | - Chris Oostenbrink
- Institute
for Molecular Modeling and Simulation, Department of Material Sciences
and Process Engineering, University of Natural
Resources and Life Sciences, Vienna, Muthgasse 18, Vienna 1190, Austria
- Christian
Doppler Laboratory for Molecular Informatics in the Biosciences, University of Natural Resources and Life Sciences, Vienna 1190, Austria
| |
Collapse
|
6
|
Arattu Thodika A, Pan X, Shao Y, Nam K. Machine Learning Quantum Mechanical/Molecular Mechanical Potentials: Evaluating Transferability in Dihydrofolate Reductase-Catalyzed Reactions. J Chem Theory Comput 2025; 21:817-832. [PMID: 39815393 PMCID: PMC11781312 DOI: 10.1021/acs.jctc.4c01487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 12/30/2024] [Accepted: 01/03/2025] [Indexed: 01/18/2025]
Abstract
Integrating machine learning potentials (MLPs) with quantum mechanical/molecular mechanical (QM/MM) free energy simulations has emerged as a powerful approach for studying enzymatic catalysis. However, its practical application has been hindered by the time-consuming process of generating the necessary training, validation, and test data for MLP models through QM/MM simulations. Furthermore, the entire process needs to be repeated for each specific enzyme system and reaction. To overcome this bottleneck, it is required that trained MLPs exhibit transferability across different enzyme environments and reacting species, thereby eliminating the need for retraining with each new enzyme variant. In this study, we explore this potential by evaluating the transferability of a pretrained ΔMLP model across different enzyme mutations within the MM environment using the QM/MM-based ML architecture developed by Pan, X. J. Chem. Theory Comput. 2021, 17(9), 5745-5758. The study includes scenarios such as single point substitutions, a homologous enzyme from different species, and even a transition to an aqueous environment, where the last two systems have MM environment that is substantially different from that used in MLP training. The results show that the ΔMLP model effectively captures and predicts the effects of enzyme mutations on electrostatic interactions, producing reliable free energy profiles of enzyme-catalyzed reactions without the need for retraining. The study also identified notable limitations in transferability, particularly when transitioning from enzyme to water-rich MM environments. Overall, this study demonstrates the robustness of the Pan et al.'s QM/MM-based ML architecture for application to diverse enzyme systems, as well as the need for further research and the development of more sophisticated MLP models and training methods.
Collapse
Affiliation(s)
- Abdul
Raafik Arattu Thodika
- Department
of Chemistry and Biochemistry, University
of Texas at Arlington, Arlington, Texas 76019, United States
| | - Xiaoliang Pan
- Department
of Chemistry and Biochemistry, University
of Oklahoma, Norman, Oklahoma 73019, United States
| | - Yihan Shao
- Department
of Chemistry and Biochemistry, University
of Oklahoma, Norman, Oklahoma 73019, United States
| | - Kwangho Nam
- Department
of Chemistry and Biochemistry, University
of Texas at Arlington, Arlington, Texas 76019, United States
- Division
of Data Science, University of Texas at
Arlington, Arlington, Texas 76019, United States
| |
Collapse
|
7
|
Zhang H, Juraskova V, Duarte F. Modelling chemical processes in explicit solvents with machine learning potentials. Nat Commun 2024; 15:6114. [PMID: 39030199 PMCID: PMC11271496 DOI: 10.1038/s41467-024-50418-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 07/08/2024] [Indexed: 07/21/2024] Open
Abstract
Solvent effects influence all stages of the chemical processes, modulating the stability of intermediates and transition states, as well as altering reaction rates and product ratios. However, accurately modelling these effects remains challenging. Here, we present a general strategy for generating reactive machine learning potentials to model chemical processes in solution. Our approach combines active learning with descriptor-based selectors and automation, enabling the construction of data-efficient training sets that span the relevant chemical and conformational space. We apply this strategy to investigate a Diels-Alder reaction in water and methanol. The generated machine learning potentials enable us to obtain reaction rates that are in agreement with experimental data and analyse the influence of these solvents on the reaction mechanism. Our strategy offers an efficient approach to the routine modelling of chemical reactions in solution, opening up avenues for studying complex chemical processes in an efficient manner.
Collapse
Affiliation(s)
- Hanwen Zhang
- Chemistry Research Laboratory, Oxford, United Kingdom
| | | | | |
Collapse
|
8
|
Yang Y, Zhang S, Ranasinghe KD, Isayev O, Roitberg AE. Machine Learning of Reactive Potentials. Annu Rev Phys Chem 2024; 75:371-395. [PMID: 38941524 DOI: 10.1146/annurev-physchem-062123-024417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
In the past two decades, machine learning potentials (MLPs) have driven significant developments in chemical, biological, and material sciences. The construction and training of MLPs enable fast and accurate simulations and analysis of thermodynamic and kinetic properties. This review focuses on the application of MLPs to reaction systems with consideration of bond breaking and formation. We review the development of MLP models, primarily with neural network and kernel-based algorithms, and recent applications of reactive MLPs (RMLPs) to systems at different scales. We show how RMLPs are constructed, how they speed up the calculation of reactive dynamics, and how they facilitate the study of reaction trajectories, reaction rates, free energy calculations, and many other calculations. Different data sampling strategies applied in building RMLPs are also discussed with a focus on how to collect structures for rare events and how to further improve their performance with active learning.
Collapse
Affiliation(s)
- Yinuo Yang
- Department of Chemistry, University of Florida, Gainesville, Florida;
| | - Shuhao Zhang
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | | | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | - Adrian E Roitberg
- Department of Chemistry, University of Florida, Gainesville, Florida;
| |
Collapse
|
9
|
Nam K, Shao Y, Major DT, Wolf-Watz M. Perspectives on Computational Enzyme Modeling: From Mechanisms to Design and Drug Development. ACS OMEGA 2024; 9:7393-7412. [PMID: 38405524 PMCID: PMC10883025 DOI: 10.1021/acsomega.3c09084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/15/2024] [Accepted: 01/19/2024] [Indexed: 02/27/2024]
Abstract
Understanding enzyme mechanisms is essential for unraveling the complex molecular machinery of life. In this review, we survey the field of computational enzymology, highlighting key principles governing enzyme mechanisms and discussing ongoing challenges and promising advances. Over the years, computer simulations have become indispensable in the study of enzyme mechanisms, with the integration of experimental and computational exploration now established as a holistic approach to gain deep insights into enzymatic catalysis. Numerous studies have demonstrated the power of computer simulations in characterizing reaction pathways, transition states, substrate selectivity, product distribution, and dynamic conformational changes for various enzymes. Nevertheless, significant challenges remain in investigating the mechanisms of complex multistep reactions, large-scale conformational changes, and allosteric regulation. Beyond mechanistic studies, computational enzyme modeling has emerged as an essential tool for computer-aided enzyme design and the rational discovery of covalent drugs for targeted therapies. Overall, enzyme design/engineering and covalent drug development can greatly benefit from our understanding of the detailed mechanisms of enzymes, such as protein dynamics, entropy contributions, and allostery, as revealed by computational studies. Such a convergence of different research approaches is expected to continue, creating synergies in enzyme research. This review, by outlining the ever-expanding field of enzyme research, aims to provide guidance for future research directions and facilitate new developments in this important and evolving field.
Collapse
Affiliation(s)
- Kwangho Nam
- Department
of Chemistry and Biochemistry, University
of Texas at Arlington, Arlington, Texas 76019, United States
| | - Yihan Shao
- Department
of Chemistry and Biochemistry, University
of Oklahoma, Norman, Oklahoma 73019-5251, United States
| | - Dan T. Major
- Department
of Chemistry and Institute for Nanotechnology & Advanced Materials, Bar-Ilan University, Ramat-Gan 52900, Israel
| | | |
Collapse
|