1
|
Greener JG. Reversible molecular simulation for training classical and machine-learning force fields. Proc Natl Acad Sci U S A 2025; 122:e2426058122. [PMID: 40434635 DOI: 10.1073/pnas.2426058122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2024] [Accepted: 04/22/2025] [Indexed: 05/29/2025] Open
Abstract
The next generation of force fields for molecular dynamics will be developed using a wealth of data. Training systematically with experimental data remains a challenge, however, especially for machine-learning potentials. Differentiable molecular simulation calculates gradients of observables with respect to parameters through molecular dynamics trajectories. Here, we improve this approach by explicitly calculating gradients using a reverse-time simulation with effectively constant memory cost and a computation count similar to the forward simulation. The method is applied to learn all-atom water and gas diffusion models with different functional forms and to train a machine-learning potential for diamond from scratch. Comparison to ensemble reweighting indicates that reversible simulation can provide more accurate gradients and train to match time-dependent observables.
Collapse
Affiliation(s)
- Joe G Greener
- Medical Research Council Laboratory of Molecular Biology, Cambridge CB2 0QH, United Kingdom
| |
Collapse
|
2
|
Gilardoni I, Piomponi V, Fröhlking T, Bussi G. MDRefine: A Python package for refining molecular dynamics trajectories with experimental data. J Chem Phys 2025; 162:192501. [PMID: 40371829 DOI: 10.1063/5.0256841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2025] [Accepted: 04/28/2025] [Indexed: 05/16/2025] Open
Abstract
Molecular dynamics (MD) simulations play a crucial role in resolving the underlying conformational dynamics of molecular systems. However, their capability to correctly reproduce and predict dynamics in agreement with experiments is limited by the accuracy of the force-field model. This capability can be improved by refining the structural ensembles or the force-field parameters. Furthermore, discrepancies with experimental data can be due to imprecise forward models, namely, functions mapping simulated structures to experimental observables. Here, we introduce MDRefine, a Python package aimed at implementing the refinement of the ensemble, the force field, and/or the forward model by comparing MD-generated trajectories with the experimental data. The software consists of several tools that can be employed separately from each other or combined together in different ways, providing a seamless interpolation between these three different types of refinement. We use some benchmark cases to show that the combined approach is superior to separately applied refinements. MDRefine has been released as an open-source package under the LGPLv2+ license. Source code, documentation, and examples are available at https://pypi.org/project/MDRefine and https://github.com/bussilab/MDRefine.
Collapse
Affiliation(s)
- Ivan Gilardoni
- Scuola Internazionale Superiore di Studi Avanzati, SISSA, Via Bonomea, 265, 34136 Trieste, Italy
| | - Valerio Piomponi
- Area Science Park, Località Padriciano, 99, 34149 Trieste, Italy
| | | | - Giovanni Bussi
- Scuola Internazionale Superiore di Studi Avanzati, SISSA, Via Bonomea, 265, 34136 Trieste, Italy
| |
Collapse
|
3
|
Zeng J, Zhang D, Peng A, Zhang X, He S, Wang Y, Liu X, Bi H, Li Y, Cai C, Zhang C, Du Y, Zhu JX, Mo P, Huang Z, Zeng Q, Shi S, Qin X, Yu Z, Luo C, Ding Y, Liu YP, Shi R, Wang Z, Bore SL, Chang J, Deng Z, Ding Z, Han S, Jiang W, Ke G, Liu Z, Lu D, Muraoka K, Oliaei H, Singh AK, Que H, Xu W, Xu Z, Zhuang YB, Dai J, Giese TJ, Jia W, Xu B, York DM, Zhang L, Wang H. DeePMD-kit v3: A Multiple-Backend Framework for Machine Learning Potentials. J Chem Theory Comput 2025; 21:4375-4385. [PMID: 40315155 DOI: 10.1021/acs.jctc.5c00340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/04/2025]
Abstract
In recent years, machine learning potentials (MLPs) have become indispensable tools in physics, chemistry, and materials science, driving the development of software packages for molecular dynamics (MD) simulations and related applications. These packages, typically built on specific machine learning frameworks, such as TensorFlow, PyTorch, or JAX, face integration challenges when advanced applications demand communication across different frameworks. The previous TensorFlow-based implementation of the DeePMD-kit exemplified these limitations. In this work, we introduce DeePMD-kit version 3, a significant update featuring a multibackend framework that supports TensorFlow, PyTorch, JAX, and PaddlePaddle backends, and demonstrate the versatility of this architecture through the integration of other MLP packages and of differentiable molecular force fields. This architecture allows seamless back-end switching with minimal modifications, enabling users and developers to integrate DeePMD-kit with other packages using different machine learning frameworks. This innovation facilitates the development of more complex and interoperable workflows, paving the way for broader applications of MLPs in scientific research.
Collapse
Affiliation(s)
- Jinzhe Zeng
- School of Artificial Intelligence and Data Science, Unversity of Science and Technology of China, Hefei 230026, P. R. China
- Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou 215123, P. R. China
- Suzhou Big Data and AI Research and Engineering Center, Suzhou 215123, P. R. China
| | - Duo Zhang
- AI for Science Institute, Beijing 100080, P. R. China
- DP Technology, Beijing 100080, P. R. China
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China
| | - Anyang Peng
- AI for Science Institute, Beijing 100080, P. R. China
| | - Xiangyu Zhang
- State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100871, P.R. China
- University of Chinese Academy of Sciences, Beijing 100871, P. R. China
| | - Sensen He
- Baidu Inc., Beijing 100085, P. R. China
| | - Yan Wang
- State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100871, P.R. China
- University of Chinese Academy of Sciences, Beijing 100871, P. R. China
| | | | - Hangrui Bi
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 1A1, Canada
| | - Yifan Li
- Department of Chemistry, Princeton University, Princeton, New Jersey 08540, United States
| | - Chun Cai
- AI for Science Institute, Beijing 100080, P. R. China
| | - Chengqian Zhang
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China
| | - Yiming Du
- State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100871, P.R. China
- University of Chinese Academy of Sciences, Beijing 100871, P. R. China
| | - Jia-Xin Zhu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P.R. China
| | - Pinghui Mo
- College of Integrated Circuits, Hunan University, Changsha 410082, P.R. China
| | - Zhengtao Huang
- State Key Laboratory of Advanced Technology for Materials Synthesis and Processing, Center for Smart Materials and Device Integration, School of Material Science and Engineering, Wuhan University of Technology, Wuhan 430070, P.R. China
| | - Qiyu Zeng
- College of Science, National University of Defense Technology, Changsha 410073, P.R. China
- Hunan Key Laboratory of Extreme Matter and Applications, National University of Defense Technology, Changsha 410073, P.R. China
| | | | - Xuejian Qin
- Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315201, P.R. China
- College of Materials Science and Optoelectronic Technology, University of Chinese Academy of Sciences, Beijing 100049, P.R. China
| | - Zhaoxi Yu
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, P. R. China
| | - Chenxing Luo
- Department of Geosciences, Princeton University, Princeton, New Jersey 08544, United States
- Department of Applied Physics and Applied Mathematics, Columbia University, New York, New York 10027, United States
| | - Ye Ding
- DP Technology, Beijing 100080, P. R. China
| | - Yun-Pei Liu
- Laboratory of AI for Electrochemistry (AI4EC), IKKEM, Xiamen 361005, Fujian, P. R. China
| | - Ruosong Shi
- Graduate School of China Academy of Engineering Physics, Beijing 100088, P. R. China
| | - Zhenyu Wang
- Key Laboratory of Material Simulation Methods and Software of Ministry of Education, College of Physics, Jilin University, Changchun 130012, P.R. China
- International Center of Future Science, Jilin University, Changchun 130012, P.R. China
| | - Sigbjørn Løland Bore
- Department of Chemistry and Hylleraas Centre for Quantum Molecular Sciences, University of Oslo, 0315 Oslo, Norway
| | - Junhan Chang
- DP Technology, Beijing 100080, P. R. China
- College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, P. R. China
| | - Zhe Deng
- College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, P. R. China
| | | | - Siyuan Han
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Wanrun Jiang
- AI for Science Institute, Beijing 100080, P. R. China
| | - Guolin Ke
- DP Technology, Beijing 100080, P. R. China
| | - Zhaoqing Liu
- College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, P. R. China
| | - Denghui Lu
- Department of Mechanics and Engineering Science, and HEDPS and CAPT, College of Engineering, Peking University, Beijing 100871, P. R. China
| | - Koki Muraoka
- Department of Chemical System Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Hananeh Oliaei
- Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Anurag Kumar Singh
- Department of Data Science, Indian Institute of Technology Palakkad, Kerala 678623, India
| | - Haohui Que
- Shanghai Astronomical Observatory, Chinese Academy of Sciences, Shanghai 200030, P.R. China
| | - Weihong Xu
- Laboratory of AI for Electrochemistry (AI4EC), IKKEM, Xiamen 361005, Fujian, P. R. China
| | - Zhangmancang Xu
- International School of Materials Science and Engineering, Wuhan University of Technology, Wuhan 430070, P. R. China
| | - Yong-Bin Zhuang
- Chaire de Simulation à l'Echelle Atomique (CSEA), Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland
| | - Jiayu Dai
- College of Science, National University of Defense Technology, Changsha 410073, P.R. China
- Hunan Key Laboratory of Extreme Matter and Applications, National University of Defense Technology, Changsha 410073, P.R. China
| | - Timothy J Giese
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Weile Jia
- State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100871, P.R. China
| | - Ben Xu
- Graduate School of Chinese Academy of Engineering Physics, Beijing 100088, P.R. China
| | - Darrin M York
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Linfeng Zhang
- AI for Science Institute, Beijing 100080, P. R. China
- DP Technology, Beijing 100080, P. R. China
| | - Han Wang
- National Key Laboratory of Computational Physics, Institute of Applied Physics and Computational Mathematics, Fenghao East Road 2, Beijing 100094, P.R. China
- HEDPS, CAPT, College of Engineering, Peking University, Beijing 100871, P.R. China
| |
Collapse
|
4
|
Feng W, Zhang L, Cheng Y, Wu J, Wei C, Zhang J, Yu K. Screening and Design of Aqueous Zinc Battery Electrolytes Based on the Multimodal Optimization of Molecular Simulation. J Phys Chem Lett 2025; 16:3326-3335. [PMID: 40130824 DOI: 10.1021/acs.jpclett.5c00341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2025]
Abstract
Aqueous batteries, such as aqueous zinc-ion batteries (AZIB), have garnered significant attention because of their advantages in intrinsic safety, low cost, and eco-friendliness. However, aqueous electrolytes tend to freeze at low temperatures, which limits their potential industrial applications. Thus, one of the core challenges in aqueous electrolyte design is optimizing the formula to prevent freezing while maintaining good ion conductivity. However, the experimental trial-and-error approach is inefficient for this purpose, and existing simulation tools are either inaccurate or too expensive for high-throughput phase transition predictions. In this work, we employ a small amount of experimental data and differentiable simulation techniques to develop a multimodal optimization workflow. With minimal human intervention, this workflow significantly enhances the prediction power of classical force fields for electrical conductivity. Most importantly, the simulated electrical conductivity can serve as an effective predictor of electrolyte freezing at low temperatures. Generally, the workflow developed in this work introduces a new paradigm for electrolyte design. This paradigm leverages both easily measurable experimental data and fast simulation techniques to predict properties that are challenging to access by using either approach alone.
Collapse
Affiliation(s)
- Wei Feng
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| | - Luyan Zhang
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| | - Yaobo Cheng
- Shenzhen Cubic-Science Company, Ltd., Shenzhen 518052, Guangdong, P. R. China
| | - Jin Wu
- Shenzhen Cubic-Science Company, Ltd., Shenzhen 518052, Guangdong, P. R. China
| | - Chunguang Wei
- Shenzhen Cubic-Science Company, Ltd., Shenzhen 518052, Guangdong, P. R. China
| | - Junwei Zhang
- Shenzhen Cubic-Science Company, Ltd., Shenzhen 518052, Guangdong, P. R. China
| | - Kuang Yu
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| |
Collapse
|
5
|
Yati, Kokane Y, Mondal A. Active-Learning Assisted General Framework for Efficient Parameterization of Force-Fields. J Chem Theory Comput 2025; 21:2638-2654. [PMID: 39999292 DOI: 10.1021/acs.jctc.5c00061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2025]
Abstract
This work presents an efficient approach to optimizing force field parameters for sulfone molecules using a combination of genetic algorithms (GA) and Gaussian process regression (GPR). Sulfone-based electrolytes are of significant interest in energy storage applications, where accurate modeling of their structural and transport properties is essential. Traditional force field parametrization methods are often computationally expensive and require extensive manual intervention. By integrating GA and GPR, our active learning framework addresses these challenges by achieving optimized parameters in 12 iterations using only 300 data points, significantly outperforming previous attempts requiring thousands of iterations and parameters. We demonstrate the efficiency of our method through a comparison with state-of-the-art techniques, including Bayesian Optimization. The optimized GA-GPR force field was validated against experimental and reference data, including density, viscosity, diffusion coefficients, and surface tension. The results demonstrated excellent agreement between GA-GPR predictions and experimental values, outperforming the widely used OPLS force field. The GA-GPR model accurately captured both bulk and interfacial properties, effectively describing molecular mobility, caging effects, and interfacial arrangements. Furthermore, the transferability of the GA-GPR force field across different temperatures and sulfone structures underscores its robustness and versatility. Our study provides a reliable and transferable force field for sulfone molecules, significantly enhancing the accuracy and efficiency of molecular simulations. This work establishes a strong foundation for future machine learning-driven force field development, applicable to complex molecular systems.
Collapse
Affiliation(s)
- Yati
- Department of Chemistry, Indian Institute of Technology Gandhinagar, Gandhinagar, Gujarat 382355, India
| | - Yash Kokane
- Department of Materials Engineering, Indian Institute of Technology Gandhinagar, Gandhinagar, Gujarat 382355, India
| | - Anirban Mondal
- Department of Chemistry, Indian Institute of Technology Gandhinagar, Gandhinagar, Gujarat 382355, India
| |
Collapse
|
6
|
Chen J, Gao Q, Huang M, Yu K. Application of modern artificial intelligence techniques in the development of organic molecular force fields. Phys Chem Chem Phys 2025; 27:2294-2319. [PMID: 39820957 DOI: 10.1039/d4cp02989e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2025]
Abstract
The molecular force field (FF) determines the accuracy of molecular dynamics (MD) and is one of the major bottlenecks that limits the application of MD in molecular design. Recently, artificial intelligence (AI) techniques, such as machine-learning potentials (MLPs), have been rapidly reshaping the landscape of MD. Meanwhile, organic molecular systems feature unique characteristics, and require more careful treatment in both model construction, optimization, and validation. While an accurate and generic organic molecular force field is still missing, significant progress has been made with the facilitation of AI, warranting a promising future. In this review, we provide an overview of the various types of AI techniques used in molecular FF development and discuss both the advantages and weaknesses of these methodologies. We show how AI methods provide unprecedented capabilities in many tasks such as potential fitting, atom typification, and automatic optimization. Meanwhile, it is also worth noting that more efforts are needed to improve the transferability of the model, develop a more comprehensive database, and establish more standardized validation procedures. With these discussions, we hope to inspire more efforts to solve the existing problems, eventually leading to the birth of next-generation generic organic FFs.
Collapse
Affiliation(s)
- Junmin Chen
- Institute of Materials Research (IMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China.
- Tsinghua-Berkeley Shenzhen Institute (TBSI), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Qian Gao
- Institute of Materials Research (IMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China.
| | - Miaofei Huang
- Institute of Materials Research (IMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China.
| | - Kuang Yu
- Institute of Materials Research (IMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China.
- Tsinghua-Berkeley Shenzhen Institute (TBSI), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| |
Collapse
|
7
|
Zhong Z, Xu L, Jiang J. A Neural-Network-Based Mapping and Optimization Framework for High-Precision Coarse-Grained Simulation. J Chem Theory Comput 2025; 21:859-870. [PMID: 39782000 DOI: 10.1021/acs.jctc.4c01466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2025]
Abstract
The accuracy and efficiency of a coarse-grained (CG) force field are pivotal for high-precision molecular simulations of large systems with complex molecules. We present an automated mapping and optimization framework for molecular simulation (AMOFMS), which is designed to streamline and improve the force field optimization process. It features a neural-network-based mapping function, DSGPM-TP (deep supervised graph partitioning model with type prediction). This model can accurately and efficiently convert atomistic structures to CG mappings, reducing the need for manual intervention. By integrating bottom-up and top-down methodologies, AMOFMS allows users to freely combine these approaches or use them independently as optimization targets. Moreover, users can select and combine different optimizers to meet their specific mission. With its parallel optimizer, AMOFMS significantly accelerates the optimization process, reducing the time required to achieve optimal results. Successful applications of AMOFMS include parameter optimizations for systems such as POPC and PEO, demonstrating its robustness and effectiveness. Overall, AMOFMS provides a general and flexible framework for the automated development of high-precision CG force fields.
Collapse
Affiliation(s)
- Zhixuan Zhong
- Beijing National Laboratory for Molecular Sciences, State Key Laboratory of Polymer Physics and Chemistry, Institute of Chemistry, Chinese Academy of Sciences, Beijing 100190, P. R. China
- University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Lifeng Xu
- Beijing National Laboratory for Molecular Sciences, State Key Laboratory of Polymer Physics and Chemistry, Institute of Chemistry, Chinese Academy of Sciences, Beijing 100190, P. R. China
- University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Jian Jiang
- Beijing National Laboratory for Molecular Sciences, State Key Laboratory of Polymer Physics and Chemistry, Institute of Chemistry, Chinese Academy of Sciences, Beijing 100190, P. R. China
- University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| |
Collapse
|
8
|
Han B, Yu K. Refining potential energy surface through dynamical properties via differentiable molecular simulation. Nat Commun 2025; 16:816. [PMID: 39827185 PMCID: PMC11742923 DOI: 10.1038/s41467-025-56061-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Accepted: 01/08/2025] [Indexed: 01/22/2025] Open
Abstract
Recently, machine learning potential (MLP) largely enhances the reliability of molecular dynamics, but its accuracy is limited by the underlying ab initio methods. A viable approach to overcome this limitation is to refine the potential by learning from experimental data, which now can be done efficiently using modern automatic differentiation technique. However, potential refinement is mostly performed using thermodynamic properties, leaving the most accessible and informative dynamical data (like spectroscopy) unexploited. In this work, through a comprehensive application of adjoint and gradient truncation methods, we show that both memory and gradient explosion issues can be circumvented in many situations, so the dynamical property differentiation is well-behaved. Consequently, both transport coefficients and spectroscopic data can be used to improve the density functional theory based MLP towards higher accuracy. Essentially, this work contributes to the solution of the inverse problem of spectroscopy by extracting microscopic interactions from vibrational spectroscopic data.
Collapse
Affiliation(s)
- Bin Han
- Institute of Materials Research, Tsinghua Shenzhen International Graduate School (TSIGS), Shenzhen, PR China
| | - Kuang Yu
- Institute of Materials Research, Tsinghua Shenzhen International Graduate School (TSIGS), Shenzhen, PR China.
| |
Collapse
|
9
|
Li Y, Jin X, Moubarak E, Smit B. A Refined Set of Universal Force Field Parameters for Some Metal Nodes in Metal-Organic Frameworks. J Chem Theory Comput 2024; 20:10540-10552. [PMID: 39601035 PMCID: PMC11635978 DOI: 10.1021/acs.jctc.4c01113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2024] [Revised: 11/14/2024] [Accepted: 11/14/2024] [Indexed: 11/29/2024]
Abstract
Metal-organic frameworks (MOFs) exhibit promise as porous materials for carbon capture due to their design versatility and large pore sizes. The generic force fields (e.g., UFF and Dreiding) use one universal set of Lennard-Jones parameters for each element, while MOFs have a much richer local chemical environment than those chemical environments used to fit the UFF. When MOFs contain hard-Lewis acid metals, UFF systematically overestimates CO2 uptakes. To address this, we developed a workflow to affordably and efficiently generate reliable force fields to predict CO2 adsorption isotherms of MOFs containing metals from groups IIA (Mg, Ca, Sr, and Ba) and IIIA (Al, Ga, and In), connected to various carboxylate ligands. This method uses experimental isotherms as input. The optimal parameters are obtained by minimizing the loss function of the experimental and simulated isotherms, in which we use the Multistate Bennett Acceptance Ratio (MBAR) theory to derive the functionality relationship of loss functions in terms of force field parameters.
Collapse
Affiliation(s)
- Yutao Li
- Laboratory of molecular simulation
(LSMO), Institut des Sciences et Ingénierie
Chimiques, École Polytechnique Fédérale de Lausanne
(EPFL), Rue de l’Industrie 17, CH-1951 Sion, Switzerland
| | - Xin Jin
- Laboratory of molecular simulation
(LSMO), Institut des Sciences et Ingénierie
Chimiques, École Polytechnique Fédérale de Lausanne
(EPFL), Rue de l’Industrie 17, CH-1951 Sion, Switzerland
| | - Elias Moubarak
- Laboratory of molecular simulation
(LSMO), Institut des Sciences et Ingénierie
Chimiques, École Polytechnique Fédérale de Lausanne
(EPFL), Rue de l’Industrie 17, CH-1951 Sion, Switzerland
| | - Berend Smit
- Laboratory of molecular simulation
(LSMO), Institut des Sciences et Ingénierie
Chimiques, École Polytechnique Fédérale de Lausanne
(EPFL), Rue de l’Industrie 17, CH-1951 Sion, Switzerland
| |
Collapse
|
10
|
Neikha K, Puzari A. Metal-Organic Frameworks through the Lens of Artificial Intelligence: A Comprehensive Review. LANGMUIR : THE ACS JOURNAL OF SURFACES AND COLLOIDS 2024; 40:21957-21975. [PMID: 39382843 DOI: 10.1021/acs.langmuir.4c03126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2024]
Abstract
Metal-organic frameworks (MOFs) are a class of hybrid porous materials that have gained prominence as a noteworthy material with varied applications. Currently, MOFs are in extensive use, particularly in the realms of energy and catalysis. The synthesis of these materials poses considerable challenges, and their computational analysis is notably intricate due to their complex structure and versatile applications in the field of material science. Density functional theory (DFT) has helped researchers in understanding reactions and mechanisms, but it is costly and time-consuming and requires bigger systems to perform these calculations. Machine learning (ML) techniques were adopted in order to overcome these problems by implementing ML in material data sets for synthesis, structure, and property predictions of MOFs. These predictions are fast, efficient, and accurate and do not require heavy computing. In this review, we discuss ML models used in MOF and their incorporation with artificial intelligence (AI) in structure and property predictions. The advantage of AI in this field would accelerate research, particularly in synthesizing novel MOFs with multiple properties and applications oriented with minimum information.
Collapse
Affiliation(s)
- Kevizali Neikha
- Department of Chemistry, National Institute of Technology Nagaland, Chumoukedima, Nagaland 797103, India
| | - Amrit Puzari
- Department of Chemistry, National Institute of Technology Nagaland, Chumoukedima, Nagaland 797103, India
| |
Collapse
|
11
|
Xu L, Jiang J. Synergistic Integration of Physical Embedding and Machine Learning Enabling Precise and Reliable Force Field. J Chem Theory Comput 2024. [PMID: 39264358 DOI: 10.1021/acs.jctc.4c00618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/13/2024]
Abstract
Machine-learning force fields have achieved significant strides in accurately reproducing the potential energy surface with quantum chemical accuracy. However, this approach still faces several challenges, e.g., extrapolating to uncharted chemical spaces, interpreting long-range electrostatics, and mapping complex macroscopic properties. To address these issues, we advocate for a synergistic integration of physical principles and machine learning techniques within the framework of a physically informed neural network (PINN). This approach involves incorporating physical knowledge into the parameters of the neural network, coupled with an efficient global optimizer, the Tabu-Adam algorithm, proposed in this work to augment optimization under strict physical constraint. We choose the AMOEBA+ force field as the physics-based model for embedding and then train and test it using the diethylene glycol dimethyl ether (DEGDME) data set as a case study. The results reveal a breakthrough in constructing a precise and noise-robust machine learning force field. Utilizing two training sets with hundreds of samples, our model exhibits remarkable generalization and density functional theory (DFT) accuracy in describing molecular interactions and enables a precise prediction of the macroscopic properties such as the diffusion coefficient with minimal cost. This work provides valuable insight into establishing a fundamental framework of the PINN force field.
Collapse
Affiliation(s)
- Lifeng Xu
- Beijing National Laboratory for Molecular Sciences, State Key Laboratory of Polymer Physics and Chemistry, Institute of Chemistry, Chinese Academy of Sciences, Beijing 100190, P. R. China
- University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Jian Jiang
- Beijing National Laboratory for Molecular Sciences, State Key Laboratory of Polymer Physics and Chemistry, Institute of Chemistry, Chinese Academy of Sciences, Beijing 100190, P. R. China
- University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| |
Collapse
|
12
|
Takaba K, Friedman AJ, Cavender CE, Behara PK, Pulido I, Henry MM, MacDermott-Opeskin H, Iacovella CR, Nagle AM, Payne AM, Shirts MR, Mobley DL, Chodera JD, Wang Y. Machine-learned molecular mechanics force fields from large-scale quantum chemical data. Chem Sci 2024; 15:12861-12878. [PMID: 39148808 PMCID: PMC11322960 DOI: 10.1039/d4sc00690a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 06/17/2024] [Indexed: 08/17/2024] Open
Abstract
The development of reliable and extensible molecular mechanics (MM) force fields-fast, empirical models characterizing the potential energy surface of molecular systems-is indispensable for biomolecular simulation and computer-aided drug design. Here, we introduce a generalized and extensible machine-learned MM force field, espaloma-0.3, and an end-to-end differentiable framework using graph neural networks to overcome the limitations of traditional rule-based methods. Trained in a single GPU-day to fit a large and diverse quantum chemical dataset of over 1.1 M energy and force calculations, espaloma-0.3 reproduces quantum chemical energetic properties of chemical domains highly relevant to drug discovery, including small molecules, peptides, and nucleic acids. Moreover, this force field maintains the quantum chemical energy-minimized geometries of small molecules and preserves the condensed phase properties of peptides and folded proteins, self-consistently parametrizing proteins and ligands to produce stable simulations leading to highly accurate predictions of binding free energies. This methodology demonstrates significant promise as a path forward for systematically building more accurate force fields that are easily extensible to new chemical domains of interest.
Collapse
Affiliation(s)
- Kenichiro Takaba
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
- Pharmaceuticals Research Center, Advanced Drug Discovery, Asahi Kasei Pharma Corporation Shizuoka 410-2321 Japan
| | - Anika J Friedman
- Department of Chemical and Biological Engineering, University of Colorado Boulder Boulder CO 80309 USA
| | - Chapin E Cavender
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego 9500 Gilman Drive La Jolla CA 92093 USA
| | - Pavan Kumar Behara
- Center for Neurotherapeutics, Department of Pathology and Laboratory Medicine, University of California Irvine CA 92697 USA
| | - Iván Pulido
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | - Michael M Henry
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | | | - Christopher R Iacovella
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | - Arnav M Nagle
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
- Department of Bioengineering, University of California, Berkeley Berkeley CA 94720 USA
| | - Alexander Matthew Payne
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
- Tri-Institutional PhD Program in Chemical Biology, Memorial Sloan Kettering Cancer Center New York 10065 USA
| | - Michael R Shirts
- Department of Chemical and Biological Engineering, University of Colorado Boulder Boulder CO 80309 USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California Irvine California 92697 USA
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | - Yuanqing Wang
- Simons Center for Computational Physical Chemistry and Center for Data Science, New York University New York NY 10004 USA
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| |
Collapse
|
13
|
Wang L, Behara PK, Thompson MW, Gokey T, Wang Y, Wagner JR, Cole DJ, Gilson MK, Shirts MR, Mobley DL. The Open Force Field Initiative: Open Software and Open Science for Molecular Modeling. J Phys Chem B 2024; 128:7043-7067. [PMID: 38989715 DOI: 10.1021/acs.jpcb.4c01558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Force fields are a key component of physics-based molecular modeling, describing the energies and forces in a molecular system as a function of the positions of the atoms and molecules involved. Here, we provide a review and scientific status report on the work of the Open Force Field (OpenFF) Initiative, which focuses on the science, infrastructure and data required to build the next generation of biomolecular force fields. We introduce the OpenFF Initiative and the related OpenFF Consortium, describe its approach to force field development and software, and discuss accomplishments to date as well as future plans. OpenFF releases both software and data under open and permissive licensing agreements to enable rapid application, validation, extension, and modification of its force fields and software tools. We discuss lessons learned to date in this new approach to force field development. We also highlight ways that other force field researchers can get involved, as well as some recent successes of outside researchers taking advantage of OpenFF tools and data.
Collapse
Affiliation(s)
- Lily Wang
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Pavan Kumar Behara
- Center for Neurotherapeutics, University of California, Irvine, California 92697, United States
| | - Matthew W Thompson
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Trevor Gokey
- Department of Chemistry, University of California, Irvine, California 92697, United States
| | - Yuanqing Wang
- Simons Center for Computational Physical Chemistry and Center for Data Science, New York, New York 10004, United States
| | - Jeffrey R Wagner
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Daniel J Cole
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, The University of California at San Diego, La Jolla, California 92093, United States
| | - Michael R Shirts
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80305, United States
| | - David L Mobley
- Department of Chemistry, University of California, Irvine, California 92697, United States
- Department of Pharmaceutical Sciences, University of California, Irvine, California 92697, United States
| |
Collapse
|
14
|
Cheng Z, Bi H, Liu S, Chen J, Misquitta AJ, Yu K. Developing a Differentiable Long-Range Force Field for Proteins with E(3) Neural Network-Predicted Asymptotic Parameters. J Chem Theory Comput 2024; 20:5598-5608. [PMID: 38888427 DOI: 10.1021/acs.jctc.4c00337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/20/2024]
Abstract
Accurately describing long-range interactions is a significant challenge in molecular dynamics (MD) simulations of proteins. High-quality long-range potential is also an important component of the range-separated machine learning force field. This study introduces a comprehensive asymptotic parameter database encompassing atomic multipole moments, polarizabilities, and dispersion coefficients. Leveraging active learning, our database comprehensively represents protein fragments with up to 8 heavy atoms, capturing their conformational diversity with merely 78,000 data points. Additionally, the E(3) neural network (E3NN) is employed to predict the asymptotic parameters directly from the local geometry. The E3NN models demonstrate exceptional accuracy and transferability across all asymptotic parameters, achieving an R2 of 0.999 for both protein fragments and 20 amino acid dipeptide test sets. The long-range electrostatic and dispersion energies can be obtained using the E3NN-predicted parameters, with an error of 0.07 and 0.02 kcal/mol, respectively, when compared to symmetry-adapted perturbation theory (SAPT). Therefore, our force fields demonstrate the capability to accurately describe long-range interactions in proteins, paving the way for next-generation protein force fields.
Collapse
Affiliation(s)
- Zheng Cheng
- School of Mathematical Sciences, Peking University, Beijing 100871, China
- AI for Science Institute, Beijing 100084, P. R. China
| | - Hangrui Bi
- School of Mathematical Sciences, Peking University, Beijing 100871, China
- DP Technology, Beijing 100080, P. R. China
| | - Siyuan Liu
- DP Technology, Beijing 100080, P. R. China
| | - Junmin Chen
- Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518055, Guangdong, P. R. China
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| | - Alston J Misquitta
- School of Physics and Astronomy, Queen Mary, University of London, London E1 4NS, U.K
| | - Kuang Yu
- Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518055, Guangdong, P. R. China
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| |
Collapse
|
15
|
Kumar A, MacKerell AD. FFParam-v2.0: A Comprehensive Tool for CHARMM Additive and Drude Polarizable Force-Field Parameter Optimization and Validation. J Phys Chem B 2024; 128:4385-4395. [PMID: 38690986 PMCID: PMC11260432 DOI: 10.1021/acs.jpcb.4c01314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2024]
Abstract
Developing production quality CHARMM force-field (FF) parameters is a very detailed process involving a variety of calculations, many of which are specific for the molecule of interest. The first version of FFParam was developed as a standalone Python package designed for the optimization of electrostatic and bonded parameters of the CHARMM additive and polarizable Drude FFs by using quantum mechanical (QM) target data. The new version of FFParam has multiple new capabilities for FF parameter optimization and validation, with an emphasis on the ability to use condensed-phase target data in optimization. FFParam-v2 allows optimization of Lennard-Jones (LJ) parameters using potential energy scans of interactions between selected atoms in a molecule and noble gases, viz., He and Ne, and through condensed-phase calculations, from which experimental observables such as heats of vaporization and free energies of solvation may be obtained. This functionality serves as a gold standard for both optimizing parameters and validating the performance of the final parameters. A new bonded parameter optimization algorithm has been introduced to account for simultaneously optimizing multiple molecules sharing parameters. FFParam-v2 also supports the comparison of normal modes and the potential energy distribution of internal coordinates towards each normal mode obtained from QM and molecular mechanics calculations. Such comparison capability is vital to validate the balance among various bonded parameters that contribute to the complex normal modes of molecules. User interaction has been extended beyond the original graphical user interface to include command-line interface capabilities that allow for integration of FFParam in workflows, thereby facilitating the automation of parameter optimization. With these new functionalities, FFParam is a more comprehensive parameter optimization tool for both beginners and advanced users.
Collapse
Affiliation(s)
- Anmol Kumar
- Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, Baltimore, MD 21201, USA
| | - Alexander D. MacKerell
- Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, Baltimore, MD 21201, USA
| |
Collapse
|
16
|
Orlando G, Serrano L, Schymkowitz J, Rousseau F. Integrating physics in deep learning algorithms: a force field as a PyTorch module. Bioinformatics 2024; 40:btae160. [PMID: 38514422 PMCID: PMC11007235 DOI: 10.1093/bioinformatics/btae160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 02/08/2024] [Accepted: 03/19/2024] [Indexed: 03/23/2024] Open
Abstract
MOTIVATION Deep learning algorithms applied to structural biology often struggle to converge to meaningful solutions when limited data is available, since they are required to learn complex physical rules from examples. State-of-the-art force-fields, however, cannot interface with deep learning algorithms due to their implementation. RESULTS We present MadraX, a forcefield implemented as a differentiable PyTorch module, able to interact with deep learning algorithms in an end-to-end fashion. AVAILABILITY AND IMPLEMENTATION MadraX documentation, together with tutorials and installation guide, is available at madrax.readthedocs.io.
Collapse
Affiliation(s)
- Gabriele Orlando
- Switch Laboratory, VIB Center for Brain and Disease Research, VIB, Leuven 3000, Belgium
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Leuven 3000, Belgium
- Switch Laboratory, VIB Center for AI & Computational Biology, VIB, Leuven 3000, Belgium
| | - Luis Serrano
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr Aiguader 88, Barcelona 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- IC REA, Pg. Lluis Companys 23, Barcelona 08010, Spain
| | - Joost Schymkowitz
- Switch Laboratory, VIB Center for Brain and Disease Research, VIB, Leuven 3000, Belgium
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Leuven 3000, Belgium
- Switch Laboratory, VIB Center for AI & Computational Biology, VIB, Leuven 3000, Belgium
| | - Frederic Rousseau
- Switch Laboratory, VIB Center for Brain and Disease Research, VIB, Leuven 3000, Belgium
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Leuven 3000, Belgium
- Switch Laboratory, VIB Center for AI & Computational Biology, VIB, Leuven 3000, Belgium
| |
Collapse
|
17
|
Chen J, Yu K. PhyNEO: A Neural-Network-Enhanced Physics-Driven Force Field Development Workflow for Bulk Organic Molecule and Polymer Simulations. J Chem Theory Comput 2024; 20:253-265. [PMID: 38118076 DOI: 10.1021/acs.jctc.3c01045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2023]
Abstract
An accurate, generalizable, and transferable force field plays a crucial role in the molecular dynamics simulations of organic polymers and biomolecules. Conventional empirical force fields often fail to capture precise intermolecular interactions due to their negligence of important physics, such as polarization, charge penetration, many-body dispersion, etc. Moreover, the parameterization of these force fields relies heavily on top-down fittings, limiting their transferabilities to new systems where the experimental data are often unavailable. To address these challenges, we introduce a general and fully ab initio force field construction strategy, named PhyNEO. It features a hybrid approach that combines both the physics-driven and the data-driven methods and is able to generate a bulk potential with chemical accuracy using only quantum chemistry data of very small clusters. Careful separations of long-/short-range interactions and nonbonding/bonding interactions are the key to the success of PhyNEO. By such a strategy, we mitigate the limitations of pure data-driven methods in long-range interactions, thus largely increasing the data efficiency and the scalability of machine learning models. The new approach is thoroughly tested on poly(ethylene oxide) and polyethylene glycol systems, giving superior accuracies in both microscopic and bulk properties compared to conventional force fields. This work thus offers a promising framework for the development of advanced force fields in a wide range of organic molecular systems.
Collapse
Affiliation(s)
- Junmin Chen
- Tsinghua-Berkeley Shenzhen Institute, Tsinghua University, Shenzhen, Guangdong 518055, P. R. China
| | - Kuang Yu
- Tsinghua-Berkeley Shenzhen Institute, Tsinghua University, Shenzhen, Guangdong 518055, P. R. China
- Institute of Materials Research (iMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, Guangdong 518055, P. R. China
| |
Collapse
|