1
|
Kalayan J, Ramzan I, Williams CD, Bryce RA, Burton NA. A neural network potential based on pairwise resolved atomic forces and energies. J Comput Chem 2024; 45:1143-1151. [PMID: 38284556 DOI: 10.1002/jcc.27313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/23/2023] [Accepted: 01/05/2024] [Indexed: 01/30/2024]
Abstract
Molecular simulations have become a key tool in molecular and materials design. Machine learning (ML)-based potential energy functions offer the prospect of simulating complex molecular systems efficiently at quantum chemical accuracy. In previous work, we have introduced the ML-based PairF-Net approach to neural network potentials, that adopts a pairwise interatomic scheme to predicting forces within a molecular system. Here, we further develop the PairF-Net model to intrinsically incorporate energy conservation and couple the model to a molecular mechanical (MM) environment within the OpenMM package. The updated PairF-Net model yields energy and force predictions and dynamical distributions in good agreement with the rMD17 dataset of ten small organic molecules in the gas-phase. We further show that these in vacuo ML models of small molecules can be applied to force predictions in aqueous solution via hybrid ML/MM simulations. We present a new benchmark dataset for these ten molecules in solution, obtained from QM/MM simulations, which we denote as rMD17-aq (https://zenodo.org/records/10048644); and assess the ability of PairF-Net to reproduce the molecular energy, atomic forces and dynamical distributions of these solution conformations via ML/MM simulations.
Collapse
Affiliation(s)
- Jas Kalayan
- Division of Pharmacy and Optometry, School of Health Sciences, University of Manchester, Manchester, UK
| | - Ismaeel Ramzan
- Division of Pharmacy and Optometry, School of Health Sciences, University of Manchester, Manchester, UK
- Neural Circuits and Computations Unit, RIKEN Center for Brain Science, Wako, Japan
| | - Christopher D Williams
- Division of Pharmacy and Optometry, School of Health Sciences, University of Manchester, Manchester, UK
| | - Richard A Bryce
- Division of Pharmacy and Optometry, School of Health Sciences, University of Manchester, Manchester, UK
| | - Neil A Burton
- Department of Chemistry, University of Manchester, Manchester, UK
| |
Collapse
|
2
|
Chen M, Jiang X, Zhang L, Chen X, Wen Y, Gu Z, Li X, Zheng M. The emergence of machine learning force fields in drug design. Med Res Rev 2024; 44:1147-1182. [PMID: 38173298 DOI: 10.1002/med.22008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Revised: 11/29/2023] [Accepted: 12/05/2023] [Indexed: 01/05/2024]
Abstract
In the field of molecular simulation for drug design, traditional molecular mechanic force fields and quantum chemical theories have been instrumental but limited in terms of scalability and computational efficiency. To overcome these limitations, machine learning force fields (MLFFs) have emerged as a powerful tool capable of balancing accuracy with efficiency. MLFFs rely on the relationship between molecular structures and potential energy, bypassing the need for a preconceived notion of interaction representations. Their accuracy depends on the machine learning models used, and the quality and volume of training data sets. With recent advances in equivariant neural networks and high-quality datasets, MLFFs have significantly improved their performance. This review explores MLFFs, emphasizing their potential in drug design. It elucidates MLFF principles, provides development and validation guidelines, and highlights successful MLFF implementations. It also addresses potential challenges in developing and applying MLFFs. The review concludes by illuminating the path ahead for MLFFs, outlining the challenges to be overcome and the opportunities to be harnessed. This inspires researchers to embrace MLFFs in their investigations as a new tool to perform molecular simulations in drug design.
Collapse
Affiliation(s)
- Mingan Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Physical Science and Technology, ShanghaiTech University, Shanghai, China
- Lingang Laboratory, Shanghai, China
| | - Xinyu Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
| | - Lehan Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
| | - Xiaoxu Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| | - Yiming Wen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| | - Zhiyong Gu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| |
Collapse
|
3
|
Pandey P, Arandhara M, Houston PL, Qu C, Conte R, Bowman JM, Ramesh SG. Assessing Permutationally Invariant Polynomial and Symmetric Gradient Domain Machine Learning Potential Energy Surfaces for H 3O 2. J Phys Chem A 2024; 128:3212-3219. [PMID: 38624168 PMCID: PMC11056970 DOI: 10.1021/acs.jpca.4c01044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 03/15/2024] [Accepted: 03/20/2024] [Indexed: 04/17/2024]
Abstract
The singly hydrated hydroxide anion OH-(H2O) is of central importance to a detailed molecular understanding of water; therefore, there is strong motivation to develop a highly accurate potential to describe this anion. While this is a small molecule, it is necessary to have an extensive data set of energies and, if possible, forces to span several important stationary points. Here, we assess two machine-learned potentials, one using the symmetric gradient domain machine learning (sGDML) method and one based on permutationally invariant polynomials (PIPs). These are successors to a PIP potential energy surface (PES) reported in 2004. We describe the details of both fitting methods and then compare the two PESs with respect to precision, properties, and speed of evaluation. While the precision of the potentials is similar, the PIP PES is much faster to evaluate for energies and energies plus gradient than the sGDML one. Diffusion Monte Carlo calculations of the ground vibrational state, using both potentials, produce similar large anharmonic downshift of the zero-point energy compared to the harmonic approximation of the PIP and sGDML potentials. The computational time for these calculations using the sGDML PES is roughly 300 times greater than using the PIP one.
Collapse
Affiliation(s)
- Priyanka Pandey
- Department
of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Mrinal Arandhara
- Department
of Inorganic and Physical Chemistry, Indian
Institute of Science, Bangalore 560012, India
| | - Paul L. Houston
- Department
of Chemistry and Chemical Biology, Cornell
University, Ithaca, New York 14853, United States
- Department
of Chemistry and Biochemistry, Georgia Institute
of Technology, Atlanta, Georgia 30332, United States
| | - Chen Qu
- Independent
Researcher, Toronto, Ontario M9B0E3, Canada
| | - Riccardo Conte
- Dipartimento
di Chimica, Università degli Studi
di Milano, Milano 20133, Italy
| | - Joel M. Bowman
- Department
of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Sai G. Ramesh
- Department
of Inorganic and Physical Chemistry, Indian
Institute of Science, Bangalore 560012, India
| |
Collapse
|
4
|
Houston PL, Qu C, Yu Q, Pandey P, Conte R, Nandi A, Bowman JM. No Headache for PIPs: A PIP Potential for Aspirin Runs Much Faster and with Similar Precision Than Other Machine-Learned Potentials. J Chem Theory Comput 2024; 20:3008-3018. [PMID: 38593438 DOI: 10.1021/acs.jctc.4c00054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/11/2024]
Abstract
Assessments of machine-learning (ML) potentials are an important aspect of the rapid development of this field. We recently reported an assessment of the linear-regression permutationally invariant polynomial (PIP) method for ethanol, using the widely used (revised) rMD17 data set. We demonstrated that the PIP approach outperformed numerous other methods, e.g., ANI, PhysNet, sGDML, and p-KRR, with respect to precision and notably with respect to speed [Houston et al., J. Chem. Phys. 2022, 156, 044120]. Here, we extend this assessment to the 21-atom aspirin molecule, using the rMD17 data set, with a focus on the speed of evaluation. Both energies and forces are used for training, and the precision of several PIPs is examined for both. Normal mode frequencies, the methyl torsional potential, and 1d vibrational energies for an OH stretch are presented. We show that the PIP approach achieves the level of precision obtained from other ML methods, e.g., atom-centered neural network methods, linear regression ACE, and kernel methods, as reported by Kovács et al. in J. Chem. Theory Comput. 2021, 17, 7696-7711. More significantly, we show that the PIP PESs run much faster than all other ML methods, whose timings were evaluated in that paper. We also show that the PIP PES extrapolates well enough to describe several internal motions of aspirin, including an OH stretch.
Collapse
Affiliation(s)
- Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, United States
- Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Chen Qu
- Independent Researcher, Toronto, Ontario M9B0E3, Canada
| | - Qi Yu
- Department of Chemistry, Fudan University, Shanghai 200438, P. R. China
| | - Priyanka Pandey
- Department of Chemistry, Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Riccardo Conte
- Dipartimento di Chimica, Università degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Apurba Nandi
- Department of Chemistry, Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
- Department of Physics and Materials Science, University of Luxembourg, Luxembourg City L-1511, Luxembourg
| | - Joel M Bowman
- Department of Chemistry, Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| |
Collapse
|
5
|
Unke OT, Stöhr M, Ganscha S, Unterthiner T, Maennel H, Kashubin S, Ahlin D, Gastegger M, Medrano Sandonas L, Berryman JT, Tkatchenko A, Müller KR. Biomolecular dynamics with machine-learned quantum-mechanical force fields trained on diverse chemical fragments. Sci Adv 2024; 10:eadn4397. [PMID: 38579003 DOI: 10.1126/sciadv.adn4397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Accepted: 02/29/2024] [Indexed: 04/07/2024]
Abstract
The GEMS method enables molecular dynamics simulations of large heterogeneous systems at ab initio quality.
Collapse
Affiliation(s)
- Oliver T Unke
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- DFG Cluster of Excellence "Unifying Systems in Catalysis" (UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
| | - Martin Stöhr
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Stefan Ganscha
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Thomas Unterthiner
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Hartmut Maennel
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Sergii Kashubin
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Daniel Ahlin
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Michael Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- DFG Cluster of Excellence "Unifying Systems in Catalysis" (UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
- BASLEARN - TU Berlin/BASF Joint Lab for Machine Learning, Technische Universität Berlin, 10587 Berlin, Germany
| | - Leonardo Medrano Sandonas
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Joshua T Berryman
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Klaus-Robert Müller
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, Korea
- Max Planck Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbrücken, Germany
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
| |
Collapse
|
6
|
Arandhara M, Ramesh SG. Nuclear Quantum Effects in Hydroxide Hydrate Along the H-Bond Bifurcation Pathway. J Phys Chem A 2024; 128:1600-1610. [PMID: 38393819 DOI: 10.1021/acs.jpca.3c08027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2024]
Abstract
Path integral (PI) simulations are used to explore nuclear quantum effects (NQEs) in hydroxide hydrate and its perdeuterated isotopomer along the H-bond bifurcation pathway. Toward this, a new potential energy surface using the symmetric gradient domain machine learning method with ab initio data at the CCSD(T)/aug-cc-pVTZ level is built. From PI umbrella sampling (US) simulations, free energy profiles along the bifurcation coordinate are explored as a function of temperature. At ambient temperature, the bifurcation barrier is increased upon inclusion of NQEs. At low temperatures in the deep tunneling regime, the barrier is strongly decreased and flattened. These trends are examined, and the role of the O-O distance is also investigated through two-dimensional US simulations.
Collapse
Affiliation(s)
- Mrinal Arandhara
- Department of Inorganic and Physical Chemistry, Indian Institute of Science, Bangalore 560012, India
| | - Sai G Ramesh
- Department of Inorganic and Physical Chemistry, Indian Institute of Science, Bangalore 560012, India
| |
Collapse
|
7
|
Célerse F, Wodrich MD, Vela S, Gallarati S, Fabregat R, Juraskova V, Corminboeuf C. From Organic Fragments to Photoswitchable Catalysts: The OFF-ON Structural Repository for Transferable Kernel-Based Potentials. J Chem Inf Model 2024; 64:1201-1212. [PMID: 38319296 PMCID: PMC10900300 DOI: 10.1021/acs.jcim.3c01953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 01/18/2024] [Accepted: 01/22/2024] [Indexed: 02/07/2024]
Abstract
Structurally and conformationally diverse databases are needed to train accurate neural networks or kernel-based potentials capable of exploring the complex free energy landscape of flexible functional organic molecules. Curating such databases for species beyond "simple" drug-like compounds or molecules composed of well-defined building blocks (e.g., peptides) is challenging as it requires thorough chemical space mapping and evaluation of both chemical and conformational diversities. Here, we introduce the OFF-ON (organic fragments from organocatalysts that are non-modular) database, a repository of 7869 equilibrium and 67,457 nonequilibrium geometries of organic compounds and dimers aimed at describing conformationally flexible functional organic molecules, with an emphasis on photoswitchable organocatalysts. The relevance of this database is then demonstrated by training a local kernel regression model on a low-cost semiempirical baseline and comparing it with a PBE0-D3 reference for several known catalysts, notably the free energy surfaces of exemplary photoswitchable organocatalysts. Our results demonstrate that the OFF-ON data set offers reliable predictions for simulating the conformational behavior of virtually any (photoswitchable) organocatalyst or organic compound composed of H, C, N, O, F, and S atoms, thereby opening a computationally feasible route to explore complex free energy surfaces in order to rationalize and predict catalytic behavior.
Collapse
Affiliation(s)
- Frédéric Célerse
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Matthew D. Wodrich
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
- National
Center for Competence in Research-Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| | - Sergi Vela
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Simone Gallarati
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Raimon Fabregat
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Veronika Juraskova
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Clémence Corminboeuf
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
- National
Center for Competence in Research-Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
- National
Centre for Computational Design and Discovery of Novel Materials (MARVEL), Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| |
Collapse
|
8
|
Wu J, Lv J, Zhao L, Zhao R, Gao T, Xu Q, Liu D, Yu Q, Ma F. Exploring the role of microbial proteins in controlling environmental pollutants based on molecular simulation. Sci Total Environ 2023; 905:167028. [PMID: 37704131 DOI: 10.1016/j.scitotenv.2023.167028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 09/03/2023] [Accepted: 09/10/2023] [Indexed: 09/15/2023]
Abstract
Molecular simulation has been widely used to study microbial proteins' structural composition and dynamic properties, such as volatility, flexibility, and stability at the microscopic scale. Herein, this review describes the key elements of molecular docking and molecular dynamics (MD) simulations in molecular simulation; reviews the techniques combined with molecular simulation, such as crystallography, spectroscopy, molecular biology, and machine learning, to validate simulation results and bridge information gaps in the structure, microenvironmental changes, expression mechanisms, and intensity quantification; illustrates the application of molecular simulation, in characterizing the molecular mechanisms of interaction of microbial proteins with four different types of contaminants, namely heavy metals (HMs), pesticides, dyes and emerging contaminants (ECs). Finally, the review outlines the important role of molecular simulations in the study of microbial proteins for controlling environmental contamination and provides ideas for the application of molecular simulation in screening microbial proteins and incorporating targeted mutagenesis to obtain more effective contaminant control proteins.
Collapse
Affiliation(s)
- Jieting Wu
- School of Environmental Science, Liaoning University, Shenyang 110036, China
| | - Jin Lv
- School of Environmental Science, Liaoning University, Shenyang 110036, China
| | - Lei Zhao
- State Key Laboratory of Urban Water Resources & Environment, Harbin Institute of Technology, Harbin 150090, China
| | - Ruofan Zhao
- School of Environment, Beijing Normal University, Beijing 100875, China
| | - Tian Gao
- Key Laboratory of Integrated Regulation and Resource Development of Shallow Lakes, Ministry of Education, College of Environment, Hohai University, Xikang Road #1, Nanjing 210098, China
| | - Qi Xu
- PetroChina Fushun Petrochemical Company, Fushun 113000, China
| | - Dongbo Liu
- School of Environmental Science, Liaoning University, Shenyang 110036, China
| | - Qiqi Yu
- School of Environmental Science, Liaoning University, Shenyang 110036, China
| | - Fang Ma
- State Key Laboratory of Urban Water Resources & Environment, Harbin Institute of Technology, Harbin 150090, China.
| |
Collapse
|
9
|
Abstract
The introduction of modern Machine Learning Potentials (MLPs) has led to a paradigm change in the development of potential energy surfaces for atomistic simulations. By providing efficient access to energies and forces, they allow us to perform large-scale simulations of extended systems, which are not directly accessible by demanding first-principles methods. In these simulations, MLPs can reach the accuracy of electronic structure calculations, provided that they have been properly trained and validated using a suitable set of reference data. Due to their highly flexible functional form, the construction of MLPs has to be done with great care. In this Tutorial, we describe the necessary key steps for training reliable MLPs, from data generation via training to final validation. The procedure, which is illustrated for the example of a high-dimensional neural network potential, is general and applicable to many types of MLPs.
Collapse
Affiliation(s)
- Alea Miako Tokita
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany and Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany and Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| |
Collapse
|
10
|
Gandolfi M, Ceotto M. Molecular Dynamics of Artificially Pair-Decoupled Systems: An Accurate Tool for Investigating the Importance of Intramolecular Couplings. J Chem Theory Comput 2023; 19:6093-6108. [PMID: 37698951 PMCID: PMC10536992 DOI: 10.1021/acs.jctc.3c00553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Indexed: 09/14/2023]
Abstract
We propose a numerical technique to accurately simulate the vibrations of organic molecules in the gas phase, when pairs of atoms (or, in general, groups of degrees of freedom) are artificially decoupled, so that their motion is instantaneously decorrelated. The numerical technique we have developed is a symplectic integration algorithm that never requires computation of the force but requires estimates of the Hessian matrix. The theory we present to support our technique postulates a pair-decoupling Hamiltonian function, which parametrically depends on a decoupling coefficient α ∈ [0, 1]. The closer α is to 0, the more decoupled the selected atoms. We test the correctness of our numerical method on small molecular systems, and we apply it to study the vibrational spectroscopic features of salicylic acid at the Density Functional Theory ab initio level on a fitted potential. Our pair-decoupled simulations of salicylic acid show that decoupling hydrogen-bonded atoms do not significantly influence the frequencies of stretching modes, but enhance enormously the out-of-plane wagging and twisting motions of the hydroxyl and carboxyl groups to the point that the carboxyl and hydroxyl groups may overcome high potential energy barriers and change the salicylic acid conformation after a short simulation time. In addition, we found that the acidity of salicylic acid is more influenced by the dynamical couplings of the proton of the carboxylic group with the carbon ring than with the hydroxyl group.
Collapse
Affiliation(s)
- Michele Gandolfi
- Dipartimento di Chimica, Università degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Michele Ceotto
- Dipartimento di Chimica, Università degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| |
Collapse
|
11
|
Zhang P, Yang W. Toward a general neural network force field for protein simulations: Refining the intramolecular interaction in protein. J Chem Phys 2023; 159:024118. [PMID: 37431910 PMCID: PMC10481389 DOI: 10.1063/5.0142280] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 06/22/2023] [Indexed: 07/12/2023] Open
Abstract
Molecular dynamics (MD) is an extremely powerful, highly effective, and widely used approach to understanding the nature of chemical processes in atomic details for proteins. The accuracy of results from MD simulations is highly dependent on force fields. Currently, molecular mechanical (MM) force fields are mainly utilized in MD simulations because of their low computational cost. Quantum mechanical (QM) calculation has high accuracy, but it is exceedingly time consuming for protein simulations. Machine learning (ML) provides the capability for generating accurate potential at the QM level without increasing much computational effort for specific systems that can be studied at the QM level. However, the construction of general machine learned force fields, needed for broad applications and large and complex systems, is still challenging. Here, general and transferable neural network (NN) force fields based on CHARMM force fields, named CHARMM-NN, are constructed for proteins by training NN models on 27 fragments partitioned from the residue-based systematic molecular fragmentation (rSMF) method. The NN for each fragment is based on atom types and uses new input features that are similar to MM inputs, including bonds, angles, dihedrals, and non-bonded terms, which enhance the compatibility of CHARMM-NN to MM MD and enable the implementation of CHARMM-NN force fields in different MD programs. While the main part of the energy of the protein is based on rSMF and NN, the nonbonded interactions between the fragments and with water are taken from the CHARMM force field through mechanical embedding. The validations of the method for dipeptides on geometric data, relative potential energies, and structural reorganization energies demonstrate that the CHARMM-NN local minima on the potential energy surface are very accurate approximations to QM, showing the success of CHARMM-NN for bonded interactions. However, the MD simulations on peptides and proteins indicate that more accurate methods to represent protein-water interactions in fragments and non-bonded interactions between fragments should be considered in the future improvement of CHARMM-NN, which can increase the accuracy of approximation beyond the current mechanical embedding QM/MM level.
Collapse
Affiliation(s)
- Pan Zhang
- Department of Chemistry, Duke University, Durham, North Carolina 27708, USA
| | - Weitao Yang
- Department of Chemistry, Duke University, Durham, North Carolina 27708, USA
| |
Collapse
|
12
|
Jaffrelot Inizan T, Plé T, Adjoua O, Ren P, Gökcan H, Isayev O, Lagardère L, Piquemal JP. Scalable hybrid deep neural networks/polarizable potentials biomolecular simulations including long-range effects. Chem Sci 2023; 14:5438-5452. [PMID: 37234902 PMCID: PMC10208042 DOI: 10.1039/d2sc04815a] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 04/03/2023] [Indexed: 07/28/2023] Open
Abstract
Deep-HP is a scalable extension of the Tinker-HP multi-GPU molecular dynamics (MD) package enabling the use of Pytorch/TensorFlow Deep Neural Network (DNN) models. Deep-HP increases DNNs' MD capabilities by orders of magnitude offering access to ns simulations for 100k-atom biosystems while offering the possibility of coupling DNNs to any classical (FFs) and many-body polarizable (PFFs) force fields. It allows therefore the introduction of the ANI-2X/AMOEBA hybrid polarizable potential designed for ligand binding studies where solvent-solvent and solvent-solute interactions are computed with the AMOEBA PFF while solute-solute ones are computed by the ANI-2X DNN. ANI-2X/AMOEBA explicitly includes AMOEBA's physical long-range interactions via an efficient Particle Mesh Ewald implementation while preserving ANI-2X's solute short-range quantum mechanical accuracy. The DNN/PFF partition can be user-defined allowing for hybrid simulations to include key ingredients of biosimulation such as polarizable solvents, polarizable counter ions, etc.… ANI-2X/AMOEBA is accelerated using a multiple-timestep strategy focusing on the model's contributions to low-frequency modes of nuclear forces. It primarily evaluates AMOEBA forces while including ANI-2X ones only via correction-steps resulting in an order of magnitude acceleration over standard Velocity Verlet integration. Simulating more than 10 μs, we compute charged/uncharged ligand solvation free energies in 4 solvents, and absolute binding free energies of host-guest complexes from SAMPL challenges. ANI-2X/AMOEBA average errors are discussed in terms of statistical uncertainty and appear in the range of chemical accuracy compared to experiment. The availability of the Deep-HP computational platform opens the path towards large-scale hybrid DNN simulations, at force-field cost, in biophysics and drug discovery.
Collapse
Affiliation(s)
- Théo Jaffrelot Inizan
- Sorbonne Université, Laboratoire de Chimie Théorique UMR 7616 CNRS Paris 75005 France
| | - Thomas Plé
- Sorbonne Université, Laboratoire de Chimie Théorique UMR 7616 CNRS Paris 75005 France
| | - Olivier Adjoua
- Sorbonne Université, Laboratoire de Chimie Théorique UMR 7616 CNRS Paris 75005 France
| | - Pengyu Ren
- Department of Biomedical Engineering, University of Texas at Austin Austin Texas USA
| | - Hatice Gökcan
- Department of Chemistry, Carnegie Mellon University Pittsburgh Pennsylvania USA
| | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University Pittsburgh Pennsylvania USA
| | - Louis Lagardère
- Sorbonne Université, Laboratoire de Chimie Théorique UMR 7616 CNRS Paris 75005 France
- Sorbonne Université, Institut Parisien de Chimie Physique et Théorique FR 2622 CNRS Paris France
| | - Jean-Philip Piquemal
- Sorbonne Université, Laboratoire de Chimie Théorique UMR 7616 CNRS Paris 75005 France
- Department of Biomedical Engineering, University of Texas at Austin Austin Texas USA
| |
Collapse
|
13
|
Hammes-Schiffer S. Exploring Proton-Coupled Electron Transfer at Multiple Scales. Nat Comput Sci 2023; 3:291-300. [PMID: 37577057 PMCID: PMC10416817 DOI: 10.1038/s43588-023-00422-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 02/23/2023] [Indexed: 08/15/2023]
Abstract
The coupling of electron and proton transfer is critical for chemical and biological processes spanning a wide range of length and time scales and often occurring in complex environments. Thus, diverse modeling strategies, including analytical theories, quantum chemistry, molecular dynamics, and kinetic modeling, are essential for a comprehensive understanding of such proton-coupled electron transfer reactions. Each of these computational methods provides one piece of the puzzle, and all these pieces must be viewed together to produce the full picture.
Collapse
|
14
|
Mousavi SZ, Shadman HR, Habibi M, Didandeh M, Nikzad A, Golmohammadi M, Maleki R, Suwaileh WA, Khataee A, Zargar M, Razmjou A. Elucidating the Sorption Mechanisms of Environmental Pollutants Using Molecular Simulation. Ind Eng Chem Res 2023. [DOI: 10.1021/acs.iecr.2c02333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Affiliation(s)
- Seyedeh Zahra Mousavi
- Department of Chemical Engineering, Tarbiat Modares University, Tehran, 1411944961, Iran
| | - Hamid Reza Shadman
- Department of Polymer Engineering & Color Technology, Amirkabir University of Technology, Tehran, 6351713178, Iran
| | - Meysam Habibi
- Department of Chemical Engineering, University of Tehran, Tehran, 6718773654, Iran
| | - Mohsen Didandeh
- Department of Chemical Engineering, Tarbiat Modares University, Tehran, 1411944961, Iran
| | - Arash Nikzad
- Mechanical Engineering Department, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Mahsa Golmohammadi
- Department of Polymer Engineering & Color Technology, Amirkabir University of Technology, Tehran, 6351713178, Iran
| | - Reza Maleki
- Department of Chemical Technologies, Iranian Research Organization for Science and Technology (IROST), P.O. Box 33535111, Tehran, 3313193685, Iran
| | - Wafa Ali Suwaileh
- Chemical Engineering Program, Texas A&M University at Qatar, Education City, Doha 23874, Qatar
| | - Alireza Khataee
- Research Laboratory of Advanced Water and Wastewater Treatment Processes, Department of Applied Chemistry, Faculty of Chemistry, University of Tabriz, 51666-16471 Tabriz, Iran
- Department of Materials Science and Nanotechnology Engineering, Faculty of Engineering, Near East University, 99138 Nicosia, Mersin 10 Turkey
| | - Masoumeh Zargar
- Mineral Recovery Research Center (MRRC), School of Engineering, Edith Cowan University, Joondalup, Perth WA 6027, Australia
| | - Amir Razmjou
- Mineral Recovery Research Center (MRRC), School of Engineering, Edith Cowan University, Joondalup, Perth WA 6027, Australia
- UNESCO Centre for Membrane Science and Technology, School of Chemical Engineering, University of New South Wales, Sydney, NSW 2052, Australia
| |
Collapse
|
15
|
Zaverkin V, Holzmüller D, Bonfirraro L, Kästner J. Transfer learning for chemically accurate interatomic neural network potentials. Phys Chem Chem Phys 2023; 25:5383-5396. [PMID: 36748821 DOI: 10.1039/d2cp05793j] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Developing machine learning-based interatomic potentials from ab initio electronic structure methods remains a challenging task for computational chemistry and materials science. This work studies the capability of transfer learning, in particular discriminative fine-tuning, for efficiently generating chemically accurate interatomic neural network potentials on organic molecules from the MD17 and ANI data sets. We show that pre-training the network parameters on data obtained from density functional calculations considerably improves the sample efficiency of models trained on more accurate ab initio data. Additionally, we show that fine-tuning with energy labels alone can suffice to obtain accurate atomic forces and run large-scale atomistic simulations, provided a well-designed fine-tuning data set. We also investigate possible limitations of transfer learning, especially regarding the design and size of the pre-training and fine-tuning data sets. Finally, we provide GM-NN potentials pre-trained and fine-tuned on the ANI-1x and ANI-1ccx data sets, which can easily be fine-tuned on and applied to organic molecules.
Collapse
Affiliation(s)
- Viktor Zaverkin
- Faculty of Chemistry, Institute for Theoretical Chemistry, University of Stuttgart, Germany.
| | - David Holzmüller
- Faculty of Mathematics and Physics, Institute for Stochastics and Applications, University of Stuttgart, Germany.
| | - Luca Bonfirraro
- Faculty of Chemistry, Institute for Theoretical Chemistry, University of Stuttgart, Germany.
| | - Johannes Kästner
- Faculty of Chemistry, Institute for Theoretical Chemistry, University of Stuttgart, Germany.
| |
Collapse
|
16
|
Kříž K, Schmidt L, Andersson AT, Walz MM, van der Spoel D. An Imbalance in the Force: The Need for Standardized Benchmarks for Molecular Simulation. J Chem Inf Model 2023; 63:412-431. [PMID: 36630710 PMCID: PMC9875315 DOI: 10.1021/acs.jcim.2c01127] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Indexed: 01/12/2023]
Abstract
Force fields (FFs) for molecular simulation have been under development for more than half a century. As with any predictive model, rigorous testing and comparisons of models critically depends on the availability of standardized data sets and benchmarks. While such benchmarks are rather common in the fields of quantum chemistry, this is not the case for empirical FFs. That is, few benchmarks are reused to evaluate FFs, and development teams rather use their own training and test sets. Here we present an overview of currently available tests and benchmarks for computational chemistry, focusing on organic compounds, including halogens and common ions, as FFs for these are the most common ones. We argue that many of the benchmark data sets from quantum chemistry can in fact be reused for evaluating FFs, but new gas phase data is still needed for compounds containing phosphorus and sulfur in different valence states. In addition, more nonequilibrium interaction energies and forces, as well as molecular properties such as electrostatic potentials around compounds, would be beneficial. For the condensed phases there is a large body of experimental data available, and tools to utilize these data in an automated fashion are under development. If FF developers, as well as researchers in artificial intelligence, would adopt a number of these data sets, it would become easier to compare the relative strengths and weaknesses of different models and to, eventually, restore the balance in the force.
Collapse
Affiliation(s)
- Kristian Kříž
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - Lisa Schmidt
- Faculty
of Biosciences, University of Heidelberg, Heidelberg69117, Germany
| | - Alfred T. Andersson
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - Marie-Madeleine Walz
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - David van der Spoel
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| |
Collapse
|
17
|
Chmiela S, Vassilev-Galindo V, Unke OT, Kabylda A, Sauceda HE, Tkatchenko A, Müller KR. Accurate global machine learning force fields for molecules with hundreds of atoms. Sci Adv 2023; 9:eadf0873. [PMID: 36630510 PMCID: PMC9833674 DOI: 10.1126/sciadv.adf0873] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 11/28/2022] [Indexed: 05/25/2023]
Abstract
Global machine learning force fields, with the capacity to capture collective interactions in molecular systems, now scale up to a few dozen atoms due to considerable growth of model complexity with system size. For larger molecules, locality assumptions are introduced, with the consequence that nonlocal interactions are not described. Here, we develop an exact iterative approach to train global symmetric gradient domain machine learning (sGDML) force fields (FFs) for several hundred atoms, without resorting to any potentially uncontrolled approximations. All atomic degrees of freedom remain correlated in the global sGDML FF, allowing the accurate description of complex molecules and materials that present phenomena with far-reaching characteristic correlation lengths. We assess the accuracy and efficiency of sGDML on a newly developed MD22 benchmark dataset containing molecules from 42 to 370 atoms. The robustness of our approach is demonstrated in nanosecond path-integral molecular dynamics simulations for supramolecular complexes in the MD22 dataset.
Collapse
Affiliation(s)
- Stefan Chmiela
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- Berlin Institute for the Foundations of Learning and Data – BIFOLD, Germany
| | - Valentin Vassilev-Galindo
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Oliver T. Unke
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- Google Research, Brain Team, Berlin, Germany
| | - Adil Kabylda
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Huziel E. Sauceda
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- Berlin Institute for the Foundations of Learning and Data – BIFOLD, Germany
- Departamento de Materia Condensada, Instituto de Física, Universidad Nacional Autónoma de México, Cd. de México C.P. 04510, Mexico
- BASLEARN - TU Berlin/BASF Joint Lab for Machine Learning, Technische Universität Berlin, 10587 Berlin, Germany
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- Berlin Institute for the Foundations of Learning and Data – BIFOLD, Germany
- Google Research, Brain Team, Berlin, Germany
- Max Planck Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbrücken, Germany
- Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, Korea
| |
Collapse
|
18
|
Huang C, Rubenstein BM. Machine Learning Diffusion Monte Carlo Forces. J Phys Chem A 2023; 127:339-355. [PMID: 36576803 DOI: 10.1021/acs.jpca.2c05904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Diffusion Monte Carlo (DMC) is one of the most accurate techniques available for calculating the electronic properties of molecules and materials, yet it often remains a challenge to economically compute forces using this technique. As a result, ab initio molecular dynamics simulations and geometry optimizations that employ Diffusion Monte Carlo forces are often out of reach. One potential approach for accelerating the computation of "DMC forces" is to machine learn these forces from DMC energy calculations. In this work, we employ Behler-Parrinello Neural Networks to learn DMC forces from DMC energy calculations for geometry optimization and molecular dynamics simulations of small molecules. We illustrate the unique challenges that stem from learning forces without explicit force data and from noisy energy data by making rigorous comparisons of potential energy surface, dynamics, and optimization predictions among ab initio density functional theory (DFT) simulations and machine-learning models trained on DFT energies with forces, DFT energies without forces, and DMC energies without forces. We show for three small molecules─C2, H2O, and CH3Cl─that machine-learned DMC dynamics can reproduce average bond lengths and angles within a few percent of known experimental results at one hundredth of the typical cost. Our work describes a much-needed means of performing dynamics simulations on high-accuracy, DMC PESs and for generating DMC-quality molecular geometries given current algorithmic constraints.
Collapse
Affiliation(s)
- Cancan Huang
- Department of Chemistry, Brown University, Providence, Rhode Island02912, United States
| | - Brenda M Rubenstein
- Department of Chemistry, Brown University, Providence, Rhode Island02912, United States
| |
Collapse
|
19
|
Conte R, Nandi A, Qu C, Yu Q, Houston PL, Bowman JM. Semiclassical and VSCF/VCI Calculations of the Vibrational Energies of trans- and gauche-Ethanol Using a CCSD(T) Potential Energy Surface. J Phys Chem A 2022; 126:7709-7718. [PMID: 36240438 DOI: 10.1021/acs.jpca.2c06322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
A recent full-dimensional Δ-Machine learning potential energy surface (PES) for ethanol is employed in semiclassical and vibrational self-consistent field (VSCF) and virtual-state configuration interaction (VCI) calculations, using MULTIMODE, to determine the anharmonic vibrational frequencies of vibration for both the trans and gauche conformers of ethanol. Both semiclassical and VSCF/VCI energies agree well with the experimental data. We find significant mixing between the VSCF basis states due to Fermi resonances between bending and stretching modes. The same effects are also accurately described by the full-dimensional semiclassical calculations. These are the first high-level anharmonic calculations using a PES, in particular a "gold-standard" CCSD(T) one.
Collapse
Affiliation(s)
- Riccardo Conte
- Dipartimento di Chimica, Università degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Chen Qu
- Independent Researcher, Toronto, Ontario M9B0E3, Canada
| | - Qi Yu
- Department of Chemistry Yale University, New Haven, Connecticut 06520, United States
| | - Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, United States.,Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Joel M Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| |
Collapse
|
20
|
Nandi A, Conte R, Qu C, Houston PL, Yu Q, Bowman JM. Quantum Calculations on a New CCSD(T) Machine-Learned Potential Energy Surface Reveal the Leaky Nature of Gas-Phase Trans and Gauche Ethanol Conformers. J Chem Theory Comput 2022; 18:5527-5538. [PMID: 35951990 PMCID: PMC9476654 DOI: 10.1021/acs.jctc.2c00760] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
![]()
Ethanol is a molecule of fundamental interest in combustion,
astrochemistry,
and condensed phase as a solvent. It is characterized by two methyl
rotors and trans (anti) and gauche conformers, which are known to be very close in energy.
Here we show that based on rigorous quantum calculations of the vibrational
zero-point state, using a new ab initio potential
energy surface (PES), the ground state resembles the trans conformer, but substantial delocalization to the gauche conformer is present. This explains experimental issues about identification
and isolation of the two conformers. This “leak” effect
is partially quenched when deuterating the OH group, which further
demonstrates the need for a quantum mechanical approach. Diffusion
Monte Carlo and full-dimensional semiclassical dynamics calculations
are employed. The new PES is obtained by means of a Δ-machine
learning approach starting from a pre-existing low level density functional
theory surface. This surface is brought to the CCSD(T) level of theory
using a relatively small number of ab initio CCSD(T)
energies. Agreement between the corrected PES and direct ab
initio results for standard tests is excellent. One- and
two-dimensional discrete variable representation calculations focusing
on the trans–gauche torsional
motion are also reported, in reasonable agreement with experiment.
Collapse
Affiliation(s)
- Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Chen Qu
- Independent Researcher, Toronto 66777, Canada
| | - Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, United States.,Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Qi Yu
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States
| | - Joel M Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| |
Collapse
|
21
|
Sauceda HE, Gálvez-González LE, Chmiela S, Paz-Borbón LO, Müller KR, Tkatchenko A. BIGDML-Towards accurate quantum machine learning force fields for materials. Nat Commun 2022; 13:3733. [PMID: 35768400 DOI: 10.1038/s41467-022-31093-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 06/01/2022] [Indexed: 12/16/2022] Open
Abstract
Machine-learning force fields (MLFF) should be accurate, computationally and data efficient, and applicable to molecules, materials, and interfaces thereof. Currently, MLFFs often introduce tradeoffs that restrict their practical applicability to small subsets of chemical space or require exhaustive datasets for training. Here, we introduce the Bravais-Inspired Gradient-Domain Machine Learning (BIGDML) approach and demonstrate its ability to construct reliable force fields using a training set with just 10-200 geometries for materials including pristine and defect-containing 2D and 3D semiconductors and metals, as well as chemisorbed and physisorbed atomic and molecular adsorbates on surfaces. The BIGDML model employs the full relevant symmetry group for a given material, does not assume artificial atom types or localization of atomic interactions and exhibits high data efficiency and state-of-the-art energy accuracies (errors substantially below 1 meV per atom) for an extended set of materials. Extensive path-integral molecular dynamics carried out with BIGDML models demonstrate the counterintuitive localization of benzene-graphene dynamics induced by nuclear quantum effects and their strong contributions to the hydrogen diffusion coefficient in a Pd crystal for a wide range of temperatures.
Collapse
|
22
|
Bowman JM, Qu C, Conte R, Nandi A, Houston PL, Yu Q. The MD17 datasets from the perspective of datasets for gas-phase “small” molecule potentials. J Chem Phys 2022; 156:240901. [DOI: 10.1063/5.0089200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
There has been great progress in developing methods for machine-learned potential energy surfaces. There have also been important assessments of these methods by comparing so-called learning curves on datasets of electronic energies and forces, notably the MD17 database. The dataset for each molecule in this database generally consists of tens of thousands of energies and forces obtained from DFT direct dynamics at 500 K. We contrast the datasets from this database for three “small” molecules, ethanol, malonaldehyde, and glycine, with datasets we have generated with specific targets for the potential energy surfaces (PESs) in mind: a rigorous calculation of the zero-point energy and wavefunction, the tunneling splitting in malonaldehyde, and, in the case of glycine, a description of all eight low-lying conformers. We found that the MD17 datasets are too limited for these targets. We also examine recent datasets for several PESs that describe small-molecule but complex chemical reactions. Finally, we introduce a new database, “QM-22,” which contains datasets of molecules ranging from 4 to 15 atoms that extend to high energies and a large span of configurations.
Collapse
Affiliation(s)
- Joel M. Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| | - Chen Qu
- Independent Researcher, Toronto, Canada
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| | - Paul L. Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA
- Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Qi Yu
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, USA
| |
Collapse
|
23
|
Liu Y, Li J. Permutation-Invariant-Polynomial Neural-Network-Based Δ-Machine Learning Approach: A Case for the HO 2 Self-Reaction and Its Dynamics Study. J Phys Chem Lett 2022; 13:4729-4738. [PMID: 35609295 DOI: 10.1021/acs.jpclett.2c01064] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Δ-machine learning, or the hierarchical construction scheme, is a highly cost-effective method, as only a small number of high-level ab initio energies are required to improve a potential energy surface (PES) fit to a large number of low-level points. However, there is no efficient and systematic way to select as few points as possible from the low-level data set. We here propose a permutation-invariant-polynomial neural-network (PIP-NN)-based Δ-machine learning approach to construct full-dimensional accurate PESs of complicated reactions efficiently. Particularly, the high flexibility of the NN is exploited to efficiently sample points from the low-level data set. This approach is applied to the challenging case of a HO2 self-reaction with a large configuration space. Only 14% of the DFT data set is used to successfully bring a newly fitted DFT PES to the UCCSD(T)-F12a/AVTZ quality. Then, the quasiclassical trajectory (QCT) calculations are performed to study its dynamics, particularly the mode specificity.
Collapse
Affiliation(s)
- Yang Liu
- School of Chemistry and Chemical Engineering & Chongqing Key Laboratory of Theoretical and Computational Chemistry, Chongqing University, Chongqing 401331, China
| | - Jun Li
- School of Chemistry and Chemical Engineering & Chongqing Key Laboratory of Theoretical and Computational Chemistry, Chongqing University, Chongqing 401331, China
| |
Collapse
|
24
|
Winkler L, Müller KR, Sauceda HE. High-fidelity molecular dynamics trajectory reconstruction with bi-directional neural networks. Mach Learn : Sci Technol 2022. [DOI: 10.1088/2632-2153/ac6ec6] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Abstract
Molecular dynamics (MD) simulations are a cornerstone in science, enabling the investigation of a system’s thermodynamics all the way to analyzing intricate molecular interactions. In general, creating extended molecular trajectories can be a computationally expensive process, for example, when running ab-initio simulations. Hence, repeating such calculations to either obtain more accurate thermodynamics or to get a higher resolution in the dynamics generated by a fine-grained quantum interaction can be time- and computational resource-consuming. In this work, we explore different machine learning methodologies to increase the resolution of MD trajectories on-demand within a post-processing step. As a proof of concept, we analyse the performance of bi-directional neural networks (NNs) such as neural ODEs, Hamiltonian networks, recurrent NNs and long short-term memories, as well as the uni-directional variants as a reference, for MD simulations (here: the MD17 dataset). We have found that Bi-LSTMs are the best performing models; by utilizing the local time-symmetry of thermostated trajectories they can even learn long-range correlations and display high robustness to noisy dynamics across molecular complexity. Our models can reach accuracies of up to 10−4 Å in trajectory interpolation, which leads to the faithful reconstruction of several unseen high-frequency molecular vibration cycles. This renders the comparison between the learned and reference trajectories indistinguishable. The results reported in this work can serve (1) as a baseline for larger systems, as well as (2) for the construction of better MD integrators.
Collapse
|
25
|
Fujioka K, Sun R. Interpolating Moving Ridge Regression (IMRR): A machine learning algorithm to predict energy gradients for ab initio molecular dynamics simulations. Chem Phys 2022. [DOI: 10.1016/j.chemphys.2022.111482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
26
|
Abstract
Recent work has demonstrated the promise of using machine-learned surrogates, in particular, Gaussian process (GP) surrogates, in reducing the number of electronic structure calculations (ESCs) needed to perform surrogate model based (SMB) geometry optimization. In this paper, we study geometry meta-optimization with GP surrogates where a SMB optimizer additionally learns from its past "experience" performing geometry optimization. To validate this idea, we start with the simplest setting where a geometry meta-optimizer learns from previous optimizations of the same molecule with different initial-guess geometries. We give empirical evidence that geometry meta-optimization with GP surrogates is effective and requires less tuning compared to SMB optimization with GP surrogates on the ANI-1 dataset of off-equilibrium initial structures of small organic molecules. Unlike SMB optimization where a surrogate should be immediately useful for optimizing a given geometry, a surrogate in geometry meta-optimization has more flexibility because it can distribute its ESC savings across a set of geometries. Indeed, we find that GP surrogates that preserve rotational invariance provide increased marginal ESC savings across geometries. As a more stringent test, we also apply geometry meta-optimization to conformational search on a hand-constructed dataset of hydrocarbons and alcohols. We observe that while SMB optimization and geometry meta-optimization do save on ESCs, they also tend to miss higher energy conformers compared to standard geometry optimization. We believe that further research into characterizing the divergence between GP surrogates and potential energy surfaces is critical not only for advancing geometry meta-optimization but also for exploring the potential of machine-learned surrogates in geometry optimization in general.
Collapse
Affiliation(s)
- Daniel Huang
- Department of Computer Science, San Francisco State University, San Francisco, California 94132, USA
| | - Junwei Lucas Bao
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, USA
| | - Jean-Baptiste Tristan
- Department of Computer Science, Boston College, Chestnut Hill, Massachusetts 02467, USA
| |
Collapse
|
27
|
Li X, Fan J, Chen Y, Xie X, Liu C, Yin Y, Kou J, Wu L, Chen Z. The structure and performance study of PP random impact resistance copolymer. Polym Bull (Berl) 2022. [DOI: 10.1007/s00289-022-04187-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
28
|
Fabregat R, Fabrizio A, Engel EA, Meyer B, Juraskova V, Ceriotti M, Corminboeuf C. Local Kernel Regression and Neural Network Approaches to the Conformational Landscapes of Oligopeptides. J Chem Theory Comput 2022; 18:1467-1479. [PMID: 35179897 PMCID: PMC8908737 DOI: 10.1021/acs.jctc.1c00813] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
The application of
machine learning to theoretical chemistry has
made it possible to combine the accuracy of quantum chemical energetics
with the thorough sampling of finite-temperature fluctuations. To
reach this goal, a diverse set of methods has been proposed, ranging
from simple linear models to kernel regression and highly nonlinear
neural networks. Here we apply two widely different approaches to
the same, challenging problem: the sampling of the conformational
landscape of polypeptides at finite temperature. We develop a local
kernel regression (LKR) coupled with a supervised sparsity method
and compare it with a more established approach based on Behler-Parrinello
type neural networks. In the context of the LKR, we discuss how the
supervised selection of the reference pool of environments is crucial
to achieve accurate potential energy surfaces at a competitive computational
cost and leverage the locality of the model to infer which chemical
environments are poorly described by the DFTB baseline. We then discuss
the relative merits of the two frameworks and perform Hamiltonian-reservoir
replica-exchange Monte Carlo sampling and metadynamics simulations,
respectively, to demonstrate that both frameworks can achieve converged
and transferable sampling of the conformational landscape of complex
and flexible biomolecules with comparable accuracy and computational
cost.
Collapse
Affiliation(s)
| | | | - Edgar A Engel
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | | | | | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | | |
Collapse
|
29
|
Houston PL, Qu C, Nandi A, Conte R, Yu Q, Bowman JM. Permutationally invariant polynomial regression for energies and gradients, using reverse differentiation, achieves orders of magnitude speed-up with high precision compared to other machine learning methods. J Chem Phys 2022; 156:044120. [DOI: 10.1063/5.0080506] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Affiliation(s)
- Paul L. Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA and Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Chen Qu
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, USA
| | - Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Qi Yu
- Department of Chemistry, Yale University, New Haven, Connecticut 06511, USA
| | - Joel M. Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| |
Collapse
|
30
|
Desgranges C, Delhommelle J. Machine-Learned Free Energy Surfaces for Capillary Condensation and Evaporation in Mesopores. Entropy (Basel) 2022; 24:97. [PMID: 35052123 DOI: 10.3390/e24010097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 12/29/2021] [Accepted: 01/05/2022] [Indexed: 12/04/2022]
Abstract
Using molecular simulations, we study the processes of capillary condensation and capillary evaporation in model mesopores. To determine the phase transition pathway, as well as the corresponding free energy profile, we carry out enhanced sampling molecular simulations using entropy as a reaction coordinate to map the onset of order during the condensation process and of disorder during the evaporation process. The structural analysis shows the role played by intermediate states, characterized by the onset of capillary liquid bridges and bubbles. We also analyze the dependence of the free energy barrier on the pore width. Furthermore, we propose a method to build a machine learning model for the prediction of the free energy surfaces underlying capillary phase transition processes in mesopores.
Collapse
|
31
|
Botti G, Ceotto M, Conte R. On-the-fly adiabatically switched semiclassical initial value representation molecular dynamics for vibrational spectroscopy of biomolecules. J Chem Phys 2021; 155:234102. [PMID: 34937370 DOI: 10.1063/5.0075220] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Semiclassical (SC) vibrational spectroscopy is a technique capable of reproducing quantum effects (such as zero-point energies, quantum resonances, and anharmonic overtones) from classical dynamics runs even in the case of very large dimensional systems. In a previous study [Conte et al. J. Chem. Phys. 151, 214107 (2019)], a preliminary sampling based on adiabatic switching has been shown to be able to improve the precision and accuracy of semiclassical results for challenging model potentials and small molecular systems. In this paper, we investigate the possibility to extend the technique to larger (bio)molecular systems whose dynamics must be integrated by means of ab initio "on-the-fly" calculations. After some preliminary tests on small molecules, we obtain the vibrational frequencies of glycine improving on pre-existing SC calculations. Finally, the new approach is applied to 17-atom proline, an amino acid characterized by a strong intramolecular hydrogen bond.
Collapse
Affiliation(s)
- Giacomo Botti
- Dipartimento di Chimica, Università degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Michele Ceotto
- Dipartimento di Chimica, Università degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Riccardo Conte
- Dipartimento di Chimica, Università degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| |
Collapse
|
32
|
Xu M, Zhu T, Zhang JZH. Automated Construction of Neural Network Potential Energy Surface: The Enhanced Self-Organizing Incremental Neural Network Deep Potential Method. J Chem Inf Model 2021; 61:5425-5437. [PMID: 34752095 DOI: 10.1021/acs.jcim.1c01125] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In recent years, the use of deep learning (neural network) potential energy surface (NNPES) in molecular dynamics simulation has experienced explosive growth as it can be as accurate as quantum chemistry methods while being as efficient as classical mechanic methods. However, the development of NNPES is highly nontrivial. In particular, it has been troubling to construct a dataset that is as small as possible yet can cover the target chemical space. In this work, an ESOINN-DP method is developed, which has the enhanced self-organizing incremental neural network (ESOINN) and a newly proposed error indicator at its core. With ESOINN-DP, one can construct the NNPES with little human intervention, and this method ensures that the constructed reference dataset covers the target chemical space with minimum redundancy. The performance of the ESOINN-DP method has been well validated by developing neural network potential energy surfaces for water clusters, tripeptides, and by de-redundancy of a sub-dataset of the ANI-1 database. We believe that the ESOINN-DP method provides a novel idea for the construction of NNPES and, especially, the reference datasets, and it can be used for molecular dynamics (MD) simulations of various gas-phase and condensed-phase chemical systems.
Collapse
Affiliation(s)
- Mingyuan Xu
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
| | - Tong Zhu
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| | - John Z H Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China.,Department of Chemistry, New York University, New York, New York 10003, United States.,Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| |
Collapse
|
33
|
Pinheiro M, Ge F, Ferré N, Dral PO, Barbatti M. Choosing the right molecular machine learning potential. Chem Sci 2021; 12:14396-14413. [PMID: 34880991 PMCID: PMC8580106 DOI: 10.1039/d1sc03564a] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 09/14/2021] [Indexed: 11/21/2022] Open
Abstract
Quantum-chemistry simulations based on potential energy surfaces of molecules provide invaluable insight into the physicochemical processes at the atomistic level and yield such important observables as reaction rates and spectra. Machine learning potentials promise to significantly reduce the computational cost and hence enable otherwise unfeasible simulations. However, the surging number of such potentials begs the question of which one to choose or whether we still need to develop yet another one. Here, we address this question by evaluating the performance of popular machine learning potentials in terms of accuracy and computational cost. In addition, we deliver structured information for non-specialists in machine learning to guide them through the maze of acronyms, recognize each potential's main features, and judge what they could expect from each one.
Collapse
Affiliation(s)
- Max Pinheiro
- Aix Marseille University, CNRS, ICR Marseille France
| | - Fuchun Ge
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, College of Chemistry and Chemical Engineering, Xiamen University China
| | - Nicolas Ferré
- Aix Marseille University, CNRS, ICR Marseille France
| | - Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, College of Chemistry and Chemical Engineering, Xiamen University China
| | - Mario Barbatti
- Aix Marseille University, CNRS, ICR Marseille France
- Institut Universitaire de France 75231 Paris France
| |
Collapse
|
34
|
Wang X, Xu Y, Zheng H, Yu K. A Scalable Graph Neural Network Method for Developing an Accurate Force Field of Large Flexible Organic Molecules. J Phys Chem Lett 2021; 12:7982-7987. [PMID: 34433274 DOI: 10.1021/acs.jpclett.1c02214] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
An accurate force field is the key to the success of all molecular mechanics simulations on organic polymers and biomolecules. Accurate correlated wave function (CW) methods scale poorly with system size, so this poses a great challenge to the development of an extendible ab initio force field for large flexible organic molecules at the CW level of accuracy. In this work, we combine the physics-driven nonbonding potential with a data-driven subgraph neural network bonding model (named sGNN). Tests on polyethylene glycol, polyethene, and their block polymers show that our strategy is highly accurate and robust for molecules of different sizes and chemical compositions. Therefore, one can develop a parameter library of small molecular fragments (with sizes easily accessible to CW methods) and assemble them to predict the energy of large polymers, thus opening a new path to next-generation organic force fields.
Collapse
Affiliation(s)
- Xufei Wang
- Two Sigma Investments, New York, New York 10013, United States
| | - Yuanda Xu
- The Program in Applied & Computational Mathematics, Princeton University, Princeton, New Jersey 08544-1000, United States
| | - Han Zheng
- Tsinghua-Berkeley Shenzhen Institute (TBSI), Institute of Materials Research (iMR), Tsinghua Shenzhen International Graduate School (TSIGS), Tsinghua University, Shenzhen 518055, P. R. China
| | - Kuang Yu
- Tsinghua-Berkeley Shenzhen Institute (TBSI), Institute of Materials Research (iMR), Tsinghua Shenzhen International Graduate School (TSIGS), Tsinghua University, Shenzhen 518055, P. R. China
| |
Collapse
|
35
|
Keith JA, Vassilev-Galindo V, Cheng B, Chmiela S, Gastegger M, Müller KR, Tkatchenko A. Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems. Chem Rev 2021; 121:9816-9872. [PMID: 34232033 PMCID: PMC8391798 DOI: 10.1021/acs.chemrev.1c00107] [Citation(s) in RCA: 170] [Impact Index Per Article: 56.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Indexed: 12/23/2022]
Abstract
Machine learning models are poised to make a transformative impact on chemical sciences by dramatically accelerating computational algorithms and amplifying insights available from computational chemistry methods. However, achieving this requires a confluence and coaction of expertise in computer science and physical sciences. This Review is written for new and experienced researchers working at the intersection of both fields. We first provide concise tutorials of computational chemistry and machine learning methods, showing how insights involving both can be achieved. We follow with a critical review of noteworthy applications that demonstrate how computational chemistry and machine learning can be used together to provide insightful (and useful) predictions in molecular and materials modeling, retrosyntheses, catalysis, and drug design.
Collapse
Affiliation(s)
- John A. Keith
- Department
of Chemical and Petroleum Engineering Swanson School of Engineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Valentin Vassilev-Galindo
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Bingqing Cheng
- Accelerate
Programme for Scientific Discovery, Department
of Computer Science and Technology, 15 J. J. Thomson Avenue, Cambridge CB3 0FD, United Kingdom
| | - Stefan Chmiela
- Department
of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, 10587, Berlin, Germany
| | - Michael Gastegger
- Department
of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, 10587, Berlin, Germany
| | - Klaus-Robert Müller
- Machine
Learning Group, Technische Universität
Berlin, 10587, Berlin, Germany
- Department
of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul, 02841, Korea
- Max-Planck-Institut für Informatik, 66123 Saarbrücken, Germany
- Google Research, Brain Team, 10117 Berlin, Germany
| | - Alexandre Tkatchenko
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| |
Collapse
|
36
|
Abstract
In recent years, the use of machine learning (ML) in computational chemistry has enabled numerous advances previously out of reach due to the computational complexity of traditional electronic-structure methods. One of the most promising applications is the construction of ML-based force fields (FFs), with the aim to narrow the gap between the accuracy of ab initio methods and the efficiency of classical FFs. The key idea is to learn the statistical relation between chemical structure and potential energy without relying on a preconceived notion of fixed chemical bonds or knowledge about the relevant interactions. Such universal ML approximations are in principle only limited by the quality and quantity of the reference data used to train them. This review gives an overview of applications of ML-FFs and the chemical insights that can be obtained from them. The core concepts underlying ML-FFs are described in detail, and a step-by-step guide for constructing and testing them from scratch is given. The text concludes with a discussion of the challenges that remain to be overcome by the next generation of ML-FFs.
Collapse
Affiliation(s)
- Oliver
T. Unke
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- DFG
Cluster of Excellence “Unifying Systems in Catalysis”
(UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
| | - Stefan Chmiela
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Huziel E. Sauceda
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- BASLEARN,
BASF-TU Joint Lab, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Michael Gastegger
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- DFG
Cluster of Excellence “Unifying Systems in Catalysis”
(UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
- BASLEARN,
BASF-TU Joint Lab, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Igor Poltavsky
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Kristof T. Schütt
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Alexandre Tkatchenko
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Klaus-Robert Müller
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- BIFOLD−Berlin
Institute for the Foundations of Learning and Data, Berlin, Germany
- Department
of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, Korea
- Max Planck
Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbrücken, Germany
- Google
Research, Brain Team, Berlin, Germany
| |
Collapse
|
37
|
Abstract
In chemistry and physics, machine learning (ML) methods promise transformative impacts by advancing modeling and improving our understanding of complex molecules and materials. Each ML method comprises a mathematically well-defined procedure, and an increasingly larger number of easy-to-use ML packages for modeling atomistic systems are becoming available. In this Perspective, we discuss the general aspects of ML techniques in the context of creating ML force fields. We describe common features of ML modeling and quantum-mechanical approximations, so-called global and local ML models, and the physical differences behind these two classes of approaches. Finally, we describe the recent developments and emerging directions in the field of ML-driven molecular modeling. This Perspective aims to inspire interdisciplinary collaborations crossing the borders between physical chemistry, chemical physics, computer science, and data science.
Collapse
Affiliation(s)
- Igor Poltavsky
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| |
Collapse
|
38
|
Xu M, Zhu T, Zhang JZH. Automatically Constructed Neural Network Potentials for Molecular Dynamics Simulation of Zinc Proteins. Front Chem 2021; 9:692200. [PMID: 34222200 PMCID: PMC8249736 DOI: 10.3389/fchem.2021.692200] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 05/10/2021] [Indexed: 11/13/2022] Open
Abstract
The development of accurate and efficient potential energy functions for the molecular dynamics simulation of metalloproteins has long been a great challenge for the theoretical chemistry community. An artificial neural network provides the possibility to develop potential energy functions with both the efficiency of the classical force fields and the accuracy of the quantum chemical methods. In this work, neural network potentials were automatically constructed by using the ESOINN-DP method for typical zinc proteins. For the four most common zinc coordination modes in proteins, the potential energy, atomic forces, and atomic charges predicted by neural network models show great agreement with quantum mechanics calculations and the neural network potential can maintain the coordination geometry correctly. In addition, MD simulation and energy optimization with the neural network potential can be readily used for structural refinement. The neural network potential is not limited by the function form and complex parameterization process, and important quantum effects such as polarization and charge transfer can be accurately considered. The algorithm proposed in this work can also be directly applied to proteins containing other metal ions.
Collapse
Affiliation(s)
- Mingyuan Xu
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Key Laboratory of Green Chemistry and Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, China
| | - Tong Zhu
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Key Laboratory of Green Chemistry and Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, China
| | - John Z. H. Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Key Laboratory of Green Chemistry and Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, China
- Department of Chemistry, New York University, New York, NY, United States
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, China
| |
Collapse
|
39
|
Qu C, Houston PL, Conte R, Nandi A, Bowman JM. Breaking the Coupled Cluster Barrier for Machine-Learned Potentials of Large Molecules: The Case of 15-Atom Acetylacetone. J Phys Chem Lett 2021; 12:4902-4909. [PMID: 34006096 PMCID: PMC8279733 DOI: 10.1021/acs.jpclett.1c01142] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Machine-learned potential energy surfaces (PESs) for molecules with more than 10 atoms are typically forced to use lower-level electronic structure methods such as density functional theory (DFT) and second-order Møller-Plesset perturbation theory (MP2). While these are efficient and realistic, they fall short of the accuracy of the "gold standard" coupled-cluster method, especially with respect to reaction and isomerization barriers. We report a major step forward in applying a Δ-machine learning method to the challenging case of acetylacetone, whose MP2 barrier height for H-atom transfer is low by roughly 1.1 kcal/mol relative to the benchmark CCSD(T) barrier of 3.2 kcal/mol. From a database of 2151 local CCSD(T) energies and training with as few as 430 energies, we obtain a new PES with a barrier of 3.5 kcal/mol in agreement with the LCCSD(T) barrier of 3.5 kcal/mol and close to the benchmark value. Tunneling splittings due to H-atom transfer are calculated using this new PES, providing improved estimates over previous ones obtained using an MP2-based PES.
Collapse
Affiliation(s)
- Chen Qu
- Department
of Chemistry & Biochemistry, University
of Maryland, College Park, Maryland 20742, United States
| | - Paul L. Houston
- Department
of Chemistry and Chemical Biology, Cornell
University, Ithaca, New York 14853, United
States
- Department
of Chemistry and Biochemistry, Georgia Institute
of Technology, Atlanta, Georgia 30332, United
States
| | - Riccardo Conte
- Dipartimento
di Chimica, Università degli Studi
di Milano, via Golgi 19, 20133 Milano, Italy
| | - Apurba Nandi
- Department
of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Joel M. Bowman
- Department
of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| |
Collapse
|
40
|
Laurens G, Rabary M, Lam J, Peláez D, Allouche AR. Infrared spectra of neutral polycyclic aromatic hydrocarbons based on machine learning potential energy surface and dipole mapping. Theor Chem Acc 2021. [DOI: 10.1007/s00214-021-02773-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
41
|
Ueno S, Tanimura Y. Modeling and Simulating the Excited-State Dynamics of a System with Condensed Phases: A Machine Learning Approach. J Chem Theory Comput 2021; 17:3618-3628. [PMID: 33999606 DOI: 10.1021/acs.jctc.1c00104] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Simulating the irreversible quantum dynamics of exciton- and electron-transfer problems poses a nontrivial challenge. Because the irreversibility of the system dynamics is a result of quantum thermal activation and dissipation caused by the surrounding environment, it is necessary to include infinite environmental degrees of freedom in the simulation. Because the capabilities of full quantum dynamics simulations that include the surrounding molecular degrees of freedom are limited, employing a system-bath model is a practical approach. In such a model, the dynamics of excitons or electrons are described by a system Hamiltonian, while the other degrees of freedom that arise from the environmental molecules are described by a harmonic oscillator bath (HOB) and system-bath interaction parameters. By extending on a previous study of molecular liquids [ J. Chem. Theory Comput. 2020, 16, 2099], here, we construct a system-bath model for exciton- and electron-transfer problems by means of a machine learning approach. We determine both the system and system-bath interaction parameters, including the spectral distribution of the bath, using the electronic excitation energies obtained from a quantum mechanics/molecular mechanics (QM/MM) simulation that is conducted as a function of time. Using the analytical expressions of optical response functions, we calculate linear and two-dimensional electronic spectra (2DES) for indocarbocyanine dimers in methanol. From these results, we demonstrate the capability of our approach to elucidate the nonequilibrium exciton dynamics of a quantum system in a nonintuitive manner.
Collapse
|
42
|
Affiliation(s)
- Jörg Behler
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraße 6, 37077 Göttingen, Germany
| |
Collapse
|
43
|
Vassilev-Galindo V, Fonseca G, Poltavsky I, Tkatchenko A. Challenges for machine learning force fields in reproducing potential energy surfaces of flexible molecules. J Chem Phys 2021; 154:094119. [DOI: 10.1063/5.0038516] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Affiliation(s)
- Valentin Vassilev-Galindo
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Gregory Fonseca
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Igor Poltavsky
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| |
Collapse
|
44
|
Nandi A, Qu C, Houston PL, Conte R, Bowman JM. Δ-machine learning for potential energy surfaces: A PIP approach to bring a DFT-based PES to CCSD(T) level of theory. J Chem Phys 2021; 154:051102. [DOI: 10.1063/5.0038301] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Affiliation(s)
- Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| | - Chen Qu
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, USA
| | - Paul L. Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA and Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, Via Golgi 19, 20133 Milano, Italy
| | - Joel M. Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| |
Collapse
|
45
|
Sauceda HE, Vassilev-Galindo V, Chmiela S, Müller KR, Tkatchenko A. Dynamical strengthening of covalent and non-covalent molecular interactions by nuclear quantum effects at finite temperature. Nat Commun 2021; 12:442. [PMID: 33469007 PMCID: PMC7815839 DOI: 10.1038/s41467-020-20212-1] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 11/12/2020] [Indexed: 11/08/2022] Open
Abstract
Nuclear quantum effects (NQE) tend to generate delocalized molecular dynamics due to the inclusion of the zero point energy and its coupling with the anharmonicities in interatomic interactions. Here, we present evidence that NQE often enhance electronic interactions and, in turn, can result in dynamical molecular stabilization at finite temperature. The underlying physical mechanism promoted by NQE depends on the particular interaction under consideration. First, the effective reduction of interatomic distances between functional groups within a molecule can enhance the n → π* interaction by increasing the overlap between molecular orbitals or by strengthening electrostatic interactions between neighboring charge densities. Second, NQE can localize methyl rotors by temporarily changing molecular bond orders and leading to the emergence of localized transient rotor states. Third, for noncovalent van der Waals interactions the strengthening comes from the increase of the polarizability given the expanded average interatomic distances induced by NQE. The implications of these boosted interactions include counterintuitive hydroxyl-hydroxyl bonding, hindered methyl rotor dynamics, and molecular stiffening which generates smoother free-energy surfaces. Our findings yield new insights into the versatile role of nuclear quantum fluctuations in molecules and materials.
Collapse
Affiliation(s)
- Huziel E Sauceda
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg.
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany.
- BASLEARN, BASF-TU joint Lab, Technische Universität Berlin, 10587, Berlin, Germany.
| | - Valentin Vassilev-Galindo
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg
| | - Stefan Chmiela
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
| | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany.
- Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul, 02841, Korea.
- Max Planck Institute for Informatics, Stuhlsatzenhausweg, 66123, Saarbrücken, Germany.
- Google Research, Brain team, Berlin, Germany.
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg.
| |
Collapse
|
46
|
Han R, Rodríguez-Mayorga M, Luber S. A Machine Learning Approach for MP2 Correlation Energies and Its Application to Organic Compounds. J Chem Theory Comput 2021; 17:777-790. [DOI: 10.1021/acs.jctc.0c00898] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Ruocheng Han
- Department of Chemistry A, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | | | - Sandra Luber
- Department of Chemistry A, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| |
Collapse
|
47
|
Benoit M, Amodeo J, Combettes S, Khaled I, Roux A, Lam J. Measuring transferability issues in machine-learning force fields: the example of gold–iron interactions with linearized potentials. Mach Learn : Sci Technol 2020. [DOI: 10.1088/2632-2153/abc9fd] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Abstract
Machine-learning force fields have been increasingly employed in order to extend the possibility of current first-principles calculations. However, the transferability of the obtained potential cannot always be guaranteed in situations that are outside the original database. To study such limitation, we examined the very difficult case of the interactions in gold–iron nanoparticles. For the machine-learning potential, we employed a linearized formulation that is parameterized using a penalizing regression scheme which allows us to control the complexity of the obtained potential. We showed that while having a more complex potential allows for a better agreement with the training database, it can also lead to overfitting issues and a lower accuracy in untrained systems.
Collapse
|
48
|
Chen Y, Zhang L, Wang H, E W. DeePKS: A Comprehensive Data-Driven Approach toward Chemically Accurate Density Functional Theory. J Chem Theory Comput 2020; 17:170-181. [DOI: 10.1021/acs.jctc.0c00872] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Affiliation(s)
- Yixiao Chen
- Program in Applied and Computational Mathematics, Princeton University, Princeton, New Jersey 08544, United States
| | - Linfeng Zhang
- Program in Applied and Computational Mathematics, Princeton University, Princeton, New Jersey 08544, United States
| | - Han Wang
- Laboratory of Computational Physics, Institute of Applied Physics and Computational Mathematics, Huayuan Road 6, Beijing 100088, People’s Republic of China
| | - Weinan E
- Program in Applied and Computational Mathematics, Princeton University, Princeton, New Jersey 08544, United States
- Department of Mathematics, Princeton University, Princeton, New Jersey 08544, United States
| |
Collapse
|
49
|
Zeng J, Cao L, Xu M, Zhu T, Zhang JZH. Complex reaction processes in combustion unraveled by neural network-based molecular dynamics simulation. Nat Commun 2020; 11:5713. [PMID: 33177517 DOI: 10.1038/s41467-020-19497-z] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Accepted: 10/06/2020] [Indexed: 12/21/2022] Open
Abstract
Combustion is a complex chemical system which involves thousands of chemical reactions and generates hundreds of molecular species and radicals during the process. In this work, a neural network-based molecular dynamics (MD) simulation is carried out to simulate the benchmark combustion of methane. During MD simulation, detailed reaction processes leading to the creation of specific molecular species including various intermediate radicals and the products are intimately revealed and characterized. Overall, a total of 798 different chemical reactions were recorded and some new chemical reaction pathways were discovered. We believe that the present work heralds the dawn of a new era in which neural network-based reactive MD simulation can be practically applied to simulating important complex reaction systems at ab initio level, which provides atomic-level understanding of chemical reaction processes as well as discovery of new reaction pathways at an unprecedented level of detail beyond what laboratory experiments could accomplish. Gaining insights into combustion processes is challenging due to the complex reactions involved. The present work proposes a neural network potential model trained to ab initio data that enables to simulate the combustion of methane by predicting reactants, products and reaction intermediates.
Collapse
|
50
|
Bogojeski M, Vogt-Maranto L, Tuckerman ME, Müller KR, Burke K. Quantum chemical accuracy from density functional approximations via machine learning. Nat Commun 2020; 11:5223. [PMID: 33067479 PMCID: PMC7567867 DOI: 10.1038/s41467-020-19093-1] [Citation(s) in RCA: 121] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Accepted: 09/24/2020] [Indexed: 12/21/2022] Open
Abstract
Kohn-Sham density functional theory (DFT) is a standard tool in most branches of chemistry, but accuracies for many molecules are limited to 2-3 kcal ⋅ mol-1 with presently-available functionals. Ab initio methods, such as coupled-cluster, routinely produce much higher accuracy, but computational costs limit their application to small molecules. In this paper, we leverage machine learning to calculate coupled-cluster energies from DFT densities, reaching quantum chemical accuracy (errors below 1 kcal ⋅ mol-1) on test data. Moreover, density-based Δ-learning (learning only the correction to a standard DFT calculation, termed Δ-DFT ) significantly reduces the amount of training data required, particularly when molecular symmetries are included. The robustness of Δ-DFT is highlighted by correcting "on the fly" DFT-based molecular dynamics (MD) simulations of resorcinol (C6H4(OH)2) to obtain MD trajectories with coupled-cluster accuracy. We conclude, therefore, that Δ-DFT facilitates running gas-phase MD simulations with quantum chemical accuracy, even for strained geometries and conformer changes where standard DFT fails.
Collapse
Affiliation(s)
- Mihail Bogojeski
- Machine Learning Group, Technische Universität Berlin, Marchstr. 23, 10587, Berlin, Germany
| | | | - Mark E Tuckerman
- Department of Chemistry, New York University, New York, NY, 10003, USA.
- Courant Institute of Mathematical Science, New York University, New York, NY, 10012, USA.
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, 3663 Zhongshan Road North, Shanghai, 200062, China.
| | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, Marchstr. 23, 10587, Berlin, Germany.
- Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul, 02841, Korea.
- Max-Planck-Institut für Informatik, Stuhlsatzenhausweg, 66123, Saarbrücken, Germany.
| | - Kieron Burke
- Department of Physics and Astronomy, University of California, Irvine, CA, 92697, USA.
- Department of Chemistry, University of California, Irvine, CA, 92697, USA.
| |
Collapse
|