1
|
Tokita AM, Devergne T, Saitta AM, Behler J. Free energy profiles for chemical reactions in solution from high-dimensional neural network potentials: The case of the Strecker synthesis. J Chem Phys 2025; 162:174120. [PMID: 40326597 DOI: 10.1063/5.0268948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2025] [Accepted: 04/14/2025] [Indexed: 05/07/2025] Open
Abstract
Machine learning potentials (MLPs) have become a popular tool in chemistry and materials science as they combine the accuracy of electronic structure calculations with the high computational efficiency of analytic potentials. MLPs are particularly useful for computationally demanding simulations such as the determination of free energy profiles governing chemical reactions in solution, but to date, such applications are still rare. In this work, we show how umbrella sampling simulations can be combined with active learning of high-dimensional neural network potentials (HDNNPs) to construct free energy profiles in a systematic way. For the example of the first step of Strecker synthesis of glycine in aqueous solution, we provide a detailed analysis of the improving quality of HDNNPs for datasets of increasing size. We find that, in addition to the typical quantification of energy and force errors with respect to the underlying density functional theory data, the long-term stability of the simulations and the convergence of physical properties should be rigorously monitored to obtain reliable and converged free energy profiles of chemical reactions in solution.
Collapse
Affiliation(s)
- Alea Miako Tokita
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany
- Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Timothée Devergne
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMRCNRS 7590, Institut de Minéralogie, de Physique des Matériaux et deCosmochimie, IMPMC, F-75005 Paris, France
- Atomistic Simulations, Italian Institute of Technology, Genova, Italy and Computational Statistics and Machine Learning, Italian Institute of Technology, Genova, Italy
| | - A Marco Saitta
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMRCNRS 7590, Institut de Minéralogie, de Physique des Matériaux et deCosmochimie, IMPMC, F-75005 Paris, France
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany
- Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| |
Collapse
|
2
|
D'Hondt S, Oramas J, De Winter H. A beginner's approach to deep learning applied to VS and MD techniques. J Cheminform 2025; 17:47. [PMID: 40200329 PMCID: PMC11980327 DOI: 10.1186/s13321-025-00985-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2024] [Accepted: 03/12/2025] [Indexed: 04/10/2025] Open
Abstract
It has become impossible to imagine the fields of biochemistry and medicinal chemistry without computational chemistry and molecular modelling techniques. In many steps of the drug development process in silico methods have become indispensable. Virtual screening (VS) can tremendously expedite the early discovery phase, whilst the use of molecular dynamics (MD) simulations forms a powerful additional tool to in vitro methods throughout the entire drug discovery process. In the field of biochemistry, MD has also become a compelling method for studying biophysical systems (e.g., protein folding) complementary to experimental techniques. However, both VS and MD come with their own limitations and methodological difficulties, from hardware limitations to restrictions in algorithmic capabilities. One solution to overcoming these difficulties lies in the field of machine learning (ML), and more specifically deep learning (DL). There are many ways in which DL can be applied to these molecular modelling techniques to achieve more accurate results in a more efficient manner or expedite the data analysis of the acquired results. Despite steadily increasing interest in DL amidst computational chemists, knowledge is still limited and scattered over different resources. This review is aimed at computational chemists with knowledge of molecular modelling, who wish to possibly integrate DL approaches in their research and already have a basic understanding of the fundamentals of DL. This review focusses on a survey of recent applications of DL in molecular modelling techniques. The different sections are logically subdivided, based on where DL is integrated in the research: (1) for the improvement of VS workflows, (2) for the improvement of certain workflows in MD simulations, (3) for aiding in the calculations of interatomic forces, or (4) for data analysis of MD trajectories. It will become clear that DL has the capacity to completely transform the way molecular modelling is carried out.
Collapse
Affiliation(s)
- Stijn D'Hondt
- Laboratory of Medicinal Chemistry, Department of Pharmaceutical Sciences, IDLab, University of Antwerp, Universiteitsplein 1, 2610, Wilrijk, Belgium
| | - José Oramas
- Department of Computer Science, Sint-Pietersvliet 7, 2000, Antwerp, Belgium
| | - Hans De Winter
- Laboratory of Medicinal Chemistry, Department of Pharmaceutical Sciences, IDLab, University of Antwerp, Universiteitsplein 1, 2610, Wilrijk, Belgium.
| |
Collapse
|
3
|
Stolte N, Daru J, Forbert H, Marx D, Behler J. Random Sampling Versus Active Learning Algorithms for Machine Learning Potentials of Quantum Liquid Water. J Chem Theory Comput 2025; 21:886-899. [PMID: 39808506 DOI: 10.1021/acs.jctc.4c01382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2025]
Abstract
Training accurate machine learning potentials requires electronic structure data comprehensively covering the configurational space of the system of interest. As the construction of this data is computationally demanding, many schemes for identifying the most important structures have been proposed. Here, we compare the performance of high-dimensional neural network potentials (HDNNPs) for quantum liquid water at ambient conditions trained to data sets constructed using random sampling as well as various flavors of active learning based on query by committee. Contrary to the common understanding of active learning, we find that for a given data set size, random sampling leads to smaller test errors for structures not included in the training process. In our analysis, we show that this can be related to small energy offsets caused by a bias in structures added in active learning, which can be overcome by using instead energy correlations as an error measure that is invariant to such shifts. Still, all HDNNPs yield very similar and accurate structural properties of quantum liquid water, which demonstrates the robustness of the training procedure with respect to the training set construction algorithm even when trained to as few as 200 structures. However, we find that for active learning based on preliminary potentials, a reasonable initial data set is important to avoid an unnecessary extension of the covered configuration space to less relevant regions.
Collapse
Affiliation(s)
- Nore Stolte
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, Bochum 44780, Germany
| | - János Daru
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, Bochum 44780, Germany
- Department of Organic Chemistry, Eötvös Loránd University, Budapest 1117, Hungary
| | - Harald Forbert
- Center for Solvation Science ZEMOS, Ruhr-Universität Bochum, Bochum 44780, Germany
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, Bochum 44780, Germany
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, Bochum 44780, Germany
- Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, Bochum 44780, Germany
| |
Collapse
|
4
|
Schienbein P, Blumberger J. Data-Efficient Active Learning for Thermodynamic Integration: Acidity Constants of BiVO 4 in Water. Chemphyschem 2025; 26:e202400490. [PMID: 39365878 DOI: 10.1002/cphc.202400490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 10/02/2024] [Accepted: 10/02/2024] [Indexed: 10/06/2024]
Abstract
The protonation state of molecules and surfaces is pivotal in various disciplines, including (electro-)catalysis, geochemistry, biochemistry, and pharmaceutics. Accurately and efficiently determining acidity constants is critical yet challenging, particularly when explicitly considering the electronic structure, thermal fluctuations, anharmonic vibrations, and solvation effects. In this research, we employ thermodynamic integration accelerated by committee Neural Network potentials, training a single machine learning model that accurately describes the relevant protonated, deprotonated, and intermediate states. We investigate two deprotonation reactions at the BiVO4 (010)-water interface, a promising candidate for efficient photocatalytic water splitting. Our results illustrate the convergence of the required ensemble averages over simulation time and of the final acidity constant as a function of the Kirkwood coupling parameter. We demonstrate that simulation times on the order of nanoseconds are required for statistical convergence. This time scale is currently unachievable with explicit ab-initio molecular dynamics simulations at the hybrid DFT level of theory. In contrast, our machine learning workflow only requires a few hundred DFT single point calculations for training and testing. Exploiting the extended time scales accessible, we furthermore asses the effect of commonly applied bias potentials. Thus, our study significantly advances calculating free energy differences with ab-initio accuracy.
Collapse
Affiliation(s)
- Philipp Schienbein
- Department of Physics and Astronomy and Thomas Young Centre, University College London, London, WC1E 6BT, United Kingdom
- Present address, Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, Bochum, 44780, Germany
- Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, Bochum, 44780, Germany
| | - Jochen Blumberger
- Department of Physics and Astronomy and Thomas Young Centre, University College London, London, WC1E 6BT, United Kingdom
| |
Collapse
|
5
|
Stolte N, Daru J, Forbert H, Behler J, Marx D. Nuclear Quantum Effects in Liquid Water Are Marginal for Its Average Structure but Significant for Dynamics. J Phys Chem Lett 2024; 15:12144-12150. [PMID: 39607891 DOI: 10.1021/acs.jpclett.4c02925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2024]
Abstract
Isotopic substitution, which can be realized in both experiment and computer simulations, is a direct approach to assess the role of nuclear quantum effects on the structure and dynamics of matter. However, the impact of nuclear quantum effects on the structure of liquid water as probed in experiment by comparing normal to heavy water has remained controversial. To settle this issue, we employ a highly accurate machine-learned high-dimensional neural network potential to perform converged coupled cluster-quality path integral simulations of liquid H2O versus D2O at ambient conditions. We find substantial H/D quantum effects on the rotational and translational dynamics of water, in close agreement with the experimental benchmarks. However, in stark contrast to the role for dynamics, H/D quantum effects turn out to be small, on the order of 1/1000 Å, on both average intramolecular and H-bonding structures of water. The most probable structure of water remains nearly unaffected by nuclear quantum effects, but effects on fluctuations away from average are appreciable, rendering H2O substantially more "liquid" than D2O.
Collapse
Affiliation(s)
- Nore Stolte
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - János Daru
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
- Department of Organic Chemistry, Eötvös Loránd University, 1117 Budapest, Hungary
| | - Harald Forbert
- Center for Solvation Science ZEMOS, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany
- Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| |
Collapse
|
6
|
Thiemann FL, O'Neill N, Kapil V, Michaelides A, Schran C. Introduction to machine learning potentials for atomistic simulations. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2024; 37:073002. [PMID: 39577092 DOI: 10.1088/1361-648x/ad9657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Accepted: 11/22/2024] [Indexed: 11/24/2024]
Abstract
Machine learning potentials have revolutionised the field of atomistic simulations in recent years and are becoming a mainstay in the toolbox of computational scientists. This paper aims to provide an overview and introduction into machine learning potentials and their practical application to scientific problems. We provide a systematic guide for developing machine learning potentials, reviewing chemical descriptors, regression models, data generation and validation approaches. We begin with an emphasis on the earlier generation of models, such as high-dimensional neural network potentials and Gaussian approximation potentials, to provide historical perspective and guide the reader towards the understanding of recent developments, which are discussed in detail thereafter. Furthermore, we refer to relevant expert reviews, open-source software, and practical examples-further lowering the barrier to exploring these methods. The paper ends with selected showcase examples, highlighting the capabilities of machine learning potentials and how they can be applied to push the boundaries in atomistic simulations.
Collapse
Affiliation(s)
- Fabian L Thiemann
- IBM Research Europe, Daresbury, Warrington WA4 4AD, United Kingdom
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge CB3 0HE, United Kingdom
| | - Niamh O'Neill
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge CB3 0HE, United Kingdom
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| | - Venkat Kapil
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
- Department of Physics and Astronomy, University College London, London, United Kingdom
- Thomas Young Centre and London Centre for Nanotechnology, London, United Kingdom
| | - Angelos Michaelides
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| | - Christoph Schran
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge CB3 0HE, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| |
Collapse
|
7
|
Yamaguchi K, Miyagawa K, Shoji M, Kawakami T, Isobe H, Yamanaka S, Nakajima T. Theoretical elucidation of the structure, bonding, and reactivity of the CaMn 4O x clusters in the whole Kok cycle for water oxidation embedded in the oxygen evolving center of photosystem II. New molecular and quantum insights into the mechanism of the O-O bond formation. PHOTOSYNTHESIS RESEARCH 2024; 162:291-330. [PMID: 37945776 PMCID: PMC11614991 DOI: 10.1007/s11120-023-01053-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 09/25/2023] [Indexed: 11/12/2023]
Abstract
This paper reviews our historical developments of broken-symmetry (BS) and beyond BS methods that are applicable for theoretical investigations of metalloenzymes such as OEC in PSII. The BS hybrid DFT (HDFT) calculations starting from high-resolution (HR) XRD structure in the most stable S1 state have been performed to elucidate structure and bonding of whole possible intermediates of the CaMn4Ox cluster (1) in the Si (i = 0 ~ 4) states of the Kok cycle. The large-scale HDFT/MM computations starting from HR XRD have been performed to elucidate biomolecular system structures which are crucial for examination of possible water inlet and proton release pathways for water oxidation in OEC of PSII. DLPNO CCSD(T0) computations have been performed for elucidation of scope and reliability of relative energies among the intermediates by HDFT. These computations combined with EXAFS, XRD, XFEL, and EPR experimental results have elucidated the structure, bonding, and reactivity of the key intermediates, which are indispensable for understanding and explanation of the mechanism of water oxidation in OEC of PSII. Interplay between theory and experiments have elucidated important roles of four degrees of freedom, spin, charge, orbital, and nuclear motion for understanding and explanation of the chemical reactivity of 1 embedded in protein matrix, indicating the participations of the Ca(H2O)n ion and tyrosine(Yz)-O radical as a one-electron acceptor for the O-O bond formation. The Ca-assisted Yz-coupled O-O bond formation mechanisms for water oxidation are consistent with recent XES and very recent time-resolved SFX XFEL and FTIR results.
Collapse
Affiliation(s)
- Kizashi Yamaguchi
- Center for Quantum Information and Quantum Biology, Osaka University, Toyonaka, Osaka, 560-0043, Japan.
- RIKEN Center for Computational Science, Kobe, Hyogo, 650-0047, Japan.
- SANKEN, Osaka University, Ibaraki, Osaka, 567-0047, Japan.
| | - Koichi Miyagawa
- Center of Computational Sciences, University of Tsukuba, Tsukuba, Ibaraki, 305-8577, Japan
| | - Mitsuo Shoji
- Center of Computational Sciences, University of Tsukuba, Tsukuba, Ibaraki, 305-8577, Japan
| | - Takashi Kawakami
- RIKEN Center for Computational Science, Kobe, Hyogo, 650-0047, Japan
- Graduate School of Science, Osaka University, Toyonaka, Osaka, 560-0043, Japan
| | - Hiroshi Isobe
- Research Institute for Interdisciplinary Science, and Graduate School of Natural Science and Technology, Okayama University, Okayama, 700-8530, Japan
| | - Shusuke Yamanaka
- Graduate School of Science, Osaka University, Toyonaka, Osaka, 560-0043, Japan
| | - Takahito Nakajima
- RIKEN Center for Computational Science, Kobe, Hyogo, 650-0047, Japan
| |
Collapse
|
8
|
Méndez E, Laria D, Hunt D. Proton quantal delocalization and H/D translocations in (MeOH)nH+ (n = 2, 3). J Chem Phys 2024; 161:174303. [PMID: 39484904 DOI: 10.1063/5.0234264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Accepted: 10/11/2024] [Indexed: 11/03/2024] Open
Abstract
In this study, we present results from path integral molecular dynamics simulations that describe the characteristics of the quantum spatial delocalizations of protons participating in OH bonds in (MeOH)2H+ and in (MeOH)3H+. The characterization was carried out by examining the overall structures of the corresponding isomorphic polymers. To introduce full flexibility in the force treatment, we have adopted a neural network fitting procedure based on second-order Møller-Plesset perturbation theory predictions. For the dimer case, we found that the spatial extent of the shared connective proton can be portrayed in terms of a prolate-like structure with typical dimensions of ∼0.1 Å. On the other hand, the dangling polymers lie confined within a thin spherical layer, spread over length scales of the order of ∼0.25 Å. In contrast, connective protons in (MeOH)3H+ exhibit larger delocalizations along the O-H bond and more localized ones along perpendicular directions, compared to their dangling counterparts. We also examined the characteristics of the relative propensities of H and D isotopes to be localized in dangling and connective positions. Physical interpretations of the different thermodynamic trends are provided in terms of the local geometrical characteristics and of the strengths of the corresponding intermolecular connectivities.
Collapse
Affiliation(s)
- Emilio Méndez
- Sorbonne Université CNRS, Physico-chimie des Electrolytes et Nanosystèmes Interfaciaux, PHENIX, F-75005 Paris, France
| | - Daniel Laria
- Departamento de Física de la Materia Condensada, GIyA, CAC-CNEA, 1650 San Martín, Buenos Aires, Argentina and Departamento de Química Inorgánica, Analítica y Química-Física, Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires, Ciudad Universitaria, Pabellón II, 1428 Buenos Aires, Argentina
| | - Diego Hunt
- Departamento de Física de la Materia Condensada, GIyA, CAC-CNEA, 1650 San Martín, Buenos Aires, Argentina, Instituto de Nanociencia y Nanotecnología, CNEA-CONICET, Buenos Aires, Argentina
| |
Collapse
|
9
|
Beckmann R, Schran C, Brieuc F, Marx D. Theoretical infrared spectroscopy of protonated methane isotopologues. Phys Chem Chem Phys 2024; 26:22846-22852. [PMID: 39171731 DOI: 10.1039/d4cp02295e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
The vibrational spectroscopy of protonated methane and its mixed hydrogen/deuterium isotopologues remains a challenge to both experimental and computational spectroscopy due to the iconic floppiness of CH5+. Here, we compute the finite-temperature broadband infrared spectra of CH5+ and all its isotopologues, i.e. CHnD5-n+ up to CD5+, from path integral molecular dynamics in conjunction with interactions and dipoles computed consistently at CCSD(T) coupled cluster accuracy. The potential energy and dipole moment surfaces have been accurately represented in full dimensionality in terms of high-dimensional neural networks. The resulting computational efficiency allows us to establish CCSD(T) accuracy at the level of converged path integral simulations. For all six isotopologues, the computed broadband spectra compare very favorably to the available experimental broadband spectra obtained from laser induced reactions action vibrational spectroscopy. The current approach is found to consistently and significantly improve on previous calculations of these broadband vibrational spectra and defines the new cutting-edge for what has been dubbed the "enfant terrible" of molecular spectroscopy in view of its pronounced large-amplitude motion that involves all intramolecular degrees of freedom.
Collapse
Affiliation(s)
- Richard Beckmann
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany.
| | - Christoph Schran
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany.
| | - Fabien Brieuc
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany.
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany.
| |
Collapse
|
10
|
Tu NTP, Williamson S, Johnson ER, Rowley CN. Modeling Intermolecular Interactions with Exchange-Hole Dipole Moment Dispersion Corrections to Neural Network Potentials. J Phys Chem B 2024; 128:8290-8302. [PMID: 39166778 DOI: 10.1021/acs.jpcb.4c02882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
Neural network potentials (NNPs) are an innovative approach for calculating the potential energy and forces of a chemical system. In principle, these methods are capable of modeling large systems with an accuracy approaching that of a high-level ab initio calculation, but with a much smaller computational cost. Due to their training to density-functional theory (DFT) data and neglect of long-range interactions, some classes of NNPs require an additional term to include London dispersion physics. In this Perspective, we discuss the requirements for a dispersion model for use with an NNP, focusing on the MLXDM (Machine Learned eXchange-Hole Dipole Moment) model developed by our groups. This model is based on the DFT-based XDM dispersion correction, which calculates interatomic dispersion coefficients in terms of atomic moments and polarizabilities, both of which can be approximated effectively using neural networks.
Collapse
Affiliation(s)
| | - Siri Williamson
- Department of Chemistry, Carleton University, Ottawa, Ontario K1S 5B6, Canada
| | - Erin R Johnson
- Department of Chemistry, Dalhousie University, Halifax, Nova Scotia B3H 4J3, Canada
| | | |
Collapse
|
11
|
Nagy PR. State-of-the-art local correlation methods enable affordable gold standard quantum chemistry for up to hundreds of atoms. Chem Sci 2024:d4sc04755a. [PMID: 39246365 PMCID: PMC11376132 DOI: 10.1039/d4sc04755a] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 07/30/2024] [Indexed: 09/10/2024] Open
Abstract
In this feature, we review the current capabilities of local electron correlation methods up to the coupled cluster model with single, double, and perturbative triple excitations [CCSD(T)], which is a gold standard in quantum chemistry. The main computational aspects of the local method types are assessed from the perspective of applications, but the focus is kept on how to achieve chemical accuracy (i.e., <1 kcal mol-1 uncertainty), as well as on the broad scope of chemical problems made accessible. The performance of state-of-the-art methods is also compared, including the most employed DLPNO and, in particular, our local natural orbital (LNO) CCSD(T) approach. The high accuracy and efficiency of the LNO method makes chemically accurate CCSD(T) computations accessible for molecules of hundreds of atoms with resources affordable to a broad computational community (days on a single CPU and 10-100 GB of memory). Recent developments in LNO-CCSD(T) enable systematic convergence and robust error estimates even for systems of complicated electronic structure or larger size (up to 1000 atoms). The predictive power of current local CCSD(T) methods, usually at about 1-2 order of magnitude higher cost than hybrid density functional theory (DFT), has become outstanding on the palette of computational chemistry applicable for molecules of practical interest. We also review more than 50 LNO-based and other advanced local-CCSD(T) applications for realistic, large systems across molecular interactions as well as main group, transition metal, bio-, and surface chemistry. The examples show that properly executed local-CCSD(T) can contribute to binding, reaction equilibrium, rate constants, etc. which are able to match measurements within the error estimates. These applications demonstrate that modern, open-access, and broadly affordable local methods, such as LNO-CCSD(T), already enable predictive computations and atomistic insight for complicated, real-life molecular processes in realistic environments.
Collapse
Affiliation(s)
- Péter R Nagy
- Department of Physical Chemistry and Materials Science, Faculty of Chemical Technology and Biotechnology, Budapest University of Technology and Economics Műegyetem rkp. 3. H-1111 Budapest Hungary
- HUN-REN-BME Quantum Chemistry Research Group Műegyetem rkp. 3. H-1111 Budapest Hungary
- MTA-BME Lendület Quantum Chemistry Research Group Műegyetem rkp. 3. H-1111 Budapest Hungary
| |
Collapse
|
12
|
Butin O, Pereyaslavets L, Kamath G, Illarionov A, Sakipov S, Kurnikov IV, Voronina E, Ivahnenko I, Leontyev I, Nawrocki G, Darkhovskiy M, Olevanov M, Cherniavskyi YK, Lock C, Greenslade S, Kornberg RD, Levitt M, Fain B. The Determination of Free Energy of Hydration of Water Ions from First Principles. J Chem Theory Comput 2024; 20:5215-5224. [PMID: 38842599 PMCID: PMC11881599 DOI: 10.1021/acs.jctc.3c01411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2024]
Abstract
We model the autoionization of water by determining the free energy of hydration of the major intermediate species of water ions. We represent the smallest ions─the hydroxide ion OH-, the hydronium ion H3O+, and the Zundel ion H5O2+─by bonded models and the more extended ionic structures by strong nonbonded interactions (e.g., the Eigen H9O4+ = H3O+ + 3(H2O) and the Stoyanov H13O6+ = H5O2+ + 4(H2O)). Our models are faithful to the precise QM energies and their components to within 1% or less. Using the calculated free energies and atomization energies, we compute the pKa of pure water from first principles as a consistency check and arrive at a value within 1.3 log units of the experimental one. From these calculations, we conclude that the hydronium ion, and its hydrated state, the Eigen cation, are the dominant species in the water autoionization process.
Collapse
Affiliation(s)
- Oleg Butin
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Leonid Pereyaslavets
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ganesh Kamath
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Alexey Illarionov
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Serzhan Sakipov
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Igor V Kurnikov
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ekaterina Voronina
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
- Skobeltsyn Institute of Nuclear Physics, Lomonosov Moscow State University, Moscow 119991, Russia
| | - Ilya Ivahnenko
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Igor Leontyev
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Grzegorz Nawrocki
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Mikhail Darkhovskiy
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Michael Olevanov
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
- Department of Physics, Lomonosov Moscow State University, Moscow 119991, Russia
| | - Yevhen K Cherniavskyi
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Christopher Lock
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Palo Alto, California 94304, United States
| | - Sean Greenslade
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Roger D Kornberg
- Department of Structural Biology, Stanford University School of Medicine, Stanford, California 94305, United States
| | - Michael Levitt
- Department of Structural Biology, Stanford University School of Medicine, Stanford, California 94305, United States
| | - Boris Fain
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| |
Collapse
|
13
|
Pollak E. A personal perspective of the present status and future challenges facing thermal reaction rate theory. J Chem Phys 2024; 160:150902. [PMID: 38639316 DOI: 10.1063/5.0199557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 03/06/2024] [Indexed: 04/20/2024] Open
Abstract
Reaction rate theory has been at the center of physical chemistry for well over one hundred years. The evolution of the theory is not only of historical interest. Reliable and accurate computation of reaction rates remains a challenge to this very day, especially in view of the development of quantum chemistry methods, which predict the relevant force fields. It is still not possible to compute the numerically exact rate on the fly when the system has more than at most a few dozen anharmonic degrees of freedom, so one must consider various approximate methods, not only from the practical point of view of constructing numerical algorithms but also on conceptual and formal levels. In this Perspective, I present some of the recent analytical results concerning leading order terms in an ℏ2m series expansion of the exact rate and their implications on various approximate theories. A second aspect has to do with the crossover temperature between tunneling and thermal activation. Using a uniform semiclassical transmission probability rather than the "primitive" semiclassical theory leads to the conclusion that there is no divergence problem associated with a "crossover temperature." If one defines a semiclassical crossover temperature as the point at which the tunneling energy of the instanton equals the barrier height, then it is a factor of two higher than its previous estimate based on the "primitive" semiclassical approximation. In the low temperature tunneling regime, the uniform semiclassical theory as well as the "primitive" semiclassical theory were based on the classical Euclidean action of a periodic orbit on the inverted potential. The uniform semiclassical theory wrongly predicts that the "half-point," which is the energy at which the transmission probability equals 1/2, for any barrier potential, is always the barrier energy. We describe here how augmenting the Euclidean action with constant terms of order ℏ2 can significantly improve the accuracy of the semiclassical theory and correct this deficiency. This also leads to a deep connection with and improvement of vibrational perturbation theory. The uniform semiclassical theory also enables an extension of the quantum version of Kramers' turnover theory to temperatures below the "crossover temperature." The implications of these recent advances on various approximate methods used to date are discussed at length, leading to the conclusion that reaction rate theory will continue to challenge us both on conceptual and practical levels for years to come.
Collapse
Affiliation(s)
- Eli Pollak
- Chemical and Biological Physics Department, Weizmann Institute of Science, 76100 Rehovoth, Israel
| |
Collapse
|
14
|
Maxson T, Szilvási T. Transferable Water Potentials Using Equivariant Neural Networks. J Phys Chem Lett 2024; 15:3740-3747. [PMID: 38547514 DOI: 10.1021/acs.jpclett.4c00605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
Machine learning interatomic potentials (MLIPs) have emerged as a technique that promises quantum theory accuracy for reduced cost. It has been proposed [J. Chem. Phys. 2023, 158, 084111] that MLIPs trained on solely liquid water data cannot accurately transfer to the vapor-liquid equilibrium while recovering the many-body decomposition (MBD) analysis of gas-phase water clusters. This suggests that MLIPs do not directly learn the physically correct interactions of water molecules, limiting transferability. In this work, we show that MLIPs using equivariant architecture and trained on 3200 liquid water structures reproduces liquid-phase water properties (e.g., density within 0.003 g/cm3 between 230 and 365 K), vapor-liquid equilibrium properties up to 550 K, the MBD analysis of gas-phase water cluster up to six-body interactions, and the relative energy and the vibrational density of states of ice phases. We show that potentials developed using equivariant MLIPs allow transferability for arbitrary phases of water that remain stable in nanosecond long simulations.
Collapse
Affiliation(s)
- Tristan Maxson
- Department of Chemical and Biological Engineering, University of Alabama, Tuscaloosa, Alabama 35487, United States
| | - Tibor Szilvási
- Department of Chemical and Biological Engineering, University of Alabama, Tuscaloosa, Alabama 35487, United States
| |
Collapse
|
15
|
Brezina K, Beck H, Marsalek O. Reducing the Cost of Neural Network Potential Generation for Reactive Molecular Systems. J Chem Theory Comput 2023; 19:6589-6604. [PMID: 37747971 PMCID: PMC10569056 DOI: 10.1021/acs.jctc.3c00391] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Indexed: 09/27/2023]
Abstract
Although machine learning potentials have recently had a substantial impact on molecular simulations, the construction of a robust training set can still become a limiting factor, especially due to the requirement of a reference ab initio simulation that covers all the relevant geometries of the system. Recognizing that this can be prohibitive for certain systems, we develop the method of transition tube sampling that mitigates the computational cost of training set and model generation. In this approach, we generate classical or quantum thermal geometries around a transition path describing a conformational change or a chemical reaction using only a sparse set of local normal mode expansions along this path and select from these geometries by an active learning protocol. This yields a training set with geometries that characterize the whole transition without the need for a costly reference trajectory. The performance of the method is evaluated on different molecular systems with the complexity of the potential energy landscape increasing from a single minimum to a double proton-transfer reaction with high barriers. Our results show that the method leads to training sets that give rise to models applicable in classical and path integral simulations alike that are on par with those based directly on ab initio calculations while providing the computational speedup we have come to expect from machine learning potentials.
Collapse
Affiliation(s)
- Krystof Brezina
- Charles University, Faculty of Mathematics
and Physics, Ke Karlovu
3, 121 16, Prague
2, Czech Republic
| | - Hubert Beck
- Charles University, Faculty of Mathematics
and Physics, Ke Karlovu
3, 121 16, Prague
2, Czech Republic
| | - Ondrej Marsalek
- Charles University, Faculty of Mathematics
and Physics, Ke Karlovu
3, 121 16, Prague
2, Czech Republic
| |
Collapse
|
16
|
Tokita AM, Behler J. How to train a neural network potential. J Chem Phys 2023; 159:121501. [PMID: 38127396 DOI: 10.1063/5.0160326] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 07/24/2023] [Indexed: 12/23/2023] Open
Abstract
The introduction of modern Machine Learning Potentials (MLPs) has led to a paradigm change in the development of potential energy surfaces for atomistic simulations. By providing efficient access to energies and forces, they allow us to perform large-scale simulations of extended systems, which are not directly accessible by demanding first-principles methods. In these simulations, MLPs can reach the accuracy of electronic structure calculations, provided that they have been properly trained and validated using a suitable set of reference data. Due to their highly flexible functional form, the construction of MLPs has to be done with great care. In this Tutorial, we describe the necessary key steps for training reliable MLPs, from data generation via training to final validation. The procedure, which is illustrated for the example of a high-dimensional neural network potential, is general and applicable to many types of MLPs.
Collapse
Affiliation(s)
- Alea Miako Tokita
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany and Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany and Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| |
Collapse
|
17
|
Atsango AO, Morawietz T, Marsalek O, Markland TE. Developing machine-learned potentials to simultaneously capture the dynamics of excess protons and hydroxide ions in classical and path integral simulations. J Chem Phys 2023; 159:074101. [PMID: 37581418 DOI: 10.1063/5.0162066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 07/31/2023] [Indexed: 08/16/2023] Open
Abstract
The transport of excess protons and hydroxide ions in water underlies numerous important chemical and biological processes. Accurately simulating the associated transport mechanisms ideally requires utilizing ab initio molecular dynamics simulations to model the bond breaking and formation involved in proton transfer and path-integral simulations to model the nuclear quantum effects relevant to light hydrogen atoms. These requirements result in a prohibitive computational cost, especially at the time and length scales needed to converge proton transport properties. Here, we present machine-learned potentials (MLPs) that can model both excess protons and hydroxide ions at the generalized gradient approximation and hybrid density functional theory levels of accuracy and use them to perform multiple nanoseconds of both classical and path-integral proton defect simulations at a fraction of the cost of the corresponding ab initio simulations. We show that the MLPs are able to reproduce ab initio trends and converge properties such as the diffusion coefficients of both excess protons and hydroxide ions. We use our multi-nanosecond simulations, which allow us to monitor large numbers of proton transfer events, to analyze the role of hypercoordination in the transport mechanism of the hydroxide ion and provide further evidence for the asymmetry in diffusion between excess protons and hydroxide ions.
Collapse
Affiliation(s)
- Austin O Atsango
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| | - Tobias Morawietz
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| | - Ondrej Marsalek
- Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
| | - Thomas E Markland
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
18
|
Riera M, Knight C, Bull-Vulpe EF, Zhu X, Agnew H, Smith DGA, Simmonett AC, Paesani F. MBX: A many-body energy and force calculator for data-driven many-body simulations. J Chem Phys 2023; 159:054802. [PMID: 37526156 PMCID: PMC10550339 DOI: 10.1063/5.0156036] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 07/11/2023] [Indexed: 08/02/2023] Open
Abstract
Many-Body eXpansion (MBX) is a C++ library that implements many-body potential energy functions (PEFs) within the "many-body energy" (MB-nrg) formalism. MB-nrg PEFs integrate an underlying polarizable model with explicit machine-learned representations of many-body interactions to achieve chemical accuracy from the gas to the condensed phases. MBX can be employed either as a stand-alone package or as an energy/force engine that can be integrated with generic software for molecular dynamics and Monte Carlo simulations. MBX is parallelized internally using Open Multi-Processing and can utilize Message Passing Interface when available in interfaced molecular simulation software. MBX enables classical and quantum molecular simulations with MB-nrg PEFs, as well as hybrid simulations that combine conventional force fields and MB-nrg PEFs, for diverse systems ranging from small gas-phase clusters to aqueous solutions and molecular fluids to biomolecular systems and metal-organic frameworks.
Collapse
Affiliation(s)
- Marc Riera
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | - Christopher Knight
- Argonne National Laboratory, Computational Science Division, Lemont, Illinois 60439, USA
| | - Ethan F. Bull-Vulpe
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | - Xuanyu Zhu
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | - Henry Agnew
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | | | - Andrew C. Simmonett
- Laboratory of Computational Biology, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | |
Collapse
|
19
|
Chen MS, Lee J, Ye HZ, Berkelbach TC, Reichman DR, Markland TE. Data-Efficient Machine Learning Potentials from Transfer Learning of Periodic Correlated Electronic Structure Methods: Liquid Water at AFQMC, CCSD, and CCSD(T) Accuracy. J Chem Theory Comput 2023; 19:4510-4519. [PMID: 36730728 DOI: 10.1021/acs.jctc.2c01203] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Obtaining the atomistic structure and dynamics of disordered condensed-phase systems from first-principles remains one of the forefront challenges of chemical theory. Here we exploit recent advances in periodic electronic structure and provide a data-efficient approach to obtain machine-learned condensed-phase potential energy surfaces using AFQMC, CCSD, and CCSD(T) from a very small number (≤200) of energies by leveraging a transfer learning scheme starting from lower-tier electronic structure methods. We demonstrate the effectiveness of this approach for liquid water by performing both classical and path integral molecular dynamics simulations on these machine-learned potential energy surfaces. By doing this, we uncover the interplay of dynamical electron correlation and nuclear quantum effects across the entire liquid range of water while providing a general strategy for efficiently utilizing periodic correlated electronic structure methods to explore disordered condensed-phase systems.
Collapse
Affiliation(s)
- Michael S Chen
- Department of Chemistry, Stanford University, Stanford, California94305, United States
| | - Joonho Lee
- Department of Chemistry, Columbia University, New York, New York10027, United States
| | - Hong-Zhou Ye
- Department of Chemistry, Columbia University, New York, New York10027, United States
| | - Timothy C Berkelbach
- Department of Chemistry, Columbia University, New York, New York10027, United States
- Center for Computational Quantum Physics, Flatiron Institute, New York, New York10010, United States
| | - David R Reichman
- Department of Chemistry, Columbia University, New York, New York10027, United States
| | - Thomas E Markland
- Department of Chemistry, Stanford University, Stanford, California94305, United States
| |
Collapse
|
20
|
Tang Z, Bromley ST, Hammer B. A machine learning potential for simulating infrared spectra of nanosilicate clusters. J Chem Phys 2023; 158:2895243. [PMID: 37290080 DOI: 10.1063/5.0150379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 05/23/2023] [Indexed: 06/10/2023] Open
Abstract
The use of machine learning (ML) in chemical physics has enabled the construction of interatomic potentials having the accuracy of ab initio methods and a computational cost comparable to that of classical force fields. Training an ML model requires an efficient method for the generation of training data. Here, we apply an accurate and efficient protocol to collect training data for constructing a neural network-based ML interatomic potential for nanosilicate clusters. Initial training data are taken from normal modes and farthest point sampling. Later on, the set of training data is extended via an active learning strategy in which new data are identified by the disagreement between an ensemble of ML models. The whole process is further accelerated by parallel sampling over structures. We use the ML model to run molecular dynamics simulations of nanosilicate clusters with various sizes, from which infrared spectra with anharmonicity included can be extracted. Such spectroscopic data are needed for understanding the properties of silicate dust grains in the interstellar medium and in circumstellar environments.
Collapse
Affiliation(s)
- Zeyuan Tang
- Center for Interstellar Catalysis, Department of Physics and Astronomy, Aarhus University, Ny Munkegade 120, Aarhus C 8000, Denmark
| | - Stefan T Bromley
- Departament de Ciència de Materials i Química Física and Institut de Química Teòrica i Computatcional (IQTCUB), Universitat de Barcelona, c/Martí i Franquès 1-11, 08028 Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluis Companys 23, 08010 Barcelona, Spain
| | - Bjørk Hammer
- Center for Interstellar Catalysis, Department of Physics and Astronomy, Aarhus University, Ny Munkegade 120, Aarhus C 8000, Denmark
| |
Collapse
|
21
|
Shepherd S, Tribello GA, Wilkins DM. A fully quantum-mechanical treatment for kaolinite. J Chem Phys 2023; 158:2892274. [PMID: 37220200 DOI: 10.1063/5.0152361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 05/03/2023] [Indexed: 05/25/2023] Open
Abstract
Neural network potentials for kaolinite minerals have been fitted to data extracted from density functional theory calculations that were performed using the revPBE + D3 and revPBE + vdW functionals. These potentials have then been used to calculate the static and dynamic properties of the mineral. We show that revPBE + vdW is better at reproducing the static properties. However, revPBE + D3 does a better job of reproducing the experimental IR spectrum. We also consider what happens to these properties when a fully quantum treatment of the nuclei is employed. We find that nuclear quantum effects (NQEs) do not make a substantial difference to the static properties. However, when NQEs are included, the dynamic properties of the material change substantially.
Collapse
Affiliation(s)
- Sam Shepherd
- Centre for Quantum Materials and Technologies, School of Mathematics and Physics, Queen's University Belfast, Belfast BT7 1NN, Northern Ireland, United Kingdom
| | - Gareth A Tribello
- Centre for Quantum Materials and Technologies, School of Mathematics and Physics, Queen's University Belfast, Belfast BT7 1NN, Northern Ireland, United Kingdom
| | - David M Wilkins
- Centre for Quantum Materials and Technologies, School of Mathematics and Physics, Queen's University Belfast, Belfast BT7 1NN, Northern Ireland, United Kingdom
| |
Collapse
|
22
|
Heindel JP, Herman KM, Xantheas SS. Many-Body Effects in Aqueous Systems: Synergies Between Interaction Analysis Techniques and Force Field Development. Annu Rev Phys Chem 2023; 74:337-360. [PMID: 37093659 DOI: 10.1146/annurev-physchem-062422-023532] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Interaction analysis techniques, including the many-body expansion (MBE), symmetry-adapted perturbation theory, and energy decomposition analysis, allow for an intuitive understanding of complex molecular interactions. We review these methods by first providing a historical context for the study of many-body interactions and discussing how nonadditivities emerge from Hamiltonians containing strictly pairwise-additive interactions. We then elaborate on the synergy between these interaction analysis techniques and the development of advanced force fields aimed at accurately reproducing the Born-Oppenheimer potential energy surface. In particular, we focus on ab initio-based force fields that aim to explicitly reproduce many-body terms and are fitted to high-level electronic structure results. These force fields generally incorporate many-body effects through (a) parameterization of distributed multipoles, (b) explicit fitting of the MBE, (c) inclusion of many-atom features in a neural network, and (d) coarse-graining of many-body terms into an effective two-body term. We also discuss the emerging use of the MBE to improve the accuracy and speed of ab initio molecular dynamics.
Collapse
Affiliation(s)
- Joseph P Heindel
- Department of Chemistry, University of Washington, Seattle, Washington, USA
| | - Kristina M Herman
- Department of Chemistry, University of Washington, Seattle, Washington, USA
| | - Sotiris S Xantheas
- Department of Chemistry, University of Washington, Seattle, Washington, USA
- Advanced Computing, Mathematics and Data Division, Pacific Northwest National Laboratory, Richland, Washington, USA; ,
| |
Collapse
|
23
|
Schmitz G, Schnieder B. Adaptive regularized Gaussian process regression for application in the context of hydrogen adsorption on graphene sheets. J Comput Chem 2023; 44:732-744. [PMID: 36382688 DOI: 10.1002/jcc.27035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 09/21/2022] [Accepted: 09/28/2022] [Indexed: 11/17/2022]
Abstract
We present a Gaussian process regression (GPR) scheme with an adaptive regularization scheme applied to the QM7 and QM9 test set, several protonated water clusters and specifically to the problem of atomic hydrogen adsorption on graphene sheets. For the last system our goal is to achieve good predictive accuracy with only a few training points. Therefore, we assess for these systems a self-correcting multilayer GPR model, in which the prediction is corrected by a chain of additional GPR models. In our adaptive regularization scheme, we impose no noise on the training data, but use an approach based on the data itself to account for its impurity. The strength of this strategy is that the data points are treated differently based on their importance and that the regularization can still be controlled by a single parameter. We assess how the accuracy of the prediction depends on this parameter. We can show that the new regularization scheme as well as the multilayer approach results in more robust predictors. Furthermore, we demonstrate that the predictor can be in good agreement with the density-functional theory results.
Collapse
Affiliation(s)
- Gunnar Schmitz
- Theoretische Chemie, Ruhr-Universität Bochum, Bochum, Germany
| | | |
Collapse
|
24
|
Zhai Y, Caruso A, Bore SL, Luo Z, Paesani F. A "short blanket" dilemma for a state-of-the-art neural network potential for water: Reproducing experimental properties or the physics of the underlying many-body interactions? J Chem Phys 2023; 158:084111. [PMID: 36859071 DOI: 10.1063/5.0142843] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Deep neural network (DNN) potentials have recently gained popularity in computer simulations of a wide range of molecular systems, from liquids to materials. In this study, we explore the possibility of combining the computational efficiency of the DeePMD framework and the demonstrated accuracy of the MB-pol data-driven, many-body potential to train a DNN potential for large-scale simulations of water across its phase diagram. We find that the DNN potential is able to reliably reproduce the MB-pol results for liquid water, but provides a less accurate description of the vapor-liquid equilibrium properties. This shortcoming is traced back to the inability of the DNN potential to correctly represent many-body interactions. An attempt to explicitly include information about many-body effects results in a new DNN potential that exhibits the opposite performance, being able to correctly reproduce the MB-pol vapor-liquid equilibrium properties, but losing accuracy in the description of the liquid properties. These results suggest that DeePMD-based DNN potentials are not able to correctly "learn" and, consequently, represent many-body interactions, which implies that DNN potentials may have limited ability to predict the properties for state points that are not explicitly included in the training process. The computational efficiency of the DeePMD framework can still be exploited to train DNN potentials on data-driven many-body potentials, which can thus enable large-scale, "chemically accurate" simulations of various molecular systems, with the caveat that the target state points must have been adequately sampled by the reference data-driven many-body potential in order to guarantee a faithful representation of the associated properties.
Collapse
Affiliation(s)
- Yaoguang Zhai
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California 92093, USA
| | - Alessandro Caruso
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | - Sigbjørn Løland Bore
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | - Zhishang Luo
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | - Francesco Paesani
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| |
Collapse
|
25
|
Mendive-Tapia D, Meyer HD, Vendrell O. Optimal Mode Combination in the Multiconfiguration Time-Dependent Hartree Method through Multivariate Statistics: Factor Analysis and Hierarchical Clustering. J Chem Theory Comput 2023; 19:1144-1156. [PMID: 36716214 DOI: 10.1021/acs.jctc.2c01089] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
The multiconfiguration time-dependent Hartree (MCTDH) method and its multilayer extension (ML-MCTDH) are powerful algorithms for the efficient computation of nuclear quantum dynamics in high-dimensional systems. By providing time-dependent variational orbitals and an optimal choice of layered effective degrees of freedom, one is able to reduce the computational cost to an amenable number of configurations. However, choices related to selecting properly the mode grouping and tensor tree are strongly system dependent and, thus far, subjectively based on intuition and/or experience. Therefore, herein we detail a new protocol based on multivariate statistics─more specifically, factor analysis and hierarchical clustering─for a reliable and convenient guiding in the optimal design of such complex "system-of-systems" tensor-network decompositions. The advantages of employing the new algorithm and its applicability are tested on water and two floppy protonated water clusters with large amplitude motions.
Collapse
Affiliation(s)
- David Mendive-Tapia
- Theoretische Chemie, Universität Heidelberg, Im Neuenheimer Feld 229, D-69120Heidelberg, Germany
| | - Hans-Dieter Meyer
- Theoretische Chemie, Universität Heidelberg, Im Neuenheimer Feld 229, D-69120Heidelberg, Germany
| | - Oriol Vendrell
- Theoretische Chemie, Universität Heidelberg, Im Neuenheimer Feld 229, D-69120Heidelberg, Germany
| |
Collapse
|
26
|
Davies JA, Schran C, Brieuc F, Marx D, Ellis AM. Onset of Rotational Decoupling for a Molecular Ion Solvated in Helium: From Tags to Rings and Shells. PHYSICAL REVIEW LETTERS 2023; 130:083001. [PMID: 36898117 DOI: 10.1103/physrevlett.130.083001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 12/05/2022] [Accepted: 01/18/2023] [Indexed: 06/18/2023]
Abstract
Little is known about how rotating molecular ions interact with multiple ^{4}He atoms and how this relates to microscopic superfluidity. Here, we use infrared spectroscopy to investigate ^{4}He_{N}⋯H_{3}O^{+} complexes and find that H_{3}O^{+} undergoes dramatic changes in rotational behavior as ^{4}He atoms are added. We present evidence of clear rotational decoupling of the ion core from the surrounding helium for N>3, with sudden changes in rotational constants at N=6 and 12. In sharp contrast to studies on small neutral molecules microsolvated in helium, accompanying path integral simulations show that an incipient superfluid effect is not needed to account for these findings.
Collapse
Affiliation(s)
- Julia A Davies
- School of Chemistry, University of Leicester, University Road, Leicester, LE1 7RH, United Kingdom
| | - Christoph Schran
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Fabien Brieuc
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Andrew M Ellis
- School of Chemistry, University of Leicester, University Road, Leicester, LE1 7RH, United Kingdom
| |
Collapse
|
27
|
Zaverkin V, Holzmüller D, Bonfirraro L, Kästner J. Transfer learning for chemically accurate interatomic neural network potentials. Phys Chem Chem Phys 2023; 25:5383-5396. [PMID: 36748821 DOI: 10.1039/d2cp05793j] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Developing machine learning-based interatomic potentials from ab initio electronic structure methods remains a challenging task for computational chemistry and materials science. This work studies the capability of transfer learning, in particular discriminative fine-tuning, for efficiently generating chemically accurate interatomic neural network potentials on organic molecules from the MD17 and ANI data sets. We show that pre-training the network parameters on data obtained from density functional calculations considerably improves the sample efficiency of models trained on more accurate ab initio data. Additionally, we show that fine-tuning with energy labels alone can suffice to obtain accurate atomic forces and run large-scale atomistic simulations, provided a well-designed fine-tuning data set. We also investigate possible limitations of transfer learning, especially regarding the design and size of the pre-training and fine-tuning data sets. Finally, we provide GM-NN potentials pre-trained and fine-tuned on the ANI-1x and ANI-1ccx data sets, which can easily be fine-tuned on and applied to organic molecules.
Collapse
Affiliation(s)
- Viktor Zaverkin
- Faculty of Chemistry, Institute for Theoretical Chemistry, University of Stuttgart, Germany.
| | - David Holzmüller
- Faculty of Mathematics and Physics, Institute for Stochastics and Applications, University of Stuttgart, Germany.
| | - Luca Bonfirraro
- Faculty of Chemistry, Institute for Theoretical Chemistry, University of Stuttgart, Germany.
| | - Johannes Kästner
- Faculty of Chemistry, Institute for Theoretical Chemistry, University of Stuttgart, Germany.
| |
Collapse
|
28
|
Käser S, Vazquez-Salazar LI, Meuwly M, Töpfer K. Neural network potentials for chemistry: concepts, applications and prospects. DIGITAL DISCOVERY 2023; 2:28-58. [PMID: 36798879 PMCID: PMC9923808 DOI: 10.1039/d2dd00102k] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 12/20/2022] [Indexed: 12/24/2022]
Abstract
Artificial Neural Networks (NN) are already heavily involved in methods and applications for frequent tasks in the field of computational chemistry such as representation of potential energy surfaces (PES) and spectroscopic predictions. This perspective provides an overview of the foundations of neural network-based full-dimensional potential energy surfaces, their architectures, underlying concepts, their representation and applications to chemical systems. Methods for data generation and training procedures for PES construction are discussed and means for error assessment and refinement through transfer learning are presented. A selection of recent results illustrates the latest improvements regarding accuracy of PES representations and system size limitations in dynamics simulations, but also NN application enabling direct prediction of physical results without dynamics simulations. The aim is to provide an overview for the current state-of-the-art NN approaches in computational chemistry and also to point out the current challenges in enhancing reliability and applicability of NN methods on a larger scale.
Collapse
Affiliation(s)
- Silvan Käser
- Department of Chemistry, University of Basel Klingelbergstrasse 80 CH-4056 Basel Switzerland
| | | | - Markus Meuwly
- Department of Chemistry, University of Basel Klingelbergstrasse 80 CH-4056 Basel Switzerland
| | - Kai Töpfer
- Department of Chemistry, University of Basel Klingelbergstrasse 80 CH-4056 Basel Switzerland
| |
Collapse
|
29
|
Schienbein P. Spectroscopy from Machine Learning by Accurately Representing the Atomic Polar Tensor. J Chem Theory Comput 2023; 19:705-712. [PMID: 36695707 PMCID: PMC9933433 DOI: 10.1021/acs.jctc.2c00788] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Vibrational spectroscopy is a key technique to elucidate microscopic structure and dynamics. Without the aid of theoretical approaches, it is, however, often difficult to understand such spectra at a microscopic level. Ab initio molecular dynamics has repeatedly proved to be suitable for this purpose; however, the computational cost can be daunting. Here, the E(3)-equivariant neural network e3nn is used to fit the atomic polar tensor of liquid water a posteriori on top of existing molecular dynamics simulations. Notably, the introduced methodology is general and thus transferable to any other system as well. The target property is most fundamental and gives access to the IR spectrum, and more importantly, it is a highly powerful tool to directly assign IR spectral features to nuclear motion─a connection which has been pursued in the past but only using severe approximations due to the prohibitive computational cost. The herein introduced methodology overcomes this bottleneck. To benchmark the machine learning model, the IR spectrum of liquid water is calculated, indeed showing excellent agreement with the explicit reference calculation. In conclusion, the presented methodology gives a new route to calculate accurate IR spectra from molecular dynamics simulations and will facilitate the understanding of such spectra on a microscopic level.
Collapse
|
30
|
Ryczko K, Krogel JT, Tamblyn I. Machine Learning Diffusion Monte Carlo Energies. J Chem Theory Comput 2022; 18:7695-7701. [PMID: 36317712 DOI: 10.1021/acs.jctc.2c00483] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
We present two machine learning methodologies that are capable of predicting diffusion Monte Carlo (DMC) energies with small data sets (≈60 DMC calculations in total). The first uses voxel deep neural networks (VDNNs) to predict DMC energy densities using Kohn-Sham density functional theory (DFT) electron densities as input. The second uses kernel ridge regression (KRR) to predict atomic contributions to the DMC total energy using atomic environment vectors as input (we used atom-centered symmetry functions, atomic environment vectors from the ANI models, and smooth overlap of atomic positions). We first compare the methodologies on pristine graphene lattices, where we find that the KRR methodology performs best in comparison to gradient boosted decision trees, random forest, Gaussian process regression, and multilayer perceptrons. In addition, KRR outperforms VDNNs by an order of magnitude. Afterward, we study the generalizability of KRR to predict the energy barrier associated with a Stone-Wales defect. Lastly, we move from 2D to 3D materials and use KRR to predict total energies of liquid water. In all cases, we find that the KRR models are more accurate than Kohn-Sham DFT and all mean absolute errors are less than chemical accuracy.
Collapse
Affiliation(s)
- Kevin Ryczko
- Good Chemistry Company, Vancouver, British ColumbiaV6E 4B1, Canada
| | - Jaron T Krogel
- Materials Science and Technology Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee37831, United States
| | - Isaac Tamblyn
- Department of Physics, University of Ottawa, Ottawa, OntarioK1N 6N5, Canada.,Vector Institute for Artificial Intelligence, Toronto, OntarioM5G 1M1, Canada
| |
Collapse
|
31
|
Daru J, Forbert H, Behler J, Marx D. Coupled Cluster Molecular Dynamics of Condensed Phase Systems Enabled by Machine Learning Potentials: Liquid Water Benchmark. PHYSICAL REVIEW LETTERS 2022; 129:226001. [PMID: 36493459 DOI: 10.1103/physrevlett.129.226001] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 09/05/2022] [Accepted: 10/05/2022] [Indexed: 06/17/2023]
Abstract
Coupled cluster theory is a general and systematic electronic structure method, but in particular the highly accurate "gold standard" coupled cluster singles, doubles and perturbative triples, CCSD(T), can only be applied to small systems. To overcome this limitation, we introduce a framework to transfer CCSD(T) accuracy of finite molecular clusters to extended condensed phase systems using a high-dimensional neural network potential. This approach, which is automated, allows one to perform high-quality coupled cluster molecular dynamics, CCMD, as we demonstrate for liquid water including nuclear quantum effects. The machine learning strategy is very efficient, generic, can be systematically improved, and is applicable to a variety of complex systems.
Collapse
Affiliation(s)
- János Daru
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Harald Forbert
- Center for Solvation Science ZEMOS, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Jörg Behler
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstrasse 6, 37077 Göttingen, Germany
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| |
Collapse
|
32
|
Larsson HR, Schröder M, Beckmann R, Brieuc F, Schran C, Marx D, Vendrell O. State-resolved infrared spectrum of the protonated water dimer: revisiting the characteristic proton transfer doublet peak. Chem Sci 2022; 13:11119-11125. [PMID: 36320484 PMCID: PMC9517273 DOI: 10.1039/d2sc03189b] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 08/26/2022] [Indexed: 11/29/2023] Open
Abstract
The infrared (IR) spectra of protonated water clusters encode precise information on the dynamics and structure of the hydrated proton. However, the strong anharmonic coupling and quantum effects of these elusive species remain puzzling up to the present day. Here, we report unequivocal evidence that the interplay between the proton transfer and the water wagging motions in the protonated water dimer (Zundel ion) giving rise to the characteristic doublet peak is both more complex and more sensitive to subtle energetic changes than previously thought. In particular, hitherto overlooked low-intensity satellite peaks in the experimental spectrum are now unveiled and mechanistically assigned. Our findings rely on the comparison of IR spectra obtained using two highly accurate potential energy surfaces in conjunction with highly accurate state-resolved quantum simulations. We demonstrate that these high-accuracy simulations are important for providing definite assignments of the complex IR signals of fluxional molecules.
Collapse
Affiliation(s)
- Henrik R Larsson
- Department of Chemistry and Biochemistry, University of California Merced CA 95343 USA
- Division of Chemistry and Chemical Engineering, California Institute of Technology Pasadena CA 91125 USA
| | - Markus Schröder
- Theoretische Chemie, Physikalisch-Chemisches Institut, Universität Heidelberg Im Neuenheimer Feld 229 D - 69120 Heidelberg Germany
| | - Richard Beckmann
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum 44780 Bochum Germany
| | - Fabien Brieuc
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum 44780 Bochum Germany
| | - Christoph Schran
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum 44780 Bochum Germany
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum 44780 Bochum Germany
| | - Oriol Vendrell
- Theoretische Chemie, Physikalisch-Chemisches Institut, Universität Heidelberg Im Neuenheimer Feld 229 D - 69120 Heidelberg Germany
| |
Collapse
|
33
|
Kiss O, Tacchino F, Vallecorsa S, Tavernelli I. Quantum neural networks force fields generation. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2022. [DOI: 10.1088/2632-2153/ac7d3c] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Abstract
Accurate molecular force fields are of paramount importance for the efficient implementation of molecular dynamics techniques at large scales. In the last decade, machine learning (ML) methods have demonstrated impressive performances in predicting accurate values for energy and forces when trained on finite size ensembles generated with ab initio techniques. At the same time, quantum computers have recently started to offer new viable computational paradigms to tackle such problems. On the one hand, quantum algorithms may notably be used to extend the reach of electronic structure calculations. On the other hand, quantum ML is also emerging as an alternative and promising path to quantum advantage. Here we follow this second route and establish a direct connection between classical and quantum solutions for learning neural network (NN) potentials. To this end, we design a quantum NN architecture and apply it successfully to different molecules of growing complexity. The quantum models exhibit larger effective dimension with respect to classical counterparts and can reach competitive performances, thus pointing towards potential quantum advantages in natural science applications via quantum ML.
Collapse
|
34
|
Abstract
Machine-learning force fields have become increasingly popular because of their balance of accuracy and speed. However, a significant limitation is the use of element-specific features, leading to poor scalability with the number of elements. This work introduces the Gaussian multipole (GMP) featurization scheme that utilizes physically relevant multipole expansions of the electron density around atoms to yield feature vectors that interpolate between element types and have a fixed dimension regardless of the number of elements present. We combine GMP with neural networks and apply these models to the MD17 and QM9 data sets, revealing high computational efficiency, systematically improvable accuracy, and the ability to make reasonable predictions on elements not included in the training set. Finally, we test GMP-based models for the OCP data set, demonstrating comparable performance to graph-convolutional models. The results indicate that this featurization scheme fills a critical gap in the construction of efficient and transferable machine-learned force fields.
Collapse
Affiliation(s)
- Xiangyun Lei
- School of Chemical & Biomolecular Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Andrew J Medford
- School of Chemical & Biomolecular Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| |
Collapse
|
35
|
Fan Z, Wang Y, Ying P, Song K, Wang J, Wang Y, Zeng Z, Xu K, Lindgren E, Rahm JM, Gabourie AJ, Liu J, Dong H, Wu J, Chen Y, Zhong Z, Sun J, Erhart P, Su Y, Ala-Nissila T. GPUMD: A package for constructing accurate machine-learned potentials and performing highly efficient atomistic simulations. J Chem Phys 2022; 157:114801. [DOI: 10.1063/5.0106617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We present our latest advancements of machine-learned potentials (MLPs) based on the neuroevolution potential (NEP) framework introduced in [Fan et al., Phys. Rev. B 104, 104309 (2021)] and their implementation in the open-source package GPUMD.We increase the accuracy of NEP models both by improving the radial functions in the atomic-environment descriptor using a linear combination of Chebyshev basis functions and by extending the angular descriptor with some four-body and five-body contributions as in the atomic cluster expansion approach.We also detail our efficient implementation of the NEP approach in graphics processing units as well as our workflow for the construction of NEP models, and we demonstrate their application in large-scale atomistic simulations.By comparing to state-of-the-art MLPs, we show that the NEP approach not only achieves above-average accuracy but also is far more computationally efficient.These results demonstrate that the GPUMD package is a promising tool for solving challenging problems requiring highly accurate, large-scale atomistic simulations.To enable the construction of MLPs using a minimal training set, we propose an active-learning scheme based on the latent space of a pre-trained NEP model.Finally, we introduce three separate Python packages, GPYUMD, CALORINE, and PYNEP, which enable the integration of GPUMD into Python workflows.
Collapse
Affiliation(s)
- Zheyong Fan
- School of Mathematics and Physics, Bohai University, China
| | | | - Penghua Ying
- School of Science, Harbin Institute of Technology Shenzhen, China
| | - Keke Song
- University of Science and Technology Beijing, China
| | | | | | | | - Ke Xu
- Xiamen University, Xiamen University, China
| | | | | | | | - Jiahui Liu
- University of Science and Technology Beijing, China
| | | | - Jianyang Wu
- Department of Physics, Xiamen University, China
| | - Yue Chen
- Mechanical Engineering, University of Hong Kong Department of Mechanical Engineering, Hong Kong
| | - Zheng Zhong
- Harbin Institute of Technology, Shenzhen, Harbin Institute of Technology, China
| | - Jian Sun
- Department of Physics and National Laboratory of Solid State Microstructures, Nanjing University, China
| | | | - Yanjing Su
- Corrosion and Protection Center, Key Laboratory for Environmental Fracture (MOE), University of Science and Technology Beijing, China
| | - Tapio Ala-Nissila
- Department of Applied Physics, Aalto University Department of Applied Physics, Finland
| |
Collapse
|
36
|
Beckmann R, Brieuc F, Schran C, Marx D. Infrared Spectra at Coupled Cluster Accuracy from Neural Network Representations. J Chem Theory Comput 2022; 18:5492-5501. [PMID: 35998360 DOI: 10.1021/acs.jctc.2c00511] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Infrared spectroscopy is key to elucidating molecular structures, monitoring reactions, and observing conformational changes, while providing information on both structural and dynamical properties. This makes the accurate prediction of infrared spectra based on first-principle theories a highly desirable pursuit. Molecular dynamics simulations have proven to be a particularly powerful approach for this task, albeit requiring the computation of energies, forces and dipole moments for a large number of molecular configurations as a function of time. This explains why highly accurate first-principles methods, such as coupled cluster theory, have so far been inapplicable for the prediction of fully anharmonic vibrational spectra of large systems at finite temperatures. Here, we push cutting-edge machine learning techniques forward by using neural network representations of energies, forces, and in particular dipoles to predict such infrared spectra fully at "gold standard" coupled cluster accuracy as demonstrated for protonated water clusters as large as the protonated water hexamer, in its extended Zundel configuration. Furthermore, we show that this methodology can be used beyond the scope of the data considered during the development of the neural network models, allowing for the computation of finite-temperature infrared spectra of large systems inaccessible to explicit coupled cluster calculations. This substantially expands the hitherto existing limits of accuracy, speed, and system size for theoretical spectroscopy and opens up a multitude of avenues for the prediction of vibrational spectra and the understanding of complex intra- and intermolecular couplings.
Collapse
Affiliation(s)
- Richard Beckmann
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Fabien Brieuc
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Christoph Schran
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| |
Collapse
|
37
|
Zhu X, Iyengar SS. Graph Theoretic Molecular Fragmentation for Multidimensional Potential Energy Surfaces Yield an Adaptive and General Transfer Machine Learning Protocol. J Chem Theory Comput 2022; 18:5125-5144. [PMID: 35994592 DOI: 10.1021/acs.jctc.1c01241] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Over a series of publications we have introduced a graph-theoretic description for molecular fragmentation. Here, a system is divided into a set of nodes, or vertices, that are then connected through edges, faces, and higher-order simplexes to represent a collection of spatially overlapping and locally interacting subsystems. Each such subsystem is treated at two levels of electronic structure theory, and the result is used to construct many-body expansions that are then embedded within an ONIOM-scheme. These expansions converge rapidly with many-body order (or graphical rank) of subsystems and have been previously used for ab initio molecular dynamics (AIMD) calculations and for computing multidimensional potential energy surfaces. Specifically, in all these cases we have shown that CCSD and MP2 level AIMD trajectories and potential surfaces may be obtained at density functional theory cost. The approach has been demonstrated for gas-phase studies, for condensed phase electronic structure, and also for basis set extrapolation-based AIMD. Recently, this approach has also been used to derive new quantum-computing algorithms that enormously reduce the quantum circuit depth in a circuit-based computation of correlated electronic structure. In this publication, we introduce (a) a family of neural networks that act in parallel to represent, efficiently, the post-Hartree-Fock electronic structure energy contributions for all simplexes (fragments), and (b) a new k-means-based tessellation strategy to glean training data for high-dimensional molecular spaces and minimize the extent of training needed to construct this family of neural networks. The approach is particularly useful when coupled cluster accuracy is desired and when fragment sizes grow in order to capture nonlocal interactions accurately. The unique multidimensional k-means tessellation/clustering algorithm used to determine our training data for all fragments is shown to be extremely efficient and reduces the needed training to only 10% of data for all fragments to obtain accurate neural networks for each fragment. These fully connected dense neural networks are then used to extrapolate the potential energy surface for all molecular fragments, and these are then combined as per our graph-theoretic procedure to transfer the learning process to a full system energy for the entire AIMD trajectory at less than one-tenth the cost as compared to a regular fragmentation-based AIMD calculation.
Collapse
Affiliation(s)
- Xiao Zhu
- Department of Chemistry and Department of Physics, Indiana University, 800 E. Kirkwood Avenue, Bloomington 47405, Indiana, United States
| | - Srinivasan S Iyengar
- Department of Chemistry and Department of Physics, Indiana University, 800 E. Kirkwood Avenue, Bloomington 47405, Indiana, United States
| |
Collapse
|
38
|
DuránCaballero L, Schran C, Brieuc F, Marx D. Neural network interaction potentials for para-hydrogen with flexible molecules. J Chem Phys 2022; 157:074302. [DOI: 10.1063/5.0100953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The study of molecular impurities in para-hydrogen ( pH2) clusters is key to push forward our understanding of intra- and intermolecular interactions, including their impact on the superfluid response of this bosonic quantum solvent. This includes tagging with only one or very few pH2, the microsolvation regime for intermediate particle numbers, and matrix isolation with many solvent molecules. However, the fundamental coupling between the bosonic pH2 environment and the (ro-)vibrational motion of molecular impurities remains poorly understood. Quantum simulations can, in principle, provide the necessary atomistic insight, but they require very accurate descriptions of the involved interactions. Here, we present a data-driven approach for the generation of impurity⋯ pH2 interaction potentials based on machine learning techniques, which retain the full flexibility of the dopant species. We employ the well-established adiabatic hindered rotor (AHR) averaging technique to include the impact of the nuclear spin statistics on the symmetry-allowed rotational quantum numbers of pH2. Embedding this averaging procedure within the high-dimensional neural network potential (NNP) framework enables the generation of highly accurate AHR-averaged NNPs at coupled cluster accuracy, namely, explicitly correlated coupled cluster single, double, and scaled perturbative triples, CCSD(T*)-F12a/aVTZcp, in an automated manner. We apply this methodology to the water and protonated water molecules as representative cases for quasi-rigid and highly flexible molecules, respectively, and obtain AHR-averaged NNPs that reliably describe the corresponding H2O⋯ pH2 and H3O+⋯ pH2 interactions. Using path integral simulations, we show for the hydronium cation, H3O+, that umbrella-like tunneling inversion has a strong impact on the first and second pH2 microsolvation shells. The automated and data-driven nature of our protocol opens the door to the study of bosonic pH2 quantum solvation for a wide range of embedded impurities.
Collapse
Affiliation(s)
- Laura DuránCaballero
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Christoph Schran
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Fabien Brieuc
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| |
Collapse
|
39
|
Waters MJ, Rondinelli JM. Benchmarking structural evolution methods for training of machine learned interatomic potentials. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2022; 34:385901. [PMID: 35797983 DOI: 10.1088/1361-648x/ac7f73] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 07/07/2022] [Indexed: 06/15/2023]
Abstract
When creating training data for machine-learned interatomic potentials (MLIPs), it is common to create initial structures and evolve them using molecular dynamics (MD) to sample a larger configuration space. We benchmark two other modalities of evolving structures, contour exploration (CE) and dimer-method (DM) searches against MD for their ability to produce diverse and robust density functional theory training data sets for MLIPs. We also discuss the generation of initial structures which are either from known structures or from random structures in detail to further formalize the structure-sourcing processes in the future. The polymorph-rich zirconium-oxygen composition space is used as a rigorous benchmark system for comparing the performance of MLIPs trained on structures generated from these structural evolution methods. Using Behler-Parrinello neural networks as our MLIP models, we find that CE and the DM searches are generally superior to MD in terms of spatial descriptor diversity and statistical accuracy.
Collapse
Affiliation(s)
- Michael J Waters
- Department of Materials Science and Engineering, Northwestern University, Evanston, IL 60208, United States of America
| | - James M Rondinelli
- Department of Materials Science and Engineering, Northwestern University, Evanston, IL 60208, United States of America
| |
Collapse
|
40
|
Schienbein P, Blumberger J. Nanosecond solvation dynamics of the hematite/liquid water interface at hybrid DFT accuracy using committee neural network potentials. Phys Chem Chem Phys 2022; 24:15365-15375. [PMID: 35703465 DOI: 10.1039/d2cp01708c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Metal oxide/water interfaces play an important role in biology, catalysis, energy storage and photocatalytic water splitting. The atomistic structure at these interfaces is often difficult to characterize by experimental techniques, whilst results from ab initio molecular dynamics simulations tend to be uncertain due to the limited length and time scales accessible. In this work, we train a committee neural network potential to simulate the hematite/water interface at the hybrid DFT level of theory to reach the nanosecond timescale and systems containing more than 3000 atoms. The NNP enables us to converge dynamical properties, not possible with brute-force ab initio molecular dynamics. Our simulations uncover a rich solvation dynamics at the hematite/water interface spanning three different time scales: picosecond H-bond dynamics between surface hydroxyls and the first water layer, in-plane/out-of-plane tilt motion of surface hydroxyls on the 10 ps time scale, and diffusion of water molecules from the oxide surface characterized by a mean residence lifetime of about 60 ps. Calculation of vibrational spectra confirm that H-bonds between surface hydroxyls and first layer water molecules are stronger than H-bonds in bulk water. Our study showcases how state of the art machine learning approaches can routinely be utilized to explore the structural dynamics at transition metal oxide interfaces with complex electronic structure. It foreshadows that c-NNPs are a promising tool to tackle the sampling problem in ab initio electrochemistry with explicit solvent molecules.
Collapse
Affiliation(s)
- Philipp Schienbein
- Department of Physics and Astronomy and Thomas Young Centre, University College London, London, WC1E 6BT, UK.
| | - Jochen Blumberger
- Department of Physics and Astronomy and Thomas Young Centre, University College London, London, WC1E 6BT, UK.
| |
Collapse
|
41
|
Zaverkin V, Holzmüller D, Schuldt R, Kästner J. Predicting properties of periodic systems from cluster data: A case study of liquid water. J Chem Phys 2022; 156:114103. [DOI: 10.1063/5.0078983] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The accuracy of the training data limits the accuracy of bulk properties from machine-learned potentials. For example, hybrid functionals or wave-function-based quantum chemical methods are readily available for cluster data but effectively out of scope for periodic structures. We show that local, atom-centered descriptors for machine-learned potentials enable the prediction of bulk properties from cluster model training data, agreeing reasonably well with predictions from bulk training data. We demonstrate such transferability by studying structural and dynamical properties of bulk liquid water with density functional theory and have found an excellent agreement with experimental and theoretical counterparts.
Collapse
Affiliation(s)
- Viktor Zaverkin
- Institute for Theoretical Chemistry, University of Stuttgart, Pfaffenwaldring 55, 70569 Stuttgart, Germany
| | - David Holzmüller
- Institute for Stochastics and Applications, University of Stuttgart, Pfaffenwaldring 57, 70569 Stuttgart, Germany
| | - Robin Schuldt
- Institute for Theoretical Chemistry, University of Stuttgart, Pfaffenwaldring 55, 70569 Stuttgart, Germany
| | - Johannes Kästner
- Institute for Theoretical Chemistry, University of Stuttgart, Pfaffenwaldring 55, 70569 Stuttgart, Germany
| |
Collapse
|
42
|
Gokcan H, Isayev O. Learning molecular potentials with neural networks. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1564] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Hatice Gokcan
- Department of Chemistry, Mellon College of Science Carnegie Mellon University Pittsburgh Pennsylvania USA
| | - Olexandr Isayev
- Department of Chemistry, Mellon College of Science Carnegie Mellon University Pittsburgh Pennsylvania USA
| |
Collapse
|
43
|
Gupta AK, Raghavachari K. Three-Dimensional Convolutional Neural Networks Utilizing Molecular Topological Features for Accurate Atomization Energy Predictions. J Chem Theory Comput 2022; 18:2132-2143. [PMID: 35226496 DOI: 10.1021/acs.jctc.1c00504] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Deep learning methods provide a novel way to establish a correlation between two quantities. In this context, computer vision techniques such as three-dimensional (3D)-convolutional neural networks become a natural choice to associate a molecular property with its structure due to the inherent 3D nature of a molecule. However, traditional 3D input data structures are intrinsically sparse in nature, which tend to induce instabilities during the learning process, which in turn may lead to underfitted results. To address this deficiency, in this project, we propose to use quantum-chemically derived molecular topological features, namely, localized orbital locator and electron localization function, as molecular descriptors, which provide a relatively denser input representation in a 3D space. Such topological features provide a detailed picture of the atomic and electronic configuration and interatomic interactions in the molecule and hence are ideal for predicting properties that are highly dependent on the physical or electronic structure of the molecule. Herein, we demonstrate the efficacy of our proposed model by applying it to the task of predicting atomization energies for the QM9-G4MP2 data set, which contains ∼134k molecules. Furthermore, we incorporated the Δ-machine learning approach into our model, which enabled us to reach beyond benchmark accuracy levels (∼1.0 kJ mol-1). As a result, we consistently obtain impressive mean absolute errors of the order 0.1 kcal mol-1 (∼0.42 kJ mol-1) versus the G4(MP2) theory using relatively modest models, which could potentially be improved further in a systematic manner using additional compute resources.
Collapse
Affiliation(s)
- Ankur Kumar Gupta
- Department of Chemistry, Indiana University, Bloomington, Indiana 47405, United States
| | - Krishnan Raghavachari
- Department of Chemistry, Indiana University, Bloomington, Indiana 47405, United States
| |
Collapse
|
44
|
Abstract
In the past two decades, machine learning potentials (MLPs) have reached a level of maturity that now enables applications to large-scale atomistic simulations of a wide range of systems in chemistry, physics, and materials science. Different machine learning algorithms have been used with great success in the construction of these MLPs. In this review, we discuss an important group of MLPs relying on artificial neural networks to establish a mapping from the atomic structure to the potential energy. In spite of this common feature, there are important conceptual differences among MLPs, which concern the dimensionality of the systems, the inclusion of long-range electrostatic interactions, global phenomena like nonlocal charge transfer, and the type of descriptor used to represent the atomic structure, which can be either predefined or learnable. A concise overview is given along with a discussion of the open challenges in the field. Expected final online publication date for the Annual Review of Physical Chemistry, Volume 73 is April 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Emir Kocer
- Institut für Physikalische Chemie, Theoretische Chemie, Universität Göttingen, Göttingen, Germany;, ,
| | - Tsz Wai Ko
- Institut für Physikalische Chemie, Theoretische Chemie, Universität Göttingen, Göttingen, Germany;, ,
| | - Jörg Behler
- Institut für Physikalische Chemie, Theoretische Chemie, Universität Göttingen, Göttingen, Germany;, ,
| |
Collapse
|
45
|
Shao Y, Dietrich FM, Nettelblad C, Zhang C. Training algorithm matters for the performance of neural network potential: A case study of Adam and the Kalman filter optimizers. J Chem Phys 2021; 155:204108. [PMID: 34852491 DOI: 10.1063/5.0070931] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
One hidden yet important issue for developing neural network potentials (NNPs) is the choice of training algorithm. In this article, we compare the performance of two popular training algorithms, the adaptive moment estimation algorithm (Adam) and the extended Kalman filter algorithm (EKF), using the Behler-Parrinello neural network and two publicly accessible datasets of liquid water [Morawietz et al., Proc. Natl. Acad. Sci. U. S. A. 113, 8368-8373, (2016) and Cheng et al., Proc. Natl. Acad. Sci. U. S. A. 116, 1110-1115, (2019)]. This is achieved by implementing EKF in TensorFlow. It is found that NNPs trained with EKF are more transferable and less sensitive to the value of the learning rate, as compared to Adam. In both cases, error metrics of the validation set do not always serve as a good indicator for the actual performance of NNPs. Instead, we show that their performance correlates well with a Fisher information based similarity measure.
Collapse
Affiliation(s)
- Yunqi Shao
- Department of Chemistry-Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, P.O. Box 538, 75121 Uppsala, Sweden
| | - Florian M Dietrich
- Department of Chemistry-Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, P.O. Box 538, 75121 Uppsala, Sweden
| | - Carl Nettelblad
- Division of Scientific Computing, Department of Information Technology, SciLifeLab, Uppsala University, Lägerhyddsvägen 2, P.O. Box 337, 75105 Uppsala, Sweden
| | - Chao Zhang
- Department of Chemistry-Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, P.O. Box 538, 75121 Uppsala, Sweden
| |
Collapse
|
46
|
Gaigeot MP. Some opinions on MD-based vibrational spectroscopy of gas phase molecules and their assembly: An overview of what has been achieved and where to go. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2021; 260:119864. [PMID: 34052762 DOI: 10.1016/j.saa.2021.119864] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2021] [Revised: 04/13/2021] [Accepted: 04/18/2021] [Indexed: 06/12/2023]
Abstract
We hereby review molecular dynamics simulations for anharmonic gas phase spectroscopy and provide some of our opinions of where the field is heading. With these new directions, the theoretical IR/Raman spectroscopy of large (bio)-molecular systems will be more easily achievable over longer time-scale MD trajectories for an increase in accuracy of the MD-IR and MD-Raman calculated spectra. With the new directions presented here, the high throughput 'decoding' of experimental IR/Raman spectra into 3D-structures should thus be possible, hence advancing e.g. the field of MS-IR for structural characterization by spectroscopy. We also review the assignment of vibrational spectra in terms of anharmonic molecular modes from the MD trajectories, and especially introduce our recent developments based on Graph Theory algorithms. Graph Theory algorithmic is also introduced in this review for the identification of the molecular 3D-structures sampled over MD trajectories.
Collapse
Affiliation(s)
- Marie-Pierre Gaigeot
- Université Paris-Saclay, Univ Evry, CNRS, LAMBE UMR8587, 91025 Evry-Courcouronnes, France.
| |
Collapse
|
47
|
Saleh Y, Sanjay V, Iske A, Yachmenev A, Küpper J. Active learning of potential-energy surfaces of weakly bound complexes with regression-tree ensembles. J Chem Phys 2021; 155:144109. [PMID: 34654290 DOI: 10.1063/5.0057051] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Several pool-based active learning (AL) algorithms were employed to model potential-energy surfaces (PESs) with a minimum number of electronic structure calculations. Theoretical and empirical results suggest that superior strategies can be obtained by sampling molecular structures corresponding to large uncertainties in their predictions while at the same time not deviating much from the true distribution of the data. To model PESs in an AL framework, we propose to use a regression version of stochastic query by forest, a hybrid method that samples points corresponding to large uncertainties while avoiding collecting too many points from sparse regions of space. The algorithm is implemented with decision trees that come with relatively small computational costs. We empirically show that this algorithm requires around half the data to converge to the same accuracy in comparison to the uncertainty-based query-by-committee algorithm. Moreover, the algorithm is fully automatic and does not require any prior knowledge of the PES. Simulations on a 6D PES of pyrrole(H2O) show that <15 000 configurations are enough to build a PES with a generalization error of 16 cm-1, whereas the final model with around 50 000 configurations has a generalization error of 11 cm-1.
Collapse
Affiliation(s)
- Yahya Saleh
- Center for Free-Electron Laser Science CFEL, Deutsches Elektronen-Synchrotron DESY, Notkestraße 85, 22607 Hamburg, Germany
| | - Vishnu Sanjay
- Center for Free-Electron Laser Science CFEL, Deutsches Elektronen-Synchrotron DESY, Notkestraße 85, 22607 Hamburg, Germany
| | - Armin Iske
- Department of Mathematics, Universität Hamburg, Bundesstraße 55, 20146 Hamburg, Germany
| | - Andrey Yachmenev
- Center for Free-Electron Laser Science CFEL, Deutsches Elektronen-Synchrotron DESY, Notkestraße 85, 22607 Hamburg, Germany
| | - Jochen Küpper
- Center for Free-Electron Laser Science CFEL, Deutsches Elektronen-Synchrotron DESY, Notkestraße 85, 22607 Hamburg, Germany
| |
Collapse
|
48
|
Bull-Vulpe EF, Riera M, Götz AW, Paesani F. MB-Fit: Software infrastructure for data-driven many-body potential energy functions. J Chem Phys 2021; 155:124801. [PMID: 34598567 DOI: 10.1063/5.0063198] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Many-body potential energy functions (MB-PEFs), which integrate data-driven representations of many-body short-range quantum mechanical interactions with physics-based representations of many-body polarization and long-range interactions, have recently been shown to provide high accuracy in the description of molecular interactions from the gas to the condensed phase. Here, we present MB-Fit, a software infrastructure for the automated development of MB-PEFs for generic molecules within the TTM-nrg (Thole-type model energy) and MB-nrg (many-body energy) theoretical frameworks. Besides providing all the necessary computational tools for generating TTM-nrg and MB-nrg PEFs, MB-Fit provides a seamless interface with the MBX software, a many-body energy and force calculator for computer simulations. Given the demonstrated accuracy of the MB-PEFs, particularly within the MB-nrg framework, we believe that MB-Fit will enable routine predictive computer simulations of generic (small) molecules in the gas, liquid, and solid phases, including, but not limited to, the modeling of quantum isomeric equilibria in molecular clusters, solvation processes, molecular crystals, and phase diagrams.
Collapse
Affiliation(s)
- Ethan F Bull-Vulpe
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | - Marc Riera
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | - Andreas W Götz
- San Diego Supercomputer Center, University of California San Diego, La Jolla, California 92093, USA
| | - Francesco Paesani
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| |
Collapse
|
49
|
Machine learning potentials for complex aqueous systems made simple. Proc Natl Acad Sci U S A 2021; 118:2110077118. [PMID: 34518232 PMCID: PMC8463804 DOI: 10.1073/pnas.2110077118] [Citation(s) in RCA: 91] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/27/2021] [Indexed: 12/23/2022] Open
Abstract
Understanding complex materials, in particular those with solid–liquid interfaces, such as water on surfaces or under confinement, is a key challenge for technological and scientific progress. Although established simulation approaches have been able to provide important atomistic insight, ab initio techniques struggle with the required time and length scales, while force field methods can often be limited in terms of their accuracy. Here we show how these limitations can be overcome in a simple and automated machine learning procedure to provide accurate models of interactions at the ab initio level, as illustrated for a variety of complex aqueous systems. These developments open up the prospect of the straightforward exploration of many technologically relevant systems by molecular simulations. Simulation techniques based on accurate and efficient representations of potential energy surfaces are urgently needed for the understanding of complex systems such as solid–liquid interfaces. Here we present a machine learning framework that enables the efficient development and validation of models for complex aqueous systems. Instead of trying to deliver a globally optimal machine learning potential, we propose to develop models applicable to specific thermodynamic state points in a simple and user-friendly process. After an initial ab initio simulation, a machine learning potential is constructed with minimum human effort through a data-driven active learning protocol. Such models can afterward be applied in exhaustive simulations to provide reliable answers for the scientific question at hand or to systematically explore the thermal performance of ab initio methods. We showcase this methodology on a diverse set of aqueous systems comprising bulk water with different ions in solution, water on a titanium dioxide surface, and water confined in nanotubes and between molybdenum disulfide sheets. Highlighting the accuracy of our approach with respect to the underlying ab initio reference, the resulting models are evaluated in detail with an automated validation protocol that includes structural and dynamical properties and the precision of the force prediction of the models. Finally, we demonstrate the capabilities of our approach for the description of water on the rutile titanium dioxide (110) surface to analyze the structure and mobility of water on this surface. Such machine learning models provide a straightforward and uncomplicated but accurate extension of simulation time and length scales for complex systems.
Collapse
|
50
|
Morrow Z, Kwon HY, Kelley CT, Jakubikova E. Efficient Approximation of Potential Energy Surfaces with Mixed-Basis Interpolation. J Chem Theory Comput 2021; 17:5673-5683. [PMID: 34351740 DOI: 10.1021/acs.jctc.1c00569] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The potential energy surface (PES) describes the energy of a chemical system as a function of its geometry and is a fundamental concept in modern chemistry. A PES provides much useful information about the system, including the structures and energies of various stationary points, such as stable conformers (local minima) and transition states (first-order saddle points) connected by a minimum-energy path. Our group has previously produced surrogate reduced-dimensional PESs using sparse interpolation along chemically significant reaction coordinates, such as bond lengths, bond angles, and torsion angles. These surrogates used a single interpolation basis, either polynomials or trigonometric functions, in every dimension. However, relevant molecular dynamics (MD) simulations often involve some combination of both periodic and nonperiodic coordinates. Using a trigonometric basis on nonperiodic coordinates, such as bond lengths, leads to inaccuracies near the domain boundary. Conversely, polynomial interpolation on the periodic coordinates does not enforce the periodicity of the surrogate PES gradient, leading to nonconservation of total energy even in a microcanonical ensemble. In this work, we present an interpolation method that uses trigonometric interpolation on the periodic reaction coordinates and polynomial interpolation on the nonperiodic coordinates. We apply this method to MD simulations of possible isomerization pathways of azomethane between cis and trans conformers. This method is the only known interpolative method that appropriately conserves total energy in systems with both periodic and nonperiodic reaction coordinates. In addition, compared to all-polynomial interpolation, the mixed basis requires fewer electronic structure calculations to obtain a given level of accuracy, is an order of magnitude faster, and is freely available on GitHub.
Collapse
Affiliation(s)
- Zachary Morrow
- Department of Mathematics, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Hyuk-Yong Kwon
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - C T Kelley
- Department of Mathematics, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Elena Jakubikova
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina 27695, United States
| |
Collapse
|