1
|
Kelly J, Hu F, Damiani A, Chen MS, Snider A, Son M, Lee A, Gupta P, Montoya-Castillo A, Zuehlsdorff TJ, Schlau-Cohen GS, Isborn CM, Markland TE. Two-Dimensional Electronic Spectroscopy in the Condensed Phase Using Equivariant Transformer Accelerated Molecular Dynamics Simulations. J Phys Chem Lett 2025:5561-5569. [PMID: 40434198 DOI: 10.1021/acs.jpclett.5c00911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2025]
Abstract
Two-dimensional electronic spectroscopy (2DES) provides rich information about how the electronic states of molecules, proteins, and solid-state materials interact with each other and their surrounding environment. Atomistic molecular dynamics simulations offer an appealing route to uncover how nuclear motions mediate electronic energy relaxation and their manifestation in electronic spectroscopies but are computationally expensive. Here we show that by using an equivariant transformer-based machine learning architecture trained with only 2500 ground state and 100 excited state electronic structure calculations, one can construct accurate machine-learned potential energy surfaces for both the ground-state electronic surface and excited-state energy gap. We demonstrate the utility of this approach for simulating the dynamics of Nile blue in ethanol, where we experimentally validate and decompose the simulated 2DES to establish the nuclear motions of the chromophore and the solvent that couple to the excited state, connecting the spectroscopic signals to their molecular origin.
Collapse
Affiliation(s)
- Joseph Kelly
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Frank Hu
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Arianna Damiani
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Michael S Chen
- Simons Center for Computational Physical Chemistry, Department of Chemistry, New York University, New York, New York 10003, United States
| | - Andrew Snider
- Department of Chemistry and Biochemistry, University of California Merced, Merced, California 95343, United States
| | - Minjung Son
- Department of Chemistry, Boston University, Boston, Massachusetts 02215, United States
| | - Angela Lee
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Prachi Gupta
- Department of Chemistry and Biochemistry, University of California Merced, Merced, California 95343, United States
| | - Andrés Montoya-Castillo
- Department of Chemistry, University of Colorado, Boulder, Boulder, Colorado 80309, United States
| | - Tim J Zuehlsdorff
- Department of Chemistry, Oregon State University, Corvallis, Oregon 97331, United States
| | - Gabriela S Schlau-Cohen
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Christine M Isborn
- Department of Chemistry and Biochemistry, University of California Merced, Merced, California 95343, United States
| | - Thomas E Markland
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
2
|
Tokita AM, Devergne T, Saitta AM, Behler J. Free energy profiles for chemical reactions in solution from high-dimensional neural network potentials: The case of the Strecker synthesis. J Chem Phys 2025; 162:174120. [PMID: 40326597 DOI: 10.1063/5.0268948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2025] [Accepted: 04/14/2025] [Indexed: 05/07/2025] Open
Abstract
Machine learning potentials (MLPs) have become a popular tool in chemistry and materials science as they combine the accuracy of electronic structure calculations with the high computational efficiency of analytic potentials. MLPs are particularly useful for computationally demanding simulations such as the determination of free energy profiles governing chemical reactions in solution, but to date, such applications are still rare. In this work, we show how umbrella sampling simulations can be combined with active learning of high-dimensional neural network potentials (HDNNPs) to construct free energy profiles in a systematic way. For the example of the first step of Strecker synthesis of glycine in aqueous solution, we provide a detailed analysis of the improving quality of HDNNPs for datasets of increasing size. We find that, in addition to the typical quantification of energy and force errors with respect to the underlying density functional theory data, the long-term stability of the simulations and the convergence of physical properties should be rigorously monitored to obtain reliable and converged free energy profiles of chemical reactions in solution.
Collapse
Affiliation(s)
- Alea Miako Tokita
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany
- Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Timothée Devergne
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMRCNRS 7590, Institut de Minéralogie, de Physique des Matériaux et deCosmochimie, IMPMC, F-75005 Paris, France
- Atomistic Simulations, Italian Institute of Technology, Genova, Italy and Computational Statistics and Machine Learning, Italian Institute of Technology, Genova, Italy
| | - A Marco Saitta
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMRCNRS 7590, Institut de Minéralogie, de Physique des Matériaux et deCosmochimie, IMPMC, F-75005 Paris, France
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany
- Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| |
Collapse
|
3
|
Ng WP, Zhang Z, Yang J. Accurate Neural Network Fine-Tuning Approach for Transferable Ab Initio Energy Prediction across Varying Molecular and Crystalline Scales. J Chem Theory Comput 2025; 21:1602-1614. [PMID: 39902570 DOI: 10.1021/acs.jctc.4c01261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2025]
Abstract
Existing machine learning models attempt to predict the energies of large molecules by training small molecules, but eventually fail to retain high accuracy as the errors increase with system size. Through an orbital pairwise decomposition of the correlation energy, a pretrained neural network model on hundred-scale data containing small molecules is demonstrated to be sufficiently transferable for accurately predicting large systems, including molecules and crystals. Our model introduces a residual connection to explicitly learn the pairwise energy corrections, and employs various low-rank retraining techniques to modestly adjust the learned network parameters. We demonstrate that with as few as only one larger molecule retraining the base model originally trained on only small molecules of (H2O)6, the MP2 correlation energy of the large liquid water (H2O)64 in a periodic supercell can be predicted at chemical accuracy. Similar performance is observed for large protonated clusters and periodic poly glycine chains. A demonstrative application is presented to predict the energy ordering of symmetrically inequivalent sublattices for distinct hydrogen orientations in the ice XV phase. Our work represents an important step forward in the quest for cost-effective, highly accurate and transferable neural network models in quantum chemistry, bridging the electronic structure patterns between small and large systems.
Collapse
Affiliation(s)
- Wai-Pan Ng
- Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China
| | - Zili Zhang
- Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China
| | - Jun Yang
- Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China
- Hong Kong Quantum AI Lab Limited, Hong Kong 999077, P. R. China
| |
Collapse
|
4
|
Moon SW, Willow SY, Park TH, Min SK, Myung CW. Machine Learning Nonadiabatic Dynamics: Eliminating Phase Freedom of Nonadiabatic Couplings with the State-Interaction State-Averaged Spin-Restricted Ensemble-Referenced Kohn-Sham Approach. J Chem Theory Comput 2025; 21:1521-1529. [PMID: 39904753 DOI: 10.1021/acs.jctc.4c01475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2025]
Abstract
Excited-state molecular dynamics (ESMD) simulations near conical intersections (CIs) pose significant challenges when using machine learning potentials (MLPs). Although MLPs have gained recognition for their integration into mixed quantum-classical (MQC) methods, such as trajectory surface hopping (TSH), and their capacity to model correlated electron-nuclear dynamics efficiently, difficulties persist in managing nonadiabatic dynamics. Specifically, singularities at CIs and double-valued coupling elements result in discontinuities that disrupt the smoothness of predictive functions. Partial solutions have been provided by learning diabatic Hamiltonians with phaseless loss functions to these challenges. However, a definitive method for addressing the discontinuities caused by CIs and double-valued coupling elements has yet to be developed. Here, we introduce the phaseless coupling term, Δ2, derived from the square of the off-diagonal elements of the diabatic Hamiltonian in the state-interaction state-averaged spin-restricted ensemble-referenced Kohn-Sham (SI-SA-REKS, briefly SSR)(2,2) formalism. This approach improves the stability and accuracy of the MLP model by addressing the issues arising from CI singularities and double-valued coupling functions. We apply this method to the penta-2,4-dieniminium cation (PSB3), demonstrating its effectiveness in improving MLP training for ML-based nonadiabatic dynamics. Our results show that the Δ2-based ML-ESMD method can reproduce ab initio ESMD simulations, underscoring its potential and efficiency for broader applications, particularly in large-scale and long-time scale ESMD simulations.
Collapse
Affiliation(s)
- Sung Wook Moon
- Department of Chemistry, School of Natural Science, Ulsan National Institute of Science and Technology (UNIST), 50 UNIST-gil, Ulju-gun, Ulsan 44919, Republic of Korea
| | - Soohaeng Yoo Willow
- Department of Energy Science, Sungkyunkwan University, Seobu-ro 2066, Suwon 16419, Republic of Korea
| | - Tae Hyeon Park
- Department of Energy Science, Sungkyunkwan University, Seobu-ro 2066, Suwon 16419, Republic of Korea
- Center for 2D Quantum Heterostructures, Institute for Basic Science (IBS), Suwon 16419, Republic of Korea
| | - Seung Kyu Min
- Department of Chemistry, School of Natural Science, Ulsan National Institute of Science and Technology (UNIST), 50 UNIST-gil, Ulju-gun, Ulsan 44919, Republic of Korea
- Center for 2D Quantum Heterostructures, Institute for Basic Science (IBS), Suwon 16419, Republic of Korea
| | - Chang Woo Myung
- Department of Energy Science, Sungkyunkwan University, Seobu-ro 2066, Suwon 16419, Republic of Korea
- Center for 2D Quantum Heterostructures, Institute for Basic Science (IBS), Suwon 16419, Republic of Korea
| |
Collapse
|
5
|
Paschek D, Busch J, Chiramel Tony AM, Ludwig R, Strate A, Stolte N, Forbert H, Marx D. When theory meets experiment: What does it take to accurately predict 1H NMR dipolar relaxation rates in neat liquid water from theory? J Chem Phys 2025; 162:054501. [PMID: 39898566 DOI: 10.1063/5.0249826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2024] [Accepted: 01/10/2025] [Indexed: 02/04/2025] Open
Abstract
In this contribution, we compute the 1H nuclear magnetic resonance (NMR) relaxation rate of liquid water at ambient conditions. We are using structural and dynamical information from Coupled Cluster Molecular Dynamics (CCMD) trajectories generated at CCSD(T) electronic structure accuracy while also considering nuclear quantum effects in addition to consulting information from x-ray and neutron scattering experiments. Our analysis is based on a recently presented computational framework for determining the frequency-dependent NMR dipole-dipole relaxation rate of spin 1/2 nuclei from Molecular Dynamics (MD) simulations, which allows for an effective disentanglement of its structural and dynamical contributions and includes a correction for finite-size effects inherent to MD simulations with periodic boundary conditions. A close to perfect agreement with experimental relaxation data is achieved if structural and dynamical information from CCMD trajectories is considered, leading to a re-balancing of the rotational and translational dynamics, which can also be expressed by the product of the self-diffusion coefficient and the reorientational correlation time of the H-H vector D0 × τHH. The simulations show that this balance is significantly altered when nuclear quantum effects are taken into account. Our analysis suggests that the intermolecular and intramolecular contributions to the 1H NMR relaxation rate of liquid water are almost similar in magnitude, unlike what was predicted earlier from fully classical MD simulations.
Collapse
Affiliation(s)
- Dietmar Paschek
- Institut für Chemie, Abteilung Physikalische und Theoretische Chemie, Universität Rostock, Albert-Einstein-Str. 27, D-18059 Rostock, Germany
| | - Johanna Busch
- Institut für Chemie, Abteilung Physikalische und Theoretische Chemie, Universität Rostock, Albert-Einstein-Str. 27, D-18059 Rostock, Germany
| | - Angel Mary Chiramel Tony
- Institut für Chemie, Abteilung Physikalische und Theoretische Chemie, Universität Rostock, Albert-Einstein-Str. 27, D-18059 Rostock, Germany
| | - Ralf Ludwig
- Institut für Chemie, Abteilung Physikalische und Theoretische Chemie, Universität Rostock, Albert-Einstein-Str. 27, D-18059 Rostock, Germany
| | - Anne Strate
- Institut für Chemie, Abteilung Physikalische und Theoretische Chemie, Universität Rostock, Albert-Einstein-Str. 27, D-18059 Rostock, Germany
| | - Nore Stolte
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, D-44780 Bochum, Germany
| | - Harald Forbert
- Center for Solvation Science ZEMOS, Ruhr-Universität Bochum, D-44780 Bochum, Germany
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, D-44780 Bochum, Germany
| |
Collapse
|
6
|
Stolte N, Daru J, Forbert H, Marx D, Behler J. Random Sampling Versus Active Learning Algorithms for Machine Learning Potentials of Quantum Liquid Water. J Chem Theory Comput 2025; 21:886-899. [PMID: 39808506 DOI: 10.1021/acs.jctc.4c01382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2025]
Abstract
Training accurate machine learning potentials requires electronic structure data comprehensively covering the configurational space of the system of interest. As the construction of this data is computationally demanding, many schemes for identifying the most important structures have been proposed. Here, we compare the performance of high-dimensional neural network potentials (HDNNPs) for quantum liquid water at ambient conditions trained to data sets constructed using random sampling as well as various flavors of active learning based on query by committee. Contrary to the common understanding of active learning, we find that for a given data set size, random sampling leads to smaller test errors for structures not included in the training process. In our analysis, we show that this can be related to small energy offsets caused by a bias in structures added in active learning, which can be overcome by using instead energy correlations as an error measure that is invariant to such shifts. Still, all HDNNPs yield very similar and accurate structural properties of quantum liquid water, which demonstrates the robustness of the training procedure with respect to the training set construction algorithm even when trained to as few as 200 structures. However, we find that for active learning based on preliminary potentials, a reasonable initial data set is important to avoid an unnecessary extension of the covered configuration space to less relevant regions.
Collapse
Affiliation(s)
- Nore Stolte
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, Bochum 44780, Germany
| | - János Daru
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, Bochum 44780, Germany
- Department of Organic Chemistry, Eötvös Loránd University, Budapest 1117, Hungary
| | - Harald Forbert
- Center for Solvation Science ZEMOS, Ruhr-Universität Bochum, Bochum 44780, Germany
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, Bochum 44780, Germany
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, Bochum 44780, Germany
- Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, Bochum 44780, Germany
| |
Collapse
|
7
|
Kulichenko M, Nebgen B, Lubbers N, Smith JS, Barros K, Allen AEA, Habib A, Shinkle E, Fedik N, Li YW, Messerly RA, Tretiak S. Data Generation for Machine Learning Interatomic Potentials and Beyond. Chem Rev 2024; 124:13681-13714. [PMID: 39572011 DOI: 10.1021/acs.chemrev.4c00572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2024]
Abstract
The field of data-driven chemistry is undergoing an evolution, driven by innovations in machine learning models for predicting molecular properties and behavior. Recent strides in ML-based interatomic potentials have paved the way for accurate modeling of diverse chemical and structural properties at the atomic level. The key determinant defining MLIP reliability remains the quality of the training data. A paramount challenge lies in constructing training sets that capture specific domains in the vast chemical and structural space. This Review navigates the intricate landscape of essential components and integrity of training data that ensure the extensibility and transferability of the resulting models. We delve into the details of active learning, discussing its various facets and implementations. We outline different types of uncertainty quantification applied to atomistic data acquisition and the correlations between estimated uncertainty and true error. The role of atomistic data samplers in generating diverse and informative structures is highlighted. Furthermore, we discuss data acquisition via modified and surrogate potential energy surfaces as an innovative approach to diversify training data. The Review also provides a list of publicly available data sets that cover essential domains of chemical space.
Collapse
Affiliation(s)
- Maksim Kulichenko
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Benjamin Nebgen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Nicholas Lubbers
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Justin S Smith
- NVIDIA Corporation, Santa Clara, California 95051, United States
| | - Kipton Barros
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Alice E A Allen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Adela Habib
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Emily Shinkle
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Nikita Fedik
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Ying Wai Li
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Richard A Messerly
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Sergei Tretiak
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Integrated Nanotechnologies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| |
Collapse
|
8
|
Stolte N, Daru J, Forbert H, Behler J, Marx D. Nuclear Quantum Effects in Liquid Water Are Marginal for Its Average Structure but Significant for Dynamics. J Phys Chem Lett 2024; 15:12144-12150. [PMID: 39607891 DOI: 10.1021/acs.jpclett.4c02925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2024]
Abstract
Isotopic substitution, which can be realized in both experiment and computer simulations, is a direct approach to assess the role of nuclear quantum effects on the structure and dynamics of matter. However, the impact of nuclear quantum effects on the structure of liquid water as probed in experiment by comparing normal to heavy water has remained controversial. To settle this issue, we employ a highly accurate machine-learned high-dimensional neural network potential to perform converged coupled cluster-quality path integral simulations of liquid H2O versus D2O at ambient conditions. We find substantial H/D quantum effects on the rotational and translational dynamics of water, in close agreement with the experimental benchmarks. However, in stark contrast to the role for dynamics, H/D quantum effects turn out to be small, on the order of 1/1000 Å, on both average intramolecular and H-bonding structures of water. The most probable structure of water remains nearly unaffected by nuclear quantum effects, but effects on fluctuations away from average are appreciable, rendering H2O substantially more "liquid" than D2O.
Collapse
Affiliation(s)
- Nore Stolte
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - János Daru
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
- Department of Organic Chemistry, Eötvös Loránd University, 1117 Budapest, Hungary
| | - Harald Forbert
- Center for Solvation Science ZEMOS, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany
- Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| |
Collapse
|
9
|
Tian W, Wang C, Zhou K. The Dynamic Diversity and Invariance of Ab Initio Water. J Chem Theory Comput 2024; 20:10667-10675. [PMID: 39558782 DOI: 10.1021/acs.jctc.4c01191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2024]
Abstract
Comprehending water dynamics is crucial in various fields, such as water desalination, ion separation, electrocatalysis, and biochemical processes. While ab initio molecular dynamics (AIMD) accurately portray water's structure, computing its dynamic properties over nanosecond time scales proves cost-prohibitive. This study employs machine learning potentials (MLPs) to accurately determine the dynamic properties of liquid water with ab initio accuracy. Our findings reveal diversity in the calculated diffusion coefficient (D) and viscosity of water (η) across different methodologies. Specifically, while the GGA, meta-GGA, and hybrid functional methods struggle to predict dynamic properties under ambient conditions, methods on the higher level of Jacob's ladder of DFT approximation perform significantly better. Intriguingly, we discovered that both D and η adhere to the established Stokes-Einstein (SE) relation for all of the ab initio water. The diversity observed across different methods can be attributed to distinct structural entropy, affirming the applicability of excess entropy scaling relations across all functionals. The correlation between D and η provides valuable insights for identifying the ideal temperature to accurately replicate the dynamic properties of liquid water. Furthermore, our findings can validate the rationale behind employing artificially high temperatures in the simulation of water via AIMD. These outcomes not only pave the path to designing better functionals for water but also underscore the significance of water's many-body characteristics.
Collapse
Affiliation(s)
- Wei Tian
- College of Energy, SIEMIS, Soochow University, Suzhou 215006, China
| | - Chenyu Wang
- College of Energy, SIEMIS, Soochow University, Suzhou 215006, China
| | - Ke Zhou
- College of Energy, SIEMIS, Soochow University, Suzhou 215006, China
| |
Collapse
|
10
|
Thiemann FL, O'Neill N, Kapil V, Michaelides A, Schran C. Introduction to machine learning potentials for atomistic simulations. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2024; 37:073002. [PMID: 39577092 DOI: 10.1088/1361-648x/ad9657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Accepted: 11/22/2024] [Indexed: 11/24/2024]
Abstract
Machine learning potentials have revolutionised the field of atomistic simulations in recent years and are becoming a mainstay in the toolbox of computational scientists. This paper aims to provide an overview and introduction into machine learning potentials and their practical application to scientific problems. We provide a systematic guide for developing machine learning potentials, reviewing chemical descriptors, regression models, data generation and validation approaches. We begin with an emphasis on the earlier generation of models, such as high-dimensional neural network potentials and Gaussian approximation potentials, to provide historical perspective and guide the reader towards the understanding of recent developments, which are discussed in detail thereafter. Furthermore, we refer to relevant expert reviews, open-source software, and practical examples-further lowering the barrier to exploring these methods. The paper ends with selected showcase examples, highlighting the capabilities of machine learning potentials and how they can be applied to push the boundaries in atomistic simulations.
Collapse
Affiliation(s)
- Fabian L Thiemann
- IBM Research Europe, Daresbury, Warrington WA4 4AD, United Kingdom
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge CB3 0HE, United Kingdom
| | - Niamh O'Neill
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge CB3 0HE, United Kingdom
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| | - Venkat Kapil
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
- Department of Physics and Astronomy, University College London, London, United Kingdom
- Thomas Young Centre and London Centre for Nanotechnology, London, United Kingdom
| | - Angelos Michaelides
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| | - Christoph Schran
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge CB3 0HE, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| |
Collapse
|
11
|
Thomsen B, Nagai Y, Kobayashi K, Hamada I, Shiga M. Self-learning path integral hybrid Monte Carlo with mixed ab initio and machine learning potentials for modeling nuclear quantum effects in water. J Chem Phys 2024; 161:204109. [PMID: 39601285 DOI: 10.1063/5.0230464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Accepted: 10/28/2024] [Indexed: 11/29/2024] Open
Abstract
The introduction of machine learned potentials (MLPs) has greatly expanded the space available for studying Nuclear Quantum Effects computationally with ab initio path integral (PI) accuracy, with the MLPs' promise of an accuracy comparable to that of ab initio at a fraction of the cost. One of the challenges in development of MLPs is the need for a large and diverse training set calculated by ab initio methods. This dataset should ideally cover the entire phase space, while not searching this space using ab initio methods, as this would be counterproductive and generally intractable with respect to computational time. In this paper, we present the self-learning PI hybrid Monte Carlo Method using a mixed ab initio and ML potential (SL-PIHMC-MIX), where the mixed potential allows for the study of larger systems and the extension of the original SL-HMC method [Nagai et al., Phys. Rev. B 102, 041124 (2020)] to PI methods and larger systems. While the MLPs generated by this method can be directly applied to run long-time ML-PIMD simulations, we demonstrate that using PIHMC-MIX with the trained MLPs allows for an exact reproduction of the structure obtained from ab initio PIMD. Specifically, we find that the PIHMC-MIX simulations require only 5000 evaluations of the 32-bead structure, compared to the 100 000 evaluations needed for the ab initio PIMD result.
Collapse
Affiliation(s)
- Bo Thomsen
- CCSE, Japan Atomic Energy Agency, 178-4-4, Wakashiba, Kashiwa, Chiba 277-0871, Japan
| | - Yuki Nagai
- Information Technology Center, The University of Tokyo, 6-2-3 Kashiwanoha, Kashiwa, Chiba 277-0882, Japan
| | - Keita Kobayashi
- CCSE, Japan Atomic Energy Agency, 178-4-4, Wakashiba, Kashiwa, Chiba 277-0871, Japan
| | - Ikutaro Hamada
- Department of Precision Engineering, Graduate School of Engineering, Osaka University, 2-1, Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Motoyuki Shiga
- CCSE, Japan Atomic Energy Agency, 178-4-4, Wakashiba, Kashiwa, Chiba 277-0871, Japan
| |
Collapse
|
12
|
Jiang T, Baumgarten MKA, Loos PF, Mahajan A, Scemama A, Ung SF, Zhang J, Malone FD, Lee J. Improved modularity and new features in ipie: Toward even larger AFQMC calculations on CPUs and GPUs at zero and finite temperatures. J Chem Phys 2024; 161:162502. [PMID: 39450727 DOI: 10.1063/5.0225596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Accepted: 10/04/2024] [Indexed: 10/26/2024] Open
Abstract
ipie is a Python-based auxiliary-field quantum Monte Carlo (AFQMC) package that has undergone substantial improvements since its initial release [Malone et al., J. Chem. Theory Comput. 19(1), 109-121 (2023)]. This paper outlines the improved modularity and new capabilities implemented in ipie. We highlight the ease of incorporating different trial and walker types and the seamless integration of ipie with external libraries. We enable distributed Hamiltonian simulations of large systems that otherwise would not fit on a single central processing unit node or graphics processing unit (GPU) card. This development enabled us to compute the interaction energy of a benzene dimer with 84 electrons and 1512 orbitals with multi-GPUs. Using CUDA and cupy for NVIDIA GPUs, ipie supports GPU-accelerated multi-slater determinant trial wavefunctions [Huang et al. arXiv:2406.08314 (2024)] to enable efficient and highly accurate simulations of large-scale systems. This allows for near-exact ground state energies of multi-reference clusters, [Cu2O2]2+ and [Fe2S2(SCH3)4]2-. We also describe implementations of free projection AFQMC, finite temperature AFQMC, AFQMC for electron-phonon systems, and automatic differentiation in AFQMC for calculating physical properties. These advancements position ipie as a leading platform for AFQMC research in quantum chemistry, facilitating more complex and ambitious computational method development and their applications.
Collapse
Affiliation(s)
- Tong Jiang
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, USA
| | - Moritz K A Baumgarten
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, USA
| | - Pierre-François Loos
- Laboratoire de Chimie et Physique Quantiques (UMR 5626), Université de Toulouse, CNRS, UPS, Toulouse, France
| | - Ankit Mahajan
- Department of Chemistry, Columbia University, New York, New York 10027, USA
| | - Anthony Scemama
- Laboratoire de Chimie et Physique Quantiques (UMR 5626), Université de Toulouse, CNRS, UPS, Toulouse, France
| | - Shu Fay Ung
- Department of Chemistry, Columbia University, New York, New York 10027, USA
| | - Jinghong Zhang
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, USA
| | | | - Joonho Lee
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, USA
| |
Collapse
|
13
|
Ye HZ, Berkelbach TC. Periodic Local Coupled-Cluster Theory for Insulators and Metals. J Chem Theory Comput 2024; 20:8948-8959. [PMID: 39376105 DOI: 10.1021/acs.jctc.4c00936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/09/2024]
Abstract
We describe the implementation details of periodic local coupled-cluster theory with single and double excitations (CCSD) and perturbative triple excitations [CCSD(T)] using local natural orbitals (LNOs) and k-point symmetry. We discuss and compare several choices for orbital localization, fragmentation, and LNO construction. By studying diamond and lithium, we demonstrate that periodic LNO-CC theory can be applied with equal success to both insulators and metals, achieving speedups of 2 to 3 orders of magnitude even for moderately sized k-point meshes. Our final predictions of the equilibrium cohesive energy, lattice constant, and bulk modulus for diamond and lithium are in good agreement with previous theoretical predictions and experimental results.
Collapse
Affiliation(s)
- Hong-Zhou Ye
- Department of Chemistry, Columbia University, New York, New York 10027, United States
| | - Timothy C Berkelbach
- Department of Chemistry, Columbia University, New York, New York 10027, United States
- Initiative for Computational Catalysis, Flatiron Institute, New York, New York 10010, United States
| |
Collapse
|
14
|
Isamura BK, Popelier PLA. Transfer learning of hyperparameters for fast construction of anisotropic GPR models: design and application to the machine-learned force field FFLUX. Phys Chem Chem Phys 2024; 26:23677-23691. [PMID: 39224929 PMCID: PMC11369757 DOI: 10.1039/d4cp01862a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Accepted: 08/22/2024] [Indexed: 09/04/2024]
Abstract
The polarisable machine-learned force field FFLUX requires pre-trained anisotropic Gaussian process regression (GPR) models of atomic energies and multipole moments to propagate unbiased molecular dynamics simulations. The outcome of FFLUX simulations is highly dependent on the predictive accuracy of the underlying models whose training entails determining the optimal set of model hyperparameters. Unfortunately, traditional direct learning (DL) procedures do not scale well on this task, especially when the hyperparameter search is initiated from a (set of) random guess solution(s). Additionally, the complexity of the hyperparameter space (HS) increases with the number of geometrical input features, at least for anisotropic kernels, making the optimization of hyperparameters even more challenging. In this study, we propose a transfer learning (TL) protocol that accelerates the training process of anisotropic GPR models by facilitating access to promising regions of the HS. The protocol is based on a seeding-relaxation mechanism in which an excellent guess solution is identified by rapidly building one or several small source models over a subset of the target training set before readjusting the previous guess over the entire set. We demonstrate the performance of this protocol by building and assessing the performance of DL and TL models of atomic energies and charges in various conformations of benzene, ethanol, formic acid dimer and the drug fomepizole. Our experiments suggest that TL models can be built one order of magnitude faster while preserving the quality of their DL analogs. Most importantly, when deployed in FFLUX simulations, TL models compete with or even outperform their DL analogs when it comes to performing FFLUX geometry optimization and computing harmonic vibrational modes.
Collapse
Affiliation(s)
- Bienfait K Isamura
- Department of Chemistry, The University of Manchester, Manchester, M13 9PL, UK.
| | - Paul L A Popelier
- Department of Chemistry, The University of Manchester, Manchester, M13 9PL, UK.
| |
Collapse
|
15
|
Wang J, Hei H, Zheng Y, Zhang H, Ye H. Five-Site Water Models for Ice and Liquid Water Generated by a Series-Parallel Machine Learning Strategy. J Chem Theory Comput 2024; 20:7533-7545. [PMID: 39133036 DOI: 10.1021/acs.jctc.4c00440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Icing, a common natural phenomenon, always originates from a molecule. Molecular simulation is crucial for understanding the relevant process but still faces a great challenge in obtaining a uniform and accurate description of ice and liquid water with limited model parameters. Here, we propose a series-parallel machine learning (ML) approach consisting of a classification back-propagation neural network (BPNN), parallel regression BPNNs, and a genetic algorithm to establish conventional TIP5P-BG and temperature-dependent TIP5P-BGT models. The established water models exhibit a comprehensive balance among the crucial physical properties (melting point, density, vaporization enthalpy, self-diffusion coefficient, and viscosity) with mean absolute percentage errors of 2.65 and 2.40%, respectively, and excellent predictive performance on the related properties of liquid water. For ice, the simulation results on the critical nucleus size and growth rate are in good accordance with experiments. This work offers a powerful molecular model for phase transition and icing in nanoconfinement and a construction strategy for a complex molecular model in the extreme case.
Collapse
Affiliation(s)
- Jian Wang
- International Research Center for Computational Mechanics, State Key Laboratory of Structural Analysis, Optimization and CAE Software for Industrial Equipment, Department of Engineering Mechanics, Faculty of Vehicle Engineering and Mechanics, Dalian University of Technology, Dalian 116024, P. R. China
| | - Haitao Hei
- International Research Center for Computational Mechanics, State Key Laboratory of Structural Analysis, Optimization and CAE Software for Industrial Equipment, Department of Engineering Mechanics, Faculty of Vehicle Engineering and Mechanics, Dalian University of Technology, Dalian 116024, P. R. China
| | - Yonggang Zheng
- International Research Center for Computational Mechanics, State Key Laboratory of Structural Analysis, Optimization and CAE Software for Industrial Equipment, Department of Engineering Mechanics, Faculty of Vehicle Engineering and Mechanics, Dalian University of Technology, Dalian 116024, P. R. China
- DUT-BSU Joint Institute, Dalian University of Technology, Dalian 116024, P. R. China
| | - Hongwu Zhang
- International Research Center for Computational Mechanics, State Key Laboratory of Structural Analysis, Optimization and CAE Software for Industrial Equipment, Department of Engineering Mechanics, Faculty of Vehicle Engineering and Mechanics, Dalian University of Technology, Dalian 116024, P. R. China
| | - Hongfei Ye
- International Research Center for Computational Mechanics, State Key Laboratory of Structural Analysis, Optimization and CAE Software for Industrial Equipment, Department of Engineering Mechanics, Faculty of Vehicle Engineering and Mechanics, Dalian University of Technology, Dalian 116024, P. R. China
| |
Collapse
|
16
|
Iyer GR, Whelpley N, Tiihonen J, Kent PRC, Krogel JT, Rubenstein BM. Force-Free Identification of Minimum-Energy Pathways and Transition States for Stochastic Electronic Structure Theories. J Chem Theory Comput 2024; 20:7416-7429. [PMID: 39172163 DOI: 10.1021/acs.jctc.4c00214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
The accurate mapping of potential energy surfaces (PESs) is crucial to our understanding of the numerous physical and chemical processes mediated by atomic rearrangements, such as conformational changes and chemical reactions, and the thermodynamic and kinetic feasibility of these processes. Stochastic electronic structure theories, e.g., Quantum Monte Carlo (QMC) methods, enable highly accurate total energy calculations that in principle can be used to construct the PES. However, their stochastic nature poses a challenge to the computation and use of forces and Hessians, which are typically required in algorithms for minimum-energy pathway (MEP) and transition state (TS) identification, such as the nudged elastic band (NEB) algorithm and its climbing image formulation. Here, we present strategies that utilize the surrogate Hessian line-search method, previously developed for QMC structural optimization, to efficiently identify MEP and TS structures without requiring force calculations at the level of the stochastic electronic structure theory. By modifying the surrogate Hessian algorithm to operate in path-orthogonal subspaces and at saddle points, we show that it is possible to identify MEPs and TSs by using a force-free QMC approach. We demonstrate these strategies via two examples, the inversion of the ammonia (NH3) molecule and the nucleophilic substitution (SN2) reaction F- + CH3F → FCH3 + F-. We validate our results using Density Functional Theory (DFT)- and Coupled Cluster (CCSD, CCSD(T))-based NEB calculations. We then introduce a hybrid DFT-QMC approach to compute thermodynamic and kinetic quantities, free energy differences, rate constants, and equilibrium constants that incorporates stochastically optimized structures and their energies, and show that this scheme improves upon DFT accuracy. Our methods generalize straightforwardly to other systems and other high-accuracy theories that similarly face challenges computing energy gradients, paving the way for highly accurate PES mapping, transition state determination, and thermodynamic and kinetic calculations at significantly reduced computational expense.
Collapse
Affiliation(s)
- Gopal R Iyer
- Department of Chemistry, Brown University, Providence, Rhode Island 02912, United States
| | - Noah Whelpley
- Department of Chemistry, Brown University, Providence, Rhode Island 02912, United States
| | - Juha Tiihonen
- Department of Physics, Nanoscience Center, University of Jyväskylä, Jyväskylä 40014, Finland
| | - Paul R C Kent
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
| | - Jaron T Krogel
- Materials Science and Technology Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, United States
| | - Brenda M Rubenstein
- Department of Chemistry, Brown University, Providence, Rhode Island 02912, United States
| |
Collapse
|
17
|
Willow SY, Kim DG, Sundheep R, Hajibabaei A, Kim KS, Myung CW. Active sparse Bayesian committee machine potential for isothermal-isobaric molecular dynamics simulations. Phys Chem Chem Phys 2024; 26:22073-22082. [PMID: 39113586 DOI: 10.1039/d4cp01801j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
Recent advancements in machine learning potentials (MLPs) have significantly impacted the fields of chemistry, physics, and biology by enabling large-scale first-principles simulations. Among different machine learning approaches, kernel-based MLPs distinguish themselves through their ability to handle small datasets, quantify uncertainties, and minimize over-fitting. Nevertheless, their extensive computational requirements present considerable challenges. To alleviate these, sparsification methods have been developed, aiming to reduce computational scaling without compromising accuracy. In the context of isothermal and isobaric ML molecular dynamics (MD) simulations, achieving precise pressure estimation is crucial for reproducing reliable system behavior under constant pressure. Despite progress, sparse kernel MLPs struggle with precise pressure prediction. Here, we introduce a virial kernel function that significantly enhances the pressure estimation accuracy of MLPs. Additionally, we propose the active sparse Bayesian committee machine (BCM) potential, an on-the-fly MLP architecture that aggregates local sparse Gaussian process regression (SGPR) MLPs. The sparse BCM potential overcomes the steep computational scaling with the kernel size, and a predefined restriction on the size of kernel allows for fast and efficient on-the-fly training. Our advancements facilitate accurate and computationally efficient machine learning-enhanced MD (MLMD) simulations across diverse systems, including ice-liquid coexisting phases, Li10Ge(PS6)2 lithium solid electrolyte, and high-pressure liquid boron nitride.
Collapse
Affiliation(s)
- Soohaeng Yoo Willow
- Department of Energy Science, Sungkyunkwan University, Seobu-ro 2066, Suwon 16419, Korea.
| | - Dong Geon Kim
- Department of Energy Science, Sungkyunkwan University, Seobu-ro 2066, Suwon 16419, Korea.
| | - R Sundheep
- Department of Energy Science, Sungkyunkwan University, Seobu-ro 2066, Suwon 16419, Korea.
| | - Amir Hajibabaei
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
| | - Kwang S Kim
- Center for Superfunctional Materials, Department of Chemistry, Ulsan National Institute of Science and Technology, Ulsan 44919, Korea
| | - Chang Woo Myung
- Department of Energy Science, Sungkyunkwan University, Seobu-ro 2066, Suwon 16419, Korea.
| |
Collapse
|
18
|
Shi BX, Wales DJ, Michaelides A, Myung CW. Going for Gold(-Standard): Attaining Coupled Cluster Accuracy in Oxide-Supported Nanoclusters. J Chem Theory Comput 2024; 20:5306-5316. [PMID: 38856017 DOI: 10.1021/acs.jctc.4c00379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
The structure of oxide-supported metal nanoclusters plays an essential role in their sharply enhanced catalytic activity over that of bulk metals. Simulations provide the atomic-scale resolution needed to understand these systems. However, the sensitive mix of metal-metal and metal-support interactions, which govern their structure, puts stringent requirements on the method used, requiring calculations beyond standard density functional theory (DFT). The method of choice is coupled cluster theory [specifically CCSD(T)], but its computational cost has so far prevented its application to these systems. In this work, we showcase two approaches to make CCSD(T) accuracy readily achievable in oxide-supported nanoclusters. First, we leverage the SKZCAM protocol to provide the first benchmarks of oxide-supported nanoclusters, revealing that it is specifically metal-metal interactions that are challenging to capture with DFT. Second, we propose a CCSD(T) correction (ΔCC) to the metal-metal interaction errors in DFT, reaching accuracy comparable to that of the SKZCAM protocol at significantly lower cost. This approach forges a path toward studying larger systems at reliable accuracy, which we highlight by identifying a ground-state structure in agreement with experiments for Au20 on MgO, a challenging system where DFT models have yielded conflicting predictions.
Collapse
Affiliation(s)
- Benjamin X Shi
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
| | - David J Wales
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
| | - Angelos Michaelides
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
| | - Chang Woo Myung
- Department of Energy Science, Sungkyunkwan University, Seobu-ro 2066, Suwon 16419, Korea
| |
Collapse
|
19
|
O’Neill N, Shi BX, Fong K, Michaelides A, Schran C. To Pair or not to Pair? Machine-Learned Explicitly-Correlated Electronic Structure for NaCl in Water. J Phys Chem Lett 2024; 15:6081-6091. [PMID: 38820256 PMCID: PMC11181334 DOI: 10.1021/acs.jpclett.4c01030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 05/23/2024] [Accepted: 05/24/2024] [Indexed: 06/02/2024]
Abstract
The extent of ion pairing in solution is an important phenomenon to rationalize transport and thermodynamic properties of electrolytes. A fundamental measure of this pairing is the potential of mean force (PMF) between solvated ions. The relative stabilities of the paired and solvent shared states in the PMF and the barrier between them are highly sensitive to the underlying potential energy surface. However, direct application of accurate electronic structure methods is challenging, since long simulations are required. We develop wave function based machine learning potentials with the random phase approximation (RPA) and second order Møller-Plesset (MP2) perturbation theory for the prototypical system of Na and Cl ions in water. We show both methods in agreement, predicting the paired and solvent shared states to have similar energies (within 0.2 kcal/mol). We also provide the same benchmarks for different DFT functionals as well as insight into the PMF based on simple analyses of the interactions in the system.
Collapse
Affiliation(s)
- Niamh O’Neill
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Cavendish
Laboratory, Department of Physics, University
of Cambridge, Cambridge CB3 0HE, United
Kingdom
- Lennard-Jones
Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| | - Benjamin X. Shi
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Lennard-Jones
Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| | - Kara Fong
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Lennard-Jones
Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| | - Angelos Michaelides
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Lennard-Jones
Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| | - Christoph Schran
- Cavendish
Laboratory, Department of Physics, University
of Cambridge, Cambridge CB3 0HE, United
Kingdom
- Lennard-Jones
Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| |
Collapse
|
20
|
Althorpe SC. Path Integral Simulations of Condensed-Phase Vibrational Spectroscopy. Annu Rev Phys Chem 2024; 75:397-420. [PMID: 38941531 DOI: 10.1146/annurev-physchem-090722-124705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
Recent theoretical and algorithmic developments have improved the accuracy with which path integral dynamics methods can include nuclear quantum effects in simulations of condensed-phase vibrational spectra. Such methods are now understood to be approximations to the delocalized classical Matsubara dynamics of smooth Feynman paths, which dominate the dynamics of systems such as liquid water at room temperature. Focusing mainly on simulations of liquid water and hexagonal ice, we explain how the recently developed quasicentroid molecular dynamics (QCMD), fast-QCMD, and temperature-elevated path integral coarse-graining simulations (Te PIGS) methods generate classical dynamics on potentials of mean force obtained by averaging over quantum thermal fluctuations. These new methods give very close agreement with one another, and the Te PIGS method has recently yielded excellent agreement with experimentally measured vibrational spectra for liquid water, ice, and the liquid-air interface. We also discuss the limitations of such methods.
Collapse
Affiliation(s)
- Stuart C Althorpe
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom;
| |
Collapse
|
21
|
Pollak E. A personal perspective of the present status and future challenges facing thermal reaction rate theory. J Chem Phys 2024; 160:150902. [PMID: 38639316 DOI: 10.1063/5.0199557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 03/06/2024] [Indexed: 04/20/2024] Open
Abstract
Reaction rate theory has been at the center of physical chemistry for well over one hundred years. The evolution of the theory is not only of historical interest. Reliable and accurate computation of reaction rates remains a challenge to this very day, especially in view of the development of quantum chemistry methods, which predict the relevant force fields. It is still not possible to compute the numerically exact rate on the fly when the system has more than at most a few dozen anharmonic degrees of freedom, so one must consider various approximate methods, not only from the practical point of view of constructing numerical algorithms but also on conceptual and formal levels. In this Perspective, I present some of the recent analytical results concerning leading order terms in an ℏ2m series expansion of the exact rate and their implications on various approximate theories. A second aspect has to do with the crossover temperature between tunneling and thermal activation. Using a uniform semiclassical transmission probability rather than the "primitive" semiclassical theory leads to the conclusion that there is no divergence problem associated with a "crossover temperature." If one defines a semiclassical crossover temperature as the point at which the tunneling energy of the instanton equals the barrier height, then it is a factor of two higher than its previous estimate based on the "primitive" semiclassical approximation. In the low temperature tunneling regime, the uniform semiclassical theory as well as the "primitive" semiclassical theory were based on the classical Euclidean action of a periodic orbit on the inverted potential. The uniform semiclassical theory wrongly predicts that the "half-point," which is the energy at which the transmission probability equals 1/2, for any barrier potential, is always the barrier energy. We describe here how augmenting the Euclidean action with constant terms of order ℏ2 can significantly improve the accuracy of the semiclassical theory and correct this deficiency. This also leads to a deep connection with and improvement of vibrational perturbation theory. The uniform semiclassical theory also enables an extension of the quantum version of Kramers' turnover theory to temperatures below the "crossover temperature." The implications of these recent advances on various approximate methods used to date are discussed at length, leading to the conclusion that reaction rate theory will continue to challenge us both on conceptual and practical levels for years to come.
Collapse
Affiliation(s)
- Eli Pollak
- Chemical and Biological Physics Department, Weizmann Institute of Science, 76100 Rehovoth, Israel
| |
Collapse
|
22
|
Tokita AM, Behler J. How to train a neural network potential. J Chem Phys 2023; 159:121501. [PMID: 38127396 DOI: 10.1063/5.0160326] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 07/24/2023] [Indexed: 12/23/2023] Open
Abstract
The introduction of modern Machine Learning Potentials (MLPs) has led to a paradigm change in the development of potential energy surfaces for atomistic simulations. By providing efficient access to energies and forces, they allow us to perform large-scale simulations of extended systems, which are not directly accessible by demanding first-principles methods. In these simulations, MLPs can reach the accuracy of electronic structure calculations, provided that they have been properly trained and validated using a suitable set of reference data. Due to their highly flexible functional form, the construction of MLPs has to be done with great care. In this Tutorial, we describe the necessary key steps for training reliable MLPs, from data generation via training to final validation. The procedure, which is illustrated for the example of a high-dimensional neural network potential, is general and applicable to many types of MLPs.
Collapse
Affiliation(s)
- Alea Miako Tokita
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany and Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany and Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| |
Collapse
|
23
|
Yu Q, Qu C, Houston PL, Nandi A, Pandey P, Conte R, Bowman JM. A Status Report on "Gold Standard" Machine-Learned Potentials for Water. J Phys Chem Lett 2023; 14:8077-8087. [PMID: 37656898 PMCID: PMC10510435 DOI: 10.1021/acs.jpclett.3c01791] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 08/28/2023] [Indexed: 09/03/2023]
Abstract
Owing to the central importance of water to life as well as its unusual properties, potentials for water have been the subject of extensive research over the past 50 years. Recently, five potentials based on different machine learning approaches have been reported that are at or near the "gold standard" CCSD(T) level of theory. The development of such high-level potentials enables efficient and accurate simulations of water systems using classical and quantum dynamical approaches. This Perspective serves as a status report of these potentials, focusing on their methodology and applications to water systems across different phases. Their performances on the energies of gas phase water clusters, as well as condensed phase structural and dynamical properties, are discussed.
Collapse
Affiliation(s)
- Qi Yu
- Department
of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Chen Qu
- Independent
Researcher, Toronto, Ontario M9B 0E3, Canada
| | - Paul L. Houston
- Department
of Chemistry and Chemical Biology, Cornell
University, Ithaca, New York 14853, United States
- Department of Chemistry
and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Apurba Nandi
- Department
of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511, Luxembourg City, Luxembourg
| | - Priyanka Pandey
- Department
of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Riccardo Conte
- Dipartimento
di Chimica, Università degli Studi
di Milano, via Golgi 19, 20133 Milano, Italy
| | - Joel M. Bowman
- Department
of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| |
Collapse
|
24
|
Atsango AO, Morawietz T, Marsalek O, Markland TE. Developing machine-learned potentials to simultaneously capture the dynamics of excess protons and hydroxide ions in classical and path integral simulations. J Chem Phys 2023; 159:074101. [PMID: 37581418 DOI: 10.1063/5.0162066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 07/31/2023] [Indexed: 08/16/2023] Open
Abstract
The transport of excess protons and hydroxide ions in water underlies numerous important chemical and biological processes. Accurately simulating the associated transport mechanisms ideally requires utilizing ab initio molecular dynamics simulations to model the bond breaking and formation involved in proton transfer and path-integral simulations to model the nuclear quantum effects relevant to light hydrogen atoms. These requirements result in a prohibitive computational cost, especially at the time and length scales needed to converge proton transport properties. Here, we present machine-learned potentials (MLPs) that can model both excess protons and hydroxide ions at the generalized gradient approximation and hybrid density functional theory levels of accuracy and use them to perform multiple nanoseconds of both classical and path-integral proton defect simulations at a fraction of the cost of the corresponding ab initio simulations. We show that the MLPs are able to reproduce ab initio trends and converge properties such as the diffusion coefficients of both excess protons and hydroxide ions. We use our multi-nanosecond simulations, which allow us to monitor large numbers of proton transfer events, to analyze the role of hypercoordination in the transport mechanism of the hydroxide ion and provide further evidence for the asymmetry in diffusion between excess protons and hydroxide ions.
Collapse
Affiliation(s)
- Austin O Atsango
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| | - Tobias Morawietz
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| | - Ondrej Marsalek
- Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
| | - Thomas E Markland
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
25
|
Sukurma Z, Schlipf M, Humer M, Taheridehkordi A, Kresse G. Benchmark Phaseless Auxiliary-Field Quantum Monte Carlo Method for Small Molecules. J Chem Theory Comput 2023; 19:4921-4934. [PMID: 37470356 PMCID: PMC10413869 DOI: 10.1021/acs.jctc.3c00322] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Indexed: 07/21/2023]
Abstract
We report a scalable Fortran implementation of the phaseless auxiliary-field quantum Monte Carlo (ph-AFQMC) and demonstrate its excellent performance and beneficial scaling with respect to system size. Furthermore, we investigate modifications of the phaseless approximation that can help to reduce the overcorrelation problems common to the ph-AFQMC. We apply the method to the 26 molecules in the HEAT set, the benzene molecule, and water clusters. We observe a mean absolute deviation of the total energy of 1.15 kcal/mol for the molecules in the HEAT set, close to chemical accuracy. For the benzene molecule, the modified algorithm despite using a single-Slater-determinant trial wavefunction yields the same accuracy as the original phaseless scheme with 400 Slater determinants. Despite these improvements, we find systematic errors for the CN, CO2, and O2 molecules that need to be addressed with more accurate trial wavefunctions. For water clusters, we find that the ph-AFQMC yields excellent binding energies that differ from CCSD(T) by typically less than 0.5 kcal/mol.
Collapse
Affiliation(s)
- Zoran Sukurma
- Faculty
of Physics and Center for Computational Materials Science, University of Vienna, Kolingasse 14-16, A-1090 Vienna, Austria
- Faculty
of Physics & Vienna Doctoral School in Physics, University of Vienna, Boltzmanngasse 5, A-1090 Vienna, Austria
| | | | - Moritz Humer
- Faculty
of Physics and Center for Computational Materials Science, University of Vienna, Kolingasse 14-16, A-1090 Vienna, Austria
- Faculty
of Physics & Vienna Doctoral School in Physics, University of Vienna, Boltzmanngasse 5, A-1090 Vienna, Austria
| | - Amir Taheridehkordi
- Faculty
of Physics and Center for Computational Materials Science, University of Vienna, Kolingasse 14-16, A-1090 Vienna, Austria
| | - Georg Kresse
- Faculty
of Physics and Center for Computational Materials Science, University of Vienna, Kolingasse 14-16, A-1090 Vienna, Austria
- VASP
Software GmbH, Sensengasse 8, 1090 Vienna, Austria
| |
Collapse
|
26
|
Ruth M, Gerbig D, Schreiner PR. Machine Learning for Bridging the Gap between Density Functional Theory and Coupled Cluster Energies. J Chem Theory Comput 2023. [PMID: 37418619 DOI: 10.1021/acs.jctc.3c00274] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/09/2023]
Abstract
Accurate electronic energies and properties are crucial for successful reaction design and mechanistic investigations. Computing energies and properties of molecular structures has proven extremely useful, and, with increasing computational power, the limits of high-level approaches (such as coupled cluster theory) are expanding to ever larger systems. However, because scaling is highly unfavorable, these methods are still not universally applicable to larger systems. To address the need for fast and accurate electronic energies of larger systems, we created a database of around 8000 small organic monomers (2000 dimers) optimized at the B3LYP-D3(BJ)/cc-pVTZ level of theory. This database also includes single-point energies computed at various levels of theory, including PBE1PBE, ωΒ97Χ, M06-2X, revTPSS, B3LYP, and BP86, for density functional theory as well as DLPNO-CCSD(T) and CCSD(T) for coupled cluster theory, all in conjunction with a cc-pVTZ basis. We used this database to train machine learning models based on graph neural networks using two different graph representations. Our models are able to make energy predictions from B3LYP-D3(BJ)/cc-pVTZ inputs to CCSD(T)/cc-pVTZ outputs with a mean absolute error of 0.78 and to DLPNO-CCSD(T)/cc-pVTZ with an mean absolute error of 0.50 and 0.18 kcal mol-1 for monomers and dimers, respectively. The model for dimers was further validated on the S22 database, and the monomer model was tested on challenging systems, including those with highly conjugated or functionally complex molecules.
Collapse
Affiliation(s)
- Marcel Ruth
- Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| | - Dennis Gerbig
- Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| | - Peter R Schreiner
- Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| |
Collapse
|
27
|
Guidarelli Mattioli F, Sciortino F, Russo J. Are Neural Network Potentials Trained on Liquid States Transferable to Crystal Nucleation? A Test on Ice Nucleation in the mW Water Model. J Phys Chem B 2023; 127:3894-3901. [PMID: 37075256 PMCID: PMC10165654 DOI: 10.1021/acs.jpcb.3c00693] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 04/06/2023] [Indexed: 04/21/2023]
Abstract
Neural network potentials (NNPs) are increasingly being used to study processes that happen on long time scales. A typical example is crystal nucleation, which rate is controlled by the occurrence of a rare fluctuation, i.e., the appearance of the critical nucleus. Because the properties of this nucleus are far from those of the bulk crystal, it is yet unclear whether NN potentials trained on equilibrium liquid states can accurately describe nucleation processes. So far, nucleation studies on NNPs have been limited to ab initio models whose nucleation properties are unknown, preventing an accurate comparison. Here we train a NN potential on the mW model of water─a classical three-body potential whose nucleation time scale is accessible in standard simulations. We show that a NNP trained only on a small number of liquid state points can reproduce with great accuracy the nucleation rates and free energy barriers of the original model, computed from both spontaneous and biased trajectories, strongly supporting the use of NNPs to study nucleation events.
Collapse
Affiliation(s)
| | | | - John Russo
- Sapienza University of Rome, Piazzale Aldo Moro 2, 00185 Rome, Italy
| |
Collapse
|