1
|
Deng Z, Liu C, Li Z, Zhang Y. An efficient method by combining different basis sets and SAPT levels. J Comput Chem 2024. [PMID: 38703182 DOI: 10.1002/jcc.27386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 04/15/2024] [Accepted: 04/18/2024] [Indexed: 05/06/2024]
Abstract
In symmetry-adapted perturbation theory (SAPT), accurate calculations on non-covalent interaction (NCI) for large complexes with more than 50 atoms are time-consuming using large basis sets. More efficient ones with smaller basis sets usually result in poor prediction in terms of dispersion and overall energies. In this study, we propose two composite methods with baseline calculated at SAPT2/aug-cc-pVDZ and SAPT2/aug-cc-pVTZ with dispersion term corrected at SAPT2+ level using bond functions and smaller basis set withδ $$ \delta $$ MP2 corrections respectively. Benchmark results on representative NCI data sets, such as S22, S66, and so forth, show significant improvements on the accuracy compared to the original SAPT Silver standard and comparable to SAPT Gold standard in some cases with much less computational cost.
Collapse
Affiliation(s)
- Zhihao Deng
- Beijing StoneWise Technology Co Ltd., Beijing, China
| | - Chang Liu
- Beijing StoneWise Technology Co Ltd., Beijing, China
| | - Zhongwei Li
- Yantai Gogetter Technology Co Ltd., Yantai, China
| | | |
Collapse
|
2
|
Kurnikov IV, Pereyaslavets L, Kamath G, Sakipov SN, Voronina E, Butin O, Illarionov A, Leontyev I, Nawrocki G, Darkhovskiy M, Olevanov M, Ivahnenko I, Chen Y, Lock CB, Levitt M, Kornberg RD, Fain B. Neural Network Corrections to Intermolecular Interaction Terms of a Molecular Force Field Capture Nuclear Quantum Effects in Calculations of Liquid Thermodynamic Properties. J Chem Theory Comput 2024; 20:1347-1357. [PMID: 38240485 PMCID: PMC11042917 DOI: 10.1021/acs.jctc.3c00921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
We incorporate nuclear quantum effects (NQE) in condensed matter simulations by introducing short-range neural network (NN) corrections to the ab initio fitted molecular force field ARROW. Force field NN corrections are fitted to average interaction energies and forces of molecular dimers, which are simulated using the Path Integral Molecular Dynamics (PIMD) technique with restrained centroid positions. The NN-corrected force field allows reproduction of the NQE for computed liquid water and methane properties such as density, radial distribution function (RDF), heat of evaporation (HVAP), and solvation free energy. Accounting for NQE through molecular force field corrections circumvents the need for explicit computationally expensive PIMD simulations in accurate calculations of the properties of chemical and biological systems. The accuracy and locality of pairwise NN NQE corrections indicate that this approach could be applicable to complex heterogeneous systems, such as proteins.
Collapse
Affiliation(s)
- Igor V Kurnikov
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Leonid Pereyaslavets
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ganesh Kamath
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Serzhan N Sakipov
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ekaterina Voronina
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Oleg Butin
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Alexey Illarionov
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Igor Leontyev
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Grzegorz Nawrocki
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Mikhail Darkhovskiy
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Michael Olevanov
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ilya Ivahnenko
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - YuChun Chen
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Christopher B Lock
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Palo Alto, California 94304, United States
| | - Michael Levitt
- Department of Structural Biology, Stanford University School of Medicine, Stanford, California 94305, United States
| | - Roger D Kornberg
- Department of Structural Biology, Stanford University School of Medicine, Stanford, California 94305, United States
| | - Boris Fain
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| |
Collapse
|
3
|
Fan ZX, Chao SD. A Machine Learning Force Field for Bio-Macromolecular Modeling Based on Quantum Chemistry-Calculated Interaction Energy Datasets. Bioengineering (Basel) 2024; 11:51. [PMID: 38247928 DOI: 10.3390/bioengineering11010051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 12/23/2023] [Accepted: 12/25/2023] [Indexed: 01/23/2024] Open
Abstract
Accurate energy data from noncovalent interactions are essential for constructing force fields for molecular dynamics simulations of bio-macromolecular systems. There are two important practical issues in the construction of a reliable force field with the hope of balancing the desired chemical accuracy and working efficiency. One is to determine a suitable quantum chemistry level of theory for calculating interaction energies. The other is to use a suitable continuous energy function to model the quantum chemical energy data. For the first issue, we have recently calculated the intermolecular interaction energies using the SAPT0 level of theory, and we have systematically organized these energies into the ab initio SOFG-31 (homodimer) and SOFG-31-heterodimer datasets. In this work, we re-calculate these interaction energies by using the more advanced SAPT2 level of theory with a wider series of basis sets. Our purpose is to determine the SAPT level of theory proper for interaction energies with respect to the CCSD(T)/CBS benchmark chemical accuracy. Next, to utilize these energy datasets, we employ one of the well-developed machine learning techniques, called the CLIFF scheme, to construct a general-purpose force field for biomolecular dynamics simulations. Here we use the SOFG-31 dataset and the SOFG-31-heterodimer dataset as the training and test sets, respectively. Our results demonstrate that using the CLIFF scheme can reproduce a diverse range of dimeric interaction energy patterns with only a small training set. The overall errors for each SAPT energy component, as well as the SAPT total energy, are all well below the desired chemical accuracy of ~1 kcal/mol.
Collapse
Affiliation(s)
- Zhen-Xuan Fan
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan
| | - Sheng D Chao
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan
- Center for Quantum Science and Engineering, National Taiwan University, Taipei 106, Taiwan
| |
Collapse
|
4
|
Demir Gİ, Tekin A. NICE-FF: A non-empirical, intermolecular, consistent, and extensible force field for nucleic acids and beyond. J Chem Phys 2023; 159:244117. [PMID: 38153156 DOI: 10.1063/5.0176641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 12/04/2023] [Indexed: 12/29/2023] Open
Abstract
A new non-empirical ab initio intermolecular force field (NICE-FF in buffered 14-7 potential form) has been developed for nucleic acids and beyond based on the dimer interaction energies (IEs) calculated at the spin component scaled-MI-second order Møller-Plesset perturbation theory. A fully automatic framework has been implemented for this purpose, capable of generating well-polished computational grids, performing the necessary ab initio calculations, conducting machine learning (ML) assisted force field (FF) parametrization, and extending existing FF parameters by incorporating new atom types. For the ML-assisted parametrization of NICE-FF, interaction energies of ∼18 000 dimer geometries (with IE < 0) were used, and the best fit gave a mean square deviation of about 0.46 kcal/mol. During this parametrization, atom types apparent in four deoxyribonucleic acid (DNA) bases have been first trained using the generated DNA base datasets. Both uracil and hypoxanthine, which contain the same atom types found in DNA bases, have been considered as test molecules. Three new atom types have been added to the DNA atom types by using IE datasets of both pyrazinamide and 9-methylhypoxanthine. Finally, the last test molecule, theophylline, has been selected, which contains already-fitted atom-type parameters. The performance of NICE-FF has been investigated on the S22 dataset, and it has been found that NICE-FF outperforms the well-known FFs by generating the most consistent IEs with the high-level ab initio ones. Moreover, NICE-FF has been integrated into our in-house developed crystal structure prediction (CSP) tool [called FFCASP (Fast and Flexible CrystAl Structure Predictor)], aiming to find the experimental crystal structures of all considered molecules. CSPs, which were performed up to 4 formula units (Z), resulted in NICE-FF being able to locate almost all the known experimental crystal structures with sufficiently low RMSD20 values to provide good starting points for density functional theory optimizations.
Collapse
Affiliation(s)
- Gözde İniş Demir
- Informatics Institute, Istanbul Technical University, 34469 Maslak, Istanbul, Türkiye
| | - Adem Tekin
- Informatics Institute, Istanbul Technical University, 34469 Maslak, Istanbul, Türkiye
- Research Institute for Fundamental Sciences (TÜBİTAK-TBAE), Kocaeli, Türkiye
| |
Collapse
|
5
|
Chen JA, Chao SD. Intermolecular Non-Bonded Interactions from Machine Learning Datasets. Molecules 2023; 28:7900. [PMID: 38067629 PMCID: PMC10707888 DOI: 10.3390/molecules28237900] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 11/22/2023] [Accepted: 11/29/2023] [Indexed: 04/04/2024] Open
Abstract
Accurate determination of intermolecular non-covalent-bonded or non-bonded interactions is the key to potentially useful molecular dynamics simulations of polymer systems. However, it is challenging to balance both the accuracy and computational cost in force field modelling. One of the main difficulties is properly representing the calculated energy data as a continuous force function. In this paper, we employ well-developed machine learning techniques to construct a general purpose intermolecular non-bonded interaction force field for organic polymers. The original ab initio dataset SOFG-31 was calculated by us and has been well documented, and here we use it as our training set. The CLIFF kernel type machine learning scheme is used for predicting the interaction energies of heterodimers selected from the SOFG-31 dataset. Our test results show that the overall errors are well below the chemical accuracy of about 1 kcal/mol, thus demonstrating the promising feasibility of machine learning techniques in force field modelling.
Collapse
Affiliation(s)
- Jia-An Chen
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan;
| | - Sheng D. Chao
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan;
- Center for Quantum Science and Engineering, National Taiwan University, Taipei 106, Taiwan
| |
Collapse
|
6
|
Spronk SA, Glick ZL, Metcalf DP, Sherrill CD, Cheney DL. A quantum chemical interaction energy dataset for accurately modeling protein-ligand interactions. Sci Data 2023; 10:619. [PMID: 37699937 PMCID: PMC10497680 DOI: 10.1038/s41597-023-02443-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 08/03/2023] [Indexed: 09/14/2023] Open
Abstract
Fast and accurate calculation of intermolecular interaction energies is desirable for understanding many chemical and biological processes, including the binding of small molecules to proteins. The Splinter ["Symmetry-adapted perturbation theory (SAPT0) protein-ligand interaction"] dataset has been created to facilitate the development and improvement of methods for performing such calculations. Molecular fragments representing commonly found substructures in proteins and small-molecule ligands were paired into >9000 unique dimers, assembled into numerous configurations using an approach designed to adequately cover the breadth of the dimers' potential energy surfaces while enhancing sampling in favorable regions. ~1.5 million configurations of these dimers were randomly generated, and a structurally diverse subset of these were minimized to obtain an additional ~80 thousand local and global minima. For all >1.6 million configurations, SAPT0 calculations were performed with two basis sets to complete the dataset. It is expected that Splinter will be a useful benchmark dataset for training and testing various methods for the calculation of intermolecular interaction energies.
Collapse
Affiliation(s)
- Steven A Spronk
- Molecular Structure and Design, Bristol Myers Squibb Company, P. O. Box 5400, Princeton, NJ, 08543, USA.
| | - Zachary L Glick
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, 30332-0400, USA
| | - Derek P Metcalf
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, 30332-0400, USA
| | - C David Sherrill
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, 30332-0400, USA.
| | - Daniel L Cheney
- Molecular Structure and Design, Bristol Myers Squibb Company, P. O. Box 5400, Princeton, NJ, 08543, USA
| |
Collapse
|
7
|
Hagg A, Kirschner KN. Open-Source Machine Learning in Computational Chemistry. J Chem Inf Model 2023; 63:4505-4532. [PMID: 37466636 PMCID: PMC10430767 DOI: 10.1021/acs.jcim.3c00643] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Indexed: 07/20/2023]
Abstract
The field of computational chemistry has seen a significant increase in the integration of machine learning concepts and algorithms. In this Perspective, we surveyed 179 open-source software projects, with corresponding peer-reviewed papers published within the last 5 years, to better understand the topics within the field being investigated by machine learning approaches. For each project, we provide a short description, the link to the code, the accompanying license type, and whether the training data and resulting models are made publicly available. Based on those deposited in GitHub repositories, the most popular employed Python libraries are identified. We hope that this survey will serve as a resource to learn about machine learning or specific architectures thereof by identifying accessible codes with accompanying papers on a topic basis. To this end, we also include computational chemistry open-source software for generating training data and fundamental Python libraries for machine learning. Based on our observations and considering the three pillars of collaborative machine learning work, open data, open source (code), and open models, we provide some suggestions to the community.
Collapse
Affiliation(s)
- Alexander Hagg
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Electrical Engineering, Mechanical Engineering and Technical Journalism, University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| | - Karl N. Kirschner
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Computer Science, University of Applied
Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| |
Collapse
|
8
|
Perrella F, Coppola F, Rega N, Petrone A. An Expedited Route to Optical and Electronic Properties at Finite Temperature via Unsupervised Learning. Molecules 2023; 28:molecules28083411. [PMID: 37110644 PMCID: PMC10144358 DOI: 10.3390/molecules28083411] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Revised: 04/06/2023] [Accepted: 04/07/2023] [Indexed: 04/29/2023] Open
Abstract
Electronic properties and absorption spectra are the grounds to investigate molecular electronic states and their interactions with the environment. Modeling and computations are required for the molecular understanding and design strategies of photo-active materials and sensors. However, the interpretation of such properties demands expensive computations and dealing with the interplay of electronic excited states with the conformational freedom of the chromophores in complex matrices (i.e., solvents, biomolecules, crystals) at finite temperature. Computational protocols combining time dependent density functional theory and ab initio molecular dynamics (MD) have become very powerful in this field, although they require still a large number of computations for a detailed reproduction of electronic properties, such as band shapes. Besides the ongoing research in more traditional computational chemistry fields, data analysis and machine learning methods have been increasingly employed as complementary approaches for efficient data exploration, prediction and model development, starting from the data resulting from MD simulations and electronic structure calculations. In this work, dataset reduction capabilities by unsupervised clustering techniques applied to MD trajectories are proposed and tested for the ab initio modeling of electronic absorption spectra of two challenging case studies: a non-covalent charge-transfer dimer and a ruthenium complex in solution at room temperature. The K-medoids clustering technique is applied and is proven to be able to reduce by ∼100 times the total cost of excited state calculations on an MD sampling with no loss in the accuracy and it also provides an easier understanding of the representative structures (medoids) to be analyzed on the molecular scale.
Collapse
Affiliation(s)
- Fulvio Perrella
- Scuola Superiore Meridionale, Largo San Marcellino 10, I-80138 Napoli, Italy
| | - Federico Coppola
- Scuola Superiore Meridionale, Largo San Marcellino 10, I-80138 Napoli, Italy
| | - Nadia Rega
- Scuola Superiore Meridionale, Largo San Marcellino 10, I-80138 Napoli, Italy
- Department of Chemical Sciences, University of Napoli Federico II, Complesso Universitario di M.S. Angelo, via Cintia 21, I-80126 Napoli, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Napoli, Complesso Universitario di M.S. Angelo ed. 6, via Cintia 21, I-80126 Napoli, Italy
| | - Alessio Petrone
- Scuola Superiore Meridionale, Largo San Marcellino 10, I-80138 Napoli, Italy
- Department of Chemical Sciences, University of Napoli Federico II, Complesso Universitario di M.S. Angelo, via Cintia 21, I-80126 Napoli, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Napoli, Complesso Universitario di M.S. Angelo ed. 6, via Cintia 21, I-80126 Napoli, Italy
| |
Collapse
|
9
|
Low K, Coote ML, Izgorodina EI. Accurate Prediction of Three-Body Intermolecular Interactions via Electron Deformation Density-Based Machine Learning. J Chem Theory Comput 2023; 19:1466-1475. [PMID: 36787280 DOI: 10.1021/acs.jctc.2c00984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
This work extends the electron deformation density-based descriptor, originally developed in the electron deformation density-based interaction energy machine learning (EDDIE-ML) algorithm to predict dimer interaction energies, to the prediction of three-body interactions in trimers. Using a sequential learning process to select the training data, the resulting Gaussian process regression (GPR) model predicts the three-body interaction energy within 0.2 kcal mol-1 of the SRS-MP2/cc-pVTZ reference values for the 3B69 and S22-3 trimer data sets. A hybrid kernel function is introduced, which combines contributions from the average and individual atomic environments, allowing the total trimer interaction energy to be predicted in addition to the three-body contribution using the same descriptor. To extend the range and diversity of trimer interaction energies available in the literature, a new data set based on a protein-ligand crystal structure is introduced, consisting of 509 structures of a central ligand with two protein fragments. Benchmark calculations are provided for the new data set, which contains significantly larger molecular interactions than current databases in the literature in addition to charged fragments. Compared to density funtional theory (DFT)- and wavefunction-based methods for calculating the three-body interaction energy, our model makes predictions in a significantly shorter time frame by reducing the number of required SCF calculations from 7 to 4 performed at the PBE0 level of theory, showcasing the utility and efficiency of our Δ-ML method particularly when applied to larger systems.
Collapse
Affiliation(s)
- Kaycee Low
- Monash Computational Chemistry Group, School of Chemistry, Monash University, Clayton, Victoria 3800, Australia
| | - Michelle L Coote
- Institute for Nanoscale Science and Technology, College of Science and Engineering, Flinders University, Bedford Park, South Australia 5042, Australia
| | - Ekaterina I Izgorodina
- Monash Computational Chemistry Group, School of Chemistry, Monash University, Clayton, Victoria 3800, Australia
| |
Collapse
|
10
|
Thürlemann M, Böselt L, Riniker S. Regularized by Physics: Graph Neural Network Parametrized Potentials for the Description of Intermolecular Interactions. J Chem Theory Comput 2023; 19:562-579. [PMID: 36633918 PMCID: PMC9878731 DOI: 10.1021/acs.jctc.2c00661] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Indexed: 01/13/2023]
Abstract
Simulations of molecular systems using electronic structure methods are still not feasible for many systems of biological importance. As a result, empirical methods such as force fields (FF) have become an established tool for the simulation of large and complex molecular systems. The parametrization of FF is, however, time-consuming and has traditionally been based on experimental data. Recent years have therefore seen increasing efforts to automatize FF parametrization or to replace FF with machine-learning (ML) based potentials. Here, we propose an alternative strategy to parametrize FF, which makes use of ML and gradient-descent based optimization while retaining a functional form founded in physics. Using a predefined functional form is shown to enable interpretability, robustness, and efficient simulations of large systems over long time scales. To demonstrate the strength of the proposed method, a fixed-charge and a polarizable model are trained on ab initio potential-energy surfaces. Given only information about the constituting elements, the molecular topology, and reference potential energies, the models successfully learn to assign atom types and corresponding FF parameters from scratch. The resulting models and parameters are validated on a wide range of experimentally and computationally derived properties of systems including dimers, pure liquids, and molecular crystals.
Collapse
Affiliation(s)
- Moritz Thürlemann
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Lennard Böselt
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Sereina Riniker
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| |
Collapse
|
11
|
Cheng Y, Verstraelen T. A new framework for frequency-dependent polarizable force fields. J Chem Phys 2022; 157:124106. [PMID: 36182425 DOI: 10.1063/5.0115151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
A frequency-dependent extension of the polarizable force field "Atom-Condensed Kohn-Sham density functional theory approximated to the second-order" (ACKS2) [Verstraelen et al., J. Chem. Phys. 141, 194114 (2014)] is proposed, referred to as ACKS2ω. The method enables theoretical predictions of dynamical response properties of finite systems after partitioning of the frequency-dependent molecular response function. Parameters in this model are computed simply as expectation values of an electronic wavefunction, and the hardness matrix is entirely reused from ACKS2 as an adiabatic approximation is used. A numerical validation shows that accurate models can already be obtained with atomic monopoles and dipoles. Absorption spectra of 42 organic and inorganic molecular monomers are evaluated using ACKS2ω, and our results agree well with the time-dependent DFT calculations. Also for the calculation of C6 dispersion coefficients, ACKS2ω closely reproduces its TDDFT reference. When parameters for ACKS2ω are derived from a PBE/aug-cc-pVDZ ground state, it reproduces experimental values for 903 organic and inorganic intermolecular pairs with an MAPE of 3.84%. Our results confirm that ACKS2ω offers a solid connection between the quantum-mechanical description of frequency-dependent response and computationally efficient force-field models.
Collapse
Affiliation(s)
- YingXing Cheng
- Center for Molecular Modeling (CMM), Ghent University, Technologiepark-Zwijnaarde 46, B-9052 Gent, Belgium
| | - Toon Verstraelen
- Center for Molecular Modeling (CMM), Ghent University, Technologiepark-Zwijnaarde 46, B-9052 Gent, Belgium
| |
Collapse
|
12
|
Gray M, Herbert JM. Comprehensive Basis-Set Testing of Extended Symmetry-Adapted Perturbation Theory and Assessment of Mixed-Basis Combinations to Reduce Cost. J Chem Theory Comput 2022; 18:2308-2330. [PMID: 35289608 DOI: 10.1021/acs.jctc.1c01302] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Hybrid or "extended" symmetry-adapted perturbation theory (XSAPT) replaces traditional SAPT's treatment of dispersion with better performing alternatives while at the same time extending two-body (dimer) SAPT to a many-body treatment of polarization using a self-consistent charge embedding procedure. The present work presents a systematic study of how XSAPT interaction energies and energy components converge with respect to the choice of Gaussian basis set. Errors can be reduced in a systematic way using correlation-consistent basis sets, with aug-cc-pVTZ results converged within <0.1 kcal/mol. Similar (if slightly less systematic) behavior is obtained using Karlsruhe basis sets at much lower cost, and we introduce new versions with limited augmentation that are even more efficient. Pople-style basis sets, which are more efficient still, often afford good results if a large number of polarization functions are included. The dispersion models used in XSAPT afford much faster basis-set convergence as compared to the perturbative description of dispersion in conventional SAPT, meaning that "compromise" basis sets (such as jun-cc-pVDZ) are no longer required and benchmark-quality results can be obtained using triple-ζ basis sets. The use of diffuse functions proves to be essential, especially for the description of hydrogen bonds. The "δ(Hartree-Fock)" correction for high-order induction can be performed in double-ζ basis sets without significant loss of accuracy, leading to a mixed-basis approach that offers 4× speedup over the existing (cubic scaling) XSAPT approach.
Collapse
Affiliation(s)
- Montgomery Gray
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio 43210, United States
| | - John M Herbert
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio 43210, United States
| |
Collapse
|
13
|
Low K, Coote ML, Izgorodina EI. Inclusion of More Physics Leads to Less Data: Learning the Interaction Energy as a Function of Electron Deformation Density with Limited Training Data. J Chem Theory Comput 2022; 18:1607-1618. [PMID: 35175045 DOI: 10.1021/acs.jctc.1c01264] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Machine learning (ML) approaches to predicting quantum mechanical (QM) properties have made great strides toward achieving the computational chemist's holy grail of structure-based property prediction. In contrast to direct ML methods, which encode a molecule with only structural information, in this work, we show that QM descriptors improve ML predictions of dimer interaction energy, both in terms of accuracy and data efficiency, by incorporating electronic information into the descriptor. We present the electron deformation density interaction energy machine learning (EDDIE-ML) model, which predicts the interaction energy as a function of Hartree-Fock electron deformation density. We compare its performance with leading direct ML schemes and modern DFT methods for the prediction of interaction energies for dimers of varying charge type, size, and intermolecular separation. Under a low-data regime, EDDIE-ML outperforms other direct ML schemes and is the only model readily transferrable to larger, more complex systems including base pair trimers and porous cages. The underlying physical connection between the density and interaction energy enables EDDIE-ML to reach an accuracy comparable to modern DFT functionals in fewer training data points compared to other ML methods.
Collapse
Affiliation(s)
- Kaycee Low
- Monash Computational Chemistry Group, School of Chemistry, Monash University, Clayton, Victoria 3800, Australia
| | - Michelle L Coote
- Research School of Chemistry, Australian National University, Canberra, Australian Capital Territory 0200, Australia
| | - Ekaterina I Izgorodina
- Monash Computational Chemistry Group, School of Chemistry, Monash University, Clayton, Victoria 3800, Australia
| |
Collapse
|
14
|
Symons BCB, Bane MK, Popelier PLA. DL_FFLUX: A Parallel, Quantum Chemical Topology Force Field. J Chem Theory Comput 2021; 17:7043-7055. [PMID: 34617748 PMCID: PMC8582247 DOI: 10.1021/acs.jctc.1c00595] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
![]()
DL_FFLUX is a force
field based on quantum chemical topology that
can perform molecular dynamics for flexible molecules endowed with
polarizable atomic multipole moments (up to hexadecapole). Using the
machine learning method kriging (aka Gaussian process regression),
DL_FFLUX has access to atomic properties (energy, charge, dipole moment,
etc.) with quantum mechanical accuracy. Newly optimized and parallelized
using domain decomposition Message Passing Interface (MPI), DL_FFLUX
is now able to deliver this rigorous methodology at scale while still
in reasonable time frames. DL_FFLUX is delivered as an add-on to the
widely distributed molecular dynamics code DL_POLY 4.08. For the systems
studied here (103–105 atoms), DL_FFLUX
is shown to add minimal computational cost to the standard DL_POLY
package. In fact, the optimization of the electrostatics in DL_FFLUX
means that, when high-rank multipole moments are enabled, DL_FFLUX
is up to 1.25× faster than standard DL_POLY. The parallel DL_FFLUX
preserves the quality of the scaling of MPI implementation in standard
DL_POLY. For the first time, it is feasible to use the full capability
of DL_FFLUX to study systems that are large enough to be of real-world
interest. For example, a fully flexible, high-rank polarized (up to
and including quadrupole moments) 1 ns simulation of a system of 10 125
atoms (3375 water molecules) takes 30 h (wall time) on 18 cores.
Collapse
Affiliation(s)
- Benjamin C B Symons
- Manchester Institute of Biotechnology (MIB), 131 Princess Street, Manchester M1 7DN, Great Britain.,Department of Chemistry, University of Manchester, Oxford Road, Manchester M13 9PL, Great Britain
| | - Michael K Bane
- High End Compute LTD, 23 Welby Street, Manchester M13 0EL, Great Britainhttps://highendcompute.co.uk.,Department of Computing and Mathematics, Manchester Metropolitan University, Manchester M15 6BH, Great Britain
| | - Paul L A Popelier
- Manchester Institute of Biotechnology (MIB), 131 Princess Street, Manchester M1 7DN, Great Britain.,Department of Chemistry, University of Manchester, Oxford Road, Manchester M13 9PL, Great Britain
| |
Collapse
|
15
|
Stoppelman JP, McDaniel JG. Physics-based, neural network force fields for reactive molecular dynamics: Investigation of carbene formation from [EMIM +][OAc -]. J Chem Phys 2021; 155:104112. [PMID: 34525833 DOI: 10.1063/5.0063187] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Reactive molecular dynamics simulations enable a detailed understanding of solvent effects on chemical reaction mechanisms and reaction rates. While classical molecular dynamics using reactive force fields allows significantly longer simulation time scales and larger system sizes compared with ab initio molecular dynamics, constructing reactive force fields is a difficult and complex task. In this work, we describe a general approach following the empirical valence bond framework for constructing ab initio reactive force fields for condensed phase simulations by combining physics-based methods with neural networks (PB/NNs). The physics-based terms ensure the correct asymptotic behavior of electrostatic, polarization, and dispersion interactions and are compatible with existing solvent force fields. NNs are utilized for a versatile description of short-range orbital interactions within the transition state region and accurate rendering of vibrational motion of the reacting complex. We demonstrate our methodology for a simple deprotonation reaction of the 1-ethyl-3-methylimidazolium cation with acetate to form 1-ethyl-3-methylimidazol-2-ylidene and acetic acid. Our PB/NN force field exhibits ∼1 kJ mol-1 mean absolute error accuracy within the transition state region for the gas-phase complex. To characterize the solvent modulation of the reaction profile, we compute potentials of mean force for the gas-phase reaction as well as the reaction within a four-ion cluster and benchmark against ab initio molecular dynamics simulations. We find that the surrounding ionic environment significantly destabilizes the formation of the carbene product, and we show that this effect is accurately captured by the reactive force field. By construction, the PB/NN potential may be directly employed for simulations of other solvents/chemical environments without additional parameterization.
Collapse
Affiliation(s)
- John P Stoppelman
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, USA
| | - Jesse G McDaniel
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, USA
| |
Collapse
|
16
|
Herbert JM. Neat, Simple, and Wrong: Debunking Electrostatic Fallacies Regarding Noncovalent Interactions. J Phys Chem A 2021; 125:7125-7137. [PMID: 34388340 DOI: 10.1021/acs.jpca.1c05962] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Multipole moments such as charge, dipole, and quadrupole are often invoked to rationalize intermolecular phenomena, but a low-order multipole expansion is rarely a valid description of electrostatics at the length scales that characterize nonbonded interactions. This is illustrated by examining several common misunderstandings rooted in erroneous electrostatic arguments. First, the notion that steric repulsion originates in Coulomb interactions is easily disproved by dissecting the interaction potential for Ar2. Second, the Hunter-Sanders model of π-π interactions, which is based on quadrupolar electrostatics, is shown to have no basis in accurate calculations. Third, curved "buckybowls" exhibit unusually large dipole moments, but these are ancillary to the forces that control their intermolecular interactions, as illustrated by two examples involving corannulene. Finally, the assumption that interactions between water and small anions are dictated by the dipole moment of H2O is shown to be false in the case of binary halide-water complexes. These examples present a compelling case that electrostatic explanations based on low-order multipole moments are very often counterfactual for nonbonded interactions at close range and should not be taken seriously in the absence of additional justification.
Collapse
Affiliation(s)
- John M Herbert
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio 43210, United States
| |
Collapse
|