1
|
Pathirage PDVS, Quebedeaux B, Akram S, Vogiatzis KD. Transferability Across Different Molecular Systems and Levels of Theory with the Data-Driven Coupled-Cluster Scheme. J Phys Chem A 2025; 129:2988-2997. [PMID: 40132101 DOI: 10.1021/acs.jpca.4c05718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/27/2025]
Abstract
Machine learning has recently been introduced into the arsenal of tools that are available to computational chemists. In the past few years, we have seen an increase in the applicability of these tools on a plethora of applications, including the automated exploration of a large fraction of the chemical space, the reduction of repetitive computational tasks, the detection of outliers on large databases, and the acceleration of molecular simulations. An attractive application of machine learning in molecular electronic structure theory is the "recycling" of molecular wave functions for faster and more accurate completion of complex quantum chemical calculations. Along these lines, we have developed hybrid quantum chemical/machine learning workflows that utilize information from low-level wave functions for the accurate prediction of higher-level wave functions. The data-driven coupled-cluster (DDCC) family of methods is discussed in this article together with the importance of the inclusion of physical properties in such hybrid workflows. After a short introduction to the philosophy and the capabilities of DDCC, we present our recent progress in extending its applicability to larger and more complex molecular structures and data sets. A significant advantage offered by DDCC is its transferability, with respect to different molecular systems and different excitation levels. As we show here, predicted wave functions at the coupled-cluster singles and doubles level of theory can be used for the accurate prediction of the perturbative triples of the CCSD(T) scheme. We conclude with some personal considerations with respect to future directions related to the development of the next generation of such hybrid quantum chemical/machine learning models.
Collapse
Affiliation(s)
- P D Varuna S Pathirage
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| | - Brody Quebedeaux
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| | - Shahzad Akram
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| | - Konstantinos D Vogiatzis
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| |
Collapse
|
2
|
de Giovanetti M, Cormanich RA, Sauer SPA. On the Performance of Second-Order Polarization Propagator Methods in the Calculation of 1JFC and nJFH NMR Spin-Spin Coupling Constants. J Chem Theory Comput 2024; 20:10453-10467. [PMID: 39611783 DOI: 10.1021/acs.jctc.4c01043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2024]
Abstract
This study evaluates the performance of doubles-corrected random phase approximation (RPA) and higher random phase approximation (HRPA) approaches in predicting nuclear magnetic resonance (NMR) coupling constants involving fluorine. Their performance is benchmarked against experimental data and compared with that of higher-level theoretical methods, specifically second-order polarization propagator (SOPPA) and SOPPA(CCSD). Additionally, we discuss their performance relative to density functional theory (DFT). We find that RPA(D) is severely constrained by (near) triplet instabilities, while HRPA(D) demonstrates markedly improved stability. Statistical analysis reveals stronger patterns for carbon-fluorine couplings across the methods and systems investigated compared with fluorine-hydrogen couplings. While SOPPA-based methodologies prove to be superior in accuracy, HRPA(D) shows promising performance in reducing the computational burden of these calculations, albeit with a tendency to underestimate the coupling strength. These findings highlight the potential of HRPA(D) as a practical alternative to SOPPA methods, even for such difficult properties as NMR spin-spin coupling constants involving fluorine, emphasizing its role in improving predictive accuracy and efficiency across diverse chemical environments.
Collapse
Affiliation(s)
- Marinella de Giovanetti
- Department of Chemistry and Hylleraas Centre for Quantum Molecular Sciences, University of Oslo, 0315 Oslo, Norway
| | - Rodrigo A Cormanich
- Chemistry Institute, State University of Campinas, P.O. Box 6154, 13083-970 Campinas, SP, Brazil
| | - Stephan P A Sauer
- Department of Chemistry, University of Copenhagen, Universitetsparken 5, DK-2100 Copenhagen Ø, Denmark
| |
Collapse
|
3
|
Aldossary A, Campos-Gonzalez-Angulo JA, Pablo-García S, Leong SX, Rajaonson EM, Thiede L, Tom G, Wang A, Avagliano D, Aspuru-Guzik A. In Silico Chemical Experiments in the Age of AI: From Quantum Chemistry to Machine Learning and Back. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2402369. [PMID: 38794859 DOI: 10.1002/adma.202402369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 04/28/2024] [Indexed: 05/26/2024]
Abstract
Computational chemistry is an indispensable tool for understanding molecules and predicting chemical properties. However, traditional computational methods face significant challenges due to the difficulty of solving the Schrödinger equations and the increasing computational cost with the size of the molecular system. In response, there has been a surge of interest in leveraging artificial intelligence (AI) and machine learning (ML) techniques to in silico experiments. Integrating AI and ML into computational chemistry increases the scalability and speed of the exploration of chemical space. However, challenges remain, particularly regarding the reproducibility and transferability of ML models. This review highlights the evolution of ML in learning from, complementing, or replacing traditional computational chemistry for energy and property predictions. Starting from models trained entirely on numerical data, a journey set forth toward the ideal model incorporating or learning the physical laws of quantum mechanics. This paper also reviews existing computational methods and ML models and their intertwining, outlines a roadmap for future research, and identifies areas for improvement and innovation. Ultimately, the goal is to develop AI architectures capable of predicting accurate and transferable solutions to the Schrödinger equation, thereby revolutionizing in silico experiments within chemistry and materials science.
Collapse
Affiliation(s)
- Abdulrahman Aldossary
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | | | - Sergio Pablo-García
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
| | - Shi Xuan Leong
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Ella Miray Rajaonson
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Luca Thiede
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Gary Tom
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Andrew Wang
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Davide Avagliano
- Chimie ParisTech, PSL University, CNRS, Institute of Chemistry for Life and Health Sciences (iCLeHS UMR 8060), Paris, F-75005, France
| | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
- Department of Materials Science & Engineering, University of Toronto, 184 College St., Toronto, ON, M5S 3E4, Canada
- Department of Chemical Engineering & Applied Chemistry, University of Toronto, 200 College St., Toronto, ON, M5S 3E5, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), 66118 University Ave., Toronto, M5G 1M1, Canada
- Acceleration Consortium, 80 St George St, Toronto, M5S 3H6, Canada
| |
Collapse
|
4
|
Pathirage PDVS, Phillips JT, Vogiatzis KD. Exploration of the Two-Electron Excitation Space with Data-Driven Coupled Cluster. J Phys Chem A 2024. [PMID: 38422511 DOI: 10.1021/acs.jpca.3c06600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2024]
Abstract
Computational cost limits the applicability of post-Hartree-Fock methods such as coupled-cluster on larger molecular systems. The data-driven coupled-cluster (DDCC) method applies machine learning to predict the coupled-cluster two-electron amplitudes (t2) using data from second-order perturbation theory (MP2). One major limitation of the DDCC models is the size of training sets that increases exponentially with the system size. Effective sampling of the amplitude space can resolve this issue. Five different amplitude selection techniques that reduce the amount of data used for training were evaluated, an approach that also prevents model overfitting and increases the portability of data-driven coupled-cluster singles and doubles to more complex molecules or larger basis sets. In combination with a localized orbital formalism to predict the CCSD t2 amplitudes, we have achieved a 10-fold error reduction for energy calculations.
Collapse
Affiliation(s)
- P D Varuna S Pathirage
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| | - Justin T Phillips
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| | - Konstantinos D Vogiatzis
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| |
Collapse
|
5
|
Venturella C, Hillenbrand C, Li J, Zhu T. Machine Learning Many-Body Green's Functions for Molecular Excitation Spectra. J Chem Theory Comput 2024; 20:143-154. [PMID: 38150268 DOI: 10.1021/acs.jctc.3c01146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
We present a machine learning (ML) framework for predicting Green's functions of molecular systems, from which photoemission spectra and quasiparticle energies at quantum many-body level can be obtained. Kernel ridge regression is adopted to predict self-energy matrix elements on compact imaginary frequency grids from static and dynamical mean-field electronic features, which gives direct access to real-frequency many-body Green's functions through analytic continuation and Dyson's equation. Feature and self-energy matrices are represented in a symmetry-adapted intrinsic atomic orbital plus projected atomic orbital basis to enforce rotational invariance. We demonstrate good transferability and high data efficiency of the proposed ML method across molecular sizes and chemical species by showing accurate predictions of density of states (DOS) and quasiparticle energies at the level of many-body perturbation theory (GW) or full configuration interaction. For the ML model trained on 48 out of 1995 molecules randomly sampled from the QM7 and QM9 data sets, we report the mean absolute errors of ML-predicted highest occupied and lowest unoccupied molecular orbital energies to be 0.13 and 0.10 eV, respectively, compared to GW@PBE0. We further showcase the capability of this method by applying the same ML model to predict DOS for significantly larger organic molecules with up to 44 heavy atoms.
Collapse
Affiliation(s)
- Christian Venturella
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States
| | | | - Jiachen Li
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States
| | - Tianyu Zhu
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States
| |
Collapse
|
6
|
Ng WP, Liang Q, Yang J. Low-Data Deep Quantum Chemical Learning for Accurate MP2 and Coupled-Cluster Correlations. J Chem Theory Comput 2023; 19:5439-5449. [PMID: 37506400 DOI: 10.1021/acs.jctc.3c00518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2023]
Abstract
Accurate ab initio prediction of electronic energies is very expensive for macromolecules by explicitly solving post-Hartree-Fock equations. We here exploit the physically justified local correlation feature in a compact basis of small molecules and construct an expressive low-data deep neural network (dNN) model to obtain machine-learned electron correlation energies on par with MP2 and CCSD levels of theory for more complex molecules and different datasets that are not represented in the training set. We show that our dNN-powered model is data efficient and makes highly transferable predictions across alkanes of various lengths, organic molecules with non-covalent and biomolecular interactions, as well as water clusters of different sizes and morphologies. In particular, by training 800 (H2O)8 clusters with the local correlation descriptors, accurate MP2/cc-pVTZ correlation energies up to (H2O)128 can be predicted with a small random error within chemical accuracy from exact values, while a majority of prediction deviations are attributed to an intrinsically systematic error. Our results reveal that an extremely compact local correlation feature set, which is poor for any direct post-Hartree-Fock calculations, has however a prominent advantage in reserving important electron correlation patterns for making accurate transferable predictions across distinct molecular compositions, bond types, and geometries.
Collapse
Affiliation(s)
- Wai-Pan Ng
- Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China
- Hong Kong Quantum AI Lab Limited, Hong Kong 999077, P. R. China
| | - Qiujiang Liang
- Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China
| | - Jun Yang
- Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China
- Hong Kong Quantum AI Lab Limited, Hong Kong 999077, P. R. China
| |
Collapse
|
7
|
Ruth M, Gerbig D, Schreiner PR. Machine Learning for Bridging the Gap between Density Functional Theory and Coupled Cluster Energies. J Chem Theory Comput 2023. [PMID: 37418619 DOI: 10.1021/acs.jctc.3c00274] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/09/2023]
Abstract
Accurate electronic energies and properties are crucial for successful reaction design and mechanistic investigations. Computing energies and properties of molecular structures has proven extremely useful, and, with increasing computational power, the limits of high-level approaches (such as coupled cluster theory) are expanding to ever larger systems. However, because scaling is highly unfavorable, these methods are still not universally applicable to larger systems. To address the need for fast and accurate electronic energies of larger systems, we created a database of around 8000 small organic monomers (2000 dimers) optimized at the B3LYP-D3(BJ)/cc-pVTZ level of theory. This database also includes single-point energies computed at various levels of theory, including PBE1PBE, ωΒ97Χ, M06-2X, revTPSS, B3LYP, and BP86, for density functional theory as well as DLPNO-CCSD(T) and CCSD(T) for coupled cluster theory, all in conjunction with a cc-pVTZ basis. We used this database to train machine learning models based on graph neural networks using two different graph representations. Our models are able to make energy predictions from B3LYP-D3(BJ)/cc-pVTZ inputs to CCSD(T)/cc-pVTZ outputs with a mean absolute error of 0.78 and to DLPNO-CCSD(T)/cc-pVTZ with an mean absolute error of 0.50 and 0.18 kcal mol-1 for monomers and dimers, respectively. The model for dimers was further validated on the S22 database, and the monomer model was tested on challenging systems, including those with highly conjugated or functionally complex molecules.
Collapse
Affiliation(s)
- Marcel Ruth
- Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| | - Dennis Gerbig
- Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| | - Peter R Schreiner
- Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| |
Collapse
|
8
|
Jones GM, Li RR, DePrince AE, Vogiatzis KD. Data-Driven Refinement of Electronic Energies from Two-Electron Reduced-Density-Matrix Theory. J Phys Chem Lett 2023:6377-6385. [PMID: 37418691 DOI: 10.1021/acs.jpclett.3c01382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/09/2023]
Abstract
The exponential computational cost of describing strongly correlated electrons can be mitigated by adopting a reduced-density matrix (RDM)-based description of the electronic structure. While variational two-electron RDM (v2RDM) methods can enable large-scale calculations on such systems, the quality of the solution is limited by the fact that only a subset of known necessary N-representability constraints can be applied to the 2RDM in practical calculations. Here, we demonstrate that violations of partial three-particle (T1 and T2) N-representability conditions, which can be evaluated with knowledge of only the 2RDM, can serve as physics-based features in a machine-learning (ML) protocol for improving energies from v2RDM calculations that consider only two-particle (PQG) conditions. Proof-of-principle calculations demonstrate that the model yields substantially improved energies relative to reference values from configuration-interaction-based calculations.
Collapse
Affiliation(s)
- Grier M Jones
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996, United States
| | - Run R Li
- Department of Chemistry and Biochemistry, Florida State University, Tallahassee, Florida 32306-4390, United States
| | - A Eugene DePrince
- Department of Chemistry and Biochemistry, Florida State University, Tallahassee, Florida 32306-4390, United States
| | | |
Collapse
|
9
|
Schrader SE, Kvaal S. Accelerated coupled cluster calculations with Procrustes orbital interpolation. J Chem Phys 2023; 158:114116. [PMID: 36948808 DOI: 10.1063/5.0141145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023] Open
Abstract
The coupled cluster method is considered a gold standard in quantum chemistry, reliably giving energies that are exact within chemical accuracy (1.6 mhartree). However, even in the coupled cluster single-double (CCSD) approximation, where the cluster operator is truncated to include only single and double excitations, the method scales as O(N6) in the number of electrons, and the cluster operator needs to be solved for iteratively, increasing the computation time. Inspired by eigenvector continuation, we present here an algorithm making use of the Gaussian processes that provides an improved initial guess for the coupled cluster amplitudes. The cluster operator is written as a linear combination of sample cluster operators that are obtained at particular sample geometries. By reusing the cluster operators from previous calculations in that way, it is possible to obtain a start guess for the amplitudes that surpasses both MP2 guesses and "previous geometry"-guesses in terms of the number of necessary iterations. As this improved guess is very close to the exact cluster operator, it can be used directly to calculate the CCSD energy to chemical accuracy, giving approximate CCSD energies scaling as O(N5).
Collapse
Affiliation(s)
- Simon Elias Schrader
- Hylleraas Centre for Quantum Molecular Sciences, Department of Chemistry, University of Oslo, P.O. Box 1033 Blindern, N-0315 Oslo, Norway
| | - Simen Kvaal
- Hylleraas Centre for Quantum Molecular Sciences, Department of Chemistry, University of Oslo, P.O. Box 1033 Blindern, N-0315 Oslo, Norway
| |
Collapse
|
10
|
Bowman JM, Qu C, Conte R, Nandi A, Houston PL, Yu Q. Δ-Machine Learned Potential Energy Surfaces and Force Fields. J Chem Theory Comput 2023; 19:1-17. [PMID: 36527383 DOI: 10.1021/acs.jctc.2c01034] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
There has been great progress in developing machine-learned potential energy surfaces (PESs) for molecules and clusters with more than 10 atoms. Unfortunately, this number of atoms generally limits the level of electronic structure theory to less than the "gold standard" CCSD(T) level. Indeed, for the well-known MD17 dataset for molecules with 9-20 atoms, all of the energies and forces were obtained with DFT calculations (PBE). This Perspective is focused on a Δ-machine learning method that we recently proposed and applied to bring DFT-based PESs to close to CCSD(T) accuracy. This is demonstrated for hydronium, N-methylacetamide, acetyl acetone, and ethanol. For 15-atom tropolone, it appears that special approaches (e.g., molecular tailoring, local CCSD(T)) are needed to obtain the CCSD(T) energies. A new aspect of this approach is the extension of Δ-machine learning to force fields. The approach is based on many-body corrections to polarizable force field potentials. This is examined in detail using the TTM2.1 water potential. The corrections make use of our recent CCSD(T) datasets for 2-b, 3-b, and 4-b interactions for water. These datasets were used to develop a new fully ab initio potential for water, termed q-AQUA.
Collapse
Affiliation(s)
- Joel M Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Chen Qu
- Independent Researcher, Toronto, Canada 66777
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, United States.,Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Qi Yu
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States
| |
Collapse
|
11
|
Cheng L, Sun J, Miller TF. Accurate Molecular-Orbital-Based Machine Learning Energies via Unsupervised Clustering of Chemical Space. J Chem Theory Comput 2022; 18:4826-4835. [PMID: 35858242 DOI: 10.1021/acs.jctc.2c00396] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We introduce an unsupervised clustering algorithm to improve training efficiency and accuracy in predicting energies using molecular-orbital-based machine learning (MOB-ML). This work determines clusters via the Gaussian mixture model (GMM) in an entirely automatic manner and simplifies an earlier supervised clustering approach [ J. Chem. Theory Comput. 2019, 15, 6668] by eliminating both the necessity for user-specified parameters and the training of an additional classifier. Unsupervised clustering results from GMM have the advantages of accurately reproducing chemically intuitive groupings of frontier molecular orbitals and exhibiting improved performance with an increasing number of training examples. The resulting clusters from supervised or unsupervised clustering are further combined with scalable Gaussian process regression (GPR) or linear regression (LR) to learn molecular energies accurately by generating a local regression model in each cluster. Among all four combinations of regressors and clustering methods, GMM combined with scalable exact GPR (GMM/GPR) is the most efficient training protocol for MOB-ML. The numerical tests of molecular energy learning on thermalized data sets of drug-like molecules demonstrate the improved accuracy, transferability, and learning efficiency of GMM/GPR over other training protocols for MOB-ML, i.e., supervised regression clustering combined with GPR (RC/GPR) and GPR without clustering. GMM/GPR also provides the best molecular energy predictions compared with ones from the literature on the same benchmark data sets. With a lower scaling, GMM/GPR has a 10.4-fold speedup in wall-clock training time compared with scalable exact GPR with a training size of 6500 QM7b-T molecules.
Collapse
Affiliation(s)
- Lixue Cheng
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Jiace Sun
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Thomas F Miller
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| |
Collapse
|
12
|
Ruth M, Gerbig D, Schreiner PR. Machine Learning of Coupled Cluster (T)-Energy Corrections via Delta (Δ)-Learning. J Chem Theory Comput 2022; 18:4846-4855. [PMID: 35816588 DOI: 10.1021/acs.jctc.2c00501] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Accurate thermochemistry is essential in many chemical disciplines, such as astro-, atmospheric, or combustion chemistry. These areas often involve fleetingly existent intermediates whose thermochemistry is difficult to assess. Whenever direct calorimetric experiments are infeasible, accurate computational estimates of relative molecular energies are required. However, high-level computations, often using coupled cluster theory, are generally resource-intensive. To expedite the process using machine learning techniques, we generated a database of energies for small organic molecules at the CCSD(T)/cc-pVDZ, CCSD(T)/aug-cc-pVDZ, and CCSD(T)/cc-pVTZ levels of theory. Leveraging the power of deep learning by employing graph neural networks, we are able to predict the effect of perturbatively included triples (T), that is, the difference between CCSD and CCSD(T) energies, with a mean absolute error of 0.25, 0.25, and 0.28 kcal mol-1 (R2 of 0.998, 0.997, and 0.998) with the cc-pVDZ, aug-cc-pVDZ, and cc-pVTZ basis sets, respectively. Our models were further validated by application to three validation sets taken from the S22 Database as well as to a selection of known theoretically challenging cases.
Collapse
Affiliation(s)
- Marcel Ruth
- Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| | - Dennis Gerbig
- Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| | - Peter R Schreiner
- Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| |
Collapse
|
13
|
Nandy A, Duan C, Kulik HJ. Audacity of huge: overcoming challenges of data scarcity and data quality for machine learning in computational materials discovery. Curr Opin Chem Eng 2022. [DOI: 10.1016/j.coche.2021.100778] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
14
|
Jeong W, Gaggioli CA, Gagliardi L. Active Learning Configuration Interaction for Excited-State Calculations of Polycyclic Aromatic Hydrocarbons. J Chem Theory Comput 2021; 17:7518-7530. [PMID: 34787422 PMCID: PMC8675132 DOI: 10.1021/acs.jctc.1c00769] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Indexed: 11/30/2022]
Abstract
We present the active learning configuration interaction (ALCI) method for multiconfigurational calculations based on large active spaces. ALCI leverages the use of an active learning procedure to find important electronic configurations among the full configurational space generated within an active space. We tested it for the calculation of singlet-singlet excited states of acenes and pyrene using different machine learning algorithms. The ALCI method yields excitation energies within 0.2-0.3 eV from those obtained by traditional complete active-space configuration interaction (CASCI) calculations (affordable for active spaces up to 16 electrons in 16 orbitals) by including only a small fraction of the CASCI configuration space in the calculations. For larger active spaces (we tested up to 26 electrons in 26 orbitals), not affordable with traditional CI methods, ALCI captures the trends of experimental excitation energies. Overall, ALCI provides satisfactory approximations to large active-space wave functions with up to 10 orders of magnitude fewer determinants for the systems presented here. These ALCI wave functions are promising and affordable starting points for the subsequent second-order perturbation theory or pair-density functional theory calculations.
Collapse
Affiliation(s)
- WooSeok Jeong
- Department
of Chemistry, Nanoporous Materials Genome Center, Chemical Theory
Center, and Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Carlo Alberto Gaggioli
- Department
of Chemistry, Pritzker School of Molecular Engineering, James Franck
Institute, Chicago Center for Theoretical Chemistry, University of Chicago, Chicago, Illinois 60637, United States
| | - Laura Gagliardi
- Department
of Chemistry, Pritzker School of Molecular Engineering, James Franck
Institute, Chicago Center for Theoretical Chemistry, University of Chicago, Chicago, Illinois 60637, United States
- Argonne
National Laboratory, Lemont, Illinois 60439, United States
| |
Collapse
|
15
|
Westermayr J, Marquetand P. Machine Learning for Electronically Excited States of Molecules. Chem Rev 2021; 121:9873-9926. [PMID: 33211478 PMCID: PMC8391943 DOI: 10.1021/acs.chemrev.0c00749] [Citation(s) in RCA: 198] [Impact Index Per Article: 49.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Indexed: 12/11/2022]
Abstract
Electronically excited states of molecules are at the heart of photochemistry, photophysics, as well as photobiology and also play a role in material science. Their theoretical description requires highly accurate quantum chemical calculations, which are computationally expensive. In this review, we focus on not only how machine learning is employed to speed up such excited-state simulations but also how this branch of artificial intelligence can be used to advance this exciting research field in all its aspects. Discussed applications of machine learning for excited states include excited-state dynamics simulations, static calculations of absorption spectra, as well as many others. In order to put these studies into context, we discuss the promises and pitfalls of the involved machine learning techniques. Since the latter are mostly based on quantum chemistry calculations, we also provide a short introduction into excited-state electronic structure methods and approaches for nonadiabatic dynamics simulations and describe tricks and problems when using them in machine learning for excited states of molecules.
Collapse
Affiliation(s)
- Julia Westermayr
- Institute
of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
| | - Philipp Marquetand
- Institute
of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Vienna
Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Data
Science @ Uni Vienna, University of Vienna, Währinger Strasse 29, 1090 Vienna, Austria
| |
Collapse
|
16
|
Nandy A, Duan C, Taylor MG, Liu F, Steeves AH, Kulik HJ. Computational Discovery of Transition-metal Complexes: From High-throughput Screening to Machine Learning. Chem Rev 2021; 121:9927-10000. [PMID: 34260198 DOI: 10.1021/acs.chemrev.1c00347] [Citation(s) in RCA: 104] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Transition-metal complexes are attractive targets for the design of catalysts and functional materials. The behavior of the metal-organic bond, while very tunable for achieving target properties, is challenging to predict and necessitates searching a wide and complex space to identify needles in haystacks for target applications. This review will focus on the techniques that make high-throughput search of transition-metal chemical space feasible for the discovery of complexes with desirable properties. The review will cover the development, promise, and limitations of "traditional" computational chemistry (i.e., force field, semiempirical, and density functional theory methods) as it pertains to data generation for inorganic molecular discovery. The review will also discuss the opportunities and limitations in leveraging experimental data sources. We will focus on how advances in statistical modeling, artificial intelligence, multiobjective optimization, and automation accelerate discovery of lead compounds and design rules. The overall objective of this review is to showcase how bringing together advances from diverse areas of computational chemistry and computer science have enabled the rapid uncovering of structure-property relationships in transition-metal chemistry. We aim to highlight how unique considerations in motifs of metal-organic bonding (e.g., variable spin and oxidation state, and bonding strength/nature) set them and their discovery apart from more commonly considered organic molecules. We will also highlight how uncertainty and relative data scarcity in transition-metal chemistry motivate specific developments in machine learning representations, model training, and in computational chemistry. Finally, we will conclude with an outlook of areas of opportunity for the accelerated discovery of transition-metal complexes.
Collapse
Affiliation(s)
- Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Michael G Taylor
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Fang Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Adam H Steeves
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
17
|
Abstract
Electronically excited states of molecules are at the heart of photochemistry, photophysics, as well as photobiology and also play a role in material science. Their theoretical description requires highly accurate quantum chemical calculations, which are computationally expensive. In this review, we focus on not only how machine learning is employed to speed up such excited-state simulations but also how this branch of artificial intelligence can be used to advance this exciting research field in all its aspects. Discussed applications of machine learning for excited states include excited-state dynamics simulations, static calculations of absorption spectra, as well as many others. In order to put these studies into context, we discuss the promises and pitfalls of the involved machine learning techniques. Since the latter are mostly based on quantum chemistry calculations, we also provide a short introduction into excited-state electronic structure methods and approaches for nonadiabatic dynamics simulations and describe tricks and problems when using them in machine learning for excited states of molecules.
Collapse
Affiliation(s)
- Julia Westermayr
- Institute of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
| | - Philipp Marquetand
- Institute of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Vienna Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Data Science @ Uni Vienna, University of Vienna, Währinger Strasse 29, 1090 Vienna, Austria
| |
Collapse
|
18
|
Westermayr J, Gastegger M, Schütt KT, Maurer RJ. Perspective on integrating machine learning into computational chemistry and materials science. J Chem Phys 2021; 154:230903. [PMID: 34241249 DOI: 10.1063/5.0047760] [Citation(s) in RCA: 81] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Machine learning (ML) methods are being used in almost every conceivable area of electronic structure theory and molecular simulation. In particular, ML has become firmly established in the construction of high-dimensional interatomic potentials. Not a day goes by without another proof of principle being published on how ML methods can represent and predict quantum mechanical properties-be they observable, such as molecular polarizabilities, or not, such as atomic charges. As ML is becoming pervasive in electronic structure theory and molecular simulation, we provide an overview of how atomistic computational modeling is being transformed by the incorporation of ML approaches. From the perspective of the practitioner in the field, we assess how common workflows to predict structure, dynamics, and spectroscopy are affected by ML. Finally, we discuss how a tighter and lasting integration of ML methods with computational chemistry and materials science can be achieved and what it will mean for research practice, software development, and postgraduate training.
Collapse
Affiliation(s)
- Julia Westermayr
- Department of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, United Kingdom
| | - Michael Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Kristof T Schütt
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Reinhard J Maurer
- Department of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, United Kingdom
| |
Collapse
|
19
|
King DS, Gagliardi L. A Ranked-Orbital Approach to Select Active Spaces for High-Throughput Multireference Computation. J Chem Theory Comput 2021; 17:2817-2831. [PMID: 33860669 DOI: 10.1021/acs.jctc.1c00037] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The past decade has seen a great increase in the application of high-throughput computation to a variety of important problems in chemistry. However, one area which has been resistant to the high-throughput approach is multireference wave function methods, in large part due to the technicalities of setting up these calculations and in particular the not always intuitive challenge of active space selection. As we look toward a future of applying high-throughput computation to all areas of chemistry, it is important to prepare these methods for large-scale automation. Here, we propose a ranked-orbital approach to select active spaces with the goal of standardizing multireference methods for high-throughput computation. This method allows for the meaningful comparison of different active space selection schemes and orbital localizations, and we demonstrate the utility of this approach across 1120 multireference calculations for the excitation energies of small molecules. Our results reveal that it is helpful to distinguish the method used to generate orbitals from the method of ranking orbitals in terms of importance for the active space. Additionally, we propose our own orbital ranking scheme that estimates the importance of an orbital for the active space through a pair-interaction framework from orbital energies and features of the Hartree-Fock exchange matrix. We call this new scheme the "approximate pair coefficient" (APC) method and we show that it performs quite well for the test systems presented.
Collapse
Affiliation(s)
- Daniel S King
- Department of Chemistry, University of Chicago, Chicago, Illinois 60637, United States
| | - Laura Gagliardi
- Department of Chemistry, University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
20
|
Agarawal V, Roy S, Chakraborty A, Maitra R. Accelerating coupled cluster calculations with nonlinear dynamics and supervised machine learning. J Chem Phys 2021; 154:044110. [PMID: 33514076 DOI: 10.1063/5.0037090] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
In this paper, the iteration scheme associated with single reference coupled cluster theory has been analyzed using nonlinear dynamics. The phase space analysis indicates the presence of a few significant cluster amplitudes, mostly involving valence excitations, that dictate the dynamics, while all other amplitudes are enslaved. Starting with a few initial iterations to establish the inter-relationship among the cluster amplitudes, a supervised machine learning scheme with a polynomial kernel ridge regression model has been employed to express each of the enslaved amplitudes uniquely in terms of the former set of amplitudes. The subsequent coupled cluster iterations are restricted solely to determine those significant excitations, and the enslaved amplitudes are determined through the already established functional mapping. We will show that our hybrid scheme leads to a significant reduction in the computational time without sacrificing the accuracy.
Collapse
Affiliation(s)
- Valay Agarawal
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai, India
| | - Samrendra Roy
- Department of Energy Science and Engineering, Indian Institute of Technology Bombay, Mumbai, India
| | - Anish Chakraborty
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai, India
| | - Rahul Maitra
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai, India
| |
Collapse
|
21
|
Head-Marsden K, Flick J, Ciccarino CJ, Narang P. Quantum Information and Algorithms for Correlated Quantum Matter. Chem Rev 2020; 121:3061-3120. [PMID: 33326218 DOI: 10.1021/acs.chemrev.0c00620] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Discoveries in quantum materials, which are characterized by the strongly quantum-mechanical nature of electrons and atoms, have revealed exotic properties that arise from correlations. It is the promise of quantum materials for quantum information science superimposed with the potential of new computational quantum algorithms to discover new quantum materials that inspires this Review. We anticipate that quantum materials to be discovered and developed in the next years will transform the areas of quantum information processing including communication, storage, and computing. Simultaneously, efforts toward developing new quantum algorithmic approaches for quantum simulation and advanced calculation methods for many-body quantum systems enable major advances toward functional quantum materials and their deployment. The advent of quantum computing brings new possibilities for eliminating the exponential complexity that has stymied simulation of correlated quantum systems on high-performance classical computers. Here, we review new algorithms and computational approaches to predict and understand the behavior of correlated quantum matter. The strongly interdisciplinary nature of the topics covered necessitates a common language to integrate ideas from these fields. We aim to provide this common language while weaving together fields across electronic structure theory, quantum electrodynamics, algorithm design, and open quantum systems. Our Review is timely in presenting the state-of-the-art in the field toward algorithms with nonexponential complexity for correlated quantum matter with applications in grand-challenge problems. Looking to the future, at the intersection of quantum information science and algorithms for correlated quantum matter, we envision seminal advances in predicting many-body quantum states and describing excitonic quantum matter and large-scale entangled states, a better understanding of high-temperature superconductivity, and quantifying open quantum system dynamics.
Collapse
Affiliation(s)
- Kade Head-Marsden
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Johannes Flick
- Center for Computational Quantum Physics, Flatiron Institute, New York, New York 10010, United States
| | - Christopher J Ciccarino
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States.,Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Prineha Narang
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| |
Collapse
|
22
|
|
23
|
Ikabata Y, Fujisawa R, Seino J, Yoshikawa T, Nakai H. Machine-learned electron correlation model based on frozen core approximation. J Chem Phys 2020; 153:184108. [DOI: 10.1063/5.0021281] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Affiliation(s)
- Yasuhiro Ikabata
- Waseda Research Institute for Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
| | - Ryo Fujisawa
- Department of Chemistry and Biochemistry, School of Advanced Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
| | - Junji Seino
- Waseda Research Institute for Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- PRESTO, Japan Science and Technology Agency, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012, Japan
| | - Takeshi Yoshikawa
- Waseda Research Institute for Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Faculty of Pharmaceutical Sciences, Toho University, 2-2-1 Miyama, Funabashi, Chiba 274-8510, Japan
| | - Hiromi Nakai
- Waseda Research Institute for Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Department of Chemistry and Biochemistry, School of Advanced Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Elements Strategy Initiative for Catalysts and Batteries (ESICB), Kyoto University, Katsura, Kyoto 615-8520, Japan
| |
Collapse
|
24
|
Townsend J, Vogiatzis KD. Transferable MP2-Based Machine Learning for Accurate Coupled-Cluster Energies. J Chem Theory Comput 2020; 16:7453-7461. [DOI: 10.1021/acs.jctc.0c00927] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Affiliation(s)
- Jacob Townsend
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996, United States
| | | |
Collapse
|
25
|
Westermayr J, Marquetand P. Machine learning and excited-state molecular dynamics. MACHINE LEARNING-SCIENCE AND TECHNOLOGY 2020. [DOI: 10.1088/2632-2153/ab9c3e] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
26
|
Yang PJ, Sugiyama M, Tsuda K, Yanai T. Artificial Neural Networks Applied as Molecular Wave Function Solvers. J Chem Theory Comput 2020; 16:3513-3529. [PMID: 32320233 DOI: 10.1021/acs.jctc.9b01132] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We use artificial neural networks (ANNs) based on the Boltzmann machine (BM) architectures as an encoder of ab initio molecular many-electron wave functions represented with the complete active space configuration interaction (CAS-CI) model. As first introduced by the work of Carleo and Troyer for physical systems, the coefficients of the electronic configurations in the CI expansion are parametrized with the BMs as a function of their occupancies that act as descriptors. This ANN-based wave function ansatz is referred to as the neural-network quantum state (NQS). The machine learning is used for training the BMs in terms of finding a variationally optimal form of the ground-state wave function on the basis of the energy minimization. It is relevant to reinforcement learning and does not use any reference data nor prior knowledge of the wave function, while the Hamiltonian is given based on a user-specified chemical structure in the first-principles manner. Carleo and Troyer used the restricted Boltzmann machine (RBM), which has hidden units, for the neural network architecture of NQS, while, in this study, we further introduce its replacement with the BM that has only visible units but with different orders of connectivity. For this hidden-node free BM, the second- and third-order BMs based on quadratic and cubic energy functions, respectively, were implemented. We denote these second- and third-order BMs as BM2 and BM3, respectively. The pilot implementation of the NQS solver into an exact diagonalization module of the quantum chemistry program was made to assess the capability of variants of the BM-based NQS. The test calculations were performed by determining the CAS-CI wave functions of illustrative molecular systems, indocyanine green, and dinitrogen dissociation. The simulated energies have been shown to converge to CAS-CI energy in most cases by improving RBM with an increasing number of hidden nodes. BM3 systematically yields lower energies than BM2, reproducing the CAS-CI energies of dinitrogen across potential energy curves within an error of 50 μEh.
Collapse
Affiliation(s)
- Peng-Jian Yang
- Department of Chemistry, Nagoya University, Furocho, Chikusa Ward, Nagoya, Aichi 464-8601, Japan
| | - Mahito Sugiyama
- National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan.,JST, PRESTO, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012, Japan
| | - Koji Tsuda
- Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwa-no-ha, Kashiwa, Chiba 277-8561, Japan.,RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan.,Research and Services Division of Materials Data and Integrated System, National Institute for Materials Science, Ibaraki 305-0047, Japan
| | - Takeshi Yanai
- Department of Chemistry, Nagoya University, Furocho, Chikusa Ward, Nagoya, Aichi 464-8601, Japan.,Institute of Transformative Bio-Molecules (WPI-ITbM), Nagoya University, Furocho, Chikusa Ward, Nagoya, Aichi 464-8601, Japan.,JST, PRESTO, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012, Japan
| |
Collapse
|
27
|
Peyton BG, Briggs C, D’Cunha R, Margraf JT, Crawford TD. Machine-Learning Coupled Cluster Properties through a Density Tensor Representation. J Phys Chem A 2020; 124:4861-4871. [DOI: 10.1021/acs.jpca.0c02804] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Affiliation(s)
- Benjamin G. Peyton
- Department of Chemistry, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Connor Briggs
- Department of Chemistry, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Ruhee D’Cunha
- Department of Chemistry, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Johannes T. Margraf
- Chair for Theoretical Chemistry, Technische Universität München, Lichtenbergstrasse 4, D-85747 Garching, Germany
| | - T. Daniel Crawford
- Department of Chemistry, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
28
|
Smith DGA, Burns LA, Simmonett AC, Parrish RM, Schieber MC, Galvelis R, Kraus P, Kruse H, Di Remigio R, Alenaizan A, James AM, Lehtola S, Misiewicz JP, Scheurer M, Shaw RA, Schriber JB, Xie Y, Glick ZL, Sirianni DA, O’Brien JS, Waldrop JM, Kumar A, Hohenstein EG, Pritchard BP, Brooks BR, Schaefer HF, Sokolov AY, Patkowski K, DePrince AE, Bozkaya U, King RA, Evangelista FA, Turney JM, Crawford TD, Sherrill CD. Psi4 1.4: Open-source software for high-throughput quantum chemistry. J Chem Phys 2020; 152:184108. [PMID: 32414239 PMCID: PMC7228781 DOI: 10.1063/5.0006002] [Citation(s) in RCA: 486] [Impact Index Per Article: 97.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Accepted: 04/12/2020] [Indexed: 12/13/2022] Open
Abstract
PSI4 is a free and open-source ab initio electronic structure program providing implementations of Hartree-Fock, density functional theory, many-body perturbation theory, configuration interaction, density cumulant theory, symmetry-adapted perturbation theory, and coupled-cluster theory. Most of the methods are quite efficient, thanks to density fitting and multi-core parallelism. The program is a hybrid of C++ and Python, and calculations may be run with very simple text files or using the Python API, facilitating post-processing and complex workflows; method developers also have access to most of PSI4's core functionalities via Python. Job specification may be passed using The Molecular Sciences Software Institute (MolSSI) QCSCHEMA data format, facilitating interoperability. A rewrite of our top-level computation driver, and concomitant adoption of the MolSSI QCARCHIVE INFRASTRUCTURE project, makes the latest version of PSI4 well suited to distributed computation of large numbers of independent tasks. The project has fostered the development of independent software components that may be reused in other quantum chemistry programs.
Collapse
Affiliation(s)
| | - Lori A. Burns
- Center for Computational Molecular Science and
Technology, School of Chemistry and Biochemistry, School of Computational Science and
Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400,
USA
| | - Andrew C. Simmonett
- National Institutes of Health – National Heart,
Lung and Blood Institute, Laboratory of Computational Biology, Bethesda,
Maryland 20892, USA
| | - Robert M. Parrish
- Center for Computational Molecular Science and
Technology, School of Chemistry and Biochemistry, School of Computational Science and
Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400,
USA
| | - Matthew C. Schieber
- Center for Computational Molecular Science and
Technology, School of Chemistry and Biochemistry, School of Computational Science and
Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400,
USA
| | | | - Peter Kraus
- School of Molecular and Life Sciences, Curtin
University, Kent St., Bentley, Perth, Western Australia 6102,
Australia
| | - Holger Kruse
- Institute of Biophysics of the Czech Academy of
Sciences, Královopolská 135, 612 65 Brno, Czech
Republic
| | - Roberto Di Remigio
- Department of Chemistry, Centre for Theoretical
and Computational Chemistry, UiT, The Arctic University of Norway, N-9037
Tromsø, Norway
| | - Asem Alenaizan
- Center for Computational Molecular Science and
Technology, School of Chemistry and Biochemistry, School of Computational Science and
Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400,
USA
| | - Andrew M. James
- Department of Chemistry, Virginia
Tech, Blacksburg, Virginia 24061, USA
| | - Susi Lehtola
- Department of Chemistry, University of
Helsinki, P.O. Box 55 (A. I. Virtasen aukio 1), FI-00014 Helsinki,
Finland
| | - Jonathon P. Misiewicz
- Center for Computational Quantum Chemistry,
University of Georgia, Athens, Georgia 30602, USA
| | - Maximilian Scheurer
- Interdisciplinary Center for Scientific
Computing, Heidelberg University, D-69120 Heidelberg,
Germany
| | - Robert A. Shaw
- ARC Centre of Excellence in Exciton Science,
School of Science, RMIT University, Melbourne, VIC 3000,
Australia
| | - Jeffrey B. Schriber
- Center for Computational Molecular Science and
Technology, School of Chemistry and Biochemistry, School of Computational Science and
Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400,
USA
| | - Yi Xie
- Center for Computational Molecular Science and
Technology, School of Chemistry and Biochemistry, School of Computational Science and
Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400,
USA
| | - Zachary L. Glick
- Center for Computational Molecular Science and
Technology, School of Chemistry and Biochemistry, School of Computational Science and
Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400,
USA
| | - Dominic A. Sirianni
- Center for Computational Molecular Science and
Technology, School of Chemistry and Biochemistry, School of Computational Science and
Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400,
USA
| | - Joseph Senan O’Brien
- Center for Computational Molecular Science and
Technology, School of Chemistry and Biochemistry, School of Computational Science and
Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400,
USA
| | - Jonathan M. Waldrop
- Department of Chemistry and Biochemistry, Auburn
University, Auburn, Alabama 36849, USA
| | - Ashutosh Kumar
- Department of Chemistry, Virginia
Tech, Blacksburg, Virginia 24061, USA
| | - Edward G. Hohenstein
- SLAC National Accelerator Laboratory, Stanford
PULSE Institute, Menlo Park, California 94025,
USA
| | | | - Bernard R. Brooks
- National Institutes of Health – National Heart,
Lung and Blood Institute, Laboratory of Computational Biology, Bethesda,
Maryland 20892, USA
| | - Henry F. Schaefer
- Center for Computational Quantum Chemistry,
University of Georgia, Athens, Georgia 30602, USA
| | - Alexander Yu. Sokolov
- Department of Chemistry and Biochemistry, The
Ohio State University, Columbus, Ohio 43210, USA
| | - Konrad Patkowski
- Department of Chemistry and Biochemistry, Auburn
University, Auburn, Alabama 36849, USA
| | - A. Eugene DePrince
- Department of Chemistry and Biochemistry,
Florida State University, Tallahassee, Florida 32306-4390,
USA
| | - Uğur Bozkaya
- Department of Chemistry, Hacettepe
University, Ankara 06800, Turkey
| | - Rollin A. King
- Department of Chemistry, Bethel
University, St. Paul, Minnesota 55112, USA
| | | | - Justin M. Turney
- Center for Computational Quantum Chemistry,
University of Georgia, Athens, Georgia 30602, USA
| | | | - C. David Sherrill
- Center for Computational Molecular Science and
Technology, School of Chemistry and Biochemistry, School of Computational Science and
Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400,
USA
| |
Collapse
|
29
|
Manzhos S. Machine learning for the solution of the Schrödinger equation. MACHINE LEARNING-SCIENCE AND TECHNOLOGY 2020. [DOI: 10.1088/2632-2153/ab7d30] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
30
|
Abstract
As the quantum chemistry (QC) community embraces machine learning (ML), the number of new methods and applications based on the combination of QC and ML is surging. In this Perspective, a view of the current state of affairs in this new and exciting research field is offered, challenges of using machine learning in quantum chemistry applications are described, and potential future developments are outlined. Specifically, examples of how machine learning is used to improve the accuracy and accelerate quantum chemical research are shown. Generalization and classification of existing techniques are provided to ease the navigation in the sea of literature and to guide researchers entering the field. The emphasis of this Perspective is on supervised machine learning.
Collapse
Affiliation(s)
- Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| |
Collapse
|
31
|
Jeong W, Stoneburner SJ, King D, Li R, Walker A, Lindh R, Gagliardi L. Automation of Active Space Selection for Multireference Methods via Machine Learning on Chemical Bond Dissociation. J Chem Theory Comput 2020; 16:2389-2399. [DOI: 10.1021/acs.jctc.9b01297] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- WooSeok Jeong
- Department of Chemistry, Nanoporous Materials Genome Center, Minnesota Supercomputing Institute, and Chemical Theory Center, University of Minnesota, 207 Pleasant Street Southeast, Minneapolis, Minnesota 55455, United States
| | - Samuel J. Stoneburner
- Department of Chemistry, Nanoporous Materials Genome Center, Minnesota Supercomputing Institute, and Chemical Theory Center, University of Minnesota, 207 Pleasant Street Southeast, Minneapolis, Minnesota 55455, United States
| | - Daniel King
- Department of Chemistry, Nanoporous Materials Genome Center, Minnesota Supercomputing Institute, and Chemical Theory Center, University of Minnesota, 207 Pleasant Street Southeast, Minneapolis, Minnesota 55455, United States
| | - Ruye Li
- Department of Chemistry, Nanoporous Materials Genome Center, Minnesota Supercomputing Institute, and Chemical Theory Center, University of Minnesota, 207 Pleasant Street Southeast, Minneapolis, Minnesota 55455, United States
| | - Andrew Walker
- Department of Computer Science and Engineering, University of Minnesota, 200 Union Street Southeast, Minneapolis, Minnesota 55455, United States
| | - Roland Lindh
- Department of Chemistry—BMC, and Uppsala Center for Computational Chemistry—UC3, Uppsala University, 751 23 Uppsala, Sweden
| | - Laura Gagliardi
- Department of Chemistry, Nanoporous Materials Genome Center, Minnesota Supercomputing Institute, and Chemical Theory Center, University of Minnesota, 207 Pleasant Street Southeast, Minneapolis, Minnesota 55455, United States
| |
Collapse
|
32
|
|
33
|
Schütt KT, Gastegger M, Tkatchenko A, Müller KR, Maurer RJ. Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunctions. Nat Commun 2019; 10:5024. [PMID: 31729373 PMCID: PMC6858523 DOI: 10.1038/s41467-019-12875-2] [Citation(s) in RCA: 214] [Impact Index Per Article: 35.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Accepted: 09/25/2019] [Indexed: 12/03/2022] Open
Abstract
Machine learning advances chemistry and materials science by enabling large-scale exploration of chemical space based on quantum chemical calculations. While these models supply fast and accurate predictions of atomistic chemical properties, they do not explicitly capture the electronic degrees of freedom of a molecule, which limits their applicability for reactive chemistry and chemical analysis. Here we present a deep learning framework for the prediction of the quantum mechanical wavefunction in a local basis of atomic orbitals from which all other ground-state properties can be derived. This approach retains full access to the electronic structure via the wavefunction at force-field-like efficiency and captures quantum mechanics in an analytically differentiable representation. On several examples, we demonstrate that this opens promising avenues to perform inverse design of molecular structures for targeting electronic property optimisation and a clear path towards increased synergy of machine learning and quantum chemistry.
Collapse
Affiliation(s)
- K T Schütt
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
| | - M Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
| | - A Tkatchenko
- Physics and Materials Science Research Unit, University of Luxembourg, L-1511, Luxembourg, Luxembourg.
| | - K-R Müller
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany.
- Department of Brain and Cognitive Engineering, Korea University, Anam-dong, Seongbuk-gu, Seoul, 02841, Korea.
- Max-Planck-Institut für Informatik, Saarbrücken, Germany.
| | - R J Maurer
- Department of Chemistry, University of Warwick, Gibbet Hill Road, CV4 7AL, Coventry, UK.
| |
Collapse
|
34
|
Cheng L, Kovachki NB, Welborn M, Miller TF. Regression Clustering for Improved Accuracy and Training Costs with Molecular-Orbital-Based Machine Learning. J Chem Theory Comput 2019; 15:6668-6677. [DOI: 10.1021/acs.jctc.9b00884] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Affiliation(s)
- Lixue Cheng
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Nikola B. Kovachki
- Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California 91125, United States
| | - Matthew Welborn
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Thomas F. Miller
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| |
Collapse
|
35
|
Coe JP. Machine Learning Configuration Interaction for ab Initio Potential Energy Curves. J Chem Theory Comput 2019; 15:6179-6189. [DOI: 10.1021/acs.jctc.9b00828] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Jeremy P. Coe
- Institute of Chemical Sciences, School of Engineering and Physical Sciences, Heriot-Watt University, Edinburgh EH14 4AS, United Kingdom
| |
Collapse
|