1
|
Dense-sparse quantum Monte Carlo algebraic diagrammatic construction and importance ranking. J Chem Phys 2024; 160:204111. [PMID: 38785284 DOI: 10.1063/5.0209137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 05/07/2024] [Indexed: 05/25/2024] Open
Abstract
Quantum Monte Carlo Algebraic Diagrammatic Construction (QMCADC) has been proposed as a reformulation of the second-order ADC scheme for the polarization propagator within the projection quantum Monte Carlo formalism. Dense-sparse partitioning and importance ranking filtering strategies are now exploited to accelerate its convergence and to alleviate the sign problem inherent in such calculations. By splitting the configuration space into dense and sparse subsets, the corresponding projection operator is decomposed into four distinct blocks. Deterministic calculations handle the dense-to-dense and sparse-to-dense blocks, while the remaining blocks, dense-to-sparse and sparse-to-sparse, are stochastically evaluated. The dense set is efficiently stored in a fixed-size array, and the sparse set is represented through conventional floating random Monte Carlo walks. The stochastic projection is further refined through importance ranking criteria, enabling a reduction in the required number of walkers with a controllable bias. Our results demonstrate the integration of dense-sparse partitioning with importance ranking filtering to significantly enhance the efficiency of QMCADC, enabling large-scale molecular excited-state calculations. Furthermore, this novel approach maximizes the utilization of the sparsity of ADC(2), transforming QMCADC into a tailored framework for ADC calculations.
Collapse
|
2
|
In Silico Chemical Experiments in the Age of AI: From Quantum Chemistry to Machine Learning and Back. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024:e2402369. [PMID: 38794859 DOI: 10.1002/adma.202402369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 04/28/2024] [Indexed: 05/26/2024]
Abstract
Computational chemistry is an indispensable tool for understanding molecules and predicting chemical properties. However, traditional computational methods face significant challenges due to the difficulty of solving the Schrödinger equations and the increasing computational cost with the size of the molecular system. In response, there has been a surge of interest in leveraging artificial intelligence (AI) and machine learning (ML) techniques to in silico experiments. Integrating AI and ML into computational chemistry increases the scalability and speed of the exploration of chemical space. However, challenges remain, particularly regarding the reproducibility and transferability of ML models. This review highlights the evolution of ML in learning from, complementing, or replacing traditional computational chemistry for energy and property predictions. Starting from models trained entirely on numerical data, a journey set forth toward the ideal model incorporating or learning the physical laws of quantum mechanics. This paper also reviews existing computational methods and ML models and their intertwining, outlines a roadmap for future research, and identifies areas for improvement and innovation. Ultimately, the goal is to develop AI architectures capable of predicting accurate and transferable solutions to the Schrödinger equation, thereby revolutionizing in silico experiments within chemistry and materials science.
Collapse
|
3
|
Water-Hydrocarbon Interactions in Anionic Pyrene Monohydrate. J Phys Chem B 2024; 128:3200-3210. [PMID: 38526297 DOI: 10.1021/acs.jpcb.3c07777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
Interactions between water and polycyclic aromatic hydrocarbons are essential in many aspects of chemistry, from interstellar and atmospheric processes to interfacial hydrophobicity and wetting phenomena. Despite their growing importance, the intermolecular potentials of the water-hydrocarbon interactions are underdeveloped compared to the water-water potentials, and there are similarly few experimental probes that are sensitive to the details of the water-hydrocarbon potential. We present a combined experimental and computational study of anionic pyrene monohydrate, one of the simplest water/hydrocarbon clusters. The action spectrum in the OH region of the mass-selected cluster ion provides a rigorous benchmark for intermolecular potentials and computational methodologies. We identify missing intermolecular interactions and shortcomings in conventional dynamics calculations by comparing experimental data to density functional theory and classical molecular dynamics calculations. Kinetic trapping is prevalent, even for one water molecule and one pyrene molecule, leading to slow equilibration in conventional molecular dynamics calculations, even on nanosecond time scales and at low temperatures (50 K). At constant energy, temperature fluctuations for the pair of molecules are substantial. Immersing the system in a bath of soft spheres and employing parallel tempering alleviates kinetic trapping and dampens temperature fluctuations, bringing the system closer to the thermodynamic limit. With such augmented sampling, a simple, flexible water model reproduces the line width and the asymmetric broadening of the symmetric OH stretching mode, which we assign to spectral diffusion. In the OH stretching region, dynamics calculations predict a more intense antisymmetric peak than experiments observe but do not predict the bimodal split symmetric peak that the experiments show. Our work suggests that electronic polarization, missing in the empirical force field, is responsible for the first discrepancy and that quantum nuclear effects, captured neither in density functional theory nor in classical dynamics, may be responsible for the second.
Collapse
|
4
|
Isomerization pathway of a C-C sigma bond in a bis(octaazamacrocycle)dinickel(II) complex activated by deprotonation: a DFT study. Theor Chem Acc 2024; 143:26. [PMID: 38495857 PMCID: PMC10937780 DOI: 10.1007/s00214-024-03100-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 02/07/2024] [Indexed: 03/19/2024]
Abstract
The anti (a) to syn (s) isomerization pathway of the deprotonated form of the dimer with two nickel(II) 15-membered octaazamacrocyclic units connected via a carbon-carbon (C-C) σ bond was investigated. For the initial anti (a) structure, a deprotonation of one of the bridging (sp3 hybridized) carbon atoms is suggested to allow for an a to s geometry twist. A 360° scan around the bridging C-C dihedral angle was performed first to find an intermediate geometry. Subsequently, the isomerization pathway was explored via individual steps using a series of mode redundant geometry optimizations (internal coordinates potential energy surface scans) and geometry relaxations leading to the s structure. The prominent geometries (intermediates) of the isomerization pathway are chosen and compared to the a and s structures, and geometry relaxations of the protonated forms of selected intermediates are considered. Supplementary Information The online version contains supplementary material available at 10.1007/s00214-024-03100-5.
Collapse
|
5
|
How well do one-electron self-interaction-correction methods perform for systems with fractional electrons? J Chem Phys 2024; 160:084102. [PMID: 38385511 DOI: 10.1063/5.0182773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 01/28/2024] [Indexed: 02/23/2024] Open
Abstract
Recently developed locally scaled self-interaction correction (LSIC) is a one-electron SIC method that, when used with a ratio of kinetic energy densities (zσ) as iso-orbital indicator, performs remarkably well for both thermochemical properties as well as for barrier heights overcoming the paradoxical behavior of the well-known Perdew-Zunger self-interaction correction (PZSIC) method. In this work, we examine how well the LSIC method performs for the delocalization error. Our results show that both LSIC and PZSIC methods correctly describe the dissociation of H2+ and He2+ but LSIC is overall more accurate than the PZSIC method. Likewise, in the case of the vertical ionization energy of an ensemble of isolated He atoms, the LSIC and PZSIC methods do not exhibit delocalization errors. For the fractional charges, both LSIC and PZSIC significantly reduce the deviation from linearity in the energy vs number of electrons curve, with PZSIC performing superior for C, Ne, and Ar atoms while for Kr they perform similarly. The LSIC performs well at the endpoints (integer occupations) while substantially reducing the deviation. The dissociation of LiF shows both LSIC and PZSIC dissociate into neutral Li and F but only LSIC exhibits charge transfer from Li+ to F- at the expected distance from the experimental data and accurate ab initio data. Overall, both the PZSIC and LSIC methods reduce the delocalization errors substantially.
Collapse
|
6
|
Building an ab initio solvated DNA model using Euclidean neural networks. PLoS One 2024; 19:e0297502. [PMID: 38358990 PMCID: PMC10868815 DOI: 10.1371/journal.pone.0297502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Accepted: 01/06/2024] [Indexed: 02/17/2024] Open
Abstract
Accurately modeling large biomolecules such as DNA from first principles is fundamentally challenging due to the steep computational scaling of ab initio quantum chemistry methods. This limitation becomes even more prominent when modeling biomolecules in solution due to the need to include large numbers of solvent molecules. We present a machine-learned electron density model based on a Euclidean neural network framework that includes a built-in understanding of equivariance to model explicitly solvated double-stranded DNA. By training the machine learning model using molecular fragments that sample the key DNA and solvent interactions, we show that the model predicts electron densities of arbitrary systems of solvated DNA accurately, resolves polarization effects that are neglected by classical force fields, and captures the physics of the DNA-solvent interaction at the ab initio level.
Collapse
|
7
|
All-Purpose Measure of Electron Correlation for Multireference Diagnostics. J Chem Theory Comput 2024; 20:721-727. [PMID: 38157841 PMCID: PMC10809408 DOI: 10.1021/acs.jctc.3c01073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 11/24/2023] [Accepted: 11/27/2023] [Indexed: 01/03/2024]
Abstract
We present an analytical relationship between two natural orbital occupancy-based indices, I N D ¯ and INDmax, and two established electron correlation metrics: the leading term of a configuration interaction expansion, c0, and the D2 diagnostic. Numerical validation revealed that I N D ¯ and INDmax can effectively substitute for c0 and D2, respectively. These indices offer three distinct advantages: (i) they are universally applicable across all electronic structure methods, (ii) their interpretation is more intuitive, and (iii) they can be readily incorporated into the development of hybrid electronic structure methods. Additionally, we draw a distinction between correlation measures and correlation diagnostics, establishing MP2 and CCSD numerical thresholds for INDmax, which are to be used as a multireference diagnostic. Our findings further demonstrate that establishing thresholds for other electronic structure methods can be easily accomplished using small data sets.
Collapse
|
8
|
Triple Synergism Effect of Ammonium Nitrilotriacetate on the Chemical Mechanical Polishing Performance of Ruthenium Barrier Layers. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2024:e2309965. [PMID: 38247206 DOI: 10.1002/smll.202309965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 12/26/2023] [Indexed: 01/23/2024]
Abstract
As the feature size of integrated circuits continues to decrease, ruthenium (Ru) has been suggested as the successor to traditional Ta/TaN bilayers for barrier layer materials due to its unique properties. This research delves into the effects of ammonium nitrilotriacetate (NTA(NH4 )3 ) on the chemical mechanical polishing (CMP) performance of Ru in H2 O2 -based slurry. The removal rate (RR) of Ru surged from 47 to 890 Å min-1 , marking an increase of about 17 times. The essence of this mechanism lies in the triple synergistic effects of NTA(NH4 )3 in promoting ruthenium (Ru) removal: 1) The interaction betweenNH 4 + ${\mathrm{NH}}_{\mathrm{4}}^{\mathrm{+}}$ from NTA(NH4 )3 and SiO2 abrasives; 2) The chelating action of [(NH4 )N(CH2 COO)3 ]2- from NTA(NH4 )3 on Ru and its oxides; 3) The ammoniation and chelation of Ru and its oxides byNH 4 + ${\mathrm{NH}}_{\mathrm{4}}^{\mathrm{+}}$ from NTA(NH4 )3 , which enhance the dissolution and corrosion of oxidized Ru, making its removal during the barrier layer CMP process more efficient through mechanical means. This research introduces a synergistic approach for the effective removal of Ru, shedding light on potential applications of CMP in the field of the integrated circuits.
Collapse
|
9
|
Degeneration of kernel regression with Matern kernels into low-order polynomial regression in high dimension. J Chem Phys 2024; 160:021101. [PMID: 38189605 DOI: 10.1063/5.0187867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 12/17/2023] [Indexed: 01/09/2024] Open
Abstract
Kernel methods such as kernel ridge regression and Gaussian process regression with Matern-type kernels have been increasingly used, in particular, to fit potential energy surfaces (PES) and density functionals, and for materials informatics. When the dimensionality of the feature space is high, these methods are used with necessarily sparse data. In this regime, the optimal length parameter of a Matern-type kernel may become so large that the method effectively degenerates into a low-order polynomial regression and, therefore, loses any advantage over such regression. This is demonstrated theoretically as well as numerically in the examples of six- and fifteen-dimensional molecular PES using squared exponential and simple exponential kernels. The results shed additional light on the success of polynomial approximations such as PIP for medium-size molecules and on the importance of orders-of-coupling-based models for preserving the advantages of kernel methods with Matern-type kernels of on the use of physically motivated (reproducing) kernels.
Collapse
|
10
|
Development of a machine learning finite-range nonlocal density functional. J Chem Phys 2024; 160:014105. [PMID: 38180254 DOI: 10.1063/5.0179149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 12/12/2023] [Indexed: 01/06/2024] Open
Abstract
Kohn-Sham density functional theory has been the most popular method in electronic structure calculations. To fulfill the increasing accuracy requirements, new approximate functionals are needed to address key issues in existing approximations. It is well known that nonlocal components are crucial. Current nonlocal functionals mostly require orbital dependence such as in Hartree-Fock exchange and many-body perturbation correlation energy, which, however, leads to higher computational costs. Deviating from this pathway, we describe functional nonlocality in a new approach. By partitioning the total density to atom-centered local densities, a many-body expansion is proposed. This many-body expansion can be truncated at one-body contributions, if a base functional is used and an energy correction is approximated. The contribution from each atom-centered local density is a single finite-range nonlocal functional that is universal for all atoms. We then use machine learning to develop this universal atom-centered functional. Parameters in this functional are determined by fitting to data that are produced by high-level theories. Extensive tests on several different test sets, which include reaction energies, reaction barrier heights, and non-covalent interaction energies, show that the new functional, with only the density as the basic variable, can produce results comparable to the best-performing double-hybrid functionals, (for example, for the thermochemistry test set selected from the GMTKN55 database, BLYP based machine learning functional gives a weighted total mean absolute deviations of 3.33 kcal/mol, while DSD-BLYP-D3(BJ) gives 3.28 kcal/mol) with a lower computational cost. This opens a new pathway to nonlocal functional development and applications.
Collapse
|
11
|
Understanding the release, migration, and risk of heavy metals in coal gangue: An approach by combining experimental and computational investigations. JOURNAL OF HAZARDOUS MATERIALS 2024; 461:132707. [PMID: 37813031 DOI: 10.1016/j.jhazmat.2023.132707] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 09/21/2023] [Accepted: 10/02/2023] [Indexed: 10/11/2023]
Abstract
The lack of understanding on the environmental fate and implications of heavy metals in coal gangue (CG) has restrained its utilization. Conventional extraction methods provide empirical measures of heavy metal speciation, lacking a detailed description of bound strength, which limits long-term risk assessment. In this study, the releasing and migrating behavior of six heavy metals (Cd, As, Pb, Ni, Cu, and Cr) were investigated through an approach by combining experimental and computational investigations. The corresponding mechanisms and risks were understood and discussed on a molecular level. The results suggested that CG is primarily a natural kaolinite α-quartz and anatase mineral. The sequence extraction results showed that heavy metals in CG are mainly distributed in stable silicate and iron manganese oxide-bound states. The toxicity characteristic leaching procedure test advised Cu, Cr, Ni, and Pb had a high toxic level and thus required long-term monitoring and controlling. A quantum chemical calculation demonstrated that the heavy metals were more likely to be embedded in silicate minerals with high binding energy than those binding on the anatase surface. The findings of this research provide a promising approach to comprehensively evaluate the stability mechanism and potential long-term risks of heavy metals in solid waste.
Collapse
|
12
|
Rapid and accurate predictions of perfect and defective material properties in atomistic simulation using the power of 3D CNN-based trained artificial neural networks. Sci Rep 2024; 14:36. [PMID: 38167883 PMCID: PMC10762098 DOI: 10.1038/s41598-023-50893-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 12/27/2023] [Indexed: 01/05/2024] Open
Abstract
This article introduces an innovative approach that utilizes machine learning (ML) to address the computational challenges of accurate atomistic simulations in materials science. Focusing on the field of molecular dynamics (MD), which offers insight into material behavior at the atomic level, the study demonstrates the potential of trained artificial neural networks (tANNs) as surrogate models. These tANNs capture complex patterns from built datasets, enabling fast and accurate predictions of material properties. The article highlights the application of 3D convolutional neural networks (CNNs) to incorporate atomistic details and defects in predictions, a significant advancement compared to current 2D image-based, or descriptor-based methods. Through a dataset of atomistic structures and MD simulations, the trained 3D CNN achieves impressive accuracy, predicting material properties with a root-mean-square error below 0.65 GPa for the prediction of elastic constants and a speed-up of approximately 185 to 2100 times compared to traditional MD simulations. This breakthrough promises to expedite materials design processes and facilitate scale-bridging in materials science, offering a new perspective on addressing computational demands in atomistic simulations.
Collapse
|
13
|
Kohn-Sham accuracy from orbital-free density functional theory via Δ-machine learning. J Chem Phys 2023; 159:244106. [PMID: 38147461 DOI: 10.1063/5.0180541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 11/30/2023] [Indexed: 12/28/2023] Open
Abstract
We present a Δ-machine learning model for obtaining Kohn-Sham accuracy from orbital-free density functional theory (DFT) calculations. In particular, we employ a machine-learned force field (MLFF) scheme based on the kernel method to capture the difference between Kohn-Sham and orbital-free DFT energies/forces. We implement this model in the context of on-the-fly molecular dynamics simulations and study its accuracy, performance, and sensitivity to parameters for representative systems. We find that the formalism not only improves the accuracy of Thomas-Fermi-von Weizsäcker orbital-free energies and forces by more than two orders of magnitude but is also more accurate than MLFFs based solely on Kohn-Sham DFT while being more efficient and less sensitive to model parameters. We apply the framework to study the structure of molten Al0.88Si0.12, the results suggesting no aggregation of Si atoms, in agreement with a previous Kohn-Sham study performed at an order of magnitude smaller length and time scales.
Collapse
|
14
|
Orbital-Free Density Functional Theory: An Attractive Electronic Structure Method for Large-Scale First-Principles Simulations. Chem Rev 2023; 123:12039-12104. [PMID: 37870767 DOI: 10.1021/acs.chemrev.2c00758] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2023]
Abstract
Kohn-Sham Density Functional Theory (KSDFT) is the most widely used electronic structure method in chemistry, physics, and materials science, with thousands of calculations cited annually. This ubiquity is rooted in the favorable accuracy vs cost balance of KSDFT. Nonetheless, the ambitions and expectations of researchers for use of KSDFT in predictive simulations of large, complicated molecular systems are confronted with an intrinsic computational cost-scaling challenge. Particularly evident in the context of first-principles molecular dynamics, the challenge is the high cost-scaling associated with the computation of the Kohn-Sham orbitals. Orbital-free DFT (OFDFT), as the name suggests, circumvents entirely the explicit use of those orbitals. Without them, the structural and algorithmic complexity of KSDFT simplifies dramatically and near-linear scaling with system size irrespective of system state is achievable. Thus, much larger system sizes and longer simulation time scales (compared to conventional KSDFT) become accessible; hence, new chemical phenomena and new materials can be explored. In this review, we introduce the historical contexts of OFDFT, its theoretical basis, and the challenge of realizing its promise via approximate kinetic energy density functionals (KEDFs). We review recent progress on that challenge for an array of KEDFs, such as one-point, two-point, and machine-learnt, as well as some less explored forms. We emphasize use of exact constraints and the inevitability of design choices. Then, we survey the associated numerical techniques and implemented algorithms specific to OFDFT. We conclude with an illustrative sample of applications to showcase the power of OFDFT in materials science, chemistry, and physics.
Collapse
|
15
|
Tailoring Ag Electron Donating Ability for Organohalide Reduction: A Bilayer Electrode Design. LANGMUIR : THE ACS JOURNAL OF SURFACES AND COLLOIDS 2023; 39:15705-15715. [PMID: 37885069 DOI: 10.1021/acs.langmuir.3c02260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
Electrochemical reduction of organohalides provides a green approach in the reduction of environmental pollutants, the synthesis of new organic molecules, and many other applications. The presence of a catalytic electrode can make the process more energetically efficient. Ag is known to be a very good electrode for the reduction of a wide range of organohalides. Herein, we examine the elementary adsorption and reaction steps that occur on Ag and the changes that result from changes in the Ag-coated metal, strain in Ag, solvent, and substrate geometry. The results are used to develop an electrode design strategy that can possibly be used to further increase the catalytic activity of pure Ag electrodes. We have shown how epitaxially depositing one to three layers of Ag on catalytically inert or less active support metal can increase the surface electron donating ability, thus increasing the adsorption of organic halide and the catalytic activity. Many factors, such as molecular geometry, lattice mismatches, work function, and solvents, contribute to the adsorption of organic halide molecules over the bilayer electrode surface. To isolate and rank these factors, we examined three model organic halides, namely, halothane, bromobenzene (BrBz), and benzyl bromide (BzBr) adsorption on Ag/metal (metal = Au, Bi, Pt, and Ti) bilayer electrodes in both vacuum and acetonitrile (ACN) solvent. The different metal supports offer a range of lattice mismatches and work function differences with Ag. Our calculations show that the surface of Ag becomes more electron donating and accessible to adsorption when it forms a bilayer with Ti as it has a lower work function and almost zero lattice mismatch with Ag. We believe this study will help to increase the electron donating ability of the Ag surface by choosing the right metal support, which in turn can improve the catalytic activity of the working electrode.
Collapse
|
16
|
A Δ-learning strategy for interpretation of spectroscopic observables. STRUCTURAL DYNAMICS (MELVILLE, N.Y.) 2023; 10:064101. [PMID: 37941993 PMCID: PMC10629969 DOI: 10.1063/4.0000215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Accepted: 10/17/2023] [Indexed: 11/10/2023]
Abstract
Accurate computations of experimental observables are essential for interpreting the high information content held within x-ray spectra. However, for complicated systems this can be difficult, a challenge compounded when dynamics becomes important owing to the large number of calculations required to capture the time-evolving observable. While machine learning architectures have been shown to represent a promising approach for rapidly predicting spectral lineshapes, achieving simultaneously accurate and sufficiently comprehensive training data is challenging. Herein, we introduce Δ-learning for x-ray spectroscopy. Instead of directly learning the structure-spectrum relationship, the Δ-model learns the structure dependent difference between a higher and lower level of theory. Consequently, once developed these models can be used to translate spectral shapes obtained from lower levels of theory to mimic those corresponding to higher levels of theory. Ultimately, this achieves accurate simulations with a much reduced computational burden as only the lower level of theory is computed, while the model can instantaneously transform this to a spectrum equivalent to a higher level of theory. Our present model, demonstrated herein, learns the difference between TDDFT(BLYP) and TDDFT(B3LYP) spectra. Its effectiveness is illustrated using simulations of Rh L3-edge spectra tracking the C-H activation of octane by a cyclopentadienyl rhodium carbonyl complex.
Collapse
|
17
|
Machine Learning Density Functionals from the Random-Phase Approximation. J Chem Theory Comput 2023; 19:7287-7299. [PMID: 37800677 PMCID: PMC10601474 DOI: 10.1021/acs.jctc.3c00848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Indexed: 10/07/2023]
Abstract
Kohn-Sham density functional theory (DFT) is the standard method for first-principles calculations in computational chemistry and materials science. More accurate theories such as the random-phase approximation (RPA) are limited in application due to their large computational cost. Here, we use machine learning to map the RPA to a pure Kohn-Sham density functional. The machine learned RPA model (ML-RPA) is a nonlocal extension of the standard gradient approximation. The density descriptors used as ingredients for the enhancement factor are nonlocal counterparts of the local density and its gradient. Rather than fitting only RPA exchange-correlation energies, we also include derivative information in the form of RPA optimized effective potentials. We train a single ML-RPA functional for diamond, its surfaces, and liquid water. The accuracy of ML-RPA for the formation energies of 28 diamond surfaces reaches that of state-of-the-art van der Waals functionals. For liquid water, however, ML-RPA cannot yet improve upon the standard gradient approximation. Overall, our work demonstrates how machine learning can extend the applicability of the RPA to larger system sizes, time scales, and chemical spaces.
Collapse
|
18
|
Untapped Potential of Deep Eutectic Solvents for the Synthesis of Bioinspired Inorganic-Organic Materials. CHEMISTRY OF MATERIALS : A PUBLICATION OF THE AMERICAN CHEMICAL SOCIETY 2023; 35:7878-7903. [PMID: 37840775 PMCID: PMC10568971 DOI: 10.1021/acs.chemmater.3c00847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Revised: 08/02/2023] [Indexed: 10/17/2023]
Abstract
Since the discovery of deep eutectic solvents (DESs) in 2003, significant progress has been made in the field, specifically advancing aspects of their preparation and physicochemical characterization. Their low-cost and unique tailored properties are reasons for their growing importance as a sustainable medium for the resource-efficient processing and synthesis of advanced materials. In this paper, the significance of these designer solvents and their beneficial features, in particular with respect to biomimetic materials chemistry, is discussed. Finally, this article explores the unrealized potential and advantageous aspects of DESs, focusing on the development of biomineralization-inspired hybrid materials. It is anticipated that this article can stimulate new concepts and advances providing a reference for breaking down the multidisciplinary borders in the field of bioinspired materials chemistry, especially at the nexus of computation and experiment, and to develop a rigorous materials-by-design paradigm.
Collapse
|
19
|
Machine learning electronic structure methods based on the one-electron reduced density matrix. Nat Commun 2023; 14:6281. [PMID: 37805614 PMCID: PMC10560258 DOI: 10.1038/s41467-023-41953-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 09/18/2023] [Indexed: 10/09/2023] Open
Abstract
The theorems of density functional theory (DFT) establish bijective maps between the local external potential of a many-body system and its electron density, wavefunction and, therefore, one-particle reduced density matrix. Building on this foundation, we show that machine learning models based on the one-electron reduced density matrix can be used to generate surrogate electronic structure methods. We generate surrogates of local and hybrid DFT, Hartree-Fock and full configuration interaction theories for systems ranging from small molecules such as water to more complex compounds like benzene and propanol. The surrogate models use the one-electron reduced density matrix as the central quantity to be learned. From the predicted density matrices, we show that either standard quantum chemistry or a second machine-learning model can be used to compute molecular observables, energies, and atomic forces. The surrogate models can generate essentially anything that a standard electronic structure method can, ranging from band gaps and Kohn-Sham orbitals to energy-conserving ab-initio molecular dynamics simulations and infrared spectra, which account for anharmonicity and thermal effects, without the need to employ computationally expensive algorithms such as self-consistent field theory. The algorithms are packaged in an efficient and easy to use Python code, QMLearn, accessible on popular platforms.
Collapse
|
20
|
Spin Polarization and Flat Bands in Eu-Doped Nanoporous and Twisted Bilayer Graphenes. MICROMACHINES 2023; 14:1889. [PMID: 37893326 PMCID: PMC10609095 DOI: 10.3390/mi14101889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Revised: 09/25/2023] [Accepted: 09/28/2023] [Indexed: 10/29/2023]
Abstract
Advanced two-dimensional spin-polarized heterostructures based on twisted (TBG) and nanoporous (NPBG) bilayer graphenes doped with Eu ions were theoretically proposed and studied using Periodic Boundary Conditions Density Functional theory electronic structure calculations. The significant polarization of the electronic states at the Fermi level was discovered for both Eu/NPBG(AA) and Eu/TBG lattices. Eu ions' chemi- and physisorption to both graphenes may lead to structural deformations, drop of symmetry of low-dimensional lattices, interlayer fusion, and mutual slides of TBG graphene fragments. The frontier bands in the valence region at the vicinity of the Fermi level of both spin-polarized 2D Eu/NPBG(AA) and Eu/TBG lattices clearly demonstrate flat dispersion laws caused by localized electronic states formed by TBG Moiré patterns, which could lead to strong electron correlations and the formation of exotic quantum phases.
Collapse
|
21
|
Making the Case for Quantum Mechanics in Predictive Toxicology─Nearly 100 Years Too Late? Chem Res Toxicol 2023; 36:1444-1450. [PMID: 37676849 DOI: 10.1021/acs.chemrestox.3c00171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]
Abstract
The use of quantum mechanics (QM) has long been the norm to study covalent-binding phenomena in chemistry and biochemistry. The pharmaceutical industry leverages QM models explicitly in covalent drug discovery and implicitly to characterize short-range interactions in noncovalent binding. Predictive toxicology has resisted widespread adoption of QM, including in the pharmaceutical industry, despite its obvious relevance to the metabolic processes in the upstream of adverse outcome pathways and advances in both QM methods and computational resources, which support fit-for-purpose applications in reasonable timeframes. Here, we make the case for embracing QM as an indispensable part of a toxicologist's toolkit. We argue that QM provides the necessary orthogonality to alert-based expert systems and traditional QSARs, consistent with calls for animal-free integrated testing strategies for safety assessments of commercial chemicals. We outline existing roadblocks to this transition, including the need to train model developers in QM and the shift toward service-based toxicity models that utilize high-performance computing clusters. Lastly, we describe recent examples of successful implementations of QM in hazard assessments and propose how in silico toxicology can be further advanced by integrating QM with artificial intelligence.
Collapse
|
22
|
Abstract
The field of computational chemistry has seen a significant increase in the integration of machine learning concepts and algorithms. In this Perspective, we surveyed 179 open-source software projects, with corresponding peer-reviewed papers published within the last 5 years, to better understand the topics within the field being investigated by machine learning approaches. For each project, we provide a short description, the link to the code, the accompanying license type, and whether the training data and resulting models are made publicly available. Based on those deposited in GitHub repositories, the most popular employed Python libraries are identified. We hope that this survey will serve as a resource to learn about machine learning or specific architectures thereof by identifying accessible codes with accompanying papers on a topic basis. To this end, we also include computational chemistry open-source software for generating training data and fundamental Python libraries for machine learning. Based on our observations and considering the three pillars of collaborative machine learning work, open data, open source (code), and open models, we provide some suggestions to the community.
Collapse
|
23
|
The central role of density functional theory in the AI age. Science 2023; 381:170-175. [PMID: 37440654 DOI: 10.1126/science.abn3445] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 05/30/2023] [Indexed: 07/15/2023]
Abstract
Density functional theory (DFT) plays a pivotal role in chemical and materials science because of its relatively high predictive power, applicability, versatility, and computational efficiency. We review recent progress in machine learning (ML) model developments, which have relied heavily on DFT for synthetic data generation and for the design of model architectures. The general relevance of these developments is placed in a broader context for chemical and materials sciences. DFT-based ML models have reached high efficiency, accuracy, scalability, and transferability and pave the way to the routine use of successful experimental planning software within self-driving laboratories.
Collapse
|
24
|
Abstract
The introduction of organic ligands is one of the effective strategies to improve the stability and reactivity of metal clusters. Herein, the enhanced reactivity of benzene-ligated cluster anions Fe2VC(C6H6)- with respect to naked Fe2VC- is identified. Structural characterization suggests that C6H6 is molecularly bound to the dual metal site in Fe2VC(C6H6)-. Mechanistic details reveal that the cleavage of N≡N is feasible in Fe2VC(C6H6)-/N2 but hindered by an overall positive barrier in the Fe2VC-/N2 system. Further analysis discloses that the ligated C6H6 regulates the compositions and energy levels of the active orbitals of the metal clusters. More importantly, C6H6 serves as an electron reservoir for the reduction of N2 to lower the crucial energy barrier of N≡N splitting. This work demonstrates that the flexibility of C6H6 in terms of withdrawing and donating electrons is crucial to regulating the electronic structures of the metal cluster and enhancing the reactivity.
Collapse
|
25
|
Challenges in modelling dynamic processes in realistic nanostructured materials at operating conditions. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2023; 381:20220239. [PMID: 37211031 DOI: 10.1098/rsta.2022.0239] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 01/23/2023] [Indexed: 05/23/2023]
Abstract
The question is addressed in how far current modelling strategies are capable of modelling dynamic phenomena in realistic nanostructured materials at operating conditions. Nanostructured materials used in applications are far from perfect; they possess a broad range of heterogeneities in space and time extending over several orders of magnitude. Spatial heterogeneities from the subnanometre to the micrometre scale in crystal particles with a finite size and specific morphology, impact the material's dynamics. Furthermore, the material's functional behaviour is largely determined by the operating conditions. Currently, there exists a huge length-time scale gap between attainable theoretical length-time scales and experimentally relevant scales. Within this perspective, three key challenges are highlighted within the molecular modelling chain to bridge this length-time scale gap. Methods are needed that enable (i) building structural models for realistic crystal particles having mesoscale dimensions with isolated defects, correlated nanoregions, mesoporosity, internal and external surfaces; (ii) the evaluation of interatomic forces with quantum mechanical accuracy albeit at much lower computational cost than the currently used density functional theory methods and (iii) derivation of the kinetics of phenomena taking place in a multi-length-time scale window to obtain an overall view of the dynamics of the process. This article is part of a discussion meeting issue 'Supercomputing simulations of advanced materials'.
Collapse
|
26
|
Accurate Prediction of Adiabatic Ionization Potentials of Organic Molecules using Quantum Chemistry Assisted Machine Learning. J Phys Chem A 2023. [PMID: 37406209 DOI: 10.1021/acs.jpca.3c00823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/07/2023]
Abstract
In previous work (Dandu et al., J. Phys. Chem. A, 2022, 126, 4528-4536), we were successful in predicting accurate atomization energies of organic molecules using machine learning (ML) models, obtaining an accuracy as low as 0.1 kcal/mol compared to the G4MP2 method. In this work, we extend the use of these ML models to adiabatic ionization potentials on data sets of energies generated using quantum chemical calculations. Atomic specific corrections that were found to improve atomization energies from quantum chemical calculations have also been used in this study to improve ionization potentials. The quantum chemical calculations were performed on 3405 molecules containing eight or fewer non-hydrogen atoms derived from the QM9 data set, using the B3LYP functional with the 6-31G(2df,p) basis set for optimization. Low-fidelity IPs for these structures were obtained using two density functional methods: B3LYP/6-31+G(2df,p) and ωB97XD/6-311+G(3df,2p). Highly accurate G4MP2 calculations were performed on these optimized structures to obtain high-fidelity IPs to use in ML models based on the low-fidelity IPs. Our best performing ML methods gave IPs of organic molecules within a mean absolute deviation of 0.035 eV from the G4MP2 IPs for the whole data set. This work demonstrates that ML predictions assisted by quantum chemical calculations can be used to successfully predict IPs of organic molecules for use in high throughput screening.
Collapse
|
27
|
Machine Learning Electron Density Prediction Using Weighted Smooth Overlap of Atomic Positions. NANOMATERIALS (BASEL, SWITZERLAND) 2023; 13:1853. [PMID: 37368284 DOI: 10.3390/nano13121853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 05/29/2023] [Accepted: 06/11/2023] [Indexed: 06/28/2023]
Abstract
Having access to accurate electron densities in chemical systems, especially for dynamical systems involving chemical reactions, ion transport, and other charge transfer processes, is crucial for numerous applications in materials chemistry. Traditional methods for computationally predicting electron density data for such systems include quantum mechanical (QM) techniques, such as density functional theory. However, poor scaling of these QM methods restricts their use to relatively small system sizes and short dynamic time scales. To overcome this limitation, we have developed a deep neural network machine learning formalism, which we call deep charge density prediction (DeepCDP), for predicting charge densities by only using atomic positions for molecules and condensed phase (periodic) systems. Our method uses the weighted smooth overlap of atomic positions to fingerprint environments on a grid-point basis and map it to electron density data generated from QM simulations. We trained models for bulk systems of copper, LiF, and silicon; for a molecular system, water; and for two-dimensional charged and uncharged systems, hydroxyl-functionalized graphane, with and without an added proton. We showed that DeepCDP achieves prediction R2 values greater than 0.99 and mean squared error values on the order of 10-5e2 Å-6 for most systems. DeepCDP scales linearly with system size, is highly parallelizable, and is capable of accurately predicting the excess charge in protonated hydroxyl-functionalized graphane. We demonstrate how DeepCDP can be used to accurately track the location of charges (protons) by computing electron densities at a few selected grid points in the materials, thus significantly reducing the computational cost. We also show that our models can be transferable, allowing prediction of electron densities for systems on which it has not been trained but that contain a subset of atomic species on which it has been trained. Our approach can be used to develop models that span different chemical systems and train them for the study of large-scale charge transport and chemical reactions.
Collapse
|
28
|
Single-Point Extrapolation to the Complete Basis Set Limit through Deep Learning. J Chem Theory Comput 2023. [PMID: 37192428 DOI: 10.1021/acs.jctc.2c01298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
Machine learning (ML) offers an attractive method for making predictions about molecular systems while circumventing the need to run expensive electronic structure calculations. Once trained on ab initio data, the promise of ML is to deliver accurate predictions of molecular properties that were previously computationally infeasible. In this work, we develop and train a graph neural network model to correct the basis set incompleteness error (BSIE) between a small and large basis set at the RHF and B3LYP levels of theory. Our results show that, when compared to fitting to the total potential, an ML model fitted to correct the BSIE is better at generalizing to systems not seen during training. We test this ability by training on single molecules while evaluating on molecular complexes. We also show that ensemble models yield better behaved potentials in situations where the training data is insufficient. However, even when only fitting to the BSIE, acceptable performance is only achieved when the training data sufficiently resemble the systems one wants to make predictions on. The test error of the final model trained to predict the difference between the cc-pVDZ and cc-pV5Z potential is 0.184 kcal/mol for the B3LYP density functional, and the ensemble model accurately reproduces the large basis set interaction energy curves on the S66x8 dataset.
Collapse
|
29
|
Fragments-in-fragments method for efficient and reliable estimates of individual hydrogen bond energies in large molecular clusters. J Comput Chem 2023. [PMID: 37191018 DOI: 10.1002/jcc.27133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 05/01/2023] [Accepted: 05/03/2023] [Indexed: 05/17/2023]
Abstract
The knowledge of individual hydrogen bond (HB) strength in molecular clusters is indispensable to get insights into the bulk properties of condensed systems. Recently, we have developed the molecular tailoring approach based (MTA-based) method for the estimation of individual HB energy in molecular clusters. However, the direct use of this MTA-based method to large molecular clusters becomes progressively difficult with the increase in the size of a cluster. To overcome this caveat, herein, we propose the use of linear scaling method (such as the original MTA method) for the estimation of single-point (SP) energies of large-sized parent molecular cluster and their respective fragments. Because the fragments of the MTA-based method, for the estimation of HB energy, are further fragmented, this proposed strategy is called as Fragments-in-Fragments (Frags-in-Frags) method. The SP energies of fragments and parent cluster calculated by the Frags-in-Frags approach were utilized to estimate the individual HB energy. The estimated individual HB energies, in various molecular clusters, by Frags-in-Frags method are found to be in excellent linear agreement with their MTA-based counterparts (R2 = 0.9975 of 348 data points). The difference being less than 0.5 kcal/mol in most of the cases. Furthermore, RMSD is 0.43 kcal/mol, MAE is 0.33 kcal/mol, and the standard deviation is 0.44 kcal/mol. Importantly, the Frags-in-Frags method not only enables the reliable estimation of HB energy in large molecular clusters but also requires less computational time and can be possible even with off-the-shelf hardware.
Collapse
|
30
|
Molecular modelling of the thermophysical properties of fluids: expectations, limitations, gaps and opportunities. Phys Chem Chem Phys 2023; 25:12607-12628. [PMID: 37114325 DOI: 10.1039/d2cp05423j] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
This manuscript provides an overview of the current state of the art in terms of the molecular modelling of the thermophysical properties of fluids. It is intended to manage the expectations and serve as guidance to practising physical chemists, chemical physicists and engineers in terms of the scope and accuracy of the more commonly available intermolecular potentials along with the peculiarities of the software and methods employed in molecular simulations while providing insights on the gaps and opportunities available in this field. The discussion is focused around case studies which showcase both the precision and the limitations of frequently used workflows.
Collapse
|
31
|
Barrier Height Prediction by Machine Learning Correction of Semiempirical Calculations. J Phys Chem A 2023; 127:2274-2283. [PMID: 36877614 PMCID: PMC10845151 DOI: 10.1021/acs.jpca.2c08340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 02/19/2023] [Indexed: 03/07/2023]
Abstract
Different machine learning (ML) models are proposed in the present work to predict density functional theory-quality barrier heights (BHs) from semiempirical quantum mechanical (SQM) calculations. The ML models include a multitask deep neural network, gradient-boosted trees by means of the XGBoost interface, and Gaussian process regression. The obtained mean absolute errors are similar to those of previous models considering the same number of data points. The ML corrections proposed in this paper could be useful for rapid screening of the large reaction networks that appear in combustion chemistry or in astrochemistry. Finally, our results show that 70% of the features with the highest impact on model output are bespoke predictors. This custom-made set of predictors could be employed by future Δ-ML models to improve the quantitative prediction of other reaction properties.
Collapse
|
32
|
Accurate Prediction of Three-Body Intermolecular Interactions via Electron Deformation Density-Based Machine Learning. J Chem Theory Comput 2023; 19:1466-1475. [PMID: 36787280 DOI: 10.1021/acs.jctc.2c00984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
This work extends the electron deformation density-based descriptor, originally developed in the electron deformation density-based interaction energy machine learning (EDDIE-ML) algorithm to predict dimer interaction energies, to the prediction of three-body interactions in trimers. Using a sequential learning process to select the training data, the resulting Gaussian process regression (GPR) model predicts the three-body interaction energy within 0.2 kcal mol-1 of the SRS-MP2/cc-pVTZ reference values for the 3B69 and S22-3 trimer data sets. A hybrid kernel function is introduced, which combines contributions from the average and individual atomic environments, allowing the total trimer interaction energy to be predicted in addition to the three-body contribution using the same descriptor. To extend the range and diversity of trimer interaction energies available in the literature, a new data set based on a protein-ligand crystal structure is introduced, consisting of 509 structures of a central ligand with two protein fragments. Benchmark calculations are provided for the new data set, which contains significantly larger molecular interactions than current databases in the literature in addition to charged fragments. Compared to density funtional theory (DFT)- and wavefunction-based methods for calculating the three-body interaction energy, our model makes predictions in a significantly shorter time frame by reducing the number of required SCF calculations from 7 to 4 performed at the PBE0 level of theory, showcasing the utility and efficiency of our Δ-ML method particularly when applied to larger systems.
Collapse
|
33
|
Size Effects in Gas-phase C-H Activation. Chemphyschem 2023; 24:e202200769. [PMID: 36420565 DOI: 10.1002/cphc.202200769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 11/23/2022] [Accepted: 11/23/2022] [Indexed: 11/25/2022]
Abstract
The gas-phase clusters reaction permits addressing fundamental aspects of the challenges related to C-H activation. The size effect plays a key role in the activation processes as it may substantially affect both the reactivity and selectivity. In this paper, we reviewed the size effect related to the hydrocarbon oxidation by early transition metal oxides and main group metal oxides, methane activation mediated by late transition metals. Based on mass-spectrometry experiments in conjunction with quantum chemical calculations, mechanistic discussions were reviewed to present how and why the size greatly regulates the reactivity and product distribution.
Collapse
|
34
|
Haloboration of o-Alkynyl Phenols Generates Halogenated Bicyclic-Boronates. Angew Chem Int Ed Engl 2023; 62:e202301463. [PMID: 36856077 DOI: 10.1002/anie.202301463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 02/28/2023] [Accepted: 03/01/2023] [Indexed: 03/02/2023]
Abstract
Benzoxaborinines are intermediates en-route to bicyclic boronates that are important active pharmaceutical ingredients (APIs). Herein, the haloboration of o-alkynyl-phenols using BX3 (X=Cl or Br) is disclosed as a route to form C4-X-benzoxaborinines with good functional group tolerance. Computational studies indicated that there are two similar in barrier mechanisms: (i) double alkyne haloboration followed by retro-haloboration; (ii) concerted trans-haloboration involving an exogenous chloride source. The C4-halide in these benzoxaborinines is useful, with a one-pot haloboration-Negishi cross coupling protocol effective to form benzoxaborinines with an alkyl or an aryl at C4. Therefore this method is a useful addition to the toolbox for synthesising bicyclic-boronates that are attracting increasing attention as APIs.
Collapse
|
35
|
Generalizable Descriptors of Highly Sensitive Detection of As(III) over Transition-Metal Single Atoms: A Combined Density Function Theory and Gradient Boosting Regression Approach. Anal Chem 2023; 95:3666-3674. [PMID: 36656141 DOI: 10.1021/acs.analchem.2c04617] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Traditional nanomodified electrodes have made great achievements in electrochemical stripping voltammetry of sensing materials for As(III) detection. Moreover, the intermediate states are complicated to probe because of the ultrashort lifetime and complex reaction conditions of the electron transfer process in electroanalysis, which seriously hinder the identification of the actual active site. Herein, the intrinsic interaction of highly sensitive analytical behavior of nanomaterials is elucidated from the perspective of electronic structure through density functional theory (DFT) and gradient boosting regression (GBR). It is revealed that the atomic radius, d-band center (εd), and the largest coordinative TM-N bond length play a crucial role in regulating the arsenic reduction reaction (ARR) performance by the established ARR process for 27 sets of transition-metal single atoms supported on N-doped graphene. Furthermore, the database composed of filtered intrinsic electronic structural properties and the calculated descriptors of the central metal atom in TM-N4-Gra were also successfully extended to oxygen evolution reaction (OER) systems, which effectively verified the reliability of the whole approach. Generally, a multistep workflow is developed through GBR models combined with DFT for valid screening of sensing materials, which will effectively upgrade the traditional trial-and-error mode for electrochemical interface designing.
Collapse
|
36
|
Data-Driven Machine Learning for Understanding Surface Structures of Heterogeneous Catalysts. Angew Chem Int Ed Engl 2023; 62:e202216383. [PMID: 36509704 DOI: 10.1002/anie.202216383] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 12/11/2022] [Accepted: 12/12/2022] [Indexed: 12/15/2022]
Abstract
The design of heterogeneous catalysts is necessarily surface-focused, generally achieved via optimization of adsorption energy and microkinetic modelling. A prerequisite is to ensure the adsorption energy is physically meaningful is the stable existence of the conceived active-site structure on the surface. The development of improved understanding of the catalyst surface, however, is challenging practically because of the complex nature of dynamic surface formation and evolution under in-situ reactions. We propose therefore data-driven machine-learning (ML) approaches as a solution. In this Minireview we summarize recent progress in using machine-learning to search and predict (meta)stable structures, assist operando simulation under reaction conditions and micro-environments, and critically analyze experimental characterization data. We conclude that ML will become the new norm to lower costs associated with discovery and design of optimal heterogeneous catalysts.
Collapse
|
37
|
Neural network potentials for chemistry: concepts, applications and prospects. DIGITAL DISCOVERY 2023; 2:28-58. [PMID: 36798879 PMCID: PMC9923808 DOI: 10.1039/d2dd00102k] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 12/20/2022] [Indexed: 12/24/2022]
Abstract
Artificial Neural Networks (NN) are already heavily involved in methods and applications for frequent tasks in the field of computational chemistry such as representation of potential energy surfaces (PES) and spectroscopic predictions. This perspective provides an overview of the foundations of neural network-based full-dimensional potential energy surfaces, their architectures, underlying concepts, their representation and applications to chemical systems. Methods for data generation and training procedures for PES construction are discussed and means for error assessment and refinement through transfer learning are presented. A selection of recent results illustrates the latest improvements regarding accuracy of PES representations and system size limitations in dynamics simulations, but also NN application enabling direct prediction of physical results without dynamics simulations. The aim is to provide an overview for the current state-of-the-art NN approaches in computational chemistry and also to point out the current challenges in enhancing reliability and applicability of NN methods on a larger scale.
Collapse
|
38
|
An Imbalance in the Force: The Need for Standardized Benchmarks for Molecular Simulation. J Chem Inf Model 2023; 63:412-431. [PMID: 36630710 PMCID: PMC9875315 DOI: 10.1021/acs.jcim.2c01127] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Indexed: 01/12/2023]
Abstract
Force fields (FFs) for molecular simulation have been under development for more than half a century. As with any predictive model, rigorous testing and comparisons of models critically depends on the availability of standardized data sets and benchmarks. While such benchmarks are rather common in the fields of quantum chemistry, this is not the case for empirical FFs. That is, few benchmarks are reused to evaluate FFs, and development teams rather use their own training and test sets. Here we present an overview of currently available tests and benchmarks for computational chemistry, focusing on organic compounds, including halogens and common ions, as FFs for these are the most common ones. We argue that many of the benchmark data sets from quantum chemistry can in fact be reused for evaluating FFs, but new gas phase data is still needed for compounds containing phosphorus and sulfur in different valence states. In addition, more nonequilibrium interaction energies and forces, as well as molecular properties such as electrostatic potentials around compounds, would be beneficial. For the condensed phases there is a large body of experimental data available, and tools to utilize these data in an automated fashion are under development. If FF developers, as well as researchers in artificial intelligence, would adopt a number of these data sets, it would become easier to compare the relative strengths and weaknesses of different models and to, eventually, restore the balance in the force.
Collapse
|
39
|
Predictive chemistry: machine learning for reaction deployment, reaction development, and reaction discovery. Chem Sci 2023; 14:226-244. [PMID: 36743887 PMCID: PMC9811563 DOI: 10.1039/d2sc05089g] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 11/25/2022] [Indexed: 11/29/2022] Open
Abstract
The field of predictive chemistry relates to the development of models able to describe how molecules interact and react. It encompasses the long-standing task of computer-aided retrosynthesis, but is far more reaching and ambitious in its goals. In this review, we summarize several areas where predictive chemistry models hold the potential to accelerate the deployment, development, and discovery of organic reactions and advance synthetic chemistry.
Collapse
|
40
|
Insight into the micro-mechanism of Co doping to improve the deNOx performance and H2O resistance of β-MnO2 catalysts. Colloids Surf A Physicochem Eng Asp 2023. [DOI: 10.1016/j.colsurfa.2023.130983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
41
|
Fusing 2D and 3D molecular graphs as unambiguous molecular descriptors for conformational and chiral stereoisomers. Brief Bioinform 2022; 24:6931719. [PMID: 36528804 PMCID: PMC9851338 DOI: 10.1093/bib/bbac560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 10/28/2022] [Accepted: 11/15/2022] [Indexed: 12/23/2022] Open
Abstract
The rapid progress of machine learning (ML) in predicting molecular properties enables high-precision predictions being routinely achieved. However, many ML models, such as conventional molecular graph, cannot differentiate stereoisomers of certain types, particularly conformational and chiral ones that share the same bonding connectivity but differ in spatial arrangement. Here, we designed a hybrid molecular graph network, Chemical Feature Fusion Network (CFFN), to address the issue by integrating planar and stereo information of molecules in an interweaved fashion. The three-dimensional (3D, i.e., stereo) modality guarantees precision and completeness by providing unabridged information, while the two-dimensional (2D, i.e., planar) modality brings in chemical intuitions as prior knowledge for guidance. The zipper-like arrangement of 2D and 3D information processing promotes cooperativity between them, and their synergy is the key to our model's success. Experiments on various molecules or conformational datasets including a special newly created chiral molecule dataset comprised of various configurations and conformations demonstrate the superior performance of CFFN. The advantage of CFFN is even more significant in datasets made of small samples. Ablation experiments confirm that fusing 2D and 3D molecular graphs as unambiguous molecular descriptors can not only effectively distinguish molecules and their conformations, but also achieve more accurate and robust prediction of quantum chemical properties.
Collapse
|
42
|
Abstract
We present two machine learning methodologies that are capable of predicting diffusion Monte Carlo (DMC) energies with small data sets (≈60 DMC calculations in total). The first uses voxel deep neural networks (VDNNs) to predict DMC energy densities using Kohn-Sham density functional theory (DFT) electron densities as input. The second uses kernel ridge regression (KRR) to predict atomic contributions to the DMC total energy using atomic environment vectors as input (we used atom-centered symmetry functions, atomic environment vectors from the ANI models, and smooth overlap of atomic positions). We first compare the methodologies on pristine graphene lattices, where we find that the KRR methodology performs best in comparison to gradient boosted decision trees, random forest, Gaussian process regression, and multilayer perceptrons. In addition, KRR outperforms VDNNs by an order of magnitude. Afterward, we study the generalizability of KRR to predict the energy barrier associated with a Stone-Wales defect. Lastly, we move from 2D to 3D materials and use KRR to predict total energies of liquid water. In all cases, we find that the KRR models are more accurate than Kohn-Sham DFT and all mean absolute errors are less than chemical accuracy.
Collapse
|
43
|
Electronic-Structure Properties from Atom-Centered Predictions of the Electron Density. J Chem Theory Comput 2022. [PMID: 36453538 DOI: 10.1021/acs.jctc.2c00850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
The electron density of a molecule or material has recently received major attention as a target quantity of machine-learning models. A natural choice to construct a model that yields transferable and linear-scaling predictions is to represent the scalar field using a multicentered atomic basis analogous to that routinely used in density fitting approximations. However, the nonorthogonality of the basis poses challenges for the learning exercise, as it requires accounting for all the atomic density components at once. We devise a gradient-based approach to directly minimize the loss function of the regression problem in an optimized and highly sparse feature space. In so doing, we overcome the limitations associated with adopting an atom-centered model to learn the electron density over arbitrarily complex data sets, obtaining very accurate predictions using a comparatively small training set. The enhanced framework is tested on 32-molecule periodic cells of liquid water, presenting enough complexity to require an optimal balance between accuracy and computational efficiency. We show that starting from the predicted density a single Kohn-Sham diagonalization step can be performed to access total energy components that carry an error of just 0.1 meV/atom with respect to the reference density functional calculations. Finally, we test our method on the highly heterogeneous QM9 benchmark data set, showing that a small fraction of the training data is enough to derive ground-state total energies within chemical accuracy.
Collapse
|
44
|
Hydrothermal Synthesis, crystal structure, DFT studies, and molecular docking of Zn-BTC MOF as potential antiprotozoal agents. J Mol Struct 2022. [DOI: 10.1016/j.molstruc.2022.134825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
45
|
Coupled Cluster Molecular Dynamics of Condensed Phase Systems Enabled by Machine Learning Potentials: Liquid Water Benchmark. PHYSICAL REVIEW LETTERS 2022; 129:226001. [PMID: 36493459 DOI: 10.1103/physrevlett.129.226001] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 09/05/2022] [Accepted: 10/05/2022] [Indexed: 06/17/2023]
Abstract
Coupled cluster theory is a general and systematic electronic structure method, but in particular the highly accurate "gold standard" coupled cluster singles, doubles and perturbative triples, CCSD(T), can only be applied to small systems. To overcome this limitation, we introduce a framework to transfer CCSD(T) accuracy of finite molecular clusters to extended condensed phase systems using a high-dimensional neural network potential. This approach, which is automated, allows one to perform high-quality coupled cluster molecular dynamics, CCMD, as we demonstrate for liquid water including nuclear quantum effects. The machine learning strategy is very efficient, generic, can be systematically improved, and is applicable to a variety of complex systems.
Collapse
|
46
|
Machine learning the Hohenberg-Kohn map for molecular excited states. Nat Commun 2022; 13:7044. [DOI: 10.1038/s41467-022-34436-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 10/25/2022] [Indexed: 11/18/2022] Open
Abstract
AbstractThe Hohenberg-Kohn theorem of density-functional theory establishes the existence of a bijection between the ground-state electron density and the external potential of a many-body system. This guarantees a one-to-one map from the electron density to all observables of interest including electronic excited-state energies. Time-Dependent Density-Functional Theory (TDDFT) provides one framework to resolve this map; however, the approximations inherent in practical TDDFT calculations, together with their computational expense, motivate finding a cheaper, more direct map for electronic excitations. Here, we show that determining density and energy functionals via machine learning allows the equations of TDDFT to be bypassed. The framework we introduce is used to perform the first excited-state molecular dynamics simulations with a machine-learned functional on malonaldehyde and correctly capture the kinetics of its excited-state intramolecular proton transfer, allowing insight into how mechanical constraints can be used to control the proton transfer reaction in this molecule. This development opens the door to using machine-learned functionals for highly efficient excited-state dynamics simulations.
Collapse
|
47
|
The fourth-order expansion of the exchange hole and neural networks to construct exchange–correlation functionals. J Chem Phys 2022; 157:171103. [DOI: 10.1063/5.0122761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The curvature Q σ of spherically averaged exchange (X) holes ρX, σ(r, u) is one of the crucial variables for the construction of approximations to the exchange–correlation energy of Kohn–Sham theory, the most prominent example being the Becke–Roussel model [A. D. Becke and M. R. Roussel, Phys. Rev. A 39, 3761 (1989)]. Here, we consider the next higher nonzero derivative of the spherically averaged X hole, the fourth-order term T σ. This variable contains information about the nonlocality of the X hole and we employ it to approximate hybrid functionals, eliminating the sometimes demanding calculation of the exact X energy. The new functional is constructed using machine learning; having identified a physical correlation between T σ and the nonlocality of the X hole, we employ a neural network to express this relation. While we only modify the X functional of the Perdew–Burke–Ernzerhof functional [Perdew et al., Phys. Rev. Lett. 77, 3865 (1996)], a significant improvement over this method is achieved.
Collapse
|
48
|
Predicting accurate ab initio DNA electron densities with equivariant neural networks. Biophys J 2022; 121:3883-3895. [PMID: 36057785 PMCID: PMC9674991 DOI: 10.1016/j.bpj.2022.08.045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 07/22/2022] [Accepted: 08/29/2022] [Indexed: 11/19/2022] Open
Abstract
One of the fundamental limitations of accurately modeling biomolecules like DNA is the inability to perform quantum chemistry calculations on large molecular structures. We present a machine learning model based on an equivariant Euclidean neural network framework to obtain accurate ab initio electron densities for arbitrary DNA structures that are much too large for conventional quantum methods. The model is trained on representative B-DNA basepair steps that capture both base pairing and base stacking interactions. The model produces accurate electron densities for arbitrary B-DNA structures with typical errors of less than 1%. Crucially, the error does not increase with system size, which suggests that the model can extrapolate to large DNA structures with negligible loss of accuracy. The model also generalizes reasonably to other DNA structural motifs such as the A- and Z-DNA forms, despite being trained on only B-DNA configurations. The model is used to calculate electron densities of several large-scale DNA structures, and we show that the computational scaling for this model is essentially linear. We also show that this machine learning electron density model can be used to calculate accurate electrostatic potentials for DNA. These electrostatic potentials produce more accurate results compared with classical force fields and do not show the usual deficiencies at short range.
Collapse
|
49
|
Computational design of magnetic molecules and their environment using quantum chemistry, machine learning and multiscale simulations. Nat Rev Chem 2022; 6:761-781. [PMID: 37118096 DOI: 10.1038/s41570-022-00424-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/15/2022] [Indexed: 11/09/2022]
Abstract
Having served as a playground for fundamental studies on the physics of d and f electrons for almost a century, magnetic molecules are now becoming increasingly important for technological applications, such as magnetic resonance, data storage, spintronics and quantum information. All of these applications require the preservation and control of spins in time, an ability hampered by the interaction with the environment, namely with other spins, conduction electrons, molecular vibrations and electromagnetic fields. Thus, the design of a novel magnetic molecule with tailored properties is a formidable task, which does not only concern its electronic structures but also calls for a deep understanding of the interaction among all the degrees of freedom at play. This Review describes how state-of-the-art ab initio computational methods, combined with data-driven approaches to materials modelling, can be integrated into a fully multiscale strategy capable of defining design rules for magnetic molecules.
Collapse
|
50
|
Abstract
We report a reliable way to manipulate the dynamic, axial chirality in perylene diimide (PDI)-based twistacenes. Specifically, we reveal how chiral substituents on the imide position induce the helicity in a series of PDI-based twistacenes. We demonstrate that this remote chirality is able to control the helicity of flexible [4]helicene subunits by UV-vis, CD spectroscopy, X-ray crystallography, and TDDFT calculations. Furthermore, we have discovered that both the chiral substituent and the solvent each has a strong impact on the sign and intensity of the CD signals, highlighting the control of the dynamic helicity in this flexible system. DFT calculations suggest that the steric interaction of the chiral substituents is the important factor in how well a particular group is at inducing a preferred helicity.
Collapse
|