1
|
Strandgaard M, Linjordet T, Kneiding H, Burnage AL, Nova A, Jensen JH, Balcells D. A Deep Generative Model for the Inverse Design of Transition Metal Ligands and Complexes. JACS AU 2025; 5:2294-2308. [PMID: 40443902 PMCID: PMC12117439 DOI: 10.1021/jacsau.5c00242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2025] [Revised: 04/15/2025] [Accepted: 04/15/2025] [Indexed: 06/02/2025]
Abstract
Deep generative models yielding transition metal complexes (TMCs) remain scarce despite the key role of these compounds in industrial catalytic processes, anticancer therapies, and the energy transition. Compared to drug discovery within the chemical space of organic molecules, TMCs pose further challenges, including the encoding of chemical bonds of higher complexity and the need to optimize multiple properties. In this work, we developed a generative model for the inverse design of transition metal ligands and complexes, based on the junction tree variational autoencoder (JT-VAE). After implementing a SMILES-based encoding of the metal-ligand bonds, the model was trained with the tmQMg-L ligand library, allowing for the generation of thousands of novel, highly diverse monodentate (κ1) and bidentate (κ2) ligands, including imines, phosphines, and carbenes. Further, the generated ligands were labeled with two target properties reflecting the stability and electron density of the associated homoleptic iridium TMCs: the HOMO-LUMO gap (ϵ) and the charge of the metal center (q Ir). This data was used to implement a conditional model that generated ligands from a prompt, with the single- or dual-objective of optimizing either or both the ϵ and q Ir properties and allowing for chemical interpretation based on the optimization trajectories. The optimizations also had an impact on other chemical properties, including ligand dissociation energies and oxidative addition barriers. A similar model was implemented to condition ligand generation by solubility and steric bulk.
Collapse
Affiliation(s)
- Magnus Strandgaard
- Hylleraas
Centre for Quantum Molecular Sciences, Department of Chemistry, University of Oslo, P.O. Box 1033, Blindern, Oslo0315, Norway
- Department
of Chemistry, University of Copenhagen, Copenhagen2100, Denmark
| | - Trond Linjordet
- Hylleraas
Centre for Quantum Molecular Sciences, Department of Chemistry, University of Oslo, P.O. Box 1033, Blindern, Oslo0315, Norway
| | - Hannes Kneiding
- Hylleraas
Centre for Quantum Molecular Sciences, Department of Chemistry, University of Oslo, P.O. Box 1033, Blindern, Oslo0315, Norway
| | - Arron L. Burnage
- Hylleraas
Centre for Quantum Molecular Sciences, Department of Chemistry, University of Oslo, P.O. Box 1033, Blindern, Oslo0315, Norway
| | - Ainara Nova
- Hylleraas
Centre for Quantum Molecular Sciences, Department of Chemistry, University of Oslo, P.O. Box 1033, Blindern, Oslo0315, Norway
- Centre
for Materials Science and Nanotechnology, Department of Chemistry, University of Oslo, OsloN-0315, Norway
| | - Jan Halborg Jensen
- Department
of Chemistry, University of Copenhagen, Copenhagen2100, Denmark
| | - David Balcells
- Hylleraas
Centre for Quantum Molecular Sciences, Department of Chemistry, University of Oslo, P.O. Box 1033, Blindern, Oslo0315, Norway
| |
Collapse
|
2
|
Li C, Sharir O, Yuan S, Chan GKL. Image super-resolution inspired electron density prediction. Nat Commun 2025; 16:4811. [PMID: 40410201 PMCID: PMC12102193 DOI: 10.1038/s41467-025-60095-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 05/07/2025] [Indexed: 05/25/2025] Open
Abstract
Predicting ground-state electron densities of chemical systems has recently received growing attention in machine learning quantum chemistry, given their fundamental importance as highlighted by the Hohenberg-Kohn theorem. Drawing inspiration from the domain of image super-resolution, we view the electron density as a 3D grayscale image and use a convolutional residual network to transform a crude and trivially generated guess of the molecular density into an accurate ground-state quantum mechanical density. Here we show that this model produces more accurate predictions than all prior density prediction approaches. Due to its simplicity, the model is directly applicable to unseen molecular conformations and chemical elements. We show that fine-tuning on limited new data provides high accuracy even in challenging cases of exotic elements and charge states.
Collapse
Affiliation(s)
- Chenghan Li
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, 91125, CA, USA
| | - Or Sharir
- Division of Engineering and Applied Sciences, California Institute of Technology, Pasadena, 91125, CA, USA
| | - Shunyue Yuan
- Division of Engineering and Applied Sciences, California Institute of Technology, Pasadena, 91125, CA, USA
| | - Garnet Kin-Lic Chan
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, 91125, CA, USA.
| |
Collapse
|
3
|
Cao B, Dong J, Wang Z, Wang L. Large-Scale Non-Adiabatic Dynamics Simulation Based on Machine Learning Hamiltonian and Force Field: The Case of Charge Transport in Monolayer MoS 2. J Phys Chem Lett 2025; 16:4907-4920. [PMID: 40346030 DOI: 10.1021/acs.jpclett.5c01037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/11/2025]
Abstract
We present an efficient and reliable large-scale non-adiabatic dynamics simulation method based on machine learning Hamiltonian and force field. The quasi-diabatic Hamiltonian network (DHNet) is trained in the Wannier basis based on well-designed translation and rotation invariant structural descriptors, which can effectively capture both local and nonlocal environmental information. Using the representative two-dimensional transition metal dichalcogenide MoS2 as an illustration, we show that density functional theory (DFT) calculations of only ten structures are sufficient to generate the training set for DHNet due to the high efficiency of Wannier analysis and orbital classification in sampling the interorbital couplings. DHNet demonstrates good transferability, thus enabling direct construction of the electronic Hamiltonian matrices for large systems. Compared with direct DFT calculations, DHNet significantly reduces the computational cost by about 5 orders of magnitude. By combining DHNet with the DeePMD machine learning force field, we successfully simulate electron transport in monolayer MoS2 with up to 3675 atoms and 13475 electronic levels by using a state-of-the-art surface hopping method. The electron mobility is calculated to be 110 cm2/(V s), which is in good agreement with the extensive experimental results in the range of 3-200 cm2/(V s) during 2013-2023. Due to the high performance, the proposed DHNet and large-scale non-adiabatic dynamics methods have great potential to be applied to study charge carrier dynamics in a wide range of material systems.
Collapse
Affiliation(s)
- Bichuan Cao
- Zhejiang Key Laboratory of Excited-State Energy Conversion and Energy Storage, Department of Chemistry, Zhejiang University, Hangzhou 310058, China
| | - Jiawei Dong
- Zhejiang Key Laboratory of Excited-State Energy Conversion and Energy Storage, Department of Chemistry, Zhejiang University, Hangzhou 310058, China
| | - Zedong Wang
- Zhejiang Key Laboratory of Excited-State Energy Conversion and Energy Storage, Department of Chemistry, Zhejiang University, Hangzhou 310058, China
| | - Linjun Wang
- Zhejiang Key Laboratory of Excited-State Energy Conversion and Energy Storage, Department of Chemistry, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
4
|
Zhou Y, Zhu H, Yuan Y, Song Z, Mort BC. Machine Learning Classification of Chirality and Optical Rotation Using a Simple One-Hot Encoded Cartesian Coordinate Molecular Representation. J Chem Inf Model 2025; 65:4281-4292. [PMID: 40311114 PMCID: PMC12076508 DOI: 10.1021/acs.jcim.4c02374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2024] [Revised: 04/17/2025] [Accepted: 04/18/2025] [Indexed: 05/03/2025]
Abstract
Absolute stereochemical configurations and optical rotations were computed for 121,416 molecular structures from the QM9 quantum chemistry data set using density functional theory. A representation for the molecules was developed using Cartesian coordinate geometries and encoded atom types to serve as input for various machine learning algorithms. Classifiers were developed and trained to predict the chirality and signs of optical rotations using a variety of machine learning methods. These methods are compared, and the results demonstrate that machine learning is a viable tool for making predictions of the stereochemical properties of molecules.
Collapse
Affiliation(s)
- Yilin Zhou
- Center for Integrated Research
Computing, University of Rochester, Rochester, New York 14627, United States
| | - Haoran Zhu
- Center for Integrated Research
Computing, University of Rochester, Rochester, New York 14627, United States
| | - Yijie Yuan
- Center for Integrated Research
Computing, University of Rochester, Rochester, New York 14627, United States
| | - Ziyu Song
- Center for Integrated Research
Computing, University of Rochester, Rochester, New York 14627, United States
| | - Brendan C. Mort
- Center for Integrated Research
Computing, University of Rochester, Rochester, New York 14627, United States
| |
Collapse
|
5
|
Chowdhury C. Quantum support vector classifier for phase diagram prediction in quinary systems. MATERIALS HORIZONS 2025. [PMID: 40326405 DOI: 10.1039/d5mh00027k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2025]
Abstract
The integration of machine learning (ML) in materials science has accelerated the discovery and optimization of novel materials. However, classical ML approaches often face limitations in handling the increasing complexity and scale of modern datasets. Quantum machine learning (QML), leveraging quantum computing principles, offers a promising avenue to address these challenges. This study explores the application of the quantum support vector classifier (QSVC) to predict phase diagrams in the Al-Cu-Mg-Si-Zn quinary system. The prediction of the phase diagram of quinary systems contributes to the achievement of issues related to sustainable development goals by facilitating the creation of sophisticated materials that possess exceptional characteristics. This, in turn, promotes innovation and sustainable practices within the industrial sector. We used a comprehensive dataset from high-throughput CALPHAD calculations, and employed QSVC with advanced quantum feature transformations and kernel methods. Our results demonstrate significant improvements in predictive accuracy and efficiency compared to classical SVC, highlighting the potential of QML to advance material design.
Collapse
Affiliation(s)
- Chandra Chowdhury
- Advanced Materials Laboratory, CSIR-Central Leather Research Institute, Sardar Patel Road, Adyar, Chennai, 600020, India.
| |
Collapse
|
6
|
Mroz AM, Basford AR, Hastedt F, Jayasekera IS, Mosquera-Lois I, Sedgwick R, Ballester PJ, Bocarsly JD, Antonio Del Río Chanona E, Evans ML, Frost JM, Ganose AM, Greenaway RL, Kuok Mimi Hii K, Li Y, Misener R, Walsh A, Zhang D, Jelfs KE. Cross-disciplinary perspectives on the potential for artificial intelligence across chemistry. Chem Soc Rev 2025. [PMID: 40278836 PMCID: PMC12024683 DOI: 10.1039/d5cs00146c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2025] [Indexed: 04/26/2025]
Abstract
From accelerating simulations and exploring chemical space, to experimental planning and integrating automation within experimental labs, artificial intelligence (AI) is changing the landscape of chemistry. We are seeing a significant increase in the number of publications leveraging these powerful data-driven insights and models to accelerate all aspects of chemical research. For example, how we represent molecules and materials to computer algorithms for predictive and generative models, as well as the physical mechanisms by which we perform experiments in the lab for automation. Here, we present ten diverse perspectives on the impact of AI coming from those with a range of backgrounds from experimental chemistry, computational chemistry, computer science, engineering and across different areas of chemistry, including drug discovery, catalysis, chemical automation, chemical physics, materials chemistry. The ten perspectives presented here cover a range of themes, including AI for computation, facilitating discovery, supporting experiments, and enabling technologies for transformation. We highlight and discuss imminent challenges and ways in which we are redefining problems to accelerate the impact of chemical research via AI.
Collapse
Affiliation(s)
- Austin M Mroz
- Department of Chemistry, Imperial College London, London W12 0BZ, UK.
- I-X Centre for AI in Science, Imperial College London, London W12 0BZ, UK
| | - Annabel R Basford
- Department of Chemistry, Imperial College London, London W12 0BZ, UK.
| | - Friedrich Hastedt
- Department of Chemical Engineering, Imperial College London, London SW7 2AZ, UK
| | | | | | - Ruby Sedgwick
- Department of Computing, Imperial College London, London SW7 2AZ, UK
| | - Pedro J Ballester
- Department of Bioengineering, Imperial College London, London SW7 2AZ, UK
| | - Joshua D Bocarsly
- Department of Chemistry and Texas Center for Superconductivity, University of Houston, Houston, USA
| | | | - Matthew L Evans
- UCLouvain, Institute of Condensed Matter and Nanosciences (IMCN), Chemin des Étoiles 8, Louvain-la-Neuve 1348, Belgium
- Matgenix SRL, A6K Advanced Engineering Center, Charleroi, Belgium
- Datalab Industries Ltd, King's Lynn, Norfolk, UK
| | - Jarvist M Frost
- Department of Chemistry, Imperial College London, London W12 0BZ, UK.
| | - Alex M Ganose
- Department of Chemistry, Imperial College London, London W12 0BZ, UK.
| | | | | | - Yingzhen Li
- Department of Computing, Imperial College London, London SW7 2AZ, UK
| | - Ruth Misener
- Department of Computing, Imperial College London, London SW7 2AZ, UK
| | - Aron Walsh
- Department of Materials, Imperial College London, London SW7 2AZ, UK
| | - Dandan Zhang
- I-X Centre for AI in Science, Imperial College London, London W12 0BZ, UK
- Department of Bioengineering, Imperial College London, London SW7 2AZ, UK
| | - Kim E Jelfs
- Department of Chemistry, Imperial College London, London W12 0BZ, UK.
| |
Collapse
|
7
|
Loche P, Huguenin-Dumittan KK, Honarmand M, Xu Q, Rumiantsev E, How WB, Langer MF, Ceriotti M. Fast and flexible long-range models for atomistic machine learning. J Chem Phys 2025; 162:142501. [PMID: 40197567 DOI: 10.1063/5.0251713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2024] [Accepted: 03/03/2025] [Indexed: 04/10/2025] Open
Abstract
Most atomistic machine learning (ML) models rely on a locality ansatz and decompose the energy into a sum of short-ranged, atom-centered contributions. This leads to clear limitations when trying to describe problems that are dominated by long-range physical effects-most notably electrostatics. Many approaches have been proposed to overcome these limitations, but efforts to make them efficient and widely available are hampered by the need to incorporate an ad hoc implementation of methods to treat long-range interactions. We develop a framework aiming to bring some of the established algorithms to evaluate non-bonded interactions-including Ewald summation, classical particle-mesh Ewald, and particle-particle/particle-mesh Ewald-into atomistic ML. We provide a reference implementation for PyTorch as well as an experimental one for JAX. Beyond Coulomb and more general long-range potentials, we introduce purified descriptors that disregard the immediate neighborhood of each atom and are more suitable for general long-range ML applications. Our implementations are fast, feature-rich, and modular: They provide an accurate evaluation of physical long-range forces that can be used in the construction of (semi)empirical baseline potentials; they exploit the availability of automatic differentiation to seamlessly combine long-range models with conventional, local ML schemes; and they are sufficiently flexible to implement more complex architectures that use physical interactions as building blocks. We benchmark and demonstrate our torch-pme and jax-pme libraries to perform molecular dynamics simulations, train ML potentials, and evaluate long-range equivariant descriptors of atomic structures.
Collapse
Affiliation(s)
- Philip Loche
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Kevin K Huguenin-Dumittan
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Melika Honarmand
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Qianjun Xu
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Egor Rumiantsev
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Wei Bin How
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Marcel F Langer
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
8
|
Pathirage PDVS, Quebedeaux B, Akram S, Vogiatzis KD. Transferability Across Different Molecular Systems and Levels of Theory with the Data-Driven Coupled-Cluster Scheme. J Phys Chem A 2025; 129:2988-2997. [PMID: 40132101 DOI: 10.1021/acs.jpca.4c05718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/27/2025]
Abstract
Machine learning has recently been introduced into the arsenal of tools that are available to computational chemists. In the past few years, we have seen an increase in the applicability of these tools on a plethora of applications, including the automated exploration of a large fraction of the chemical space, the reduction of repetitive computational tasks, the detection of outliers on large databases, and the acceleration of molecular simulations. An attractive application of machine learning in molecular electronic structure theory is the "recycling" of molecular wave functions for faster and more accurate completion of complex quantum chemical calculations. Along these lines, we have developed hybrid quantum chemical/machine learning workflows that utilize information from low-level wave functions for the accurate prediction of higher-level wave functions. The data-driven coupled-cluster (DDCC) family of methods is discussed in this article together with the importance of the inclusion of physical properties in such hybrid workflows. After a short introduction to the philosophy and the capabilities of DDCC, we present our recent progress in extending its applicability to larger and more complex molecular structures and data sets. A significant advantage offered by DDCC is its transferability, with respect to different molecular systems and different excitation levels. As we show here, predicted wave functions at the coupled-cluster singles and doubles level of theory can be used for the accurate prediction of the perturbative triples of the CCSD(T) scheme. We conclude with some personal considerations with respect to future directions related to the development of the next generation of such hybrid quantum chemical/machine learning models.
Collapse
Affiliation(s)
- P D Varuna S Pathirage
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| | - Brody Quebedeaux
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| | - Shahzad Akram
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| | - Konstantinos D Vogiatzis
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| |
Collapse
|
9
|
Blinov A, Rekhman Z, Yasnaya M, Gvozdenko A, Golik A, Kravtsov A, Shevchenko I, Askerova A, Prasolova A, Pirogov M, Piskov S, Rzhepakovsky I, Nagdalian A. Enhancement of stability and activity of zinc carbonate nanoparticles using chitosan, hydroxyethyl cellulose, methyl cellulose and hyaluronic acid for multifaceted applications in medicine. Int J Biol Macromol 2025; 298:139768. [PMID: 39818387 DOI: 10.1016/j.ijbiomac.2025.139768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2024] [Revised: 12/30/2024] [Accepted: 01/09/2025] [Indexed: 01/18/2025]
Abstract
Currently, biopolymer-based Zn-containing nanoforms are of great interest for medical applications. However, there is lack information on optimal synthesis parameters, reagents and stabilizing agent for production of zinc carbonate nanoparticles (ZnC-NPs). In this work, synthesis of ZnC-NPs was carried out by chemical precipitation with the use of chitosan, hydroxyethyl cellulose, methyl cellulose and hyaluronic acid as stabilizing agents. The optimal precursor (Zn(CH3COO)2) and the optimal precipitator ((NH₄)₂CO₃) were detected. ZnC-NPs had one phase (Zn5(OH)6(CO3)2) with diameter from 35 to 120 nm. Thus, the optimal synthesis parameters were set as stoichiometric ratio of precursor and precipitator and the maximum concentration of biopolymer. It was found that polymers are sorbed on different crystallographic planes of crystallites, which affects the morphology of Zn5(OH)6(CO3). Quantum chemical modelling revealed that all models of interaction are energetically advantageous (∆E > 9788.910 kcal/mol) and preferably occurs through OH group, which was confirmed by FTIR spectroscopy of synthesized samples. Notably, CAM assay and histological evaluation showed that ZnC-NPs stabilized with chitosan (as represent of considered biopolymers) have no toxic effect and are compatible with CAM biological environment, which open a great potential for further studies of ZnC-NPs stabilized with biopolymers for multifaceted applications in medicine.
Collapse
Affiliation(s)
- Andrey Blinov
- North Caucasus Federal University, 355000 Stavropol, Russia
| | - Zafar Rekhman
- North Caucasus Federal University, 355000 Stavropol, Russia
| | - Mariya Yasnaya
- North Caucasus Federal University, 355000 Stavropol, Russia
| | | | - Alexey Golik
- North Caucasus Federal University, 355000 Stavropol, Russia
| | | | | | - Alina Askerova
- North Caucasus Federal University, 355000 Stavropol, Russia
| | | | - Maksim Pirogov
- North Caucasus Federal University, 355000 Stavropol, Russia
| | - Sergey Piskov
- North Caucasus Federal University, 355000 Stavropol, Russia
| | | | | |
Collapse
|
10
|
Jiang M, Wang Z, Chen Y, Zhang W, Zhu Z, Yan W, Wu J, Xu X. X2-PEC: A Neural Network Model Based on Atomic Pair Energy Corrections. J Comput Chem 2025; 46:e70081. [PMID: 40099806 DOI: 10.1002/jcc.70081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2025] [Revised: 02/27/2025] [Accepted: 02/28/2025] [Indexed: 03/20/2025]
Abstract
With the development of artificial neural networks (ANNs), its applications in chemistry have become increasingly widespread, especially in the prediction of various molecular properties. This work introduces the X2-PEC method, that is, the second generalization of the X1 series of ANN methods developed in our group, utilizing pair energy correction (PEC). The essence of the X2 model lies in its feature vector construction, using overlap integrals and core Hamiltonian integrals to incorporate physical and chemical information into the feature vectors to describe atomic interactions. It aims to enhance the accuracy of low-rung density functional theory (DFT) calculations, such as those from the widely used BLYP/6-31G(d) or B3LYP/6-31G(2df,p) methods, to the level of top-rung DFT calculations, such as those from the highly accurate doubly hybrid XYGJ-OS/GTLarge method. Trained on the QM9 dataset, X2-PEC excels in predicting the atomization energies of isomers such as C6H8 and C4H4N2O with varying bonding structures. The performance of the X2-PEC model on standard enthalpies of formation for datasets such as G2-HCNOF, PSH36, ALKANE28, BIGMOL20, and HEDM45, as well as a HCNOF subset of BH9 for reaction barriers, is equally commendable, demonstrating its good generalization ability and predictive accuracy, as well as its potential for further development to achieve greater accuracy. These outcomes highlight the practical significance of the X2-PEC model in elevating the results from lower-rung DFT calculations to the level of higher-rung DFT calculations through deep learning.
Collapse
Affiliation(s)
- Minghong Jiang
- Collaborative Innovation Center of Chemistry for Energy Materials, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, MOE Key Laboratory of Computational Physical Sciences, Department of Chemistry, Fudan University, Shanghai, China
| | - Zhanfeng Wang
- Collaborative Innovation Center of Chemistry for Energy Materials, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, MOE Key Laboratory of Computational Physical Sciences, Department of Chemistry, Fudan University, Shanghai, China
| | - Yicheng Chen
- Collaborative Innovation Center of Chemistry for Energy Materials, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, MOE Key Laboratory of Computational Physical Sciences, Department of Chemistry, Fudan University, Shanghai, China
| | - Wenhao Zhang
- Collaborative Innovation Center of Chemistry for Energy Materials, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, MOE Key Laboratory of Computational Physical Sciences, Department of Chemistry, Fudan University, Shanghai, China
| | - Zhenyu Zhu
- Collaborative Innovation Center of Chemistry for Energy Materials, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, MOE Key Laboratory of Computational Physical Sciences, Department of Chemistry, Fudan University, Shanghai, China
| | - Wenjie Yan
- Collaborative Innovation Center of Chemistry for Energy Materials, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, MOE Key Laboratory of Computational Physical Sciences, Department of Chemistry, Fudan University, Shanghai, China
| | - Jianming Wu
- Collaborative Innovation Center of Chemistry for Energy Materials, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, MOE Key Laboratory of Computational Physical Sciences, Department of Chemistry, Fudan University, Shanghai, China
| | - Xin Xu
- Collaborative Innovation Center of Chemistry for Energy Materials, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, MOE Key Laboratory of Computational Physical Sciences, Department of Chemistry, Fudan University, Shanghai, China
- Hefei National Laboratory, Hefei, China
| |
Collapse
|
11
|
Souza Mattos R, Mukherjee S, Barbatti M. Legion: A Platform for Gaussian Wavepacket Nonadiabatic Dynamics. J Chem Theory Comput 2025; 21:2189-2205. [PMID: 40025765 PMCID: PMC11948330 DOI: 10.1021/acs.jctc.4c01697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2024] [Revised: 02/18/2025] [Accepted: 02/19/2025] [Indexed: 03/04/2025]
Abstract
Nonadiabatic molecular dynamics is crucial in investigating the time evolution of excited states in molecular systems. Among the various methods for performing such dynamics, those employing frozen Gaussian wavepacket propagation, particularly the multiple spawning approach, offer a favorable balance between computational cost and reliability. It propagates on-the-fly trajectories used to build and propagate the nuclear wavepacket. Despite its potential, efficient, flexible, and easily accessible software for Gaussian wavepacket propagation is less common compared to other methods, such as surface hopping. To address this, we present Legion, a software that facilitates the development and application of classical-trajectory-guided quantum wavepacket methods. The version presented here already contains a highly flexible and fully functional ab initio multiple spawning implementation, with different strategies to improve efficiency. Legion is written in Python for data management and NumPy/Fortran for numerical operations. It is created under the umbrella of the Newton-X platform and inherits all of its electronic structure interfaces beyond other direct interfaces. It also contains new approximations that allow it to circumvent the computation of the nonadiabatic coupling, extending the electronic structure methods that can be used for multiple spawning dynamics. We test, validate, and demonstrate Legion's functionalities through multiple spawning dynamics of fulvene (CASSCF and CASPT2) and DMABN (TDDFT).
Collapse
Affiliation(s)
| | - Saikat Mukherjee
- Aix
Marseille University, CNRS, ICR, Marseille 13397, France
- Faculty
of Chemistry, Nicolaus Copernicus University
in Torun, Torun 87100, Poland
| | - Mario Barbatti
- Aix
Marseille University, CNRS, ICR, Marseille 13397, France
- Institut
Universitaire de France, Paris 75231, France
| |
Collapse
|
12
|
Rath Y, Booth GH. Interpolating numerically exact many-body wave functions for accelerated molecular dynamics. Nat Commun 2025; 16:2005. [PMID: 40011445 DOI: 10.1038/s41467-025-57134-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 02/07/2025] [Indexed: 02/28/2025] Open
Abstract
While there have been many developments in computational probes of both strongly-correlated molecular systems and machine-learning accelerated molecular dynamics, there remains a significant gap in capabilities in simulating accurate non-local electronic structure over timescales on which atoms move. We develop an approach to bridge these fields with a practical interpolation scheme for the correlated many-electron state through the space of atomic configurations, whilst avoiding the exponential complexity of these underlying electronic states. With a small number of accurate correlated wave functions as a training set, we demonstrate provable convergence to near-exact potential energy surfaces for subsequent dynamics with propagation of a valid many-body wave function and inference of its variational energy whilst retaining a mean-field computational scaling. This represents a profoundly different paradigm to the direct interpolation of potential energy surfaces in established machine-learning approaches. We combine this with modern electronic structure approaches to systematically resolve molecular dynamics trajectories and converge thermodynamic quantities with a high-throughput of several million interpolated wave functions with explicit validation of their accuracy from only a few numerically exact quantum chemical calculations. We also highlight the comparison to traditional machine-learned potentials or dynamics on mean-field surfaces.
Collapse
Affiliation(s)
- Yannic Rath
- National Physical Laboratory, Teddington, UK.
- Department of Physics and Thomas Young Centre, King's College London, London, UK.
| | - George H Booth
- Department of Physics and Thomas Young Centre, King's College London, London, UK.
| |
Collapse
|
13
|
Ng WP, Zhang Z, Yang J. Accurate Neural Network Fine-Tuning Approach for Transferable Ab Initio Energy Prediction across Varying Molecular and Crystalline Scales. J Chem Theory Comput 2025; 21:1602-1614. [PMID: 39902570 DOI: 10.1021/acs.jctc.4c01261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2025]
Abstract
Existing machine learning models attempt to predict the energies of large molecules by training small molecules, but eventually fail to retain high accuracy as the errors increase with system size. Through an orbital pairwise decomposition of the correlation energy, a pretrained neural network model on hundred-scale data containing small molecules is demonstrated to be sufficiently transferable for accurately predicting large systems, including molecules and crystals. Our model introduces a residual connection to explicitly learn the pairwise energy corrections, and employs various low-rank retraining techniques to modestly adjust the learned network parameters. We demonstrate that with as few as only one larger molecule retraining the base model originally trained on only small molecules of (H2O)6, the MP2 correlation energy of the large liquid water (H2O)64 in a periodic supercell can be predicted at chemical accuracy. Similar performance is observed for large protonated clusters and periodic poly glycine chains. A demonstrative application is presented to predict the energy ordering of symmetrically inequivalent sublattices for distinct hydrogen orientations in the ice XV phase. Our work represents an important step forward in the quest for cost-effective, highly accurate and transferable neural network models in quantum chemistry, bridging the electronic structure patterns between small and large systems.
Collapse
Affiliation(s)
- Wai-Pan Ng
- Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China
| | - Zili Zhang
- Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China
| | - Jun Yang
- Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China
- Hong Kong Quantum AI Lab Limited, Hong Kong 999077, P. R. China
| |
Collapse
|
14
|
Delgado-Granados LH, Sager-Smith LM, Trifonova K, Mazziotti DA. Machine Learning of Two-Electron Reduced Density Matrices for Many-Body Problems. J Phys Chem Lett 2025:2231-2237. [PMID: 39983757 DOI: 10.1021/acs.jpclett.4c03366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2025]
Abstract
We present a novel machine learning algorithm for the many-electron problem, predicting the convex combination of two-electron reduced density matrices (2-RDMs)─obtained from upper- and lower-bound energy calculations─that closely approximates the exact energy. In contrast to other recently developed approaches based on the wave function or one-electron density, our 2-RDM machine-learning approach predicts energies and properties without steep scaling or functional approximation. As conjectured by Preskill and co-workers, a small amount of data in a physics-based machine learning algorithm─in this case, information about the RDMs and their violation of selected higher-order N-representability conditions─yields highly accurate electronic energies that capture both dynamic and static correlation. We demonstrate the method by predicting the potential energy curves for BH and N2 within a few millihartrees of results from exact diagonalization. This machine learning algorithm provides a general framework for improving electronic structure calculations, with the potential for wide-reaching applications to both moderately and strongly correlated molecular systems.
Collapse
Affiliation(s)
- Luis H Delgado-Granados
- Department of Chemistry and The James Franck Institute, The University of Chicago, Chicago, Illinois 60637, United States
| | - LeeAnn M Sager-Smith
- Department of Chemistry and Physics, Saint Mary's College, Notre Dame, Indiana 46556, United States
| | - Kristina Trifonova
- Department of Chemistry and The James Franck Institute, The University of Chicago, Chicago, Illinois 60637, United States
| | - David A Mazziotti
- Department of Chemistry and The James Franck Institute, The University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
15
|
Shi H, Shen C, Huang Z, Dong K. Machine Learning-Guided Prediction of Hydroformylation. Chemphyschem 2025; 26:e202400773. [PMID: 39468908 DOI: 10.1002/cphc.202400773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Revised: 09/30/2024] [Accepted: 10/28/2024] [Indexed: 10/30/2024]
Abstract
A holistic model for predicting yield and linear selectivity for the hydroformylation of 1-octene was developed by machine learning using the experimental data collected from literatures. Physical organic chemistry (POC) parameter-based descriptors were adopted to represent pre-catalyst molecular features. Machine learning models trained respectively by Random Forests (RF) and Extreme Gradient Boost (XGBoost) algorithm showed remarkable performance on predicting linear selectivity. The method can also comprehensively map the correlation between reaction conditions and the results. The accuracy of the prediction results was verified by experimental data.
Collapse
Affiliation(s)
- Haonan Shi
- Chang-Kung Chuang Institute, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200241, China
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China
| | - Chaoren Shen
- Chang-Kung Chuang Institute, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200241, China
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China
| | - Zheng Huang
- Chang-Kung Chuang Institute, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200241, China
- State Key Laboratory of Organometallic Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 345 Lingling Road, Shanghai, 200032, China
| | - Kaiwu Dong
- Chang-Kung Chuang Institute, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200241, China
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China
| |
Collapse
|
16
|
Tang H, Xiao B, He W, Subasic P, Harutyunyan AR, Wang Y, Liu F, Xu H, Li J. Approaching coupled-cluster accuracy for molecular electronic structures with multi-task learning. NATURE COMPUTATIONAL SCIENCE 2025; 5:144-154. [PMID: 39730875 DOI: 10.1038/s43588-024-00747-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Accepted: 11/21/2024] [Indexed: 12/29/2024]
Abstract
Machine learning plays an important role in quantum chemistry, providing fast-to-evaluate predictive models for various properties of molecules; however, most existing machine learning models for molecular electronic properties use density functional theory (DFT) databases as ground truth in training, and their prediction accuracy cannot surpass that of DFT. In this work we developed a unified machine learning method for electronic structures of organic molecules using the gold-standard CCSD(T) calculations as training data. Tested on hydrocarbon molecules, our model outperforms DFT with several widely used hybrid and double-hybrid functionals in terms of both computational cost and prediction accuracy of various quantum chemical properties. We apply the model to aromatic compounds and semiconducting polymers, evaluating both ground- and excited-state properties. The results demonstrate the model's accuracy and generalization capability to complex systems that cannot be calculated using CCSD(T)-level methods due to scaling.
Collapse
Affiliation(s)
- Hao Tang
- Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Brian Xiao
- Department of Physics, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Wenhao He
- The Center for Computational Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | | | | | - Yao Wang
- Department of Chemistry, Emory University, Atlanta, GA, USA
| | - Fang Liu
- Department of Chemistry, Emory University, Atlanta, GA, USA
| | - Haowei Xu
- Department of Nuclear Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Ju Li
- Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Department of Nuclear Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
| |
Collapse
|
17
|
Cheng AH, Ser CT, Skreta M, Guzmán-Cordero A, Thiede L, Burger A, Aldossary A, Leong SX, Pablo-García S, Strieth-Kalthoff F, Aspuru-Guzik A. Spiers Memorial Lecture: How to do impactful research in artificial intelligence for chemistry and materials science. Faraday Discuss 2025; 256:10-60. [PMID: 39400305 DOI: 10.1039/d4fd00153b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2024]
Abstract
Machine learning has been pervasively touching many fields of science. Chemistry and materials science are no exception. While machine learning has been making a great impact, it is still not reaching its full potential or maturity. In this perspective, we first outline current applications across a diversity of problems in chemistry. Then, we discuss how machine learning researchers view and approach problems in the field. Finally, we provide our considerations for maximizing impact when researching machine learning for chemistry.
Collapse
Affiliation(s)
- Austin H Cheng
- Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada.
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
| | - Cher Tian Ser
- Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada.
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
| | - Marta Skreta
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
| | - Andrés Guzmán-Cordero
- Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
- Tinbergen Institute, University of Amsterdam, Amsterdam, Netherlands
| | - Luca Thiede
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
| | - Andreas Burger
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
| | | | - Shi Xuan Leong
- Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada.
- School of Chemistry, Chemical Engineering and Biotechnology, Nanyang Technological University, Singapore 63737, Singapore
| | | | | | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada.
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario M5G 1M1, Canada
- Acceleration Consortium, Toronto, Ontario M5G 1X6, Canada
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Canada
- Department of Materials Science and Engineering, University of Toronto, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), Canada
| |
Collapse
|
18
|
Kulichenko M, Nebgen B, Lubbers N, Smith JS, Barros K, Allen AEA, Habib A, Shinkle E, Fedik N, Li YW, Messerly RA, Tretiak S. Data Generation for Machine Learning Interatomic Potentials and Beyond. Chem Rev 2024; 124:13681-13714. [PMID: 39572011 DOI: 10.1021/acs.chemrev.4c00572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2024]
Abstract
The field of data-driven chemistry is undergoing an evolution, driven by innovations in machine learning models for predicting molecular properties and behavior. Recent strides in ML-based interatomic potentials have paved the way for accurate modeling of diverse chemical and structural properties at the atomic level. The key determinant defining MLIP reliability remains the quality of the training data. A paramount challenge lies in constructing training sets that capture specific domains in the vast chemical and structural space. This Review navigates the intricate landscape of essential components and integrity of training data that ensure the extensibility and transferability of the resulting models. We delve into the details of active learning, discussing its various facets and implementations. We outline different types of uncertainty quantification applied to atomistic data acquisition and the correlations between estimated uncertainty and true error. The role of atomistic data samplers in generating diverse and informative structures is highlighted. Furthermore, we discuss data acquisition via modified and surrogate potential energy surfaces as an innovative approach to diversify training data. The Review also provides a list of publicly available data sets that cover essential domains of chemical space.
Collapse
Affiliation(s)
- Maksim Kulichenko
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Benjamin Nebgen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Nicholas Lubbers
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Justin S Smith
- NVIDIA Corporation, Santa Clara, California 95051, United States
| | - Kipton Barros
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Alice E A Allen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Adela Habib
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Emily Shinkle
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Nikita Fedik
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Ying Wai Li
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Richard A Messerly
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Sergei Tretiak
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Integrated Nanotechnologies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| |
Collapse
|
19
|
Chan B, Dawson W, Nakajima T. Data Quality in the Fitting of Approximate Models: A Computational Chemistry Perspective. J Chem Theory Comput 2024; 20:10468-10476. [PMID: 39556867 DOI: 10.1021/acs.jctc.4c01063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2024]
Abstract
Empirical parametrization underpins many scientific methodologies including certain quantum-chemistry protocols [e.g., density functional theory (DFT), machine-learning (ML) models]. In some cases, the fitting requires a large amount of data, necessitating the use of data obtained using low-cost, and thus low-quality, means. Here we examine the effect of using low-quality data on the resulting method in the context of DFT methods. We use multiple G2/97 data sets of different qualities to fit the DFT-type methods. Encouragingly, this fitting can tolerate a relatively large proportion of low-quality fitting data, which may be attributed to the physical foundations of the DFT models and the use of a modest number of parameters. Further examination using "ML-quality" data shows that adding a large amount of low-quality data to a small number of high-quality ones may not offer tangible benefits. On the other hand, when the high-quality data is limited in scope, diversification by a modest amount of low-quality data improves the performance. Quantitatively, for parametrizing DFT (and perhaps also quantum-chemistry ML models), caution should be taken when more than 50% of the fitting set contains questionable data, and that the average error of the full set is more than 20 kJ mol-1. One may also follow the recently proposed transferability principles to ensure diversity in the fitting set.
Collapse
Affiliation(s)
- Bun Chan
- Graduate School of Engineering, Nagasaki University, Bunkyo 1-14, Nagasaki 852-8521, Japan
- RIKEN Center for Computational Science, 7-1-26, Minatojima-minami-machi, Chuo-ku, Kobe 650-0047, Japan
| | - William Dawson
- RIKEN Center for Computational Science, 7-1-26, Minatojima-minami-machi, Chuo-ku, Kobe 650-0047, Japan
| | - Takahito Nakajima
- RIKEN Center for Computational Science, 7-1-26, Minatojima-minami-machi, Chuo-ku, Kobe 650-0047, Japan
| |
Collapse
|
20
|
Thiemann FL, O'Neill N, Kapil V, Michaelides A, Schran C. Introduction to machine learning potentials for atomistic simulations. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2024; 37:073002. [PMID: 39577092 DOI: 10.1088/1361-648x/ad9657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Accepted: 11/22/2024] [Indexed: 11/24/2024]
Abstract
Machine learning potentials have revolutionised the field of atomistic simulations in recent years and are becoming a mainstay in the toolbox of computational scientists. This paper aims to provide an overview and introduction into machine learning potentials and their practical application to scientific problems. We provide a systematic guide for developing machine learning potentials, reviewing chemical descriptors, regression models, data generation and validation approaches. We begin with an emphasis on the earlier generation of models, such as high-dimensional neural network potentials and Gaussian approximation potentials, to provide historical perspective and guide the reader towards the understanding of recent developments, which are discussed in detail thereafter. Furthermore, we refer to relevant expert reviews, open-source software, and practical examples-further lowering the barrier to exploring these methods. The paper ends with selected showcase examples, highlighting the capabilities of machine learning potentials and how they can be applied to push the boundaries in atomistic simulations.
Collapse
Affiliation(s)
- Fabian L Thiemann
- IBM Research Europe, Daresbury, Warrington WA4 4AD, United Kingdom
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge CB3 0HE, United Kingdom
| | - Niamh O'Neill
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge CB3 0HE, United Kingdom
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| | - Venkat Kapil
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
- Department of Physics and Astronomy, University College London, London, United Kingdom
- Thomas Young Centre and London Centre for Nanotechnology, London, United Kingdom
| | - Angelos Michaelides
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| | - Christoph Schran
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge CB3 0HE, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| |
Collapse
|
21
|
Yuan K, Zhou S, Li N, Li T, Ding B, Guo D, Ma Y. Fault-tolerant quantum chemical calculations with improved machine-learning models. J Comput Chem 2024; 45:2640-2658. [PMID: 39072777 DOI: 10.1002/jcc.27459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 05/30/2024] [Accepted: 06/18/2024] [Indexed: 07/30/2024]
Abstract
Easy and effective usage of computational resources is crucial for scientific calculations. Following our recent work of machine-learning (ML) assisted scheduling optimization [J. Comput. Chem. 2023, 44, 1174], we further propose (1) the improved ML models for the better predictions of computational loads, and as such, more elaborate load-balancing calculations can be expected; (2) the idea of coded computation, that is, the integration of gradient coding, in order to introduce fault tolerance during the distributed calculations; and (3) their applications together with re-normalized exciton model with time-dependent density functional theory (REM-TDDFT) for calculating the excited states. Illustrated benchmark calculations include P38 protein, and solvent model with one or several excitable centers. The results show that the improved ML-assisted coded calculations can further improve the load-balancing and cluster utilization, owing primarily profit in fault tolerance that aims at the automated quantum chemical calculations for both ground and excited states.
Collapse
Affiliation(s)
- Kai Yuan
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
| | - Shuai Zhou
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, China
| | - Ning Li
- College of Chemistry and Materials Engineering, Wenzhou University, Wenzhou, China
| | - Tianyan Li
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
| | - Bowen Ding
- Institute of Chemistry, Chinese Academy of Sciences, Beijing, China
| | - Danhuai Guo
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China
| | - Yingjin Ma
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
22
|
Liang C, Rouzhahong Y, Yao S, Liang J, Yu C, Wang B, Li H. A Cluster-Based Deep Learning Model Perceiving Series Correlation for Accurate Prediction of Phonon Spectrum. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2406183. [PMID: 39422637 PMCID: PMC11633492 DOI: 10.1002/advs.202406183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2024] [Revised: 09/30/2024] [Indexed: 10/19/2024]
Abstract
The spectral properties are the most prevalent continuous representation for characterizing transport phenomena and excitation responses, yet their accurate predictions remain a challenge due to the inability to perceive series correlations by existing machine learning (ML) models. Herein, a ML model named cluster-based series graph networks (CSGN) is developed based on the dynamical theory of crystal lattices to predict phonon density of states (PDOS) spectrum for crystal materials. The multiple atomic cluster representation is constructed to capture the diverse vibration modes, while the mixture Gaussian process and dynamic time warping mechanism are compiled to project from clusters to PDOS spectrum. Accurate predictions of complicated spectra with multiple or overlapping peaks are achieved. The high performance of CSGN model can be attributed to the pertinent feature extraction and the appropriate similarity evaluation, which enable the natural perception of structure-property relation and intrinsic series correlations as confirmed in the predictive results. The transferable and interpretable CSGN model advances ML predictions of spectral properties and reveals the potential of designing ML methods based on physical mechanisms.
Collapse
Affiliation(s)
- Chao Liang
- School of PhysicsSun Yat‐Sen UniversityGuangzhou510275China
| | - Yilimiranmu Rouzhahong
- School of Materials Science and EngineeringDongguan University of TechnologyDongguan523808China
| | - Shunwei Yao
- School of PhysicsSun Yat‐Sen UniversityGuangzhou510275China
| | - Junhao Liang
- School of PhysicsSun Yat‐Sen UniversityGuangzhou510275China
| | - Chunlin Yu
- School of PhysicsSun Yat‐Sen UniversityGuangzhou510275China
| | - Biao Wang
- School of Materials Science and EngineeringDongguan University of TechnologyDongguan523808China
| | - Huashan Li
- School of PhysicsSun Yat‐Sen UniversityGuangzhou510275China
- Guangdong Provincial Key Laboratory of Magnetoelectric Physics and DevicesSchool of PhysicsSun Yat‐sen UniversityGuangzhou510275China
- Center for Neutron Science and TechnologySchool of PhysicsSun Yat‐sen UniversityGuangzhou510275China
| |
Collapse
|
23
|
Lai J, Kan B, Wu Y, Fu Q, Shang H, Li Z, Yang J. Accurate Calculation of Interatomic Forces with Neural Networks Based on a Generative Transformer Architecture. J Chem Theory Comput 2024; 20:9478-9487. [PMID: 39440863 DOI: 10.1021/acs.jctc.4c01205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2024]
Abstract
Using neural networks to express electronic wave functions represents a new paradigm for solving the Schrödinger equation in quantum chemistry. For practical quantum chemistry simulations, one needs to know not only energies of molecules, but also accurate forces acting on constituent atoms. In this work, we achieve the accurate calculation of interatomic forces on QiankunNet, a platform that combines transformer-based deep neural networks with efficient batched autoregressive sampling. Our approach permits the application of the Hellmann-Feynman theorem to force calculations without introducing corrective Pulay terms. The results show that the calculated interatomic forces are in close agreement with those derived from the full configuration interaction method, irrespective of whether the system is a simple molecule or a strongly correlated electron system like a linear hydrogen chain. Furthermore, the calculated interatomic forces are employed for atomic relaxation in the torsional rotation process of ethylene, and the energy barrier obtained from the scanned potential energy surface is in excellent agreement with the experiment. Our work contributes to the application of artificial intelligence to broader quantum chemistry simulations, such as modeling challenging chemical transformations where electron correlations are difficult to describe.
Collapse
Affiliation(s)
- Juntao Lai
- School of Future Technology, University of Science and Technology of China, Hefei 230026, China
- Hefei National Laboratory, University of Science and Technology of China, Hefei 230088, China
| | - Bowen Kan
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Yangjun Wu
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Qiang Fu
- School of Future Technology, University of Science and Technology of China, Hefei 230026, China
- Hefei National Laboratory, University of Science and Technology of China, Hefei 230088, China
| | - Honghui Shang
- Key Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei 230026, China
| | - Zhenyu Li
- Hefei National Laboratory, University of Science and Technology of China, Hefei 230088, China
- Key Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei 230026, China
| | - Jinlong Yang
- Hefei National Laboratory, University of Science and Technology of China, Hefei 230088, China
- Key Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei 230026, China
| |
Collapse
|
24
|
Hwang W, Austin SL, Blondel A, Boittier ED, Boresch S, Buck M, Buckner J, Caflisch A, Chang HT, Cheng X, Choi YK, Chu JW, Crowley MF, Cui Q, Damjanovic A, Deng Y, Devereux M, Ding X, Feig MF, Gao J, Glowacki DR, Gonzales JE, Hamaneh MB, Harder ED, Hayes RL, Huang J, Huang Y, Hudson PS, Im W, Islam SM, Jiang W, Jones MR, Käser S, Kearns FL, Kern NR, Klauda JB, Lazaridis T, Lee J, Lemkul JA, Liu X, Luo Y, MacKerell AD, Major DT, Meuwly M, Nam K, Nilsson L, Ovchinnikov V, Paci E, Park S, Pastor RW, Pittman AR, Post CB, Prasad S, Pu J, Qi Y, Rathinavelan T, Roe DR, Roux B, Rowley CN, Shen J, Simmonett AC, Sodt AJ, Töpfer K, Upadhyay M, van der Vaart A, Vazquez-Salazar LI, Venable RM, Warrensford LC, Woodcock HL, Wu Y, Brooks CL, Brooks BR, Karplus M. CHARMM at 45: Enhancements in Accessibility, Functionality, and Speed. J Phys Chem B 2024; 128:9976-10042. [PMID: 39303207 PMCID: PMC11492285 DOI: 10.1021/acs.jpcb.4c04100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Revised: 08/15/2024] [Accepted: 08/22/2024] [Indexed: 09/22/2024]
Abstract
Since its inception nearly a half century ago, CHARMM has been playing a central role in computational biochemistry and biophysics. Commensurate with the developments in experimental research and advances in computer hardware, the range of methods and applicability of CHARMM have also grown. This review summarizes major developments that occurred after 2009 when the last review of CHARMM was published. They include the following: new faster simulation engines, accessible user interfaces for convenient workflows, and a vast array of simulation and analysis methods that encompass quantum mechanical, atomistic, and coarse-grained levels, as well as extensive coverage of force fields. In addition to providing the current snapshot of the CHARMM development, this review may serve as a starting point for exploring relevant theories and computational methods for tackling contemporary and emerging problems in biomolecular systems. CHARMM is freely available for academic and nonprofit research at https://academiccharmm.org/program.
Collapse
Affiliation(s)
- Wonmuk Hwang
- Department
of Biomedical Engineering, Texas A&M
University, College
Station, Texas 77843, United States
- Department
of Materials Science and Engineering, Texas
A&M University, College Station, Texas 77843, United States
- Department
of Physics and Astronomy, Texas A&M
University, College Station, Texas 77843, United States
- Center for
AI and Natural Sciences, Korea Institute
for Advanced Study, Seoul 02455, Republic
of Korea
| | - Steven L. Austin
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Arnaud Blondel
- Institut
Pasteur, Université Paris Cité, CNRS UMR3825, Structural
Bioinformatics Unit, 28 rue du Dr. Roux F-75015 Paris, France
| | - Eric D. Boittier
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Stefan Boresch
- Faculty of
Chemistry, Department of Computational Biological Chemistry, University of Vienna, Wahringerstrasse 17, 1090 Vienna, Austria
| | - Matthias Buck
- Department
of Physiology and Biophysics, Case Western
Reserve University, School of Medicine, Cleveland, Ohio 44106, United States
| | - Joshua Buckner
- Department
of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Amedeo Caflisch
- Department
of Biochemistry, University of Zürich, CH-8057 Zürich, Switzerland
| | - Hao-Ting Chang
- Institute
of Bioinformatics and Systems Biology, National
Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan, ROC
| | - Xi Cheng
- Shanghai
Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Yeol Kyo Choi
- Department
of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Jhih-Wei Chu
- Institute
of Bioinformatics and Systems Biology, Department of Biological Science
and Technology, Institute of Molecular Medicine and Bioengineering,
and Center for Intelligent Drug Systems and Smart Bio-devices (IDSB), National Yang Ming Chiao Tung
University, Hsinchu 30010, Taiwan,
ROC
| | - Michael F. Crowley
- Renewable
Resources and Enabling Sciences Center, National Renewable Energy Laboratory, Golden, Colorado 80401, United States
| | - Qiang Cui
- Department
of Chemistry, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, United States
- Department
of Physics, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, United States
- Department
of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, Massachusetts 02215, United States
| | - Ana Damjanovic
- Department
of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, United States
- Department
of Physics and Astronomy, Johns Hopkins
University, Baltimore, Maryland 21218, United States
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Yuqing Deng
- Shanghai
R&D Center, DP Technology, Ltd., Shanghai 201210, China
| | - Mike Devereux
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Xinqiang Ding
- Department
of Chemistry, Tufts University, Medford, Massachusetts 02155, United States
| | - Michael F. Feig
- Department
of Biochemistry and Molecular Biology, Michigan
State University, East Lansing, Michigan 48824, United States
| | - Jiali Gao
- School
of Chemical Biology & Biotechnology, Peking University Shenzhen Graduate School, Shenzhen, Guangdong 518055, China
- Institute
of Systems and Physical Biology, Shenzhen
Bay Laboratory, Shenzhen, Guangdong 518055, China
- Department
of Chemistry and Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - David R. Glowacki
- CiTIUS
Centro Singular de Investigación en Tecnoloxías Intelixentes
da USC, 15705 Santiago de Compostela, Spain
| | - James E. Gonzales
- Department
of Biomedical Engineering, Texas A&M
University, College
Station, Texas 77843, United States
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Mehdi Bagerhi Hamaneh
- Department
of Physiology and Biophysics, Case Western
Reserve University, School of Medicine, Cleveland, Ohio 44106, United States
| | | | - Ryan L. Hayes
- Department
of Chemical and Biomolecular Engineering, University of California, Irvine, Irvine, California 92697, United States
- Department
of Pharmaceutical Sciences, University of
California, Irvine, Irvine, California 92697, United States
| | - Jing Huang
- Key Laboratory
of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang 310024, China
| | - Yandong Huang
- College
of Computer Engineering, Jimei University, Xiamen 361021, China
| | - Phillip S. Hudson
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
- Medicine
Design, Pfizer Inc., Cambridge, Massachusetts 02139, United States
| | - Wonpil Im
- Department
of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Shahidul M. Islam
- Department
of Chemistry, Delaware State University, Dover, Delaware 19901, United States
| | - Wei Jiang
- Computational
Science Division, Argonne National Laboratory, Argonne, Illinois 60439, United States
| | - Michael R. Jones
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Silvan Käser
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Fiona L. Kearns
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Nathan R. Kern
- Department
of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Jeffery B. Klauda
- Department
of Chemical and Biomolecular Engineering, Institute for Physical Science
and Technology, Biophysics Program, University
of Maryland, College Park, Maryland 20742, United States
| | - Themis Lazaridis
- Department
of Chemistry, City College of New York, New York, New York 10031, United States
| | - Jinhyuk Lee
- Disease
Target Structure Research Center, Korea
Research Institute of Bioscience and Biotechnology, Daejeon 34141, Republic of Korea
- Department
of Bioinformatics, KRIBB School of Bioscience, University of Science and Technology, Daejeon 34141, Republic of Korea
| | - Justin A. Lemkul
- Department
of Biochemistry, Virginia Polytechnic Institute
and State University, Blacksburg, Virginia 24061, United States
| | - Xiaorong Liu
- Department
of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Yun Luo
- Department
of Biotechnology and Pharmaceutical Sciences, College of Pharmacy, Western University of Health Sciences, Pomona, California 91766, United States
| | - Alexander D. MacKerell
- Department
of Pharmaceutical Sciences, University of
Maryland School of Pharmacy, Baltimore, Maryland 21201, United States
| | - Dan T. Major
- Department
of Chemistry and Institute for Nanotechnology & Advanced Materials, Bar-Ilan University, Ramat-Gan 52900, Israel
| | - Markus Meuwly
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
- Department
of Chemistry, Brown University, Providence, Rhode Island 02912, United States
| | - Kwangho Nam
- Department
of Chemistry and Biochemistry, University
of Texas at Arlington, Arlington, Texas 76019, United States
| | - Lennart Nilsson
- Karolinska
Institutet, Department of Biosciences and
Nutrition, SE-14183 Huddinge, Sweden
| | - Victor Ovchinnikov
- Harvard
University, Department of Chemistry
and Chemical Biology, Cambridge, Massachusetts 02138, United States
| | - Emanuele Paci
- Dipartimento
di Fisica e Astronomia, Universitá
di Bologna, Bologna 40127, Italy
| | - Soohyung Park
- Department
of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Richard W. Pastor
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Amanda R. Pittman
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Carol Beth Post
- Borch Department
of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana 47907, United States
| | - Samarjeet Prasad
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Jingzhi Pu
- Department
of Chemistry and Chemical Biology, Indiana
University Indianapolis, Indianapolis, Indiana 46202, United States
| | - Yifei Qi
- School
of Pharmacy, Fudan University, Shanghai 201203, China
| | | | - Daniel R. Roe
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Benoit Roux
- Department
of Chemistry, University of Chicago, Chicago, Illinois 60637, United States
| | | | - Jana Shen
- Department
of Pharmaceutical Sciences, University of
Maryland School of Pharmacy, Baltimore, Maryland 21201, United States
| | - Andrew C. Simmonett
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Alexander J. Sodt
- Eunice
Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Kai Töpfer
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Meenu Upadhyay
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Arjan van der Vaart
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | | | - Richard M. Venable
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Luke C. Warrensford
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - H. Lee Woodcock
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Yujin Wu
- Department
of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Charles L. Brooks
- Department
of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Bernard R. Brooks
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Martin Karplus
- Harvard
University, Department of Chemistry
and Chemical Biology, Cambridge, Massachusetts 02138, United States
- Laboratoire
de Chimie Biophysique, ISIS, Université
de Strasbourg, 67000 Strasbourg, France
| |
Collapse
|
25
|
Tang Z, Li H, Lin P, Gong X, Jin G, He L, Jiang H, Ren X, Duan W, Xu Y. A deep equivariant neural network approach for efficient hybrid density functional calculations. Nat Commun 2024; 15:8815. [PMID: 39394190 PMCID: PMC11470148 DOI: 10.1038/s41467-024-53028-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 09/24/2024] [Indexed: 10/13/2024] Open
Abstract
Hybrid density functional calculations are essential for accurate description of electronic structure, yet their widespread use is restricted by the substantial computational cost. Here we develop DeepH-hybrid, a deep equivariant neural network method for learning the hybrid-functional Hamiltonian as a function of material structure, which circumvents the time-consuming self-consistent field iterations and enables the study of large-scale materials with hybrid-functional accuracy. Our extensive experiments demonstrate good reliability as well as effective transferability and efficiency of the method. As a notable application, DeepH-hybrid is applied to study large-supercell Moiré-twisted materials, offering the first case study on how the inclusion of exact exchange affects flat bands in magic-angle twisted bilayer graphene. The work generalizes deep-learning electronic structure methods to beyond conventional density functional theory, facilitating the development of deep-learning-based ab initio methods.
Collapse
Affiliation(s)
- Zechen Tang
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, 100084, Beijing, China
| | - He Li
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, 100084, Beijing, China
- Institute for Advanced Study, Tsinghua University, 100084, Beijing, China
| | - Peize Lin
- Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, 100190, Beijing, China
- Songshan Lake Materials Laboratory, 523808, Dongguan, Guangdong, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, 230026, Hefei, Anhui, China
| | - Xiaoxun Gong
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, 100084, Beijing, China
- School of Physics, Peking University, 100871, Beijing, China
| | - Gan Jin
- Key Laboratory of Quantum Information, University of Science and Technology of China, 230026, Hefei, Anhui, China
| | - Lixin He
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, 230026, Hefei, Anhui, China
- Key Laboratory of Quantum Information, University of Science and Technology of China, 230026, Hefei, Anhui, China
| | - Hong Jiang
- College of Chemistry and Molecular Engineering, Peking University, 100871, Beijing, China
| | - Xinguo Ren
- Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, 100190, Beijing, China.
- Songshan Lake Materials Laboratory, 523808, Dongguan, Guangdong, China.
| | - Wenhui Duan
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, 100084, Beijing, China.
- Institute for Advanced Study, Tsinghua University, 100084, Beijing, China.
- Frontier Science Center for Quantum Information, Beijing, China.
| | - Yong Xu
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, 100084, Beijing, China.
- Frontier Science Center for Quantum Information, Beijing, China.
- RIKEN Center for Emergent Matter Science (CEMS), Wako, Saitama, 351-0198, Japan.
| |
Collapse
|
26
|
Gong X, Louie SG, Duan W, Xu Y. Generalizing deep learning electronic structure calculation to the plane-wave basis. NATURE COMPUTATIONAL SCIENCE 2024; 4:752-760. [PMID: 39363113 PMCID: PMC11499277 DOI: 10.1038/s43588-024-00701-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 09/04/2024] [Indexed: 10/05/2024]
Abstract
Deep neural networks capable of representing the density functional theory (DFT) Hamiltonian as a function of material structure hold great promise for revolutionizing future electronic structure calculations. However, a notable limitation of previous neural networks is their compatibility solely with the atomic-orbital (AO) basis, excluding the widely used plane-wave (PW) basis. Here we overcome this critical limitation by proposing an accurate and efficient real-space reconstruction method for directly computing AO Hamiltonian matrices from PW DFT results. The reconstruction method is orders of magnitude faster than traditional projection-based methods to convert PW results to the AO basis, and the reconstructed Hamiltonian matrices can faithfully reproduce the PW electronic structure, thus bridging the longstanding gap between the AO basis deep learning electronic structure approach and PW DFT. Advantages of the PW methods, such as high accuracy, high flexibility and wide applicability, thus can be all integrated into deep learning electronic structure methods without sacrificing these methods' inherent benefits. This allows for the construction of large-scale and high-fidelity training datasets with the help of PW DFT results towards the development of precise and broadly applicable deep learning electronic structure models.
Collapse
Affiliation(s)
- Xiaoxun Gong
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing, China
- Department of Physics, University of California at Berkeley, Berkeley, CA, USA
- Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Steven G Louie
- Department of Physics, University of California at Berkeley, Berkeley, CA, USA.
- Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| | - Wenhui Duan
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing, China.
- Institute for Advanced Study, Tsinghua University, Beijing, China.
- Frontier Science Center for Quantum Information, Beijing, China.
| | - Yong Xu
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing, China.
- Frontier Science Center for Quantum Information, Beijing, China.
- RIKEN Center for Emergent Matter Science (CEMS), Wako, Japan.
| |
Collapse
|
27
|
Wang Y, Li Y, Tang Z, Li H, Yuan Z, Tao H, Zou N, Bao T, Liang X, Chen Z, Xu S, Bian C, Xu Z, Wang C, Si C, Duan W, Xu Y. Universal materials model of deep-learning density functional theory Hamiltonian. Sci Bull (Beijing) 2024; 69:2514-2521. [PMID: 38942699 DOI: 10.1016/j.scib.2024.06.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2024] [Revised: 06/07/2024] [Accepted: 06/08/2024] [Indexed: 06/30/2024]
Abstract
Realizing large materials models has emerged as a critical endeavor for materials research in the new era of artificial intelligence, but how to achieve this fantastic and challenging objective remains elusive. Here, we propose a feasible pathway to address this paramount pursuit by developing universal materials models of deep-learning density functional theory Hamiltonian (DeepH), enabling computational modeling of the complicated structure-property relationship of materials in general. By constructing a large materials database and substantially improving the DeepH method, we obtain a universal materials model of DeepH capable of handling diverse elemental compositions and material structures, achieving remarkable accuracy in predicting material properties. We further showcase a promising application of fine-tuning universal materials models for enhancing specific materials models. This work not only demonstrates the concept of DeepH's universal materials model but also lays the groundwork for developing large materials models, opening up significant opportunities for advancing artificial intelligence-driven materials discovery.
Collapse
Affiliation(s)
- Yuxiang Wang
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China
| | - Yang Li
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China
| | - Zechen Tang
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China
| | - He Li
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China; Institute for Advanced Study, Tsinghua University, Beijing 100084, China
| | - Zilong Yuan
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China
| | - Honggeng Tao
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China
| | - Nianlong Zou
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China
| | - Ting Bao
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China
| | - Xinghao Liang
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China
| | - Zezhou Chen
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China
| | - Shanghua Xu
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China
| | - Ce Bian
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China
| | - Zhiming Xu
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China
| | - Chong Wang
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China
| | - Chen Si
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Wenhui Duan
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China; Institute for Advanced Study, Tsinghua University, Beijing 100084, China; Frontier Science Center for Quantum Information, Beijing 100084, China.
| | - Yong Xu
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China; Frontier Science Center for Quantum Information, Beijing 100084, China; RIKEN Center for Emergent Matter Science (CEMS), Wako 351-0198, Japan.
| |
Collapse
|
28
|
Schatz GC, Wodtke AM, Yang X. Spiers Memorial Lecture: New directions in molecular scattering. Faraday Discuss 2024; 251:9-62. [PMID: 38764350 DOI: 10.1039/d4fd00015c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/21/2024]
Abstract
The field of molecular scattering is reviewed as it pertains to gas-gas as well as gas-surface chemical reaction dynamics. We emphasize the importance of collaboration of experiment and theory, from which new directions of research are being pursued on increasingly complex problems. We review both experimental and theoretical advances that provide the modern toolbox available to molecular-scattering studies. We distinguish between two classes of work. The first involves simple systems and uses experiment to validate theory so that from the validated theory, one may learn far more than could ever be measured in the laboratory. The second class involves problems of great complexity that would be difficult or impossible to understand without a partnership of experiment and theory. Key topics covered in this review include crossed-beams reactive scattering and scattering at extremely low energies, where quantum effects dominate. They also include scattering from surfaces, reactive scattering and kinetics at surfaces, and scattering work done at liquid surfaces. The review closes with thoughts on future promising directions of research.
Collapse
Affiliation(s)
- George C Schatz
- Dept of Chemistry, Northwestern University, Evanston, Illinois 60208, USA
| | - Alec M Wodtke
- Institute for Physical Chemistry, Georg August University, Goettingen, Germany
- Max Planck Institute for Multidisciplinary Natural Sciences, Goettingen, Germany.
- International Center for the Advanced Studies of Energy Conversion, Georg August University, Goettingen, Germany
| | - Xueming Yang
- Dalian Institute for Chemical Physics, Chinese Academy of Sciences, Dalian, China
- Department of Chemistry, College of Science, Southern University of Science and Technology, Shenzhen, China
| |
Collapse
|
29
|
Li Y, Tang Z, Chen Z, Sun M, Zhao B, Li H, Tao H, Yuan Z, Duan W, Xu Y. Neural-network Density Functional Theory Based on Variational Energy Minimization. PHYSICAL REVIEW LETTERS 2024; 133:076401. [PMID: 39213576 DOI: 10.1103/physrevlett.133.076401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Accepted: 07/12/2024] [Indexed: 09/04/2024]
Abstract
Deep-learning density functional theory (DFT) shows great promise to significantly accelerate material discovery and potentially revolutionize materials research. However, current research in this field primarily relies on data-driven supervised learning, making the developments of neural networks and DFT isolated from each other. In this work, we present a theoretical framework of neural-network DFT, which unifies the optimization of neural networks with the variational computation of DFT, enabling physics-informed unsupervised learning. Moreover, we develop a differential DFT code incorporated with deep-learning DFT Hamiltonian, and introduce algorithms of automatic differentiation and backpropagation into DFT, demonstrating the capability of neural-network DFT. The physics-informed neural-network architecture not only surpasses conventional approaches in accuracy and efficiency, but also offers a new paradigm for developing deep-learning DFT methods.
Collapse
Affiliation(s)
- Yang Li
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China
| | | | | | | | | | | | | | | | - Wenhui Duan
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China
- Institute for Advanced Study, Tsinghua University, Beijing 100084, China
- Frontier Science Center for Quantum Information, Beijing, China
| | | |
Collapse
|
30
|
Gu Q, Zhouyin Z, Pandey SK, Zhang P, Zhang L, E W. Deep learning tight-binding approach for large-scale electronic simulations at finite temperatures with ab initio accuracy. Nat Commun 2024; 15:6772. [PMID: 39117636 PMCID: PMC11310461 DOI: 10.1038/s41467-024-51006-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Accepted: 07/17/2024] [Indexed: 08/10/2024] Open
Abstract
Simulating electronic behavior in materials and devices with realistic large system sizes remains a formidable task within the ab initio framework due to its computational intensity. Here we show DeePTB, an efficient deep learning-based tight-binding approach with ab initio accuracy to address this issue. By training on structural data and corresponding ab initio eigenvalues, the DeePTB model can efficiently predict tight-binding Hamiltonians for unseen structures, enabling efficient simulations of large-size systems under external perturbations such as finite temperatures and strain. This capability is vital for semiconductor band gap engineering and materials design. When combined with molecular dynamics, DeePTB facilitates efficient and accurate finite-temperature simulations of both atomic and electronic behavior simultaneously. This is demonstrated by computing the temperature-dependent electronic properties of a gallium phosphide system with 106 atoms. The availability of DeePTB bridges the gap between accuracy and scalability in electronic simulations, potentially advancing materials science and related fields by enabling large-scale electronic structure calculations.
Collapse
Affiliation(s)
- Qiangqiang Gu
- AI for Science Institute, 100080, Beijing, China.
- School of Mathematical Science, Peking University, 100871, Beijing, China.
| | - Zhanghao Zhouyin
- AI for Science Institute, 100080, Beijing, China
- College of Intelligence and Computing, Tianjin University, 300350, Tianjin, China
| | - Shishir Kumar Pandey
- AI for Science Institute, 100080, Beijing, China
- Birla Institute of Technology & Science, Pilani-Dubai Campus, Dubai, 345055, UAE
| | - Peng Zhang
- College of Intelligence and Computing, Tianjin University, 300350, Tianjin, China
| | - Linfeng Zhang
- AI for Science Institute, 100080, Beijing, China
- DP Technology, 100080, Beijing, China
| | - Weinan E
- AI for Science Institute, 100080, Beijing, China
- School of Mathematical Science, Peking University, 100871, Beijing, China
- Center for Machine Learning Research, Peking University, 100871, Beijing, China
| |
Collapse
|
31
|
A machine learning tool to efficiently calculate electron-phonon coupling. NATURE COMPUTATIONAL SCIENCE 2024; 4:565-566. [PMID: 39117917 DOI: 10.1038/s43588-024-00680-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/10/2024]
|
32
|
Zhong Y, Liu S, Zhang B, Tao Z, Sun Y, Chu W, Gong XG, Yang JH, Xiang H. Accelerating the calculation of electron-phonon coupling strength with machine learning. NATURE COMPUTATIONAL SCIENCE 2024; 4:615-625. [PMID: 39117916 DOI: 10.1038/s43588-024-00668-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Accepted: 07/09/2024] [Indexed: 08/10/2024]
Abstract
The calculation of electron-phonon couplings (EPCs) is essential for understanding various fundamental physical properties, including electrical transport, optical and superconducting behaviors in materials. However, obtaining EPCs through fully first-principles methods is notably challenging, particularly for large systems or when employing advanced functionals. Here we introduce a machine learning framework to accelerate EPC calculations by utilizing atomic orbital-based Hamiltonian matrices and gradients predicted by an equivariant graph neural network. We demonstrate that our method not only yields EPC values in close agreement with first-principles results but also enhances calculation efficiency by several orders of magnitude. Application to GaAs using the Heyd-Scuseria-Ernzerhof functional reveals the necessity of advanced functionals for accurate carrier mobility predictions, while for the large Kagome crystal CsV3Sb5, our framework reproduces the experimentally observed double domes in pressure-induced superconducting phase diagrams. This machine learning framework offers a powerful and efficient tool for the investigation of diverse EPC-related phenomena in complex materials.
Collapse
Affiliation(s)
- Yang Zhong
- Key Laboratory of Computational Physical Sciences (Ministry of Education), Institute of Computational Physical Sciences, State Key Laboratory of Surface Physics, and Department of Physics, Fudan University, Shanghai, China
- Shanghai Qi Zhi Institute, Shanghai, China
| | - Shixu Liu
- Key Laboratory of Computational Physical Sciences (Ministry of Education), Institute of Computational Physical Sciences, State Key Laboratory of Surface Physics, and Department of Physics, Fudan University, Shanghai, China
- Shanghai Qi Zhi Institute, Shanghai, China
| | - Binhua Zhang
- Key Laboratory of Computational Physical Sciences (Ministry of Education), Institute of Computational Physical Sciences, State Key Laboratory of Surface Physics, and Department of Physics, Fudan University, Shanghai, China
- Shanghai Qi Zhi Institute, Shanghai, China
| | - Zhiguo Tao
- Key Laboratory of Computational Physical Sciences (Ministry of Education), Institute of Computational Physical Sciences, State Key Laboratory of Surface Physics, and Department of Physics, Fudan University, Shanghai, China
- Shanghai Qi Zhi Institute, Shanghai, China
| | - Yuting Sun
- Key Laboratory of Computational Physical Sciences (Ministry of Education), Institute of Computational Physical Sciences, State Key Laboratory of Surface Physics, and Department of Physics, Fudan University, Shanghai, China
- Shanghai Qi Zhi Institute, Shanghai, China
| | - Weibin Chu
- Key Laboratory of Computational Physical Sciences (Ministry of Education), Institute of Computational Physical Sciences, State Key Laboratory of Surface Physics, and Department of Physics, Fudan University, Shanghai, China
- Shanghai Qi Zhi Institute, Shanghai, China
| | - Xin-Gao Gong
- Key Laboratory of Computational Physical Sciences (Ministry of Education), Institute of Computational Physical Sciences, State Key Laboratory of Surface Physics, and Department of Physics, Fudan University, Shanghai, China
- Shanghai Qi Zhi Institute, Shanghai, China
| | - Ji-Hui Yang
- Key Laboratory of Computational Physical Sciences (Ministry of Education), Institute of Computational Physical Sciences, State Key Laboratory of Surface Physics, and Department of Physics, Fudan University, Shanghai, China.
- Shanghai Qi Zhi Institute, Shanghai, China.
| | - Hongjun Xiang
- Key Laboratory of Computational Physical Sciences (Ministry of Education), Institute of Computational Physical Sciences, State Key Laboratory of Surface Physics, and Department of Physics, Fudan University, Shanghai, China.
- Shanghai Qi Zhi Institute, Shanghai, China.
| |
Collapse
|
33
|
Liu H, Yin H, Luo Z, Wang X. Integrating chemistry knowledge in large language models via prompt engineering. Synth Syst Biotechnol 2024; 10:23-38. [PMID: 39206087 PMCID: PMC11350497 DOI: 10.1016/j.synbio.2024.07.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 07/08/2024] [Accepted: 07/20/2024] [Indexed: 09/04/2024] Open
Abstract
This paper presents a study on the integration of domain-specific knowledge in prompt engineering to enhance the performance of large language models (LLMs) in scientific domains. The proposed domain-knowledge embedded prompt engineering method outperforms traditional prompt engineering strategies on various metrics, including capability, accuracy, F1 score, and hallucination drop. The effectiveness of the method is demonstrated through case studies on complex materials including the MacMillan catalyst, paclitaxel, and lithium cobalt oxide. The results suggest that domain-knowledge prompts can guide LLMs to generate more accurate and relevant responses, highlighting the potential of LLMs as powerful tools for scientific discovery and innovation when equipped with domain-specific prompts. The study also discusses limitations and future directions for domain-specific prompt engineering development.
Collapse
Affiliation(s)
- Hongxuan Liu
- Department of Chemical Engineering, Tsinghua University, Beijing, 100084, China
| | - Haoyu Yin
- Department of Chemical Engineering, Tsinghua University, Beijing, 100084, China
| | - Zhiyao Luo
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Old Road Campus Research Building, Headington, Oxford, OX3 7DQ, United Kingdom
| | - Xiaonan Wang
- Department of Chemical Engineering, Tsinghua University, Beijing, 100084, China
- Key Laboratory for Industrial Biocatalysis, Ministry of Education, Tsinghua University, Beijing, 100084, China
| |
Collapse
|
34
|
Manzhos S, Chen QG, Lee WY, Heejoo Y, Ihara M, Chueh CC. Computational Investigation of the Potential and Limitations of Machine Learning with Neural Network Circuits Based on Synaptic Transistors. J Phys Chem Lett 2024; 15:6974-6985. [PMID: 38941557 PMCID: PMC11247485 DOI: 10.1021/acs.jpclett.4c01413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
Synaptic transistors have been proposed to implement neuron activation functions of neural networks (NNs). While promising to enable compact, fast, inexpensive, and energy-efficient dedicated NN circuits, they also have limitations compared to digital NNs (realized as codes for digital processors), including shape choices of the activation function using particular types of transistor implementation, and instabilities due to noise and other factors present in analog circuits. We present a computational study of the effects of these factors on NN performance and find that, while accuracy competitive with traditional NNs can be realized for many applications, there is high sensitivity to the instability in the shape of the activation function, suggesting that, when highly accurate NNs are required, high-precision circuitry should be developed beyond what has been reported for synaptic transistors to date.
Collapse
Affiliation(s)
- Sergei Manzhos
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan
| | - Qun Gao Chen
- Department of Chemical Engineering and Biotechnology, National Taipei University of Technology, Taipei 106, Taiwan
| | - Wen-Ya Lee
- Department of Chemical Engineering and Biotechnology, National Taipei University of Technology, Taipei 106, Taiwan
| | - Yoon Heejoo
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan
| | - Manabu Ihara
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan
| | - Chu-Chen Chueh
- Department of Chemical Engineering, National Taiwan University, Taipei 10617, Taiwan
| |
Collapse
|
35
|
Stishenko P, McSloy A, Onat B, Hourahine B, Maurer RJ, Kermode JR, Logsdail A. Integrated workflows and interfaces for data-driven semi-empirical electronic structure calculations. J Chem Phys 2024; 161:012502. [PMID: 38958157 DOI: 10.1063/5.0209742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Accepted: 06/07/2024] [Indexed: 07/04/2024] Open
Abstract
Modern software engineering of electronic structure codes has seen a paradigm shift from monolithic workflows toward object-based modularity. Software objectivity allows for greater flexibility in the application of electronic structure calculations, with particular benefits when integrated with approaches for data-driven analysis. Here, we discuss different approaches to create deep modular interfaces that connect big-data workflows and electronic structure codes and explore the diversity of use cases that they can enable. We present two such interface approaches for the semi-empirical electronic structure package, DFTB+. In one case, DFTB+ is applied as a library and provides data to an external workflow; in another, DFTB+receives data via external bindings and processes the information subsequently within an internal workflow. We provide a general framework to enable data exchange workflows for embedding new machine-learning-based Hamiltonians within DFTB+ or enabling deep integration of DFTB+ in multiscale embedding workflows. These modular interfaces demonstrate opportunities in emergent software and workflows to accelerate scientific discovery by harnessing existing software capabilities.
Collapse
Affiliation(s)
- Pavel Stishenko
- Cardiff Catalysis Institute, School of Chemistry, Cardiff University, Park Place, Cardiff CF10 3AT, United Kingdom
| | - Adam McSloy
- Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Berk Onat
- Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Ben Hourahine
- SUPA, Department of Physics, John Anderson Building, University of Strathclyde, 107 Rottenrow, Glasgow G4 0NG, United Kingdom
| | - Reinhard J Maurer
- Department of Chemistry, University of Warwick, Coventry CV4 7AL, United Kingdom and Department of Physics, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - James R Kermode
- Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Andrew Logsdail
- Cardiff Catalysis Institute, School of Chemistry, Cardiff University, Park Place, Cardiff CF10 3AT, United Kingdom
| |
Collapse
|
36
|
Aldossary A, Campos-Gonzalez-Angulo JA, Pablo-García S, Leong SX, Rajaonson EM, Thiede L, Tom G, Wang A, Avagliano D, Aspuru-Guzik A. In Silico Chemical Experiments in the Age of AI: From Quantum Chemistry to Machine Learning and Back. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2402369. [PMID: 38794859 DOI: 10.1002/adma.202402369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 04/28/2024] [Indexed: 05/26/2024]
Abstract
Computational chemistry is an indispensable tool for understanding molecules and predicting chemical properties. However, traditional computational methods face significant challenges due to the difficulty of solving the Schrödinger equations and the increasing computational cost with the size of the molecular system. In response, there has been a surge of interest in leveraging artificial intelligence (AI) and machine learning (ML) techniques to in silico experiments. Integrating AI and ML into computational chemistry increases the scalability and speed of the exploration of chemical space. However, challenges remain, particularly regarding the reproducibility and transferability of ML models. This review highlights the evolution of ML in learning from, complementing, or replacing traditional computational chemistry for energy and property predictions. Starting from models trained entirely on numerical data, a journey set forth toward the ideal model incorporating or learning the physical laws of quantum mechanics. This paper also reviews existing computational methods and ML models and their intertwining, outlines a roadmap for future research, and identifies areas for improvement and innovation. Ultimately, the goal is to develop AI architectures capable of predicting accurate and transferable solutions to the Schrödinger equation, thereby revolutionizing in silico experiments within chemistry and materials science.
Collapse
Affiliation(s)
- Abdulrahman Aldossary
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | | | - Sergio Pablo-García
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
| | - Shi Xuan Leong
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Ella Miray Rajaonson
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Luca Thiede
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Gary Tom
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Andrew Wang
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Davide Avagliano
- Chimie ParisTech, PSL University, CNRS, Institute of Chemistry for Life and Health Sciences (iCLeHS UMR 8060), Paris, F-75005, France
| | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
- Department of Materials Science & Engineering, University of Toronto, 184 College St., Toronto, ON, M5S 3E4, Canada
- Department of Chemical Engineering & Applied Chemistry, University of Toronto, 200 College St., Toronto, ON, M5S 3E5, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), 66118 University Ave., Toronto, M5G 1M1, Canada
- Acceleration Consortium, 80 St George St, Toronto, M5S 3H6, Canada
| |
Collapse
|
37
|
Hazra S, Patil U, Sanvito S. Predicting the One-Particle Density Matrix with Machine Learning. J Chem Theory Comput 2024; 20:4569-4578. [PMID: 38818782 PMCID: PMC11171273 DOI: 10.1021/acs.jctc.4c00042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 05/14/2024] [Accepted: 05/15/2024] [Indexed: 06/01/2024]
Abstract
Two of the most widely used electronic-structure theory methods, namely, Hartree-Fock and Kohn-Sham density functional theory, require the iterative solution of a set of Schrödinger-like equations. The speed of convergence of such a process depends on the complexity of the system under investigation, the self-consistent-field algorithm employed, and the initial guess for the density matrix. An initial density matrix close to the ground-state matrix will effectively allow one to cut out many of the self-consistent steps necessary to achieve convergence. Here, we predict the density matrix of Kohn-Sham density functional theory by constructing a neural network that uses only the atomic positions as information. Such a neural network provides an initial guess for the density matrix far superior to that of any other recipes available. Furthermore, the quality of such a neural-network density matrix is good enough for the evaluation of interatomic forces. This allows us to run accelerated ab initio molecular dynamics with little to no self-consistent steps.
Collapse
Affiliation(s)
- S. Hazra
- School of Physics and CRANN
Institute, Trinity College, Dublin 2, Ireland
| | - U. Patil
- School of Physics and CRANN
Institute, Trinity College, Dublin 2, Ireland
| | - S. Sanvito
- School of Physics and CRANN
Institute, Trinity College, Dublin 2, Ireland
| |
Collapse
|
38
|
Lim H. Development of scoring-assisted generative exploration (SAGE) and its application to dual inhibitor design for acetylcholinesterase and monoamine oxidase B. J Cheminform 2024; 16:59. [PMID: 38790018 PMCID: PMC11127438 DOI: 10.1186/s13321-024-00845-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 04/26/2024] [Indexed: 05/26/2024] Open
Abstract
De novo molecular design is the process of searching chemical space for drug-like molecules with desired properties, and deep learning has been recognized as a promising solution. In this study, I developed an effective computational method called Scoring-Assisted Generative Exploration (SAGE) to enhance chemical diversity and property optimization through virtual synthesis simulation, the generation of bridged bicyclic rings, and multiple scoring models for drug-likeness. In six protein targets, SAGE generated molecules with high scores within reasonable numbers of steps by optimizing target specificity without a constraint and even with multiple constraints such as synthetic accessibility, solubility, and metabolic stability. Furthermore, I suggested a top-ranked molecule with SAGE as dual inhibitors of acetylcholinesterase and monoamine oxidase B through multiple desired property optimization. Therefore, SAGE can generate molecules with desired properties by optimizing multiple properties simultaneously, indicating the importance of de novo design strategies in the future of drug discovery and development. SCIENTIFIC CONTRIBUTION: The scientific contribution of this study lies in the development of the Scoring-Assisted Generative Exploration (SAGE) method, a novel computational approach that significantly enhances de novo molecular design. SAGE uniquely integrates virtual synthesis simulation, the generation of complex bridged bicyclic rings, and multiple scoring models to optimize drug-like properties comprehensively. By efficiently generating molecules that meet a broad spectrum of pharmacological criteria-including target specificity, synthetic accessibility, solubility, and metabolic stability-within a reasonable number of steps, SAGE represents a substantial advancement over traditional methods. Additionally, the application of SAGE to discover dual inhibitors for acetylcholinesterase and monoamine oxidase B not only demonstrates its potential to streamline and enhance the drug development process but also highlights its capacity to create more effective and precisely targeted therapies. This study emphasizes the critical and evolving role of de novo design strategies in reshaping the future of drug discovery and development, providing promising avenues for innovative therapeutic discoveries.
Collapse
Affiliation(s)
- Hocheol Lim
- Bioinformatics and Molecular Design Research Center (BMDRC), Incheon, Republic of Korea.
| |
Collapse
|
39
|
Shakiba M, Akimov AV. Machine-Learned Kohn-Sham Hamiltonian Mapping for Nonadiabatic Molecular Dynamics. J Chem Theory Comput 2024; 20:2992-3007. [PMID: 38581699 DOI: 10.1021/acs.jctc.4c00008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2024]
Abstract
In this work, we report a simple, efficient, and scalable machine-learning (ML) approach for mapping non-self-consistent Kohn-Sham Hamiltonians constructed with one kind of density functional to the nearly self-consistent Hamiltonians constructed with another kind of density functional. This approach is designed as a fast surrogate Hamiltonian calculator for use in long nonadiabatic dynamics simulations of large atomistic systems. In this approach, the input and output features are Hamiltonian matrices computed from different levels of theory. We demonstrate that the developed ML-based Hamiltonian mapping method (1) speeds up the calculations by several orders of magnitude, (2) is conceptually simpler than alternative ML approaches, (3) is applicable to different systems and sizes and can be used for mapping Hamiltonians constructed with arbitrary density functionals, (4) requires a modest training data, learns fast, and generates molecular orbitals and their energies with the accuracy nearly matching that of conventional calculations, and (5) when applied to nonadiabatic dynamics simulation of excitation energy relaxation in large systems yields the corresponding time scales within the margin of error of the conventional calculations. Using this approach, we explore the excitation energy relaxation in C60 fullerene and Si75H64 quantum dot structures and derive qualitative and quantitative insights into dynamics in these systems.
Collapse
Affiliation(s)
- Mohammad Shakiba
- Department of Chemistry, University at Buffalo, The State University of New York, Buffalo, New York 14260, United States
| | - Alexey V Akimov
- Department of Chemistry, University at Buffalo, The State University of New York, Buffalo, New York 14260, United States
| |
Collapse
|
40
|
Gallegos M, Isamura BK, Popelier PLA, Martín Pendás Á. An Unsupervised Machine Learning Approach for the Automatic Construction of Local Chemical Descriptors. J Chem Inf Model 2024; 64:3059-3079. [PMID: 38498942 PMCID: PMC11040729 DOI: 10.1021/acs.jcim.3c01906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 03/06/2024] [Accepted: 03/07/2024] [Indexed: 03/20/2024]
Abstract
Condensing the many physical variables defining a chemical system into a fixed-size array poses a significant challenge in the development of chemical Machine Learning (ML). Atom Centered Symmetry Functions (ACSFs) offer an intuitive featurization approach by means of a tedious and labor-intensive selection of tunable parameters. In this work, we implement an unsupervised ML strategy relying on a Gaussian Mixture Model (GMM) to automatically optimize the ACSF parameters. GMMs effortlessly decompose the vastness of the chemical and conformational spaces into well-defined radial and angular clusters, which are then used to build tailor-made ACSFs. The unsupervised exploration of the space has demonstrated general applicability across a diverse range of systems, spanning from various unimolecular landscapes to heterogeneous databases. The impact of the sampling technique and temperature on space exploration is also addressed, highlighting the particularly advantageous role of high-temperature Molecular Dynamics (MD) simulations. The reliability of the resulting features is assessed through the estimation of the atomic charges of a prototypical capped amino acid and a heterogeneous collection of CHON molecules. The automatically constructed ACSFs serve as high-quality descriptors, consistently yielding typical prediction errors below 0.010 electrons bound for the reported atomic charges. Altering the spatial distribution of the functions with respect to the cluster highlights the critical role of symmetry rupture in achieving significantly improved features. More specifically, using two separate functions to describe the lower and upper tails of the cluster results in the best performing models with errors as low as 0.006 electrons. Finally, the effectiveness of finely tuned features was checked across different architectures, unveiling the superior performance of Gaussian Process (GP) models over Feed Forward Neural Networks (FFNNs), particularly in low-data regimes, with nearly a 2-fold increase in prediction quality. Altogether, this approach paves the way toward an easier construction of local chemical descriptors, while providing valuable insights into how radial and angular spaces should be mapped. Finally, this work opens the possibility of encoding many-body information beyond angular terms into upcoming ML features.
Collapse
Affiliation(s)
- Miguel Gallegos
- Department
of Analytical and Physical Chemistry, University
of Oviedo, Oviedo E-33006, Spain
| | | | - Paul L. A. Popelier
- Department
of Chemistry, The University of Manchester, Oxford Road, Manchester M13 9PL, U.K.
| | - Ángel Martín Pendás
- Department
of Analytical and Physical Chemistry, University
of Oviedo, Oviedo E-33006, Spain
| |
Collapse
|
41
|
Ming Z, Liu D, Xiao L, Yang L, Cheng Y, Yang H, Zhou J, Ding H, Yang Z, Wang K. Nondestructive measurement of terahertz optical thin films by machine learning based on physical consistency. OPTICS EXPRESS 2024; 32:16426-16436. [PMID: 38859269 DOI: 10.1364/oe.521609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Accepted: 04/06/2024] [Indexed: 06/12/2024]
Abstract
Optical scattering measurement is one of the most commonly used methods for non-contact online measurement of film properties in industrial film manufacturing. Terahertz photons have low energy and are non-ionizing when measuring objects, so combining these two methods can enable online nondestructive testing of thin films. In the visible light band, some materials are transparent, and their thickness and material properties cannot be measured. Therefore, a method based on physical consistency modeling and machine learning is proposed in this paper, which realizes the method of obtaining high-precision thin film parameters through single-frequency terahertz wave measurement, and shows good performance. Through the experimental measurement of organic material thin films, it is proved that the proposed method is an effective terahertz online detection technology with high precision and high throughput.
Collapse
|
42
|
Unke OT, Stöhr M, Ganscha S, Unterthiner T, Maennel H, Kashubin S, Ahlin D, Gastegger M, Medrano Sandonas L, Berryman JT, Tkatchenko A, Müller KR. Biomolecular dynamics with machine-learned quantum-mechanical force fields trained on diverse chemical fragments. SCIENCE ADVANCES 2024; 10:eadn4397. [PMID: 38579003 PMCID: PMC11809612 DOI: 10.1126/sciadv.adn4397] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Accepted: 02/29/2024] [Indexed: 04/07/2024]
Abstract
The GEMS method enables molecular dynamics simulations of large heterogeneous systems at ab initio quality.
Collapse
Affiliation(s)
- Oliver T. Unke
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- DFG Cluster of Excellence “Unifying Systems in Catalysis” (UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
| | - Martin Stöhr
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Stefan Ganscha
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Thomas Unterthiner
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Hartmut Maennel
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Sergii Kashubin
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Daniel Ahlin
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Michael Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- DFG Cluster of Excellence “Unifying Systems in Catalysis” (UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
- BASLEARN — TU Berlin/BASF Joint Lab for Machine Learning, Technische Universität Berlin, 10587 Berlin, Germany
| | - Leonardo Medrano Sandonas
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Joshua T. Berryman
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Klaus-Robert Müller
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, Korea
- Max Planck Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbrücken, Germany
- BIFOLD — Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
| |
Collapse
|
43
|
Cignoni E, Suman D, Nigam J, Cupellini L, Mennucci B, Ceriotti M. Electronic Excited States from Physically Constrained Machine Learning. ACS CENTRAL SCIENCE 2024; 10:637-648. [PMID: 38559300 PMCID: PMC10979507 DOI: 10.1021/acscentsci.3c01480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 01/16/2024] [Accepted: 01/30/2024] [Indexed: 04/04/2024]
Abstract
Data-driven techniques are increasingly used to replace electronic-structure calculations of matter. In this context, a relevant question is whether machine learning (ML) should be applied directly to predict the desired properties or combined explicitly with physically grounded operations. We present an example of an integrated modeling approach in which a symmetry-adapted ML model of an effective Hamiltonian is trained to reproduce electronic excitations from a quantum-mechanical calculation. The resulting model can make predictions for molecules that are much larger and more complex than those on which it is trained and allows for dramatic computational savings by indirectly targeting the outputs of well-converged calculations while using a parametrization corresponding to a minimal atom-centered basis. These results emphasize the merits of intertwining data-driven techniques with physical approximations, improving the transferability and interpretability of ML models without affecting their accuracy and computational efficiency and providing a blueprint for developing ML-augmented electronic-structure methods.
Collapse
Affiliation(s)
- Edoardo Cignoni
- Dipartimento
di Chimica e Chimica Industriale, Università
di Pisa, 56126 Pisa, Italy
| | - Divya Suman
- Laboratory
of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale
de Lausanne, 1015 Lausanne, Switzerland
| | - Jigyasa Nigam
- Laboratory
of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale
de Lausanne, 1015 Lausanne, Switzerland
| | - Lorenzo Cupellini
- Dipartimento
di Chimica e Chimica Industriale, Università
di Pisa, 56126 Pisa, Italy
| | - Benedetta Mennucci
- Dipartimento
di Chimica e Chimica Industriale, Università
di Pisa, 56126 Pisa, Italy
| | - Michele Ceriotti
- Laboratory
of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale
de Lausanne, 1015 Lausanne, Switzerland
- Division
of Chemistry and Chemical Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| |
Collapse
|
44
|
Dral PO. AI in computational chemistry through the lens of a decade-long journey. Chem Commun (Camb) 2024; 60:3240-3258. [PMID: 38444290 DOI: 10.1039/d4cc00010b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
This article gives a perspective on the progress of AI tools in computational chemistry through the lens of the author's decade-long contributions put in the wider context of the trends in this rapidly expanding field. This progress over the last decade is tremendous: while a decade ago we had a glimpse of what was to come through many proof-of-concept studies, now we witness the emergence of many AI-based computational chemistry tools that are mature enough to make faster and more accurate simulations increasingly routine. Such simulations in turn allow us to validate and even revise experimental results, deepen our understanding of the physicochemical processes in nature, and design better materials, devices, and drugs. The rapid introduction of powerful AI tools gives rise to unique challenges and opportunities that are discussed in this article too.
Collapse
Affiliation(s)
- Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China.
| |
Collapse
|
45
|
Issa AA, Kamel MD, El-Sayed DS. Depicted simulation model for removal of second-generation antipsychotic drugs adsorbed on Zn-MOF: adsorption locator assessment. J Mol Model 2024; 30:106. [PMID: 38491151 DOI: 10.1007/s00894-024-05896-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Accepted: 03/02/2024] [Indexed: 03/18/2024]
Abstract
CONTEXT Electronic durable behavior on the material surface was accompanied by a class of antipsychotic drugs (APD) to describe the surface modification in the designed adsorption model. Hierarchically Zn-MOF system was utilized for estimating its capacity for drug molecule removal. Geometrically optimized strategy on the studied systems was performed using DFT/GGA/PBE. FMOs analysis was depicted based on the same level of calculations, and molecular electrostatic potential surface (MEP) was generated for unadsorbed and adsorbed systems to illustrate the variation in the surface-active sites. By interpreting the electronic density of states (DOS), the atomic orbital can be identified as a major or minor electronic distribution by PDOS graph. Adsorption locating behavior was considered to detect the significant surface interaction mode between APD and Zn-MOF surface based on lower adsorption energy. The stability of the adsorbed model was best described through dynamic simulation analysis with time through elevated temperatures. The non-covalent interactions were described using RDG/NCI analysis to show the major favorable surface interaction predicting the highly stable adsorption system. METHODS The most accurate geometrical computations were performed using the materials studio software followed by surface cleavage and vacuum slab generation. The first principle of DFT was used to apply CASTEP module with GGA/PBE method for band structure and DOS calculations. Three systems of antipsychotic drugs were computationally studied using CASTEP simulation package and adsorbed on an optimized Zn-MOF surface. Adsorption locator module predicted the preferred adsorption mechanistic models, in which the first model was arranged to be more stable, to confirm the occurrence of some interactions in the adsorption mechanism.
Collapse
Affiliation(s)
- Ali Abdullah Issa
- Department of Applied Sciences, University of Technology, Baghdad, Iraq
| | | | - Doaa S El-Sayed
- Chemistry Department, Faculty of Science, Alexandria University, Alexandria, Egypt.
| |
Collapse
|
46
|
Pathirage PDVS, Phillips JT, Vogiatzis KD. Exploration of the Two-Electron Excitation Space with Data-Driven Coupled Cluster. J Phys Chem A 2024. [PMID: 38422511 DOI: 10.1021/acs.jpca.3c06600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2024]
Abstract
Computational cost limits the applicability of post-Hartree-Fock methods such as coupled-cluster on larger molecular systems. The data-driven coupled-cluster (DDCC) method applies machine learning to predict the coupled-cluster two-electron amplitudes (t2) using data from second-order perturbation theory (MP2). One major limitation of the DDCC models is the size of training sets that increases exponentially with the system size. Effective sampling of the amplitude space can resolve this issue. Five different amplitude selection techniques that reduce the amount of data used for training were evaluated, an approach that also prevents model overfitting and increases the portability of data-driven coupled-cluster singles and doubles to more complex molecules or larger basis sets. In combination with a localized orbital formalism to predict the CCSD t2 amplitudes, we have achieved a 10-fold error reduction for energy calculations.
Collapse
Affiliation(s)
- P D Varuna S Pathirage
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| | - Justin T Phillips
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| | - Konstantinos D Vogiatzis
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| |
Collapse
|
47
|
Briling K, Calvino Alonso Y, Fabrizio A, Corminboeuf C. SPA HM(a,b): Encoding the Density Information from Guess Hamiltonian in Quantum Machine Learning Representations. J Chem Theory Comput 2024; 20:1108-1117. [PMID: 38227222 PMCID: PMC10867806 DOI: 10.1021/acs.jctc.3c01040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 12/20/2023] [Accepted: 12/26/2023] [Indexed: 01/17/2024]
Abstract
Recently, we introduced a class of molecular representations for kernel-based regression methods─the spectrum of approximated Hamiltonian matrices (SPAHM)─that takes advantage of lightweight one-electron Hamiltonians traditionally used as a self-consistent field initial guess. The original SPAHM variant is built from occupied-orbital energies (i.e., eigenvalues) and naturally contains all of the information about nuclear charges, atomic positions, and symmetry requirements. Its advantages were demonstrated on data sets featuring a wide variation of charge and spin, for which traditional structure-based representations commonly fail. SPAHM(a,b), as introduced here, expand the eigenvalue SPAHM into local and transferable representations. They rely upon one-electron density matrices to build fingerprints from atomic and bond density overlap contributions inspired from preceding state-of-the-art representations. The performance and efficiency of SPAHM(a,b) is assessed on the predictions for data sets of prototypical organic molecules (QM7) of different charges and azoheteroarene dyes in an excited state. Overall, both SPAHM(a) and SPAHM(b) outperform state-of-the-art representations on difficult prediction tasks such as the atomic properties of charged open-shell species and of π-conjugated systems.
Collapse
Affiliation(s)
- Ksenia
R. Briling
- Laboratory
for Computational Molecular Design, Institute of Chemical Sciences
and Engineering, École Polytechnique
Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Yannick Calvino Alonso
- Laboratory
for Computational Molecular Design, Institute of Chemical Sciences
and Engineering, École Polytechnique
Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Alberto Fabrizio
- Laboratory
for Computational Molecular Design, Institute of Chemical Sciences
and Engineering, École Polytechnique
Fédérale de Lausanne, 1015 Lausanne, Switzerland
- National
Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale
de Lausanne, 1015 Lausanne, Switzerland
| | - Clemence Corminboeuf
- Laboratory
for Computational Molecular Design, Institute of Chemical Sciences
and Engineering, École Polytechnique
Fédérale de Lausanne, 1015 Lausanne, Switzerland
- National
Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale
de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
48
|
Jijila B, Nirmala V, Selvarengan P, Kavitha D, Arun Muthuraj V, Rajagopal A. Employing neural density functionals to generate potential energy surfaces. J Mol Model 2024; 30:65. [PMID: 38340208 DOI: 10.1007/s00894-024-05834-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 01/04/2024] [Indexed: 02/12/2024]
Abstract
CONTEXT With the union of machine learning (ML) and quantum chemistry, amid the debate between machine-learned functionals and human-designed functionals in density functional theory (DFT), this paper aims to demonstrate the generation of potential energy surfaces using computations with machine-learned density functional approximation (ML-DFA). A recent research trend is the application of ML in quantum sciences in the design of density functionals such as DeepMind's Deep Learning model (DeepMind21, DM21). Though science reported the state-of-the-art performance of DM21, the opportunity to utilize DeepMind's pretrained DM21 neural networks in computations in quantum chemistry has not yet been tapped. So far in the literature, the Deep Learning density functionals (DM21) have not been applied to generate potential energy surfaces. While the superior accuracy of DM21 has been reported, there is still a scarcity of publications that apply DM21 in calculations in the field. In this context, for the first time in literature, neural density functionals inferring 2D potential energy surfaces (ML-DFA-PES) based on machine-learned DFA-based computational method is contributed in this paper. This paper reports the ML-DFA-generated PES for C4H8, H2O, H2, and H2+ by employing a pretrained DM21m TensorFlow model with cc-pVDZ basis set. In addition, we also analyze the long-range behavior of DM21 based PES to investigate the ability to describe a system at long ranges. Furthermore, we compare PES diagrams from DM21 with popular DFT functionals (b3lyp/ PW6B95) and CCSD(T). METHODS In this method, 2D potential energy surfaces are obtained using a method that relies upon the neural network's ability to accurately learn the mapping between 3D electron density and exchange-correlation potential. By inserting Deep Learning inference in DFT with a pretrained neural network, self-consistent field (SCF) energy at different geometries along the coordinates of interest is computed, and then, potential energy surfaces are plotted. In this method, first, the electron density is computed mathematically, and this computed 3D electron density is used as a ML feature vector to predict the exchange correlation potential as a ML inference computed by a forward pass of pre-trained DM21 TensorFlow computational graph, followed by the computation of self-consistent field energy at multiple geometries, and then, SCF energies at different bond lengths/angles are plotted as 2D PES. We implement this in a python source code using frameworks such as PySCF and DM21. This paper contributes this implementation in open source. The source code and DM21-DFA-based PES are contributed at https://sites.google.com/view/MLfunctionals-DeepMind-PES .
Collapse
Affiliation(s)
- B Jijila
- Queen Mary's College, Chennai, India
| | - V Nirmala
- Queen Mary's College, Chennai, India.
| | - P Selvarengan
- Kalasalingam Academy of Research & Education, Krishnankoil, India
| | - D Kavitha
- Dr. MGR Educational and Research Institute, Chennai, India
| | | | - A Rajagopal
- Indian Institute of Technology, Madras, India
| |
Collapse
|
49
|
Lewis L, Huang HY, Tran VT, Lehner S, Kueng R, Preskill J. Improved machine learning algorithm for predicting ground state properties. Nat Commun 2024; 15:895. [PMID: 38291046 PMCID: PMC10828424 DOI: 10.1038/s41467-024-45014-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Accepted: 01/08/2024] [Indexed: 02/01/2024] Open
Abstract
Finding the ground state of a quantum many-body system is a fundamental problem in quantum physics. In this work, we give a classical machine learning (ML) algorithm for predicting ground state properties with an inductive bias encoding geometric locality. The proposed ML model can efficiently predict ground state properties of an n-qubit gapped local Hamiltonian after learning from only [Formula: see text] data about other Hamiltonians in the same quantum phase of matter. This improves substantially upon previous results that require [Formula: see text] data for a large constant c. Furthermore, the training and prediction time of the proposed ML model scale as [Formula: see text] in the number of qubits n. Numerical experiments on physical systems with up to 45 qubits confirm the favorable scaling in predicting ground state properties using a small training dataset.
Collapse
Affiliation(s)
- Laura Lewis
- California Institute of Technology, Pasadena, CA, USA
- University of Cambridge, Cambridge, UK
| | - Hsin-Yuan Huang
- California Institute of Technology, Pasadena, CA, USA.
- Massachusetts Institute of Technology, Cambridge, MA, USA.
- Google Quantum AI, Venice, CA, USA.
| | | | | | | | - John Preskill
- California Institute of Technology, Pasadena, CA, USA
- AWS Center for Quantum Computing, Pasadena, CA, USA
| |
Collapse
|
50
|
Xu X, Soriano-Agueda L, López X, Ramos-Cordoba E, Matito E. All-Purpose Measure of Electron Correlation for Multireference Diagnostics. J Chem Theory Comput 2024; 20:721-727. [PMID: 38157841 PMCID: PMC10809408 DOI: 10.1021/acs.jctc.3c01073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 11/24/2023] [Accepted: 11/27/2023] [Indexed: 01/03/2024]
Abstract
We present an analytical relationship between two natural orbital occupancy-based indices, I N D ¯ and INDmax, and two established electron correlation metrics: the leading term of a configuration interaction expansion, c0, and the D2 diagnostic. Numerical validation revealed that I N D ¯ and INDmax can effectively substitute for c0 and D2, respectively. These indices offer three distinct advantages: (i) they are universally applicable across all electronic structure methods, (ii) their interpretation is more intuitive, and (iii) they can be readily incorporated into the development of hybrid electronic structure methods. Additionally, we draw a distinction between correlation measures and correlation diagnostics, establishing MP2 and CCSD numerical thresholds for INDmax, which are to be used as a multireference diagnostic. Our findings further demonstrate that establishing thresholds for other electronic structure methods can be easily accomplished using small data sets.
Collapse
Affiliation(s)
- Xiang Xu
- Donostia
International Physics Center (DIPC), 20018 Donostia, Euskadi, Spain
- Polimero
eta Material Aurreratuak: Fisika, Kimika eta Teknologia, Kimika Fakultatea, Euskal Herriko Unibertsitatea UPV/EHU, P.K. 1072, 20080 Donostia, Euskadi, Spain
| | - Luis Soriano-Agueda
- Donostia
International Physics Center (DIPC), 20018 Donostia, Euskadi, Spain
| | - Xabier López
- Donostia
International Physics Center (DIPC), 20018 Donostia, Euskadi, Spain
- Polimero
eta Material Aurreratuak: Fisika, Kimika eta Teknologia, Kimika Fakultatea, Euskal Herriko Unibertsitatea UPV/EHU, P.K. 1072, 20080 Donostia, Euskadi, Spain
| | - Eloy Ramos-Cordoba
- Donostia
International Physics Center (DIPC), 20018 Donostia, Euskadi, Spain
- Polimero
eta Material Aurreratuak: Fisika, Kimika eta Teknologia, Kimika Fakultatea, Euskal Herriko Unibertsitatea UPV/EHU, P.K. 1072, 20080 Donostia, Euskadi, Spain
- Ikerbasque
Foundation for Science, Plaza Euskadi 5, 48009 Bilbao, Spain
| | - Eduard Matito
- Donostia
International Physics Center (DIPC), 20018 Donostia, Euskadi, Spain
- Ikerbasque
Foundation for Science, Plaza Euskadi 5, 48009 Bilbao, Spain
| |
Collapse
|