1
|
Steiner M, Reiher M. A human-machine interface for automatic exploration of chemical reaction networks. Nat Commun 2024; 15:3680. [PMID: 38693117 PMCID: PMC11063077 DOI: 10.1038/s41467-024-47997-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 04/15/2024] [Indexed: 05/03/2024] Open
Abstract
Autonomous reaction network exploration algorithms offer a systematic approach to explore mechanisms of complex chemical processes. However, the resulting reaction networks are so vast that an exploration of all potentially accessible intermediates is computationally too demanding. This renders brute-force explorations unfeasible, while explorations with completely pre-defined intermediates or hard-wired chemical constraints, such as element-specific coordination numbers, are not flexible enough for complex chemical systems. Here, we introduce a STEERING WHEEL to guide an otherwise unbiased automated exploration. The STEERING WHEEL algorithm is intuitive, generally applicable, and enables one to focus on specific regions of an emerging network. It also allows for guiding automated data generation in the context of mechanism exploration, catalyst design, and other chemical optimization challenges. The algorithm is demonstrated for reaction mechanism elucidation of transition metal catalysts. We highlight how to explore catalytic cycles in a systematic and reproducible way. The exploration objectives are fully adjustable, allowing one to harness the STEERING WHEEL for both structure-specific (accurate) calculations as well as for broad high-throughput screening of possible reaction intermediates.
Collapse
Affiliation(s)
- Miguel Steiner
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093, Zurich, Switzerland
- ETH Zurich, NCCR Catalysis, Vladimir-Prelog-Weg 2, 8093, Zurich, Switzerland
| | - Markus Reiher
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093, Zurich, Switzerland.
- ETH Zurich, NCCR Catalysis, Vladimir-Prelog-Weg 2, 8093, Zurich, Switzerland.
| |
Collapse
|
2
|
Stan-Bernhardt A, Glinkina L, Hulm A, Ochsenfeld C. Exploring Chemical Space Using Ab Initio Hyperreactor Dynamics. ACS Cent Sci 2024; 10:302-314. [PMID: 38435517 PMCID: PMC10906254 DOI: 10.1021/acscentsci.3c01403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 12/20/2023] [Accepted: 12/21/2023] [Indexed: 03/05/2024]
Abstract
In recent years, first-principles exploration of chemical reaction space has provided valuable insights into intricate reaction networks. Here, we introduce ab initio hyperreactor dynamics, which enables rapid screening of the accessible chemical space from a given set of initial molecular species, predicting new synthetic routes that can potentially guide subsequent experimental studies. For this purpose, different hyperdynamics derived bias potentials are applied along with pressure-inducing spherical confinement of the molecular system in ab initio molecular dynamics simulations to efficiently enhance reactivity under mild conditions. To showcase the advantages and flexibility of the hyperreactor approach, we present a systematic study of the method's parameters on a HCN toy model and apply it to a recently introduced experimental model for the prebiotic formation of glycinal and acetamide in interstellar ices, which yields results in line with experimental findings. In addition, we show how the developed framework enables the study of complicated transitions like the first step of a nonenzymatic DNA nucleoside synthesis in an aqueous environment, where the molecular fragmentation problem of earlier nanoreactor approaches is avoided.
Collapse
Affiliation(s)
- Alexandra Stan-Bernhardt
- Chair
of Theoretical Chemistry, Department of Chemistry, University of Munich (LMU), Butenandtstrasse 5, D-81377 München, Germany
| | - Liubov Glinkina
- Chair
of Theoretical Chemistry, Department of Chemistry, University of Munich (LMU), Butenandtstrasse 5, D-81377 München, Germany
| | - Andreas Hulm
- Chair
of Theoretical Chemistry, Department of Chemistry, University of Munich (LMU), Butenandtstrasse 5, D-81377 München, Germany
| | - Christian Ochsenfeld
- Chair
of Theoretical Chemistry, Department of Chemistry, University of Munich (LMU), Butenandtstrasse 5, D-81377 München, Germany
- Max
Planck Institute for Solid State Research, Heisenbergstrasse 1, D-70569 Stuttgart, Germany
| |
Collapse
|
3
|
Nicolle A, Deng S, Ihme M, Kuzhagaliyeva N, Ibrahim EA, Farooq A. Mixtures Recomposition by Neural Nets: A Multidisciplinary Overview. J Chem Inf Model 2024; 64:597-620. [PMID: 38284618 DOI: 10.1021/acs.jcim.3c01633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2024]
Abstract
Artificial Neural Networks (ANNs) are transforming how we understand chemical mixtures, providing an expressive view of the chemical space and multiscale processes. Their hybridization with physical knowledge can bridge the gap between predictivity and understanding of the underlying processes. This overview explores recent progress in ANNs, particularly their potential in the 'recomposition' of chemical mixtures. Graph-based representations reveal patterns among mixture components, and deep learning models excel in capturing complexity and symmetries when compared to traditional Quantitative Structure-Property Relationship models. Key components, such as Hamiltonian networks and convolution operations, play a central role in representing multiscale mixtures. The integration of ANNs with Chemical Reaction Networks and Physics-Informed Neural Networks for inverse chemical kinetic problems is also examined. The combination of sensors with ANNs shows promise in optical and biomimetic applications. A common ground is identified in the context of statistical physics, where ANN-based methods iteratively adapt their models by blending their initial states with training data. The concept of mixture recomposition unveils a reciprocal inspiration between ANNs and reactive mixtures, highlighting learning behaviors influenced by the training environment.
Collapse
Affiliation(s)
- Andre Nicolle
- Aramco Fuel Research Center, Rueil-Malmaison 92852, France
| | - Sili Deng
- Massachusetts Institute of Technology, Cambridge 02139, Massachusetts, United States
| | - Matthias Ihme
- Stanford University, Stanford 94305, California, United States
| | | | - Emad Al Ibrahim
- King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia
| | - Aamir Farooq
- King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia
| |
Collapse
|
4
|
Kim S, Woo J, Kim WY. Diffusion-based generative AI for exploring transition states from 2D molecular graphs. Nat Commun 2024; 15:341. [PMID: 38184661 PMCID: PMC10771475 DOI: 10.1038/s41467-023-44629-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 12/21/2023] [Indexed: 01/08/2024] Open
Abstract
The exploration of transition state (TS) geometries is crucial for elucidating chemical reaction mechanisms and modeling their kinetics. Recently, machine learning (ML) models have shown remarkable performance for prediction of TS geometries. However, they require 3D conformations of reactants and products often with their appropriate orientations as input, which demands substantial efforts and computational cost. Here, we propose a generative approach based on the stochastic diffusion method, namely TSDiff, for prediction of TS geometries just from 2D molecular graphs. TSDiff outperforms the existing ML models with 3D geometries in terms of both accuracy and efficiency. Moreover, it enables to sample various TS conformations, because it learns the distribution of TS geometries for diverse reactions in training. Thus, TSDiff finds more favorable reaction pathways with lower barrier heights than those in the reference database. These results demonstrate that TSDiff shows promising potential for an efficient and reliable TS exploration.
Collapse
Affiliation(s)
- Seonghwan Kim
- Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea
| | - Jeheon Woo
- Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea
| | - Woo Youn Kim
- Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea.
- AI Institute, KAIST, 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea.
| |
Collapse
|
5
|
Day EC, Chittari SS, Bogen MP, Knight AS. Navigating the Expansive Landscapes of Soft Materials: A User Guide for High-Throughput Workflows. ACS Polym Au 2023; 3:406-427. [PMID: 38107416 PMCID: PMC10722570 DOI: 10.1021/acspolymersau.3c00025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 11/02/2023] [Accepted: 11/07/2023] [Indexed: 12/19/2023]
Abstract
Synthetic polymers are highly customizable with tailored structures and functionality, yet this versatility generates challenges in the design of advanced materials due to the size and complexity of the design space. Thus, exploration and optimization of polymer properties using combinatorial libraries has become increasingly common, which requires careful selection of synthetic strategies, characterization techniques, and rapid processing workflows to obtain fundamental principles from these large data sets. Herein, we provide guidelines for strategic design of macromolecule libraries and workflows to efficiently navigate these high-dimensional design spaces. We describe synthetic methods for multiple library sizes and structures as well as characterization methods to rapidly generate data sets, including tools that can be adapted from biological workflows. We further highlight relevant insights from statistics and machine learning to aid in data featurization, representation, and analysis. This Perspective acts as a "user guide" for researchers interested in leveraging high-throughput screening toward the design of multifunctional polymers and predictive modeling of structure-property relationships in soft materials.
Collapse
Affiliation(s)
| | | | - Matthew P. Bogen
- Department of Chemistry, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Abigail S. Knight
- Department of Chemistry, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| |
Collapse
|
6
|
Rasmussen MH, Seumer J, Jensen JH. Toward De Novo Catalyst Discovery: Fast Identification of New Catalyst Candidates for Alcohol-Mediated Morita-Baylis-Hillman Reactions. Angew Chem Int Ed Engl 2023; 62:e202310580. [PMID: 37830522 DOI: 10.1002/anie.202310580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 09/15/2023] [Accepted: 10/13/2023] [Indexed: 10/14/2023]
Abstract
Recently we have demonstrated how a genetic algorithm (GA) starting from random tertiary amines can be used to discover a new and efficient catalyst for the alcohol-mediated Morita-Baylis-Hillman (MBH) reaction. In particular, the discovered catalyst was shown experimentally to be eight times more active than DABCO, commonly used to catalyze the MBH reaction. This represents a breakthrough in using generative models for catalyst optimization. However, the GA procedure, and hence discovery, relied on two important pieces of information; 1) the knowledge that tertiary amines catalyze the reaction and 2) the mechanism and reaction profile for the catalyzed reaction, in particular the transition state structure of the rate-determining step. Thus, truly de novo catalyst discovery must include these steps. Here we present such a method for discovering catalyst candidates for a specific reaction while simultaneously proposing a mechanism for the catalyzed reaction. We show that tertiary amines and phosphines are potential catalysts for the MBH reaction by screening 11 molecular templates representing common functional groups. The method relies on an automated reaction discovery workflow using meta-dynamics calculations. Combining this method for catalyst candidate discovery with our GA-based catalyst optimization method results in an algorithm for truly de novo catalyst discovery.
Collapse
Affiliation(s)
- Maria H Rasmussen
- Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100, Copenhagen, Denmark
| | - Julius Seumer
- Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100, Copenhagen, Denmark
| | - Jan H Jensen
- Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100, Copenhagen, Denmark
| |
Collapse
|
7
|
Zhao Q, Anstine DM, Isayev O, Savoie BM. Δ 2 machine learning for reaction property prediction. Chem Sci 2023; 14:13392-13401. [PMID: 38033903 PMCID: PMC10686042 DOI: 10.1039/d3sc02408c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 07/11/2023] [Indexed: 12/02/2023] Open
Abstract
The emergence of Δ-learning models, whereby machine learning (ML) is used to predict a correction to a low-level energy calculation, provides a versatile route to accelerate high-level energy evaluations at a given geometry. However, Δ-learning models are inapplicable to reaction properties like heats of reaction and activation energies that require both a high-level geometry and energy evaluation. Here, a Δ2-learning model is introduced that can predict high-level activation energies based on low-level critical-point geometries. The Δ2 model uses an atom-wise featurization typical of contemporary ML interatomic potentials (MLIPs) and is trained on a dataset of ∼167 000 reactions, using the GFN2-xTB energy and critical-point geometry as a low-level input and the B3LYP-D3/TZVP energy calculated at the B3LYP-D3/TZVP critical point as a high-level target. The excellent performance of the Δ2 model on unseen reactions demonstrates the surprising ease with which the model implicitly learns the geometric deviations between the low-level and high-level geometries that condition the activation energy prediction. The transferability of the Δ2 model is validated on several external testing sets where it shows near chemical accuracy, illustrating the benefits of combining ML models with readily available physical-based information from semi-empirical quantum chemistry calculations. Fine-tuning of the Δ2 model on a small number of Gaussian-4 calculations produced a 35% accuracy improvement over DFT activation energy predictions while retaining xTB-level cost. The Δ2 model approach proves to be an efficient strategy for accelerating chemical reaction characterization with minimal sacrifice in prediction accuracy.
Collapse
Affiliation(s)
- Qiyuan Zhao
- Davidson School of Chemical Engineering, Purdue University West Lafayette IN 47906 USA
| | - Dylan M Anstine
- Department of Chemistry, Carnegie Mellon University Pittsburgh PA 15213 USA
| | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University Pittsburgh PA 15213 USA
| | - Brett M Savoie
- Davidson School of Chemical Engineering, Purdue University West Lafayette IN 47906 USA
| |
Collapse
|
8
|
Petrus E, Garay-Ruiz D, Reiher M, Bo C. Multi-Time-Scale Simulation of Complex Reactive Mixtures: How Do Polyoxometalates Form? J Am Chem Soc 2023; 145:18920-18930. [PMID: 37496164 DOI: 10.1021/jacs.3c05514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/28/2023]
Abstract
Understanding the dynamics of reactive mixtures still challenges both experiments and theory. A relevant example can be found in the chemistry of molecular metal-oxide nanoclusters, also known as polyoxometalates. The high number of species potentially involved, the interconnectivity of the reaction network, and the precise control of the pH and concentrations needed in the synthesis of such species make the theoretical/computational treatment of such processes cumbersome. This work addresses this issue relying on a unique combination of recently developed computational methods that tackle the construction, kinetic simulation, and analysis of complex chemical reaction networks. By using the Bell-Evans-Polanyi approximation for estimating activation energies, and an accurate and robust linear scaling for correcting the computed pKa values, we report herein multi-time-scale kinetic simulations for the self-assembly processes of polyoxotungstates that comprise 22 orders of magnitude, from tens of femtoseconds to months of reaction time. This very large time span was required to reproduce very fast processes such as the acid/base equilibria (at 10-12 s), relatively slow reactions such as the formation of key clusters such as the metatungstate (at 103 s), and the very slow assembly of the decatungstate (at 106 s). Analysis of the kinetic data and of the reaction network topology shed light onto the details of the main reaction mechanisms, which explains the origin of kinetic and thermodynamic control followed by the reaction. Simulations at alkaline pH fully reproduce experimental evidence since clusters do not form under those conditions.
Collapse
Affiliation(s)
- Enric Petrus
- Institute of Chemical Research of Catalonia (ICIQ), The Barcelona Institute of Science and Technology (BIST), Avenida Països Catalans, 16, Tarragona 43007, Spain
| | - Diego Garay-Ruiz
- Institute of Chemical Research of Catalonia (ICIQ), The Barcelona Institute of Science and Technology (BIST), Avenida Països Catalans, 16, Tarragona 43007, Spain
| | - Markus Reiher
- Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, Zürich 8093, Switzerland
| | - Carles Bo
- Institute of Chemical Research of Catalonia (ICIQ), The Barcelona Institute of Science and Technology (BIST), Avenida Països Catalans, 16, Tarragona 43007, Spain
- Departament de Química Física i Inorgànica, Universitat Rovira i Virgili, Marcel•li Domingo s/n, Tarragona 43007, Spain
| |
Collapse
|