1
|
Mandal A, Maurer C, Plett C, Chandramohan KRK, Fleischer R, Schnakenburg G, Grimme S, Bunescu A. Selective C-H Borylation of Polyaromatic Compounds Enabled by Metal-Arene π-Complexation. J Am Chem Soc 2025; 147:15281-15293. [PMID: 40265718 DOI: 10.1021/jacs.5c00774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/24/2025]
Abstract
The undirected Ir-catalyzed C-H borylation usually occurs preferentially at the least hindered and more acidic C-H bond of the aromatic ring. In the case of polyaromatic compounds possessing multiple unbiased and sterically accessible C-H bonds, the site selectivity for the nondirected C-H borylation is low. Here, we report the dramatic effect exerted by the π-complexation of a chromium tricarbonyl unit on the aromatic ring in the context of Ir-catalyzed C-H borylation. Competition experiments demonstrate that the C-H bonds of an aromatic ring bound to the chromium tricarbonyl unit react on average two orders in magnitude faster toward the C-H borylation than the unbound arenes. This enables an unprecedented C-H borylation with high site selectivity of the aromatic ring π-complexed with a chromium tripod in a series of organic polyaromatic compounds. Besides, the drastic enhancement of the reactivity of C-H bonds induced by the chromium tripod allows the C-H borylation to occur at room temperature with the substrate as a limiting reagent. The DFT studies indicate that the oxidative addition of the C-H bonds has lower activation barriers when the arenes are complexed with a chromium tricarbonyl unit, explaining the observed exceptional site selectivity. This study will further spearhead the development of nondirected C-H borylation with a bimetallic system to harness the effect of the noncovalent metal-arene π-type interactions on the reactivity and the selectivity of the C-H functionalization.
Collapse
Affiliation(s)
- Anup Mandal
- Kekulé Institute of Organic Chemistry and Biochemistry, University of Bonn, Gerhard-Domagk-Straße 1, 53121 Bonn, Germany
| | - Clemens Maurer
- Kekulé Institute of Organic Chemistry and Biochemistry, University of Bonn, Gerhard-Domagk-Straße 1, 53121 Bonn, Germany
| | - Christoph Plett
- Mulliken Center for Theoretical Chemistry, Clausius Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstraße 4, 53115 Bonn, Germany
| | - Kavin Raj Kumar Chandramohan
- Kekulé Institute of Organic Chemistry and Biochemistry, University of Bonn, Gerhard-Domagk-Straße 1, 53121 Bonn, Germany
| | - Ruben Fleischer
- Kekulé Institute of Organic Chemistry and Biochemistry, University of Bonn, Gerhard-Domagk-Straße 1, 53121 Bonn, Germany
| | - Gregor Schnakenburg
- Institute of Inorganic Chemistry, University of Bonn, Gerhard-Domagk-Straße 1, 53121 Bonn, Germany
| | - Stefan Grimme
- Mulliken Center for Theoretical Chemistry, Clausius Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstraße 4, 53115 Bonn, Germany
| | - Ala Bunescu
- Kekulé Institute of Organic Chemistry and Biochemistry, University of Bonn, Gerhard-Domagk-Straße 1, 53121 Bonn, Germany
| |
Collapse
|
2
|
Wang S, Hou X, Li Y, Zhou C, Zhang P, Hu C. From Single-Atom to Dual-Atom: A Universal Principle for the Rational Design of Heterogeneous Fenton-like Catalysts. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2025; 59:8822-8833. [PMID: 40261206 DOI: 10.1021/acs.est.4c13826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/24/2025]
Abstract
Developing efficient heterogeneous Fenton-like catalysts is the key point to accelerating the removal of organic micropollutants in the advanced oxidation process. However, a general principle guiding the reasonable design of highly efficient heterogeneous Fenton-like catalysts has not been constructed up to now. In this work, a total of 16 single-atom and 272 dual-atom transition metal/nitrogen/carbon (TM/N/C) catalysts for H2O2 dissociation were explored systematically based on high-throughput density functional theory and machine learning. It was found that H2O2 dissociation on single-atom TM/N/C exhibited a distinct volcano-type relationship between catalytic activity and •OH adsorption energy. The favorable •OH adsorption energies were in the range of -3.11 ∼ -2.20 eV. Three different descriptors, namely, energetic, electronic, and structural descriptors, were found, which can correlate the intrinsic properties of catalysts and their catalytic activity. Using adsorption energy, stability, and activation energy as the evaluation criteria, two dual-atom CoCu/N/C and CoRu/N/C catalysts were screened out from 272 candidates, which exhibited higher catalytic activity than the best single-atom TM/N/C catalyst due to the synergistic effect. This work could present a conceptually novel understanding of H2O2 dissociation on TM/N/C and inspire the structure-oriented catalyst design from the viewpoint of volcano relationship.
Collapse
Affiliation(s)
- Shengbo Wang
- Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Institute of Environmental Research at Greater Bay, Guangzhou University, Guangzhou 510006, China
| | - Xiuli Hou
- School of Physics and Materials Science, Guangzhou University, Guangzhou 510006, China
| | - Yichan Li
- Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Institute of Environmental Research at Greater Bay, Guangzhou University, Guangzhou 510006, China
| | - Chen Zhou
- Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Institute of Environmental Research at Greater Bay, Guangzhou University, Guangzhou 510006, China
| | - Peng Zhang
- Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Institute of Environmental Research at Greater Bay, Guangzhou University, Guangzhou 510006, China
| | - Chun Hu
- Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Institute of Environmental Research at Greater Bay, Guangzhou University, Guangzhou 510006, China
| |
Collapse
|
3
|
Stephens S, Lambert KM. The Importance of Atomic Charges for Predicting Site-Selective Ir-, Ru-, and Rh-Catalyzed C-H Borylations. J Org Chem 2025; 90:6000-6012. [PMID: 40268690 PMCID: PMC12053941 DOI: 10.1021/acs.joc.5c00343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2025] [Revised: 04/04/2025] [Accepted: 04/15/2025] [Indexed: 04/25/2025]
Abstract
A supervised machine learning model has been developed that allows for the prediction of site selectivity in late-stage C-H borylations. Model development was accomplished using literature data for the site-selective (≥95%) C-H borylation of 189 unique arene, heteroarene, and aliphatic substrates that feature a total of 971 possible sp2 or sp3 C-H borylation sites. The reported experimental data was supplemented with additional chemoinformatic descriptors, computed atomic charges at the C-H borylation sites, and data from parameterization of catalytically active tris-boryl complexes resulting from the combination of seven different Ir-, Ru-, and Rh-based precatalysts with eight different ligands. Of the over 1600 parameters investigated, the computed atomic charges (e.g., Hirshfeld, ChelpG, and Mulliken charges) on the hydrogen and carbon atoms at the site of borylation were identified as the most important features that allow for the successful prediction of whether a particular C-H bond will undergo a site-selective borylation. The overall accuracy of the developed model was 88.9% ± 2.5% with precision, recall, and F1 scores of 92-95% for the nonborylating sites and 65-75% for the sites of borylation. The model was demonstrated to be generalizable to molecules outside of the training/test sets with an additional validation set of 12 electronically and structurally diverse systems.
Collapse
Affiliation(s)
- Shannon
M. Stephens
- Department of Chemistry and
Biochemistry, Old Dominion University, 4501 Elkhorn Ave, Norfolk, Virginia 23529, United States
| | - Kyle M. Lambert
- Department of Chemistry and
Biochemistry, Old Dominion University, 4501 Elkhorn Ave, Norfolk, Virginia 23529, United States
| |
Collapse
|
4
|
Singh S, Hernández-Lobato JM. A meta-learning approach for selectivity prediction in asymmetric catalysis. Nat Commun 2025; 16:3599. [PMID: 40234410 PMCID: PMC12000603 DOI: 10.1038/s41467-025-58854-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Accepted: 03/31/2025] [Indexed: 04/17/2025] Open
Abstract
Transition metal-catalyzed asymmetric reactions are of high contemporary importance in organic synthesis. Recently, machine learning (ML) has shown promise in accelerating the development of newer catalytic protocols. However, the need for large amount of experimental data can present a bottleneck for implementing ML models. Here, we propose a meta-learning workflow that can harness the literature-derived data to extract shared reaction features and requires only a few examples to predict the outcome of new reactions. Prototypical networks are used as a meta-learning method to predict the enantioselectivity of asymmetric hydrogenation of olefins. This meta-learning model consistently provides significant performance improvement over other popular ML methods such as random forests and graph neural networks. The performance of our meta-model is analyzed with varying sizes of training examples to demonstrate its utility even with limited data. A good model performance on an out-of-sample test set further indicates the general applicability of our approach. We believe this work will provide a leap forward in identifying promising reactions in the early phases of reaction development when minimal data is available.
Collapse
Affiliation(s)
- Sukriti Singh
- Department of Engineering, University of Cambridge, Cambridge, UK.
| | | |
Collapse
|
5
|
Sigmund LM, Assante M, Johansson MJ, Norrby PO, Jorner K, Kabeshov M. Computational tools for the prediction of site- and regioselectivity of organic reactions. Chem Sci 2025; 16:5383-5412. [PMID: 40070469 PMCID: PMC11891785 DOI: 10.1039/d5sc00541h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2025] [Accepted: 03/03/2025] [Indexed: 03/14/2025] Open
Abstract
The regio- and site-selectivity of organic reactions is one of the most important aspects when it comes to synthesis planning. Due to that, massive research efforts were invested into computational models for regio- and site-selectivity prediction, and the introduction of machine learning to the chemical sciences within the past decade has added a whole new dimension to these endeavors. This review article walks through the currently available predictive tools for regio- and site-selectivity with a particular focus on machine learning models while being organized along the individual reaction classes of organic chemistry. Respective featurization techniques and model architectures are described and compared to each other; applications of the tools to critical real-world examples are highlighted. This paper aims to serve as an overview of the field's status quo for both the intended users of the tools, that is synthetic chemists, as well as for developers to find potential new research avenues.
Collapse
Affiliation(s)
- Lukas M Sigmund
- Molecular AI, Discovery Sciences, R&D, AstraZeneca Gothenburg Pepparedsleden 1 43183 Mölndal Sweden
| | - Michele Assante
- Innovation Centre in Digital Molecular Technologies, Department of Chemistry, University of Cambridge Lensfield Rd Cambridge CB2 1EW UK
- Compound Synthesis & Management, The Discovery Centre, AstraZeneca Cambridge Cambridge Biomedical Campus, 1 Francis Crick Avenue CB2 0AA Cambridge UK
| | - Magnus J Johansson
- Medicinal Chemistry, Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals, R&D, AstraZeneca Gothenburg Pepparedsleden 1 43183 Mölndal Sweden
| | - Per-Ola Norrby
- Data Science & Modelling, Pharmaceutical Sciences, R&D, AstraZeneca Gothenburg Pepparedsleden 1 43183 Mölndal Sweden
| | - Kjell Jorner
- ETH Zürich, Institute of Chemical and Bioengineering, Department of Chemistry and Applied Biosciences Vladimir-Prelog-Weg 1 CH-8093 Zürich Switzerland
- National Centre of Competence in Research (NCCR) Catalysis, ETH Zurich Zurich Switzerland
| | - Mikhail Kabeshov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca Gothenburg Pepparedsleden 1 43183 Mölndal Sweden
| |
Collapse
|
6
|
Sánchez-Fernández D, Torres T, García-Calvo J. Controlling the Symmetry of Perylene Derivatives via Selective ortho-Borylation. J Org Chem 2025; 90:3202-3208. [PMID: 40013334 DOI: 10.1021/acs.joc.4c02669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/28/2025]
Abstract
This work presents a systematic and rational approach to the synthesis of previously reported as well as novel tetra- and di-ortho-borylated perylene, perylenediimide, and perylenemonoimide scaffolds. Through optimization of the reaction conditions, employing [Ir(OMe)(COD)]2 as a catalyst and suitable ligands, efficient tetraborylation and regioselective diborylations were achieved. Additionally, the reaction times were reduced from days to hours under microwave irradiation, rendering this methodology a practical and scalable route for the ortho-functionalization of perylene derivatives.
Collapse
Affiliation(s)
- David Sánchez-Fernández
- Department of Organic Chemistry, Universidad Autónoma de Madrid, Campus de Cantoblanco, 28049 Madrid, Spain
| | - Tomás Torres
- Department of Organic Chemistry, Universidad Autónoma de Madrid, Campus de Cantoblanco, 28049 Madrid, Spain
- Institute for Advanced Research in Chemical Sciences (IAdChem), Universidad Autónoma de Madrid, Campus de Cantoblanco, 28049 Madrid, Spain
- IMDEA-Nanociencia, c/Faraday 9, Campus de Cantoblanco, 28049 Madrid, Spain
| | - José García-Calvo
- Department of Organic Chemistry, Universidad Autónoma de Madrid, Campus de Cantoblanco, 28049 Madrid, Spain
- Institute for Advanced Research in Chemical Sciences (IAdChem), Universidad Autónoma de Madrid, Campus de Cantoblanco, 28049 Madrid, Spain
- IMDEA-Nanociencia, c/Faraday 9, Campus de Cantoblanco, 28049 Madrid, Spain
| |
Collapse
|
7
|
Schleinitz J, Carretero-Cerdán A, Gurajapu A, Harnik Y, Lee G, Pandey A, Milo A, Reisman SE. Designing Target-specific Data Sets for Regioselectivity Predictions on Complex Substrates. J Am Chem Soc 2025; 147:7476-7484. [PMID: 39982221 PMCID: PMC11887056 DOI: 10.1021/jacs.4c15902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2024] [Revised: 02/05/2025] [Accepted: 02/06/2025] [Indexed: 02/22/2025]
Abstract
The development of machine learning models to predict the regioselectivity of C(sp3)-H functionalization reactions is reported. A data set for dioxirane oxidations was curated from the literature and used to generate a model to predict the regioselectivity of C-H oxidation. To assess whether smaller, intentionally designed data sets could provide accuracy on complex targets, a series of acquisition functions were developed to select the most informative molecules for the specific target. Active learning-based acquisition functions that leverage predicted reactivity and model uncertainty were found to outperform those based on molecular and site similarity alone. The use of acquisition functions for data set elaboration significantly reduced the number of data points needed to perform accurate prediction, and it was found that smaller, machine-designed data sets can give accurate predictions when larger, randomly selected data sets fail. Finally, the workflow was experimentally validated on five complex substrates and shown to be applicable to predicting the regioselectivity of arene C-H radical borylation. These studies provide a quantitative alternative to the intuitive extrapolation from "model substrates" that is frequently used to estimate reactivity on complex molecules.
Collapse
Affiliation(s)
- Jules Schleinitz
- The
Warren and Katharine Schlinger Laboratory for Chemistry and Chemical
Engineering, Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Alba Carretero-Cerdán
- The
Warren and Katharine Schlinger Laboratory for Chemistry and Chemical
Engineering, Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
- Division
of Theoretical Chemistry & Biology, CBH School, KTH Royal Institute of Technology, Teknikringen 30, S-10044 Stockholm, Sweden
| | - Anjali Gurajapu
- The
Warren and Katharine Schlinger Laboratory for Chemistry and Chemical
Engineering, Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Yonatan Harnik
- Department
of Chemistry, Ben-Gurion University of the
Negev, Beer-Sheva 841051, Israel
| | - Gina Lee
- The
Warren and Katharine Schlinger Laboratory for Chemistry and Chemical
Engineering, Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Amitesh Pandey
- The
Warren and Katharine Schlinger Laboratory for Chemistry and Chemical
Engineering, Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Anat Milo
- Department
of Chemistry, Ben-Gurion University of the
Negev, Beer-Sheva 841051, Israel
| | - Sarah E. Reisman
- The
Warren and Katharine Schlinger Laboratory for Chemistry and Chemical
Engineering, Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| |
Collapse
|
8
|
Sojdak C, Polefrone DA, Shah HM, Vu CD, Orzolek BJ, Jimenez Antenucci PM, Bush MV, Kozlowski MC. Direct (LC-)MS Identification of Regioisomers from C-H Functionalization by Partial Isotopic Labeling. ACS CENTRAL SCIENCE 2025; 11:272-278. [PMID: 40028360 PMCID: PMC11868960 DOI: 10.1021/acscentsci.4c01765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Revised: 12/24/2024] [Accepted: 01/02/2025] [Indexed: 03/05/2025]
Abstract
C-H functionalization of complex substrates is highly enabling in total synthesis and in the development of late-stage drug candidates. Much work has been dedicated to developing new methods as well as predictive modeling to accelerate route scouting. However, workflows to identify regioisomeric products are arduous, typically requiring chromatographic separation and/or nuclear magnetic resonance spectroscopy analysis. In addition, most reports focus on major products or do not assign regioisomeric products, which biases predictive models constructed from such data. Herein, we present a novel approach to complex reaction analysis utilizing partial deuterium labels, which enables direct product identification via liquid chromatography-mass spectrometry. When combined with spectral deconvolution, the method generates product ratios while circumventing chromatography altogether. Competitive kinetic isotope effects can also be determined. The resultant data are expected to be useful in the construction of predictive models across several dimensions including reaction selectivity, the impact of structure on mechanism, and mass spectral ionization patterns and expedite the identification of drug metabolites.
Collapse
Affiliation(s)
- Christopher
A. Sojdak
- Department
of Chemistry, Roy and Diana Vagelos Laboratories, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6323, United States
| | - David A. Polefrone
- Department
of Chemistry, Roy and Diana Vagelos Laboratories, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6323, United States
| | - Hriday M. Shah
- Department
of Chemistry, Roy and Diana Vagelos Laboratories, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6323, United States
| | - Cassandra D. Vu
- Department
of Chemistry, Roy and Diana Vagelos Laboratories, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6323, United States
| | - Brandon J. Orzolek
- Department
of Chemistry, Roy and Diana Vagelos Laboratories, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6323, United States
| | - Pedro M. Jimenez Antenucci
- Department
of Chemistry, Roy and Diana Vagelos Laboratories, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6323, United States
| | - Micah Valadez Bush
- Department
of Chemistry, Roy and Diana Vagelos Laboratories, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6323, United States
| | - Marisa C. Kozlowski
- Department
of Chemistry, Roy and Diana Vagelos Laboratories, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6323, United States
| |
Collapse
|
9
|
Zakis J, Lipina RA, Bell S, Williams SR, Mathis M, Johansson MJ, Wencel-Delord J, Smejkal T. High-Throughput Enabled Iridium-Catalyzed C-H Borylation Platform for Late-Stage Functionalization. ACS Catal 2025; 15:3525-3534. [PMID: 40013248 PMCID: PMC11851780 DOI: 10.1021/acscatal.4c07711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2024] [Revised: 01/23/2025] [Accepted: 01/24/2025] [Indexed: 02/28/2025]
Abstract
In this work, we present a dedicated, high-throughput reaction optimization platform allowing for the rapid evaluation of regiodivergent C-H borylation protocols while minimizing the amount of starting material required. The workflow was applied to a diverse set of fragment-like compounds, pharmaceuticals, and agrochemicals, and its practicality was demonstrated by successfully isolating 36 derivatives of bioactive compounds. Leveraging the informer library approach, we provide a comprehensive, side-by-side comparison of catalytic methods, revealing insights into the strengths, limitations, and versatility of each borylation protocol. Surprising reactivity patterns, effectiveness of ligand-free C-H borylation, and the utility of previously reported directed C-H borylation catalysts outside of their expected substrate scope have been noticed. This study highlights the potential of dedicated high-throughput optimization platforms to expand the practical utility of late-stage functionalization protocols for pharmaceutical and agrochemical research.
Collapse
Affiliation(s)
- Janis
M. Zakis
- Research
Chemistry, Syngenta Crop Protection AG, Schaffhauserstrasse 101, AG 4332 Stein, Switzerland
- Institut
für Organische Chemie, Universität
Würzburg, Am Hubland, 97074 Würzburg, Germany
| | - Rebeka A. Lipina
- Research
Chemistry, Syngenta Crop Protection AG, Schaffhauserstrasse 101, AG 4332 Stein, Switzerland
| | - Sharon Bell
- Research
Chemistry, Syngenta Crop Protection AG, Schaffhauserstrasse 101, AG 4332 Stein, Switzerland
| | - Simon R. Williams
- Research
Chemistry, Syngenta Crop Protection AG, Schaffhauserstrasse 101, AG 4332 Stein, Switzerland
| | - Maurus Mathis
- Research
Chemistry, Syngenta Crop Protection AG, Schaffhauserstrasse 101, AG 4332 Stein, Switzerland
| | - Magnus J. Johansson
- Medicinal
Chemistry, Research and Early Development, Cardiovascular, Renal and
Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Pepparedsleden
1, Mölndal, 431 50 Gothenburg, Sweden
| | - Joanna Wencel-Delord
- Institut
für Organische Chemie, Universität
Würzburg, Am Hubland, 97074 Würzburg, Germany
| | - Tomas Smejkal
- Research
Chemistry, Syngenta Crop Protection AG, Schaffhauserstrasse 101, AG 4332 Stein, Switzerland
| |
Collapse
|
10
|
Nakajima H, Murata C, Noto N, Saito S. Database Construction for the Virtual Screening of the Ruthenium-Catalyzed Hydrogenation of Ketones. J Org Chem 2025; 90:1054-1060. [PMID: 39762115 DOI: 10.1021/acs.joc.4c02347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2025]
Abstract
During the recent development of machine-learning (ML) methods for organic synthesis, the value of "failed experiments" has increasingly been acknowledged. Accordingly, we have developed an exhaustive database comprising 300 entries of experimental data obtained by performing ruthenium-catalyzed hydrogenation reactions using 10 ketones as substrates and 30 phosphine ligands. After evaluating the predictive performance of ML models using the constructed database, we conducted a virtual screening of commercially available phosphine ligands. For the virtual screening, we utilized several models, such as histogram-based gradient boosting and Ridge regression, combined with the Mordred descriptors and MACCSKeys, respectively. The disclosed approach resulted in the identification of high-performance phosphine ligands, and the rationale behind the predictions in the virtual screening was analyzed using SHAP.
Collapse
Affiliation(s)
- Haruno Nakajima
- Graduate School of Science, Nagoya University, Nagoya 464-8602, Japan
| | - Chihaya Murata
- Graduate School of Science, Nagoya University, Nagoya 464-8602, Japan
| | - Naoki Noto
- Integrated Research Consortium on Chemical Sciences (IRCCS), Nagoya University, Nagoya 464-8602, Japan
| | - Susumu Saito
- Graduate School of Science, Nagoya University, Nagoya 464-8602, Japan
- Integrated Research Consortium on Chemical Sciences (IRCCS), Nagoya University, Nagoya 464-8602, Japan
| |
Collapse
|
11
|
Nippa DF, Müller AT, Atz K, Konrad DB, Grether U, Martin RE, Schneider G. Simple User-Friendly Reaction Format. Mol Inform 2025; 44:e202400361. [PMID: 39846425 PMCID: PMC11755691 DOI: 10.1002/minf.202400361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2024] [Revised: 01/03/2025] [Accepted: 01/06/2025] [Indexed: 01/24/2025]
Abstract
Utilizing the growing wealth of chemical reaction data can boost synthesis planning and increase success rates. Yet, the effectiveness of machine learning tools for retrosynthesis planning and forward reaction prediction relies on accessible, well-curated data presented in a structured format. Although some public and licensed reaction databases exist, they often lack essential information about reaction conditions. To address this issue and promote the principles of findable, accessible, interoperable, and reusable (FAIR) data reporting and sharing, we introduce the Simple User-Friendly Reaction Format (SURF). SURF standardizes the documentation of reaction data through a structured tabular format, requiring only a basic understanding of spreadsheets. This format enables chemists to record the synthesis of molecules in a format that is understandable by both humans and machines, which facilitates seamless sharing and integration directly into machine learning pipelines. SURF files are designed to be interoperable, easily imported into relational databases, and convertible into other formats. This complements existing initiatives like the Open Reaction Database (ORD) and Unified Data Model (UDM). At Roche, SURF plays a crucial role in democratizing FAIR reaction data sharing and expediting the chemical synthesis process.
Collapse
Affiliation(s)
- David F. Nippa
- Roche Pharma Research and Early Development (pRED)Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd.Grenzacherstrasse 1244070BaselSwitzerland
- Department of PharmacyLudwig-Maximilians-Universität MünchenButenandtstrasse 581377MunichGermany
| | - Alex T. Müller
- Roche Pharma Research and Early Development (pRED)Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd.Grenzacherstrasse 1244070BaselSwitzerland
| | - Kenneth Atz
- Roche Pharma Research and Early Development (pRED)Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd.Grenzacherstrasse 1244070BaselSwitzerland
| | - David B. Konrad
- Department of PharmacyLudwig-Maximilians-Universität MünchenButenandtstrasse 581377MunichGermany
| | - Uwe Grether
- Roche Pharma Research and Early Development (pRED)Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd.Grenzacherstrasse 1244070BaselSwitzerland
| | - Rainer E. Martin
- Roche Pharma Research and Early Development (pRED)Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd.Grenzacherstrasse 1244070BaselSwitzerland
| | - Gisbert Schneider
- Department of Biosystems Science and EngineeringETH ZurichKlingelbergstrasse 484056BaselSwitzerland
| |
Collapse
|
12
|
Xu L, Zhu J, Shen X, Chai J, Shi L, Wu B, Li W, Ma D. 6-Hydroxy Picolinohydrazides Promoted Cu(I)-Catalyzed Hydroxylation Reaction in Water: Machine-Learning Accelerated Ligands Design and Reaction Optimization. Angew Chem Int Ed Engl 2024; 63:e202412552. [PMID: 39189301 DOI: 10.1002/anie.202412552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Revised: 08/19/2024] [Accepted: 08/25/2024] [Indexed: 08/28/2024]
Abstract
Hydroxylated (hetero)arenes are privileged motifs in natural products, materials, small-molecule pharmaceuticals and serve as versatile intermediates in synthetic organic chemistry. Herein, we report an efficient Cu(I)/6-hydroxy picolinohydrazide-catalyzed hydroxylation reaction of (hetero)aryl halides (Br, Cl) in water. By establishing machine learning (ML) models, the design of ligands and optimization of reaction conditions were effectively accelerated. The N-(1,3-dimethyl-9H- carbazol-9-yl)-6-hydroxypicolinamide (L32, 6-HPA-DMCA) demonstrated high efficiency for (hetero)aryl bromides, promoting hydroxylation reactions with a minimal catalyst loading of 0.01 mol % (100 ppm) at 80 °C to reach 10000 TON; for substrates containing sensitive functional groups, the catalyst loading needs to be increased to 3.0 mol % under near-room temperature conditions. N-(2,7-Di-tert-butyl-9H-carbazol-9-yl)-6-hydroxypicolinamide (L42, 6-HPA-DTBCA) displayed superior reaction activity for chloride substrates, enabling hydroxylation reactions at 100 °C with 2-3 mol % catalyst loading. These represent the state of art for both lowest catalyst loading and temperature in the copper-catalyzed hydroxylation reactions. Furthermore, this method features a sustainable and environmentally friendly solvent system, accommodates a wide range of substrates, and shows potential for developing robust and scalable synthesis processes for key pharmaceutical intermediates.
Collapse
Affiliation(s)
- Lanting Xu
- State Key Laboratory of Chemical Biology, Shanghai Institute of Organic Chemistry, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 345 Lingling Lu, Shanghai, 200032, China
| | - Jiazhou Zhu
- Suzhou Novartis Technical Development Co., Ltd., #18-1, Tonglian Road, Bixi Subdistrict, Changshu, Jiangsu, 215537, China
| | - Xiaodong Shen
- Suzhou Novartis Technical Development Co., Ltd., #18-1, Tonglian Road, Bixi Subdistrict, Changshu, Jiangsu, 215537, China
| | - Jiashuang Chai
- Chang-Kung Chuang Institute, School of Chemistry and Molecular Engineering, East China Normal University, 500 Dongchuang Lu, Shanghai, 200062, China
| | - Lei Shi
- Suzhou Novartis Technical Development Co., Ltd., #18-1, Tonglian Road, Bixi Subdistrict, Changshu, Jiangsu, 215537, China
| | - Bin Wu
- Suzhou Novartis Technical Development Co., Ltd., #18-1, Tonglian Road, Bixi Subdistrict, Changshu, Jiangsu, 215537, China
| | - Wei Li
- Suzhou Novartis Technical Development Co., Ltd., #18-1, Tonglian Road, Bixi Subdistrict, Changshu, Jiangsu, 215537, China
| | - Dawei Ma
- State Key Laboratory of Chemical Biology, Shanghai Institute of Organic Chemistry, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 345 Lingling Lu, Shanghai, 200032, China
| |
Collapse
|
13
|
Dong K, Wu T, Wang M, Lin L. Spirobipyridine Ligand Enabled Iridium-Catalyzed Site-Selective C-H Activation via Non-Covalent Interactions. Angew Chem Int Ed Engl 2024; 63:e202411158. [PMID: 39008194 DOI: 10.1002/anie.202411158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Revised: 07/15/2024] [Accepted: 07/15/2024] [Indexed: 07/16/2024]
Abstract
The selective borylation of specific C-H bonds in organic synthesis remains a formidable challenge. In this study, we present a novel spirobipyridine ligand that features a binaphthyl backbone. This ligand facilitates the iridium-catalyzed selective C-H borylation of benzene derivatives. The ligand is designed with "side-arm-wall" substituents that allow vicinal di- or multi-substituted benzene derivatives to approach metal center and effectively block other reactive sites by non-covalent interactions with substrates. The effectiveness of this strategy is demonstrated by the successful selective distal C-H activation of various alkaloids and its broad compatibility with functional groups.
Collapse
Affiliation(s)
- Kun Dong
- School of Chemistry, Dalian University of Technology, Dalian, Liaoning, 116024, China
| | - Tianbao Wu
- State Key Laboratory of Coordination Chemistry, Chemistry and Biomedicine Innovation Center (ChemBIC), School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, 210023, China
| | - Minyan Wang
- State Key Laboratory of Coordination Chemistry, Chemistry and Biomedicine Innovation Center (ChemBIC), School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, 210023, China
| | - Luqing Lin
- School of Chemistry, Dalian University of Technology, Dalian, Liaoning, 116024, China
| |
Collapse
|
14
|
Singh S, Hernández-Lobato JM. Data-Driven Insights into the Transition-Metal-Catalyzed Asymmetric Hydrogenation of Olefins. J Org Chem 2024; 89:12467-12478. [PMID: 39149801 PMCID: PMC11382158 DOI: 10.1021/acs.joc.4c01396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
The transition-metal-catalyzed asymmetric hydrogenation of olefins is one of the key transformations with great utility in various industrial applications. The field has been dominated by the use of noble metal catalysts, such as iridium and rhodium. The reactions with the earth-abundant cobalt metal have increased only in recent years. In this work, we analyze the large amount of literature data available on iridium- and rhodium-catalyzed asymmetric hydrogenation. The limited data on reactions using Co catalysts are then examined in the context of Ir and Rh to obtain a better understanding of the reactivity pattern. A detailed data-driven study of the types of olefins, ligands, and reaction conditions such as solvent, temperature, and pressure is carried out. Our analysis provides an understanding of the literature trends and demonstrates that only a few olefin-ligand combinations or reaction conditions are frequently used. The knowledge of this bias in the literature data toward a certain group of substrates or reaction conditions can be useful for practitioners to design new reaction data sets that are suitable to obtain meaningful predictions from machine-learning models.
Collapse
Affiliation(s)
- Sukriti Singh
- Department of Engineering, University of Cambridge, Cambridge CB2 1PZ, U.K
| | | |
Collapse
|
15
|
Atz K, Nippa DF, Müller AT, Jost V, Anelli A, Reutlinger M, Kramer C, Martin RE, Grether U, Schneider G, Wuitschik G. Geometric deep learning-guided Suzuki reaction conditions assessment for applications in medicinal chemistry. RSC Med Chem 2024; 15:2310-2321. [PMID: 39026644 PMCID: PMC11253849 DOI: 10.1039/d4md00196f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Accepted: 05/25/2024] [Indexed: 07/20/2024] Open
Abstract
Suzuki cross-coupling reactions are considered a valuable tool for constructing carbon-carbon bonds in small molecule drug discovery. However, the synthesis of chemical matter often represents a time-consuming and labour-intensive bottleneck. We demonstrate how machine learning methods trained on high-throughput experimentation (HTE) data can be leveraged to enable fast reaction condition selection for novel coupling partners. We show that the trained models support chemists in determining suitable catalyst-solvent-base combinations for individual transformations including an evaluation of the need for HTE screening. We introduce an algorithm for designing 96-well plates optimized towards reaction yields and discuss the model performance of zero- and few-shot machine learning. The best-performing machine learning model achieved a three-category classification accuracy of 76.3% (±0.2%) and an F 1-score for a binary classification of 79.1% (±0.9%). Validation on eight reactions revealed a receiver operating characteristic (ROC) curve (AUC) value of 0.82 (±0.07) for few-shot machine learning. On the other hand, zero-shot machine learning models achieved a mean ROC-AUC value of 0.63 (±0.16). This study positively advocates the application of few-shot machine learning-guided reaction condition selection for HTE campaigns in medicinal chemistry and highlights practical applications as well as challenges associated with zero-shot machine learning.
Collapse
Affiliation(s)
- Kenneth Atz
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd. Grenzacherstrasse 124 4070 Basel Switzerland
| | - David F Nippa
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd. Grenzacherstrasse 124 4070 Basel Switzerland
| | - Alex T Müller
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd. Grenzacherstrasse 124 4070 Basel Switzerland
| | - Vera Jost
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd. Grenzacherstrasse 124 4070 Basel Switzerland
| | - Andrea Anelli
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd. Grenzacherstrasse 124 4070 Basel Switzerland
| | - Michael Reutlinger
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd. Grenzacherstrasse 124 4070 Basel Switzerland
| | - Christian Kramer
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd. Grenzacherstrasse 124 4070 Basel Switzerland
| | - Rainer E Martin
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd. Grenzacherstrasse 124 4070 Basel Switzerland
| | - Uwe Grether
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd. Grenzacherstrasse 124 4070 Basel Switzerland
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, ETH Zurich Vladimir-Prelog-Weg 4 8093 Zurich Switzerland
| | - Georg Wuitschik
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd. Grenzacherstrasse 124 4070 Basel Switzerland
| |
Collapse
|
16
|
Borup RM, Ree N, Jensen JH. pKalculator: A p K a predictor for C-H bonds. Beilstein J Org Chem 2024; 20:1614-1622. [PMID: 39076289 PMCID: PMC11285060 DOI: 10.3762/bjoc.20.144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 07/02/2024] [Indexed: 07/31/2024] Open
Abstract
Determining the pK a values of various C-H sites in organic molecules offers valuable insights for synthetic chemists in predicting reaction sites. As molecular complexity increases, this task becomes more challenging. This paper introduces pKalculator, a quantum chemistry (QM)-based workflow for automatic computations of C-H pK a values, which is used to generate a training dataset for a machine learning (ML) model. The QM workflow is benchmarked against 695 experimentally determined C-H pK a values in DMSO. The ML model is trained on a diverse dataset of 775 molecules with 3910 C-H sites. Our ML model predicts C-H pK a values with a mean absolute error (MAE) and a root mean squared error (RMSE) of 1.24 and 2.15 pK a units, respectively. Furthermore, we employ our model on 1043 pK a-dependent reactions (aldol, Claisen, and Michael) and successfully indicate the reaction sites with a Matthew's correlation coefficient (MCC) of 0.82.
Collapse
Affiliation(s)
- Rasmus M Borup
- Department of Chemistry, University of Copenhagen, Copenhagen, DK-2100, Denmark
| | - Nicolai Ree
- Department of Chemistry, University of Copenhagen, Copenhagen, DK-2100, Denmark
| | - Jan H Jensen
- Department of Chemistry, University of Copenhagen, Copenhagen, DK-2100, Denmark
| |
Collapse
|
17
|
Raghavan P, Rago AJ, Verma P, Hassan MM, Goshu GM, Dombrowski AW, Pandey A, Coley CW, Wang Y. Incorporating Synthetic Accessibility in Drug Design: Predicting Reaction Yields of Suzuki Cross-Couplings by Leveraging AbbVie's 15-Year Parallel Library Data Set. J Am Chem Soc 2024; 146:15070-15084. [PMID: 38768950 PMCID: PMC11157529 DOI: 10.1021/jacs.4c00098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 04/24/2024] [Accepted: 04/25/2024] [Indexed: 05/22/2024]
Abstract
Despite the increased use of computational tools to supplement medicinal chemists' expertise and intuition in drug design, predicting synthetic yields in medicinal chemistry endeavors remains an unsolved challenge. Existing design workflows could profoundly benefit from reaction yield prediction, as precious material waste could be reduced, and a greater number of relevant compounds could be delivered to advance the design, make, test, analyze (DMTA) cycle. In this work, we detail the evaluation of AbbVie's medicinal chemistry library data set to build machine learning models for the prediction of Suzuki coupling reaction yields. The combination of density functional theory (DFT)-derived features and Morgan fingerprints was identified to perform better than one-hot encoded baseline modeling, furnishing encouraging results. Overall, we observe modest generalization to unseen reactant structures within the 15-year retrospective library data set. Additionally, we compare predictions made by the model to those made by expert medicinal chemists, finding that the model can often predict both reaction success and reaction yields with greater accuracy. Finally, we demonstrate the application of this approach to suggest structurally and electronically similar building blocks to replace those predicted or observed to be unsuccessful prior to or after synthesis, respectively. The yield prediction model was used to select similar monomers predicted to have higher yields, resulting in greater synthesis efficiency of relevant drug-like molecules.
Collapse
Affiliation(s)
- Priyanka Raghavan
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, 77 Massachusetts Ave, Cambridge, Massachusetts 02139, United States
| | - Alexander J. Rago
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Pritha Verma
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Majdi M. Hassan
- RAIDERS
Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Gashaw M. Goshu
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Amanda W. Dombrowski
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Abhishek Pandey
- RAIDERS
Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Connor W. Coley
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, 77 Massachusetts Ave, Cambridge, Massachusetts 02139, United States
| | - Ying Wang
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| |
Collapse
|
18
|
Kotlyarov R, Papachristos K, Wood GPF, Goodman JM. Leveraging Language Model Multitasking To Predict C-H Borylation Selectivity. J Chem Inf Model 2024; 64:4286-4297. [PMID: 38708520 PMCID: PMC11134489 DOI: 10.1021/acs.jcim.4c00137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/05/2024] [Accepted: 04/23/2024] [Indexed: 05/07/2024]
Abstract
C-H borylation is a high-value transformation in the synthesis of lead candidates for the pharmaceutical industry because a wide array of downstream coupling reactions is available. However, predicting its regioselectivity, especially in drug-like molecules that may contain multiple heterocycles, is not a trivial task. Using a data set of borylation reactions from Reaxys, we explored how a language model originally trained on USPTO_500_MT, a broad-scope set of patent data, can be used to predict the C-H borylation reaction product in different modes: product generation and site reactivity classification. Our fine-tuned T5Chem multitask language model can generate the correct product in 79% of cases. It can also classify the reactive aromatic C-H bonds with 95% accuracy and 88% positive predictive value, exceeding purpose-developed graph-based neural networks.
Collapse
Affiliation(s)
- Ruslan Kotlyarov
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield
Road, Cambridge CB2 1EW, U.K.
| | | | - Geoffrey P. F. Wood
- Exscientia
Plc, The Schrödinger Building, Oxford Science Park, Oxford OX4 4GE, U.K.
| | - Jonathan M. Goodman
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield
Road, Cambridge CB2 1EW, U.K.
| |
Collapse
|
19
|
Xu J, Ye X, Lv Z, Chen YH, Wang XS. The Role of Base in Reaction Performance of Photochemical Synthesis of Thiazoles: An Integrated Theoretical and Experimental Study. Chemistry 2024; 30:e202304279. [PMID: 38409580 DOI: 10.1002/chem.202304279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 02/25/2024] [Accepted: 02/26/2024] [Indexed: 02/28/2024]
Abstract
Artificial intelligence (AI)/machine learning (ML) is emerging as pivotal in synthetic chemistry, offering revolutionary potential in retrosynthetic analysis, reaction conditions and reaction prediction. We have combined chemical descriptors, primarily based on Density Functional Theory (DFT) calculations, with various AI/ML tools such as Multi-Layer Perceptron (MLP) and Random Forest (RF), to predict the synthesis of 2-arylbenzothiazole in photoredox reactions. Significantly, our models underscore the critical role of the molecular structure and physicochemical characteristics of the base, especially the total atomic polarizabilities, in the rate-determining steps involving cyclohexyl and phenethyl moieties of the substrate. Moreover, we validated our findings in articles through experimental studies. It showcases the power of AI/ML and quantum chemistry in shaping the future of organic chemistry.
Collapse
Affiliation(s)
- Jiaxin Xu
- The Institute for Advanced Studies (IAS), Wuhan University, Wuhan, 430072, China
| | - Xiaoyu Ye
- The Institute for Advanced Studies (IAS), Wuhan University, Wuhan, 430072, China
| | - Zongchao Lv
- The Institute for Advanced Studies (IAS), Wuhan University, Wuhan, 430072, China
- CMC Pharmaceutical Research Center, Wuhan RS Pharmaceutical Co., Ltd., Wuhan, 430073, China
| | - Yi-Hung Chen
- The Institute for Advanced Studies (IAS), Wuhan University, Wuhan, 430072, China
| | - Xiang Simon Wang
- Howard University College of Pharmacy, 2300 Fourth Street NW, Washington, DC 20059, United States
| |
Collapse
|
20
|
Nippa DF, Atz K, Hohler R, Müller AT, Marx A, Bartelmus C, Wuitschik G, Marzuoli I, Jost V, Wolfard J, Binder M, Stepan AF, Konrad DB, Grether U, Martin RE, Schneider G. Enabling late-stage drug diversification by high-throughput experimentation with geometric deep learning. Nat Chem 2024; 16:239-248. [PMID: 37996732 PMCID: PMC10849962 DOI: 10.1038/s41557-023-01360-5] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 10/03/2023] [Indexed: 11/25/2023]
Abstract
Late-stage functionalization is an economical approach to optimize the properties of drug candidates. However, the chemical complexity of drug molecules often makes late-stage diversification challenging. To address this problem, a late-stage functionalization platform based on geometric deep learning and high-throughput reaction screening was developed. Considering borylation as a critical step in late-stage functionalization, the computational model predicted reaction yields for diverse reaction conditions with a mean absolute error margin of 4-5%, while the reactivity of novel reactions with known and unknown substrates was classified with a balanced accuracy of 92% and 67%, respectively. The regioselectivity of the major products was accurately captured with a classifier F-score of 67%. When applied to 23 diverse commercial drug molecules, the platform successfully identified numerous opportunities for structural diversification. The influence of steric and electronic information on model performance was quantified, and a comprehensive simple user-friendly reaction format was introduced that proved to be a key enabler for seamlessly integrating deep learning and high-throughput experimentation for late-stage functionalization.
Collapse
Affiliation(s)
- David F Nippa
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
- Department of Pharmacy, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Kenneth Atz
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Remo Hohler
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Alex T Müller
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Andreas Marx
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Christian Bartelmus
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Georg Wuitschik
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Irene Marzuoli
- Process Chemistry and Catalysis (PCC), F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Vera Jost
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Jens Wolfard
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Martin Binder
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Antonia F Stepan
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - David B Konrad
- Department of Pharmacy, Ludwig-Maximilians-Universität München, Munich, Germany.
| | - Uwe Grether
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland.
| | - Rainer E Martin
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland.
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland.
- ETH Singapore SEC Ltd, Singapore, Singapore.
| |
Collapse
|
21
|
Raghavan P, Haas BC, Ruos ME, Schleinitz J, Doyle AG, Reisman SE, Sigman MS, Coley CW. Dataset Design for Building Models of Chemical Reactivity. ACS CENTRAL SCIENCE 2023; 9:2196-2204. [PMID: 38161380 PMCID: PMC10755851 DOI: 10.1021/acscentsci.3c01163] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 11/06/2023] [Accepted: 11/15/2023] [Indexed: 01/03/2024]
Abstract
Models can codify our understanding of chemical reactivity and serve a useful purpose in the development of new synthetic processes via, for example, evaluating hypothetical reaction conditions or in silico substrate tolerance. Perhaps the most determining factor is the composition of the training data and whether it is sufficient to train a model that can make accurate predictions over the full domain of interest. Here, we discuss the design of reaction datasets in ways that are conducive to data-driven modeling, emphasizing the idea that training set diversity and model generalizability rely on the choice of molecular or reaction representation. We additionally discuss the experimental constraints associated with generating common types of chemistry datasets and how these considerations should influence dataset design and model building.
Collapse
Affiliation(s)
- Priyanka Raghavan
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Brittany C. Haas
- Department
of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Madeline E. Ruos
- Department
of Chemistry & Biochemistry, University
of California, Los Angeles, Los Angeles, California 90095, United States
| | - Jules Schleinitz
- Division
of Chemistry and Chemical Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| | - Abigail G. Doyle
- Department
of Chemistry & Biochemistry, University
of California, Los Angeles, Los Angeles, California 90095, United States
| | - Sarah E. Reisman
- Division
of Chemistry and Chemical Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| | - Matthew S. Sigman
- Department
of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Connor W. Coley
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
- Department
of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
22
|
Roque JB, Shimozono AM, Pabst TP, Hierlmeier G, Peterson PO, Chirik PJ. Kinetic and thermodynamic control of C(sp 2)-H activation enables site-selective borylation. Science 2023; 382:1165-1170. [PMID: 38060669 PMCID: PMC10898344 DOI: 10.1126/science.adj6527] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2023] [Accepted: 10/10/2023] [Indexed: 02/29/2024]
Abstract
Catalysts that distinguish between electronically distinct carbon-hydrogen (C-H) bonds without relying on steric effects or directing groups are challenging to design. In this work, cobalt precatalysts supported by N-alkyl-imidazole-substituted pyridine dicarbene (ACNC) pincer ligands are described that enable undirected, remote borylation of fluoroaromatics and expansion of scope to include electron-rich arenes, pyridines, and tri- and difluoromethoxylated arenes, thereby addressing one of the major limitations of first-row transition metal C-H functionalization catalysts. Mechanistic studies established a kinetic preference for C-H bond activation at the meta-position despite cobalt-aryl complexes resulting from ortho C-H activation being thermodynamically preferred. Switchable site selectivity in C-H borylation as a function of the boron reagent was thereby preliminarily demonstrated using a single precatalyst.
Collapse
Affiliation(s)
- Jose B. Roque
- Department of Chemistry, Princeton University, Princeton, New Jersey, 08544, U.S.A
| | - Alex M. Shimozono
- Department of Chemistry, Princeton University, Princeton, New Jersey, 08544, U.S.A
| | - Tyler P. Pabst
- Department of Chemistry, Princeton University, Princeton, New Jersey, 08544, U.S.A
| | - Gabriele Hierlmeier
- Department of Chemistry, Princeton University, Princeton, New Jersey, 08544, U.S.A
| | - Paul O. Peterson
- Department of Chemistry, Princeton University, Princeton, New Jersey, 08544, U.S.A
| | - Paul J. Chirik
- Department of Chemistry, Princeton University, Princeton, New Jersey, 08544, U.S.A
| |
Collapse
|
23
|
Nippa DF, Atz K, Müller AT, Wolfard J, Isert C, Binder M, Scheidegger O, Konrad DB, Grether U, Martin RE, Schneider G. Identifying opportunities for late-stage C-H alkylation with high-throughput experimentation and in silico reaction screening. Commun Chem 2023; 6:256. [PMID: 37985850 PMCID: PMC10661846 DOI: 10.1038/s42004-023-01047-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 10/30/2023] [Indexed: 11/22/2023] Open
Abstract
Enhancing the properties of advanced drug candidates is aided by the direct incorporation of specific chemical groups, avoiding the need to construct the entire compound from the ground up. Nevertheless, their chemical intricacy often poses challenges in predicting reactivity for C-H activation reactions and planning their synthesis. We adopted a reaction screening approach that combines high-throughput experimentation (HTE) at a nanomolar scale with computational graph neural networks (GNNs). This approach aims to identify suitable substrates for late-stage C-H alkylation using Minisci-type chemistry. GNNs were trained using experimentally generated reactions derived from in-house HTE and literature data. These trained models were then used to predict, in a forward-looking manner, the coupling of 3180 advanced heterocyclic building blocks with a diverse set of sp3-rich carboxylic acids. This predictive approach aimed to explore the substrate landscape for Minisci-type alkylations. Promising candidates were chosen, their production was scaled up, and they were subsequently isolated and characterized. This process led to the creation of 30 novel, functionally modified molecules that hold potential for further refinement. These results positively advocate the application of HTE-based machine learning to virtual reaction screening.
Collapse
Affiliation(s)
- David F Nippa
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070, Basel, Switzerland
- Department of Pharmacy, Ludwig-Maximilians-Universität München, Butenandtstrasse 5, 81377, Munich, Germany
| | - Kenneth Atz
- Department of Chemistry and Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Alex T Müller
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070, Basel, Switzerland
| | - Jens Wolfard
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070, Basel, Switzerland
| | - Clemens Isert
- Department of Chemistry and Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Martin Binder
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070, Basel, Switzerland
| | - Oliver Scheidegger
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070, Basel, Switzerland
| | - David B Konrad
- Department of Pharmacy, Ludwig-Maximilians-Universität München, Butenandtstrasse 5, 81377, Munich, Germany.
| | - Uwe Grether
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070, Basel, Switzerland.
| | - Rainer E Martin
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070, Basel, Switzerland.
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland.
| |
Collapse
|
24
|
Yu IF, Wilson JW, Hartwig JF. Transition-Metal-Catalyzed Silylation and Borylation of C-H Bonds for the Synthesis and Functionalization of Complex Molecules. Chem Rev 2023; 123:11619-11663. [PMID: 37751601 DOI: 10.1021/acs.chemrev.3c00207] [Citation(s) in RCA: 46] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/28/2023]
Abstract
The functionalization of C-H bonds in organic molecules containing functional groups has been one of the holy grails of catalysis. One synthetically important approach to the diverse functionalization of C-H bonds is the catalytic silylation or borylation of C-H bonds, which enables a broad array of downstream transformations to afford diverse structures. Advances in both undirected and directed methods for the transition-metal-catalyzed silylation and borylation of C-H bonds have led to their rapid adoption in early-, mid-, and late-stage of the synthesis of complex molecules. In this Review, we review the application of the transition-metal-catalyzed silylation and borylation of C-H bonds to the synthesis of bioactive molecules, organic materials, and ligands. Overall, we aim to provide a picture of the state of art of the silylation and borylation of C-H bonds as applied to the synthesis and modification of diverse architectures that will spur further application and development of these reactions.
Collapse
Affiliation(s)
- Isaac F Yu
- Department of Chemistry, University of California, Berkeley, California 94720, United States
| | - Jake W Wilson
- Department of Chemistry, University of California, Berkeley, California 94720, United States
| | - John F Hartwig
- Department of Chemistry, University of California, Berkeley, California 94720, United States
| |
Collapse
|