1
|
Hisata Y, Washio T, Takizawa S, Ogoshi S, Hoshimoto Y. In-silico-assisted derivatization of triarylboranes for the catalytic reductive functionalization of aniline-derived amino acids and peptides with H 2. Nat Commun 2024; 15:3708. [PMID: 38714662 PMCID: PMC11076482 DOI: 10.1038/s41467-024-47984-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 04/16/2024] [Indexed: 05/10/2024] Open
Abstract
Cheminformatics-based machine learning (ML) has been employed to determine optimal reaction conditions, including catalyst structures, in the field of synthetic chemistry. However, such ML-focused strategies have remained largely unexplored in the context of catalytic molecular transformations using Lewis-acidic main-group elements, probably due to the absence of a candidate library and effective guidelines (parameters) for the prediction of the activity of main-group elements. Here, the construction of a triarylborane library and its application to an ML-assisted approach for the catalytic reductive alkylation of aniline-derived amino acids and C-terminal-protected peptides with aldehydes and H2 is reported. A combined theoretical and experimental approach identified the optimal borane, i.e., B(2,3,5,6-Cl4-C6H)(2,6-F2-3,5-(CF3)2-C6H)2, which exhibits remarkable functional-group compatibility toward aniline derivatives in the presence of 4-methyltetrahydropyran. The present catalytic system generates H2O as the sole byproduct.
Collapse
Affiliation(s)
- Yusei Hisata
- Department of Applied Chemistry, Faculty of Engineering, Osaka University, Suita, Osaka, 565-0871, Japan
| | - Takashi Washio
- Department of Reasoning for Intelligence and Artificial Intelligence Research Center, SANKEN, Osaka University, Ibaraki, Osaka, 567-0047, Japan
| | - Shinobu Takizawa
- Department of Synthetic Organic Chemistry and Artificial Intelligence Research Center, SANKEN, Osaka University, Ibaraki, Osaka, 567-0047, Japan
| | - Sensuke Ogoshi
- Department of Applied Chemistry, Faculty of Engineering, Osaka University, Suita, Osaka, 565-0871, Japan
| | - Yoichi Hoshimoto
- Department of Applied Chemistry, Faculty of Engineering, Osaka University, Suita, Osaka, 565-0871, Japan.
- Division of Applied Chemistry, Center for Future Innovation (CFi), Faculty of Engineering, Osaka University, Suita, Osaka, 565-0871, Japan.
| |
Collapse
|
2
|
Tcyrulnikov S, Hubbell AK, Pedro D, Reyes GP, Monfette S, Weix DJ, Hansen EC. Computationally Guided Ligand Discovery from Compound Libraries and Discovery of a New Class of Ligands for Ni-Catalyzed Cross-Electrophile Coupling of Challenging Quinoline Halides. J Am Chem Soc 2024; 146:6947-6954. [PMID: 38427582 DOI: 10.1021/jacs.3c14607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2024]
Abstract
Although screening technology has heavily impacted the fields of metal catalysis and drug discovery, its application to the discovery of new catalyst classes has been limited. The diversity of on- and off-cycle pathways, combined with incomplete mechanistic understanding, means that screens of potential new ligands have thus far been guided by intuitive analysis of the metal binding potential. This has resulted in the discovery of new classes of ligands, but the low hit rates have limited the use of this strategy because large screens require considerable cost and effort. Here, we demonstrate a method to identify promising screening directions via simple and scalable computational and linear regression tools that leads to a substantial improvement in hit rate, enabling the use of smaller screens to find new ligands. The application of this approach to a particular example of Ni-catalyzed cross-electrophile coupling of aryl halides with alkyl halides revealed a previously overlooked trend: reactions with more electron-poor amidine ligands result in a higher yield. Focused screens utilizing this trend were more successful than serendipity-based screening and led to the discovery of two new types of ligands, pyridyl oxadiazoles and pyridyl oximes. These ligands are especially effective for couplings of bromo- and chloroquinolines and isoquinolines, where they are now the state of the art. The simplicity of these models with parameters derived from metal-free ligand structures should make this approach scalable and widely accessible.
Collapse
Affiliation(s)
- Sergei Tcyrulnikov
- Chemical Research and Development, Pfizer Worldwide Research and Development, Eastern Point Road, Groton, Connecticut 06340, United States
| | - Aran K Hubbell
- Chemical Research and Development, Pfizer Worldwide Research and Development, Eastern Point Road, Groton, Connecticut 06340, United States
| | - Dylan Pedro
- Chemical Research and Development, Pfizer Worldwide Research and Development, Eastern Point Road, Groton, Connecticut 06340, United States
| | - Giselle P Reyes
- Chemical Research and Development, Pfizer Worldwide Research and Development, Eastern Point Road, Groton, Connecticut 06340, United States
| | - Sebastien Monfette
- Chemical Research and Development, Pfizer Worldwide Research and Development, Eastern Point Road, Groton, Connecticut 06340, United States
| | - Daniel J Weix
- University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Eric C Hansen
- Chemical Research and Development, Pfizer Worldwide Research and Development, Eastern Point Road, Groton, Connecticut 06340, United States
| |
Collapse
|
3
|
Gallarati S, van Gerwen P, Laplaza R, Brey L, Makaveev A, Corminboeuf C. A genetic optimization strategy with generality in asymmetric organocatalysis as a primary target. Chem Sci 2024; 15:3640-3660. [PMID: 38455002 PMCID: PMC10915838 DOI: 10.1039/d3sc06208b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 01/30/2024] [Indexed: 03/09/2024] Open
Abstract
A catalyst possessing a broad substrate scope, in terms of both turnover and enantioselectivity, is sometimes called "general". Despite their great utility in asymmetric synthesis, truly general catalysts are difficult or expensive to discover via traditional high-throughput screening and are, therefore, rare. Existing computational tools accelerate the evaluation of reaction conditions from a pre-defined set of experiments to identify the most general ones, but cannot generate entirely new catalysts with enhanced substrate breadth. For these reasons, we report an inverse design strategy based on the open-source genetic algorithm NaviCatGA and on the OSCAR database of organocatalysts to simultaneously probe the catalyst and substrate scope and optimize generality as a primary target. We apply this strategy to the Pictet-Spengler condensation, for which we curate a database of 820 reactions, used to train statistical models of selectivity and activity. Starting from OSCAR, we define a combinatorial space of millions of catalyst possibilities, and perform evolutionary experiments on a diverse substrate scope that is representative of the whole chemical space of tetrahydro-β-carboline products. While privileged catalysts emerge, we show how genetic optimization can address the broader question of generality in asymmetric synthesis, extracting structure-performance relationships from the challenging areas of chemical space.
Collapse
Affiliation(s)
- Simone Gallarati
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne Switzerland
| | - Puck van Gerwen
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne Switzerland
- National Center for Competence in Research - Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne Switzerland
| | - Ruben Laplaza
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne Switzerland
- National Center for Competence in Research - Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne Switzerland
| | - Lucien Brey
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne Switzerland
| | - Alexander Makaveev
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne Switzerland
| | - Clemence Corminboeuf
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne Switzerland
- National Center for Competence in Research - Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne Switzerland
- National Center for Computational Design and Discovery of Novel Materials (MARVEL), Ecole Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne Switzerland
| |
Collapse
|
4
|
Raghavan P, Haas BC, Ruos ME, Schleinitz J, Doyle AG, Reisman SE, Sigman MS, Coley CW. Dataset Design for Building Models of Chemical Reactivity. ACS CENTRAL SCIENCE 2023; 9:2196-2204. [PMID: 38161380 PMCID: PMC10755851 DOI: 10.1021/acscentsci.3c01163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 11/06/2023] [Accepted: 11/15/2023] [Indexed: 01/03/2024]
Abstract
Models can codify our understanding of chemical reactivity and serve a useful purpose in the development of new synthetic processes via, for example, evaluating hypothetical reaction conditions or in silico substrate tolerance. Perhaps the most determining factor is the composition of the training data and whether it is sufficient to train a model that can make accurate predictions over the full domain of interest. Here, we discuss the design of reaction datasets in ways that are conducive to data-driven modeling, emphasizing the idea that training set diversity and model generalizability rely on the choice of molecular or reaction representation. We additionally discuss the experimental constraints associated with generating common types of chemistry datasets and how these considerations should influence dataset design and model building.
Collapse
Affiliation(s)
- Priyanka Raghavan
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Brittany C. Haas
- Department
of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Madeline E. Ruos
- Department
of Chemistry & Biochemistry, University
of California, Los Angeles, Los Angeles, California 90095, United States
| | - Jules Schleinitz
- Division
of Chemistry and Chemical Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| | - Abigail G. Doyle
- Department
of Chemistry & Biochemistry, University
of California, Los Angeles, Los Angeles, California 90095, United States
| | - Sarah E. Reisman
- Division
of Chemistry and Chemical Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| | - Matthew S. Sigman
- Department
of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Connor W. Coley
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
- Department
of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
5
|
Pancoast AR, McCormack SL, Galinat S, Walser-Kuntz R, Jett BM, Sanford MS, Sigman MS. Data science enabled discovery of a highly soluble 2,2'-bipyrimidine anolyte for application in a flow battery. Chem Sci 2023; 14:13734-13742. [PMID: 38075655 PMCID: PMC10699568 DOI: 10.1039/d3sc04084d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Accepted: 11/01/2023] [Indexed: 02/12/2024] Open
Abstract
Development of non-aqueous redox flow batteries as a viable energy storage solution relies upon the identification of soluble charge carriers capable of storing large amounts of energy over extended time periods. A combination of metrics including number of electrons stored per molecule, redox potential, stability, and solubility of the charge carrier impact performance. In this context, we recently reported a 2,2'-bipyrimidine charge carrier that stores two electrons per molecule with reduction near -2.0 V vs. Fc/Fc+ and high stability. However, these first-generation derivatives showed a modest solubility of 0.17 M (0.34 M e-). Seeking to improve solubility without sacrificing stability, we harnessed the synthetic modularity of this scaffold to design a library of sixteen candidates. Using computed molecular descriptors and a single node decision tree, we found that minimization of the solvent accessible surface area (SASA) can be used to predict derivatives with enhanced solubility. This parameter was used in combination with a heatmap describing stability to de-risk a virtual screen that ultimately identified a 2,2'-bipyrimidine with significantly increased solubility and good stability metrics in the reduced states. This molecule was paired with a cyclopropenium catholyte in a prototype all-organic redox flow battery, achieving a cell potential up to 3 V.
Collapse
Affiliation(s)
- Adam R Pancoast
- Department of Chemistry, University of Utah 315 South 1400 East Salt Lake City Utah 84112 USA
- Joint Center for Energy Storage Research 9700 S. Cass Avenue Argonne Illinois 60439 USA
| | - Sara L McCormack
- Department of Chemistry, University of Utah 315 South 1400 East Salt Lake City Utah 84112 USA
- Joint Center for Energy Storage Research 9700 S. Cass Avenue Argonne Illinois 60439 USA
| | - Shelby Galinat
- Department of Chemistry, University of Utah 315 South 1400 East Salt Lake City Utah 84112 USA
| | - Ryan Walser-Kuntz
- Department of Chemistry, University of Michigan, 930 North University Avenue Ann Arbor Michigan 48109 USA
- Joint Center for Energy Storage Research 9700 S. Cass Avenue Argonne Illinois 60439 USA
| | - Brianna M Jett
- Department of Chemistry, University of Michigan, 930 North University Avenue Ann Arbor Michigan 48109 USA
- Joint Center for Energy Storage Research 9700 S. Cass Avenue Argonne Illinois 60439 USA
| | - Melanie S Sanford
- Department of Chemistry, University of Michigan, 930 North University Avenue Ann Arbor Michigan 48109 USA
- Joint Center for Energy Storage Research 9700 S. Cass Avenue Argonne Illinois 60439 USA
| | - Matthew S Sigman
- Department of Chemistry, University of Utah 315 South 1400 East Salt Lake City Utah 84112 USA
- Department of Chemistry, University of Michigan, 930 North University Avenue Ann Arbor Michigan 48109 USA
| |
Collapse
|
6
|
Shim E, Tewari A, Cernak T, Zimmerman PM. Machine Learning Strategies for Reaction Development: Toward the Low-Data Limit. J Chem Inf Model 2023; 63:3659-3668. [PMID: 37312524 DOI: 10.1021/acs.jcim.3c00577] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Machine learning models are increasingly being utilized to predict outcomes of organic chemical reactions. A large amount of reaction data is used to train these models, which is in stark contrast to how expert chemists discover and develop new reactions by leveraging information from a small number of relevant transformations. Transfer learning and active learning are two strategies that can operate in low-data situations, which may help fill this gap and promote the use of machine learning for tackling real-world challenges in organic synthesis. This Perspective introduces active and transfer learning and connects these to potential opportunities and directions for further research, especially in the area of prospective development of chemical transformations.
Collapse
Affiliation(s)
- Eunjae Shim
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Ambuj Tewari
- Department of Statistics, University of Michigan, Ann Arbor, Michigan 48109, United States
- Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Tim Cernak
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Paul M Zimmerman
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| |
Collapse
|