1
|
Salimi A, Lee JY. Hybrid intelligence for environmental pollution: biodegradability assessment of organic compounds through multimodal integration of graph attention networks and QSAR models. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2025; 27:981-991. [PMID: 40052292 DOI: 10.1039/d4em00594e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/17/2025]
Abstract
Computational methods are crucial for assessing chemical biodegradability, given their significant impact on both environmental and human health. Organic compounds that are not biodegradable can persist in the environment, contributing to pollution. Our novel approach leverages graph attention networks (GATs) and incorporates node and edge attributes for biodegradability prediction. Quantitative Structure-Activity Relationship (QSAR) models using two-dimensional descriptors alongside weighted average and stacking approaches were employed to generate ensemble models. The GAT models demonstrated a stable function and generally higher specificity on the validation set compared to a graph convolutional network, although definitive superiority is challenging to establish owing to overlapping standard deviations. However, the sensitivities tended to decrease with potential performance overlap owing to the interval intersection. Ensemble learning enhanced several performance metrics compared with individual models and base models, with the combination of extreme Gradient Boosting and GAT achieving the highest precision and specificity. Combining GAT with random forest and Gradient Boosting may be preferable for accurately predicting biodegradable molecules, whereas the stacking approach may be suitable for prioritizing the correct classification of nonbiodegradable substances. Important descriptors, such as SpMax1_Bh(m) and SAscore, were identified in at least two QSAR models. Despite inherent complexities, the ease of implementation depends on factors such as data availability, and domain knowledge. Assessing the biodegradability of organic compounds is essential for reducing their environmental impact, assessing risks, ensuring regulatory compliance, promoting sustainable development, and supporting effective pollution remediation. It assists in making informed decisions about chemical use, waste management, and environmental protection.
Collapse
Affiliation(s)
- Abbas Salimi
- Department of Chemistry, Sungkyunkwan University, Suwon 16419, Korea.
| | - Jin Yong Lee
- Department of Chemistry, Sungkyunkwan University, Suwon 16419, Korea.
| |
Collapse
|
2
|
Ree N, Wollschläger JM, Göller AH, Jensen JH. Atom-based machine learning for estimating nucleophilicity and electrophilicity with applications to retrosynthesis and chemical stability. Chem Sci 2025; 16:5676-5687. [PMID: 40041802 PMCID: PMC11875096 DOI: 10.1039/d4sc07297a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2024] [Accepted: 02/23/2025] [Indexed: 03/28/2025] Open
Abstract
Nucleophilicity and electrophilicity are important properties for evaluating the reactivity and selectivity of chemical reactions. It allows the ranking of nucleophiles and electrophiles on reactivity scales, enabling a better understanding and prediction of reaction outcomes. Building upon our recent work (N. Ree, A. H. Göller and J. H. Jensen, Automated quantum chemistry for estimating nucleophilicity and electrophilicity with applications to retrosynthesis and covalent inhibitors, Digit. Discov., 2024, 3, 347-354), we introduce an atom-based machine learning (ML) approach for predicting methyl cation affinities (MCAs) and methyl anion affinities (MAAs) to estimate nucleophilicity and electrophilicity, respectively. The ML models are trained and validated on QM-derived data from around 50 000 neutral drug-like molecules, achieving Pearson correlation coefficients of 0.97 for MCA and 0.95 for MAA on the held-out test sets. In addition, we demonstrate the ML approach on two different applications: first, as a general tool for filtering retrosynthetic routes based on chemical selectivity predictions, and second, as a tool for assessing the chemical stability of esters and carbamates towards hydrolysis reactions. The code is freely available on GitHub under the MIT open source license and as a web application at https://www.esnuel.org.
Collapse
Affiliation(s)
- Nicolai Ree
- Department of Chemistry, University of Copenhagen Universitetsparken 5 2100 Copenhagen Ø Denmark
| | - Jan M Wollschläger
- Bayer AG, Pharmaceuticals, R&D, Machine Learning Research 13353 Berlin Germany
| | - Andreas H Göller
- Bayer AG, Pharmaceuticals, R&D, Computational Molecular Design 42096 Wuppertal Germany
| | - Jan H Jensen
- Department of Chemistry, University of Copenhagen Universitetsparken 5 2100 Copenhagen Ø Denmark
| |
Collapse
|
3
|
Tavakoli M, Chiu YTT, Carlton AM, Van Vranken D, Baldi P. Chemically Informed Deep Learning for Interpretable Radical Reaction Prediction. J Chem Inf Model 2025; 65:1228-1242. [PMID: 39871741 PMCID: PMC11815866 DOI: 10.1021/acs.jcim.4c01901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Revised: 01/14/2025] [Accepted: 01/15/2025] [Indexed: 01/29/2025]
Abstract
Organic radical reactions are crucial in many areas of chemistry, including synthetic, biological, and atmospheric chemistry. We develop a predictive framework based on the interaction of molecular orbitals that operates on mechanistic-level radical reactions. Given our chemistry-aware model, all predictions are provided with different levels of interpretability. Our models are trained and evaluated using the RMechDB database of radical reaction steps. Our model predicts the correct orbital interaction and products for 96% of the test reactions in RMechDB. By chaining these predictions, we perform a pathway search capable of identifying all intermediates and byproducts of a radical reaction. We test the pathway search on two classes of problems in atmospheric and polymerization chemistry. RMechRP is publicly available online at https://deeprxn.ics.uci.edu/rmechrp/.
Collapse
Affiliation(s)
- Mohammadamin Tavakoli
- Department
of Computer Science, University of California,
Irvine, Irvine, California 92697, United States
| | - Yin Ting T. Chiu
- Department
of Chemistry, University of California,
Irvine, Irvine, California 92697, United States
| | - Ann Marie Carlton
- Department
of Chemistry, University of California,
Irvine, Irvine, California 92697, United States
| | - David Van Vranken
- Department
of Chemistry, University of California,
Irvine, Irvine, California 92697, United States
| | - Pierre Baldi
- Department
of Computer Science, University of California,
Irvine, Irvine, California 92697, United States
| |
Collapse
|
4
|
Gross C, Eitzinger A, Hampel N, Mayer P, Ofial AR. Defining the Synthetic Scope of ortho-Quinone Methides by Quantifying their Electrophilicity. Chemistry 2025; 31:e202403785. [PMID: 39531351 DOI: 10.1002/chem.202403785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Revised: 10/31/2024] [Accepted: 11/11/2024] [Indexed: 11/16/2024]
Abstract
A series of aryl-substituted ortho-quinone methides (oQMs) was synthesised and structurally characterised. Kinetic studies of the nucleophilic additions of carbanions (reference nucleophiles) to oQMs were used to determine second-order rate constants k2 for the carbon-carbon bond forming reactions (20 °C, DMSO) at the oQMs' exocyclic π-bond. Analysing the kinetic data by the linear free energy relationship lg k2=sN(N+E) revealed the Mayr electrophilicities E of the oQMs. The electrophilicities E of oQMs correlate linearly with Hammett substituent constants and experimentally determined reduction potentials Ep red as well as with quantum-chemically calculated methyl anion affinities (MAAs), which provides valuable tools for prediciting the reactivity of further types of oQMs. Embedding the oQMs in Mayr's reactivity scales enables to predict novel nucleophilic reaction partners for oQMs and can productively be used to prepare simple Michael adducts as well as 4+2 or 4+1 cyclisation products as demonstrated in this work by several novel reactions with neutral or negatively charged C-, N-, and S-nucleophiles.
Collapse
Affiliation(s)
- Christoph Gross
- Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377, München, Germany
| | - Andreas Eitzinger
- Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377, München, Germany
- Current address: Institute of Organic Chemistry, Johannes Kepler University Linz, Austria
| | - Nathalie Hampel
- Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377, München, Germany
| | - Peter Mayer
- Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377, München, Germany
| | - Armin R Ofial
- Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377, München, Germany
| |
Collapse
|
5
|
Qian W, Wang X, Kang Y, Pan P, Hou T, Hsieh CY. A general model for predicting enzyme functions based on enzymatic reactions. J Cheminform 2024; 16:38. [PMID: 38556873 PMCID: PMC10983695 DOI: 10.1186/s13321-024-00827-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 03/16/2024] [Indexed: 04/02/2024] Open
Abstract
Accurate prediction of the enzyme comission (EC) numbers for chemical reactions is essential for the understanding and manipulation of enzyme functions, biocatalytic processes and biosynthetic planning. A number of machine leanring (ML)-based models have been developed to classify enzymatic reactions, showing great advantages over costly and long-winded experimental verifications. However, the prediction accuracy for most available models trained on the records of chemical reactions without specifying the enzymatic catalysts is rather limited. In this study, we introduced BEC-Pred, a BERT-based multiclassification model, for predicting EC numbers associated with reactions. Leveraging transfer learning, our approach achieves precise forecasting across a wide variety of Enzyme Commission (EC) numbers solely through analysis of the SMILES sequences of substrates and products. BEC-Pred model outperformed other sequence and graph-based ML methods, attaining a higher accuracy of 91.6%, surpassing them by 5.5%, and exhibiting superior F1 scores with improvements of 6.6% and 6.0%, respectively. The enhanced performance highlights the potential of BEC-Pred to serve as a reliable foundational tool to accelerate the cutting-edge research in synthetic biology and drug metabolism. Moreover, we discussed a few examples on how BEC-Pred could accurately predict the enzymatic classification for the Novozym 435-induced hydrolysis and lipase efficient catalytic synthesis. We anticipate that BEC-Pred will have a positive impact on the progression of enzymatic research.
Collapse
Affiliation(s)
- Wenjia Qian
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Xiaorui Wang
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, 999078, China
- CarbonSilicon AI Technology Co., Ltd, Hangzhou, 310018, Zhejiang, China
| | - Yu Kang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Peichen Pan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| | - Chang-Yu Hsieh
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| |
Collapse
|
6
|
Tavakoli M, Miller RJ, Angel MC, Pfeiffer MA, Gutman ES, Mood AD, Van Vranken D, Baldi P. PMechDB: A Public Database of Elementary Polar Reaction Steps. J Chem Inf Model 2024; 64:1975-1983. [PMID: 38483315 PMCID: PMC10966657 DOI: 10.1021/acs.jcim.3c01810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 02/15/2024] [Accepted: 02/16/2024] [Indexed: 03/26/2024]
Abstract
Most online chemical reaction databases are not publicly accessible or are fully downloadable. These databases tend to contain reactions in noncanonicalized formats and often lack comprehensive information regarding reaction pathways, intermediates, and byproducts. Within the few publicly available databases, reactions are typically stored in the form of unbalanced, overall transformations with minimal interpretability of the underlying chemistry. These limitations present significant obstacles to data-driven applications including the development of machine learning models. As an effort to overcome these challenges, we introduce PMechDB, a publicly accessible platform designed to curate, aggregate, and share polar chemical reaction data in the form of elementary reaction steps. Our initial version of PMechDB consists of over 100,000 such steps. In the PMechDB, all reactions are stored as canonicalized and balanced elementary steps, featuring accurate atom mapping and arrow-pushing mechanisms. As an online interactive database, PMechDB provides multiple interfaces that enable users to search, download, and upload chemical reactions. We anticipate that the public availability of PMechDB and its standardized data representation will prove beneficial for chemoinformatics research and education and the development of data-driven, interpretable models for predicting reactions and pathways. PMechDB platform is accessible online at https://deeprxn.ics.uci.edu/pmechdb.
Collapse
Affiliation(s)
- Mohammadamin Tavakoli
- Department
of Computer Science, University of California,
Irvine, Irvine, California 92697, United States
| | - Ryan J. Miller
- Department
of Computer Science, University of California,
Irvine, Irvine, California 92697, United States
| | - Mirana Claire Angel
- Department
of Computer Science, University of California,
Irvine, Irvine, California 92697, United States
| | - Michael A. Pfeiffer
- Department
of Chemistry, University of California,
Irvine, Irvine, California 92697, United States
| | - Eugene S. Gutman
- Department
of Chemistry, University of California,
Irvine, Irvine, California 92697, United States
| | - Aaron D. Mood
- Department
of Chemistry, University of California,
Irvine, Irvine, California 92697, United States
| | - David Van Vranken
- Department
of Chemistry, University of California,
Irvine, Irvine, California 92697, United States
| | - Pierre Baldi
- Department
of Computer Science, University of California,
Irvine, Irvine, California 92697, United States
| |
Collapse
|
7
|
Dou B, Zhu Z, Merkurjev E, Ke L, Chen L, Jiang J, Zhu Y, Liu J, Zhang B, Wei GW. Machine Learning Methods for Small Data Challenges in Molecular Science. Chem Rev 2023; 123:8736-8780. [PMID: 37384816 PMCID: PMC10999174 DOI: 10.1021/acs.chemrev.3c00189] [Citation(s) in RCA: 79] [Impact Index Per Article: 39.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023]
Abstract
Small data are often used in scientific and engineering research due to the presence of various constraints, such as time, cost, ethics, privacy, security, and technical limitations in data acquisition. However, big data have been the focus for the past decade, small data and their challenges have received little attention, even though they are technically more severe in machine learning (ML) and deep learning (DL) studies. Overall, the small data challenge is often compounded by issues, such as data diversity, imputation, noise, imbalance, and high-dimensionality. Fortunately, the current big data era is characterized by technological breakthroughs in ML, DL, and artificial intelligence (AI), which enable data-driven scientific discovery, and many advanced ML and DL technologies developed for big data have inadvertently provided solutions for small data problems. As a result, significant progress has been made in ML and DL for small data challenges in the past decade. In this review, we summarize and analyze several emerging potential solutions to small data challenges in molecular science, including chemical and biological sciences. We review both basic machine learning algorithms, such as linear regression, logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), kernel learning (KL), random forest (RF), and gradient boosting trees (GBT), and more advanced techniques, including artificial neural network (ANN), convolutional neural network (CNN), U-Net, graph neural network (GNN), Generative Adversarial Network (GAN), long short-term memory (LSTM), autoencoder, transformer, transfer learning, active learning, graph-based semi-supervised learning, combining deep learning with traditional machine learning, and physical model-based data augmentation. We also briefly discuss the latest advances in these methods. Finally, we conclude the survey with a discussion of promising trends in small data challenges in molecular science.
Collapse
Affiliation(s)
- Bozheng Dou
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Zailiang Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Ekaterina Merkurjev
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Lu Ke
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Long Chen
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jian Jiang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yueying Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jie Liu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Bengong Zhang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
8
|
He Y, Liu G, Hu S, Wang X, Jia J, Zhou H, Yan X. Implementing comprehensive machine learning models of multispecies toxicity assessment to improve regulation of organic compounds. JOURNAL OF HAZARDOUS MATERIALS 2023; 458:131942. [PMID: 37390684 DOI: 10.1016/j.jhazmat.2023.131942] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Revised: 06/12/2023] [Accepted: 06/24/2023] [Indexed: 07/02/2023]
Abstract
Machine learning has made significant progress in assessing the risk associated with hazardous chemicals. However, most models were constructed by randomly selecting one algorithm and one toxicity endpoint towards single species, which may cause biased regulation of chemicals. In the present study, we implemented comprehensive prediction models involving multiple advanced machine learning and end-to-end deep learning to assess the aquatic toxicity of chemicals. The generated optimal models accurately unravel the quantitative structure-toxicity relationships, with the correlation coefficients of all training sets from 0.59 to 0.81 and of the test sets from 0.56 to 0.83. For each chemical, its ecological risk was determined from the toxicity information towards multiple species. The results also revealed the toxicity mechanism of chemicals was species sensitivity, and the high-level organisms were faced with more serious side effects from hazardous substances. The proposed approach was finally applied to screen over 16,000 compounds and identify high-risk chemicals. We believe that the current approach can provide a useful tool for predicting the toxicity of diverse organic chemicals and help regulatory authorities make more reasonable decisions.
Collapse
Affiliation(s)
- Ying He
- Institute of Environmental Research at Greater Bay Area, Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Guangzhou University, Guangzhou 510006, China
| | - Guohong Liu
- Institute of Environmental Research at Greater Bay Area, Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Guangzhou University, Guangzhou 510006, China; School of Agriculture and Biological Sciences, Qiannan Normal University for Nationalities, Duyun 558000, China
| | - Song Hu
- School of Environmental Science and Engineering, Shandong University, Qingdao 266237, China
| | - Xiaohong Wang
- Institute of Environmental Research at Greater Bay Area, Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Guangzhou University, Guangzhou 510006, China
| | - Jianbo Jia
- Institute of Environmental Research at Greater Bay Area, Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Guangzhou University, Guangzhou 510006, China
| | - Hongyu Zhou
- Institute of Environmental Research at Greater Bay Area, Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Guangzhou University, Guangzhou 510006, China.
| | - Xiliang Yan
- Institute of Environmental Research at Greater Bay Area, Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Guangzhou University, Guangzhou 510006, China; School of Agriculture and Biological Sciences, Qiannan Normal University for Nationalities, Duyun 558000, China.
| |
Collapse
|
9
|
Abstract
Cyclopropanes that carry an electron-accepting group react as electrophiles in polar, ring-opening reactions. Analogous reactions at cyclopropanes with additional C2 substituents allow one to access difunctionalized products. Consequently, functionalized cyclopropanes are frequently used building blocks in organic synthesis. The polarization of the C1-C2 bond in 1-acceptor-2-donor-substituted cyclopropanes not only favorably enhances reactivity toward nucleophiles but also directs the nucleophilic attack toward the already substituted C2 position. Monitoring the kinetics of non-catalytic ring-opening reactions with a series of thiophenolates and other strong nucleophiles, such as azide ions, in DMSO provided the inherent SN2 reactivity of electrophilic cyclopropanes. The experimentally determined second-order rate constants k 2 for cyclopropane ring-opening reactions were then compared to those of related Michael additions. Interestingly, cyclopropanes with aryl substituents at the C2 position reacted faster than their unsubstituted analogues. Variation of the electronic properties of the aryl groups at C2 gave rise to parabolic Hammett relationships.
Collapse
Affiliation(s)
- Andreas Eitzinger
- Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5–13, 81377München, Germany
| | - Armin R. Ofial
- Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5–13, 81377München, Germany
| |
Collapse
|
10
|
Li L, Mayer RJ, Ofial AR, Mayr H. One-Bond-Nucleophilicity and -Electrophilicity Parameters: An Efficient Ordering System for 1,3-Dipolar Cycloadditions. J Am Chem Soc 2023; 145:7416-7434. [PMID: 36952671 DOI: 10.1021/jacs.2c13872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Abstract
Diazoalkanes are ambiphilic 1,3-dipoles that undergo fast Huisgen cycloadditions with both electron-rich and electron-poor dipolarophiles but react slowly with alkenes of low polarity. Frontier molecular orbital (FMO) theory considering the 3-center-4-electron π-system of the propargyl fragment of diazoalkanes is commonly applied to rationalize these reactivity trends. However, we recently found that a change in the mechanism from cycloadditions to azo couplings takes place due to the existence of a previously overlooked lower-lying unoccupied molecular orbital. We now propose an alternative approach to analyze 1,3-dipolar cycloaddition reactions, which relies on the linear free energy relationship lg k2(20 °C) = sN(N + E) (eq 1) with two solvent-dependent parameters (N, sN) to characterize nucleophiles and one parameter (E) for electrophiles. Rate constants for the cycloadditions of diazoalkanes with dipolarophiles were measured and compared with those calculated for the formation of zwitterions by eq 1. The difference between experimental and predicted Gibbs energies of activation is interpreted as the energy of concert, i.e., the stabilization of the transition states by the concerted formation of two new bonds. By linking the plot of lg k2 vs N for nucleophilic dipolarophiles with that of lg k2 vs E for electrophilic dipolarophiles, one obtains V-shaped plots which provide absolute rate constants for the stepwise reactions on the borderlines. These plots furthermore predict relative reactivities of dipolarophiles in concerted, highly asynchronous cycloadditions more precisely than the classical correlations of rate constants with FMO energies or ionization potentials. DFT calculations using the SMD solvent model confirm these interpretations.
Collapse
Affiliation(s)
- Le Li
- Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377 München, Germany
| | - Robert J Mayer
- CNRS, ISIS, Université de Strasbourg, 8 Allee Gaspard Monge, 67000 Strasbourg, France
| | - Armin R Ofial
- Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377 München, Germany
| | - Herbert Mayr
- Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377 München, Germany
| |
Collapse
|
11
|
Tavakoli M, Chiu YTT, Baldi P, Carlton AM, Van Vranken D. RMechDB: A Public Database of Elementary Radical Reaction Steps. J Chem Inf Model 2023; 63:1114-1123. [PMID: 36799778 PMCID: PMC9976277 DOI: 10.1021/acs.jcim.2c01359] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
Abstract
We introduce RMechDB, an open-access platform for aggregating, curating, and distributing reliable data about elementary radical reaction steps for computational radical reaction modeling and prediction. RMechDB contains over 5,300 elementary radical reaction steps, each with a single transition state at or around room temperature. These elementary step reactions are manually curated plausible arrow-pushing steps for organic radical reactions. The steps were taken from a variety of sources. Over 2,000 mechanistic steps were extracted from textbooks and/or constructed from research publications. Another 3,000 were taken from gas-phase atmospheric reactions of isoprene and other organic molecules on the MCM (Master Chemical Mechanism) Web site. Reactions are encoded in the SMIRKS format with accurate atom mapping and annotations for arrow-pushing mechanisms. At its core, RMechDB consists of a database schema with an online interactive search interface and a request portal for downloading the raw form of elementary step reactions with their metadata. It also offers an interface for submitting new reactions to RMechDB and expanding the data set through community contributions. Although there are several applications for RMechDB, it is primarily designed as a central platform of radical elementary steps with a unified and structured representation. We believe that this open access to this data and platform enables the extension of data-driven models for chemical reaction predictions and other chemoinformatics predictive tasks.
Collapse
Affiliation(s)
- Mohammadamin Tavakoli
- Department
of Computer Science, University of California,
Irvine, Irvine, California 92697, United States
| | - Yin Ting T. Chiu
- Department
of Chemistry, University of California,
Irvine, Irvine, California 92697, United States
| | - Pierre Baldi
- Department
of Computer Science, University of California,
Irvine, Irvine, California 92697, United States,E-mail:
| | - Ann Marie Carlton
- Department
of Chemistry, University of California,
Irvine, Irvine, California 92697, United States
| | - David Van Vranken
- Department
of Chemistry, University of California,
Irvine, Irvine, California 92697, United States,E-mail:
| |
Collapse
|
12
|
Abstract
Reactivity scales are useful research tools for chemists, both experimental and computational. However, to determine the reactivity of a single molecule, multiple measurements need to be carried out, which is a time-consuming and resource-intensive task. In this Tutorial Review, we present alternative approaches for the efficient generation of quantitative structure-reactivity relationships that are based on quantum chemistry, supervised learning, and uncertainty quantification. First published in 2002, we observe a tendency for these relationships to become not only more predictive but also more interpretable over time.
Collapse
Affiliation(s)
- Maike Vahl
- Institute of Physical and Theoretical Chemistry, Technische Universität Braunschweig, Gaußstraße 17, 38106 Braunschweig, Germany.
| | - Jonny Proppe
- Institute of Physical and Theoretical Chemistry, Technische Universität Braunschweig, Gaußstraße 17, 38106 Braunschweig, Germany.
| |
Collapse
|
13
|
McAulay K, Bilsland A, Bon M. Reactivity of Covalent Fragments and Their Role in Fragment Based Drug Discovery. Pharmaceuticals (Basel) 2022; 15:1366. [PMID: 36355538 PMCID: PMC9694498 DOI: 10.3390/ph15111366] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 10/30/2022] [Accepted: 11/04/2022] [Indexed: 09/27/2023] Open
Abstract
Fragment based drug discovery has long been used for the identification of new ligands and interest in targeted covalent inhibitors has continued to grow in recent years, with high profile drugs such as osimertinib and sotorasib gaining FDA approval. It is therefore unsurprising that covalent fragment-based approaches have become popular and have recently led to the identification of novel targets and binding sites, as well as ligands for targets previously thought to be 'undruggable'. Understanding the properties of such covalent fragments is important, and characterizing and/or predicting reactivity can be highly useful. This review aims to discuss the requirements for an electrophilic fragment library and the importance of differing warhead reactivity. Successful case studies from the world of drug discovery are then be examined.
Collapse
Affiliation(s)
- Kirsten McAulay
- Cancer Research Horizons—Therapeutic Innovation, Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Glasgow G61 1BD, UK
- Centre for Targeted Protein Degradation, University of Dundee, Nethergate, Dundee DD1 4HN, UK
| | - Alan Bilsland
- Cancer Research Horizons—Therapeutic Innovation, Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Glasgow G61 1BD, UK
| | - Marta Bon
- Cancer Research Horizons—Therapeutic Innovation, Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Glasgow G61 1BD, UK
- Exscientia, The Schrödinger Building, Oxford Science Park, Oxford OX4 4GE, UK
| |
Collapse
|
14
|
Liu T, Chu X, Fan D, Ma Z, Dai Y, Zhu Z, Wang Y, Gao J. Intelligent prediction model of ammonia solubility in designable green solvents based on microstructure group contribution. Mol Phys 2022. [DOI: 10.1080/00268976.2022.2124203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Affiliation(s)
- Tianxiong Liu
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao, People’s Republic of China
| | - Xiaojun Chu
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao, People’s Republic of China
| | - Dingchao Fan
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao, People’s Republic of China
| | - Zhaoyuan Ma
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao, People’s Republic of China
| | - Yasen Dai
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao, People’s Republic of China
| | - Zhaoyou Zhu
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao, People’s Republic of China
| | - Yinglong Wang
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao, People’s Republic of China
| | - Jun Gao
- College of Chemical and Environmental Engineering, Shandong University of Science and Technology, Qingdao, People’s Republic of China
| |
Collapse
|
15
|
Rarey M, Nicklaus MC, Warr W. Special Issue on Reaction Informatics and Chemical Space. J Chem Inf Model 2022; 62:2009-2010. [DOI: 10.1021/acs.jcim.2c00390] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Affiliation(s)
- Matthias Rarey
- Universität Hamburg, ZBH − Center for Bioinformatics, 20146 Hamburg, Germany
| | - Marc C. Nicklaus
- NCI, NIH, CADD Group, NCI-Frederick, Frederick, Maryland 21702, United States
| | - Wendy Warr
- Wendy Warr & Associates, Cheshire CW4 7HZ, U.K
| |
Collapse
|