1
|
|
2
|
Pereira F. Machine Learning Methods to Predict the Terrestrial and Marine Origin of Natural Products. Mol Inform 2021; 40:e2060034. [PMID: 33787065 DOI: 10.1002/minf.202060034] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 02/04/2021] [Indexed: 12/23/2022]
Abstract
In recent years there has been a growing interest in studying the differences between the chemical and biological space represented by natural products (NPs) of terrestrial and marine origin. In order to learn more about these two chemical spaces, marine natural products (MNPs) and terrestrial natural products (TNPs), a machine learning (ML) approach was developed in the current work to predict three classes, MNPs, TNPs and a third class of NPs that appear in both the terrestrial and marine environments. In total 22,398 NPs were retrieved from the Reaxys® database, from those 10,790 molecules are recorded as MNPs, 10,857 as TNPs, and 761 NPs appear registered as both MNPs and TNPs. Several ML algorithms such as Random Forest, Support Vector Machines, and deep learning Multilayer Perceptron networks have been benchmarked. The best performance was achieved with a consensus classification model, which predicted the external test set with an overall predictive accuracy up to 81 %. As far as we know this approach has never been intended and therefore allow to be used to better understand the chemical space defined by MNPs, TNPs or both, but also in virtual screening to define the applicability domain of QSAR models of MNPs and TNPs.
Collapse
Affiliation(s)
- Florbela Pereira
- LAQV and REQUIMTE, Departamento de Química, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516, Caparica, Portugal
| |
Collapse
|
3
|
Venkatraman V, Evjen S, Knuutila HK, Fiksdahl A, Alsberg BK. Predicting ionic liquid melting points using machine learning. J Mol Liq 2020. [DOI: 10.1016/j.molliq.2020.114686] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
4
|
A review on created QSPR models for predicting ionic liquids properties and their reliability from chemometric point of view. J Mol Liq 2020. [DOI: 10.1016/j.molliq.2019.112013] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
5
|
Pereira F. Machine learning methods to predict the crystallization propensity of small organic molecules. CrystEngComm 2020. [DOI: 10.1039/d0ce00070a] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Machine learning algorithms were explored for the prediction of the crystallization propensity based on molecular descriptors and fingerprints generated from 2D chemical structures and 3D chemical structures optimized with empirical methods.
Collapse
Affiliation(s)
- Florbela Pereira
- LAQV and REQUIMTE
- Departamento de Química
- Faculdade de Ciências e Tecnologia
- Universidade Nova de Lisboa
- Caparica
| |
Collapse
|
6
|
Cai G, Yang S, Zhou Q, Liu L, Lu X, Xu J, Zhang S. Physicochemical Properties of Various 2-Hydroxyethylammonium Sulfonate -Based Protic Ionic Liquids and Their Potential Application in Hydrodeoxygenation. Front Chem 2019; 7:196. [PMID: 31024888 PMCID: PMC6460099 DOI: 10.3389/fchem.2019.00196] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 03/14/2019] [Indexed: 11/13/2022] Open
Abstract
In order to obtain the regularities of physicochemical properties of hydroxy protic ionic liquids (PILs) and broaden their potential application, a series of 2-hydroxyethylammonium sulfonate-based PILs were synthesized through proton transfer reaction and characterized by NMR and FT-IR and elemental analysis. Their phase transfer behavior (T m) and initial decomposition point (T d) were characterized by differential scanning calorimetry (DSC) and thermogravimetric analysis (TGA), respectively. Meanwhile, the regularities of density (ρ), viscosity (η) and electrical conductivity (σ) of synthesized PILs at different temperatures were measured. The results indicated that their physicochemical properties were tightly related with their structures and the interactions between cations and anions. In addition, the dissociation constants (pKa) of synthesized PILs were obtained by acid-base titration, which revealed that all synthesized PILs had pKa exceeding 7 and their cations were the crux of determining the pKa value. Moreover, several synthesized PILs with a low melting temperature also showed potential application in the deoxidation reaction of cyclohexanol, as they had conversion rates approximating 100% and the selectivity of cyclohexane or cyclohexene was about 80%.
Collapse
Affiliation(s)
- Guangming Cai
- Beijing Key Laboratory of Ionic Liquids Clean Process, CAS Key Laboratory of Green Process and Engineering, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, China.,School of Chemical Engineering, University of Chinese Academy of Sciences, Beijing, China
| | - Shaoqi Yang
- Beijing Key Laboratory of Ionic Liquids Clean Process, CAS Key Laboratory of Green Process and Engineering, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, China.,School of Chemical Engineering, University of Chinese Academy of Sciences, Beijing, China
| | - Qing Zhou
- Beijing Key Laboratory of Ionic Liquids Clean Process, CAS Key Laboratory of Green Process and Engineering, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, China.,School of Chemical Engineering, University of Chinese Academy of Sciences, Beijing, China
| | - Lifei Liu
- Beijing Key Laboratory of Ionic Liquids Clean Process, CAS Key Laboratory of Green Process and Engineering, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, China
| | - Xingmei Lu
- Beijing Key Laboratory of Ionic Liquids Clean Process, CAS Key Laboratory of Green Process and Engineering, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, China.,School of Chemical Engineering, University of Chinese Academy of Sciences, Beijing, China
| | - Junli Xu
- Beijing Key Laboratory of Ionic Liquids Clean Process, CAS Key Laboratory of Green Process and Engineering, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, China
| | - Suojiang Zhang
- Beijing Key Laboratory of Ionic Liquids Clean Process, CAS Key Laboratory of Green Process and Engineering, State Key Laboratory of Multiphase Complex Systems, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, China.,School of Chemical Engineering, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
7
|
Glavatskikh M, Madzhidov T, Horvath D, Nugmanov R, Gimadiev T, Malakhova D, Marcou G, Varnek A. Predictive Models for Kinetic Parameters of Cycloaddition Reactions. Mol Inform 2018; 38:e1800077. [DOI: 10.1002/minf.201800077] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Accepted: 07/22/2018] [Indexed: 01/10/2023]
Affiliation(s)
- Marta Glavatskikh
- Laboratoire de Chémoinformatique; UMR 7140 CNRS; Université de Strasbourg, 1; rue Blaise Pascal 67000 Strasbourg France
- Laboratory of Chemoinformatics and Molecular Modeling; Butlerov Institute of Chemistry; Kazan Federal University; Kremlyovskaya str. 18 Kazan Russia
| | - Timur Madzhidov
- Laboratory of Chemoinformatics and Molecular Modeling; Butlerov Institute of Chemistry; Kazan Federal University; Kremlyovskaya str. 18 Kazan Russia
| | - Dragos Horvath
- Laboratoire de Chémoinformatique; UMR 7140 CNRS; Université de Strasbourg, 1; rue Blaise Pascal 67000 Strasbourg France
| | - Ramil Nugmanov
- Laboratory of Chemoinformatics and Molecular Modeling; Butlerov Institute of Chemistry; Kazan Federal University; Kremlyovskaya str. 18 Kazan Russia
| | - Timur Gimadiev
- Laboratoire de Chémoinformatique; UMR 7140 CNRS; Université de Strasbourg, 1; rue Blaise Pascal 67000 Strasbourg France
- Laboratory of Chemoinformatics and Molecular Modeling; Butlerov Institute of Chemistry; Kazan Federal University; Kremlyovskaya str. 18 Kazan Russia
| | - Daria Malakhova
- Laboratory of Chemoinformatics and Molecular Modeling; Butlerov Institute of Chemistry; Kazan Federal University; Kremlyovskaya str. 18 Kazan Russia
| | - Gilles Marcou
- Laboratoire de Chémoinformatique; UMR 7140 CNRS; Université de Strasbourg, 1; rue Blaise Pascal 67000 Strasbourg France
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique; UMR 7140 CNRS; Université de Strasbourg, 1; rue Blaise Pascal 67000 Strasbourg France
| |
Collapse
|
8
|
Venkatraman V, Evjen S, Knuutila HK, Fiksdahl A, Alsberg BK. Predicting ionic liquid melting points using machine learning. J Mol Liq 2018. [DOI: 10.1016/j.molliq.2018.03.090] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
9
|
Glavatskikh M, Madzhidov T, Baskin II, Horvath D, Nugmanov R, Gimadiev T, Marcou G, Varnek A. Visualization and Analysis of Complex Reaction Data: The Case of Tautomeric Equilibria. Mol Inform 2018; 37:e1800056. [DOI: 10.1002/minf.201800056] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Accepted: 06/29/2018] [Indexed: 11/07/2022]
Affiliation(s)
- Marta Glavatskikh
- Laboratoire de Chémoinformatique, UMR 7140 CNRS; Université de Strasbourg; 1, rue Blaise Pascal 67000 Strasbourg France
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry; Kazan Federal University; Kremlevskaya str. 18 Kazan Russia
| | - Timur Madzhidov
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry; Kazan Federal University; Kremlevskaya str. 18 Kazan Russia
| | - Igor I. Baskin
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry; Kazan Federal University; Kremlevskaya str. 18 Kazan Russia
- Faculty of Physics; Lomonosov Moscow State University; Leninskie Gory 1/2 119991 Moscow Russia
| | - Dragos Horvath
- Laboratoire de Chémoinformatique, UMR 7140 CNRS; Université de Strasbourg; 1, rue Blaise Pascal 67000 Strasbourg France
| | - Ramil Nugmanov
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry; Kazan Federal University; Kremlevskaya str. 18 Kazan Russia
| | - Timur Gimadiev
- Laboratoire de Chémoinformatique, UMR 7140 CNRS; Université de Strasbourg; 1, rue Blaise Pascal 67000 Strasbourg France
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry; Kazan Federal University; Kremlevskaya str. 18 Kazan Russia
| | - Gilles Marcou
- Laboratoire de Chémoinformatique, UMR 7140 CNRS; Université de Strasbourg; 1, rue Blaise Pascal 67000 Strasbourg France
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique, UMR 7140 CNRS; Université de Strasbourg; 1, rue Blaise Pascal 67000 Strasbourg France
| |
Collapse
|
10
|
Kireeva N, Pervov VS. Materials space of solid-state electrolytes: unraveling chemical composition–structure–ionic conductivity relationships in garnet-type metal oxides using cheminformatics virtual screening approaches. Phys Chem Chem Phys 2017; 19:20904-20918. [DOI: 10.1039/c7cp00518k] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Several candidate garnet-related compounds have been recommended for synthesis as potential materials for solid-state electrolytes.
Collapse
Affiliation(s)
- Natalia Kireeva
- Frumkin Institute of Physical Chemistry and Electrochemistry Russian Academy of Sciences
- Moscow
- Russia
- Moscow Institute of Physics and Technology (State University)
- Dolgoprudny
| | - Vladislav S. Pervov
- Kurnakov Institute of General and Inorganic Chemistry Russian Academy of Sciences
- Moscow
- Russia
| |
Collapse
|
11
|
Venkatraman V, Alsberg BK. Quantitative structure-property relationship modelling of thermal decomposition temperatures of ionic liquids. J Mol Liq 2016. [DOI: 10.1016/j.molliq.2016.08.023] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
12
|
|
13
|
Gaspar HA, Baskin II, Marcou G, Horvath D, Varnek A. Stargate GTM: Bridging Descriptor and Activity Spaces. J Chem Inf Model 2015; 55:2403-10. [DOI: 10.1021/acs.jcim.5b00398] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Héléna A. Gaspar
- Laboratoire
de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue
Blaise Pascal, Strasbourg 67000, France
| | - Igor I. Baskin
- Faculty
of Physics, M.V. Lomonosov Moscow State University, Leninskie Gory, Moscow 119991, Russia
- Laboratory
of Chemoinformatics, Butlerov Institute of Chemistry, Kazan Federal University, Kazan 420008, Russia
| | - Gilles Marcou
- Laboratoire
de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue
Blaise Pascal, Strasbourg 67000, France
| | - Dragos Horvath
- Laboratoire
de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue
Blaise Pascal, Strasbourg 67000, France
| | - Alexandre Varnek
- Laboratoire
de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue
Blaise Pascal, Strasbourg 67000, France
- Laboratory
of Chemoinformatics, Butlerov Institute of Chemistry, Kazan Federal University, Kazan 420008, Russia
| |
Collapse
|
14
|
Osolodkin DI, Radchenko EV, Orlov AA, Voronkov AE, Palyulin VA, Zefirov NS. Progress in visual representations of chemical space. Expert Opin Drug Discov 2015; 10:959-73. [DOI: 10.1517/17460441.2015.1060216] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
15
|
Gaspar HA, Baskin II, Marcou G, Horvath D, Varnek A. GTM-Based QSAR Models and Their Applicability Domains. Mol Inform 2015; 34:348-56. [DOI: 10.1002/minf.201400153] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2014] [Accepted: 11/28/2014] [Indexed: 11/06/2022]
|
16
|
Gaspar HA, Baskin II, Marcou G, Horvath D, Varnek A. Chemical data visualization and analysis with incremental generative topographic mapping: big data challenge. J Chem Inf Model 2014; 55:84-94. [PMID: 25423612 DOI: 10.1021/ci500575y] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
This paper is devoted to the analysis and visualization in 2-dimensional space of large data sets of millions of compounds using the incremental version of generative topographic mapping (iGTM). The iGTM algorithm implemented in the in-house ISIDA-GTM program was applied to a database of more than 2 million compounds combining data sets of 36 chemicals suppliers and the NCI collection, encoded either by MOE descriptors or by MACCS keys. Taking advantage of the probabilistic nature of GTM, several approaches to data analysis were proposed. The chemical space coverage was evaluated using the normalized Shannon entropy. Different views of the data (property landscapes) were obtained by mapping various physical and chemical properties (molecular weight, aqueous solubility, LogP, etc.) onto the iGTM map. The superposition of these views helped to identify the regions in the chemical space populated by compounds with desirable physicochemical profiles and the suppliers providing them. The data sets similarity in the latent space was assessed by applying several metrics (Euclidean distance, Tanimoto and Bhattacharyya coefficients) to data probability distributions based on cumulated responsibility vectors. As a complementary approach, data sets were compared by considering them as individual objects on a meta-GTM map, built on cumulated responsibility vectors or property landscapes produced with iGTM. We believe that the iGTM methodology described in this article represents a fast and reliable way to analyze and visualize large chemical databases.
Collapse
Affiliation(s)
- Héléna A Gaspar
- Laboratory of Chemoinformatics, University of Strasbourg , 67081 Strasbourg, France
| | | | | | | | | |
Collapse
|
17
|
Ovchinnikova SI, Bykov AA, Tsivadze AY, Dyachkov EP, Kireeva NV. Supervised extensions of chemography approaches: case studies of chemical liabilities assessment. J Cheminform 2014; 6:20. [PMID: 24868246 PMCID: PMC4018504 DOI: 10.1186/1758-2946-6-20] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2013] [Accepted: 04/28/2014] [Indexed: 12/04/2022] Open
Abstract
Chemical liabilities, such as adverse effects and toxicity, play a significant role in modern drug discovery process. In silico assessment of chemical liabilities is an important step aimed to reduce costs and animal testing by complementing or replacing in vitro and in vivo experiments. Herein, we propose an approach combining several classification and chemography methods to be able to predict chemical liabilities and to interpret obtained results in the context of impact of structural changes of compounds on their pharmacological profile. To our knowledge for the first time, the supervised extension of Generative Topographic Mapping is proposed as an effective new chemography method. New approach for mapping new data using supervised Isomap without re-building models from the scratch has been proposed. Two approaches for estimation of model's applicability domain are used in our study to our knowledge for the first time in chemoinformatics. The structural alerts responsible for the negative characteristics of pharmacological profile of chemical compounds has been found as a result of model interpretation.
Collapse
Affiliation(s)
- Svetlana I Ovchinnikova
- Frumkin Institute of Physical Chemistry and Electrochemistry RAS, Leninsky pr-t 31-4, 119071 Moscow, Russia
- Moscow Institute of Physics and Technology, Institutsky per., 9, 141700 Dolgoprudny, Russia
| | - Arseniy A Bykov
- Frumkin Institute of Physical Chemistry and Electrochemistry RAS, Leninsky pr-t 31-4, 119071 Moscow, Russia
- Moscow Institute of Physics and Technology, Institutsky per., 9, 141700 Dolgoprudny, Russia
| | - Aslan Yu Tsivadze
- Frumkin Institute of Physical Chemistry and Electrochemistry RAS, Leninsky pr-t 31-4, 119071 Moscow, Russia
| | - Evgeny P Dyachkov
- Kurnakov Institute of General and Inorganic Chemistry RAS, Leninsky pr-t 31, 119071 Moscow, Russia
| | - Natalia V Kireeva
- Frumkin Institute of Physical Chemistry and Electrochemistry RAS, Leninsky pr-t 31-4, 119071 Moscow, Russia
- Moscow Institute of Physics and Technology, Institutsky per., 9, 141700 Dolgoprudny, Russia
| |
Collapse
|
18
|
Nonlinear Dimensionality Reduction for Visualizing Toxicity Data: Distance-Based Versus Topology-Based Approaches. ChemMedChem 2014; 9:1047-59. [DOI: 10.1002/cmdc.201400027] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2014] [Indexed: 01/11/2023]
|
19
|
Impact of distance-based metric learning on classification and visualization model performance and structure–activity landscapes. J Comput Aided Mol Des 2014; 28:61-73. [DOI: 10.1007/s10822-014-9719-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2013] [Accepted: 01/24/2014] [Indexed: 10/25/2022]
|