1
|
Schietgat L, Cuissart B, De Grave K, Efthymiadis K, Bureau R, Crémilleux B, Ramon J, Lepailleur A. Automated detection of toxicophores and prediction of mutagenicity using PMCSFG algorithm. Mol Inform 2023; 42:e2200232. [PMID: 36529710 DOI: 10.1002/minf.202200232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 12/13/2022] [Accepted: 12/18/2022] [Indexed: 12/23/2022]
Abstract
Maximum common substructures (MCS) have received a lot of attention in the chemoinformatics community. They are typically used as a similarity measure between molecules, showing high predictive performance when used in classification tasks, while being easily explainable substructures. In the present work, we applied the Pairwise Maximum Common Subgraph Feature Generation (PMCSFG) algorithm to automatically detect toxicophores (structural alerts) and to compute fingerprints based on MCS. We present a comparison between our MCS-based fingerprints and 12 well-known chemical fingerprints when used as features in machine learning models. We provide an experimental evaluation and discuss the usefulness of the different methods on mutagenicity data. The features generated by the MCS method have a state-of-the-art performance when predicting mutagenicity, while they are more interpretable than the traditional chemical fingerprints.
Collapse
Affiliation(s)
- Leander Schietgat
- Artificial Intelligence Lab, Vrije Universiteit Brussel, Brussel, Belgium.,Department of Computer Science, KU Leuven, Leuven, Belgium
| | - Bertrand Cuissart
- Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen, UNICAEN, ENSICAEN, CNRS - UMR GREYC, Normandie Univ., Caen, France
| | | | | | - Ronan Bureau
- Centre d'Etudes et de Recherche sur le Médicament de Normandie, UNICAEN, CERMN, Normandie Univ., Caen, France
| | - Bruno Crémilleux
- Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen, UNICAEN, ENSICAEN, CNRS - UMR GREYC, Normandie Univ., Caen, France
| | - Jan Ramon
- INRIA Lille Nord Europe, Lille, France
| | - Alban Lepailleur
- Centre d'Etudes et de Recherche sur le Médicament de Normandie, UNICAEN, CERMN, Normandie Univ., Caen, France
| |
Collapse
|
2
|
Coley CW, Eyke NS, Jensen KF. Autonomous Discovery in the Chemical Sciences Part I: Progress. Angew Chem Int Ed Engl 2020; 59:22858-22893. [DOI: 10.1002/anie.201909987] [Citation(s) in RCA: 100] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Indexed: 01/05/2023]
Affiliation(s)
- Connor W. Coley
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Natalie S. Eyke
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Klavs F. Jensen
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| |
Collapse
|
3
|
Coley CW, Eyke NS, Jensen KF. Autonome Entdeckung in den chemischen Wissenschaften, Teil I: Fortschritt. Angew Chem Int Ed Engl 2020. [DOI: 10.1002/ange.201909987] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Connor W. Coley
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Natalie S. Eyke
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Klavs F. Jensen
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| |
Collapse
|
4
|
Hemmerich J, Ecker GF. In silico toxicology: From structure–activity relationships towards deep learning and adverse outcome pathways. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2020; 10:e1475. [PMID: 35866138 PMCID: PMC9286356 DOI: 10.1002/wcms.1475] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Revised: 03/09/2020] [Accepted: 03/10/2020] [Indexed: 12/18/2022]
Abstract
In silico toxicology is an emerging field. It gains increasing importance as research is aiming to decrease the use of animal experiments as suggested in the 3R principles by Russell and Burch. In silico toxicology is a means to identify hazards of compounds before synthesis, and thus in very early stages of drug development. For chemical industries, as well as regulatory agencies it can aid in gap‐filling and guide risk minimization strategies. Techniques such as structural alerts, read‐across, quantitative structure–activity relationship, machine learning, and deep learning allow to use in silico toxicology in many cases, some even when data is scarce. Especially the concept of adverse outcome pathways puts all techniques into a broader context and can elucidate predictions by mechanistic insights. This article is categorized under:Structure and Mechanism > Computational Biochemistry and Biophysics Data Science > Chemoinformatics
Collapse
Affiliation(s)
- Jennifer Hemmerich
- Department of Pharmaceutical Chemistry University of Vienna Vienna Austria
| | - Gerhard F. Ecker
- Department of Pharmaceutical Chemistry University of Vienna Vienna Austria
| |
Collapse
|
5
|
Finding the Key Structure of Mechanical Parts with Formal Concept Analysis. INFORMATION 2020. [DOI: 10.3390/info11020116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Aiming at the problem that the assembly body model is difficult to classify and retrieve (large information redundancy and poor data consistency), an assembly body retrieval method oriented to key structures was presented. In this paper, a decision formal context is transformed from the 3D structure model. The 3D assembly structure model of parts is defined by the adjacency graph of function surface and qualitative geometric constraint graph. The assembly structure is coded by the linear symbol representation of compounds in chemical database. An importance or cohesion as the weight to a decision-making objective on the context is defined by a rough set method. A weighted concept lattice is introduced on it. An important formal concept means a key structure, since the concept represents the relations between parts’ function surfaces. It can greatly improve the query efficiency.
Collapse
|
6
|
Benigni R, Laura Battistelli C, Bossa C, Giuliani A, Fioravanzo E, Bassan A, Fuart Gatnik M, Rathman J, Yang C, Tcheremenskaia O. Evaluation of the applicability of existing (Q)SAR models for predicting the genotoxicity of pesticides and similarity analysis related with genotoxicity of pesticides for facilitating of grouping and read across. ACTA ACUST UNITED AC 2019. [DOI: 10.2903/sp.efsa.2019.en-1598] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
7
|
García-Vico ÁM, González P, Carmona CJ, del Jesus MJ. A Big Data Approach for the Extraction of Fuzzy Emerging Patterns. Cognit Comput 2019. [DOI: 10.1007/s12559-018-9612-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
8
|
Yang H, Sun L, Li W, Liu G, Tang Y. Identification of Nontoxic Substructures: A New Strategy to Avoid Potential Toxicity Risk. Toxicol Sci 2018; 165:396-407. [DOI: 10.1093/toxsci/kfy146] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Affiliation(s)
- Hongbin Yang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Lixia Sun
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Weihua Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Yun Tang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| |
Collapse
|
9
|
Métivier JP, Cuissart B, Bureau R, Lepailleur A. The Pharmacophore Network: A Computational Method for Exploring Structure–Activity Relationships from a Large Chemical Data Set. J Med Chem 2018; 61:3551-3564. [DOI: 10.1021/acs.jmedchem.7b01890] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Affiliation(s)
- Jean-Philippe Métivier
- Centre d’Etudes et de Recherche sur le Médicament de Normandie, Normandie Univ, UNICAEN, CERMN, 14000 Caen, France
- Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen, Normandie Univ, UNICAEN, ENSICAEN, CNRS, GREYC, 14000 Caen, France
| | - Bertrand Cuissart
- Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen, Normandie Univ, UNICAEN, ENSICAEN, CNRS, GREYC, 14000 Caen, France
| | - Ronan Bureau
- Centre d’Etudes et de Recherche sur le Médicament de Normandie, Normandie Univ, UNICAEN, CERMN, 14000 Caen, France
| | - Alban Lepailleur
- Centre d’Etudes et de Recherche sur le Médicament de Normandie, Normandie Univ, UNICAEN, CERMN, 14000 Caen, France
| |
Collapse
|
10
|
Afolabi LT, Saeed F, Hashim H, Petinrin OO. Ensemble learning method for the prediction of new bioactive molecules. PLoS One 2018; 13:e0189538. [PMID: 29329334 PMCID: PMC5766097 DOI: 10.1371/journal.pone.0189538] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Accepted: 11/27/2017] [Indexed: 12/31/2022] Open
Abstract
Pharmacologically active molecules can provide remedies for a range of different illnesses and infections. Therefore, the search for such bioactive molecules has been an enduring mission. As such, there is a need to employ a more suitable, reliable, and robust classification method for enhancing the prediction of the existence of new bioactive molecules. In this paper, we adopt a recently developed combination of different boosting methods (Adaboost) for the prediction of new bioactive molecules. We conducted the research experiments utilizing the widely used MDL Drug Data Report (MDDR) database. The proposed boosting method generated better results than other machine learning methods. This finding suggests that the method is suitable for inclusion among the in silico tools for use in cheminformatics, computational chemistry and molecular biology.
Collapse
Affiliation(s)
| | - Faisal Saeed
- College of Computer Science and Engineering, Taibah University, Medina, Saudi Arabia
- Information Systems Department, Faculty of Computing, Universiti Teknologi Malaysia, Skudai, Johor, Malaysia
| | - Haslinda Hashim
- Information Systems Department, Faculty of Computing, Universiti Teknologi Malaysia, Skudai, Johor, Malaysia
- Kolej Yayasan Pelajaran Johor, KM16, Jalan Kulai-Kota Tinggi, Kota Tinggi, Johor, Malaysia
| | | |
Collapse
|
11
|
Rabatel J, Fannes T, Lepailleur A, Le Goff J, Crémilleux B, Ramon J, Bureau R, Cuissart B. Non a Priori Automatic Discovery of 3D Chemical Patterns: Application to Mutagenicity. Mol Inform 2017; 36. [PMID: 28590546 DOI: 10.1002/minf.201700022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2017] [Accepted: 05/22/2017] [Indexed: 11/11/2022]
Abstract
This article introduces a new type of structural fragment called a geometrical pattern. Such geometrical patterns are defined as molecular graphs that include a labelling of atoms together with constraints on interatomic distances. The discovery of geometrical patterns in a chemical dataset relies on the induction of multiple decision trees combined in random forests. Each computational step corresponds to a refinement of a preceding set of constraints, extending a previous geometrical pattern. This paper focuses on the mutagenicity of chemicals via the definition of structural alerts in relation with these geometrical patterns. It follows an experimental assessment of the main geometrical patterns to show how they can efficiently originate the definition of a chemical feature related to a chemical function or a chemical property. Geometrical patterns have provided a valuable and innovative approach to bring new pieces of information for discovering and assessing structural characteristics in relation to a particular biological phenotype.
Collapse
Affiliation(s)
- Julien Rabatel
- Normandie Univ, France.,UNICAEN, GREYC, UMR CNRS, F-14032, Caen, France
| | - Thomas Fannes
- Declarative Languages and Artificial Intelligence (DTAI), Department of Computer Science, KU Leuven, Belgium
| | - Alban Lepailleur
- Normandie Univ, France.,UNICAEN, CERMN, UPRES EA 4258-FR CNRS 3038 INC3 M, Bd Becquerel, F-14032, Caen, France
| | | | - Bruno Crémilleux
- Normandie Univ, France.,UNICAEN, GREYC, UMR CNRS, F-14032, Caen, France
| | - Jan Ramon
- Declarative Languages and Artificial Intelligence (DTAI), Department of Computer Science, KU Leuven, Belgium
| | - Ronan Bureau
- Normandie Univ, France.,UNICAEN, CERMN, UPRES EA 4258-FR CNRS 3038 INC3 M, Bd Becquerel, F-14032, Caen, France
| | - Bertrand Cuissart
- Normandie Univ, France.,UNICAEN, GREYC, UMR CNRS, F-14032, Caen, France
| |
Collapse
|
12
|
Zhang H, Kang YL, Zhu YY, Zhao KX, Liang JY, Ding L, Zhang TG, Zhang J. Novel naïve Bayes classification models for predicting the chemical Ames mutagenicity. Toxicol In Vitro 2017; 41:56-63. [DOI: 10.1016/j.tiv.2017.02.016] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2016] [Revised: 01/04/2017] [Accepted: 02/18/2017] [Indexed: 10/20/2022]
|
13
|
Floris M, Raitano G, Medda R, Benfenati E. Fragment Prioritization on a Large Mutagenicity Dataset. Mol Inform 2016; 36. [PMID: 28032691 DOI: 10.1002/minf.201600133] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2016] [Accepted: 12/11/2016] [Indexed: 11/08/2022]
Abstract
The identification of structural alerts is one of the simplest tools used for the identification of potentially toxic chemical compounds. Structural alerts have served as an aid to quickly identify chemicals that should be either prioritized for testing or for elimination from further consideration and use. In the recent years, the availability of larger datasets, often growing in the context of collaborative efforts and competitions, created the raw material needed to identify new and more accurate structural alerts. This work applied a method to efficiently mine large toxicological dataset for structural alert showing a strong statistical association with mutagenicity. In details, we processed a large Ames mutagenicity dataset comprising 14,015 unique molecules obtained by joining different data sources. After correction for multiple testing, we were able to assign a probability value to each fragment. A total of 51 rules were identified, with p-value < 0.05. Using the same method, we also confirmed the statistical significance of several mutagenicity rules already present and largely recognized in the literature. In addition, we have extended the application of our method by predicting the mutagenicity of an external data set.
Collapse
Affiliation(s)
- Matteo Floris
- CRS4 - Center for advanced studies, research and development in Sardinia, Loc. Piscina Manna, Building 1, 09010, Pula (CA), Italy.,Department of Biomedical Sciences, University of Sassari, Sassari, Italy
| | - Giuseppa Raitano
- IRCCS - Istituto di Ricerche Farmacologiche "Mario Negri", Department of Environmental Health Sciences, Laboratory of Environmental Chemistry and Toxicology, Via La Masa 19, 20159, Milan, Italy
| | - Ricardo Medda
- CRS4 - Center for advanced studies, research and development in Sardinia, Loc. Piscina Manna, Building 1, 09010, Pula (CA), Italy
| | - Emilio Benfenati
- IRCCS - Istituto di Ricerche Farmacologiche "Mario Negri", Department of Environmental Health Sciences, Laboratory of Environmental Chemistry and Toxicology, Via La Masa 19, 20159, Milan, Italy
| |
Collapse
|
14
|
Cortes-Ciriano I. Bioalerts: a python library for the derivation of structural alerts from bioactivity and toxicity data sets. J Cheminform 2016; 8:13. [PMID: 26949417 PMCID: PMC4779235 DOI: 10.1186/s13321-016-0125-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2015] [Accepted: 02/22/2016] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Assessing compound toxicity at early stages of the drug discovery process is a crucial task to dismiss drug candidates likely to fail in clinical trials. Screening drug candidates against structural alerts, i.e. chemical fragments associated to a toxicological response prior or after being metabolized (bioactivation), has proved a valuable approach for this task. During the last decades, diverse algorithms have been proposed for the automatic derivation of structural alerts from categorical toxicity data sets. RESULTS AND CONCLUSIONS Here, the python library bioalerts is presented, which comprises functionalities for the automatic derivation of structural alerts from categorical (dichotomous), e.g. toxic/non-toxic, and continuous bioactivity data sets, e.g. [Formula: see text] or [Formula: see text] values. The library bioalerts relies on the RDKit implementation of the circular Morgan fingerprint algorithm to compute chemical substructures, which are derived by considering radial atom neighbourhoods of increasing bond radius. In addition to the derivation of structural alerts, bioalerts provides functionalities for the calculation of unhashed (keyed) Morgan fingerprints, which can be used in predictive bioactivity modelling with the advantage of allowing for a chemically meaningful deconvolution of the chemical space. Finally, bioalerts provides functionalities for the easy visualization of the derived structural alerts.
Collapse
Affiliation(s)
- Isidro Cortes-Ciriano
- Unité de Bioinformatique Structurale, CNRS UMR 3825, Département de Biologie Structurale et Chimie, Institut Pasteur, 25, rue du Dr. Roux, 75015 Paris, France
| |
Collapse
|
15
|
Tomberg A, Pottel J, Liu Z, Labute P, Moitessier N. Understanding P450-mediated Bio-transformations into Epoxide and Phenolic Metabolites. Angew Chem Int Ed Engl 2015. [DOI: 10.1002/ange.201506131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
16
|
Tomberg A, Pottel J, Liu Z, Labute P, Moitessier N. Understanding P450-mediated Bio-transformations into Epoxide and Phenolic Metabolites. Angew Chem Int Ed Engl 2015; 54:13743-7. [PMID: 26418278 DOI: 10.1002/anie.201506131] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2015] [Revised: 09/10/2015] [Indexed: 11/06/2022]
Abstract
Adverse drug reactions are commonly the result of cytochrome P450 enzymes (CYPs) converting the drugs into reactive metabolites. Thus, information about the CYP bioactivation of drugs would not only provide insight into metabolic stability, but also into the potential toxicity. For example, oxidation of phenyl rings may lead to either toxic epoxides or safer phenols. Herein, we demonstrate that the potential to form reactive metabolites is encoded primarily in the properties of the molecule to be oxidized. While the enzyme positions the molecule inside the binding pocket (selects the site of metabolism), the subsequent reaction is only dependent on the substrate itself. To test this hypothesis, we used this observation as a predictor of drug inherent toxicity. This approach was used to successfully identify the formation of reactive metabolites in over 100 drug molecules. These results provide a new perspective on the impact of functional groups on aromatic oxidation of drugs and their effects on toxicity.
Collapse
Affiliation(s)
- Anna Tomberg
- Department of Chemistry, McGill University, 801 Sherbrooke Street West, Montreal, QC, H3A 0B8 (Canada)
| | - Joshua Pottel
- Department of Chemistry, McGill University, 801 Sherbrooke Street West, Montreal, QC, H3A 0B8 (Canada)
| | - Zhaomin Liu
- Department of Chemistry, McGill University, 801 Sherbrooke Street West, Montreal, QC, H3A 0B8 (Canada)
| | - Paul Labute
- Chemical Computing Group Inc., 1010 Sherbrooke Street West, Montreal, QC, H3A 2R7 (Canada)
| | - Nicolas Moitessier
- Department of Chemistry, McGill University, 801 Sherbrooke Street West, Montreal, QC, H3A 0B8 (Canada).
| |
Collapse
|