1
|
Hayek-Orduz Y, Acevedo-Castro DA, Saldarriaga Escobar JS, Ortiz-Domínguez BE, Villegas-Torres MF, Caicedo PA, Barrera-Ocampo Á, Cortes N, Osorio EH, González Barrios AF. dyphAI dynamic pharmacophore modeling with AI: a tool for efficient screening of new acetylcholinesterase inhibitors. Front Chem 2025; 13:1479763. [PMID: 40017724 PMCID: PMC11865752 DOI: 10.3389/fchem.2025.1479763] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Accepted: 01/06/2025] [Indexed: 03/01/2025] Open
Abstract
Therapeutic strategies for Alzheimer's disease (AD) often involve inhibiting acetylcholinesterase (AChE), underscoring the need for novel inhibitors with high selectivity and minimal side effects. A detailed analysis of the protein-ligand pharmacophore dynamics can facilitate this. In this study, we developed and employed dyphAI, an innovative approach integrating machine learning models, ligand-based pharmacophore models, and complex-based pharmacophore models into a pharmacophore model ensemble. This ensemble captures key protein-ligand interactions, including π-cation interactions with Trp-86 and several π-π interactions with residues Tyr-341, Tyr-337, Tyr-124, and Tyr-72. The protocol identified 18 novel molecules from the ZINC database with binding energy values ranging from -62 to -115 kJ/mol, suggesting their strong potential as AChE inhibitors. To further validate the predictions, nine molecules were acquired and tested for their inhibitory activity against human AChE. Experimental results revealed that molecules, 4 (P-1894047), with its complex multi-ring structure and numerous hydrogen bond acceptors, and 7 (P-2652815), characterized by a flexible, polar framework with ten hydrogen bond donors and acceptors, exhibited IC₅₀ values lower than or equal to that of the control (galantamine), indicating potent inhibitory activity. Similarly, molecules 5 (P-1205609), 6 (P-1206762), 8 (P-2026435), and 9 (P-533735) also demonstrated strong inhibition. In contrast, molecule 3 (P-617769798) showed a higher IC50 value, and molecules 1 (P-14421887) and 2 (P-25746649) yielded inconsistent results, likely due to solubility issues in the experimental setup. These findings underscore the value of integrating computational predictions with experimental validation, enhancing the reliability of virtual screening in the discovery of potent enzyme inhibitors.
Collapse
Affiliation(s)
- Yasser Hayek-Orduz
- Grupo de Diseño de Productos y Procesos (GDPP), Department of Chemical and Food Engineering, Universidad de los Andes, Bogotá, Colombia
| | - Dorian Armando Acevedo-Castro
- Grupo de Diseño de Productos y Procesos (GDPP), Department of Chemical and Food Engineering, Universidad de los Andes, Bogotá, Colombia
- Computational Bio-Organic Chemistry (COBO), Department of Chemistry, Universidad de los Andes, Bogotá, Colombia
| | - Juan Sebastián Saldarriaga Escobar
- Grupo Natura, Facultad de Ingenieria, Diseño y Ciencias Aplicadas, Departamento de Ciencias Biológicas, Bioprocesos y Biotecnología, Universidad ICESI, Cali, Colombia
| | - Brandon Eli Ortiz-Domínguez
- Grupo Natura, Facultad de Ingenieria, Diseño y Ciencias Aplicadas, Departamento de Ciencias Biológicas, Bioprocesos y Biotecnología, Universidad ICESI, Cali, Colombia
| | - María Francisca Villegas-Torres
- Centro de Investigaciones Microbiológicas (CIMIC), Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
| | - Paola A. Caicedo
- Grupo Natura, Facultad de Ingenieria, Diseño y Ciencias Aplicadas, Departamento de Ciencias Biológicas, Bioprocesos y Biotecnología, Universidad ICESI, Cali, Colombia
| | - Álvaro Barrera-Ocampo
- Grupo Natura, Facultad de Ingenieria, Diseño y Ciencias Aplicadas, Departamento de Ciencias Farmacéuticas y Químicas, Universidad ICESI, Cali, Colombia
| | - Natalie Cortes
- Grupo de Investigación en Química Bioorgánica y Sistemas Moleculares (QBOSMO), Faculty of Natural Sciences and Mathematics, Universidad de Ibagué, Ibagué, Colombia
| | - Edison H. Osorio
- Grupo de Investigación en Química Bioorgánica y Sistemas Moleculares (QBOSMO), Faculty of Natural Sciences and Mathematics, Universidad de Ibagué, Ibagué, Colombia
| | - Andrés Fernando González Barrios
- Grupo de Diseño de Productos y Procesos (GDPP), Department of Chemical and Food Engineering, Universidad de los Andes, Bogotá, Colombia
| |
Collapse
|
2
|
MORTAR: a rich client application for in silico molecule fragmentation. J Cheminform 2023; 15:1. [PMID: 36593523 PMCID: PMC9809053 DOI: 10.1186/s13321-022-00674-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 12/17/2022] [Indexed: 01/03/2023] Open
Abstract
Developing and implementing computational algorithms for the extraction of specific substructures from molecular graphs (in silico molecule fragmentation) is an iterative process. It involves repeated sequences of implementing a rule set, applying it to relevant structural data, checking the results, and adjusting the rules. This requires a computational workflow with data import, fragmentation algorithm integration, and result visualisation. The described workflow is normally unavailable for a new algorithm and must be set up individually. This work presents an open Java rich client Graphical User Interface (GUI) application to support the development of new in silico molecule fragmentation algorithms and make them readily available upon release. The MORTAR (MOlecule fRagmenTAtion fRamework) application visualises fragmentation results of a set of molecules in various ways and provides basic analysis features. Fragmentation algorithms can be integrated and developed within MORTAR by using a specific wrapper class. In addition, fragmentation pipelines with any combination of the available fragmentation methods can be executed. Upon release, three fragmentation algorithms are already integrated: ErtlFunctionalGroupsFinder, Sugar Removal Utility, and Scaffold Generator. These algorithms, as well as all cheminformatics functionalities in MORTAR, are implemented based on the Chemistry Development Kit (CDK).
Collapse
|
3
|
Dai X, Xu Y, Qiu H, Qian X, Lin M, Luo L, Zhao Y, Huang D, Zhang Y, Chen Y, Liu H, Jiang Y. KID: A Kinase-Focused Interaction Database and Its Application in the Construction of Kinase-Focused Molecule Databases. J Chem Inf Model 2022; 62:6022-6034. [PMID: 36447388 DOI: 10.1021/acs.jcim.2c00908] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Protein kinases are important drug targets for the treatment of several diseases. The interaction between kinases and ligands is vital in the process of small-molecule kinase inhibitor (SMKI) design. In this study, we propose a method to extract fragments and amino acid residues from crystal structures for kinase-ligand interactions. In addition, core fragments that interact with the important hinge region of kinases were extracted along with their decorations. Based on the superimposed structural data of kinases from the kinase-ligand interaction fingerprint and structure database, we obtained two libraries, namely, a hinge-unfocused fragment-amino acid pair library (FAP Lib) that contains 6672 pairs of fragments and corresponding amino-acids, and a hinge-focused hinge binder library (HB Lib) of 3560 pairs of hinge-binding scaffolds with their corresponding decorations. These two libraries constitute a kinase-focused interaction database (KID). In depth analysis was conducted on KID to explore important characteristics of fragments in the design of SMKIs. With KID, we built two kinase-focused molecule databases, one called Recomb_DB, which contains 1,72,346 molecules generated through fragment recombination based on the FAP Lib, and another called RsdHB_DB, which contains 93,030 molecules generated based on our HB Lib using molecular generation methods. Compared with five databases both commercial and non-commercial, these two databases both ranked top 3 in scaffold diversity, top 4 in molecule fingerprint diversity, and are more focused on the chemical space of kinase inhibitors. Hence, KID presents a useful addition to existing databases for the exploration of novel SMKIs.
Collapse
Affiliation(s)
- Xiaowen Dai
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Yuan Xu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Haodi Qiu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Xu Qian
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Mingde Lin
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Lin Luo
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Yang Zhao
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Dingfang Huang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Yanmin Zhang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Yadong Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Haichun Liu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| | - Yulei Jiang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing 211198, China
| |
Collapse
|
4
|
Schaub J, Zander J, Zielesny A, Steinbeck C. Scaffold Generator: a Java library implementing molecular scaffold functionalities in the Chemistry Development Kit (CDK). J Cheminform 2022; 14:79. [PMID: 36357931 PMCID: PMC9650898 DOI: 10.1186/s13321-022-00656-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 10/30/2022] [Indexed: 11/12/2022] Open
Abstract
The concept of molecular scaffolds as defining core structures of organic molecules is utilised in many areas of chemistry and cheminformatics, e.g. drug design, chemical classification, or the analysis of high-throughput screening data. Here, we present Scaffold Generator, a comprehensive open library for the generation, handling, and display of molecular scaffolds, scaffold trees and networks. The new library is based on the Chemistry Development Kit (CDK) and highly customisable through multiple settings, e.g. five different structural framework definitions are available. For display of scaffold hierarchies, the open GraphStream Java library is utilised. Performance snapshots with natural products (NP) from the COCONUT (COlleCtion of Open Natural prodUcTs) database and drug molecules from DrugBank are reported. The generation of a scaffold network from more than 450,000 NP can be achieved within a single day.
Collapse
Affiliation(s)
- Jonas Schaub
- grid.9613.d0000 0001 1939 2794Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University Jena, Lessing Strasse 8, 07743 Jena, Germany
| | - Julian Zander
- grid.454254.60000 0004 0647 4362Institute for Bioinformatics and Chemoinformatics, Westphalian University of Applied Sciences, August-Schmidt-Ring 10, 45665 Recklinghausen, Germany
| | - Achim Zielesny
- grid.454254.60000 0004 0647 4362Institute for Bioinformatics and Chemoinformatics, Westphalian University of Applied Sciences, August-Schmidt-Ring 10, 45665 Recklinghausen, Germany
| | - Christoph Steinbeck
- grid.9613.d0000 0001 1939 2794Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University Jena, Lessing Strasse 8, 07743 Jena, Germany
| |
Collapse
|
5
|
Shearer J, Castro JL, Lawson ADG, MacCoss M, Taylor RD. Rings in Clinical Trials and Drugs: Present and Future. J Med Chem 2022; 65:8699-8712. [PMID: 35730680 PMCID: PMC9289879 DOI: 10.1021/acs.jmedchem.2c00473] [Citation(s) in RCA: 167] [Impact Index Per Article: 55.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We present a comprehensive analysis of all ring systems (both heterocyclic and nonheterocyclic) in clinical trial compounds and FDA-approved drugs. We show 67% of small molecules in clinical trials comprise only ring systems found in marketed drugs, which mirrors previously published findings for newly approved drugs. We also show there are approximately 450 000 unique ring systems derived from 2.24 billion molecules currently available in synthesized chemical space, and molecules in clinical trials utilize only 0.1% of this available pool. Moreover, there are fewer ring systems in drugs compared with those in clinical trials, but this is balanced by the drug ring systems being reused more often. Furthermore, systematic changes of up to two atoms on existing drug and clinical trial ring systems give a set of 3902 future clinical trial ring systems, which are predicted to cover approximately 50% of the novel ring systems entering clinical trials.
Collapse
Affiliation(s)
| | | | | | - Malcolm MacCoss
- Bohicket Pharma Consulting Limited Liability Company, 2556 Seabrook Island Road, Seabrook Island, South Carolina29455, United States
| | | |
Collapse
|
6
|
Manelfi C, Gemei M, Talarico C, Cerchia C, Fava A, Lunghini F, Beccari AR. "Molecular Anatomy": a new multi-dimensional hierarchical scaffold analysis tool. J Cheminform 2021; 13:54. [PMID: 34301327 PMCID: PMC8299179 DOI: 10.1186/s13321-021-00526-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Accepted: 06/13/2021] [Indexed: 11/10/2022] Open
Abstract
The scaffold representation is widely employed to classify bioactive compounds on the basis of common core structures or correlate compound classes with specific biological activities. In this paper, we present a novel approach called "Molecular Anatomy" as a flexible and unbiased molecular scaffold-based metrics to cluster large set of compounds. We introduce a set of nine molecular representations at different abstraction levels, combined with fragmentation rules, to define a multi-dimensional network of hierarchically interconnected molecular frameworks. We demonstrate that the introduction of a flexible scaffold definition and multiple pruning rules is an effective method to identify relevant chemical moieties. This approach allows to cluster together active molecules belonging to different molecular classes, capturing most of the structure activity information, in particular when libraries containing a huge number of singletons are analyzed. We also propose a procedure to derive a network visualization that allows a full graphical representation of compounds dataset, permitting an efficient navigation in the scaffold's space and significantly contributing to perform high quality SAR analysis. The protocol is freely available as a web interface at https://ma.exscalate.eu .
Collapse
Affiliation(s)
- Candida Manelfi
- Dompé Farmaceutici SpA, Via Campo di Pile, 67100, L'Aquila, Italy
| | - Marica Gemei
- Dompé Farmaceutici SpA, Via Campo di Pile, 67100, L'Aquila, Italy
| | - Carmine Talarico
- Dompé Farmaceutici SpA, Via Campo di Pile, 67100, L'Aquila, Italy
| | - Carmen Cerchia
- Department of Pharmacy, University of Naples "Federico II", 80131, Napoli, Italy
| | - Anna Fava
- Dompé Farmaceutici SpA, Via Campo di Pile, 67100, L'Aquila, Italy
| | - Filippo Lunghini
- Dompé Farmaceutici SpA, Via Campo di Pile, 67100, L'Aquila, Italy
| | | |
Collapse
|
7
|
Naveja JJ, Pilón-Jiménez BA, Bajorath J, Medina-Franco JL. A general approach for retrosynthetic molecular core analysis. J Cheminform 2019; 11:61. [PMID: 33430974 PMCID: PMC6760108 DOI: 10.1186/s13321-019-0380-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Accepted: 08/04/2019] [Indexed: 11/13/2022] Open
Abstract
Scaffold analysis of compound data sets has reemerged as a chemically interpretable alternative to machine learning for chemical space and structure–activity relationships analysis. In this context, analog series-based scaffolds (ASBS) are synthetically relevant core structures that represent individual series of analogs. As an extension to ASBS, we herein introduce the development of a general conceptual framework that considers all putative cores of molecules in a compound data set, thus softening the often applied “single molecule–single scaffold” correspondence. A putative core is here defined as any substructure of a molecule complying with two basic rules: (a) the size of the core is a significant proportion of the whole molecule size and (b) the substructure can be reached from the original molecule through a succession of retrosynthesis rules. Thereafter, a bipartite network consisting of molecules and cores can be constructed for a database of chemical structures. Compounds linked to the same cores are considered analogs. We present case studies illustrating the potential of the general framework. The applications range from inter- and intra-core diversity analysis of compound data sets, structure–property relationships, and identification of analog series and ASBS. The molecule–core network herein presented is a general methodology with multiple applications in scaffold analysis. New statistical methods are envisioned that will be able to draw quantitative conclusions from these data. The code to use the method presented in this work is freely available as an additional file. Follow-up applications include analog searching and core structure–property relationships analyses.![]()
Collapse
Affiliation(s)
- J Jesús Naveja
- PECEM, School of Medicine, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico. .,Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico.
| | - B Angélica Pilón-Jiménez
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, 53115, Bonn, Germany
| | - José L Medina-Franco
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico.
| |
Collapse
|
8
|
Bandyopadhyay D, Kreatsoulas C, Brady PG, Boyer J, He Z, Scavello G, Peryea T, Jadhav A, Nguyen DT, Guha R. Scaffold-Based Analytics: Enabling Hit-to-Lead Decisions by Visualizing Chemical Series Linked across Large Datasets. J Chem Inf Model 2019; 59:4880-4892. [DOI: 10.1021/acs.jcim.9b00243] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Affiliation(s)
- Deepak Bandyopadhyay
- GlaxoSmithKline, 1250 S. Collegeville Rd, Collegeville, Pennsylvania 19426, United States
| | | | - Pat G. Brady
- GlaxoSmithKline, 1250 S. Collegeville Rd, Collegeville, Pennsylvania 19426, United States
| | - Joseph Boyer
- GlaxoSmithKline, 1250 S. Collegeville Rd, Collegeville, Pennsylvania 19426, United States
| | - Zangdong He
- GlaxoSmithKline, 1250 S. Collegeville Rd, Collegeville, Pennsylvania 19426, United States
| | - Genaro Scavello
- GlaxoSmithKline, 1250 S. Collegeville Rd, Collegeville, Pennsylvania 19426, United States
| | - Tyler Peryea
- National Center for Advancing Translational Science, 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| | - Ajit Jadhav
- National Center for Advancing Translational Science, 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| | - Dac-Trung Nguyen
- National Center for Advancing Translational Science, 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| | - Rajarshi Guha
- National Center for Advancing Translational Science, 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| |
Collapse
|
9
|
Dang X, Liu Z, Zhou Y, Chen P, Liu J, Yao X, Lei B. Steroids-specific target library for steroids target prediction. Steroids 2018; 140:83-91. [PMID: 30296544 DOI: 10.1016/j.steroids.2018.10.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 09/14/2018] [Accepted: 10/01/2018] [Indexed: 01/07/2023]
Abstract
Steroids exist universally and play critical roles in various biological processes. Identifying potential targets of steroids is of great significance in studying their physiological and biochemical activities, the side effects and for drug repurposing. Herein, aiming at more precise steroids targets prediction, a steroids-specific target library integrating 3325 PDB or homology modeling structures categorized into 196 proteins was built by considering chemical similarity from DrugBank and biological processes from KEGG. The main properties of this library include: (1) It was manually prepared and checked to eliminate mistakes. (2) The library enriched the possible steroids targets and could decrease the false positives of structure-based target screening for steroids. (3) The ranking by protein name instead of PDB ID could make the screening more efficiency and precise. (4) Protein flexibility was taken into account partially by the different active conformations through the structural redundancy of each category of protein, which leads to more accurate prediction. The case studies of glycocholic acid and 24-epibrassinolide proved its powerful predictive accuracy. In summary, our strategy to build the steroids-specific protein library for steroids target prediction is a promising approach and it provides a novel idea for the target prediction of small molecules.
Collapse
Affiliation(s)
- Xiaoxue Dang
- Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Zheng Liu
- Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Yanzhuo Zhou
- Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Peizi Chen
- Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Jiyuan Liu
- Key Laboratory of Plant Protection Resources & Pest Management of the Ministry of Education, Northwest A&F University, Yangling, Shaanxi, China
| | - Xiaojun Yao
- State Key Laboratory of Applied Organic Chemistry and Department of Chemistry, Lanzhou University, Lanzhou, China
| | - Beilei Lei
- Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China.
| |
Collapse
|
10
|
Jacoby E, Wroblowski B, Buyck C, Neefs JM, Meyer C, Cummings MD, van Vlijmen H. Protocols for the Design of Kinase-focused Compound Libraries. Mol Inform 2017; 37:e1700119. [PMID: 29116686 DOI: 10.1002/minf.201700119] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Accepted: 10/20/2017] [Indexed: 01/12/2023]
Abstract
Protocols for the design of kinase-focused compound libraries are presented. Kinase-focused compound libraries can be differentiated based on the design goal. Depending on whether the library should be a discovery library specific for one particular kinase, a general discovery library for multiple distinct kinase projects, or even phenotypic screening, there exists today a variety of in silico methods to design candidate compound libraries. We address the following scenarios: 1) Datamining of SAR databases and kinase focused vendor catalogues; 2) Predictions and virtual screening; 3) Structure-based design of combinatorial kinase inhibitors; 4) Design of covalent kinase inhibitors; 5) Design of macrocyclic kinase inhibitors; and 6) Design of allosteric kinase inhibitors and activators.
Collapse
Affiliation(s)
- Edgar Jacoby
- Janssen Research & Development, Turnhoutseweg 30, 2340, Beerse, Belgium
| | | | - Christophe Buyck
- Janssen Research & Development, Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Jean-Marc Neefs
- Janssen Research & Development, Turnhoutseweg 30, 2340, Beerse, Belgium
| | | | - Maxwell D Cummings
- Janssen Research & Development, 1400 McKean Rd, Spring House, PA 19477, USA
| | | |
Collapse
|
11
|
Hall RJ, Murray CW, Verdonk ML. The Fragment Network: A Chemistry Recommendation Engine Built Using a Graph Database. J Med Chem 2017; 60:6440-6450. [PMID: 28712298 DOI: 10.1021/acs.jmedchem.7b00809] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
The hit validation stage of a fragment-based drug discovery campaign involves probing the SAR around one or more fragment hits. This often requires a search for similar compounds in a corporate collection or from commercial suppliers. The Fragment Network is a graph database that allows a user to efficiently search chemical space around a compound of interest. The result set is chemically intuitive, naturally grouped by substitution pattern and meaningfully sorted according to the number of observations of each transformation in medicinal chemistry databases. This paper describes the algorithms used to construct and search the Fragment Network and provides examples of how it may be used in a drug discovery context.
Collapse
Affiliation(s)
- Richard J Hall
- Astex Pharmaceuticals , 436 Cambridge Science Park, Milton Road, Cambridge CB4 0QA, United Kingdom
| | - Christopher W Murray
- Astex Pharmaceuticals , 436 Cambridge Science Park, Milton Road, Cambridge CB4 0QA, United Kingdom
| | - Marcel L Verdonk
- Astex Pharmaceuticals , 436 Cambridge Science Park, Milton Road, Cambridge CB4 0QA, United Kingdom
| |
Collapse
|
12
|
Velkoborsky J, Hoksza D. Scaffold analysis of PubChem database as background for hierarchical scaffold-based visualization. J Cheminform 2016; 8:74. [PMID: 28090217 PMCID: PMC5199768 DOI: 10.1186/s13321-016-0186-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Accepted: 12/02/2016] [Indexed: 11/25/2022] Open
Abstract
Background Visualization of large molecular datasets is a challenging yet important topic utilised in diverse fields of chemistry ranging from material engineering to drug design. Especially in drug design, modern methods of high-throughput screening generate large amounts of molecular data that call for methods enabling their analysis. One such method is classification of compounds based on their molecular scaffolds, a concept widely used by medicinal chemists to group molecules of similar properties. This classification can then be utilized for intuitive visualization of compounds. Results In this paper, we propose a scaffold hierarchy as a result of large-scale analysis of the PubChem Compound database. The analysis not only provided insights into scaffold diversity of the PubChem Compound database, but also enables scaffold-based hierarchical visualization of user compound data sets on the background of empirical chemical space, as defined by the PubChem data, or on the background of any other user-defined data set. The visualization is performed by a web based client-server application called Scaffvis. It provides an interactive zoomable tree map visualization of data sets up to hundreds of thousands molecules. Scaffvis is free to use and its source codes have been published under an open source license.. ![]() Electronic supplementary material The online version of this article (doi:10.1186/s13321-016-0186-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jakub Velkoborsky
- Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
| | - David Hoksza
- Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
| |
Collapse
|
13
|
ClassyFire: automated chemical classification with a comprehensive, computable taxonomy. J Cheminform 2016; 8:61. [PMID: 27867422 PMCID: PMC5096306 DOI: 10.1186/s13321-016-0174-y] [Citation(s) in RCA: 797] [Impact Index Per Article: 88.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2016] [Accepted: 10/18/2016] [Indexed: 12/03/2022] Open
Abstract
Background Scientists have long been driven by the desire to describe, organize, classify, and compare objects using taxonomies and/or ontologies. In contrast to biology, geology, and many other scientific disciplines, the world of chemistry still lacks a standardized chemical ontology or taxonomy. Several attempts at chemical classification have been made; but they have mostly been limited to either manual, or semi-automated proof-of-principle applications. This is regrettable as comprehensive chemical classification and description tools could not only improve our understanding of chemistry but also improve the linkage between chemistry and many other fields. For instance, the chemical classification of a compound could help predict its metabolic fate in humans, its druggability or potential hazards associated with it, among others. However, the sheer number (tens of millions of compounds) and complexity of chemical structures is such that any manual classification effort would prove to be near impossible. Results We have developed a comprehensive, flexible, and computable, purely structure-based chemical taxonomy (ChemOnt), along with a computer program (ClassyFire) that uses only chemical structures and structural features to automatically assign all known chemical compounds to a taxonomy consisting of >4800 different categories. This new chemical taxonomy consists of up to 11 different levels (Kingdom, SuperClass, Class, SubClass, etc.) with each of the categories defined by unambiguous, computable structural rules. Furthermore each category is named using a consensus-based nomenclature and described (in English) based on the characteristic common structural properties of the compounds it contains. The ClassyFire webserver is freely accessible at http://classyfire.wishartlab.com/. Moreover, a Ruby API version is available at https://bitbucket.org/wishartlab/classyfire_api, which provides programmatic access to the ClassyFire server and database. ClassyFire has been used to annotate over 77 million compounds and has already been integrated into other software packages to automatically generate textual descriptions for, and/or infer biological properties of over 100,000 compounds. Additional examples and applications are provided in this paper. Conclusion ClassyFire, in combination with ChemOnt (ClassyFire’s comprehensive chemical taxonomy), now allows chemists and cheminformaticians to perform large-scale, rapid and automated chemical classification. Moreover, a freely accessible API allows easy access to more than 77 million “ClassyFire” classified compounds. The results can be used to help annotate well studied, as well as lesser-known compounds. In addition, these chemical classifications can be used as input for data integration, and many other cheminformatics-related tasks. Electronic supplementary material The online version of this article (doi:10.1186/s13321-016-0174-y) contains supplementary material, which is available to authorized users.
Collapse
|
14
|
Grygorenko OO, Babenko P, Volochnyuk DM, Raievskyi O, Komarov IV. Following Ramachandran: exit vector plots (EVP) as a tool to navigate chemical space covered by 3D bifunctional scaffolds. The case of cycloalkanes. RSC Adv 2016. [DOI: 10.1039/c5ra19958a] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
An approach to analysis and visualization of chemical space covered by disubstituted scaffolds, which is based on exit vector plots (EVP), is used for analysis of cycloalkane. Four clearly defined regions (α, β, γ and δ) are found in their EVP.
Collapse
Affiliation(s)
| | - Pavlo Babenko
- Taras Shevchenko National University of Kyiv
- Kyiv 01601
- Ukraine
| | - Dmitry M. Volochnyuk
- Institute of Organic Chemistry National Academy of Sciences of Ukraine
- Kyiv 02094
- Ukraine
| | - Oleksii Raievskyi
- Institute of Molecular Biology and Genetics National Academy of Sciences of Ukraine
- Kyiv 03680
- Ukraine
- Life Chemicals
- Life Chemicals Group
| | - Igor V. Komarov
- Taras Shevchenko National University of Kyiv
- Kyiv 01601
- Ukraine
| |
Collapse
|
15
|
Jacoby E, Tresadern G, Bembenek S, Wroblowski B, Buyck C, Neefs JM, Rassokhin D, Poncelet A, Hunt J, van Vlijmen H. Extending kinome coverage by analysis of kinase inhibitor broad profiling data. Drug Discov Today 2015; 20:652-8. [PMID: 25596550 DOI: 10.1016/j.drudis.2015.01.002] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2014] [Revised: 12/03/2014] [Accepted: 01/08/2015] [Indexed: 01/09/2023]
Abstract
The explored kinome was extended with broad profiling using the DiscoveRx and Millipore assay panels. The analysis of the profiling of 3368 selected inhibitors on 456 kinases in the DiscoveRx format delivered several insights. First, the coverage depended on the threshold of the selectivity parameter. Second, the relation between hit confirmation rates and inhibitor selectivity showed unexpectedly that higher selectivity can increase the likelihood of false positives. Third, comparing the coverage of a focused to that of a random library showed that the design based on a maximum number of scaffolds was superior to a limited number of scaffolds. Therefore, selective compounds can be used in target validation, enable the jumpstarting of new kinase drug discovery projects, and chart new biological space via phenotypic screening.
Collapse
Affiliation(s)
- Edgar Jacoby
- Janssen Research & Development, Turnhoutseweg 30, 2340 Beerse, Belgium.
| | - Gary Tresadern
- Janssen Research & Development, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Scott Bembenek
- Janssen Research & Development, La Jolla, 3210 Merryfield Row, San Diego, CA 92121, USA
| | | | - Christophe Buyck
- Janssen Research & Development, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Jean-Marc Neefs
- Janssen Research & Development, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Dmitrii Rassokhin
- Janssen Research & Development, Spring House, 1400 McKean Road, Spring House, PA 19002, USA
| | - Alain Poncelet
- Janssen Research & Development, Val de Reuil, Campus de Maigremont, B.P. 615, 27106 Val De Reuil Cedex, France
| | - Jeremy Hunt
- KINOMEscan Division of DiscoveRx Corporation, 11180 Roselle Street, Suite D, San Diego, CA 92121, USA
| | | |
Collapse
|
16
|
Schwartz J, Awale M, Reymond JL. SMIfp (SMILES fingerprint) chemical space for virtual screening and visualization of large databases of organic molecules. J Chem Inf Model 2013; 53:1979-89. [PMID: 23845040 DOI: 10.1021/ci400206h] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
SMIfp (SMILES fingerprint) is defined here as a scalar fingerprint describing organic molecules by counting the occurrences of 34 different symbols in their SMILES strings, which creates a 34-dimensional chemical space. Ligand-based virtual screening using the city-block distance CBD(SMIfp) as similarity measure provides good AUC values and enrichment factors for recovering series of actives from the directory of useful decoys (DUD-E) and from ZINC. DrugBank, ChEMBL, ZINC, PubChem, GDB-11, GDB-13, and GDB-17 can be searched by CBD(SMIfp) using an online SMIfp-browser at www.gdb.unibe.ch. Visualization of the SMIfp chemical space was performed by principal component analysis and color-coded maps of the (PC1, PC2)-planes, with interactive access to the molecules enabled by the Java application SMIfp-MAPPLET available from www.gdb.unibe.ch. These maps spread molecules according to their fraction of aromatic atoms, size and polarity. SMIfp provides a new and relevant entry to explore the small molecule chemical space.
Collapse
Affiliation(s)
- Julian Schwartz
- Department of Chemistry and Biochemistry, University of Berne, Freiestrasse 3, 3012 Berne, Switzerland
| | | | | |
Collapse
|
17
|
Awale M, van Deursen R, Reymond JL. MQN-mapplet: visualization of chemical space with interactive maps of DrugBank, ChEMBL, PubChem, GDB-11, and GDB-13. J Chem Inf Model 2013; 53:509-18. [PMID: 23297797 DOI: 10.1021/ci300513m] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
The MQN-mapplet is a Java application giving access to the structure of small molecules in large databases via color-coded maps of their chemical space. These maps are projections from a 42-dimensional property space defined by 42 integer value descriptors called molecular quantum numbers (MQN), which count different categories of atoms, bonds, polar groups, and topological features and categorize molecules by size, rigidity, and polarity. Despite its simplicity, MQN-space is relevant to biological activities. The MQN-mapplet allows localization of any molecule on the color-coded images, visualization of the molecules, and identification of analogs as neighbors on the MQN-map or in the original 42-dimensional MQN-space. No query molecule is necessary to start the exploration, which may be particularly attractive for nonchemists. To our knowledge, this type of interactive exploration tool is unprecedented for very large databases such as PubChem and GDB-13 (almost one billion molecules). The application is freely available for download at www.gdb.unibe.ch.
Collapse
Affiliation(s)
- Mahendra Awale
- Department of Chemistry and Biochemistry, NCCR TransCure, University of Berne, Freiestrasse 3, 3012 Berne, Switzerland
| | | | | |
Collapse
|
18
|
Takigawa I, Mamitsuka H. Graph mining: procedure, application to drug discovery and recent advances. Drug Discov Today 2012; 18:50-7. [PMID: 22889967 DOI: 10.1016/j.drudis.2012.07.016] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2011] [Revised: 04/20/2012] [Accepted: 07/26/2012] [Indexed: 10/28/2022]
Abstract
Combinatorial chemistry has generated chemical libraries and databases with a huge number of chemical compounds, which include prospective drugs. Chemical structures of compounds can be molecular graphs, to which a variety of graph-based techniques in computer science, specifically graph mining, can be applied. The most basic way for analyzing molecular graphs is using structural fragments, so-called subgraphs in graph theory. The mainstream technique in graph mining is frequent subgraph mining, by which we can retrieve essential subgraphs in given molecular graphs. In this article we explain the idea and procedure of mining frequent subgraphs from given molecular graphs, raising some real applications, and we describe the recent advances of graph mining.
Collapse
Affiliation(s)
- Ichigaku Takigawa
- Bioinformatics Center, Institute for Chemical Research Kyoto University, Gokasho, Uji 6110011, Japan.
| | | |
Collapse
|
19
|
Cheng T, Li Q, Wang Y, Bryant SH. Identifying compound-target associations by combining bioactivity profile similarity search and public databases mining. J Chem Inf Model 2011; 51:2440-8. [PMID: 21834535 PMCID: PMC3180241 DOI: 10.1021/ci200192v] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
![]()
Molecular target identification is of central importance to drug discovery. Here, we developed a computational approach, named bioactivity profile similarity search (BASS), for associating targets to small molecules by using the known target annotations of related compounds from public databases. To evaluate BASS, a bioactivity profile database was constructed using 4296 compounds that were commonly tested in the US National Cancer Institute 60 human tumor cell line anticancer drug screen (NCI-60). Each compound was used as a query to search against the entire bioactivity profile database, and reference compounds with similar bioactivity profiles above a threshold of 0.75 were considered as neighbor compounds of the query. Potential targets were subsequently linked to the identified neighbor compounds by using the known targets of the query compound. About 45% of the predicted compound-target associations were successfully verified retrospectively, suggesting the possible application of BASS in identifying the targets of uncharacterized compounds and thus providing insight into the study of promiscuity and polypharmacology. Furthermore, BASS identified a significant fraction of structurally diverse compounds with similar bioactivities, indicating its feasibility of “scaffold hopping” in searching novel molecules against the target of interest.
Collapse
Affiliation(s)
- Tiejun Cheng
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, United States
| | | | | | | |
Collapse
|
20
|
Varin T, Schuffenhauer A, Ertl P, Renner S. Mining for bioactive scaffolds with scaffold networks: improved compound set enrichment from primary screening data. J Chem Inf Model 2011; 51:1528-38. [PMID: 21615076 DOI: 10.1021/ci2000924] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Identification of meaningful chemical patterns in the increasing amounts of high-throughput-generated bioactivity data available today is an increasingly important challenge for successful drug discovery. Herein, we present the scaffold network as a novel approach for mapping and navigation of chemical and biological space. A scaffold network represents the chemical space of a library of molecules consisting of all molecular scaffolds and smaller "parent" scaffolds generated therefrom by the pruning of rings, effectively leading to a network of common scaffold substructure relationships. This algorithm provides an extension of the scaffold tree algorithm that, instead of a network, generates a tree relationship between a heuristically rule-based selected subset of parent scaffolds. The approach was evaluated for the identification of statistically significantly active scaffolds from primary screening data for which the scaffold tree approach has already been shown to be successful. Because of the exhaustive enumeration of smaller scaffolds and the full enumeration of relationships between them, about twice as many statistically significantly active scaffolds were identified compared to the scaffold-tree-based approach. We suggest visualizing scaffold networks as islands of active scaffolds.
Collapse
Affiliation(s)
- Thibault Varin
- Novartis Institutes for BioMedical Research, Forum 1, Novartis Campus, CH-4056 Basel, Switzerland
| | | | | | | |
Collapse
|