1
|
López-Pérez K, Miranda-Quintana RA. iCliff Taylor's Version: Robust and Efficient Activity Cliff Determination. J Chem Inf Model 2025. [PMID: 40400300 DOI: 10.1021/acs.jcim.5c00506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/23/2025]
Abstract
Activity cliffs represent an important challenge to tackle in cheminformatics and drug design. One of the most common indicators to quantify them is the Structure-Activity Landscape Index (SALI). Here, we expose the mathematical limitations of SALI's formulation, the most evident: it is undefined in instances where the similarity between two molecules is one. We show how using a simple Taylor's series can aid this main problem, yielding a defined expression that can capture the ranking information from the original SALI. The second issue to solve is the quadratic complexity of using SALI to describe the roughness of the activity landscape of a set. Here, we propose iCliff, an indicator that can quantify the roughness in linear complexity. For this, we leverage the iSIM framework to obtain the average similarity of the set and a rearrangement to obtain the average of the squared property differences. The calculations for 30 different AC-focused databases suggest that there is a strong correlation between iCliff and the average pairwise of SALI's pairwise Taylor Series. To further explore the individual effects of removing each molecule in the activity landscape, we propose complementary iCliff. With this tool, we were able to identify the molecules that have a high number of activity cliffs with the rest of the molecules in the set.
Collapse
Affiliation(s)
- Kenneth López-Pérez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| | - Ramón Alain Miranda-Quintana
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| |
Collapse
|
2
|
Serrano-Morrás Á, Westermaier Y, Majewski M, Barril X. The Quasi-Bound State as a Predictor of Relative Binding Free Energy. J Chem Inf Model 2025. [PMID: 40392679 DOI: 10.1021/acs.jcim.5c00289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2025]
Abstract
Relative binding free energy (ΔΔGbind) predictions have become the main approach to evaluate the potency of a congeneric series of compounds. They are enabled by alchemical transformations coupled to free energy methods, tools that have become essential in drug design. Yet, they are computationally expensive and are limited to small compound sets and relatively simple transformations. The ever-increasing size of virtual screening databases demands faster methods to assess virtual hits. Here, we show that the structural robustness of protein-ligand complexes, measured as the free energy necessary to reach a quasi-bound state (ΔGQB) by Dynamic Undocking (DUck), is well suited to detect outliers in the structure-activity continuum (i.e., activity cliffs), which are particularly challenging for knowledge-based approaches. On different congeneric series of HSP90α, CDK2, and BACE1 inhibitors, we demonstrate that ΔGQB can deliver excellent predictions. Despite the local nature of the measurement, these are in some cases comparable to the much more computationally demanding alchemical transformation methods. We find that for systems following a one-step dissociation model, ΔGQB informs about the free energy of the transition state, allowing us to predict relative binding kinetics and, when the series present relatively constant on-rates, also ΔΔGbind. This work has important implications for drug discovery, as it shows that within a well-defined applicability domain, high-throughput computational dissociation studies can deliver ΔΔGbind predictions that compare well with rigorous alchemical transformation methods at a fraction of the cost.
Collapse
Affiliation(s)
- Álvaro Serrano-Morrás
- Facultat de Farmàcia and Institut de Biomedicina, Universitat de Barcelona, Av. Joan XXIII, 27-31, Barcelona 08028, Spain
| | - Yvonne Westermaier
- Facultat de Farmàcia and Institut de Biomedicina, Universitat de Barcelona, Av. Joan XXIII, 27-31, Barcelona 08028, Spain
| | - Maciej Majewski
- Facultat de Farmàcia and Institut de Biomedicina, Universitat de Barcelona, Av. Joan XXIII, 27-31, Barcelona 08028, Spain
| | - Xavier Barril
- Facultat de Farmàcia and Institut de Biomedicina, Universitat de Barcelona, Av. Joan XXIII, 27-31, Barcelona 08028, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), Passeig Lluís Companys 23, Barcelona 08010, Spain
| |
Collapse
|
3
|
Li K, Wu Y, Li Y, Guo Y, Kong Y, Wang Y, Liang Y, Fan Y, Huang L, Zhang R, Zhou F. AMPCliff: Quantitative definition and benchmarking of activity cliffs in antimicrobial peptides. J Adv Res 2025:S2090-1232(25)00292-9. [PMID: 40318764 DOI: 10.1016/j.jare.2025.04.046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2025] [Revised: 04/09/2025] [Accepted: 04/29/2025] [Indexed: 05/07/2025] Open
Abstract
INTRODUCTION Activity cliff (AC) is a phenomenon that a pair of similar molecules differ by a small structural alternation but exhibit a large difference in their biochemical activities. This phenomenon affects various tasks ranging from virtual screening to lead optimization in drug development. The AC of small molecules has been extensively investigated but limited knowledge is accumulated about the AC phenomenon in pharmaceutical peptides with canonical amino acids. OBJECTIVES This study introduces a quantitative definition and benchmarking framework AMPCliff for the AC phenomenon in antimicrobial peptides (AMPs) composed by canonical amino acids. METHODS This study establishes a benchmark dataset of paired AMPs in Staphylococcus aureus from the publicly available AMP dataset GRAMPA, and conducts a rigorous procedure to evaluate various AMP AC prediction models, including nine machine learning, four deep learning algorithms, four masked language models, and four generative language models. RESULTS A comprehensive analysis of the existing AMP dataset reveals a significant prevalence of AC within AMPs. AMPCliff quantifies the activities of AMPs by the metric minimum inhibitory concentration (MIC), and defines 0.9 as the minimum threshold for the normalized BLOSUM62 similarity score between a pair of aligned peptides with at least two-fold MIC changes. Our analysis reveals that these models are capable of detecting AMP AC events and the pre-trained protein language model ESM2 demonstrates superior performance across the evaluations. The predictive performance of AMP activity cliffs remains to be further improved, considering that ESM2 with 33 layers only achieves the Spearman correlation coefficient 0.4669 for the regression task of the -log(MIC) values on the benchmark dataset. CONCLUSION Our findings highlight limitations in current deep learning-based representation models. To more accurately capture the properties of antimicrobial peptides (AMPs), it is essential to integrate atomic-level dynamic information that reflects their underlying mechanisms of action.
Collapse
Affiliation(s)
- Kewei Li
- College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
| | - Yuqian Wu
- School of Software, Jilin University, Changchun 130012 Jilin, China
| | - Yinheng Li
- Department of Computer Science, Columbia University, 116th and Broadway, New York City, NY 10027, United States
| | - Yutong Guo
- School of Life Sciences, Jilin University, Changchun 130012 Jilin, China
| | - Yanwen Kong
- College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
| | - Yan Wang
- School of Computer Engineering, Changchun University of Engineering, Changchun 130103 Jilin, China
| | - Yiyang Liang
- Changchun Wenli High School, Changchun 130062 Jilin, China
| | - Yusi Fan
- College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
| | - Lan Huang
- College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
| | - Ruochi Zhang
- School of Artificial Intelligence, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012 Jilin, China.
| | - Fengfeng Zhou
- College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China; School of Biology and Engineering, Guizhou Medical University, Guiyang 550025 Guizhou, China.
| |
Collapse
|
4
|
López-Pérez K, Miranda-Quintana RA. iCliff Taylor's version: Robust and Efficient Activity Cliff Determination. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.09.642269. [PMID: 40161667 PMCID: PMC11952408 DOI: 10.1101/2025.03.09.642269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
Activity cliffs represent an important challenge to tackle in cheminformatics and drug design. One of the most common indicators to quantify them is the SALI index. Here we expose mathematical limitations of SALI's formulation, the most evident: it is undefined in instances where the similarity between two molecules is one. We show how using a simple Taylor's series can aid this main problem, yielding a defined expression that can capture the ranking information from the original SALI. The second issue to solve is the quadratic complexity of using SALI to describe the roughness of the activity landscape of a set. Here, we propose iCliff, an indicator that can quantify the roughness in linear complexity. For this, we leverage the iSIM framework to obtain the average similarity of the set and a rearrangement to obtain the average of the squared property differences. The calculations for 30 different AC-focused databases suggest that there is a strong correlation between iCliff and the average pairwise of SALI's pairwise Taylor Series. To further explore the individual effects of removing each molecule in the activity landscape, we propose complementary iCliff. With this tool, we were able to identify the molecules that have a high number of activity cliffs with the rest of the molecules in the set.
Collapse
Affiliation(s)
- Kenneth López-Pérez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL 32611, USA
| | | |
Collapse
|
5
|
Feng B, Yu H, Dong X, Díaz-Holguín A, Hu H. Identification of bioactive compounds with popular single-atom modifications: Comprehensive analysis and implications for compound design. Eur J Med Chem 2025; 283:117051. [PMID: 39631098 DOI: 10.1016/j.ejmech.2024.117051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Revised: 10/28/2024] [Accepted: 11/11/2024] [Indexed: 12/07/2024]
Abstract
The extensive bioactivity data available in public databases, such as ChEMBL, has facilitated in-depth structure-activity relationship (SAR) analysis, which are essential for understanding the impact of molecular modifications on biological activity in a comprehensive manner. A central strategy in SAR analysis is the assessment of molecular similarity. Several approaches preferred by medicinal chemists have been developed to efficiently capture structurally related compounds on a large scale. Represented as a popular molecular editing strategy in hit-to-lead and lead optimization processes, we previously introduced four types of single-atom modifications (SAMs) as chemical similarity criterion and conducted a systematic analysis of their application in compound design. In this study, we expanded the analysis to cover 10 common SAMs, including carbon-nitrogen (N↔C), O↔C, N↔O, S↔O, as well as simpler modifications such as OH↔H, CH3↔H, and halogen-hydrogen (F, Cl, Br, I↔H) exchanges. Leveraging high-confidence bioactivity data from ChEMBL (version 34), we assembled a comprehensive dataset comprising 374,979 SAM pairs. Following an evaluation of the frequency of these SAM types in medicinal chemistry efforts, we focused on SAM-induced activity cliffs (ACs), yielding over 7400 ACs, substantially expanding the current knowledgebase of ACs associated with single-atom changes. Furthermore, structural analysis of these ACs, supported by experimental data, provides critical insights into the role of single-atom modifications in modulating compound activity, offering practical guidance for the structure-based optimization of molecular properties in drug development. As a result, we are providing open access to all identified ACs along with their associated structural information.
Collapse
Affiliation(s)
- Bo Feng
- Department of Pharmacy, The Affiliated Hospital of Yangzhou University, Yangzhou University, Yangzhou, 225000, PR China
| | - Hui Yu
- Information School, University of Sheffield, 211 Portobello, Sheffield, S1 4DP, UK
| | - Xu Dong
- Department of Pharmacy, The Affiliated Hospital of Yangzhou University, Yangzhou University, Yangzhou, 225000, PR China
| | - Alejandro Díaz-Holguín
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, BMC, Box 596, SE-751 24, Uppsala, Sweden
| | - Huabin Hu
- Department of Pharmacy, The Affiliated Hospital of Yangzhou University, Yangzhou University, Yangzhou, 225000, PR China; Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, BMC, Box 596, SE-751 24, Uppsala, Sweden; Centre for Cancer Drug Discovery, Division of Cancer Therapeutics, The Institute of Cancer Research, London, UK.
| |
Collapse
|
6
|
de O. Viana J, Weber KC, da Cruz LEG, Santos RDO, Rocha GB, Jordão AK, Barbosa EG. In Silico Structural Insights and Potential Inhibitor Identification Based on the Benzothiazole Core for Targeting Leishmania major Pteridine Reductase 1. ACS OMEGA 2025; 10:306-317. [PMID: 39829523 PMCID: PMC11740253 DOI: 10.1021/acsomega.4c06146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 11/23/2024] [Accepted: 12/11/2024] [Indexed: 01/22/2025]
Abstract
Leishmaniasis is reported as the second most common protozoonosis, with the highest prevalence and mortality rate. Among the Leishmania drug targets, Pteridine Reductase 1 of Leishmania major (LmPTR1) proved to be promising because Leishmania is auxotrophic for folates. Thus, this study employed a combination of ligand- and structure-based approaches to screen new benzothiazole compounds as LmPTR1 inhibitor candidates. Initially, a highly predictive quantitative structure-activity relationship (QSAR) model was able to identify the relevant hybrid descriptors, with an accuracy of over 93%. Insights into the mechanism of action indicated Phe113, His241, Leu188, Met183, and Leu226 as key residues. New commercially available compounds were screened using QSAR, docking, and pharmacokinetic properties as filters. Molecular dynamics, non-covalent interactions analysis, and quantum chemical calculation of binding enthalpy demonstrated that the lead compound (ZINC 72229720) forms a stable complex with LmPTR1, indicating it as a new promising LmPTR1 inhibitor.
Collapse
Affiliation(s)
- Jéssika de O. Viana
- Department
of Chemistry, Federal University of Paraíba, João Pessoa 58051-900, Brazil
| | - Karen C. Weber
- Department
of Chemistry, Federal University of Paraíba, João Pessoa 58051-900, Brazil
| | - Luiz E. G. da Cruz
- Department
of Chemistry, Federal University of Paraíba, João Pessoa 58051-900, Brazil
| | - Rhayane de O. Santos
- Department
of Chemistry, Federal University of Paraíba, João Pessoa 58051-900, Brazil
| | - Gerd B. Rocha
- Department
of Chemistry, Federal University of Paraíba, João Pessoa 58051-900, Brazil
| | - Alessandro K. Jordão
- Department
of Pharmacy, Federal University of Rio Grande
do Norte, General Cordeiro de Farias Street, CEP, 59012-570 Natal, RN, Brazil
| | - Euzébio G. Barbosa
- Department
of Pharmacy, Federal University of Rio Grande
do Norte, General Cordeiro de Farias Street, CEP, 59012-570 Natal, RN, Brazil
| |
Collapse
|
7
|
López-Pérez K, Avellaneda-Tamayo JF, Chen L, López-López E, Juárez-Mercado KE, Medina-Franco JL, Miranda-Quintana RA. Molecular similarity: Theory, applications, and perspectives. ARTIFICIAL INTELLIGENCE CHEMISTRY 2024; 2:100077. [PMID: 40124654 PMCID: PMC11928018 DOI: 10.1016/j.aichem.2024.100077] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/25/2025]
Abstract
Molecular similarity pervades much of our understanding and rationalization of chemistry. This has become particularly evident in the current data-intensive era of chemical research, with similarity measures serving as the backbone of many Machine Learning (ML) supervised and unsupervised procedures. Here, we present a discussion on the role of molecular similarity in drug design, chemical space exploration, chemical "art" generation, molecular representations, and many more. We also discuss more recent topics in molecular similarity, like the ability to efficiently compare large molecular libraries.
Collapse
Affiliation(s)
- Kenneth López-Pérez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL 32611, USA
| | - Juan F. Avellaneda-Tamayo
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| | - Lexin Chen
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL 32611, USA
| | - Edgar López-López
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
- Department of Chemistry and Graduate Program in Pharmacology, Center for Research and Advanced Studies of the National Polytechnic Institute, Section 14-740, Mexico City 07000, Mexico
| | - K. Eurídice Juárez-Mercado
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| | - José L. Medina-Franco
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| | | |
Collapse
|
8
|
Zhao B, Xu W, Guan J, Zhou S. Molecular property prediction based on graph structure learning. Bioinformatics 2024; 40:btae304. [PMID: 38710497 PMCID: PMC11112045 DOI: 10.1093/bioinformatics/btae304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 04/06/2024] [Accepted: 05/03/2024] [Indexed: 05/08/2024] Open
Abstract
MOTIVATION Molecular property prediction (MPP) is a fundamental but challenging task in the computer-aided drug discovery process. More and more recent works employ different graph-based models for MPP, which have achieved considerable progress in improving prediction performance. However, current models often ignore relationships between molecules, which could be also helpful for MPP. RESULTS For this sake, in this article we propose a graph structure learning (GSL) based MPP approach, called GSL-MPP. Specifically, we first apply graph neural network (GNN) over molecular graphs to extract molecular representations. Then, with molecular fingerprints, we construct a molecule similarity graph (MSG). Following that, we conduct GSL on the MSG, i.e. molecule-level GSL, to get the final molecular embeddings, which are the results of fuzing both GNN encoded molecular representations and the relationships among molecules. That is, combining both intra-molecule and inter-molecule information. Finally, we use these molecular embeddings to perform MPP. Extensive experiments on 10 various benchmark datasets show that our method could achieve state-of-the-art performance in most cases, especially on classification tasks. Further visualization studies also demonstrate the good molecular representations of our method. AVAILABILITY AND IMPLEMENTATION Source code is available at https://github.com/zby961104/GSL-MPP.
Collapse
Affiliation(s)
- Bangyi Zhao
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200438, China
| | - Weixia Xu
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200438, China
| | - Jihong Guan
- Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
| | - Shuigeng Zhou
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200438, China
| |
Collapse
|
9
|
Daoud S, Taha M. Protein characteristics substantially influence the propensity of activity cliffs among kinase inhibitors. Sci Rep 2024; 14:9058. [PMID: 38643174 PMCID: PMC11032345 DOI: 10.1038/s41598-024-59501-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Accepted: 04/11/2024] [Indexed: 04/22/2024] Open
Abstract
Activity cliffs (ACs) are pairs of structurally similar molecules with significantly different affinities for a biotarget, posing a challenge in computer-assisted drug discovery. This study focuses on protein kinases, significant therapeutic targets, with some exhibiting ACs while others do not despite numerous inhibitors. The hypothesis that the presence of ACs is dependent on the target protein and its complete structural context is explored. Machine learning models were developed to link protein properties to ACs, revealing specific tripeptide sequences and overall protein properties as critical factors in ACs occurrence. The study highlights the importance of considering the entire protein matrix rather than just the binding site in understanding ACs. This research provides valuable insights for drug discovery and design, paving the way for addressing ACs-related challenges in modern computational approaches.
Collapse
Affiliation(s)
- Safa Daoud
- Department of Pharmaceutical Chemistry and Pharmacognosy, Faculty of Pharmacy, Applied Sciences Private University, Amman, Jordan.
| | - Mutasem Taha
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, University of Jordan, Amman, Jordan.
| |
Collapse
|
10
|
Danishuddin, Malik MZ, Kashif M, Haque S, Kim JJ. Exploring chemical space, scaffold diversity, and activity landscape of spleen tyrosine kinase active inhibitors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2024; 35:325-342. [PMID: 38690773 DOI: 10.1080/1062936x.2024.2345618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Accepted: 04/14/2024] [Indexed: 05/03/2024]
Abstract
This study aims to comprehensively characterize 576 inhibitors targeting Spleen Tyrosine Kinase (SYK), a non-receptor tyrosine kinase primarily found in haematopoietic cells, with significant relevance to B-cell receptor function. The objective is to gain insights into the structural requirements essential for potent activity, with implications for various therapeutic applications. Through chemoinformatic analyses, we focus on exploring the chemical space, scaffold diversity, and structure-activity relationships (SAR). By leveraging ECFP4 and MACCS fingerprints, we elucidate the relationship between chemical compounds and visualize the network using RDKit and NetworkX platforms. Additionally, compound clustering and visualization of the associated chemical space aid in understanding overall diversity. The outcomes include identifying consensus diversity patterns to assess global chemical space diversity. Furthermore, incorporating pairwise activity differences enhances the activity landscape visualization, revealing heterogeneous SAR patterns. The dataset analysed in this work has three activity cliff generators, CHEMBL3415598, CHEMBL4780257, and CHEMBL3265037, compounds with high affinity to SYK are very similar to compounds analogues with reasonable potency differences. Overall, this study provides a critical analysis of SYK inhibitors, uncovering potential scaffolds and chemical moieties crucial for their activity, thereby advancing the understanding of their therapeutic potential.
Collapse
Affiliation(s)
- Danishuddin
- Department of Biotechnology, Yeungnam University, Gyeongsan, Republic of Korea
| | - M Z Malik
- Department of Genetics and Bioinformatics, Dasman Diabetes Institute (DDI), Dasman, Kuwait
| | - M Kashif
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India
| | - S Haque
- Research and Scientific Studies Unit, College of Nursing and Health Sciences, Jazan University, Jazan, Saudi Arabia
- Centre of Medical and Bio-Allied Health Sciences Research, Ajman University, Ajman, United Arab Emirates
| | - J J Kim
- Department of Biotechnology, Yeungnam University, Gyeongsan, Republic of Korea
| |
Collapse
|
11
|
Martinez-Mayorga K, Rosas-Jiménez JG, Gonzalez-Ponce K, López-López E, Neme A, Medina-Franco JL. The pursuit of accurate predictive models of the bioactivity of small molecules. Chem Sci 2024; 15:1938-1952. [PMID: 38332817 PMCID: PMC10848664 DOI: 10.1039/d3sc05534e] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 01/09/2024] [Indexed: 02/10/2024] Open
Abstract
Property prediction is a key interest in chemistry. For several decades there has been a continued and incremental development of mathematical models to predict properties. As more data is generated and accumulated, there seems to be more areas of opportunity to develop models with increased accuracy. The same is true if one considers the large developments in machine and deep learning models. However, along with the same areas of opportunity and development, issues and challenges remain and, with more data, new challenges emerge such as the quality and quantity and reliability of the data, and model reproducibility. Herein, we discuss the status of the accuracy of predictive models and present the authors' perspective of the direction of the field, emphasizing on good practices. We focus on predictive models of bioactive properties of small molecules relevant for drug discovery, agrochemical, food chemistry, natural product research, and related fields.
Collapse
Affiliation(s)
- Karina Martinez-Mayorga
- Institute of Chemistry, Merida Unit, National Autonomous University of Mexico Merida-Tetiz Highway, Km. 4.5 Ucu Yucatan Mexico
- Institute for Applied Mathematics and Systems, Merida Research Unit, National Autonomous University of Mexico Sierra Papacal Merida Yucatan Mexico
| | - José G Rosas-Jiménez
- Department of Theoretical Biophysics, IMPRS on Cellular Biophysics Max-von-Laue Strasse 3 Frankfurt am Main 60438 Germany
| | - Karla Gonzalez-Ponce
- Institute of Chemistry, Merida Unit, National Autonomous University of Mexico Merida-Tetiz Highway, Km. 4.5 Ucu Yucatan Mexico
| | - Edgar López-López
- Department of Chemistry and Graduate Program in Pharmacology, Center for Research and Advanced Studies of the National Polytechnic Institute Mexico City 07000 Mexico
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry National Autonomous University of Mexico Mexico City 04510 Mexico
| | - Antonio Neme
- Institute for Applied Mathematics and Systems, Merida Research Unit, National Autonomous University of Mexico Sierra Papacal Merida Yucatan Mexico
| | - José L Medina-Franco
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry National Autonomous University of Mexico Mexico City 04510 Mexico
| |
Collapse
|
12
|
Bertin P, Rector-Brooks J, Sharma D, Gaudelet T, Anighoro A, Gross T, Martínez-Peña F, Tang EL, Suraj MS, Regep C, Hayter JBR, Korablyov M, Valiante N, van der Sloot A, Tyers M, Roberts CES, Bronstein MM, Lairson LL, Taylor-King JP, Bengio Y. RECOVER identifies synergistic drug combinations in vitro through sequential model optimization. CELL REPORTS METHODS 2023; 3:100599. [PMID: 37797618 PMCID: PMC10626197 DOI: 10.1016/j.crmeth.2023.100599] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 08/30/2023] [Accepted: 09/06/2023] [Indexed: 10/07/2023]
Abstract
For large libraries of small molecules, exhaustive combinatorial chemical screens become infeasible to perform when considering a range of disease models, assay conditions, and dose ranges. Deep learning models have achieved state-of-the-art results in silico for the prediction of synergy scores. However, databases of drug combinations are biased toward synergistic agents and results do not generalize out of distribution. During 5 rounds of experimentation, we employ sequential model optimization with a deep learning model to select drug combinations increasingly enriched for synergism and active against a cancer cell line-evaluating only ∼5% of the total search space. Moreover, we find that learned drug embeddings (using structural information) begin to reflect biological mechanisms. In silico benchmarking suggests search queries are ∼5-10× enriched for highly synergistic drug combinations by using sequential rounds of evaluation when compared with random selection or ∼3× when using a pretrained model.
Collapse
Affiliation(s)
- Paul Bertin
- Mila, the Quebec AI Institute, Montreal, QC, Canada
| | | | | | | | | | | | | | - Eileen L Tang
- Department of Chemistry, The Scripps Research Institute, La Jolla, CA, USA
| | | | | | | | | | | | - Almer van der Sloot
- IRIC, Institute for Research in Immunology and Cancer, Université de Montréal, Montreal, QC, Canada
| | - Mike Tyers
- Program in Molecular Medicine, Peter Gilgan Centre for Research and Learning, The Hospital for Sick Children, 686 Bay Street, Toronto, ON M5G 0A4, Canada
| | | | - Michael M Bronstein
- Relation Therapeutics, London, UK; Department of Computer Science, University of Oxford, Oxford, UK
| | - Luke L Lairson
- Department of Chemistry, The Scripps Research Institute, La Jolla, CA, USA
| | | | | |
Collapse
|
13
|
Schür C, Gasser L, Perez-Cruz F, Schirmer K, Baity-Jesi M. A benchmark dataset for machine learning in ecotoxicology. Sci Data 2023; 10:718. [PMID: 37853023 PMCID: PMC10584858 DOI: 10.1038/s41597-023-02612-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 09/28/2023] [Indexed: 10/20/2023] Open
Abstract
The use of machine learning for predicting ecotoxicological outcomes is promising, but underutilized. The curation of data with informative features requires both expertise in machine learning as well as a strong biological and ecotoxicological background, which we consider a barrier of entry for this kind of research. Additionally, model performances can only be compared across studies when the same dataset, cleaning, and splittings were used. Therefore, we provide ADORE, an extensive and well-described dataset on acute aquatic toxicity in three relevant taxonomic groups (fish, crustaceans, and algae). The core dataset describes ecotoxicological experiments and is expanded with phylogenetic and species-specific data on the species as well as chemical properties and molecular representations. Apart from challenging other researchers to try and achieve the best model performances across the whole dataset, we propose specific relevant challenges on subsets of the data and include datasets and splittings corresponding to each of these challenge as well as in-depth characterization and discussion of train-test splitting approaches.
Collapse
Affiliation(s)
- Christoph Schür
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, Dübendorf, Switzerland.
| | - Lilian Gasser
- Swiss Data Science Center (SDSC), Zürich, Switzerland
| | - Fernando Perez-Cruz
- Swiss Data Science Center (SDSC), Zürich, Switzerland
- ETH Zürich: Department of Computer Science, Zürich, Switzerland
| | - Kristin Schirmer
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, Dübendorf, Switzerland
- ETH Zürich: Department of Environmental Systems Science, Zürich, Switzerland
- EPF Lausanne, School of Architecture, Civil and Environmental Engineering, Lausanne, Switzerland
| | - Marco Baity-Jesi
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, Dübendorf, Switzerland
| |
Collapse
|
14
|
Han R, Yoon H, Kim G, Lee H, Lee Y. Revolutionizing Medicinal Chemistry: The Application of Artificial Intelligence (AI) in Early Drug Discovery. Pharmaceuticals (Basel) 2023; 16:1259. [PMID: 37765069 PMCID: PMC10537003 DOI: 10.3390/ph16091259] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 08/24/2023] [Accepted: 09/04/2023] [Indexed: 09/29/2023] Open
Abstract
Artificial intelligence (AI) has permeated various sectors, including the pharmaceutical industry and research, where it has been utilized to efficiently identify new chemical entities with desirable properties. The application of AI algorithms to drug discovery presents both remarkable opportunities and challenges. This review article focuses on the transformative role of AI in medicinal chemistry. We delve into the applications of machine learning and deep learning techniques in drug screening and design, discussing their potential to expedite the early drug discovery process. In particular, we provide a comprehensive overview of the use of AI algorithms in predicting protein structures, drug-target interactions, and molecular properties such as drug toxicity. While AI has accelerated the drug discovery process, data quality issues and technological constraints remain challenges. Nonetheless, new relationships and methods have been unveiled, demonstrating AI's expanding potential in predicting and understanding drug interactions and properties. For its full potential to be realized, interdisciplinary collaboration is essential. This review underscores AI's growing influence on the future trajectory of medicinal chemistry and stresses the importance of ongoing synergies between computational and domain experts.
Collapse
Affiliation(s)
| | | | | | | | - Yoonji Lee
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea
| |
Collapse
|
15
|
Dablander M, Hanser T, Lambiotte R, Morris GM. Exploring QSAR models for activity-cliff prediction. J Cheminform 2023; 15:47. [PMID: 37069675 PMCID: PMC10107580 DOI: 10.1186/s13321-023-00708-w] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 03/10/2023] [Indexed: 04/19/2023] Open
Abstract
INTRODUCTION AND METHODOLOGY Pairs of similar compounds that only differ by a small structural modification but exhibit a large difference in their binding affinity for a given target are known as activity cliffs (ACs). It has been hypothesised that QSAR models struggle to predict ACs and that ACs thus form a major source of prediction error. However, the AC-prediction power of modern QSAR methods and its quantitative relationship to general QSAR-prediction performance is still underexplored. We systematically construct nine distinct QSAR models by combining three molecular representation methods (extended-connectivity fingerprints, physicochemical-descriptor vectors and graph isomorphism networks) with three regression techniques (random forests, k-nearest neighbours and multilayer perceptrons); we then use each resulting model to classify pairs of similar compounds as ACs or non-ACs and to predict the activities of individual molecules in three case studies: dopamine receptor D2, factor Xa, and SARS-CoV-2 main protease. RESULTS AND CONCLUSIONS Our results provide strong support for the hypothesis that indeed QSAR models frequently fail to predict ACs. We observe low AC-sensitivity amongst the evaluated models when the activities of both compounds are unknown, but a substantial increase in AC-sensitivity when the actual activity of one of the compounds is given. Graph isomorphism features are found to be competitive with or superior to classical molecular representations for AC-classification and can thus be employed as baseline AC-prediction models or simple compound-optimisation tools. For general QSAR-prediction, however, extended-connectivity fingerprints still consistently deliver the best performance amongs the tested input representations. A potential future pathway to improve QSAR-modelling performance might be the development of techniques to increase AC-sensitivity.
Collapse
Affiliation(s)
- Markus Dablander
- Mathematical Institute, University of Oxford, Andrew Wiles Building, Radcliffe Observatory Quarter (550), Woodstock Road, Oxford, OX2 6GG, UK
| | - Thierry Hanser
- Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds, LS11 5PS, UK
| | - Renaud Lambiotte
- Mathematical Institute, University of Oxford, Andrew Wiles Building, Radcliffe Observatory Quarter (550), Woodstock Road, Oxford, OX2 6GG, UK
| | - Garrett M Morris
- Department of Statistics, University of Oxford, 24-29 St Giles', Oxford, OX1 3LB, UK.
| |
Collapse
|
16
|
Chiodi D, Ishihara Y. "Magic Chloro": Profound Effects of the Chlorine Atom in Drug Discovery. J Med Chem 2023; 66:5305-5331. [PMID: 37014977 DOI: 10.1021/acs.jmedchem.2c02015] [Citation(s) in RCA: 81] [Impact Index Per Article: 40.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/06/2023]
Abstract
Chlorine is one of the most common atoms present in small-molecule drugs beyond carbon, hydrogen, nitrogen, and oxygen. There are currently more than 250 FDA-approved chlorine-containing drugs, yet the beneficial effect of the chloro substituent has not yet been reviewed. The seemingly simple substitution of a hydrogen atom (R = H) with a chlorine atom (R = Cl) can result in remarkable improvements in potency of up to 100,000-fold and can lead to profound effects on pharmacokinetic parameters including clearance, half-life, and drug exposure in vivo. Following the literature terminology of the "magic methyl effect" in drugs, the term "magic chloro effect" has been coined herein. Although reports of 500-fold or 1000-fold potency improvements are often serendipitous discoveries that can be considered "magical" rather than planned, hypotheses made to explain the magic chloro effect can lead to lessons that accelerate the cycle of drug discovery.
Collapse
Affiliation(s)
- Debora Chiodi
- Department of Chemistry, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Yoshihiro Ishihara
- Department of Chemistry, Vividion Therapeutics, 5820 Nancy Ridge Drive, San Diego, California 92121, United States
| |
Collapse
|
17
|
Béquignon OJM, Bongers BJ, Jespers W, IJzerman AP, van der Water B, van Westen GJP. Papyrus: a large-scale curated dataset aimed at bioactivity predictions. J Cheminform 2023; 15:3. [PMID: 36609528 PMCID: PMC9824924 DOI: 10.1186/s13321-022-00672-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 12/17/2022] [Indexed: 01/07/2023] Open
Abstract
With the ongoing rapid growth of publicly available ligand-protein bioactivity data, there is a trove of valuable data that can be used to train a plethora of machine-learning algorithms. However, not all data is equal in terms of size and quality and a significant portion of researchers' time is needed to adapt the data to their needs. On top of that, finding the right data for a research question can often be a challenge on its own. To meet these challenges, we have constructed the Papyrus dataset. Papyrus is comprised of around 60 million data points. This dataset contains multiple large publicly available datasets such as ChEMBL and ExCAPE-DB combined with several smaller datasets containing high-quality data. The aggregated data has been standardised and normalised in a manner that is suitable for machine learning. We show how data can be filtered in a variety of ways and also perform some examples of quantitative structure-activity relationship analyses and proteochemometric modelling. Our ambition is that this pruned data collection constitutes a benchmark set that can be used for constructing predictive models, while also providing an accessible data source for research.
Collapse
Affiliation(s)
- O. J. M. Béquignon
- grid.5132.50000 0001 2312 1970Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Leiden, The Netherlands
| | - B. J. Bongers
- grid.5132.50000 0001 2312 1970Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Leiden, The Netherlands
| | - W. Jespers
- grid.5132.50000 0001 2312 1970Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Leiden, The Netherlands
| | - A. P. IJzerman
- grid.5132.50000 0001 2312 1970Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Leiden, The Netherlands
| | - B. van der Water
- grid.5132.50000 0001 2312 1970Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Leiden, The Netherlands
| | - G. J. P. van Westen
- grid.5132.50000 0001 2312 1970Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Leiden, The Netherlands
| |
Collapse
|
18
|
Isomeric Activity Cliffs-A Case Study for Fluorine Substitution of Aminergic G Protein-Coupled Receptor Ligands. Molecules 2023; 28:molecules28020490. [PMID: 36677547 PMCID: PMC9863698 DOI: 10.3390/molecules28020490] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 12/30/2022] [Accepted: 01/01/2023] [Indexed: 01/06/2023] Open
Abstract
Currently, G protein-coupled receptors (GPCRs) constitute a significant group of membrane-bound receptors representing more than 30% of therapeutic targets. Fluorine is commonly used in designing highly active biological compounds, as evidenced by the steadily increasing number of drugs by the Food and Drug Administration (FDA). Herein, we identified and analyzed 898 target-based F-containing isomeric analog sets for SAR analysis in the ChEMBL database-FiSAR sets active against 33 different aminergic GPCRs comprising a total of 2163 fluorinated (1201 unique) compounds. We found 30 FiSAR sets contain activity cliffs (ACs), defined as pairs of structurally similar compounds showing significant differences in affinity (≥50-fold change), where the change of fluorine position may lead up to a 1300-fold change in potency. The analysis of matched molecular pair (MMP) networks indicated that the fluorination of aromatic rings showed no clear trend toward a positive or negative effect on affinity. Additionally, we propose an in silico workflow (including induced-fit docking, molecular dynamics, quantum polarized ligand docking, and binding free energy calculations based on the Generalized-Born Surface-Area (GBSA) model) to score the fluorine positions in the molecule.
Collapse
|
19
|
van Tilborg D, Alenicheva A, Grisoni F. Exposing the Limitations of Molecular Machine Learning with Activity Cliffs. J Chem Inf Model 2022; 62:5938-5951. [PMID: 36456532 PMCID: PMC9749029 DOI: 10.1021/acs.jcim.2c01073] [Citation(s) in RCA: 66] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Indexed: 12/03/2022]
Abstract
Machine learning has become a crucial tool in drug discovery and chemistry at large, e.g., to predict molecular properties, such as bioactivity, with high accuracy. However, activity cliffs─pairs of molecules that are highly similar in their structure but exhibit large differences in potency─have received limited attention for their effect on model performance. Not only are these edge cases informative for molecule discovery and optimization but also models that are well equipped to accurately predict the potency of activity cliffs have increased potential for prospective applications. Our work aims to fill the current knowledge gap on best-practice machine learning methods in the presence of activity cliffs. We benchmarked a total of 24 machine and deep learning approaches on curated bioactivity data from 30 macromolecular targets for their performance on activity cliff compounds. While all methods struggled in the presence of activity cliffs, machine learning approaches based on molecular descriptors outperformed more complex deep learning methods. Our findings highlight large case-by-case differences in performance, advocating for (a) the inclusion of dedicated "activity-cliff-centered" metrics during model development and evaluation and (b) the development of novel algorithms to better predict the properties of activity cliffs. To this end, the methods, metrics, and results of this study have been encapsulated into an open-access benchmarking platform named MoleculeACE (Activity Cliff Estimation, available on GitHub at: https://github.com/molML/MoleculeACE). MoleculeACE is designed to steer the community toward addressing the pressing but overlooked limitation of molecular machine learning models posed by activity cliffs.
Collapse
Affiliation(s)
- Derek van Tilborg
- Institute
for Complex Molecular Systems and Dept. Biomedical Engineering, Eindhoven University of Technology, 5612AZEindhoven, The Netherlands
- Centre
for Living Technologies, Alliance TU/e,
WUR, UU, UMC Utrecht, 3584CBUtrecht, The Netherlands
| | | | - Francesca Grisoni
- Institute
for Complex Molecular Systems and Dept. Biomedical Engineering, Eindhoven University of Technology, 5612AZEindhoven, The Netherlands
- Centre
for Living Technologies, Alliance TU/e,
WUR, UU, UMC Utrecht, 3584CBUtrecht, The Netherlands
| |
Collapse
|
20
|
López-López E, Fernández-de Gortari E, Medina-Franco JL. Yes SIR! On the structure-inactivity relationships in drug discovery. Drug Discov Today 2022; 27:2353-2362. [PMID: 35561964 DOI: 10.1016/j.drudis.2022.05.005] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 04/09/2022] [Accepted: 05/05/2022] [Indexed: 12/12/2022]
Abstract
In analogy with structure-activity relationships (SARs), which are at the core of medicinal chemistry, studying structure-inactivity relationships (SIRs) is essential to understanding and predicting biological activity. Current computational methods should predict or distinguish 'activity' and 'inactivity' with the same confidence because both concepts are complementary. However, the lack of inactivity data, in particular in the public domain, limits the development of predictive models and its broad application. In this review, we encourage the scientific community to disclose and analyze high-confidence activity data considering both the labeled 'active' and 'inactive' compounds.
Collapse
Affiliation(s)
- Edgar López-López
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico; Department of Chemistry and Graduate Program in Pharmacology, Center for Research and Advanced Studies of the National Polytechnic Institute, Mexico City 07000, Mexico.
| | - Eli Fernández-de Gortari
- Department of Nanosafety, International Iberian Nanotechnology Laboratory, Braga 4715-330, Portugal
| | - José L Medina-Franco
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico.
| |
Collapse
|
21
|
Mousa LA, Hatmal MM, Taha M. Exploiting activity cliffs for building pharmacophore models and comparison with other pharmacophore generation methods: sphingosine kinase 1 as case study. J Comput Aided Mol Des 2022; 36:39-62. [PMID: 35059939 DOI: 10.1007/s10822-021-00435-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Accepted: 11/24/2021] [Indexed: 12/20/2022]
Abstract
Activity cliffs (ACs) are defined as closely analogous compounds of significant affinity discrepancies against certain biotarget. In this paper we propose to use AC pair(s) for extracting valid binding pharmacophores through exposing corresponding protein complexes to stochastic deformation/relaxation followed by applying genetic algorithm/machine learning (GA-ML) for selecting optimal pharmacophore(s) that best classify a long list of inhibitors. We compared the performances of ligand-based and structure-based pharmacophores with counterparts generated by this newly introduced technique. Sphingosine kinase 1 (SPHK-1) was used as case study. SPHK-1 is a lipid kinase that plays pivotal role in the regulation of a variety of biological processes including, cell growth, apoptosis, and inflammation. The new approach proved to yield pharmacophore and ML models of comparable accuracies to established ligand-based and structure-based pharmacophores. The resulting pharmacophores and ML models were used to capture hits from the national cancer institute list of compounds and predict their bioactivity categories. Two hits of novel chemotypes showed selective and low micromolar inhibitory IC50 values against SPHK-1.
Collapse
Affiliation(s)
- Lubabah A Mousa
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, University of Jordan, Amman, 11942, Jordan
| | - Ma'mon M Hatmal
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, PO Box 330127, Zarqa, 13133, Jordan
| | - Mutasem Taha
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, University of Jordan, Amman, 11942, Jordan.
| |
Collapse
|
22
|
Congenericity of Claimed Compounds in Patent Applications. Molecules 2021; 26:molecules26175253. [PMID: 34500686 PMCID: PMC8433967 DOI: 10.3390/molecules26175253] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 08/17/2021] [Accepted: 08/18/2021] [Indexed: 12/04/2022] Open
Abstract
A method is presented to analyze quantitatively the degree of congenericity of claimed compounds in patent applications. The approach successfully differentiates patents exemplified with highly congeneric compounds of a structurally compact and well defined chemical series from patents containing a more diverse set of compounds around a more vaguely described patent claim. An application to 750 common patents available in SureChEMBL, SureChEMBLccs and ChEMBL is presented and the congenericity of patent compounds in those different sources discussed.
Collapse
|
23
|
Medina-Franco JL, Sánchez-Cruz N, López-López E, Díaz-Eufracio BI. Progress on open chemoinformatic tools for expanding and exploring the chemical space. J Comput Aided Mol Des 2021; 36:341-354. [PMID: 34143323 PMCID: PMC8211976 DOI: 10.1007/s10822-021-00399-1] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 06/14/2021] [Indexed: 01/10/2023]
Abstract
The concept of chemical space is a cornerstone in chemoinformatics, and it has broad conceptual and practical applicability in many areas of chemistry, including drug design and discovery. One of the most considerable impacts is in the study of structure-property relationships where the property can be a biological activity or any other characteristic of interest to a particular chemistry discipline. The chemical space is highly dependent on the molecular representation that is also a cornerstone concept in computational chemistry. Herein, we discuss the recent progress on chemoinformatic tools developed to expand and characterize the chemical space of compound data sets using different types of molecular representations, generate visual representations of such spaces, and explore structure-property relationships in the context of chemical spaces. We emphasize the development of methods and freely available tools focusing on drug discovery applications. We also comment on the general advantages and shortcomings of using freely available and easy-to-use tools and discuss the value of using such open resources for research, education, and scientific dissemination.
Collapse
Affiliation(s)
- José L Medina-Franco
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico.
| | - Norberto Sánchez-Cruz
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico
| | - Edgar López-López
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico.,Departamento de Química y Programa de Posgrado en Farmacología, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional, Apartado 14-740, 07000, Mexico City, Mexico
| | - Bárbara I Díaz-Eufracio
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico
| |
Collapse
|
24
|
Cheminformatic Profiling and Hit Prioritization of Natural Products with Activities against Methicillin-Resistant Staphylococcus aureus (MRSA). Molecules 2021; 26:molecules26123674. [PMID: 34208597 PMCID: PMC8246317 DOI: 10.3390/molecules26123674] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/28/2021] [Accepted: 05/08/2021] [Indexed: 12/14/2022] Open
Abstract
Several natural products (NPs) have displayed varying in vitro activities against methicillin-resistant Staphylococcus aureus (MRSA). However, few of these compounds have not been developed into potential antimicrobial drug candidates. This may be due to the high cost and tedious and time-consuming process of conducting the necessary preclinical tests on these compounds. In this study, cheminformatic profiling was performed on 111 anti-MRSA NPs (AMNPs), using a few orally administered conventional drugs for MRSA (CDs) as reference, to identify compounds with prospects to become drug candidates. This was followed by prioritizing these hits and identifying the liabilities among the AMNPs for possible optimization. Cheminformatic profiling revealed that most of the AMNPs were within the required drug-like region of the investigated properties. For example, more than 76% of the AMNPs showed compliance with the Lipinski, Veber, and Egan predictive rules for oral absorption and permeability. About 34% of the AMNPs showed the prospect to penetrate the blood–brain barrier (BBB), an advantage over the CDs, which are generally non-permeant of BBB. The analysis of toxicity revealed that 59% of the AMNPs might have negligible or no toxicity risks. Structure–activity relationship (SAR) analysis revealed chemical groups that may be determinants of the reported bioactivity of the compounds. A hit prioritization strategy using a novel “desirability scoring function” was able to identify AMNPs with the desired drug-likeness. Hit optimization strategies implemented on AMNPs with poor desirability scores led to the design of two compounds with improved desirability scores.
Collapse
|