1
|
Warapande V, Meng F, Bozan A, Graff DE, Fromer JC, Mughal K, Mohideen FK, Shivangi, Paruchuri S, Johnston ML, Sharma P, Crea TR, Rudraraju RS, George A, Folvar C, Nelson AM, Neiditch MB, Zimmerman MD, Coley CW, Freundlich JS. Identification of Antituberculars with Favorable Potency and Pharmacokinetics through Structure-Based and Ligand-Based Modeling. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.03.636334. [PMID: 39974961 PMCID: PMC11838534 DOI: 10.1101/2025.02.03.636334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
Drug discovery is inherently challenged by a multiple criteria decision making problem. The arduous path from hit discovery through lead optimization and preclinical candidate selection necessitates the evolution of a plethora of molecular properties. In this study, we focus on the hit discovery phase while beginning to address multiple criteria critical to the development of novel therapeutics to treat Mycobacterium tuberculosis infection. We develop a hybrid structure- and ligand-based pipeline for nominating diverse inhibitors targeting the β-ketoacyl synthase KasA by employing a Bayesian optimization-guided docking method and an ensemble model for compound nominations based on machine learning models for in vitro antibacterial efficacy, as characterized by minimum inhibitory concentration (MIC), and mouse pharmacokinetic (PK) plasma exposure. The application of our pipeline to the Enamine HTS library of 2.1M molecules resulted in the selection of 93 compounds, the experimental validation of which revealed exceptional PK (41%) and MIC (19%) success rates. Twelve compounds meet hit-like criteria in terms of MIC and PK profile and represent promising seeds for future drug discovery programs.
Collapse
|
2
|
Jamir L, P H. Employing Machine Learning Models to Predict Potential α-Glucosidase Inhibitory Plant Secondary Metabolites Targeting Type-2 Diabetes and Their In Vitro Validation. J Chem Inf Model 2024; 64:9150-9162. [PMID: 39352297 DOI: 10.1021/acs.jcim.4c00955] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]
Abstract
The need for new antidiabetic drugs is evident, considering the ongoing global burden of type-2 diabetes mellitus despite notable progress in drug discovery from laboratory research to clinical application. This study aimed to build machine learning (ML) models to predict potential α-glucosidase inhibitors based on the data set comprising over 537 reported plant secondary metabolite (PSM) α-glucosidase inhibitors. We assessed 35 ML models by using seven different fingerprints. The Random forest with the RDKit fingerprint was the best-performing model, with an accuracy (ACC) of 83.74% and an area under the ROC curve (AUC) of 0.803. The resulting robust ML model encompasses all reported α-glucosidase inhibitory PSMs. The model was employed to predict potential α-glucosidase inhibitors from an in-house 5810 PSM database. The model identified 965 PSMs with a prediction activity ≥0.90 for α-glucosidase inhibition. Twenty-four predicted PSMs were subjected to in vitro assay, and 13 were found to inhibit α-glucosidase with IC50 ranging from 0.63 to 7 mg/mL. Among them, seven compounds recorded IC50 values less than the standard drug acarbose and were investigated further to have optimal drug-likeness and medicinal chemistry characteristics. The ML model and in vitro experiments have identified nervonic acid as a promising α-glucosidase inhibitor. This compound should be further investigated for its potential integration into the diabetes treatment system.
Collapse
Affiliation(s)
- Lemnaro Jamir
- Centre for Rural Development and Technology, Indian Institute of Technology Delhi, New Delhi 110016, India
| | - Hariprasad P
- Centre for Rural Development and Technology, Indian Institute of Technology Delhi, New Delhi 110016, India
| |
Collapse
|
3
|
Liu Q, He D, Fan M, Wang J, Cui Z, Wang H, Mi Y, Li N, Meng Q, Hou Y. Prediction and Interpretation Microglia Cytotoxicity by Machine Learning. J Chem Inf Model 2024; 64:9306-9326. [PMID: 38949724 DOI: 10.1021/acs.jcim.4c00366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Ameliorating microglia-mediated neuroinflammation is a crucial strategy in developing new drugs for neurodegenerative diseases. Plant compounds are an important screening target for the discovery of drugs for the treatment of neurodegenerative diseases. However, due to the spatial complexity of phytochemicals, it becomes particularly important to evaluate the effectiveness of compounds while avoiding the mixing of cytotoxic substances in the early stages of compound screening. Traditional high-throughput screening methods suffer from high cost and low efficiency. A computational model based on machine learning provides a novel avenue for cytotoxicity determination. In this study, a microglia cytotoxicity classifier was developed using a machine learning approach. First, we proposed a data splitting strategy based on the molecule murcko generic scaffold, under this condition, three machine learning approaches were coupled with three kinds of molecular representation methods to construct microglia cytotoxicity classifier, which were then compared and assessed by the predictive accuracy, balanced accuracy, F1-score, and Matthews Correlation Coefficient. Then, the recursive feature elimination integrated with support vector machine (RFE-SVC) dimension reduction method was introduced to molecular fingerprints with high dimensions to further improve the model performance. Among all the microglial cytotoxicity classifiers, the SVM coupled with ECFP4 fingerprint after feature selection (ECFP4-RFE-SVM) obtained the most accurate classification for the test set (ACC of 0.99, BA of 0.99, F1-score of 0.99, MCC of 0.97). Finally, the Shapley additive explanations (SHAP) method was used in interpreting the microglia cytotoxicity classifier and key substructure smart identified as structural alerts. Experimental results show that ECFP4-RFE-SVM have reliable classification capability for microglia cytotoxicity, and SHAP can not only provide a rational explanation for microglia cytotoxicity predictions, but also offer a guideline for subsequent molecular cytotoxicity modifications.
Collapse
Affiliation(s)
- Qing Liu
- College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
| | - Dakuo He
- College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
| | - Mengmeng Fan
- College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
| | - Jinpeng Wang
- College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
| | - Zeyu Cui
- College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
| | - Hao Wang
- College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
| | - Yan Mi
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Key Laboratory of Data Analytics and Optimization for Smart Industry, Ministry of Education, Northeastern University, Shenyang 110169, P. R. China
| | - Ning Li
- School of Traditional Chinese Materia Medica, Key Laboratory for TCM Material Basis Study and Innovative Drug Development of Shenyang City, Shenyang Pharmaceutical University, Shenyang 110016, P. R. China
| | - Qingqi Meng
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Key Laboratory of Data Analytics and Optimization for Smart Industry, Ministry of Education, Northeastern University, Shenyang 110169, P. R. China
| | - Yue Hou
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Key Laboratory of Data Analytics and Optimization for Smart Industry, Ministry of Education, Northeastern University, Shenyang 110169, P. R. China
| |
Collapse
|
4
|
Obrezanova O. Artificial intelligence for compound pharmacokinetics prediction. Curr Opin Struct Biol 2023; 79:102546. [PMID: 36804676 DOI: 10.1016/j.sbi.2023.102546] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 01/04/2023] [Accepted: 01/13/2023] [Indexed: 02/17/2023]
Abstract
Optimisation of compound pharmacokinetics (PK) is an integral part of drug discovery and development. Animal in vivo PK data as well as human and animal in vitro systems are routinely utilised to evaluate PK in humans. In recent years machine learning and artificial intelligence (AI) emerged as a major tool for modelling of in vivo animal and human PK, enabling prediction from chemical structure early in drug discovery, and therefore offering opportunities to guide the design and prioritisation of molecules based on relevant in vivo properties and, ultimately, predicting human PK at the point of design. This review presents recent advances in machine learning and AI models for in vivo animal and human PK for small-molecule compounds as well as some examples for antibody therapeutics.
Collapse
Affiliation(s)
- Olga Obrezanova
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Cambridge, CB4 0WJ, UK.
| |
Collapse
|
5
|
Machine Learning Models to Predict Protein-Protein Interaction Inhibitors. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27227986. [PMID: 36432086 PMCID: PMC9694076 DOI: 10.3390/molecules27227986] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 11/09/2022] [Accepted: 11/16/2022] [Indexed: 11/19/2022]
Abstract
Protein-protein interaction (PPI) inhibitors have an increasing role in drug discovery. It is hypothesized that machine learning (ML) algorithms can classify or identify PPI inhibitors. This work describes the performance of different algorithms and molecular fingerprints used in chemoinformatics to develop a classification model to identify PPI inhibitors making the codes freely available to the community, particularly the medicinal chemistry research groups working with PPI inhibitors. We found that classification algorithms have different performances according to various features employed in the training process. Random forest (RF) models with the extended connectivity fingerprint radius 2 (ECFP4) had the best classification abilities compared to those models trained with ECFP6 o MACCS keys (166-bits). In general, logistic regression (LR) models had lower performance metrics than RF models, but ECFP4 was the representation most appropriate for LR. ECFP4 also generated models with high-performance metrics with support vector machines (SVM). We also constructed ensemble models based on the top-performing models. As part of this work and to help non-computational experts, we developed a pipeline code freely available.
Collapse
|
6
|
Mughal H, Bell EC, Mughal K, Derbyshire ER, Freundlich JS. Random Forest Model Predictions Afford Dual-Stage Antimalarial Agents. ACS Infect Dis 2022; 8:1553-1562. [PMID: 35894649 PMCID: PMC9987178 DOI: 10.1021/acsinfecdis.2c00189] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The need for novel antimalarials is apparent given the continuing disease burden worldwide, despite significant drug discovery advances from the bench to the bedside. In particular, small-molecule agents with potent efficacy against both the liver and blood stages of Plasmodium parasite infection are critical for clinical settings as they would simultaneously prevent and treat malaria with a reduced selection pressure for resistance. While experimental screens for such dual-stage inhibitors have been conducted, the time and cost of these efforts limit their scope. Here, we have focused on leveraging machine learning approaches to discover novel antimalarials with such properties. A random forest modeling approach was taken to predict small molecules with in vitro efficacy versus liver-stage Plasmodium berghei parasites and a lack of human liver cell cytotoxicity. Empirical validation of the model was achieved with the realization of hits with liver-stage efficacy after prospective scoring of a commercial diversity library and consideration of structural diversity. A subset of these hits also demonstrated promising blood-stage Plasmodium falciparum efficacy. These 18 validated dual-stage antimalarials represent novel starting points for drug discovery and mechanism of action studies with significant potential for seeding a new generation of therapies.
Collapse
Affiliation(s)
- Haseeb Mughal
- Department of Pharmacology, Physiology, and Neuroscience, Rutgers University – New Jersey Medical School, 185 South Orange Ave, Newark, NJ, 07103
| | - Elise C. Bell
- Department of Chemistry, Duke University, 124 Science Drive, Durham, NC 27708, USA
| | - Khadija Mughal
- Department of Pharmacology, Physiology, and Neuroscience, Rutgers University – New Jersey Medical School, 185 South Orange Ave, Newark, NJ, 07103
| | - Emily R. Derbyshire
- Department of Chemistry, Duke University, 124 Science Drive, Durham, NC 27708, USA
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, 213 Research Drive, Durham, NC 27710, USA
| | - Joel S. Freundlich
- Department of Pharmacology, Physiology, and Neuroscience, Rutgers University – New Jersey Medical School, 185 South Orange Ave, Newark, NJ, 07103
- Department of Medicine, Center for Emerging and Re-emerging Pathogens, Rutgers University – New Jersey Medical School, Newark, NJ, 07103
| |
Collapse
|