1
|
Chutia H, Borah GS, Mahanta HJ, Nagamani S. BoostDILI: Extreme Gradient Boost-Powered Drug-Induced Liver Injury Prediction and Structural Alerts Generation. Chem Res Toxicol 2025; 38:865-876. [PMID: 40241442 DOI: 10.1021/acs.chemrestox.4c00532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/18/2025]
Abstract
Over the past 60 years, drug-induced liver injury (DILI) has played a key role in the withdrawal of marketed drugs due to safety concerns. Early prediction of DILI is crucial for developing safer pharmaceuticals, yet current in vitro and in vivo testing methods are complex and cumbersome. In this study, we developed an extreme gradient boosting (XGB)-powered machine learning (ML) model for DILI prediction. Comparing various DILI prediction models is challenging because they rely on different public data sets. We comprehensively evaluated the proposed BoostDILI model to address two crucial questions: 1. Can insights derived from public data sets help in DILI prediction for Food and Drug Administration (FDA) approved drugs? 2. Can we generate structural alerts to improve the model's explainability? To address the first question, we developed a DILI prediction model using four publicly available data sets. This effort led to the creation of the BoostDILI model, which achieved a 5-fold CV accuracy of 0.70. A sequential feature selection method was employed to identify relevant descriptors. This model integrates feature-level representations derived from RDKit (12 features) and Mordred (23 features) features. Bayesian statistics was applied to identify high-performance substructures iteratively, and a structural alerts model was developed to address the second question. The developed model was further validated with two FDA-approved drug data sets, DILIst and DILIRank. The BoostDILI model offers a trustable solution for evaluating the DILI risk in preclinical research. The structural alerts help in identifying the substructures that may be responsible for DILI. The data set and the source code are available at https://github.com/Naga270588/BoostDILI.
Collapse
Affiliation(s)
- Hillul Chutia
- CSIR-North East Institute of Science and Technology, Jorhat 785006, India
| | - Gori Sankar Borah
- School of Computer Science, The Assam Kaziranga University, Jorhat 785006, India
| | - Hridoy Jyoti Mahanta
- CSIR-North East Institute of Science and Technology, Jorhat 785006, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Selvaraman Nagamani
- CSIR-North East Institute of Science and Technology, Jorhat 785006, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| |
Collapse
|
2
|
Rehman AU, Li M, Wu B, Ali Y, Rasheed S, Shaheen S, Liu X, Luo R, Zhang J. Role of artificial intelligence in revolutionizing drug discovery. FUNDAMENTAL RESEARCH 2025; 5:1273-1287. [PMID: 40528990 PMCID: PMC12167903 DOI: 10.1016/j.fmre.2024.04.021] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 04/09/2024] [Accepted: 04/24/2024] [Indexed: 06/20/2025] Open
Abstract
The application of artificial intelligence (AI) in medicine, particularly through machine learning (ML), marked a significant progression in drug discovery. AI acts as a powerful catalyst in narrowing the gap between disease understanding and the identification of potential therapeutic agents. This review provides an inclusive summary of the latest advancements in AI and its application in drug discovery. We examine the various stages of the drug discovery process, starting from disease identification and encompassing diagnosis, target identification, screening, and lead discovery. AI's capability to analyze extensive datasets and discern patterns is essential in these stages, enhancing predictions and efficiencies in disease identification, drug discovery, and clinical trial management. The role of AI in expediting drug development is emphasized, highlighting its potential to analyze vast data volumes, thus reducing the time and costs associated with new drug market introduction. The importance of data quality, algorithm training, and ethical considerations, especially in patient data handling during clinical trials, is addressed. By considering these factors, AI promises to transform drug development, offering significant benefits to patients and society.
Collapse
Affiliation(s)
- Ashfaq Ur Rehman
- Medicinal Chemistry and Bioinformatics Center, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
- Departments of Molecular Biology and Biochemistry, University of California Irvine, Irvine, CA 92697, United States
| | - Mingyu Li
- Medicinal Chemistry and Bioinformatics Center, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
| | - Binjian Wu
- Medicinal Chemistry and Bioinformatics Center, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
| | - Yasir Ali
- Institute of Chemistry, Slovak Academy of Sciences, 845 38 Bratislava, Slovakia
| | - Salman Rasheed
- National Center for Bioinformatics, Quaid-e-Azam University, Islamabad 44000, Pakistan
| | - Sana Shaheen
- Key Department of Biochemistry, Abdul Wali Khan University Mardan, Khyber-Pakhtunkhwa 23200, Pakistan
| | - Xinyi Liu
- Medicinal Chemistry and Bioinformatics Center, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
- State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine at Shanghai, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
| | - Ray Luo
- Departments of Molecular Biology and Biochemistry, University of California Irvine, Irvine, CA 92697, United States
| | - Jian Zhang
- Medicinal Chemistry and Bioinformatics Center, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
- State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine at Shanghai, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
| |
Collapse
|
3
|
Kim S, Yang S, Jung J, Choi J, Kang M, Joo J. Psychedelic Drugs in Mental Disorders: Current Clinical Scope and Deep Learning-Based Advanced Perspectives. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025; 12:e2413786. [PMID: 40112231 PMCID: PMC12005819 DOI: 10.1002/advs.202413786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2024] [Revised: 02/13/2025] [Indexed: 03/22/2025]
Abstract
Mental disorders are a representative type of brain disorder, including anxiety, major depressive depression (MDD), and autism spectrum disorder (ASD), that are caused by multiple etiologies, including genetic heterogeneity, epigenetic dysregulation, and aberrant morphological and biochemical conditions. Psychedelic drugs such as psilocybin and lysergic acid diethylamide (LSD) have been renewed as fascinating treatment options and have gradually demonstrated potential therapeutic effects in mental disorders. However, the multifaceted conditions of psychiatric disorders resulting from individuality, complex genetic interplay, and intricate neural circuits impact the systemic pharmacology of psychedelics, which disturbs the integration of mechanisms that may result in dissimilar medicinal efficiency. The precise prescription of psychedelic drugs remains unclear, and advanced approaches are needed to optimize drug development. Here, recent studies demonstrating the diverse pharmacological effects of psychedelics in mental disorders are reviewed, and emerging perspectives on structural function, the microbiota-gut-brain axis, and the transcriptome are discussed. Moreover, the applicability of deep learning is highlighted for the development of drugs on the basis of big data. These approaches may provide insight into pharmacological mechanisms and interindividual factors to enhance drug discovery and development for advanced precision medicine.
Collapse
Affiliation(s)
- Sung‐Hyun Kim
- Department of PharmacyCollege of PharmacyHanyang UniversityAnsanGyeonggi‐do15588Republic of Korea
| | - Sumin Yang
- Department of PharmacyCollege of PharmacyHanyang UniversityAnsanGyeonggi‐do15588Republic of Korea
| | - Jeehye Jung
- Department of PharmacyCollege of PharmacyHanyang UniversityAnsanGyeonggi‐do15588Republic of Korea
| | - Jeonghyeon Choi
- Department of PharmacyCollege of PharmacyHanyang UniversityAnsanGyeonggi‐do15588Republic of Korea
| | - Mingon Kang
- Department of Computer ScienceUniversity of NevadaLas VegasNV89154USA
| | - Jae‐Yeol Joo
- Department of PharmacyCollege of PharmacyHanyang UniversityAnsanGyeonggi‐do15588Republic of Korea
| |
Collapse
|
4
|
OréMaldonado KA, Cuesta SA, Mora JR, Loroño MA, Paz JL. Discovering New Tyrosinase Inhibitors by Using In Silico Modelling, Molecular Docking, and Molecular Dynamics. Pharmaceuticals (Basel) 2025; 18:418. [PMID: 40143194 PMCID: PMC11946302 DOI: 10.3390/ph18030418] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2024] [Revised: 03/09/2025] [Accepted: 03/13/2025] [Indexed: 03/28/2025] Open
Abstract
Background/Objectives: This study was used in silico modelling to search for potential tyrosinase protein inhibitors from a database of different core structures for IC50 prediction. Methods: Four machine learning algorithms and topographical descriptors were tested for model construction. Results: A model based on multiple linear regression was the most robust, with only six descriptors, and validated by the Tropsha test with statistical parameters R2 = 0.8687, Q2LOO = 0.8030, and Q2ext = 0.9151. From the screening of FDA-approved drugs and natural products, the pIC50 values for 15,424 structures were calculated. The applicability domain analysis covered 100% of the external dataset and 71.22% and 73.26% of the two screening datasets. Fifteen candidates with pIC50 above 7.6 were identified, with five structures proposed as potential tyrosinase enzyme inhibitors, which underwent ADME analysis. Conclusions: The molecular docking analysis was performed for the dataset used in the training-test process and for the fifteen structures from the screening dataset with potential pharmaceutical tyrosinase inhibition, followed by molecular dynamics studies for the top five candidates with the highest predicted pIC50 values. The new use of these five candidates in tyrosinase inhibition is highlighted based on their promising application in melanoma treatment.
Collapse
Affiliation(s)
- Kevin A. OréMaldonado
- Departamento Académico de Química Fisicoquímica, Facultad de Química e Ingeniería Química, Universidad Nacional Mayor de San Marcos, Lima 15081, Peru;
| | - Sebastián A. Cuesta
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito 170901, Ecuador;
- Department of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, Manchester M17DN, UK
| | - José R. Mora
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito 170901, Ecuador;
| | - Marcos A. Loroño
- Departamento Académico de Química Fisicoquímica, Facultad de Química e Ingeniería Química, Universidad Nacional Mayor de San Marcos, Lima 15081, Peru;
| | - José L. Paz
- Departamento Académico de Química Inorgánica, Facultad de Química e Ingeniería Química, Universidad Nacional Mayor de San Marcos, Lima 15081, Peru;
| |
Collapse
|
5
|
García MC, Cuesta SA, Mora JR, Paz JL, Marrero-Ponce Y, Alexis F, Márquez EA. Using computer modeling to find new LRRK2 inhibitors for parkinson's disease. Sci Rep 2025; 15:4085. [PMID: 39900949 PMCID: PMC11790940 DOI: 10.1038/s41598-025-86926-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2024] [Accepted: 01/15/2025] [Indexed: 02/05/2025] Open
Abstract
Parkinson's disease (PD) is a complex neurodegenerative disorder that affects multiple neurotransmitters, and its exact cause is still unknown. Developing new drugs for PD is a lengthy and expensive process, making it difficult to find new treatments. This study aims to create a detailed dataset to build strong predictive models with various machine learning algorithms. An ensemble modeling approach was employed to screen the DrugBank database, aiming to repurpose approved medications as potential treatments for Parkinson's disease (PD). The dataset was constructed using pIC50 values of various compounds targeting the inhibition of leucine-rich repeat kinase 2 (LRRK2). The best ensemble model showed exceptional predictive performance, with five-fold cross-validation and external validation metrics exceeding 0.8 (Q2cv = 0.864 and Q2ext = 0.873). The DrugBank screening resulted in three promising drugs-triamterene, phenazopyridine, and CRA_1801-with predicted pIC50 values greater than 7, warranting further investigation as novel PD treatments. Molecular docking and molecular dynamics simulations were performed to provide a comprehensive understanding of the interactions between LRRK2 and the inhibitors in the data set and best molecules of the screening. Free energy of binding calculation along with hydrogen bond occupancy analysis and RMSD of the ligand in the pocket show CRA_1801 as the best candidate to be repurposed as LRRK2 inhibitor.
Collapse
Affiliation(s)
- María C García
- Departamento de Ingeniería Química, Diego de Robles y Vía Interoceánica, Universidad San Francisco de Quito, 170901, Quito, Ecuador
| | - Sebastián A Cuesta
- Departamento de Ingeniería Química, Diego de Robles y Vía Interoceánica, Universidad San Francisco de Quito, 170901, Quito, Ecuador
- Department of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, Manchester, M1 7DN, UK
| | - José R Mora
- Departamento de Ingeniería Química, Diego de Robles y Vía Interoceánica, Universidad San Francisco de Quito, 170901, Quito, Ecuador.
| | - Jose L Paz
- Departamento Académico de Química Inorgánica, Facultad de Química e Ingeniería Química, Universidad Nacional Mayor de San Marcos, Lima, Perú
| | - Yovani Marrero-Ponce
- Grupo de Medicina Molecular y Traslacional (MeM&T), Universidad San Francisco de Quito, Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA), Av. Interoceánica Km 12 1/2 y Av. Florencia, 17, 1200-841, Quito, Ecuador
| | - Frank Alexis
- Departamento de Ingeniería Química, Diego de Robles y Vía Interoceánica, Universidad San Francisco de Quito, 170901, Quito, Ecuador
| | - Edgar A Márquez
- Grupo de Investigaciones en Química y Biología, Departamento de Química y Biología, Facultad de Ciencias Básicas, Universidad del Norte, Carrera 51B, Km 5, vía Puerto Colombia, Barranquilla, 081007, Colombia.
| |
Collapse
|
6
|
Yang Y, Yang Z, Pang X, Cao H, Sun Y, Wang L, Zhou Z, Wang P, Liang Y, Wang Y. Molecular designing of potential environmentally friendly PFAS based on deep learning and generative models. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 953:176095. [PMID: 39245376 DOI: 10.1016/j.scitotenv.2024.176095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2024] [Revised: 09/03/2024] [Accepted: 09/04/2024] [Indexed: 09/10/2024]
Abstract
Perfluoroalkyl and polyfluoroalkyl substances (PFAS) are widely used across a spectrum of industrial and consumer goods. Nonetheless, their persistent nature and tendency to accumulate in biological systems pose substantial environmental and health threats. Consequently, striking a balance between maximizing product efficiency and minimizing environmental and health risks by tailoring the molecular structure of PFAS has become a pivotal challenge in the fields of environmental chemistry and sustainable development. To address this issue, a computational workflow was proposed for designing an environmentally friendly PFAS by incorporating deep learning (DL) and molecular generative models. The hybrid DL architecture MolHGT+ based on heterogeneous graph neural network with transformer-like attention was applied to predict the surface tension, bioaccumulation, and hepatotoxicity of the molecules. Through virtual screening of the PFAS master database using MolHGT+, the findings indicate that incorporating the siloxane group and betaine fragment can effectively decrease both the bioaccumulation and hepatotoxicity of PFAS while preserving low surface tension. In addition, molecular generative models were employed to create a structurally diverse pool of novel PFASs with the aforementioned hit molecules serving as the initial template structures. Overall, our study presents a promising AI-driven method for advancing the development of environmentally friendly PFAS.
Collapse
Affiliation(s)
- Ying Yang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Zeguo Yang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Xudi Pang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Huiming Cao
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China.
| | - Yuzhen Sun
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Ling Wang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Zhen Zhou
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Pu Wang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Yong Liang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China.
| | - Yawei Wang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China; State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
| |
Collapse
|
7
|
De La Torre S, Cuesta SA, Calle L, Mora JR, Paz JL, Espinoza-Montero PJ, Flores-Sumoza M, Márquez EA. Computational approaches for lead compound discovery in dipeptidyl peptidase-4 inhibition using machine learning and molecular dynamics techniques. Comput Biol Chem 2024; 112:108145. [PMID: 39002224 DOI: 10.1016/j.compbiolchem.2024.108145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Revised: 07/01/2024] [Accepted: 07/08/2024] [Indexed: 07/15/2024]
Abstract
The prediction of possible lead compounds from already-known drugs that may present DPP-4 inhibition activity imply a advantage in the drug development in terms of time and cost to find alternative medicines for the treatment of Type 2 Diabetes Mellitus (T2DM). The inhibition of dipeptidyl peptidase-4 (DPP-4) has been one of the most explored strategies to develop potential drugs against this condition. A diverse dataset of molecules with known experimental inhibitory activity against DPP-4 was constructed and used to develop predictive models using different machine-learning algorithms. Model M36 is the most promising one based on the internal and external performance showing values of Q2CV = 0.813, and Q2EXT = 0.803. The applicability domain evaluation and Tropsha's analysis were conducted to validate M36, indicating its robustness and accuracy in predicting pIC50 values for organic molecules within the established domain. The physicochemical properties of the ligands, including electronegativity, polarizability, and van der Waals volume were relevant to predict the inhibition process. The model was then employed in the virtual screening of potential DPP4 inhibitors, finding 448 compounds from the DrugBank and 9 from DiaNat with potential inhibitory activity. Molecular docking and molecular dynamics simulations were used to get insight into the ligand-protein interaction. From the screening and the favorable molecular dynamic results, several compounds including Skimmin (pIC50 = 3.54, Binding energy = -8.86 kcal/mol), bergenin (pIC50 = 2.69, Binding energy = -13.90 kcal/mol), and DB07272 (pIC50 = 3.97, Binding energy = -25.28 kcal/mol) seem to be promising hits to be tested and optimized in the treatment of T2DM. This results imply a important reduction in cost and time on the application of this drugs because all the information about the its metabolism is already available.
Collapse
Affiliation(s)
- Sandra De La Torre
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito 170901, Ecuador
| | - Sebastián A Cuesta
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito 170901, Ecuador; Department of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, Manchester M1 7DN, UK
| | - Luis Calle
- Facultad de Ciencias Médicas, Instituto de Investigación e Innovación en Salud Integral, Universidad Católica Santiago de Guayaquil, Guayaquil 09013493, Ecuador
| | - José R Mora
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito 170901, Ecuador.
| | - Jose L Paz
- Departamento Académico de Química Inorgánica, Facultad de Química e Ingeniería Química, Universidad Nacional Mayor de San Marcos, Lima, Peru
| | | | - Máryury Flores-Sumoza
- Facultad de Ciencias Básicas y Biomédicas, Programa de Química y Farmacia, Universidad Simón Bolívar, carrera 59 N° 59-65, Barranquilla 080002, Colombia
| | - Edgar A Márquez
- Grupo de Investigaciones en Química y Biología, Departamento de Química y Biología, Facultad de Ciencias Básicas, Universidad del Norte, Carrera 51B, Km 5, vía Puerto Colombia, Barranquilla 081007, Colombia
| |
Collapse
|
8
|
Khan MZI, Ren JN, Cao C, Ye HYX, Wang H, Guo YM, Yang JR, Chen JZ. Comprehensive hepatotoxicity prediction: ensemble model integrating machine learning and deep learning. Front Pharmacol 2024; 15:1441587. [PMID: 39234116 PMCID: PMC11373136 DOI: 10.3389/fphar.2024.1441587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Accepted: 07/24/2024] [Indexed: 09/06/2024] Open
Abstract
Background Chemicals may lead to acute liver injuries, posing a serious threat to human health. Achieving the precise safety profile of a compound is challenging due to the complex and expensive testing procedures. In silico approaches will aid in identifying the potential risk of drug candidates in the initial stage of drug development and thus mitigating the developmental cost. Methods In current studies, QSAR models were developed for hepatotoxicity predictions using the ensemble strategy to integrate machine learning (ML) and deep learning (DL) algorithms using various molecular features. A large dataset of 2588 chemicals and drugs was randomly divided into training (80%) and test (20%) sets, followed by the training of individual base models using diverse machine learning or deep learning based on three different kinds of descriptors and fingerprints. Feature selection approaches were employed to proceed with model optimizations based on the model performance. Hybrid ensemble approaches were further utilized to determine the method with the best performance. Results The voting ensemble classifier emerged as the optimal model, achieving an excellent prediction accuracy of 80.26%, AUC of 82.84%, and recall of over 93% followed by bagging and stacking ensemble classifiers method. The model was further verified by an external test set, internal 10-fold cross-validation, and rigorous benchmark training, exhibiting much better reliability than the published models. Conclusion The proposed ensemble model offers a dependable assessment with a good performance for the prediction regarding the risk of chemicals and drugs to induce liver damage.
Collapse
Affiliation(s)
| | - Jia-Nan Ren
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Cheng Cao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
- Polytechnic Institute, Zhejiang University, Hangzhou, China
| | - Hong-Yu-Xiang Ye
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Hao Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Ya-Min Guo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Jin-Rong Yang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
- Polytechnic Institute, Zhejiang University, Hangzhou, China
| | - Jian-Zhong Chen
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| |
Collapse
|
9
|
Seal S, Williams D, Hosseini-Gerami L, Mahale M, Carpenter AE, Spjuth O, Bender A. Improved Detection of Drug-Induced Liver Injury by Integrating Predicted In Vivo and In Vitro Data. Chem Res Toxicol 2024; 37:1290-1305. [PMID: 38981058 PMCID: PMC11337212 DOI: 10.1021/acs.chemrestox.4c00015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 06/27/2024] [Accepted: 07/01/2024] [Indexed: 07/11/2024]
Abstract
Drug-induced liver injury (DILI) has been a significant challenge in drug discovery, often leading to clinical trial failures and necessitating drug withdrawals. Over the last decade, the existing suite of in vitro proxy-DILI assays has generally improved at identifying compounds with hepatotoxicity. However, there is considerable interest in enhancing the in silico prediction of DILI because it allows for evaluating large sets of compounds more quickly and cost-effectively, particularly in the early stages of projects. In this study, we aim to study ML models for DILI prediction that first predict nine proxy-DILI labels and then use them as features in addition to chemical structural features to predict DILI. The features include in vitro (e.g., mitochondrial toxicity, bile salt export pump inhibition) data, in vivo (e.g., preclinical rat hepatotoxicity studies) data, pharmacokinetic parameters of maximum concentration, structural fingerprints, and physicochemical parameters. We trained DILI-prediction models on 888 compounds from the DILI data set (composed of DILIst and DILIrank) and tested them on a held-out external test set of 223 compounds from the DILI data set. The best model, DILIPredictor, attained an AUC-PR of 0.79. This model enabled the detection of the top 25 toxic compounds (2.68 LR+, positive likelihood ratio) compared to models using only structural features (1.65 LR+ score). Using feature interpretation from DILIPredictor, we identified the chemical substructures causing DILI and differentiated cases of DILI caused by compounds in animals but not in humans. For example, DILIPredictor correctly recognized 2-butoxyethanol as nontoxic in humans despite its hepatotoxicity in mice models. Overall, the DILIPredictor model improves the detection of compounds causing DILI with an improved differentiation between animal and human sensitivity and the potential for mechanism evaluation. DILIPredictor required only chemical structures as input for prediction and is publicly available at https://broad.io/DILIPredictor for use via web interface and with all code available for download.
Collapse
Affiliation(s)
- Srijit Seal
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Rd, Cambridge CB2 1EW, United Kingdom
- Imaging
Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02141, United States
| | - Dominic Williams
- Safety
Innovation, Clinical Pharmacology and Safety Sciences, AstraZeneca, Cambridge CB4 0FZ, United Kingdom
- Quantitative
Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge CB4 0FZ, United Kingdom
| | - Layla Hosseini-Gerami
- Ignota
Laboratories, County Hall, Westminster Bridge Rd, London SE1 7PB, United Kingdom
| | - Manas Mahale
- Bombay
College
of Pharmacy Kalina Santacruz (E), Mumbai 400 098, India
| | - Anne E. Carpenter
- Imaging
Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02141, United States
| | - Ola Spjuth
- Department
of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Box 591, Uppsala SE-75124, Sweden
| | - Andreas Bender
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Rd, Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
10
|
Zhao Y, Zhang Z, Kong X, Wang K, Wang Y, Jia J, Li H, Tian S. Prediction of Drug-Induced Liver Injury: From Molecular Physicochemical Properties and Scaffold Architectures to Machine Learning Approaches. Chem Biol Drug Des 2024; 104:e14607. [PMID: 39179521 DOI: 10.1111/cbdd.14607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 07/24/2024] [Accepted: 08/01/2024] [Indexed: 08/26/2024]
Abstract
The process of developing new drugs is widely acknowledged as being time-intensive and requiring substantial financial investment. Despite ongoing efforts to reduce time and expenses in drug development, ensuring medication safety remains an urgent problem. One of the major problems involved in drug development is hepatotoxicity, specifically known as drug-induced liver injury (DILI). The popularity of new drugs often poses a significant barrier during development and frequently leads to their recall after launch. In silico methods have many advantages compared with traditional in vivo and in vitro assays. To establish a more precise and reliable prediction model, it is necessary to utilize an extensive and high-quality database consisting of information on drug molecule properties and structural patterns. In addition, we should also carefully select appropriate molecular descriptors that can be used to accurately depict compound characteristics. The aim of this study was to conduct a comprehensive investigation into the prediction of DILI. First, we conducted a comparative analysis of the physicochemical properties of extensively well-prepared DILI-positive and DILI-negative compounds. Then, we used classic substructure dissection methods to identify structural pattern differences between these two different types of chemical molecules. These findings indicate that it is not feasible to establish property or substructure-based rules for distinguishing between DILI-positive and DILI-negative compounds. Finally, we developed quantitative classification models for predicting DILI using the naïve Bayes classifier (NBC) and recursive partitioning (RP) machine learning techniques. The optimal DILI prediction model was obtained using NBC, which combines 21 physicochemical properties, the VolSurf descriptors and the LCFP_10 fingerprint set. This model achieved a global accuracy (GA) of 0.855 and an area under the curve (AUC) of 0.704 for the training set, while the corresponding values were 0.619 and 0.674 for the test set, respectively. Moreover, indicative substructural fragments favorable or unfavorable for DILI were identified from the best naïve Bayesian classification model. These findings may help prioritize lead compounds in the early stage of drug development pipelines.
Collapse
Affiliation(s)
- Yulong Zhao
- College of Pharmaceutical Sciences, Soochow University, Suzhou, China
| | - Zhoudong Zhang
- College of Pharmaceutical Sciences, Soochow University, Suzhou, China
| | - Xiaotian Kong
- Jiangsu Province Engineering Research Center of Precision Diagnostics and Therapeutics Development, Soochow University, Suzhou, China
| | - Kai Wang
- College of Pharmaceutical Sciences, Soochow University, Suzhou, China
| | - Yaxuan Wang
- College of Pharmaceutical Sciences, Soochow University, Suzhou, China
| | - Jie Jia
- College of Pharmaceutical Sciences, Soochow University, Suzhou, China
| | - Huanqiu Li
- College of Pharmaceutical Sciences, Soochow University, Suzhou, China
| | - Sheng Tian
- College of Pharmaceutical Sciences, Soochow University, Suzhou, China
- College of Chemistry and Life Science, Beijing University of Technology, Beijing, China
| |
Collapse
|
11
|
Żandarek J, Żmudzki P, Obradović D, Lazović S, Bogojević A, Koszła O, Sołek P, Maciąg M, Płazińska A, Starek M, Dąbrowska M. Analysis of pharmacokinetic profile and ecotoxicological character of cefepime and its photodegradation products. CHEMOSPHERE 2024; 353:141529. [PMID: 38428534 DOI: 10.1016/j.chemosphere.2024.141529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 02/20/2024] [Accepted: 02/21/2024] [Indexed: 03/03/2024]
Abstract
An important problem is the impact of photodegradation on product toxicity in biological tests, which may be complex and context-dependent. Previous studies have described the pharmacology of cefepime, but the toxicological effects of its photodegradation products remain largely unknown. Therefore, photodegradation studies were undertaken in conditions similar to those occurring in biological systems insilico, in vitro, in vivo and ecotoxicological experiments. The structures of four cefepime photodegradation products were determined by UPLC-MS/MS method. The calculated in silico ADMET profile indicates that carcinogenic potential is expected for compounds CP-1, cefepime, CP-2 and CP-3. The Cell Line Cytomotovity Predictor 2.0 tool was used to predict the cytotoxic effects of cefepime and related compounds in non-transformed and cancer cell lines. The results indicate that possible actions include: non-small cell lung cancer, breast adenocarcinoma, prostate cancer and papillary renal cell carcinoma. OPERA models were used to predict absorption, distribution, metabolism and excretion (ADME) endpoints, and potential bioactivity of CP-2, cefepime and CP-4. The results obtained in silico show that after 96h of exposure, cefepime, CP-1, CP-2, and CP-3 are moderately toxic in the zebrafish model, while CP-4 is highly toxic. On the contrary, cefepime is more toxic to T. platyurus (highly toxic) compared to the zebrafish model, similar to products CP-4, CP-3 and CP-2. In vitro cytotoxicity studies were performed by MTT assay and in vivo acute embryo toxicity studies using Danio rerio embryos and larvae. In vitro showed an increase in the cytotoxicity of products with the longest exposure period i.e. for 8 h. Additionally, at a concentration of 200 μg/mL, statistically significant changes in metabolic activity were observed depending on the irradiation time. In vivo studies conducted with Zebrafish showed that both cefepime and its photodegradation products have only low toxicity. Assessment of potential ecotoxicity included Microbiotests on invertebrates (Thamnotoxkit F and Daphtoxkit F), and luminescence inhibition tests (LumiMara). The observed toxicity of the tested solutions towards both Thamnocephalus platyurus and Daphnia magna indicates that the parent substance (unexposed) has lower toxicity, which increases during irradiation. The acute toxicity (Lumi Mara) of nonirradiated cefepime solution is low for all tested strains (<10%), but mixtures of cefepime and its photoproducts showed growth inhibition against all tested strains (except #6, Photobacterium phoreum). Generally, it can be concluded that after UV-Vis irradiation, the mixture of cefepime phototransformation products shows a significant increase in toxicity.
Collapse
Affiliation(s)
- Joanna Żandarek
- Department of Inorganic and Analytical Chemistry, Faculty of Pharmacy, Jagiellonian University Medical College, 9 Medyczna St, 30-688, Kraków, Poland; Doctoral School of Medical and Health Sciences, Jagiellonian University Medical College, 16 Łazarza St, 31-530, Kraków, Poland
| | - Paweł Żmudzki
- Department of Medicinal Chemistry, Medical College, Jagiellonian University, 9 Medyczna, 30-688 Kraków, Poland
| | - Darija Obradović
- Institute of Physics Belgrade, National Institute of the Republic of Serbia, Pregrevica 118, 11080 Belgrade, Serbia
| | - Saša Lazović
- Institute of Physics Belgrade, National Institute of the Republic of Serbia, Pregrevica 118, 11080 Belgrade, Serbia
| | - Aleksandar Bogojević
- Institute of Physics Belgrade, National Institute of the Republic of Serbia, Pregrevica 118, 11080 Belgrade, Serbia
| | - Oliwia Koszła
- Department of Biopharmacy, Medical University of Lublin, 4a Chodźki St, 20-093 Lublin, Poland
| | - Przemysław Sołek
- Department of Biopharmacy, Medical University of Lublin, 4a Chodźki St, 20-093 Lublin, Poland; Department of Biochemistry and Toxicology, University of Life Sciences, 13 Akademicka St, 20-950 Lublin, Poland
| | - Monika Maciąg
- Department of Biopharmacy, Medical University of Lublin, 4a Chodźki St, 20-093 Lublin, Poland; Independent Laboratory of Behavioral Studies, Medical University of Lublin, 4a Chodźki St, 20-093 Lublin, Poland
| | - Anita Płazińska
- Department of Biopharmacy, Medical University of Lublin, 4a Chodźki St, 20-093 Lublin, Poland
| | - Małgorzata Starek
- Department of Inorganic and Analytical Chemistry, Faculty of Pharmacy, Jagiellonian University Medical College, 9 Medyczna St, 30-688, Kraków, Poland
| | - Monika Dąbrowska
- Department of Inorganic and Analytical Chemistry, Faculty of Pharmacy, Jagiellonian University Medical College, 9 Medyczna St, 30-688, Kraków, Poland.
| |
Collapse
|
12
|
Mostafa F, Chen M. Computational models for predicting liver toxicity in the deep learning era. FRONTIERS IN TOXICOLOGY 2024; 5:1340860. [PMID: 38312894 PMCID: PMC10834666 DOI: 10.3389/ftox.2023.1340860] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 12/22/2023] [Indexed: 02/06/2024] Open
Abstract
Drug-induced liver injury (DILI) is a severe adverse reaction caused by drugs and may result in acute liver failure and even death. Many efforts have centered on mitigating risks associated with potential DILI in humans. Among these, quantitative structure-activity relationship (QSAR) was proven to be a valuable tool for early-stage hepatotoxicity screening. Its advantages include no requirement for physical substances and rapid delivery of results. Deep learning (DL) made rapid advancements recently and has been used for developing QSAR models. This review discusses the use of DL in predicting DILI, focusing on the development of QSAR models employing extensive chemical structure datasets alongside their corresponding DILI outcomes. We undertake a comprehensive evaluation of various DL methods, comparing with those of traditional machine learning (ML) approaches, and explore the strengths and limitations of DL techniques regarding their interpretability, scalability, and generalization. Overall, our review underscores the potential of DL methodologies to enhance DILI prediction and provides insights into future avenues for developing predictive models to mitigate DILI risk in humans.
Collapse
Affiliation(s)
- Fahad Mostafa
- Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX, United States
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, United States
| | - Minjun Chen
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, United States
| |
Collapse
|
13
|
Wu W, Qian J, Liang C, Yang J, Ge G, Zhou Q, Guan X. GeoDILI: A Robust and Interpretable Model for Drug-Induced Liver Injury Prediction Using Graph Neural Network-Based Molecular Geometric Representation. Chem Res Toxicol 2023; 36:1717-1730. [PMID: 37839069 DOI: 10.1021/acs.chemrestox.3c00199] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2023]
Abstract
Drug-induced liver injury (DILI) is a significant cause of drug failure and withdrawal due to liver damage. Accurate prediction of hepatotoxic compounds is crucial for safe drug development. Several DILI prediction models have been published, but they are built on different data sets, making it difficult to compare model performance. Moreover, most existing models are based on molecular fingerprints or descriptors, neglecting molecular geometric properties and lacking interpretability. To address these limitations, we developed GeoDILI, an interpretable graph neural network that uses a molecular geometric representation. First, we utilized a geometry-based pretrained molecular representation and optimized it on the DILI data set to improve predictive performance. Second, we leveraged gradient information to obtain high-precision atomic-level weights and deduce the dominant substructure. We benchmarked GeoDILI against recently published DILI prediction models, as well as popular GNN models and fingerprint-based machine learning models using the same data set, showing superior predictive performance of our proposed model. We applied the interpretable method in the DILI data set and derived seven precise and mechanistically elucidated structural alerts. Overall, GeoDILI provides a promising approach for accurate and interpretable DILI prediction with potential applications in drug discovery and safety assessment. The data and source code are available at GitHub repository (https://github.com/CSU-QJY/GeoDILI).
Collapse
Affiliation(s)
- Wenxuan Wu
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Jiayu Qian
- School of Mathematics and Statistics, Central South University, Changsha, Hunan 410083, China
| | - Changjie Liang
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Jingya Yang
- School of Mathematics and Statistics, Central South University, Changsha, Hunan 410083, China
| | - Guangbo Ge
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Qingping Zhou
- School of Mathematics and Statistics, Central South University, Changsha, Hunan 410083, China
| | - Xiaoqing Guan
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| |
Collapse
|
14
|
Béquignon OM, Gómez-Tamayo JC, Lenselink EB, Wink S, Hiemstra S, Lam CC, Gadaleta D, Roncaglioni A, Norinder U, Water BVD, Pastor M, van Westen GJP. Collaborative SAR Modeling and Prospective In Vitro Validation of Oxidative Stress Activation in Human HepG2 Cells. J Chem Inf Model 2023; 63:5433-5445. [PMID: 37616385 PMCID: PMC10498489 DOI: 10.1021/acs.jcim.3c00220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Indexed: 08/26/2023]
Abstract
Oxidative stress is the consequence of an abnormal increase of reactive oxygen species (ROS). ROS are generated mainly during the metabolism in both normal and pathological conditions as well as from exposure to xenobiotics. Xenobiotics can, on the one hand, disrupt molecular machinery involved in redox processes and, on the other hand, reduce the effectiveness of the antioxidant activity. Such dysregulation may lead to oxidative damage when combined with oxidative stress overpassing the cell capacity to detoxify ROS. In this work, a green fluorescent protein (GFP)-tagged nuclear factor erythroid 2-related factor 2 (NRF2)-regulated sulfiredoxin reporter (Srxn1-GFP) was used to measure the antioxidant response of HepG2 cells to a large series of drug and drug-like compounds (2230 compounds). These compounds were then classified as positive or negative depending on cellular response and distributed among different modeling groups to establish structure-activity relationship (SAR) models. A selection of models was used to prospectively predict oxidative stress induced by a new set of compounds subsequently experimentally tested to validate the model predictions. Altogether, this exercise exemplifies the different challenges of developing SAR models of a phenotypic cellular readout, model combination, chemical space selection, and results interpretation.
Collapse
Affiliation(s)
- Olivier
J. M. Béquignon
- Leiden
Academic Centre for Drug Research, Leiden
University, Wassenaarseweg 76, 2333 AL Leiden, The Netherlands
| | - Jose C. Gómez-Tamayo
- Research
Programme on Biomedical Informatics (GRIB), Department of Medicine
and Life Sciences, Hospital del Mar Medical Research Institute, Universitat Pompeu Fabra, Carrer del Dr. Aiguader 88, 08002 Barcelona, Spain
| | - Eelke B. Lenselink
- Leiden
Academic Centre for Drug Research, Leiden
University, Wassenaarseweg 76, 2333 AL Leiden, The Netherlands
| | - Steven Wink
- Leiden
Academic Centre for Drug Research, Leiden
University, Wassenaarseweg 76, 2333 AL Leiden, The Netherlands
| | - Steven Hiemstra
- Leiden
Academic Centre for Drug Research, Leiden
University, Wassenaarseweg 76, 2333 AL Leiden, The Netherlands
| | - Chi Chung Lam
- Leiden
Academic Centre for Drug Research, Leiden
University, Wassenaarseweg 76, 2333 AL Leiden, The Netherlands
| | - Domenico Gadaleta
- Laboratory
of Environmental Chemistry and Toxicology, Department of Environmental
Health Sciences, IRCCS—Istituto di
Ricerche Farmacologiche Mario Negri, Via la Masa 19, 20156 Milano, Italy
| | - Alessandra Roncaglioni
- Laboratory
of Environmental Chemistry and Toxicology, Department of Environmental
Health Sciences, IRCCS—Istituto di
Ricerche Farmacologiche Mario Negri, Via la Masa 19, 20156 Milano, Italy
| | - Ulf Norinder
- MTM
Research Centre, School of Science and Technology, Örebro University, SE-70182 Örebro, Sweden
| | - Bob van de Water
- Leiden
Academic Centre for Drug Research, Leiden
University, Wassenaarseweg 76, 2333 AL Leiden, The Netherlands
| | - Manuel Pastor
- Research
Programme on Biomedical Informatics (GRIB), Department of Medicine
and Life Sciences, Hospital del Mar Medical Research Institute, Universitat Pompeu Fabra, Carrer del Dr. Aiguader 88, 08002 Barcelona, Spain
| | - Gerard J. P. van Westen
- Leiden
Academic Centre for Drug Research, Leiden
University, Wassenaarseweg 76, 2333 AL Leiden, The Netherlands
| |
Collapse
|
15
|
Tran TTV, Surya Wibowo A, Tayara H, Chong KT. Artificial Intelligence in Drug Toxicity Prediction: Recent Advances, Challenges, and Future Perspectives. J Chem Inf Model 2023; 63:2628-2643. [PMID: 37125780 DOI: 10.1021/acs.jcim.3c00200] [Citation(s) in RCA: 59] [Impact Index Per Article: 29.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
Toxicity prediction is a critical step in the drug discovery process that helps identify and prioritize compounds with the greatest potential for safe and effective use in humans, while also reducing the risk of costly late-stage failures. It is estimated that over 30% of drug candidates are discarded owing to toxicity. Recently, artificial intelligence (AI) has been used to improve drug toxicity prediction as it provides more accurate and efficient methods for identifying the potentially toxic effects of new compounds before they are tested in human clinical trials, thus saving time and money. In this review, we present an overview of recent advances in AI-based drug toxicity prediction, including the use of various machine learning algorithms and deep learning architectures, of six major toxicity properties and Tox21 assay end points. Additionally, we provide a list of public data sources and useful toxicity prediction tools for the research community and highlight the challenges that must be addressed to enhance model performance. Finally, we discuss future perspectives for AI-based drug toxicity prediction. This review can aid researchers in understanding toxicity prediction and pave the way for new methods of drug discovery.
Collapse
Affiliation(s)
- Thi Tuyet Van Tran
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea
- Faculty of Information Technology, An Giang University, Long Xuyen 880000, Vietnam
- Vietnam National University - Ho Chi Minh City, Ho Chi Minh 700000, Vietnam
| | - Agung Surya Wibowo
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea
- Department of Electrical Engineering, Telkom University, Bandung 40257, Indonesia
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, Republic of Korea
| | - Kil To Chong
- Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, Republic of Korea
| |
Collapse
|
16
|
Cuesta SA, Moreno M, López RA, Mora JR, Paz JL, Márquez EA. ElectroPredictor: An Application to Predict Mayr's Electrophilicity E through Implementation of an Ensemble Model Based on Machine Learning Algorithms. J Chem Inf Model 2023; 63:507-521. [PMID: 36594600 DOI: 10.1021/acs.jcim.2c01367] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Electrophilicity (E) is one of the most important parameters to understand the reactivity of an organic molecule. Although the theoretical electrophilicity index (ω) has been associated with E in a small homologous series, the use of w to predict E in a structurally heterogeneous set of compounds is not a trivial task. In this study, a robust ensemble model is created using Mayr's database of reactivity parameters. A combination of topological and quantum mechanical descriptors and different machine learning algorithms are employed for the model's development. The predictability of the model is assessed using different statistical parameters, and its validation is examined, including a training/test partition, an applicability domain, and a y-scrambling test. The global ensemble model presents a Q5-fold2 of 0.909 and a Qext2 of 0.912, demonstrating an excellent predictability performance of E values and showing that w is not a good descriptor for the prediction of E, especially for the case of neutral compounds. ElectroPredictor, a noncommercial Python application (https://github.com/mmoreno1/ElectroPredictor), is developed to predict E. QM9, a well-known large dataset containing 133885 neutral molecules, is used to perform a virtual screening (94.0% coverage). Finally, the 10 most electrophilic molecules are analyzed as possible new Mayr's electrophiles, which have not yet been experimentally tested. This study confirms the necessity to build an ensemble model using nonlinear machine learning algorithms, topographic descriptors, and separating molecules into charged and neutral compounds to predict E with precision.
Collapse
Affiliation(s)
- Sebastián A Cuesta
- Instituto de Simulación Computacional (ISC-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito170901, Ecuador
- Department of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, ManchesterM1 7DN, U.K
| | - Martín Moreno
- Instituto de Simulación Computacional (ISC-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito170901, Ecuador
| | - Romina A López
- Colegio San Ignacio de Loyola─Fe y Alegría, Ministerio de Educación, Quito170901, Ecuador
| | - José R Mora
- Instituto de Simulación Computacional (ISC-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito170901, Ecuador
| | - José Luis Paz
- Departamento Académico de Química Inorgánica, Facultad de Química e Ingeniería Química, Universidad Nacional Mayor de San Marcos, Cercado de Lima, Lima15081, Peru
| | - Edgar A Márquez
- Grupo de Investigaciones en Química y Biología, Departamento de Química y Biología, Facultad de Ciencias Exactas, Universidad del Norte, Carrera 51B, Km 5, vía Puerto Colombia, Barranquilla081007, Colombia
| |
Collapse
|
17
|
Searching glycolate oxidase inhibitors based on QSAR, molecular docking, and molecular dynamic simulation approaches. Sci Rep 2022; 12:19969. [PMID: 36402831 PMCID: PMC9675741 DOI: 10.1038/s41598-022-24196-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Accepted: 11/11/2022] [Indexed: 11/21/2022] Open
Abstract
Primary hyperoxaluria type 1 (PHT1) treatment is mainly focused on inhibiting the enzyme glycolate oxidase, which plays a pivotal role in the production of glyoxylate, which undergoes oxidation to produce oxalate. When the renal secretion capacity exceeds, calcium oxalate forms stones that accumulate in the kidneys. In this respect, detailed QSAR analysis, molecular docking, and dynamics simulations of a series of inhibitors containing glycolic, glyoxylic, and salicylic acid groups have been performed employing different regression machine learning techniques. Three robust models with less than 9 descriptors-based on a tenfold cross (Q2 CV) and external (Q2 EXT) validation-were found i.e., MLR1 (Q2 CV = 0.893, Q2 EXT = 0.897), RF1 (Q2 CV = 0.889, Q2 EXT = 0.907), and IBK1 (Q2 CV = 0.891, Q2 EXT = 0.907). An ensemble model was built by averaging the predicted pIC50 of the three models, obtaining a Q2 EXT = 0.933. Physicochemical properties such as charge, electronegativity, hardness, softness, van der Waals volume, and polarizability were considered as attributes to build the models. To get more insight into the potential biological activity of the compouds studied herein, docking and dynamic analysis were carried out, finding the hydrophobic and polar residues show important interactions with the ligands. A screening of the DrugBank database V.5.1.7 was performed, leading to the proposal of seven commercial drugs within the applicability domain of the models, that can be suggested as possible PHT1 treatment.
Collapse
|
18
|
Ivanov SM, Lagunin AA, Filimonov DA, Poroikov VV. Relationships between the Structure and Severe Drug-Induced Liver Injury for Low, Medium, and High Doses of Drugs. Chem Res Toxicol 2022; 35:402-411. [PMID: 35172101 DOI: 10.1021/acs.chemrestox.1c00307] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Assessment of structure-activity relationships (SARs) for predicting severe drug-induced liver injury (DILI) is essential since in vivo and in vitro preclinical methods cannot detect many druglike compounds disrupting liver functions. To date, plenty of SAR models for the prediction of DILI have been developed; however, none of them considered the route of drug administration and daily dose, which may introduce significant bias into prediction results. We have created a dataset of 617 drugs with parenteral and oral administration routes and consistent information on DILI severity. We have found a clear relationship between route, dose, and DILI severity. According to SAR, nearly 40% of moderate- and non-DILI-causing drugs would cause severe DILI if they were administered at high oral doses. We have proposed the following approach to predict severe DILI. New compounds recommended to be used at low oral doses (<∼10 mg daily), or parenterally, can be considered not causing severe DILI. DILI for compounds administered at medium oral doses (∼10-100 mg daily; 22.2% of drugs under consideration) can be considered unpredictable because reasonable SAR models were not obtained due to the small size and heterogeneity of the corresponding dataset. The DILI potential of the compounds recommended to be used at high oral doses (more than ∼100 mg daily) can be estimated using SAR modeling. The balanced accuracy of the approach calculated by a 10-fold cross-validation procedure is 0.803. The developed approach can be used to estimate severe DILI for druglike compounds proposed to use at low and high oral doses or parenterally at the early stages of drug development.
Collapse
Affiliation(s)
- Sergey M Ivanov
- Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow 119121, Russia.,Pirogov Russian National Research Medical University, Ostrovityanova Str., 1, Moscow 117997, Russia
| | - Alexey A Lagunin
- Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow 119121, Russia.,Pirogov Russian National Research Medical University, Ostrovityanova Str., 1, Moscow 117997, Russia
| | - Dmitry A Filimonov
- Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow 119121, Russia
| | - Vladimir V Poroikov
- Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow 119121, Russia
| |
Collapse
|
19
|
Cabrera N, Cuesta SA, Mora JR, Calle L, Márquez EA, Kaunas R, Paz JL. In Silico Searching for Alternative Lead Compounds to Treat Type 2 Diabetes through a QSAR and Molecular Dynamics Study. Pharmaceutics 2022; 14:232. [PMID: 35213965 PMCID: PMC8879932 DOI: 10.3390/pharmaceutics14020232] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 12/28/2021] [Accepted: 01/07/2022] [Indexed: 02/01/2023] Open
Abstract
Free fatty acid receptor 1 (FFA1) stimulates insulin secretion in pancreatic β-cells. An advantage of therapies that target FFA1 is their reduced risk of hypoglycemia relative to common type 2 diabetes treatments. In this work, quantitative structure-activity relationship (QSAR) approach was used to construct models to identify possible FFA1 agonists by applying four different machine-learning algorithms. The best model (M2) meets the Tropsha's test requirements and has the statistics parameters R2 = 0.843, Q2CV = 0.785, and Q2ext = 0.855. Also, coverage of 100% of the test set based on the applicability domain analysis was obtained. Furthermore, a deep analysis based on the ADME predictions, molecular docking, and molecular dynamics simulations was performed. The lipophilicity and the residue interactions were used as relevant criteria for selecting a candidate from the screening of the DiaNat and DrugBank databases. Finally, the FDA-approved drugs bilastine, bromfenac, and fenofibric acid are suggested as potential and lead FFA1 agonists.
Collapse
Affiliation(s)
- Nicolás Cabrera
- Department of Biomedical Engineering, Texas A&M University, College Station, TX 77843, USA; (N.C.); (R.K.)
| | - Sebastián A. Cuesta
- Department of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, Manchester M1 7DN, UK;
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y vía Interoceánica, Quito 170901, Ecuador
| | - José R. Mora
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y vía Interoceánica, Quito 170901, Ecuador
| | - Luis Calle
- Faculty of Pharmacy, University of Granada, 18011 Granada, Spain;
- Facultad de Ciencias Médicas, Instituto de Investigación e Innovación en Salud Integral, Universidad Católica Santiago de Guayaquil, Guayaquil 09013493, Ecuador
| | - Edgar A. Márquez
- Grupo de Investigaciones en Química y Biología, Departamento de Química y Biología, Facultad de Ciencias Exactas, Universidad del Norte, Carrera 51B, Km 5, vía Puerto Colombia, Barranquilla 081007, Colombia
| | - Roland Kaunas
- Department of Biomedical Engineering, Texas A&M University, College Station, TX 77843, USA; (N.C.); (R.K.)
| | - José Luis Paz
- Departamento Académico de Química Inorgánica, Facultad de Química e Ingeniería Química, Universidad Nacional Mayor de San Marcos, Cercado de Lima 15081, Peru;
| |
Collapse
|
20
|
Liu J, Guo W, Sakkiah S, Ji Z, Yavas G, Zou W, Chen M, Tong W, Patterson TA, Hong H. Machine Learning Models for Predicting Liver Toxicity. Methods Mol Biol 2022; 2425:393-415. [PMID: 35188640 DOI: 10.1007/978-1-0716-1960-5_15] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Liver toxicity is a major adverse drug reaction that accounts for drug failure in clinical trials and withdrawal from the market. Therefore, predicting potential liver toxicity at an early stage in drug discovery is crucial to reduce costs and the potential for drug failure. However, current in vivo animal toxicity testing is very expensive and time consuming. As an alternative approach, various machine learning models have been developed to predict potential liver toxicity in humans. This chapter reviews current advances in the development and application of machine learning models for prediction of potential liver toxicity in humans and discusses possible improvements to liver toxicity prediction.
Collapse
Affiliation(s)
- Jie Liu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Wenjing Guo
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Sugunadevi Sakkiah
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Zuowei Ji
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Gokhan Yavas
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Wen Zou
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Minjun Chen
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Tucker A Patterson
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA.
| |
Collapse
|
21
|
Ellison C, Hewitt M, Przybylak K. In Silico Models for Hepatotoxicity. Methods Mol Biol 2022; 2425:355-392. [PMID: 35188639 DOI: 10.1007/978-1-0716-1960-5_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In this chapter, we review the state of the art of predicting human hepatotoxicity using in silico techniques. There has been significant progress in this area over the past 20 years but there are still some challenges ahead. Principally, these challenges are our partial understanding of a very complex biochemical system and our ability to emulate that in a predictive capacity. Here, we provide an overview of the published modeling approaches in this area to date and discuss their design, strengths and weaknesses. It is interesting to note the diversity in modeling approaches, whether they be statistical algorithms or evidenced-based approaches including structural alerts and pharmacophore models. Irrespective of modeling approach, it appears a common theme of access to appropriate, relevant, and high-quality data is a limitation to all and is likely to continue to be the focus of future research.
Collapse
Affiliation(s)
- Claire Ellison
- Human and Natural Sciences Directorate, School of Science, Engineering and Environment, University of Salford, Manchester, UK
| | - Mark Hewitt
- School of Pharmacy, Faculty of Science and Engineering, University of Wolverhampton, Wolverhampton, UK.
| | | |
Collapse
|
22
|
Joint Decision-Making Model Based on Consensus Modeling Technology for the Prediction of Drug-Induced Liver Injury. J CHEM-NY 2021. [DOI: 10.1155/2021/2293871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Drug-induced liver injury (DILI) is the major cause of clinical trial failure and postmarketing withdrawals of approved drugs. It is very expensive and time-consuming to evaluate hepatotoxicity using animal or cell-based experiments in the early stage of drug development. In this study, an in silico model based on the joint decision-making strategy was developed for DILI assessment using a relatively large dataset of 2608 compounds. Five consensus models were developed with PaDEL descriptors and PubChem, Substructure, Estate, and Klekota–Roth fingerprints, respectively. Submodels for each consensus model were obtained through joint optimization. The parameters and features of each submodel were optimized jointly based on the hybrid quantum particle swarm optimization (HQPSO) algorithm. The application domain (AD) based on the frequency-weighted and distance (FWD)-based method and Tanimoto similarity index showed the wide AD of the qualified consensus models. A joint decision-making model was integrated by the qualified consensus models, and the overwhelming majority principle was used to improve the performance of consensus models. The application scope narrowing caused by the overwhelming majority principle was successfully solved by joint decision-making. The proposed model successfully predicted 99.2% of the compounds in the test set, with an accuracy of 80.0%, a sensitivity of 83.9, and a specificity of 73.3%. For an external validation set containing 390 compounds collected from DILIrank, 98.2% of the compounds were successfully predicted with an accuracy of 79.9%, a sensitivity of 97.1%, and a specificity of 66.0%. Furthermore, 25 privileged substructures responsible for DILI were identified from Substructure, PubChem, and Klekota–Roth fingerprints. These privileged substructures can be regarded as structural alerts in hepatotoxicity evaluation. Compared with the main published studies, our method exhibits certain advantage in data size, transparency, and standardization of the modeling process and accuracy and credibility of prediction results. It is a promising tool for virtual screening in the early stage of drug development.
Collapse
|
23
|
Venkatraman V. FP-ADMET: a compendium of fingerprint-based ADMET prediction models. J Cheminform 2021; 13:75. [PMID: 34583740 PMCID: PMC8479898 DOI: 10.1186/s13321-021-00557-5] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 09/20/2021] [Indexed: 12/11/2022] Open
Abstract
MOTIVATION The absorption, distribution, metabolism, excretion, and toxicity (ADMET) of drugs plays a key role in determining which among the potential candidates are to be prioritized. In silico approaches based on machine learning methods are becoming increasing popular, but are nonetheless limited by the availability of data. With a view to making both data and models available to the scientific community, we have developed FPADMET which is a repository of molecular fingerprint-based predictive models for ADMET properties. In this article, we have examined the efficacy of fingerprint-based machine learning models for a large number of ADMET-related properties. The predictive ability of a set of 20 different binary fingerprints (based on substructure keys, atom pairs, local path environments, as well as custom fingerprints such as all-shortest paths) for over 50 ADMET and ADMET-related endpoints have been evaluated as part of the study. We find that for a majority of the properties, fingerprint-based random forest models yield comparable or better performance compared with traditional 2D/3D molecular descriptors. AVAILABILITY The models are made available as part of open access software that can be downloaded from https://gitlab.com/vishsoft/fpadmet .
Collapse
Affiliation(s)
- Vishwesh Venkatraman
- Norwegian University of Science and Technology, Realfagbygget, Gløshaugen, Høgskoleringen, 7491, Trondheim, Norway.
| |
Collapse
|
24
|
Calle L, Marrero-Ponce Y, Mora JR. Molecular simulation of the (GPx)-like antioxidant activity of ebselen derivatives through machine learning techniques. MOLECULAR SIMULATION 2021. [DOI: 10.1080/08927022.2021.1975039] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Luis Calle
- Facultad de Ciencias Médicas, Instituto de Investigación e Innovación en Salud Integral (ISAIN), Universidad Católica Santiago de Guayaquil, Guayaquil, Ecuador
- Faculty of Pharmacy, University of Granada, Granada, Spain
| | - Yovani Marrero-Ponce
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Quito, Ecuador
- Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Universidad San Francisco de Quito, Quito, Ecuador
| | - José R. Mora
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Quito, Ecuador
| |
Collapse
|
25
|
Pérez Santín E, Rodríguez Solana R, González García M, García Suárez MDM, Blanco Díaz GD, Cima Cabal MD, Moreno Rojas JM, López Sánchez JI. Toxicity prediction based on artificial intelligence: A multidisciplinary overview. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2021. [DOI: 10.1002/wcms.1516] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Affiliation(s)
- Efrén Pérez Santín
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| | - Raquel Rodríguez Solana
- Department of Food Science and Health Andalusian Institute of Agricultural and Fisheries Research and Training (IFAPA), Alameda del Obispo Avda Córdoba, Andalucía Spain
| | - Mariano González García
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| | - María Del Mar García Suárez
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| | - Gerardo David Blanco Díaz
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| | - María Dolores Cima Cabal
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| | - José Manuel Moreno Rojas
- Department of Food Science and Health Andalusian Institute of Agricultural and Fisheries Research and Training (IFAPA), Alameda del Obispo Avda Córdoba, Andalucía Spain
| | - José Ignacio López Sánchez
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| |
Collapse
|
26
|
Fernandes PO, Martins DM, de Souza Bozzi A, Martins JPA, de Moraes AH, Maltarollo VG. Molecular insights on ABL kinase activation using tree-based machine learning models and molecular docking. Mol Divers 2021; 25:1301-1314. [PMID: 34191245 PMCID: PMC8241884 DOI: 10.1007/s11030-021-10261-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 06/18/2021] [Indexed: 12/14/2022]
Abstract
Abelson kinase (c-Abl) is a non-receptor tyrosine kinase involved in several biological processes essential for cell differentiation, migration, proliferation, and survival. This enzyme's activation might be an alternative strategy for treating diseases such as neutropenia induced by chemotherapy, prostate, and breast cancer. Recently, a series of compounds that promote the activation of c-Abl has been identified, opening a promising ground for c-Abl drug development. Structure-based drug design (SBDD) and ligand-based drug design (LBDD) methodologies have significantly impacted recent drug development initiatives. Here, we combined SBDD and LBDD approaches to characterize critical chemical properties and interactions of identified c-Abl's activators. We used molecular docking simulations combined with tree-based machine learning models-decision tree, AdaBoost, and random forest to understand the c-Abl activators' structural features required for binding to myristoyl pocket, and consequently, to promote enzyme and cellular activation. We obtained predictive and robust models with Matthews correlation coefficient values higher than 0.4 for all endpoints and identified characteristics that led to constructing a structure-activity relationship model (SAR).
Collapse
Affiliation(s)
- Philipe Oliveira Fernandes
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Diego Magno Martins
- Departamento de Química, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Aline de Souza Bozzi
- Departamento de Química, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - João Paulo A Martins
- Departamento de Química, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Adolfo Henrique de Moraes
- Departamento de Química, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Vinícius Gonçalves Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil.
| |
Collapse
|
27
|
Cuesta SA, Mora JR, Márquez EA. In Silico Screening of the DrugBank Database to Search for Possible Drugs against SARS-CoV-2. Molecules 2021; 26:1100. [PMID: 33669720 PMCID: PMC7923184 DOI: 10.3390/molecules26041100] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 02/11/2021] [Accepted: 02/16/2021] [Indexed: 12/29/2022] Open
Abstract
Coronavirus desease 2019 (COVID-19) is responsible for more than 1.80 M deaths worldwide. A Quantitative Structure-Activity Relationships (QSAR) model is developed based on experimental pIC50 values reported for a structurally diverse dataset. A robust model with only five descriptors is found, with values of R2 = 0.897, Q2LOO = 0.854, and Q2ext = 0.876 and complying with all the parameters established in the validation Tropsha's test. The analysis of the applicability domain (AD) reveals coverage of about 90% for the external test set. Docking and molecular dynamic analysis are performed on the three most relevant biological targets for SARS-CoV-2: main protease, papain-like protease, and RNA-dependent RNA polymerase. A screening of the DrugBank database is executed, predicting the pIC50 value of 6664 drugs, which are IN the AD of the model (coverage = 79%). Fifty-seven possible potent anti-COVID-19 candidates with pIC50 values > 6.6 are identified, and based on a pharmacophore modelling analysis, four compounds of this set can be suggested as potent candidates to be potential inhibitors of SARS-CoV-2. Finally, the biological activity of the compounds was related to the frontier molecular orbitals shapes.
Collapse
Affiliation(s)
- Sebastián A. Cuesta
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Colegio Politécnico, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito 170901, Ecuador;
| | - José R. Mora
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Colegio Politécnico, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito 170901, Ecuador;
| | - Edgar A. Márquez
- Grupo de Investigaciones en Química y Biología, Departamento de Química y Biología, Facultad de Ciencias Exactas, Universidad del Norte, Carrera 51B, Km 5, vía Puerto Colombia, Barranquilla 081007, Colombia
| |
Collapse
|
28
|
Ma H, An W, Wang Y, Sun H, Huang R, Huang J. Deep Graph Learning with Property Augmentation for Predicting Drug-Induced Liver Injury. Chem Res Toxicol 2021; 34:495-506. [PMID: 33347312 PMCID: PMC9887540 DOI: 10.1021/acs.chemrestox.0c00322] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Drug-induced liver injury (DILI) is a crucial factor in determining the qualification of potential drugs. However, the DILI property is excessively difficult to obtain due to the complex testing process. Consequently, an in silico screening in the early stage of drug discovery would help to reduce the total development cost by filtering those drug candidates with a high risk to cause DILI. To serve the screening goal, we apply several computational techniques to predict the DILI property, including traditional machine learning methods and graph-based deep learning techniques. While deep learning models require large training data to tune huge model parameters, the DILI data set only contains a few hundred annotated molecules. To alleviate the data scarcity problem, we propose a property augmentation strategy to include massive training data with other property information. Extensive experiments demonstrate that our proposed method significantly outperforms all existing baselines on the DILI data set by obtaining a 81.4% accuracy using cross-validation with random splitting, 78.7% using leave-one-out cross-validation, and 76.5% using cross-validation with scaffold splitting.
Collapse
Affiliation(s)
- Hehuan Ma
- Department of Computer Science, University of Texas at Arlington, Arlington, Texas, USA
| | - Weizhi An
- Department of Computer Science, University of Texas at Arlington, Arlington, Texas, USA
| | - Yuhong Wang
- National Center for Advancing Translating Sciences, NIH Rockville, Maryland, USA
| | - Hongmao Sun
- National Center for Advancing Translating Sciences, NIH Rockville, Maryland, USA
| | - Ruili Huang
- National Center for Advancing Translating Sciences, NIH Rockville, Maryland, USA
| | - Junzhou Huang
- Department of Computer Science, University of Texas at Arlington, Arlington, Texas, USA
| |
Collapse
|
29
|
Feng H, Zhang L, Li S, Liu L, Yang T, Yang P, Zhao J, Arkin IT, Liu H. Predicting the reproductive toxicity of chemicals using ensemble learning methods and molecular fingerprints. Toxicol Lett 2021; 340:4-14. [PMID: 33421549 DOI: 10.1016/j.toxlet.2021.01.002] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Revised: 10/29/2020] [Accepted: 01/03/2021] [Indexed: 12/20/2022]
Abstract
Reproductive toxicity endpoints are a significant safety concern in the assessment of the adverse effects of chemicals in drug discovery. Computational models that can accurately predict a chemical's toxic potential are increasingly pursued to replace traditional animal experiments. Thus, ensemble learning models were built to predict the reproductive toxicity of compounds. Our ensemble models were developed using support vector machine, random forest, and extreme gradient boosting methods and 9 molecular fingerprints calculated for a dataset containing 1823 chemicals. The best prediction performance was achieved by the Ensemble-Top12 model, with an accuracy (ACC) of 86.33 %, a sensitivity (SEN) of 82.02 %, a specificity (SPE) of 90.19 %, and an area under the receiver operating characteristic curve (AUC) of 0.937 in 5-fold cross-validation and ACC, SEN, SPE, and AUC values of 84.38 %, 86.90 %, 90.67 %, and 0.920, respectively, in external validation. We also defined the applicability domain (AD) of the ensemble model by calculating the Tanimoto distance of the training set. Compared with models in existing literature, our ensemble model achieves relatively high ACC, SPE and AUC values. We also identified several fingerprint features related to chemical reproductive toxicity. Considering the performance of model, we recommend using the Ensemble-Top12 model to predict reproductive toxicity in early drug development.
Collapse
Affiliation(s)
- Huawei Feng
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Li Zhang
- School of Life Science, Liaoning University, Shenyang, 110036, China; Technology Innovation Center for Computer Simulating and Information Processing of Bio-macromolecules of Shenyang, Shenyang, 110036, China; Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning, Liaoning University, Shenyang, 110036, China
| | - Shimeng Li
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Lili Liu
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Tianzhou Yang
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Pengyu Yang
- School of Information, Liaoning University, Shenyang, 110036, China
| | - Jian Zhao
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Isaiah Tuvia Arkin
- Department of Biological Chemistry, The Hebrew University of Jerusalem, Edmond J. Safra Campus, Givat-Ram, Jerusalem, 91904, Israel
| | - Hongsheng Liu
- Technology Innovation Center for Computer Simulating and Information Processing of Bio-macromolecules of Shenyang, Shenyang, 110036, China; Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning, Liaoning University, Shenyang, 110036, China; School of Pharmaceutical Science, Liaoning University, Shenyang, 110036, China.
| |
Collapse
|
30
|
Cabrera N, Mora JR, Márquez E, Flores-Morales V, Calle L, Cortés E. QSAR and molecular docking modelling of anti-leishmanial activities of organic selenium and tellurium compounds. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2021; 32:29-50. [PMID: 33241943 DOI: 10.1080/1062936x.2020.1848914] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Accepted: 11/05/2020] [Indexed: 06/11/2023]
Abstract
Leishmaniasis affects mainly rural areas and the poorest people in the world. A computational study of the antileishmanial activity of organic selenium and tellurium compounds was performed. The 3D structures of the compounds were optimized at the wb97xd/lanl2dz level and used in the quantitative structure-activity relationship (QSAR) analysis. The antileishmanial activity was measured by L. donovani β carbonic anhydrase inhibition (Ki) and the half-maximal inhibitory concentration (IC50) against L. infantum amastigotes. The dataset was divided into training (75%) and test sets (25%) by using a k-means clustering algorithm. For pKi prediction, model M3 with seven 3D topographic descriptors was characterized by the following statistical parameters: r 2 = 0.879, Q 2 LOO = 0.822, and Q 2 ext = 0.840. For pIC50 prediction, model M12 with six attributes was characterized by the following statistical parameters: r 2 = 0.907, Q 2 LOO = 0.824, and Q 2 ext = 0.795. Both models met all the requirements of Tropsha´s test, which implies predictions of pIC50 and pKi activities with high accuracy. Concomitantly, favourable interactions of the sulphonamide group with the Zn atom in the protein were revealed by the docking analysis.
Collapse
Affiliation(s)
- N Cabrera
- Department of Biomedical Engineering, Texas A&M University , College Station, TX, USA
| | - J R Mora
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito , Quito, Ecuador
| | - E Márquez
- Grupo de Investigaciones en Química y Biología, Departamento de Química y Biología, Facultad de Ciencias Exactas, Universidad del Norte , Barranquilla, Colombia
| | - V Flores-Morales
- Laboratorio de Síntesis Asimétrica y Bioenergética (LSAyB), Ingeniería Química (UACQ), Program of Doctorate in Sciences with Orientation in Molecular Medicine, Academic Unit of Human Medicine and Health Sciences, Universidad Autónoma de Zacatecas , Zacatecas, Mexico
| | - L Calle
- Instituto de Investigación e Innovación en Salud Integral (ISAIN), Facultad de Ciencias Medicas, Universidad Católica Santiago de Guayaquil , Guayaquil, Ecuador
| | - E Cortés
- Grupo de Investigación en Ciencias Naturales y Exactas, Departamento de Ciencias Naturales y Exactas, Universidad de la Costa , Barranquilla, Colombia
| |
Collapse
|
31
|
Li T, Tong W, Roberts R, Liu Z, Thakkar S. DeepDILI: Deep Learning-Powered Drug-Induced Liver Injury Prediction Using Model-Level Representation. Chem Res Toxicol 2020; 34:550-565. [PMID: 33356151 DOI: 10.1021/acs.chemrestox.0c00374] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Drug-induced liver injury (DILI) is the most frequently reported single cause of safety-related withdrawal of marketed drugs. It is essential to identify drugs with DILI potential at the early stages of drug development. In this study, we describe a deep learning-powered DILI (DeepDILI) prediction model created by combining model-level representation generated by conventional machine learning (ML) algorithms with a deep learning framework based on Mold2 descriptors. We conducted a comprehensive evaluation of the proposed DeepDILI model performance by posing several critical questions: (1) Could the DILI potential of newly approved drugs be predicted by accumulated knowledge of early approved ones? (2) is model-level representation more informative than molecule-based representation for DILI prediction? and (3) could improved model explainability be established? For question 1, we developed the DeepDILI model using drugs approved before 1997 to predict the DILI potential of those approved thereafter. As a result, the DeepDILI model outperformed the five conventional ML algorithms and two state-of-the-art ensemble methods with a Matthews correlation coefficient (MCC) value of 0.331. For question 2, we demonstrated that the DeepDILI model's performance was significantly improved (i.e., a MCC improvement of 25.86% in test set) compared with deep neural networks based on molecule-based representation. For question 3, we found 21 chemical descriptors that were enriched, suggesting a strong association with DILI outcome. Furthermore, we found that the DeepDILI model has more discrimination power to identify the DILI potential of drugs belonging to the World Health Organization therapeutic category of 'alimentary tract and metabolism'. Moreover, the DeepDILI model based on Mold2 descriptors outperformed the ones with Mol2vec and MACCS descriptors. Finally, the DeepDILI model was applied to the recent real-world problem of predicting any DILI concern for potential COVID-19 treatments from repositioning drug candidates. Altogether, this developed DeepDILI model could serve as a promising tool for screening for DILI risk of compounds in the preclinical setting, and the DeepDILI model is publicly available through https://github.com/TingLi2016/DeepDILI.
Collapse
Affiliation(s)
- Ting Li
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas 72079, United States.,University of Arkansas at Little Rock and University of Arkansas for Medical Sciences Joint Bioinformatics Program, Little Rock, Arkansas 72204, United States
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas 72079, United States
| | - Ruth Roberts
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas 72079, United States.,ApconiX Ltd., Alderley Park, Alderley Edge SK10 4TG, United Kingdom.,University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom
| | - Zhichao Liu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas 72079, United States
| | - Shraddha Thakkar
- Office of Translational Sciences, Center for Drug Evaluation and Research, United States Food and Drug Administration, Silver Spring, Maryland 20993, United States
| |
Collapse
|