1
|
Zhao X, Zhang S, Zhang T, Cao Y, Liu J. A small-scale data driven and graph neural network based toxicity prediction method of compounds. Comput Biol Chem 2025; 117:108393. [PMID: 40048921 DOI: 10.1016/j.compbiolchem.2025.108393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2024] [Revised: 02/12/2025] [Accepted: 02/16/2025] [Indexed: 04/22/2025]
Abstract
Toxicity prediction is crucial in drug discovery, helping identify safe compounds and reduce development risks. However, the lack of known toxicity data for most compounds is a major challenge. Recently, data-driven models have gained attention as a more efficient alternative to traditional in vivo and in vitro experiments. In this paper, we propose a small-scale, data-driven toxicity prediction method based on Graph Neural Network (GNN). We introduce a joint learning strategy for multiple toxicity types and construct a graph-based model, JLGCN-MTT, to improve prediction accuracy. In addition, we integrate a transfer learning strategy that leverages data from multiple toxicity types, allowing the model to make reliable predictions even when data for a specific toxicity type is limited. We conducted experiments using data from 3566 compounds in the Tox21 dataset, which contains 12 types of toxicity-related bioactivity data. The experimental results show that JLGCN-MTT outperforms traditional machine learning methods and single-task GNN in all 12 toxicity prediction tasks, with AUC improving by over 10% in 11 tasks. For small-scale data with 50, 100, and 300 training samples, the AUC improved in all cases, with the highest improvement of 11% observed when the sample size was 50. These results demonstrate that the small-scale, data-driven toxicity prediction method we propose can achieve high prediction accuracy.
Collapse
Affiliation(s)
- Xin Zhao
- School of Electronic and Information Engineering, Tianjin University, 92 Weijin Road, Tianjin, 300072, Tianjin, China
| | - Shuyi Zhang
- School of Electronic and Information Engineering, Tianjin University, 92 Weijin Road, Tianjin, 300072, Tianjin, China
| | - Tao Zhang
- School of Electronic and Information Engineering, Tianjin University, 92 Weijin Road, Tianjin, 300072, Tianjin, China.
| | - Yahui Cao
- School of Electronic and Information Engineering, Tianjin University, 92 Weijin Road, Tianjin, 300072, Tianjin, China
| | - Jingjing Liu
- International Engineering Institute, Tianjin University, 92 Weijin Road, Tianjin, 300072, Tianjin, China
| |
Collapse
|
2
|
Mustafa FM, Al-Hussainy AF, Doshi H, Yadav A, Rekha MM, Kundlas M, Sabarivani A, Kubaev A, Taher SG, Alwan M, Jawad M, Mushtaq H, Farhood B. TabNet and TabTransformer: Novel Deep Learning Models for Chemical Toxicity Prediction in Comparison With Machine Learning. J Appl Toxicol 2025. [PMID: 40309751 DOI: 10.1002/jat.4803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2025] [Revised: 04/17/2025] [Accepted: 04/21/2025] [Indexed: 05/02/2025]
Abstract
The prediction of chemical toxicity is crucial for applications in drug discovery, environmental safety, and regulatory assessments. This study aims to evaluate the performance of advanced deep learning architectures, TabNet and TabTransformer, in comparison to traditional machine learning methods, for predicting the toxicity of chemical compounds across 12 toxicological endpoints. The dataset consisted of 12,228 training and 3057 test samples, each characterized by 801 molecular descriptors representing chemical and structural features. Traditional machine learning models, including XGBoost, CatBoost, SVM, and a voting classifier, were paired with feature selection techniques such as principal component analysis (PCA), recursive feature elimination (RFE), and mutual information (MI). Advanced architectures, TabNet and TabTransformer, were trained directly on the full feature set without dimensionality reduction. Model performance was assessed using accuracy, F1-score, AUC-ROC, AUPR, and Matthews correlation coefficient (MCC), alongside SHAP analysis to interpret feature importance and enhance model transparency under class imbalance conditions. Cross-validation and test set evaluations ensured robust comparisons across all models and toxicological endpoints. TabNet and TabTransformer consistently outperformed traditional classifiers, achieving AUC-ROC values up to 96% for endpoints such as SR.ARE and SR.p53. TabTransformer showed the highest performance on complex labels, benefiting from self-attention mechanisms that captured intricate feature relationships, while TabNet achieved competitive outcomes with an efficient, dynamic feature selection. In addition to standard metrics, we reported AUPR and MCC to better evaluate model performance under class imbalance, with both models maintaining high scores across endpoints. Although traditional classifiers, particularly the voting classifier, performed well when combined with feature selection-achieving up to 94% AUC-ROC on SR.p53-they lagged behind the deep learning models in generalizability and feature interaction modeling. SHAP analysis further highlighted the interpretability of the proposed architectures by identifying influential descriptors such as VSAEstate6 and MoRSEE8. This study highlights the superiority of TabNet and TabTransformer in predicting chemical toxicity while ensuring interpretability through SHAP analysis. These models offer a promising alternative to traditional in vitro and in vivo approaches, paving the way for cost-effective and ethical toxicity assessments.
Collapse
Affiliation(s)
| | | | - Hardik Doshi
- Marwadi University Research Center, Department of Computer Engineering, Faculty of Engineering and Technology, Marwadi University, Rajkot, Gujarat, India
| | - Anupam Yadav
- Department of Computer Engineering and Application, GLA University, Mathura, India
| | - M M Rekha
- Department of Chemistry and Biochemistry, School of Sciences, JAIN (Deemed to be University), Bangalore, Karnataka, India
| | - Mayank Kundlas
- Centre for Research Impact and Outcome, Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, India
| | - A Sabarivani
- Department of Biomedical, Sathyabama Institute of Science and Technology, Chennai, Tamil Nadu, India
| | - Aziz Kubaev
- Department of Maxillofacial Surgery, Samarkand State Medical University, Samarkand, Uzbekistan
| | - Sada Ghalib Taher
- College of Health and Medical Technology, National University of Science and Technology, Nasiriyah, Dhi Qar, Iraq
| | - Mariem Alwan
- Pharmacy College, Al-Farahidi University, Baghdad, Iraq
| | - Mahmood Jawad
- Department of Pharmacy, Al-Zahrawi University College, Karbala, Iraq
| | | | - Bagher Farhood
- Department of Medical Physics and Radiology, Faculty of Paramedical Sciences, Kashan University of Medical Sciences, Kashan, Iran
| |
Collapse
|
3
|
Hao Y, Duan Z, Liu L, Xue Q, Pan W, Liu X, Zhang A, Fu J. Development of an Interpretable Machine Learning Model for Neurotoxicity Prediction of Environmentally Related Compounds. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2025. [PMID: 40307185 DOI: 10.1021/acs.est.5c03311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2025]
Abstract
The rising prevalence of nervous system disorders has become a significant global health challenge, with environmental pollutants identified as key contributors. However, the large number of environmental related compounds, combined with the low efficiency of traditional methods, has resulted in substantial gaps in neurotoxicity data. In this study, we developed a robust and interpretable neurotoxicity prediction model using a high-quality data set. To identify the best predictive model, three molecular representation methods (molecular fingerprints, molecular descriptors, and molecular graphs) combined with six traditional machine learning (ML) algorithms and two deep learning (DL) approaches were evaluated. The optimal model, combining molecular fingerprints and descriptors with eXtreme Gradient Boosting (XGBoost), achieved a training accuracy of 0.93 and an area under the curve (AUC) of 0.99, outperforming other ML and DL models, while maintaining interpretability. The model was used to screen 1170 compounds detected in human blood, predicting 1145 successfully. Among 89 compounds with known neurotoxicity data, the model achieved an accuracy of 0.74. It identified 821 potentially neurotoxic compounds, including 36 with high detection concentrations, warranting further study. An online platform (http://www.envwind.site/tools.html) was developed to expand accessibility. This model offers an efficient tool for predicting neurotoxicity and managing environmental health risks.
Collapse
Affiliation(s)
- Yuxing Hao
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, P. R. China
- Sino-Danish College, University of Chinese Academy of Sciences, Beijing 100190, P. R. China
- School of Environment, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310012, P. R. China
| | - Zhihui Duan
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, P. R. China
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Lizheng Liu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, P. R. China
- School of Environment, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310012, P. R. China
| | - Qiao Xue
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, P. R. China
| | - Wenxiao Pan
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, P. R. China
| | - Xian Liu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, P. R. China
| | - Aiqian Zhang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, P. R. China
- Sino-Danish College, University of Chinese Academy of Sciences, Beijing 100190, P. R. China
- School of Environment, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310012, P. R. China
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jianjie Fu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, P. R. China
- School of Environment, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310012, P. R. China
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
4
|
Gangwal A, Lavecchia A. Artificial intelligence in preclinical research: enhancing digital twins and organ-on-chip to reduce animal testing. Drug Discov Today 2025; 30:104360. [PMID: 40252989 DOI: 10.1016/j.drudis.2025.104360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2025] [Revised: 03/28/2025] [Accepted: 04/10/2025] [Indexed: 04/21/2025]
Abstract
Artificial intelligence (AI) is reshaping preclinical drug research offering innovative alternatives to traditional animal testing. Advanced techniques, including machine learning (ML), deep learning (DL), AI-powered digital twins (DTs), and AI-enhanced organ-on-a-chip (OoC) platforms, enable precise simulations of complex biological systems. AI plays a critical role in overcoming the limitations of DTs and OoC, improving their predictive power and scalability. These technologies facilitate early-stage, reliable evaluations of drug safety and efficacy, addressing ethical concerns, reducing costs, and accelerating drug development while adhering to the 3Rs principle (Replace, Reduce, Refine). By integrating AI with these advanced models, preclinical research can achieve greater accuracy and efficiency in drug discovery. This review examines the transformative impact of AI in preclinical research, highlighting its advancements, challenges, and the critical steps needed to establish AI as a cornerstone of ethical and efficient drug discovery.
Collapse
Affiliation(s)
- Amit Gangwal
- Department of Natural Product Chemistry, Shri Vile Parle Kelavani Mandal's Institute of Pharmacy, Dhule 424001 Maharashtra, India
| | - Antonio Lavecchia
- "Drug Discovery" Laboratory, Department of Pharmacy, University of Naples Federico II, I-80131 Naples, Italy.
| |
Collapse
|
5
|
Li J, Zhang J, Guo R, Dai J, Niu Z, Wang Y, Wang T, Jiang X, Hu W. Progress of machine learning in the application of small molecule druggability prediction. Eur J Med Chem 2025; 285:117269. [PMID: 39808972 DOI: 10.1016/j.ejmech.2025.117269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2024] [Revised: 01/07/2025] [Accepted: 01/08/2025] [Indexed: 01/16/2025]
Abstract
Machine learning (ML) has become an important tool for predicting the pharmaceutical properties of small molecules. Recent advancements in ML algorithms enable the rapid and accurate evaluation of solubility, activity, toxicity, pharmacokinetics, and other molecular properties through ML-based models. By conducting virtual screening of drug targets and elucidating drug-target protein interactions, researchers can conduct preliminary evaluations of the activity and safety of compounds from the ultra-large drug compound libraries, thereby accelerating the screening process for lead compounds. Moreover, ML leverages existing experimental data to train and generate new datasets, addressing the challenge of limited compounds and protein target data. This review provided a concise overview of ML applications in predicting small molecule properties, focusing on model construction principles, molecular feature selection, and other essential aspects. It also discussed the potential applications of ML in the screening of pharmaceutical small molecules.
Collapse
Affiliation(s)
- Junyao Li
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China; School of Life Sciences, Huaiyin Normal University, Huaian, 223300, China; Institute of Translational Medicine, School of Medicine, Yangzhou University, Yangzhou, 225009, China
| | - Jianmei Zhang
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China
| | - Rui Guo
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China; Institute of Translational Medicine, School of Medicine, Yangzhou University, Yangzhou, 225009, China
| | - Jiawei Dai
- Institute of Translational Medicine, School of Medicine, Yangzhou University, Yangzhou, 225009, China
| | - Zhiqiang Niu
- Institute of Translational Medicine, School of Medicine, Yangzhou University, Yangzhou, 225009, China
| | - Yan Wang
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China
| | - Taoyun Wang
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China.
| | - Xiaojian Jiang
- School of Life Sciences, Huaiyin Normal University, Huaian, 223300, China.
| | - Weicheng Hu
- Institute of Translational Medicine, School of Medicine, Yangzhou University, Yangzhou, 225009, China.
| |
Collapse
|
6
|
Monem S, Abdel-Hamid AH, Hassanien AE. Drug toxicity prediction model based on enhanced graph neural network. Comput Biol Med 2025; 185:109614. [PMID: 39721415 DOI: 10.1016/j.compbiomed.2024.109614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Revised: 12/15/2024] [Accepted: 12/21/2024] [Indexed: 12/28/2024]
Abstract
Prediction of drug toxicity remains a significant challenge and an essential process in drug discovery. Traditional machine learning algorithms struggle to capture the full scope of molecular structure features, limiting their effectiveness in toxicity prediction. Graph Neural Network offers a promising solution by effectively extracting drug features from their molecular graphs. However, existing graph learning algorithms fail to account for the interaction features between graph nodes and the indirect edges connecting them. This paper proposes an enhanced graph Neural Network algorithm that employs multi-view features for each node, capturing the feature interactions between each node and its neighbors. Additionally, the adjacency matrix is preprocessed to handle indirect edge interactions. A pooling technique is then applied to aggregate node features, followed by normalization and an activation layer. To further enhance the proposed algorithm, multi-scale attention is applied to learn graph features at different scales, utilizing weights to understand intricate relationships among node feature vectors. The proposed algorithm is evaluated using eight toxicity datasets, covering binary classification, multi-task multi-class, and regression tasks. For binary classification, the Tox21, AMES, Skin reaction, Carcinogens, and DILI datasets are tested. For multi-task multi-class, the ToxCast dataset is applied, and for regression, the LD50 and hREG datasets are tested. The proposed algorithm is compared with four well-known algorithms including Graph Convolution Network, Graph Attention Network, Graph Isomorphism Network, Enhanced Graph Isomorphism Network, and Graph Total Variation. For the classification task, the proposed algorithm achieves ROC-AUC scores of 0.752 for Tox21, 0.775 for AMES, 0.707 for Skin reaction, 0.845 for Carcinogens, 0.92 for DILI, and 0.691 for the ToxCast dataset. For the regression task, the algorithm attains mean square errors of 0.896 for the LD50 dataset and 0.766 for the hREG dataset. These results demonstrate an improvement over the compared algorithms across all evaluated datasets.
Collapse
Affiliation(s)
- Samar Monem
- Mathematics and Computer Science Department, Faculty of Science, Beni-Suef University, 62521, Beni-Suef, Egypt.
| | - Alaa H Abdel-Hamid
- Mathematics and Computer Science Department, Faculty of Science, Beni-Suef University, 62521, Beni-Suef, Egypt.
| | | |
Collapse
|
7
|
Hou Q, Li Y. Dual inhibition of AChE and MAO-B in Alzheimer's disease: machine learning approaches and model interpretations. Mol Divers 2025:10.1007/s11030-024-11061-x. [PMID: 39838228 DOI: 10.1007/s11030-024-11061-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2024] [Accepted: 11/20/2024] [Indexed: 01/23/2025]
Abstract
Alzheimer's disease (AD) is one of the most prevalent neurodegenerative diseases. Given the multifactorial pathophysiology of AD, monotargeted agents can only alleviate symptoms but not cure AD. Acetylcholinesterase (AChE) and Monoamine oxidase B (MAO-B) are two key targets in the treatment of AD, molecules that inhibiting both targets are considered promising avenue to develop more effective AD therapies. In the present work, a dual inhibition dataset containing 449 molecules was established, based on which five machine learning algorithms (KNN, SVM, RF, GBDT, and LGBM) four fingerprints (MACCS, ECFP4, RDKitFP, PubChemFP) and DRAGON descriptors were combined to develop 25 classification models in which GBDT paired with ECFP4 and RF paired with PubchemFP achieved the same best performance across multiple metrics (Accuracy = 0.92, F1 Score = 0.94, MCC = 0.81). Moreover, based on the curated bioactivity datasets of AChE and MAO-B, regression models were developed to predict pIC50 values. For the AChE inhibition task, GBDT demonstrated the best performance (RMSE = 0.683, MAE = 0.500, R2 = 0.721). The SVM algorithm emerged as the most effective for MAO-B inhibition (RMSE = 0.668, MAE = 0.507, R2 = 0.675). The SHAP algorithm was used to interpret the optimal models, identifying and analyzing the key substructures and properties for both dual-target and single-target inhibitors. Moreover, molecules docking process provided potential mechanism and Structure-Activity Relationships (SAR) of dual-target inhibition further.
Collapse
Affiliation(s)
- Qinghe Hou
- State Key Laboratory of Fine Chemicals, Dalian University of Technology, Dalian, 116024, Liaoning, China
| | - Yan Li
- State Key Laboratory of Fine Chemicals, Dalian University of Technology, Dalian, 116024, Liaoning, China.
| |
Collapse
|
8
|
Wang N, Li X, Xiao J, Liu S, Cao D. Data-driven toxicity prediction in drug discovery: Current status and future directions. Drug Discov Today 2024; 29:104195. [PMID: 39357621 DOI: 10.1016/j.drudis.2024.104195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 09/13/2024] [Accepted: 09/26/2024] [Indexed: 10/04/2024]
Abstract
Early toxicity assessment plays a vital role in the drug discovery process on account of its significant influence on the attrition rate of candidates. Recently, constant upgrading of information technology has greatly promoted the continuous development of toxicity prediction. To give an overview of the current state of data-driven toxicity prediction, we reviewed relevant studies and summarized them in three main respects: the features and difficulties of toxicity prediction, the evolution of modeling approaches, and the available tools for toxicity prediction. For each part, we expound the research status, existing challenges, and feasible solutions. Finally, several new directions and suggestions for toxicity prediction are also put forward.
Collapse
Affiliation(s)
- Ningning Wang
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; The Hunan Institute of Pharmacy Practice and Clinical Research, Changsha 410008 Hunan, PR China
| | - Xinliang Li
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; The Hunan Institute of Pharmacy Practice and Clinical Research, Changsha 410008 Hunan, PR China
| | - Jing Xiao
- Hunan Institute for Drug Control, Changsha 410001 Hunan, PR China
| | - Shao Liu
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; The Hunan Institute of Pharmacy Practice and Clinical Research, Changsha 410008 Hunan, PR China.
| | - Dongsheng Cao
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, PR China.
| |
Collapse
|
9
|
Monem S, Hassanien AE, Abdel-Hamid AH. A multi-task graph deep learning model to predict drugs combination of synergy and sensitivity scores. BMC Bioinformatics 2024; 25:327. [PMID: 39390357 PMCID: PMC11468365 DOI: 10.1186/s12859-024-05925-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Accepted: 09/06/2024] [Indexed: 10/12/2024] Open
Abstract
BACKGROUND Drug combination treatments have proven to be a realistic technique for treating challenging diseases such as cancer by enhancing efficacy and mitigating side effects. To achieve the therapeutic goals of these combinations, it is essential to employ multi-targeted drug combinations, which maximize effectiveness and synergistic effects. RESULTS This paper proposes 'MultiComb', a multi-task deep learning (MTDL) model designed to simultaneously predict the synergy and sensitivity of drug combinations. The model utilizes a graph convolution network to represent the Simplified Molecular-Input Line-Entry (SMILES) of two drugs, generating their respective features. Also, three fully connected subnetworks extract features of the cancer cell line. These drug and cell line features are then concatenated and processed through an attention mechanism, which outputs two optimized feature representations for the target tasks. The cross-stitch model learns the relationship between these tasks. At last, each learned task feature is fed into fully connected subnetworks to predict the synergy and sensitivity scores. The proposed model is validated using the O'Neil benchmark dataset, which includes 38 unique drugs combined to form 17,901 drug combination pairs and tested across 37 unique cancer cells. The model's performance is tested using some metrics like mean square error ( MSE ), mean absolute error ( MAE ), coefficient of determination (R 2 ), Spearman, and Pearson scores. The mean synergy scores of the proposed model are 232.37, 9.59, 0.57, 0.76, and 0.73 for the previous metrics, respectively. Also, the values for mean sensitivity scores are 15.59, 2.74, 0.90, 0.95, and 0.95, respectively. CONCLUSION This paper proposes an MTDL model to predict synergy and sensitivity scores for drug combinations targeting specific cancer cell lines. The MTDL model demonstrates superior performance compared to existing approaches, providing better results.
Collapse
Affiliation(s)
- Samar Monem
- Mathematics and Computer Science Department, Faculty of Science, Beni-Suef University, Beni Suef, 62521, Egypt.
- Scientific Research School of Egypt (SRSEG), Cairo, Egypt.
| | - Aboul Ella Hassanien
- Faculty of Computer and AI, Cairo University, Cairo, Egypt
- Scientific Research School of Egypt (SRSEG), Cairo, Egypt
| | - Alaa H Abdel-Hamid
- Mathematics and Computer Science Department, Faculty of Science, Beni-Suef University, Beni Suef, 62521, Egypt.
| |
Collapse
|
10
|
Toopradab B, Xie W, Duan L, Hengphasatporn K, Harada R, Sinsulpsiri S, Shigeta Y, Shi L, Maitarad P, Rungrotmongkol T. Machine learning-based QSAR and LB-PaCS-MD guided design of SARS-CoV-2 main protease inhibitors. Bioorg Med Chem Lett 2024; 110:129852. [PMID: 38925524 DOI: 10.1016/j.bmcl.2024.129852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 05/17/2024] [Accepted: 06/19/2024] [Indexed: 06/28/2024]
Abstract
The global outbreak of the COVID-19 pandemic caused by the SARS-CoV-2 virus had led to profound respiratory health implications. This study focused on designing organoselenium-based inhibitors targeting the SARS-CoV-2 main protease (Mpro). The ligand-binding pathway sampling method based on parallel cascade selection molecular dynamics (LB-PaCS-MD) simulations was employed to elucidate plausible paths and conformations of ebselen, a synthetic organoselenium drug, within the Mpro catalytic site. Ebselen effectively engaged the active site, adopting proximity to H41 and interacting through the benzoisoselenazole ring in a π-π T-shaped arrangement, with an additional π-sulfur interaction with C145. In addition, the ligand-based drug design using the QSAR with GFA-MLR, RF, and ANN models were employed for biological activity prediction. The QSAR-ANN model showed robust statistical performance, with an r2training exceeding 0.98 and an RMSEtest of 0.21, indicating its suitability for predicting biological activities. Integration the ANN model with the LB-PaCS-MD insights enabled the rational design of novel compounds anchored in the ebselen core structure, identifying promising candidates with favorable predicted IC50 values. The designed compounds exhibited suitable drug-like characteristics and adopted an active conformation similar to ebselen, inhibiting Mpro function. These findings represent a synergistic approach merging ligand and structure-based drug design; with the potential to guide experimental synthesis and enzyme assay testing.
Collapse
Affiliation(s)
- Borwornlak Toopradab
- Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Bangkok 10330, Thailand; Center of Excellence in Biocatalyst and Sustainable Biotechnology, Department of Biochemistry, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand
| | - Wanting Xie
- Research Center of Nano Science and Technology, Department of Chemistry, College of Science, Shanghai University, Shanghai, 200444 PR China
| | - Lian Duan
- Center for Computational Sciences (CCS), University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan; Graduate School of Pure and Applied Sciences, University of Tsukuba, 1-1-1 Tennodai, Ibaraki 305-8571, Japan
| | - Kowit Hengphasatporn
- Center for Computational Sciences (CCS), University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan
| | - Ryuhei Harada
- Center for Computational Sciences (CCS), University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan
| | - Silpsiri Sinsulpsiri
- Center of Excellence in Biocatalyst and Sustainable Biotechnology, Department of Biochemistry, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand
| | - Yasuteru Shigeta
- Center for Computational Sciences (CCS), University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan
| | - Liyi Shi
- Research Center of Nano Science and Technology, Department of Chemistry, College of Science, Shanghai University, Shanghai, 200444 PR China; Emerging Industries Institute Shanghai University, Jiaxing, Zhejiang 314006, PR China
| | - Phornphimon Maitarad
- Research Center of Nano Science and Technology, Department of Chemistry, College of Science, Shanghai University, Shanghai, 200444 PR China.
| | - Thanyada Rungrotmongkol
- Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Bangkok 10330, Thailand; Center of Excellence in Biocatalyst and Sustainable Biotechnology, Department of Biochemistry, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand.
| |
Collapse
|
11
|
Chakraborty C, Bhattacharya M, Lee SS, Wen ZH, Lo YH. The changing scenario of drug discovery using AI to deep learning: Recent advancement, success stories, collaborations, and challenges. MOLECULAR THERAPY. NUCLEIC ACIDS 2024; 35:102295. [PMID: 39257717 PMCID: PMC11386122 DOI: 10.1016/j.omtn.2024.102295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
Due to the transformation of artificial intelligence (AI) tools and technologies, AI-driven drug discovery has come to the forefront. It reduces the time and expenditure. Due to these advantages, pharmaceutical industries are concentrating on AI-driven drug discovery. Several drug molecules have been discovered using AI-based techniques and tools, and several newly AI-discovered drug molecules have already entered clinical trials. In this review, we first present the data and their resources in the pharmaceutical sector for AI-driven drug discovery and illustrated some significant algorithms or techniques used for AI and ML which are used in this field. We gave an overview of the deep neural network (NN) models and compared them with artificial NNs. Then, we illustrate the recent advancement of the landscape of drug discovery using AI to deep learning, such as the identification of drug targets, prediction of their structure, estimation of drug-target interaction, estimation of drug-target binding affinity, design of de novo drug, prediction of drug toxicity, estimation of absorption, distribution, metabolism, excretion, toxicity; and estimation of drug-drug interaction. Moreover, we highlighted the success stories of AI-driven drug discovery and discussed several collaboration and the challenges in this area. The discussions in the article will enrich the pharmaceutical industry.
Collapse
Affiliation(s)
- Chiranjib Chakraborty
- Department of Biotechnology, School of Life Science and Biotechnology, Adamas University, Kolkata, West Bengal 700126, India
| | - Manojit Bhattacharya
- Department of Zoology, Fakir Mohan University, Vyasa Vihar, Balasore, Odisha 756020, India
| | - Sang-Soo Lee
- Institute for Skeletal Aging & Orthopedic Surgery, Hallym University-Chuncheon Sacred Heart Hospital, Chuncheon, Gangwon-Do 24252, Republic of Korea
| | - Zhi-Hong Wen
- Department of Marine Biotechnology and Resources, National Sun Yat-sen University, Kaohsiung 80424, Taiwan
| | - Yi-Hao Lo
- Department of Family Medicine, Zuoying Armed Forces General Hospital, Kaohsiung 813204, Taiwan
- Shu-Zen Junior College of Medicine and Management, Kaohsiung 821004, Taiwan
- Institute of Medical Science and Technology, National Sun Yat-sen University, Kaohsiung 804201, Taiwan
| |
Collapse
|
12
|
Li B, Tan K, Lao AR, Wang H, Zheng H, Zhang L. A comprehensive review of artificial intelligence for pharmacology research. Front Genet 2024; 15:1450529. [PMID: 39290983 PMCID: PMC11405247 DOI: 10.3389/fgene.2024.1450529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Accepted: 08/26/2024] [Indexed: 09/19/2024] Open
Abstract
With the innovation and advancement of artificial intelligence, more and more artificial intelligence techniques are employed in drug research, biomedical frontier research, and clinical medicine practice, especially, in the field of pharmacology research. Thus, this review focuses on the applications of artificial intelligence in drug discovery, compound pharmacokinetic prediction, and clinical pharmacology. We briefly introduced the basic knowledge and development of artificial intelligence, presented a comprehensive review, and then summarized the latest studies and discussed the strengths and limitations of artificial intelligence models. Additionally, we highlighted several important studies and pointed out possible research directions.
Collapse
Affiliation(s)
- Bing Li
- College of Computer Science, Sichuan University, Chengdu, China
| | - Kan Tan
- College of Computer Science, Sichuan University, Chengdu, China
| | - Angelyn R Lao
- Department of Mathematics and Statistics, De La Salle University, Manila, Philippines
| | - Haiying Wang
- School of Computing, Ulster University, Belfast, United Kingdom
| | - Huiru Zheng
- School of Computing, Ulster University, Belfast, United Kingdom
| | - Le Zhang
- College of Computer Science, Sichuan University, Chengdu, China
| |
Collapse
|
13
|
Arab I, Laukens K, Bittremieux W. Semisupervised Learning to Boost hERG, Nav1.5, and Cav1.2 Cardiac Ion Channel Toxicity Prediction by Mining a Large Unlabeled Small Molecule Data Set. J Chem Inf Model 2024; 64:6410-6420. [PMID: 39110924 DOI: 10.1021/acs.jcim.4c01102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/27/2024]
Abstract
Predicting drug toxicity is a critical aspect of ensuring patient safety during the drug design process. Although conventional machine learning techniques have shown some success in this field, the scarcity of annotated toxicity data poses a significant challenge in enhancing models' performance. In this study, we explore the potential of leveraging large unlabeled small molecule data sets using semisupervised learning to improve drug cardiotoxicity predictive performance across three cardiac ion channel targets: the voltage-gated potassium channel (hERG), the voltage-gated sodium channel (Nav1.5), and the voltage-gated calcium channel (Cav1.2). We extensively mined the ChEMBL database, comprising approximately 2 million small molecules, and then employed semisupervised learning to construct robust classification models for this purpose. We achieved a performance boost on highly diverse (i.e., structurally dissimilar) test data sets across all three targets. Using our built models, we screened the whole ChEMBL database and a large set of FDA-approved drugs, identifying several compounds with potential cardiac ion channel activity. To ensure broad accessibility and usability for both technical and nontechnical users, we developed a cross-platform graphical user interface that allows users to make predictions and gain insights into the cardiotoxicity of drugs and other small molecules. The software is made available as open source under the permissive MIT license at https://github.com/issararab/CToxPred2.
Collapse
Affiliation(s)
- Issar Arab
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium
- Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| | - Kris Laukens
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium
- Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| | - Wout Bittremieux
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium
- Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| |
Collapse
|
14
|
Oh M, Shen M, Liu R, Stavitskaya L, Shen J. Machine Learned Classification of Ligand Intrinsic Activities at Human μ-Opioid Receptor. ACS Chem Neurosci 2024; 15:2842-2852. [PMID: 38990780 DOI: 10.1021/acschemneuro.4c00212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024] Open
Abstract
Opioids are small-molecule agonists of μ-opioid receptor (μOR), while reversal agents such as naloxone are antagonists of μOR. Here, we developed machine learning (ML) models to classify the intrinsic activities of ligands at the human μOR based on the SMILES strings and two-dimensional molecular descriptors. We first manually curated a database of 983 small molecules with measured Emax values at the human μOR. Analysis of the chemical space allowed identification of dominant scaffolds and structurally similar agonists and antagonists. Decision tree models and directed message passing neural networks (MPNNs) were then trained to classify agonistic and antagonistic ligands. The hold-out test AUCs (areas under the receiver operator curves) of the extra-tree (ET) and MPNN models are 91.5 ± 3.9% and 91.8 ± 4.4%, respectively. To overcome the challenge of a small data set, a student-teacher learning method called tritraining with disagreement was tested using an unlabeled data set comprised of 15,816 ligands of human, mouse, and rat μOR, κOR, and δOR. We found that the tritraining scheme was able to increase the hold-out AUC of MPNN models to as high as 95.7%. Our work demonstrates the feasibility of developing ML models to accurately predict the intrinsic activities of μOR ligands, even with limited data. We envisage potential applications of these models in evaluating uncharacterized substances for public safety risks and discovering new therapeutic agents to counteract opioid overdoses.
Collapse
Affiliation(s)
- Myongin Oh
- Division of Applied Regulatory Science, Office of Clinical Pharmacology, Center for Drug Evaluation and Research, United States Food and Drug Administration, Silver Spring, Maryland 20993, United States
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, Maryland 21201, United States
| | - Maximilian Shen
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20742, United States
| | - Ruibin Liu
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, Maryland 21201, United States
| | - Lidiya Stavitskaya
- Division of Applied Regulatory Science, Office of Clinical Pharmacology, Center for Drug Evaluation and Research, United States Food and Drug Administration, Silver Spring, Maryland 20993, United States
| | - Jana Shen
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, Maryland 21201, United States
| |
Collapse
|
15
|
Yang Z, Wang Y, Du G, Zhan Y, Zhan W. Prediction method of pharmacokinetic parameters of small molecule drugs based on GCN network model. J Mol Model 2024; 30:264. [PMID: 38995407 DOI: 10.1007/s00894-024-06051-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Accepted: 06/26/2024] [Indexed: 07/13/2024]
Abstract
CONTEXT Accurately predicting plasma protein binding rate (PPBR) and oral bioavailability (OBA) helps to better reveal the absorption and distribution of drugs in the human body and subsequent drug design. Although machine learning models have achieved good results in prediction accuracy, they often suffer from insufficient accuracy when dealing with data with irregular topological structures. METHODS In view of this, this study proposes a pharmacokinetic parameter prediction framework based on graph convolutional networks (GCN), which predicts the PPBR and OBA of small molecule drugs. In the framework, GCN is first used to extract spatial feature information on the topological structure of drug molecules, in order to better learn node features and association information between nodes. Then, based on the principle of drug similarity, this study calculates the similarity between small molecule drugs, selects different thresholds to construct datasets, and establishes a prediction model centered on the GCN algorithm. The experimental results show that compared with traditional machine learning prediction models, the prediction model constructed based on the GCN method performs best on PPBR and OBA datasets with an inter-molecular similarity threshold of 0.25, with MAE of 0.155 and 0.167, respectively. In addition, in order to further improve the accuracy of the prediction model, GCN is combined with other algorithms. Compared to using a single GCN method, the distribution of the predicted values obtained by the combined model is highly consistent with the true values. In summary, this work provides a new method for improving the rate of early drug screening in the future.
Collapse
Affiliation(s)
- Zhihua Yang
- Department of Radiation Oncology, General Hospital of Ningxia Medical University, Yinchuan, 750004, China
| | - Ying Wang
- Engineering Research Center of Molecular and Neuro Imaging of the Ministry of Education, School of Life Science and Technology, Xidian University, Xi'an, 710071, Shaanxi, China
| | - Getao Du
- Engineering Research Center of Molecular and Neuro Imaging of the Ministry of Education, School of Life Science and Technology, Xidian University, Xi'an, 710071, Shaanxi, China
| | - Yonghua Zhan
- Engineering Research Center of Molecular and Neuro Imaging of the Ministry of Education, School of Life Science and Technology, Xidian University, Xi'an, 710071, Shaanxi, China.
| | - Wenhua Zhan
- Department of Radiation Oncology, General Hospital of Ningxia Medical University, Yinchuan, 750004, China.
| |
Collapse
|
16
|
Zhou Y, Ning C, Tan Y, Li Y, Wang J, Shu Y, Liang S, Liu Z, Wang Y. ToxMPNN: A deep learning model for small molecule toxicity prediction. J Appl Toxicol 2024; 44:953-964. [PMID: 38409892 DOI: 10.1002/jat.4591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 01/23/2024] [Accepted: 02/02/2024] [Indexed: 02/28/2024]
Abstract
Machine learning (ML) has shown a great promise in predicting toxicity of small molecules. However, the availability of data for such predictions is often limited. Because of the unsatisfactory performance of models trained on a single toxicity endpoint, we collected toxic small molecules with multiple toxicity endpoints from previous study. The dataset comprises 27 toxic endpoints categorized into seven toxicity classes, namely, carcinogenicity and mutagenicity, acute oral toxicity, respiratory toxicity, irritation and corrosion, cardiotoxicity, CYP450, and endocrine disruption. In addition, a binary classification Common-Toxicity task was added based on the aforementioned dataset. To improve the performance of the models, we added marketed drugs as negative samples. This study presents a toxicity predictive model, ToxMPNN, based on the message passing neural network (MPNN) architecture, aiming to predict the toxicity of small molecules. The results demonstrate that ToxMPNN outperforms other models in capturing toxic features within the molecular structure, resulting in more precise predictions with the ROC_AUC testing score of 0.886 for the Toxicity_drug dataset. Furthermore, it was observed that adding marketed drugs as negative samples not only improves the predictive performance of the binary classification Common-Toxicity task but also enhances the stability of the model prediction. It shows that the graph-based deep learning (DL) algorithms in this study can be used as a trustworthy and effective tool to assess small molecule toxicity in the development of new drugs.
Collapse
Affiliation(s)
- Yini Zhou
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha, China
- Institute of Interdisciplinary Studies, Hunan Normal University, Changsha, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha, China
| | - Chao Ning
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha, China
- Institute of Interdisciplinary Studies, Hunan Normal University, Changsha, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha, China
| | - Yijun Tan
- School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha, China
| | - Yaqi Li
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha, China
- Institute of Interdisciplinary Studies, Hunan Normal University, Changsha, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha, China
| | - Jiaxu Wang
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha, China
- Institute of Interdisciplinary Studies, Hunan Normal University, Changsha, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha, China
| | - Yuanyuan Shu
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha, China
- Institute of Interdisciplinary Studies, Hunan Normal University, Changsha, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha, China
| | - Songping Liang
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha, China
- Institute of Interdisciplinary Studies, Hunan Normal University, Changsha, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha, China
| | - Zhonghua Liu
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha, China
- Institute of Interdisciplinary Studies, Hunan Normal University, Changsha, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha, China
| | - Ying Wang
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha, China
- Institute of Interdisciplinary Studies, Hunan Normal University, Changsha, China
- Peptide and Small Molecule Drug R&D Platform, Furong Laboratory, Hunan Normal University, Changsha, China
| |
Collapse
|
17
|
Anandhi G, Iyapparaja M. Systematic approaches to machine learning models for predicting pesticide toxicity. Heliyon 2024; 10:e28752. [PMID: 38576573 PMCID: PMC10990867 DOI: 10.1016/j.heliyon.2024.e28752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 03/13/2024] [Accepted: 03/24/2024] [Indexed: 04/06/2024] Open
Abstract
Pesticides play an important role in modern agriculture by protecting crops from pests and diseases. However, the negative consequences of pesticides, such as environmental contamination and adverse effects on human and ecological health, underscore the importance of accurate toxicity predictions. To address this issue, artificial intelligence models have emerged as valuable methods for predicting the toxicity of organic compounds. In this review article, we explore the application of machine learning (ML) for pesticide toxicity prediction. This review provides a detailed summary of recent developments, prediction models, and datasets used for pesticide toxicity prediction. In this analysis, we compared the results of several algorithms that predict the harmfulness of various classes of pesticides. Furthermore, this review article identified emerging trends and areas for future direction, showcasing the transformative potential of machine learning in promoting safer pesticide usage and sustainable agriculture.
Collapse
Affiliation(s)
- Ganesan Anandhi
- Department of Smart Computing, School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - M. Iyapparaja
- Department of Smart Computing, School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| |
Collapse
|
18
|
Di Stefano M, Galati S, Piazza L, Granchi C, Mancini S, Fratini F, Macchia M, Poli G, Tuccinardi T. VenomPred 2.0: A Novel In Silico Platform for an Extended and Human Interpretable Toxicological Profiling of Small Molecules. J Chem Inf Model 2024; 64:2275-2289. [PMID: 37676238 PMCID: PMC11005041 DOI: 10.1021/acs.jcim.3c00692] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Indexed: 09/08/2023]
Abstract
The application of artificial intelligence and machine learning (ML) methods is becoming increasingly popular in computational toxicology and drug design; it is considered as a promising solution for assessing the safety profile of compounds, particularly in lead optimization and ADMET studies, and to meet the principles of the 3Rs, which calls for the replacement, reduction, and refinement of animal testing. In this context, we herein present the development of VenomPred 2.0 (http://www.mmvsl.it/wp/venompred2/), the new and improved version of our free of charge web tool for toxicological predictions, which now represents a powerful web-based platform for multifaceted and human-interpretable in silico toxicity profiling of chemicals. VenomPred 2.0 presents an extended set of toxicity endpoints (androgenicity, skin irritation, eye irritation, and acute oral toxicity, in addition to the already available carcinogenicity, mutagenicity, hepatotoxicity, and estrogenicity) that can be evaluated through an exhaustive consensus prediction strategy based on multiple ML models. Moreover, we also implemented a new utility based on the Shapley Additive exPlanations (SHAP) method that allows human interpretable toxicological profiling of small molecules, highlighting the features that strongly contribute to the toxicological predictions in order to derive structural toxicophores.
Collapse
Affiliation(s)
- Miriana Di Stefano
- Department
of Pharmacy, University of Pisa, Via Bonanno 6, 56126 Pisa, Italy
- Department
of Life Sciences, University of Siena, 53100 Siena, Italy
| | - Salvatore Galati
- Department
of Pharmacy, University of Pisa, Via Bonanno 6, 56126 Pisa, Italy
| | - Lisa Piazza
- Department
of Pharmacy, University of Pisa, Via Bonanno 6, 56126 Pisa, Italy
| | - Carlotta Granchi
- Department
of Pharmacy, University of Pisa, Via Bonanno 6, 56126 Pisa, Italy
| | - Simone Mancini
- Department
of Veterinary Sciences, University of Pisa, Viale Delle Piagge 2, 56124 Pisa, Italy
| | - Filippo Fratini
- Department
of Veterinary Sciences, University of Pisa, Viale Delle Piagge 2, 56124 Pisa, Italy
| | - Marco Macchia
- Department
of Pharmacy, University of Pisa, Via Bonanno 6, 56126 Pisa, Italy
| | - Giulio Poli
- Department
of Pharmacy, University of Pisa, Via Bonanno 6, 56126 Pisa, Italy
| | - Tiziano Tuccinardi
- Department
of Pharmacy, University of Pisa, Via Bonanno 6, 56126 Pisa, Italy
| |
Collapse
|
19
|
Song Z, Chen J, Cheng J, Chen G, Qi Z. Computer-Aided Molecular Design of Ionic Liquids as Advanced Process Media: A Review from Fundamentals to Applications. Chem Rev 2024; 124:248-317. [PMID: 38108629 DOI: 10.1021/acs.chemrev.3c00223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
The unique physicochemical properties, flexible structural tunability, and giant chemical space of ionic liquids (ILs) provide them a great opportunity to match different target properties to work as advanced process media. The crux of the matter is how to efficiently and reliably tailor suitable ILs toward a specific application. In this regard, the computer-aided molecular design (CAMD) approach has been widely adapted to cover this family of high-profile chemicals, that is, to perform computer-aided IL design (CAILD). This review discusses the past developments that have contributed to the state-of-the-art of CAILD and provides a perspective about how future works could pursue the acceleration of the practical application of ILs. In a broad context of CAILD, key aspects related to the forward structure-property modeling and reverse molecular design of ILs are overviewed. For the former forward task, diverse IL molecular representations, modeling algorithms, as well as representative models on physical properties, thermodynamic properties, among others of ILs are introduced. For the latter reverse task, representative works formulating different molecular design scenarios are summarized. Beyond the substantial progress made, some future perspectives to move CAILD a step forward are finally provided.
Collapse
Affiliation(s)
- Zhen Song
- State Key laboratory of Chemical Engineering, School of Chemical Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Jiahui Chen
- State Key laboratory of Chemical Engineering, School of Chemical Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Jie Cheng
- State Key laboratory of Chemical Engineering, School of Chemical Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Guzhong Chen
- State Key laboratory of Chemical Engineering, School of Chemical Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Zhiwen Qi
- State Key laboratory of Chemical Engineering, School of Chemical Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| |
Collapse
|
20
|
Lv Q, Zhou F, Liu X, Zhi L. Artificial intelligence in small molecule drug discovery from 2018 to 2023: Does it really work? Bioorg Chem 2023; 141:106894. [PMID: 37776682 DOI: 10.1016/j.bioorg.2023.106894] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/24/2023] [Accepted: 09/25/2023] [Indexed: 10/02/2023]
Abstract
Utilizing artificial intelligence (AI) in drug design represents an advanced approach for identifying targets and developing new drugs. Integrating AI techniques significantly reduces the workload involved in drug development and enhances the efficiency of early-stage drug discovery. This review aims to present a comprehensive overview of the utilization of AI methods in the field of small drug design, with a specific focus on four key areas: protein structure prediction, molecular virtual screening, molecular design, and absorption, distribution, metabolism, excretion, and toxicity (ADMET) prediction. Additionally, the role and limitations of AI in drug development are explored, and the impact of AI on decision-making processes is studied. It is important to note that while AI can bring numerous benefits to the early stage of drug development, the direction and quality of decision-making should still be emphasized, as AI should be considered as a tool rather than a decisive factor.
Collapse
Affiliation(s)
- Qi Lv
- School of Pharmacy, Inflammation and Immune Mediated Diseases Laboratory of Anhui Province, Hefei 230032, PR China
| | - Feilong Zhou
- School of Pharmacy, Inflammation and Immune Mediated Diseases Laboratory of Anhui Province, Hefei 230032, PR China
| | - Xinhua Liu
- School of Pharmacy, Inflammation and Immune Mediated Diseases Laboratory of Anhui Province, Hefei 230032, PR China.
| | - Liping Zhi
- School of Health Management, Anhui Medical University Hefei, 230032, PR China.
| |
Collapse
|
21
|
Guo W, Liu J, Dong F, Song M, Li Z, Khan MKH, Patterson TA, Hong H. Review of machine learning and deep learning models for toxicity prediction. Exp Biol Med (Maywood) 2023; 248:1952-1973. [PMID: 38057999 PMCID: PMC10798180 DOI: 10.1177/15353702231209421] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023] Open
Abstract
The ever-increasing number of chemicals has raised public concerns due to their adverse effects on human health and the environment. To protect public health and the environment, it is critical to assess the toxicity of these chemicals. Traditional in vitro and in vivo toxicity assays are complicated, costly, and time-consuming and may face ethical issues. These constraints raise the need for alternative methods for assessing the toxicity of chemicals. Recently, due to the advancement of machine learning algorithms and the increase in computational power, many toxicity prediction models have been developed using various machine learning and deep learning algorithms such as support vector machine, random forest, k-nearest neighbors, ensemble learning, and deep neural network. This review summarizes the machine learning- and deep learning-based toxicity prediction models developed in recent years. Support vector machine and random forest are the most popular machine learning algorithms, and hepatotoxicity, cardiotoxicity, and carcinogenicity are the frequently modeled toxicity endpoints in predictive toxicology. It is known that datasets impact model performance. The quality of datasets used in the development of toxicity prediction models using machine learning and deep learning is vital to the performance of the developed models. The different toxicity assignments for the same chemicals among different datasets of the same type of toxicity have been observed, indicating benchmarking datasets is needed for developing reliable toxicity prediction models using machine learning and deep learning algorithms. This review provides insights into current machine learning models in predictive toxicology, which are expected to promote the development and application of toxicity prediction models in the future.
Collapse
Affiliation(s)
- Wenjing Guo
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Jie Liu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Fan Dong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Meng Song
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Zoe Li
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Md Kamrul Hasan Khan
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Tucker A Patterson
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| |
Collapse
|
22
|
Sinha K, Ghosh N, Sil PC. A Review on the Recent Applications of Deep Learning in Predictive Drug Toxicological Studies. Chem Res Toxicol 2023; 36:1174-1205. [PMID: 37561655 DOI: 10.1021/acs.chemrestox.2c00375] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]
Abstract
Drug toxicity prediction is an important step in ensuring patient safety during drug design studies. While traditional preclinical studies have historically relied on animal models to evaluate toxicity, recent advances in deep-learning approaches have shown great promise in advancing drug safety science and reducing animal use in preclinical studies. However, deep-learning-based approaches also face challenges in handling large biological data sets, model interpretability, and regulatory acceptance. In this review, we provide an overview of recent developments in deep-learning-based approaches for predicting drug toxicity, highlighting their potential advantages over traditional methods and the need to address their limitations. Deep-learning models have demonstrated excellent performance in predicting toxicity outcomes from various data sources such as chemical structures, genomic data, and high-throughput screening assays. The potential of deep learning for automated feature engineering is also discussed. This review emphasizes the need to address ethical concerns related to the use of deep learning in drug toxicity studies, including the reduction of animal use and ensuring regulatory acceptance. Furthermore, emerging applications of deep learning in drug toxicity prediction, such as predicting drug-drug interactions and toxicity in rare subpopulations, are highlighted. The integration of deep-learning-based approaches with traditional methods is discussed as a way to develop more reliable and efficient predictive models for drug safety assessment, paving the way for safer and more effective drug discovery and development. Overall, this review highlights the critical role of deep learning in predictive toxicology and drug safety evaluation, emphasizing the need for continued research and development in this rapidly evolving field. By addressing the limitations of traditional methods, leveraging the potential of deep learning for automated feature engineering, and addressing ethical concerns, deep-learning-based approaches have the potential to revolutionize drug toxicity prediction and improve patient safety in drug discovery and development.
Collapse
Affiliation(s)
- Krishnendu Sinha
- Department of Zoology, Jhargram Raj College, Jhargram 721507, West Bengal, India
| | - Nabanita Ghosh
- Department of Zoology, Maulana Azad College, Kolkata 700013, West Bengal, India
| | - Parames C Sil
- Division of Molecular Medicine, Bose Institute, Kolkata 700054, West Bengal, India
| |
Collapse
|
23
|
Zhang H, Saravanan KM, Zhang JZH. DeepBindGCN: Integrating Molecular Vector Representation with Graph Convolutional Neural Networks for Protein-Ligand Interaction Prediction. Molecules 2023; 28:4691. [PMID: 37375246 PMCID: PMC10301867 DOI: 10.3390/molecules28124691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 06/08/2023] [Accepted: 06/09/2023] [Indexed: 06/29/2023] Open
Abstract
The core of large-scale drug virtual screening is to select the binders accurately and efficiently with high affinity from large libraries of small molecules in which non-binders are usually dominant. The binding affinity is significantly influenced by the protein pocket, ligand spatial information, and residue types/atom types. Here, we used the pocket residues or ligand atoms as the nodes and constructed edges with the neighboring information to comprehensively represent the protein pocket or ligand information. Moreover, the model with pre-trained molecular vectors performed better than the one-hot representation. The main advantage of DeepBindGCN is that it is independent of docking conformation, and concisely keeps the spatial information and physical-chemical features. Using TIPE3 and PD-L1 dimer as proof-of-concept examples, we proposed a screening pipeline integrating DeepBindGCN and other methods to identify strong-binding-affinity compounds. It is the first time a non-complex-dependent model has achieved a root mean square error (RMSE) value of 1.4190 and Pearson r value of 0.7584 in the PDBbind v.2016 core set, respectively, thereby showing a comparable prediction power with the state-of-the-art affinity prediction models that rely upon the 3D complex. DeepBindGCN provides a powerful tool to predict the protein-ligand interaction and can be used in many important large-scale virtual screening application scenarios.
Collapse
Affiliation(s)
- Haiping Zhang
- Shenzhen Institute of Synthetic Biology, Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Konda Mani Saravanan
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai 600073, Tamil Nadu, India;
| | - John Z. H. Zhang
- Shenzhen Institute of Synthetic Biology, Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
24
|
Wang X, Wang L, Wang S, Ren Y, Chen W, Li X, Han P, Song T. QuantumTox: Utilizing quantum chemistry with ensemble learning for molecular toxicity prediction. Comput Biol Med 2023; 157:106744. [PMID: 36947905 DOI: 10.1016/j.compbiomed.2023.106744] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 02/16/2023] [Accepted: 03/04/2023] [Indexed: 03/11/2023]
Abstract
Molecular toxicity prediction plays an important role in drug discovery, which is directly related to human health and drug fate. Accurately determining the toxicity of molecules can help weed out low-quality molecules in the early stage of drug discovery process and avoid depletion later in the drug development process. Nowadays, more and more researchers are starting to use machine learning methods to predict the toxicity of molecules, but these models do not fully exploit the 3D information of molecules. Quantum chemical information, which provides stereo structural information of molecules, can influence their toxicity. To this end, we propose QuantumTox, the first application of quantum chemistry in the field of drug molecule toxicity prediction compared to existing work. We extract the quantum chemical information of molecules as their 3D features. In the downstream prediction phase, we use Gradient Boosting Decision Tree and Bagging ensemble learning methods together to improve the accuracy and generalization of the model. A series of experiments on various tasks show that our model consistently outperforms the baseline model and that the model still performs well on small datasets of less than 300.
Collapse
Affiliation(s)
- Xun Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China.
| | - Lulu Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China.
| | - Shuang Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China.
| | - Yongqi Ren
- College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China.
| | - Wenqi Chen
- College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China.
| | - Xue Li
- College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China.
| | - Peifu Han
- College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China.
| | - Tao Song
- College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China.
| |
Collapse
|
25
|
Liu J, Lei X, Zhang Y, Pan Y. The prediction of molecular toxicity based on BiGRU and GraphSAGE. Comput Biol Med 2023; 153:106524. [PMID: 36623439 DOI: 10.1016/j.compbiomed.2022.106524] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 12/10/2022] [Accepted: 12/31/2022] [Indexed: 01/04/2023]
Abstract
The prediction of molecules toxicity properties plays an crucial role in the realm of the drug discovery, since it can swiftly screen out the expected drug moleculars. The conventional method for predicting toxicity is to use some in vivo or in vitro biological experiments in the laboratory, which can easily pose a threat significant time and financial waste and even ethical issues. Therefore, using computational approaches to predict molecular toxicity has become a common strategy in modern drug discovery. In this article, we propose a novel model named MTBG, which primarily makes use of both SMILES (Simplified molecular input line entry system) strings and graph structures of molecules to extract drug molecular feature in the field of drug molecular toxicity prediction. To verify the performance of the MTBG model, we opt the Tox21 dataset and several widely used baseline models. Experimental results demonstrate that our model can perform better than these baseline models.
Collapse
Affiliation(s)
- Jianping Liu
- School of Computer Science, Shaanxi Normal University, Xi'an, 710119, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, 710119, China.
| | - Yuchen Zhang
- School of Computer Science, Shaanxi Normal University, Xi'an, 710119, China
| | - Yi Pan
- Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| |
Collapse
|
26
|
Xie Y, Zhang Y, Wong KC, Shi M, Peng C. Improving Chemical Reaction Prediction with Unlabeled Data. Molecules 2022; 27:molecules27185967. [PMID: 36144703 PMCID: PMC9506495 DOI: 10.3390/molecules27185967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Revised: 09/04/2022] [Accepted: 09/08/2022] [Indexed: 11/18/2022] Open
Abstract
Predicting products of organic chemical reactions is useful in chemical sciences, especially when one or more reactants are new organics. However, the performance of traditional learning models heavily relies on high-quality labeled data. In this work, to utilize unlabeled data for better prediction performance, we propose a method that combines semi-supervised learning with graph convolutional neural networks for chemical reaction prediction. First, we propose a Mean Teacher Weisfeiler–Lehman Network to find the reaction centers. Then, we construct the candidate product set. Finally, we use an Improved Weisfeiler–Lehman Difference Network to rank candidate products. Experimental results demonstrate that, with 400k labeled data, our framework can improve the top-5 accuracy by 0.7% using 35k unlabeled data. When the proportion of unlabeled data increases, the performance gain can be larger. For example, with 80k labeled data and 35k unlabeled data, the performance gain with our framework can be 1.8%.
Collapse
Affiliation(s)
- Yu Xie
- College of Information Science and Engineering, Ningbo University, Ningbo 315211, China
| | - Yuyang Zhang
- College of Information Science and Engineering, Ningbo University, Ningbo 315211, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Hongkong 999077, China
| | - Meixia Shi
- College of Chemical Engineering, Ningbo Polytechnic, Ningbo 315000, China
| | - Chengbin Peng
- College of Information Science and Engineering, Ningbo University, Ningbo 315211, China
- Correspondence:
| |
Collapse
|