1
|
Tran TTV, Tayara H, Chong KT. AMPred-CNN: Ames mutagenicity prediction model based on convolutional neural networks. Comput Biol Med 2024; 176:108560. [PMID: 38754218 DOI: 10.1016/j.compbiomed.2024.108560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 04/15/2024] [Accepted: 05/05/2024] [Indexed: 05/18/2024]
Abstract
Mutagenicity assessment plays a pivotal role in the safety evaluation of chemicals, pharmaceuticals, and environmental compounds. In recent years, the development of robust computational models for predicting chemical mutagenicity has gained significant attention, driven by the need for efficient and cost-effective toxicity assessments. In this paper, we proposed AMPred-CNN, an innovative Ames mutagenicity prediction model based on Convolutional Neural Networks (CNNs), uniquely employing molecular structures as images to leverage CNNs' powerful feature extraction capabilities. The study employs the widely used benchmark mutagenicity dataset from Hansen et al. for model development and evaluation. Comparative analyses with traditional ML models on different molecular features reveal substantial performance enhancements. AMPred-CNN outshines these models, demonstrating superior accuracy, AUC, F1 score, MCC, sensitivity, and specificity on the test set. Notably, AMPred-CNN is further benchmarked against seven recent ML and DL models, consistently showcasing superior performance with an impressive AUC of 0.954. Our study highlights the effectiveness of CNNs in advancing mutagenicity prediction, paving the way for broader applications in toxicology and drug development.
Collapse
Affiliation(s)
- Thi Tuyet Van Tran
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea; Faculty of Information Technology, An Giang University, Long Xuyen 880000, Viet Nam; Vietnam National University-Ho Chi Minh City, Ho Chi Minh 700000, Viet Nam.
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, Republic of Korea.
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea; Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, Republic of Korea.
| |
Collapse
|
2
|
Amorim AMB, Piochi LF, Gaspar AT, Preto AJ, Rosário-Ferreira N, Moreira IS. Advancing Drug Safety in Drug Development: Bridging Computational Predictions for Enhanced Toxicity Prediction. Chem Res Toxicol 2024. [PMID: 38758610 DOI: 10.1021/acs.chemrestox.3c00352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/19/2024]
Abstract
The attrition rate of drugs in clinical trials is generally quite high, with estimates suggesting that approximately 90% of drugs fail to make it through the process. The identification of unexpected toxicity issues during preclinical stages is a significant factor contributing to this high rate of failure. These issues can have a major impact on the success of a drug and must be carefully considered throughout the development process. These late-stage rejections or withdrawals of drug candidates significantly increase the costs associated with drug development, particularly when toxicity is detected during clinical trials or after market release. Understanding drug-biological target interactions is essential for evaluating compound toxicity and safety, as well as predicting therapeutic effects and potential off-target effects that could lead to toxicity. This will enable scientists to predict and assess the safety profiles of drug candidates more accurately. Evaluation of toxicity and safety is a critical aspect of drug development, and biomolecules, particularly proteins, play vital roles in complex biological networks and often serve as targets for various chemicals. Therefore, a better understanding of these interactions is crucial for the advancement of drug development. The development of computational methods for evaluating protein-ligand interactions and predicting toxicity is emerging as a promising approach that adheres to the 3Rs principles (replace, reduce, and refine) and has garnered significant attention in recent years. In this review, we present a thorough examination of the latest breakthroughs in drug toxicity prediction, highlighting the significance of drug-target binding affinity in anticipating and mitigating possible adverse effects. In doing so, we aim to contribute to the development of more effective and secure drugs.
Collapse
Affiliation(s)
- Ana M B Amorim
- Department of Life Sciences, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
- CNC-UC─Center for Neuroscience and Cell Biology, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
- CIBB─Centre for Innovative Biomedicine and Biotechnology, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
- PhD Programme in Biosciences, Department of Life Sciences, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
- PURR.AI, Rua Pedro Nunes, IPN Incubadora, Ed C, 3030-199 Coimbra, Portugal
| | - Luiz F Piochi
- Department of Life Sciences, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
- CNC-UC─Center for Neuroscience and Cell Biology, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
- CIBB─Centre for Innovative Biomedicine and Biotechnology, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
| | - Ana T Gaspar
- Department of Life Sciences, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
- CNC-UC─Center for Neuroscience and Cell Biology, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
- CIBB─Centre for Innovative Biomedicine and Biotechnology, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
| | - António J Preto
- CNC-UC─Center for Neuroscience and Cell Biology, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
- CIBB─Centre for Innovative Biomedicine and Biotechnology, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
- PhD Programme in Experimental Biology and Biomedicine, Institute for Interdisciplinary Research (IIIUC), University of Coimbra, Casa Costa Alemão, 3030-789 Coimbra, Portugal
| | - Nícia Rosário-Ferreira
- CNC-UC─Center for Neuroscience and Cell Biology, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
- CIBB─Centre for Innovative Biomedicine and Biotechnology, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
| | - Irina S Moreira
- Department of Life Sciences, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
- CNC-UC─Center for Neuroscience and Cell Biology, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
- CIBB─Centre for Innovative Biomedicine and Biotechnology, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal
| |
Collapse
|
3
|
Umemori Y, Handa K, Yoshimura S, Kageyama M, Iijima T. Development of a Novel In Silico Classification Model to Assess Reactive Metabolite Formation in the Cysteine Trapping Assay and Investigation of Important Substructures. Biomolecules 2024; 14:535. [PMID: 38785942 PMCID: PMC11117661 DOI: 10.3390/biom14050535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 04/25/2024] [Accepted: 04/26/2024] [Indexed: 05/25/2024] Open
Abstract
Predicting whether a compound can cause drug-induced liver injury (DILI) is difficult due to the complexity of drug mechanism. The cysteine trapping assay is a method for detecting reactive metabolites that bind to microsomes covalently. However, it is cumbersome to use 35S isotope-labeled cysteine for this assay. Therefore, we constructed an in silico classification model for predicting a positive/negative outcome in the cysteine trapping assay. We collected 475 compounds (436 in-house compounds and 39 publicly available drugs) based on experimental data performed in this study, and the composition of the results showed 248 positives and 227 negatives. Using a Message Passing Neural Network (MPNN) and Random Forest (RF) with extended connectivity fingerprint (ECFP) 4, we built machine learning models to predict the covalent binding risk of compounds. In the time-split dataset, AUC-ROC of MPNN and RF were 0.625 and 0.559 in the hold-out test, restrictively. This result suggests that the MPNN model has a higher predictivity than RF in the time-split dataset. Hence, we conclude that the in silico MPNN classification model for the cysteine trapping assay has a better predictive power. Furthermore, most of the substructures that contributed positively to the cysteine trapping assay were consistent with previous results.
Collapse
Affiliation(s)
| | - Koichi Handa
- DMPK Research Department, Teijin Institute for Bio-Medical Research, TEIJIN PHARMA LIMITED, 4-3-2 Asahigaoka, Hino-shi, Tokyo 191-8512, Japan; (Y.U.); (S.Y.); (M.K.); (T.I.)
| | | | | | | |
Collapse
|
4
|
Mostafa F, Chen M. Computational models for predicting liver toxicity in the deep learning era. FRONTIERS IN TOXICOLOGY 2024; 5:1340860. [PMID: 38312894 PMCID: PMC10834666 DOI: 10.3389/ftox.2023.1340860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 12/22/2023] [Indexed: 02/06/2024] Open
Abstract
Drug-induced liver injury (DILI) is a severe adverse reaction caused by drugs and may result in acute liver failure and even death. Many efforts have centered on mitigating risks associated with potential DILI in humans. Among these, quantitative structure-activity relationship (QSAR) was proven to be a valuable tool for early-stage hepatotoxicity screening. Its advantages include no requirement for physical substances and rapid delivery of results. Deep learning (DL) made rapid advancements recently and has been used for developing QSAR models. This review discusses the use of DL in predicting DILI, focusing on the development of QSAR models employing extensive chemical structure datasets alongside their corresponding DILI outcomes. We undertake a comprehensive evaluation of various DL methods, comparing with those of traditional machine learning (ML) approaches, and explore the strengths and limitations of DL techniques regarding their interpretability, scalability, and generalization. Overall, our review underscores the potential of DL methodologies to enhance DILI prediction and provides insights into future avenues for developing predictive models to mitigate DILI risk in humans.
Collapse
Affiliation(s)
- Fahad Mostafa
- Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX, United States
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, United States
| | - Minjun Chen
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, United States
| |
Collapse
|
5
|
Zhang R, Xie X, Ni D, Wang H, Li J, Xiao W. MT-EpiPred: Multitask Learning for Prediction of Small-Molecule Epigenetic Modulators. J Chem Inf Model 2024; 64:110-118. [PMID: 38109786 DOI: 10.1021/acs.jcim.3c01368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Epigenetic modulators play an increasingly crucial role in the treatment of various diseases. In this case, it is imperative to systematically investigate the activity of these agents and understand their influence on the entire epigenetic regulatory network rather than solely concentrate on individual targets. This work introduces MT-EpiPred, a multitask learning method capable of predicting the activity of compounds against 78 epigenetic targets. MT-EpiPred demonstrated outstanding performance, boasting an average auROC of 0.915 and the ability to handle few-shot targets. In comparison to the existing method, MT-EpiPred not only expands the target pool but also achieves superior predictive performance with the same data set. MT-EpiPred was then applied to predict the epigenetic target of a newly synthesized compound (1), where the molecular target was unknown. The method identified KDM4D as a potential target, which was subsequently validated through an in vitro enzyme inhibition assay, revealing an IC50 of 4.8 μM. The MT-EpiPred method has been implemented in the web server MT-EpiPred (http://epipred.com), providing free accessibility. In summary, this work presents a convenient and accurate tool for discovering novel small-molecule epigenetic modulators, particularly in the development of selective inhibitors and evaluating the impact of these inhibitors over a broad epigenetic network.
Collapse
Affiliation(s)
- Ruihan Zhang
- Key Laboratory of Medicinal Chemistry for Natural Resource, Ministry of Education; Yunnan Key Laboratory of Research and Development for Natural Products; The Cloud Computing Engineering Research Center of Yunnan Province; Key Laboratory of Software Engineering of Yunnan Province; School of Software; School of Pharmacy, Yunnan University, Kunming 650500, P. R. China
| | - Xingran Xie
- Key Laboratory of Medicinal Chemistry for Natural Resource, Ministry of Education; Yunnan Key Laboratory of Research and Development for Natural Products; The Cloud Computing Engineering Research Center of Yunnan Province; Key Laboratory of Software Engineering of Yunnan Province; School of Software; School of Pharmacy, Yunnan University, Kunming 650500, P. R. China
| | - Dongxuan Ni
- Key Laboratory of Medicinal Chemistry for Natural Resource, Ministry of Education; Yunnan Key Laboratory of Research and Development for Natural Products; The Cloud Computing Engineering Research Center of Yunnan Province; Key Laboratory of Software Engineering of Yunnan Province; School of Software; School of Pharmacy, Yunnan University, Kunming 650500, P. R. China
| | - Hairong Wang
- Key Laboratory of Medicinal Chemistry for Natural Resource, Ministry of Education; Yunnan Key Laboratory of Research and Development for Natural Products; The Cloud Computing Engineering Research Center of Yunnan Province; Key Laboratory of Software Engineering of Yunnan Province; School of Software; School of Pharmacy, Yunnan University, Kunming 650500, P. R. China
| | - Jin Li
- Key Laboratory of Medicinal Chemistry for Natural Resource, Ministry of Education; Yunnan Key Laboratory of Research and Development for Natural Products; The Cloud Computing Engineering Research Center of Yunnan Province; Key Laboratory of Software Engineering of Yunnan Province; School of Software; School of Pharmacy, Yunnan University, Kunming 650500, P. R. China
| | - Weilie Xiao
- Key Laboratory of Medicinal Chemistry for Natural Resource, Ministry of Education; Yunnan Key Laboratory of Research and Development for Natural Products; The Cloud Computing Engineering Research Center of Yunnan Province; Key Laboratory of Software Engineering of Yunnan Province; School of Software; School of Pharmacy, Yunnan University, Kunming 650500, P. R. China
| |
Collapse
|
6
|
Lee S, Yoo S. InterDILI: interpretable prediction of drug-induced liver injury through permutation feature importance and attention mechanism. J Cheminform 2024; 16:1. [PMID: 38173043 PMCID: PMC10765872 DOI: 10.1186/s13321-023-00796-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Accepted: 12/17/2023] [Indexed: 01/05/2024] Open
Abstract
Safety is one of the important factors constraining the distribution of clinical drugs on the market. Drug-induced liver injury (DILI) is the leading cause of safety problems produced by drug side effects. Therefore, the DILI risk of approved drugs and potential drug candidates should be assessed. Currently, in vivo and in vitro methods are used to test DILI risk, but both methods are labor-intensive, time-consuming, and expensive. To overcome these problems, many in silico methods for DILI prediction have been suggested. Previous studies have shown that DILI prediction models can be utilized as prescreening tools, and they achieved a good performance. However, there are still limitations in interpreting the prediction results. Therefore, this study focused on interpreting the model prediction to analyze which features could potentially cause DILI. For this, five publicly available datasets were collected to train and test the model. Then, various machine learning methods were applied using substructure and physicochemical descriptors as inputs and the DILI label as the output. The interpretation of feature importance was analyzed by recognizing the following general-to-specific patterns: (i) identifying general important features of the overall DILI predictions, and (ii) highlighting specific molecular substructures which were highly related to the DILI prediction for each compound. The results indicated that the model not only captured the previously known properties to be related to DILI but also proposed a new DILI potential substructural of physicochemical properties. The models for the DILI prediction achieved an area under the receiver operating characteristic (AUROC) of 0.88-0.97 and an area under the Precision-Recall curve (AUPRC) of 0.81-0.95. From this, we hope the proposed models can help identify the potential DILI risk of drug candidates at an early stage and offer valuable insights for drug development.
Collapse
Affiliation(s)
- Soyeon Lee
- Department of ICT Convergence System Engineering, Chonnam National University, Gwangju, 61186, Republic of Korea
- Division of Bioresources Bank, Honam National Institute of Biological Resources, Mokpo, 58762, Republic of Korea
| | - Sunyong Yoo
- Department of ICT Convergence System Engineering, Chonnam National University, Gwangju, 61186, Republic of Korea.
| |
Collapse
|
7
|
Wang R, Li L, Chen M, Li X, Liu Y, Xue Z, Ma Q, Chen J. Gene expression insights: Chronic stress and bipolar disorder: A bioinformatics investigation. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:392-414. [PMID: 38303428 DOI: 10.3934/mbe.2024018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Bipolar disorder (BD) is a psychiatric disorder that affects an increasing number of people worldwide. The mechanisms of BD are unclear, but some studies have suggested that it may be related to genetic factors with high heritability. Moreover, research has shown that chronic stress can contribute to the development of major illnesses. In this paper, we used bioinformatics methods to analyze the possible mechanisms of chronic stress affecting BD through various aspects. We obtained gene expression data from postmortem brains of BD patients and healthy controls in datasets GSE12649 and GSE53987, and we identified 11 chronic stress-related genes (CSRGs) that were differentially expressed in BD. Then, we screened five biomarkers (IGFBP6, ALOX5AP, MAOA, AIF1 and TRPM3) using machine learning models. We further validated the expression and diagnostic value of the biomarkers in other datasets (GSE5388 and GSE78936) and performed functional enrichment analysis, regulatory network analysis and drug prediction based on the biomarkers. Our bioinformatics analysis revealed that chronic stress can affect the occurrence and development of BD through many aspects, including monoamine oxidase production and decomposition, neuroinflammation, ion permeability, pain perception and others. In this paper, we confirm the importance of studying the genetic influences of chronic stress on BD and other psychiatric disorders and suggested that biomarkers related to chronic stress may be potential diagnostic tools and therapeutic targets for BD.
Collapse
Affiliation(s)
- Rongyanqi Wang
- School of Chinese Medicine, Beijing University of Chinese Medicine, Beijing 100029, China
| | - Lan Li
- College of Basic Medicine, Hubei University of Chinese Medicine, Wuhan 430065, China
| | - Man Chen
- College of Basic Medicine, Hubei University of Chinese Medicine, Wuhan 430065, China
| | - Xiaojuan Li
- Guangzhou Key Laboratory of Formula-Pattern of Traditional Chinese Medicine, Formula-Pattern Research Center, School of Traditional Chinese Medicine, Jinan University, Guangzhou 510632, China
| | - Yueyun Liu
- School of Chinese Medicine, Beijing University of Chinese Medicine, Beijing 100029, China
| | - Zhe Xue
- School of Chinese Medicine, Beijing University of Chinese Medicine, Beijing 100029, China
| | - Qingyu Ma
- Guangzhou Key Laboratory of Formula-Pattern of Traditional Chinese Medicine, Formula-Pattern Research Center, School of Traditional Chinese Medicine, Jinan University, Guangzhou 510632, China
| | - Jiaxu Chen
- School of Chinese Medicine, Beijing University of Chinese Medicine, Beijing 100029, China
| |
Collapse
|
8
|
Wu W, Qian J, Liang C, Yang J, Ge G, Zhou Q, Guan X. GeoDILI: A Robust and Interpretable Model for Drug-Induced Liver Injury Prediction Using Graph Neural Network-Based Molecular Geometric Representation. Chem Res Toxicol 2023; 36:1717-1730. [PMID: 37839069 DOI: 10.1021/acs.chemrestox.3c00199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2023]
Abstract
Drug-induced liver injury (DILI) is a significant cause of drug failure and withdrawal due to liver damage. Accurate prediction of hepatotoxic compounds is crucial for safe drug development. Several DILI prediction models have been published, but they are built on different data sets, making it difficult to compare model performance. Moreover, most existing models are based on molecular fingerprints or descriptors, neglecting molecular geometric properties and lacking interpretability. To address these limitations, we developed GeoDILI, an interpretable graph neural network that uses a molecular geometric representation. First, we utilized a geometry-based pretrained molecular representation and optimized it on the DILI data set to improve predictive performance. Second, we leveraged gradient information to obtain high-precision atomic-level weights and deduce the dominant substructure. We benchmarked GeoDILI against recently published DILI prediction models, as well as popular GNN models and fingerprint-based machine learning models using the same data set, showing superior predictive performance of our proposed model. We applied the interpretable method in the DILI data set and derived seven precise and mechanistically elucidated structural alerts. Overall, GeoDILI provides a promising approach for accurate and interpretable DILI prediction with potential applications in drug discovery and safety assessment. The data and source code are available at GitHub repository (https://github.com/CSU-QJY/GeoDILI).
Collapse
Affiliation(s)
- Wenxuan Wu
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Jiayu Qian
- School of Mathematics and Statistics, Central South University, Changsha, Hunan 410083, China
| | - Changjie Liang
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Jingya Yang
- School of Mathematics and Statistics, Central South University, Changsha, Hunan 410083, China
| | - Guangbo Ge
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Qingping Zhou
- School of Mathematics and Statistics, Central South University, Changsha, Hunan 410083, China
| | - Xiaoqing Guan
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| |
Collapse
|
9
|
Guo W, Liu J, Dong F, Song M, Li Z, Khan MKH, Patterson TA, Hong H. Review of machine learning and deep learning models for toxicity prediction. Exp Biol Med (Maywood) 2023; 248:1952-1973. [PMID: 38057999 PMCID: PMC10798180 DOI: 10.1177/15353702231209421] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023] Open
Abstract
The ever-increasing number of chemicals has raised public concerns due to their adverse effects on human health and the environment. To protect public health and the environment, it is critical to assess the toxicity of these chemicals. Traditional in vitro and in vivo toxicity assays are complicated, costly, and time-consuming and may face ethical issues. These constraints raise the need for alternative methods for assessing the toxicity of chemicals. Recently, due to the advancement of machine learning algorithms and the increase in computational power, many toxicity prediction models have been developed using various machine learning and deep learning algorithms such as support vector machine, random forest, k-nearest neighbors, ensemble learning, and deep neural network. This review summarizes the machine learning- and deep learning-based toxicity prediction models developed in recent years. Support vector machine and random forest are the most popular machine learning algorithms, and hepatotoxicity, cardiotoxicity, and carcinogenicity are the frequently modeled toxicity endpoints in predictive toxicology. It is known that datasets impact model performance. The quality of datasets used in the development of toxicity prediction models using machine learning and deep learning is vital to the performance of the developed models. The different toxicity assignments for the same chemicals among different datasets of the same type of toxicity have been observed, indicating benchmarking datasets is needed for developing reliable toxicity prediction models using machine learning and deep learning algorithms. This review provides insights into current machine learning models in predictive toxicology, which are expected to promote the development and application of toxicity prediction models in the future.
Collapse
Affiliation(s)
- Wenjing Guo
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Jie Liu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Fan Dong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Meng Song
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Zoe Li
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Md Kamrul Hasan Khan
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Tucker A Patterson
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| |
Collapse
|
10
|
Sinha K, Ghosh N, Sil PC. A Review on the Recent Applications of Deep Learning in Predictive Drug Toxicological Studies. Chem Res Toxicol 2023; 36:1174-1205. [PMID: 37561655 DOI: 10.1021/acs.chemrestox.2c00375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]
Abstract
Drug toxicity prediction is an important step in ensuring patient safety during drug design studies. While traditional preclinical studies have historically relied on animal models to evaluate toxicity, recent advances in deep-learning approaches have shown great promise in advancing drug safety science and reducing animal use in preclinical studies. However, deep-learning-based approaches also face challenges in handling large biological data sets, model interpretability, and regulatory acceptance. In this review, we provide an overview of recent developments in deep-learning-based approaches for predicting drug toxicity, highlighting their potential advantages over traditional methods and the need to address their limitations. Deep-learning models have demonstrated excellent performance in predicting toxicity outcomes from various data sources such as chemical structures, genomic data, and high-throughput screening assays. The potential of deep learning for automated feature engineering is also discussed. This review emphasizes the need to address ethical concerns related to the use of deep learning in drug toxicity studies, including the reduction of animal use and ensuring regulatory acceptance. Furthermore, emerging applications of deep learning in drug toxicity prediction, such as predicting drug-drug interactions and toxicity in rare subpopulations, are highlighted. The integration of deep-learning-based approaches with traditional methods is discussed as a way to develop more reliable and efficient predictive models for drug safety assessment, paving the way for safer and more effective drug discovery and development. Overall, this review highlights the critical role of deep learning in predictive toxicology and drug safety evaluation, emphasizing the need for continued research and development in this rapidly evolving field. By addressing the limitations of traditional methods, leveraging the potential of deep learning for automated feature engineering, and addressing ethical concerns, deep-learning-based approaches have the potential to revolutionize drug toxicity prediction and improve patient safety in drug discovery and development.
Collapse
Affiliation(s)
- Krishnendu Sinha
- Department of Zoology, Jhargram Raj College, Jhargram 721507, West Bengal, India
| | - Nabanita Ghosh
- Department of Zoology, Maulana Azad College, Kolkata 700013, West Bengal, India
| | - Parames C Sil
- Division of Molecular Medicine, Bose Institute, Kolkata 700054, West Bengal, India
| |
Collapse
|
11
|
Dou B, Zhu Z, Merkurjev E, Ke L, Chen L, Jiang J, Zhu Y, Liu J, Zhang B, Wei GW. Machine Learning Methods for Small Data Challenges in Molecular Science. Chem Rev 2023; 123:8736-8780. [PMID: 37384816 PMCID: PMC10999174 DOI: 10.1021/acs.chemrev.3c00189] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023]
Abstract
Small data are often used in scientific and engineering research due to the presence of various constraints, such as time, cost, ethics, privacy, security, and technical limitations in data acquisition. However, big data have been the focus for the past decade, small data and their challenges have received little attention, even though they are technically more severe in machine learning (ML) and deep learning (DL) studies. Overall, the small data challenge is often compounded by issues, such as data diversity, imputation, noise, imbalance, and high-dimensionality. Fortunately, the current big data era is characterized by technological breakthroughs in ML, DL, and artificial intelligence (AI), which enable data-driven scientific discovery, and many advanced ML and DL technologies developed for big data have inadvertently provided solutions for small data problems. As a result, significant progress has been made in ML and DL for small data challenges in the past decade. In this review, we summarize and analyze several emerging potential solutions to small data challenges in molecular science, including chemical and biological sciences. We review both basic machine learning algorithms, such as linear regression, logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), kernel learning (KL), random forest (RF), and gradient boosting trees (GBT), and more advanced techniques, including artificial neural network (ANN), convolutional neural network (CNN), U-Net, graph neural network (GNN), Generative Adversarial Network (GAN), long short-term memory (LSTM), autoencoder, transformer, transfer learning, active learning, graph-based semi-supervised learning, combining deep learning with traditional machine learning, and physical model-based data augmentation. We also briefly discuss the latest advances in these methods. Finally, we conclude the survey with a discussion of promising trends in small data challenges in molecular science.
Collapse
Affiliation(s)
- Bozheng Dou
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Zailiang Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Ekaterina Merkurjev
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Lu Ke
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Long Chen
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jian Jiang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yueying Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jie Liu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Bengong Zhang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
12
|
Nguyen-Vo TH, Trinh QH, Nguyen L, Nguyen-Hoang PU, Rahardja S, Nguyen BP. i4mC-GRU: Identifying DNA N 4-Methylcytosine sites in mouse genomes using bidirectional gated recurrent unit and sequence-embedded features. Comput Struct Biotechnol J 2023; 21:3045-3053. [PMID: 37273848 PMCID: PMC10238585 DOI: 10.1016/j.csbj.2023.05.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 05/12/2023] [Accepted: 05/12/2023] [Indexed: 06/06/2023] Open
Abstract
N4-methylcytosine (4mC) is one of the most common DNA methylation modifications found in both prokaryotic and eukaryotic genomes. Since the 4mC has various essential biological roles, determining its location helps reveal unexplored physiological and pathological pathways. In this study, we propose an effective computational method called i4mC-GRU using a gated recurrent unit and duplet sequence-embedded features to predict potential 4mC sites in mouse (Mus musculus) genomes. To fairly assess the performance of the model, we compared our method with several state-of-the-art methods using two different benchmark datasets. Our results showed that i4mC-GRU achieved area under the receiver operating characteristic curve values of 0.97 and 0.89 and area under the precision-recall curve values of 0.98 and 0.90 on the first and second benchmark datasets, respectively. Briefly, our method outperformed existing methods in predicting 4mC sites in mouse genomes. Also, we deployed i4mC-GRU as an online web server, supporting users in genomics studies.
Collapse
Affiliation(s)
- Thanh-Hoang Nguyen-Vo
- School of Mathematics and Statistics, Victoria University of Wellington, Wellington 6140, New Zealand
- School of Innovation, Design and Technology, Wellington Institute of Technology, Wellington 5012, New Zealand
| | - Quang H. Trinh
- School of Information and Communication Technology, Hanoi University of Science and Technology, Hanoi 100000, Vietnam
| | - Loc Nguyen
- School of Mathematics and Statistics, Victoria University of Wellington, Wellington 6140, New Zealand
| | - Phuong-Uyen Nguyen-Hoang
- Computational Biology Center, International University - VNU HCMC, Ho Chi Minh City 700000, Vietnam
| | - Susanto Rahardja
- School of Marine Science and Technology, Northwestern Polytechnical University, Xi’an 710072, China
- Infocomm Technology Cluster, Singapore Institute of Technology, Singapore 138683, Singapore
| | - Binh P. Nguyen
- School of Mathematics and Statistics, Victoria University of Wellington, Wellington 6140, New Zealand
| |
Collapse
|
13
|
Tran TTV, Surya Wibowo A, Tayara H, Chong KT. Artificial Intelligence in Drug Toxicity Prediction: Recent Advances, Challenges, and Future Perspectives. J Chem Inf Model 2023; 63:2628-2643. [PMID: 37125780 DOI: 10.1021/acs.jcim.3c00200] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
Toxicity prediction is a critical step in the drug discovery process that helps identify and prioritize compounds with the greatest potential for safe and effective use in humans, while also reducing the risk of costly late-stage failures. It is estimated that over 30% of drug candidates are discarded owing to toxicity. Recently, artificial intelligence (AI) has been used to improve drug toxicity prediction as it provides more accurate and efficient methods for identifying the potentially toxic effects of new compounds before they are tested in human clinical trials, thus saving time and money. In this review, we present an overview of recent advances in AI-based drug toxicity prediction, including the use of various machine learning algorithms and deep learning architectures, of six major toxicity properties and Tox21 assay end points. Additionally, we provide a list of public data sources and useful toxicity prediction tools for the research community and highlight the challenges that must be addressed to enhance model performance. Finally, we discuss future perspectives for AI-based drug toxicity prediction. This review can aid researchers in understanding toxicity prediction and pave the way for new methods of drug discovery.
Collapse
Affiliation(s)
- Thi Tuyet Van Tran
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea
- Faculty of Information Technology, An Giang University, Long Xuyen 880000, Vietnam
- Vietnam National University - Ho Chi Minh City, Ho Chi Minh 700000, Vietnam
| | - Agung Surya Wibowo
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea
- Department of Electrical Engineering, Telkom University, Bandung 40257, Indonesia
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, Republic of Korea
| | - Kil To Chong
- Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, Republic of Korea
| |
Collapse
|
14
|
Nguyen QH, Ngo HH, Nguyen-Vo TH, Do TT, Rahardja S, Nguyen BP. eMIC-AntiKP: Estimating minimum inhibitory concentrations of antibiotics towards Klebsiella pneumoniae using deep learning. Comput Struct Biotechnol J 2022; 21:751-757. [PMID: 36659924 PMCID: PMC9827358 DOI: 10.1016/j.csbj.2022.12.041] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 12/22/2022] [Accepted: 12/23/2022] [Indexed: 12/27/2022] Open
Abstract
Nowadays, antibiotic resistance has become one of the most concerning problems that directly affects the recovery process of patients. For years, numerous efforts have been made to efficiently use antimicrobial drugs with appropriate doses not only to exterminate microbes but also stringently constrain any chances for bacterial evolution. However, choosing proper antibiotics is not a straightforward and time-effective process because well-defined drugs can only be given to patients after determining microbic taxonomy and evaluating minimum inhibitory concentrations (MICs). Besides conventional methods, numerous computer-aided frameworks have been recently developed using computational advances and public data sources of clinical antimicrobial resistance. In this study, we introduce eMIC-AntiKP, a computational framework specifically designed to predict the MIC values of 20 antibiotics towards Klebsiella pneumoniae. Our prediction models were constructed using convolutional neural networks and k-mer counting-based features. The model for cefepime has the most limited performance with a test 1-tier accuracy of 0.49, while the model for ampicillin has the highest performance with a test 1-tier accuracy of 1.00. Most models have satisfactory performance, with test accuracies ranging from about 0.70-0.90. The significance of eMIC-AntiKP is the effective utilization of computing resources to make it a compact and portable tool for most moderately configured computers. We provide users with two options, including an online web server for basic analysis and an offline package for deeper analysis and technical modification.
Collapse
Affiliation(s)
- Quang H. Nguyen
- School of Information and Communication Technology, Hanoi University of Science and Technology, Hanoi 100000, Viet Nam
| | - Hoang H. Ngo
- School of Information and Communication Technology, Hanoi University of Science and Technology, Hanoi 100000, Viet Nam
| | - Thanh-Hoang Nguyen-Vo
- School of Mathematics and Statistics, Victoria University of Wellington, Wellington 6140, New Zealand
| | - Trang T.T. Do
- School of Innovation, Design and Technology, Wellington Institute of Technology, Lower Hutt 5012, New Zealand
| | - Susanto Rahardja
- School of Marine Science and Technology, Northwestern Polytechnical University, Xi’an 710072, China,Infocomm Technology Cluster, Singapore Institute of Technology, Singapore 138683, Singapore,Corresponding author at: School of Marine Science and Technology, Northwestern Polytechnical University, Xi’an 710072, China.
| | - Binh P. Nguyen
- School of Mathematics and Statistics, Victoria University of Wellington, Wellington 6140, New Zealand,Corresponding author.
| |
Collapse
|
15
|
Lin J, Li M, Mak W, Shi Y, Zhu X, Tang Z, He Q, Xiang X. Applications of In Silico Models to Predict Drug-Induced Liver Injury. TOXICS 2022; 10:788. [PMID: 36548621 PMCID: PMC9785299 DOI: 10.3390/toxics10120788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 12/09/2022] [Accepted: 12/13/2022] [Indexed: 06/17/2023]
Abstract
Drug-induced liver injury (DILI) is a major cause of the withdrawal of pre-marketed drugs, typically attributed to oxidative stress, mitochondrial damage, disrupted bile acid homeostasis, and innate immune-related inflammation. DILI can be divided into intrinsic and idiosyncratic DILI with cholestatic liver injury as an important manifestation. The diagnosis of DILI remains a challenge today and relies on clinical judgment and knowledge of the insulting agent. Early prediction of hepatotoxicity is an important but still unfulfilled component of drug development. In response, in silico modeling has shown good potential to fill the missing puzzle. Computer algorithms, with machine learning and artificial intelligence as a representative, can be established to initiate a reaction on the given condition to predict DILI. DILIsym is a mechanistic approach that integrates physiologically based pharmacokinetic modeling with the mechanisms of hepatoxicity and has gained increasing popularity for DILI prediction. This article reviews existing in silico approaches utilized to predict DILI risks in clinical medication and provides an overview of the underlying principles and related practical applications.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Qingfeng He
- Correspondence: (Q.H.); (X.X.); Tel.: +86-21-51980024 (X.X.)
| | - Xiaoqiang Xiang
- Correspondence: (Q.H.); (X.X.); Tel.: +86-21-51980024 (X.X.)
| |
Collapse
|
16
|
Nguyen-Vo TH, Trinh QH, Nguyen L, Nguyen-Hoang PU, Rahardja S, Nguyen BP. iPromoter-Seqvec: identifying promoters using bidirectional long short-term memory and sequence-embedded features. BMC Genomics 2022; 23:681. [PMID: 36192696 PMCID: PMC9531353 DOI: 10.1186/s12864-022-08829-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 08/08/2022] [Indexed: 11/30/2022] Open
Abstract
Background Promoters, non-coding DNA sequences located at upstream regions of the transcription start site of genes/gene clusters, are essential regulatory elements for the initiation and regulation of transcriptional processes. Furthermore, identifying promoters in DNA sequences and genomes significantly contributes to discovering entire structures of genes of interest. Therefore, exploration of promoter regions is one of the most imperative topics in molecular genetics and biology. Besides experimental techniques, computational methods have been developed to predict promoters. In this study, we propose iPromoter-Seqvec – an efficient computational model to predict TATA and non-TATA promoters in human and mouse genomes using bidirectional long short-term memory neural networks in combination with sequence-embedded features extracted from input sequences. The promoter and non-promoter sequences were retrieved from the Eukaryotic Promoter database and then were refined to create four benchmark datasets. Results The area under the receiver operating characteristic curve (AUCROC) and the area under the precision-recall curve (AUCPR) were used as two key metrics to evaluate model performance. Results on independent test sets showed that iPromoter-Seqvec outperformed other state-of-the-art methods with AUCROC values ranging from 0.85 to 0.99 and AUCPR values ranging from 0.86 to 0.99. Models predicting TATA promoters in both species had slightly higher predictive power compared to those predicting non-TATA promoters. With a novel idea of constructing artificial non-promoter sequences based on promoter sequences, our models were able to learn highly specific characteristics discriminating promoters from non-promoters to improve predictive efficiency. Conclusions iPromoter-Seqvec is a stable and robust model for predicting both TATA and non-TATA promoters in human and mouse genomes. Our proposed method was also deployed as an online web server with a user-friendly interface to support research communities. Links to our source codes and web server are available at https://github.com/mldlproject/2022-iPromoter-Seqvec. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08829-6.
Collapse
Affiliation(s)
- Thanh-Hoang Nguyen-Vo
- School of Mathematics and Statistics, Victoria University of Wellington, Gate 7, Kelburn Parade, 6140, Wellington, New Zealand
| | - Quang H Trinh
- School of Information and Communication Technology, Hanoi University of Science and Technology, 1 Dai Co Viet, 100000, Hanoi, Vietnam
| | - Loc Nguyen
- School of Mathematics and Statistics, Victoria University of Wellington, Gate 7, Kelburn Parade, 6140, Wellington, New Zealand
| | - Phuong-Uyen Nguyen-Hoang
- Computational Biology Center, International University - VNU HCMC, Quarter 6, Linh Trung Ward, Thu Duc District, 700000, Ho Chi Minh City, Vietnam
| | - Susanto Rahardja
- School of Marine Science and Technology, Northwestern Polytechnical University, 127 West Youyi Road, 710072, Xi'an, China. .,Infocomm Technology Cluster, Singapore Institute of Technology, 10 Dover Drive, 138683, Singapore, Singapore.
| | - Binh P Nguyen
- School of Mathematics and Statistics, Victoria University of Wellington, Gate 7, Kelburn Parade, 6140, Wellington, New Zealand.
| |
Collapse
|
17
|
Morita K, Mizuno T, Kusuhara H. Investigation of a Data Split Strategy Involving the Time Axis in Adverse Event Prediction Using Machine Learning. J Chem Inf Model 2022; 62:3982-3992. [PMID: 35971760 DOI: 10.1021/acs.jcim.2c00765] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Adverse events are a serious issue in drug development, and many prediction methods using machine learning have been developed. The random split cross-validation is the de facto standard for model building and evaluation in machine learning, but care should be taken in adverse event prediction because this approach does not strictly match the real-world situation. The time split, which uses the time axis, is considered suitable for real-world prediction. However, the differences in model performance obtained using the time and random splits are not clear due to the lack of comparable studies. To understand the differences, we compared the model performance between the time and random splits using nine types of compound information as input, eight adverse events as targets, and six machine learning algorithms. The random split showed higher area under the curve values than did the time split for six of eight targets. The chemical spaces of the training and test datasets of the time split were similar, suggesting that the concept of applicability domain is insufficient to explain the differences derived from the splitting. The area under the curve differences were smaller for the protein interaction than for the other datasets. Subsequent detailed analyses suggested the danger of confounding in the use of knowledge-based information in the time split. These findings indicate the importance of understanding the differences between the time and random splits in adverse event prediction and suggest that appropriate use of the splitting strategies and interpretation of results are necessary for the real-world prediction of adverse events. We provide the analysis code and datasets used in the present study at https://github.com/mizuno-group/AE_prediction.
Collapse
Affiliation(s)
- Katsuhisa Morita
- Graduate School of Pharmaceutical Sciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Tadahaya Mizuno
- Graduate School of Pharmaceutical Sciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Hiroyuki Kusuhara
- Graduate School of Pharmaceutical Sciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
| |
Collapse
|
18
|
Chen Z, Jiang Y, Zhang X, Zheng R, Qiu R, Sun Y, Zhao C, Shang H. The prediction approach of drug-induced liver injury: response to the issues of reproducible science of artificial intelligence in real-world applications. Brief Bioinform 2022; 23:6598880. [PMID: 35656709 DOI: 10.1093/bib/bbac196] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 04/12/2022] [Accepted: 04/27/2022] [Indexed: 11/12/2022] Open
Abstract
Abstract
In the previous study, we developed the generalized drug-induced liver injury (DILI) prediction model—ResNet18DNN to predict DILI based on multi-source combined DILI dataset and achieved better performance than that of previously published described DILI prediction models. Recently, we were honored to receive the invitation from the editor to response the Letter to Editor by Liu Zhichao, et al. We were glad that our research has attracted the attention of Liu’s team and they has put forward their opinions on our research. In this response to Letter to the Editor, we will respond to these comments.
Collapse
Affiliation(s)
- Zhao Chen
- Key Laboratory of Chinese Internal Medicine of Ministry of Education , Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing, China
| | - Yin Jiang
- Key Laboratory of Chinese Internal Medicine of Ministry of Education , Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing, China
| | - Xiaoyu Zhang
- Key Laboratory of Chinese Internal Medicine of Ministry of Education , Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing, China
| | - Rui Zheng
- Key Laboratory of Chinese Internal Medicine of Ministry of Education , Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing, China
| | - Ruijin Qiu
- Key Laboratory of Chinese Internal Medicine of Ministry of Education , Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing, China
| | - Yang Sun
- Key Laboratory of Chinese Internal Medicine of Ministry of Education , Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing, China
| | - Chen Zhao
- Institute of Basic Research in Clinical Medicine , China Academy of Chinese Medical Sciences, Beijing, China
| | - Hongcai Shang
- Key Laboratory of Chinese Internal Medicine of Ministry of Education , Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing, China
- College of Integrated Traditional Chinese and Western Medicine , Hunan University of Chinese Medicine, Changsha, Hunan 410208 , China
| |
Collapse
|
19
|
Ivanov SM, Lagunin AA, Filimonov DA, Poroikov VV. Relationships between the Structure and Severe Drug-Induced Liver Injury for Low, Medium, and High Doses of Drugs. Chem Res Toxicol 2022; 35:402-411. [PMID: 35172101 DOI: 10.1021/acs.chemrestox.1c00307] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Assessment of structure-activity relationships (SARs) for predicting severe drug-induced liver injury (DILI) is essential since in vivo and in vitro preclinical methods cannot detect many druglike compounds disrupting liver functions. To date, plenty of SAR models for the prediction of DILI have been developed; however, none of them considered the route of drug administration and daily dose, which may introduce significant bias into prediction results. We have created a dataset of 617 drugs with parenteral and oral administration routes and consistent information on DILI severity. We have found a clear relationship between route, dose, and DILI severity. According to SAR, nearly 40% of moderate- and non-DILI-causing drugs would cause severe DILI if they were administered at high oral doses. We have proposed the following approach to predict severe DILI. New compounds recommended to be used at low oral doses (<∼10 mg daily), or parenterally, can be considered not causing severe DILI. DILI for compounds administered at medium oral doses (∼10-100 mg daily; 22.2% of drugs under consideration) can be considered unpredictable because reasonable SAR models were not obtained due to the small size and heterogeneity of the corresponding dataset. The DILI potential of the compounds recommended to be used at high oral doses (more than ∼100 mg daily) can be estimated using SAR modeling. The balanced accuracy of the approach calculated by a 10-fold cross-validation procedure is 0.803. The developed approach can be used to estimate severe DILI for druglike compounds proposed to use at low and high oral doses or parenterally at the early stages of drug development.
Collapse
Affiliation(s)
- Sergey M Ivanov
- Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow 119121, Russia.,Pirogov Russian National Research Medical University, Ostrovityanova Str., 1, Moscow 117997, Russia
| | - Alexey A Lagunin
- Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow 119121, Russia.,Pirogov Russian National Research Medical University, Ostrovityanova Str., 1, Moscow 117997, Russia
| | - Dmitry A Filimonov
- Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow 119121, Russia
| | - Vladimir V Poroikov
- Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow 119121, Russia
| |
Collapse
|
20
|
Liu J, Guo W, Sakkiah S, Ji Z, Yavas G, Zou W, Chen M, Tong W, Patterson TA, Hong H. Machine Learning Models for Predicting Liver Toxicity. Methods Mol Biol 2022; 2425:393-415. [PMID: 35188640 DOI: 10.1007/978-1-0716-1960-5_15] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Liver toxicity is a major adverse drug reaction that accounts for drug failure in clinical trials and withdrawal from the market. Therefore, predicting potential liver toxicity at an early stage in drug discovery is crucial to reduce costs and the potential for drug failure. However, current in vivo animal toxicity testing is very expensive and time consuming. As an alternative approach, various machine learning models have been developed to predict potential liver toxicity in humans. This chapter reviews current advances in the development and application of machine learning models for prediction of potential liver toxicity in humans and discusses possible improvements to liver toxicity prediction.
Collapse
Affiliation(s)
- Jie Liu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Wenjing Guo
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Sugunadevi Sakkiah
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Zuowei Ji
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Gokhan Yavas
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Wen Zou
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Minjun Chen
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Tucker A Patterson
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA.
| |
Collapse
|
21
|
Chen Z, Jiang Y, Zhang X, Zheng R, Qiu R, Sun Y, Zhao C, Shang H. ResNet18DNN: prediction approach of drug-induced liver injury by deep neural network with ResNet18. Brief Bioinform 2021; 23:6457162. [PMID: 34882224 DOI: 10.1093/bib/bbab503] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 09/27/2021] [Accepted: 11/02/2021] [Indexed: 01/22/2023] Open
Abstract
Drug-induced liver injury (DILI) has always been the focus of clinicians and drug researchers. How to improve the performance of the DILI prediction model to accurately predict liver injury was an urgent problem for researchers in the field of medical research. In order to solve this scientific problem, this research collected a comprehensive and accurate dataset of DILI with high recognition and high quality based on clinically confirmed DILI compound datasets, including 1446 chemical compounds. Then, the residual neural network with 18-layer by using more 5-layer blocks (ResNet18) with deep neural network (ResNet18DNN) model was proposed to predict DILI, which was an improved model for DILI prediction through vectorization of compound structure image. In predicting DILI, the ResNet18DNN learned greatly and outperformed the existing state-of-the-art DILI predictors. The results of DILI prediction model based on ResNet18DNN showed that the AUC (area under the curve), accuracy, recall, precision, F1-score and specificity of the training set were 0.973, 0.992, 0.995, 0.994, 0.995 and 0.975; those of test set were, respectively, 0.958, 0.976, 0.935, 0.947, 0.926 and 0.913, which were better than the performance of previously published described DILI prediction models. This method adopted ResNet18 embedding method to vectorize molecular structure images and the evaluation indicators of Resnet18DNN were obtained after 10 000 iterations. This prediction approach will greatly improve the performance of the predictive model of DILI and provide an accurate and precise early warning method for DILI in drug development and clinical medication.
Collapse
Affiliation(s)
- Zhao Chen
- Key Laboratory of Chinese Internal Medicine of Ministry of Education, Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing 100700, China
| | - Yin Jiang
- Key Laboratory of Chinese Internal Medicine of Ministry of Education, Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing 100700, China
| | - Xiaoyu Zhang
- Key Laboratory of Chinese Internal Medicine of Ministry of Education, Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing 100700, China
| | - Rui Zheng
- Key Laboratory of Chinese Internal Medicine of Ministry of Education, Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing 100700, China
| | - Ruijin Qiu
- Key Laboratory of Chinese Internal Medicine of Ministry of Education, Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing 100700, China
| | - Yang Sun
- Key Laboratory of Chinese Internal Medicine of Ministry of Education, Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing 100700, China
| | - Chen Zhao
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Hongcai Shang
- Key Laboratory of Chinese Internal Medicine of Ministry of Education, Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing 100700, China.,College of Integrated Traditional Chinese and Western Medicine, Hunan University of Chinese Medicine, Changsha, Hunan 410208, China
| |
Collapse
|
22
|
Muller C, Rabal O, Diaz Gonzalez C. Artificial Intelligence, Machine Learning, and Deep Learning in Real-Life Drug Design Cases. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2021; 2390:383-407. [PMID: 34731478 DOI: 10.1007/978-1-0716-1787-8_16] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The discovery and development of drugs is a long and expensive process with a high attrition rate. Computational drug discovery contributes to ligand discovery and optimization, by using models that describe the properties of ligands and their interactions with biological targets. In recent years, artificial intelligence (AI) has made remarkable modeling progress, driven by new algorithms and by the increase in computing power and storage capacities, which allow the processing of large amounts of data in a short time. This review provides the current state of the art of AI methods applied to drug discovery, with a focus on structure- and ligand-based virtual screening, library design and high-throughput analysis, drug repurposing and drug sensitivity, de novo design, chemical reactions and synthetic accessibility, ADMET, and quantum mechanics.
Collapse
Affiliation(s)
- Christophe Muller
- Evotec (France) SAS, Computational Drug Discovery, Integrated Drug Discovery, Toulouse, France
| | - Obdulia Rabal
- Evotec (France) SAS, Computational Drug Discovery, Integrated Drug Discovery, Toulouse, France
| | | |
Collapse
|
23
|
Nguyen-Vo TH, Trinh QH, Nguyen L, Nguyen-Hoang PU, Nguyen TN, Nguyen DT, Nguyen BP, Le L. iCYP-MFE: Identifying Human Cytochrome P450 Inhibitors Using Multitask Learning and Molecular Fingerprint-Embedded Encoding. J Chem Inf Model 2021; 62:5059-5068. [PMID: 34672553 DOI: 10.1021/acs.jcim.1c00628] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
The human cytochrome P450 (CYP) superfamily holds responsibilities for the metabolism of both endogenous and exogenous compounds such as drugs, cellular metabolites, and toxins. The inhibition exerted on the CYP enzymes is closely associated with adverse drug reactions encompassing metabolic failures and induced side effects. In modern drug discovery, identification of potential CYP inhibitors is, therefore, highly essential. Alongside experimental approaches, numerous computational models have been proposed to address this biochemical issue. In this study, we introduce iCYP-MFE, a computational framework for virtual screening on CYP inhibitors toward 1A2, 2C9, 2C19, 2D6, and 3A4 isoforms. iCYP-MFE contains a set of five robust, stable, and effective prediction models developed using multitask learning incorporated with molecular fingerprint-embedded features. The results show that multitask learning can remarkably leverage useful information from related tasks to promote global performance. Comparative analysis indicates that iCYP-MFE achieves three predominant tasks, one equivalent task, and one less effective task compared to state-of-the-art methods. The area under the receiver operating characteristic curve (AUC-ROC) and the area under the precision-recall curve (AUC-PR) were two decisive metrics used for model evaluation. The prediction task for CYP2D6-inhibition achieves the highest AUC-ROC value of 0.93 while the prediction task for CYP1A2-inhibition obtains the highest AUC-PR value of 0.92. The substructural analysis preliminarily explains the nature of the CYP-inhibitory activity of compounds. An online web server for iCYP-MFE with a user-friendly interface was also deployed to support scientific communities in identifying CYP inhibitors.
Collapse
Affiliation(s)
- Thanh-Hoang Nguyen-Vo
- School of Mathematics and Statistics, Victoria University of Wellington, Kelburn Parade, Wellington 6140, New Zealand
| | - Quang H Trinh
- Computational Biology Center, International University-VNU HCMC, Ho Chi Minh City 700000, Vietnam
| | - Loc Nguyen
- Computational Biology Center, International University-VNU HCMC, Ho Chi Minh City 700000, Vietnam
| | - Phuong-Uyen Nguyen-Hoang
- Computational Biology Center, International University-VNU HCMC, Ho Chi Minh City 700000, Vietnam
| | - Thien-Ngan Nguyen
- Computational Biology Center, International University-VNU HCMC, Ho Chi Minh City 700000, Vietnam
| | - Dung T Nguyen
- School of Information and Communication Technology, Hanoi University of Science and Technology, Hanoi 100000, Vietnam
| | - Binh P Nguyen
- School of Mathematics and Statistics, Victoria University of Wellington, Kelburn Parade, Wellington 6140, New Zealand
| | - Ly Le
- Computational Biology Center, International University-VNU HCMC, Ho Chi Minh City 700000, Vietnam.,Vingroup Big Data Institute, Ha Noi 100000, Vietnam
| |
Collapse
|
24
|
Nguyen-Vo TH, Trinh QH, Nguyen L, Do TTT, Chua MCH, Nguyen BP. Predicting Antimalarial Activity in Natural Products Using Pretrained Bidirectional Encoder Representations from Transformers. J Chem Inf Model 2021; 62:5050-5058. [DOI: 10.1021/acs.jcim.1c00584] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Thanh-Hoang Nguyen-Vo
- School of Mathematics and Statistics, Victoria University of Wellington, Kelburn Parade, Wellington 6140, New Zealand
| | - Quang H. Trinh
- Computational Biology Center, International University−VNU HCMC, Ho Chi Minh City 700000, Vietnam
| | - Loc Nguyen
- Computational Biology Center, International University−VNU HCMC, Ho Chi Minh City 700000, Vietnam
| | - Trang T. T. Do
- School of Business and Information Technology, Wellington Institute of Technology, 21 Kensington Avenue, Lower Hutt 5012, New Zealand
| | - Matthew Chin Heng Chua
- Institute of Systems Science, National University of Singapore, 29 Heng Mui Keng Terrace, Singapore 119620, Singapore
| | - Binh P. Nguyen
- School of Mathematics and Statistics, Victoria University of Wellington, Kelburn Parade, Wellington 6140, New Zealand
| |
Collapse
|