1
|
Lee SH, Choi E, Park J, Yoon S, Song MH, Lee JY, Seo J, Shin SK, Lee SH, Oh HB. Prediction of reproductive and developmental toxicity using an attention and gate augmented graph convolutional network. Sci Rep 2025; 15:18186. [PMID: 40415056 DOI: 10.1038/s41598-025-02590-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Accepted: 05/14/2025] [Indexed: 05/27/2025] Open
Abstract
Due to the diverse molecular structures of chemical compounds and their intricate biological pathways of toxicity, predicting their reproductive and developmental toxicity remains a challenge. Traditional Quantitative Structure-Activity Relationship models that rely on molecular descriptors have limitations in capturing the complexity of reproductive and developmental toxicity to achieve high predictive performance. In this study, we developed a descriptor-free deep learning model by constructing a Graph Convolutional Network designed with multi-head attention and gated skip-connections to predict reproductive and developmental toxicity. By integrating structural alerts directly related to toxicity into the model, we enabled more effective learning of toxicologically relevant substructures. We built a dataset of 4,514 diverse compounds, including both organic and inorganic substances. The model was trained and validated using stratified 5-fold cross-validation. It demonstrated excellent predictive performance, achieving an accuracy of 81.19% on the test set. To address the interpretability of the deep learning model, we identified subgraphs corresponding to known structural alerts, providing insights into the model's decision-making process. This study was conducted in accordance with the OECD principles for reliable Quantitative Structure-Activity Relationship modeling and contributes to the development of robust in silico models for toxicity prediction.
Collapse
Affiliation(s)
- Si Hoon Lee
- Department of Chemistry, School of Natural Sciences, Sogang University, Seoul, 04107, Republic of Korea
| | - Eunwoo Choi
- Department of Chemistry, School of Natural Sciences, Sogang University, Seoul, 04107, Republic of Korea
| | - JunHo Park
- Department of Chemistry, School of Natural Sciences, Sogang University, Seoul, 04107, Republic of Korea
| | - Seohwi Yoon
- Department of Chemistry, School of Natural Sciences, Sogang University, Seoul, 04107, Republic of Korea
| | - Myung-Ha Song
- Environmental Risk Research Division, Environmental Health Research Department, National Institute of Environmental Research, Incheon, 22689, Republic of Korea
| | - Ji Young Lee
- Environmental Risk Research Division, Environmental Health Research Department, National Institute of Environmental Research, Incheon, 22689, Republic of Korea
| | - Jungkwan Seo
- Environmental Risk Research Division, Environmental Health Research Department, National Institute of Environmental Research, Incheon, 22689, Republic of Korea
| | - Sun Kyung Shin
- Environmental Risk Research Division, Environmental Health Research Department, National Institute of Environmental Research, Incheon, 22689, Republic of Korea
| | - Sang Hee Lee
- Environmental Risk Research Division, Environmental Health Research Department, National Institute of Environmental Research, Incheon, 22689, Republic of Korea.
| | - Han Bin Oh
- Department of Chemistry, School of Natural Sciences, Sogang University, Seoul, 04107, Republic of Korea.
| |
Collapse
|
2
|
Liu J, Li J, Li Z, Dong F, Guo W, Ge W, Patterson TA, Hong H. Developing predictive models for µ opioid receptor binding using machine learning and deep learning techniques. Exp Biol Med (Maywood) 2025; 250:10359. [PMID: 40177220 PMCID: PMC11961360 DOI: 10.3389/ebm.2025.10359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Accepted: 02/25/2025] [Indexed: 04/05/2025] Open
Abstract
Opioids exert their analgesic effect by binding to the µ opioid receptor (MOR), which initiates a downstream signaling pathway, eventually inhibiting pain transmission in the spinal cord. However, current opioids are addictive, often leading to overdose contributing to the opioid crisis in the United States. Therefore, understanding the structure-activity relationship between MOR and its ligands is essential for predicting MOR binding of chemicals, which could assist in the development of non-addictive or less-addictive opioid analgesics. This study aimed to develop machine learning and deep learning models for predicting MOR binding activity of chemicals. Chemicals with MOR binding activity data were first curated from public databases and the literature. Molecular descriptors of the curated chemicals were calculated using software Mold2. The chemicals were then split into training and external validation datasets. Random forest, k-nearest neighbors, support vector machine, multi-layer perceptron, and long short-term memory models were developed and evaluated using 5-fold cross-validations and external validations, resulting in Matthews correlation coefficients of 0.528-0.654 and 0.408, respectively. Furthermore, prediction confidence and applicability domain analyses highlighted their importance to the models' applicability. Our results suggest that the developed models could be useful for identifying MOR binders, potentially aiding in the development of non-addictive or less-addictive drugs targeting MOR.
Collapse
Affiliation(s)
- Jie Liu
- U.S. Food and Drug Administration, National Center for Toxicological Research, Jefferson, AR, United States
| | - Jerry Li
- Department of Computer Science, Rice University, Houston, TX, United States
| | - Zoe Li
- U.S. Food and Drug Administration, National Center for Toxicological Research, Jefferson, AR, United States
| | - Fan Dong
- U.S. Food and Drug Administration, National Center for Toxicological Research, Jefferson, AR, United States
| | - Wenjing Guo
- U.S. Food and Drug Administration, National Center for Toxicological Research, Jefferson, AR, United States
| | - Weigong Ge
- U.S. Food and Drug Administration, National Center for Toxicological Research, Jefferson, AR, United States
| | - Tucker A. Patterson
- U.S. Food and Drug Administration, National Center for Toxicological Research, Jefferson, AR, United States
| | - Huixiao Hong
- U.S. Food and Drug Administration, National Center for Toxicological Research, Jefferson, AR, United States
| |
Collapse
|
3
|
Miller LB, Feuz MB, Meyer RG, Meyer-Ficca ML. Reproductive toxicology: keeping up with our changing world. FRONTIERS IN TOXICOLOGY 2024; 6:1456687. [PMID: 39463893 PMCID: PMC11502475 DOI: 10.3389/ftox.2024.1456687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2024] [Accepted: 09/26/2024] [Indexed: 10/29/2024] Open
Abstract
Reproductive toxicology testing is essential to safeguard public health of current and future generations. Traditional toxicological testing of male reproduction has focused on evaluating substances for acute toxicity to the reproductive system, with fertility assessment as a main endpoint and infertility a main adverse outcome. Newer studies in the last few decades have significantly widened our understanding of what represents an adverse event in reproductive toxicology, and thus changed our perspective of what constitutes a reproductive toxicant, such as endocrine disrupting chemicals that affect fertility and offspring health in an intergenerational manner. Besides infertility or congenital abnormalities, adverse outcomes can present as increased likelihood for various health problems in offspring, including metabolic syndrome, neurodevelopmental problems like autism and increased cancer predisposition, among others. To enable toxicologic studies to accurately represent the population, toxicologic testing designs need to model changing population characteristics and exposure circumstances. Current trends of increasing importance in human reproduction include increased paternal age, with an associated decline of nicotinamide adenine dinucleotide (NAD), and a higher prevalence of obesity, both of which are factors that toxicological testing study design should account for. In this perspective article, we highlighted some limitations of standard testing protocols, the need for expanding the assessed reproductive endpoint by including genetic and epigenetic sperm parameters, and the potential of recent developments, including mixture testing, novel animal models, in vitro systems like organoids, multigenerational testing protocols, as well as in silico modelling, machine learning and artificial intelligence.
Collapse
Affiliation(s)
| | | | | | - Mirella L. Meyer-Ficca
- Department of Veterinary, Clinical and Life Sciences, College of Veterinary Medicine, Utah State University, Logan, UT, United States
| |
Collapse
|
4
|
Guo W, Liu J, Dong F, Hong H. Unlocking the potential of AI: Machine learning and deep learning models for predicting carcinogenicity of chemicals. JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH. PART C, TOXICOLOGY AND CARCINOGENESIS 2024; 43:23-50. [PMID: 39228157 DOI: 10.1080/26896583.2024.2396731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
The escalating apprehension surrounding the carcinogenic potential of chemicals emphasizes the imperative need for efficient methods of assessing carcinogenicity. Conventional experimental approaches such as in vitro and in vivo assays, albeit effective, suffer from being costly and time-consuming. In response to this challenge, new alternative methodologies, notably machine learning and deep learning techniques, have attracted attention for their potential in developing carcinogenicity prediction models. This article reviews the progress in predicting carcinogenicity using various machine learning and deep learning algorithms. A comparative analysis on these developed models reveals that support vector machine, random forest, and ensemble learning are commonly preferred for their robustness and effectiveness in predicting chemical carcinogenicity. Conversely, models based on deep learning algorithms, such as feedforward neural network, convolutional neural network, graph convolutional neural network, capsule neural network, and hybrid neural networks, exhibit promising capabilities but are limited by the size of available carcinogenicity datasets. This review provides a comprehensive analysis of current machine learning and deep learning models for carcinogenicity prediction, underscoring the importance of high-quality and large datasets. These observations are anticipated to catalyze future advancements in developing effective and generalizable machine learning and deep learning models for predicting chemical carcinogenicity.
Collapse
Affiliation(s)
- Wenjing Guo
- National Center for Toxicological Research (NCTR), U.S. Food & Drug Administration (FDA), Jefferson, AR
| | - Jie Liu
- National Center for Toxicological Research (NCTR), U.S. Food & Drug Administration (FDA), Jefferson, AR
| | - Fan Dong
- National Center for Toxicological Research (NCTR), U.S. Food & Drug Administration (FDA), Jefferson, AR
| | - Huixiao Hong
- National Center for Toxicological Research (NCTR), U.S. Food & Drug Administration (FDA), Jefferson, AR
| |
Collapse
|
5
|
Liu J, Khan MKH, Guo W, Dong F, Ge W, Zhang C, Gong P, Patterson TA, Hong H. Machine learning and deep learning approaches for enhanced prediction of hERG blockade: a comprehensive QSAR modeling study. Expert Opin Drug Metab Toxicol 2024; 20:665-684. [PMID: 38968091 DOI: 10.1080/17425255.2024.2377593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 06/26/2024] [Indexed: 07/07/2024]
Abstract
BACKGROUND Cardiotoxicity is a major cause of drug withdrawal. The hERG channel, regulating ion flow, is pivotal for heart and nervous system function. Its blockade is a concern in drug development. Predicting hERG blockade is essential for identifying cardiac safety issues. Various QSAR models exist, but their performance varies. Ongoing improvements show promise, necessitating continued efforts to enhance accuracy using emerging deep learning algorithms in predicting potential hERG blockade. STUDY DESIGN AND METHOD Using a large training dataset, six individual QSAR models were developed. Additionally, three ensemble models were constructed. All models were evaluated using 10-fold cross-validations and two external datasets. RESULTS The 10-fold cross-validations resulted in Mathews correlation coefficient (MCC) values from 0.682 to 0.730, surpassing the best-reported model on the same dataset (0.689). External validations yielded MCC values from 0.520 to 0.715 for the first dataset, exceeding those of previously reported models (0-0.599). For the second dataset, MCC values fell between 0.025 and 0.215, aligning with those of reported models (0.112-0.220). CONCLUSIONS The developed models can assist the pharmaceutical industry and regulatory agencies in predicting hERG blockage activity, thereby enhancing safety assessments and reducing the risk of adverse cardiac events associated with new drug candidates.
Collapse
Affiliation(s)
- Jie Liu
- National Center for Toxicological Research, US Food & Drug Administration, Jefferson, AR, USA
| | - Md Kamrul Hasan Khan
- National Center for Toxicological Research, US Food & Drug Administration, Jefferson, AR, USA
| | - Wenjing Guo
- National Center for Toxicological Research, US Food & Drug Administration, Jefferson, AR, USA
| | - Fan Dong
- National Center for Toxicological Research, US Food & Drug Administration, Jefferson, AR, USA
| | - Weigong Ge
- National Center for Toxicological Research, US Food & Drug Administration, Jefferson, AR, USA
| | - Chaoyang Zhang
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, USA
| | - Ping Gong
- Environmental Laboratory, US Army Engineer Research and Development Center, Vicksburg, MS, USA
| | - Tucker A Patterson
- National Center for Toxicological Research, US Food & Drug Administration, Jefferson, AR, USA
| | - Huixiao Hong
- National Center for Toxicological Research, US Food & Drug Administration, Jefferson, AR, USA
| |
Collapse
|
6
|
De Gregorio V, La Pietra A, Candela A, Oliviero C, Ferrandino I, Tesauro D. Insight on cytotoxic NHC gold(I) halide complexes evaluated in multifaceted culture systems. Curr Res Toxicol 2024; 6:100174. [PMID: 38841651 PMCID: PMC11152893 DOI: 10.1016/j.crtox.2024.100174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 04/24/2024] [Accepted: 05/22/2024] [Indexed: 06/07/2024] Open
Abstract
Gold complexes can be a useful system in the fight against cancer. Although many studies have been carried out on in vitro 2D cell culture models embryotoxic assays are particularly lacking. Embryotoxicity and DNA damage are critical concerns in drug development. In this study, the effects of a new N-Heterocyclic carbene (NHC)-Au compound (Bromo[1,3-di-4-methoxybenzyl-4,5-bis(4-methoxyphenyl)imidazol-2-ylidene]gold(I)) at different concentrations were explored using multifaceted approach, encompassing 2D cancer cell cultures, in vivo zebrafish and in vitro bovine models, and compared with a consolidated similar complex (Bromo[1,3-diethyl-4,5-bis(4-methoxyphenyl)imidazol-2-ylidene]gold(I)). The results obtained from 2D cancer cell cultures revealed concentration-dependent effects of the gold compounds by estimating the cytotoxicity with MTT assay and cellular damage as indicated by LDH release. Selected concentrations of gold complexes demonstrated no adverse effects on zebrafish embryo development. However, in bovine embryos, these same concentrations led to significant impairments in the early developmental stages, triggering cell apoptosis and reducing blastocyst competence. These findings underscore the importance of evaluating drug effects across different model systems to comprehensively assess their safety and potential impact on embryonic development.
Collapse
Affiliation(s)
- Vincenza De Gregorio
- Department of Biology University of Naples “Federico II”, Via Cinthia 80126, Napoli, Italy
| | - Alessandra La Pietra
- Department of Biology University of Naples “Federico II”, Via Cinthia 80126, Napoli, Italy
| | - Andrea Candela
- Department of Biology University of Naples “Federico II”, Via Cinthia 80126, Napoli, Italy
| | - Carlo Oliviero
- Department of Experimental Medicine, Section of Biotechnology, Medical Histology and Molecular Biology, University of Campania “Luigi Vanvitelli”, 80138 Naples, Italy
| | - Ida Ferrandino
- Department of Biology University of Naples “Federico II”, Via Cinthia 80126, Napoli, Italy
| | - Diego Tesauro
- Department of Pharmacy and Interuniversity Research Centre on Bioactive Peptides (CIRPeB), University of Naples “Federico II”, Via Montesano 49, 80131 Naples, Italy
| |
Collapse
|
7
|
Dong F, Guo W, Liu J, Patterson TA, Hong H. BERT-based language model for accurate drug adverse event extraction from social media: implementation, evaluation, and contributions to pharmacovigilance practices. Front Public Health 2024; 12:1392180. [PMID: 38716250 PMCID: PMC11074401 DOI: 10.3389/fpubh.2024.1392180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 04/11/2024] [Indexed: 05/18/2024] Open
Abstract
Introduction Social media platforms serve as a valuable resource for users to share health-related information, aiding in the monitoring of adverse events linked to medications and treatments in drug safety surveillance. However, extracting drug-related adverse events accurately and efficiently from social media poses challenges in both natural language processing research and the pharmacovigilance domain. Method Recognizing the lack of detailed implementation and evaluation of Bidirectional Encoder Representations from Transformers (BERT)-based models for drug adverse event extraction on social media, we developed a BERT-based language model tailored to identifying drug adverse events in this context. Our model utilized publicly available labeled adverse event data from the ADE-Corpus-V2. Constructing the BERT-based model involved optimizing key hyperparameters, such as the number of training epochs, batch size, and learning rate. Through ten hold-out evaluations on ADE-Corpus-V2 data and external social media datasets, our model consistently demonstrated high accuracy in drug adverse event detection. Result The hold-out evaluations resulted in average F1 scores of 0.8575, 0.9049, and 0.9813 for detecting words of adverse events, words in adverse events, and words not in adverse events, respectively. External validation using human-labeled adverse event tweets data from SMM4H further substantiated the effectiveness of our model, yielding F1 scores 0.8127, 0.8068, and 0.9790 for detecting words of adverse events, words in adverse events, and words not in adverse events, respectively. Discussion This study not only showcases the effectiveness of BERT-based language models in accurately identifying drug-related adverse events in the dynamic landscape of social media data, but also addresses the need for the implementation of a comprehensive study design and evaluation. By doing so, we contribute to the advancement of pharmacovigilance practices and methodologies in the context of emerging information sources like social media.
Collapse
Affiliation(s)
| | | | | | | | - Huixiao Hong
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, United States
| |
Collapse
|
8
|
Anandhi G, Iyapparaja M. Systematic approaches to machine learning models for predicting pesticide toxicity. Heliyon 2024; 10:e28752. [PMID: 38576573 PMCID: PMC10990867 DOI: 10.1016/j.heliyon.2024.e28752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 03/13/2024] [Accepted: 03/24/2024] [Indexed: 04/06/2024] Open
Abstract
Pesticides play an important role in modern agriculture by protecting crops from pests and diseases. However, the negative consequences of pesticides, such as environmental contamination and adverse effects on human and ecological health, underscore the importance of accurate toxicity predictions. To address this issue, artificial intelligence models have emerged as valuable methods for predicting the toxicity of organic compounds. In this review article, we explore the application of machine learning (ML) for pesticide toxicity prediction. This review provides a detailed summary of recent developments, prediction models, and datasets used for pesticide toxicity prediction. In this analysis, we compared the results of several algorithms that predict the harmfulness of various classes of pesticides. Furthermore, this review article identified emerging trends and areas for future direction, showcasing the transformative potential of machine learning in promoting safer pesticide usage and sustainable agriculture.
Collapse
Affiliation(s)
- Ganesan Anandhi
- Department of Smart Computing, School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - M. Iyapparaja
- Department of Smart Computing, School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| |
Collapse
|
9
|
Li Z, Huang R, Xia M, Patterson TA, Hong H. Fingerprinting Interactions between Proteins and Ligands for Facilitating Machine Learning in Drug Discovery. Biomolecules 2024; 14:72. [PMID: 38254672 PMCID: PMC10813698 DOI: 10.3390/biom14010072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 12/26/2023] [Accepted: 12/28/2023] [Indexed: 01/24/2024] Open
Abstract
Molecular recognition is fundamental in biology, underpinning intricate processes through specific protein-ligand interactions. This understanding is pivotal in drug discovery, yet traditional experimental methods face limitations in exploring the vast chemical space. Computational approaches, notably quantitative structure-activity/property relationship analysis, have gained prominence. Molecular fingerprints encode molecular structures and serve as property profiles, which are essential in drug discovery. While two-dimensional (2D) fingerprints are commonly used, three-dimensional (3D) structural interaction fingerprints offer enhanced structural features specific to target proteins. Machine learning models trained on interaction fingerprints enable precise binding prediction. Recent focus has shifted to structure-based predictive modeling, with machine-learning scoring functions excelling due to feature engineering guided by key interactions. Notably, 3D interaction fingerprints are gaining ground due to their robustness. Various structural interaction fingerprints have been developed and used in drug discovery, each with unique capabilities. This review recapitulates the developed structural interaction fingerprints and provides two case studies to illustrate the power of interaction fingerprint-driven machine learning. The first elucidates structure-activity relationships in β2 adrenoceptor ligands, demonstrating the ability to differentiate agonists and antagonists. The second employs a retrosynthesis-based pre-trained molecular representation to predict protein-ligand dissociation rates, offering insights into binding kinetics. Despite remarkable progress, challenges persist in interpreting complex machine learning models built on 3D fingerprints, emphasizing the need for strategies to make predictions interpretable. Binding site plasticity and induced fit effects pose additional complexities. Interaction fingerprints are promising but require continued research to harness their full potential.
Collapse
Affiliation(s)
- Zoe Li
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR 72079, USA; (Z.L.); (T.A.P.)
| | - Ruili Huang
- National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, MD 20892, USA; (R.H.); (M.X.)
| | - Menghang Xia
- National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, MD 20892, USA; (R.H.); (M.X.)
| | - Tucker A. Patterson
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR 72079, USA; (Z.L.); (T.A.P.)
| | - Huixiao Hong
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR 72079, USA; (Z.L.); (T.A.P.)
| |
Collapse
|
10
|
Khan MKH, Guo W, Liu J, Dong F, Li Z, Patterson TA, Hong H. Machine learning and deep learning for brain tumor MRI image segmentation. Exp Biol Med (Maywood) 2023; 248:1974-1992. [PMID: 38102956 PMCID: PMC10798183 DOI: 10.1177/15353702231214259] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2023] Open
Abstract
Brain tumors are often fatal. Therefore, accurate brain tumor image segmentation is critical for the diagnosis, treatment, and monitoring of patients with these tumors. Magnetic resonance imaging (MRI) is a commonly used imaging technique for capturing brain images. Both machine learning and deep learning techniques are popular in analyzing MRI images. This article reviews some commonly used machine learning and deep learning techniques for brain tumor MRI image segmentation. The limitations and advantages of the reviewed machine learning and deep learning methods are discussed. Even though each of these methods has a well-established status in their individual domains, the combination of two or more techniques is currently an emerging trend.
Collapse
Affiliation(s)
- Md Kamrul Hasan Khan
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Wenjing Guo
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Jie Liu
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Fan Dong
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Zoe Li
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Tucker A Patterson
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| |
Collapse
|
11
|
Liu J, Xu L, Guo W, Li Z, Khan MKH, Ge W, Patterson TA, Hong H. Developing a SARS-CoV-2 main protease binding prediction random forest model for drug repurposing for COVID-19 treatment. Exp Biol Med (Maywood) 2023; 248:1927-1936. [PMID: 37997891 PMCID: PMC10798185 DOI: 10.1177/15353702231209413] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 09/26/2023] [Indexed: 11/25/2023] Open
Abstract
The coronavirus disease 2019 (COVID-19) global pandemic resulted in millions of people becoming infected with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus and close to seven million deaths worldwide. It is essential to further explore and design effective COVID-19 treatment drugs that target the main protease of SARS-CoV-2, a major target for COVID-19 drugs. In this study, machine learning was applied for predicting the SARS-CoV-2 main protease binding of Food and Drug Administration (FDA)-approved drugs to assist in the identification of potential repurposing candidates for COVID-19 treatment. Ligands bound to the SARS-CoV-2 main protease in the Protein Data Bank and compounds experimentally tested in SARS-CoV-2 main protease binding assays in the literature were curated. These chemicals were divided into training (516 chemicals) and testing (360 chemicals) data sets. To identify SARS-CoV-2 main protease binders as potential candidates for repurposing to treat COVID-19, 1188 FDA-approved drugs from the Liver Toxicity Knowledge Base were obtained. A random forest algorithm was used for constructing predictive models based on molecular descriptors calculated using Mold2 software. Model performance was evaluated using 100 iterations of fivefold cross-validations which resulted in 78.8% balanced accuracy. The random forest model that was constructed from the whole training dataset was used to predict SARS-CoV-2 main protease binding on the testing set and the FDA-approved drugs. Model applicability domain and prediction confidence on drugs predicted as the main protease binders discovered 10 FDA-approved drugs as potential candidates for repurposing to treat COVID-19. Our results demonstrate that machine learning is an efficient method for drug repurposing and, thus, may accelerate drug development targeting SARS-CoV-2.
Collapse
Affiliation(s)
| | | | - Wenjing Guo
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Zoe Li
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Md Kamrul Hasan Khan
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Weigong Ge
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Tucker A Patterson
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA
| |
Collapse
|