Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Guo W, Liu J, Dong F, Song M, Li Z, Khan MKH, Patterson TA, Hong H. Review of machine learning and deep learning models for toxicity prediction. Exp Biol Med (Maywood) 2023;248:1952-1973. [PMID: 38057999 PMCID: PMC10798180 DOI: 10.1177/15353702231209421] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023] Open

For:	Guo W, Liu J, Dong F, Song M, Li Z, Khan MKH, Patterson TA, Hong H. Review of machine learning and deep learning models for toxicity prediction. Exp Biol Med (Maywood) 2023;248:1952-1973. [PMID: 38057999 PMCID: PMC10798180 DOI: 10.1177/15353702231209421] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023] Open

Number

Cited by Other Article(s)

Seal S, Mahale M, García-Ortegón M, Joshi CK, Hosseini-Gerami L, Beatson A, Greenig M, Shekhar M, Patra A, Weis C, Mehrjou A, Badré A, Paisley B, Lowe R, Singh S, Shah F, Johannesson B, Williams D, Rouquie D, Clevert DA, Schwab P, Richmond N, Nicolaou CA, Gonzalez RJ, Naven R, Schramm C, Vidler LR, Mansouri K, Walters WP, Wilk DD, Spjuth O, Carpenter AE, Bender A. Machine Learning for Toxicity Prediction Using Chemical Structures: Pillars for Success in the Real World. Chem Res Toxicol 2025. [PMID: 40314361 DOI: 10.1021/acs.chemrestox.5c00033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2025]

Abstract

Machine learning (ML) is increasingly valuable for predicting molecular properties and toxicity in drug discovery. However, toxicity-related end points have always been challenging to evaluate experimentally with respect to in vivo translation due to the required resources for human and animal studies; this has impacted data availability in the field. ML can augment or even potentially replace traditional experimental processes depending on the project phase and specific goals of the prediction. For instance, models can be used to select promising compounds for on-target effects or to deselect those with undesirable characteristics (e.g., off-target or ineffective due to unfavorable pharmacokinetics). However, reliance on ML is not without risks, due to biases stemming from nonrepresentative training data, incompatible choice of algorithm to represent the underlying data, or poor model building and validation approaches. This might lead to inaccurate predictions, misinterpretation of the confidence in ML predictions, and ultimately suboptimal decision-making. Hence, understanding the predictive validity of ML models is of utmost importance to enable faster drug development timelines while improving the quality of decisions. This perspective emphasizes the need to enhance the understanding and application of machine learning models in drug discovery, focusing on well-defined data sets for toxicity prediction based on small molecule structures. We focus on five crucial pillars for success with ML-driven molecular property and toxicity prediction: (1) data set selection, (2) structural representations, (3) model algorithm, (4) model validation, and (5) translation of predictions to decision-making. Understanding these key pillars will foster collaboration and coordination between ML researchers and toxicologists, which will help to advance drug discovery and development.

Collapse

Affiliation(s)

Srijit Seal Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, United States Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, U.K
Manas Mahale Department of Pharmaceutical Chemistry, Bombay College of Pharmacy, Mumbai 400098, India
Miguel García-Ortegón Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, U.K
Chaitanya K Joshi Department of Computer Science and Technology, University of Cambridge, Cambridge CB3 0FD, U.K
Layla Hosseini-Gerami IgnotaLabs, Cambridge CB4 0GA, U.K
Alex Beatson Axiom Bio, San Francisco, California 94107, United States
Matthew Greenig Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, U.K
Mrinal Shekhar Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, United States
Arijit Patra UCB Pharma U.K., Slough SL1 3WE, U.K
Caroline Weis GSK, London WC1A 1DG, U.K
Arash Mehrjou GSK, London WC1A 1DG, U.K
Adrien Badré Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Brianna Paisley Eli Lilly & Company, Indianapolis, Indiana 46285, United States
Rhiannon Lowe Relation Therapeutics, London NW1 3BG, U.K
Shantanu Singh Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, United States
Falgun Shah Non Clinical Drug Safety, Merck Inc., West Point, Pennsylvania 19486, United States
Bjarki Johannesson AstraZeneca, Pepparedsleden 1 43183 Molndal, Sweden
Dominic Williams AstraZeneca, Pepparedsleden 1 43183 Molndal, Sweden
David Rouquie Toxicology Data Science, Bayer SAS Crop Science Division, Valbonne Sophia-Antipolis 06560, France
Djork-Arné Clevert Pfizer, Worldwide Research, Development and Medical, Machine Learning & Computational Sciences, Berlin 10922, Germany
Patrick Schwab GSK, London WC1A 1DG, U.K
Nicola Richmond Recursion, London N1C 4AG, U.K
Christos A Nicolaou Computational Drug Design, Digital Science & Innovation, Novo Nordisk US R&D, Lexington, Massachusetts 02421, United States
Raymond J Gonzalez Non Clinical Drug Safety, Merck Inc., West Point, Pennsylvania 19486, United States
Russell Naven Novartis Biomedical Research, Cambridge, Massachusetts 02139, United States
Carolin Schramm Sanofi, Babraham Research Campus, Cambridge CB22 3AT, U.K
Lewis R Vidler Eli Lilly and Company, Bracknell RG12 1PU, U.K
Kamel Mansouri NIH/NIEHS/DTT/NICEATM, Research Triangle Park, North Carolina 27709, United States
W Patrick Walters Relay Therapeutics, Cambridge, Massachusetts 02141, United States
Deidre Dalmas Wilk Nonclinical Safety, Collegeville Pennsylvania 19426, United States
Ola Spjuth Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala 751 24, Sweden Phenaros Pharmaceuticals AB, Uppsala 75239, Sweden
Anne E Carpenter Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, United States
Andreas Bender Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, U.K College of Medicine and Health Sciences, Khalifa University of Science and Technology, Abu Dhabi 127788, United Arab Emirates

Collapse

Chen YQ, Yu T, Song ZQ, Wang CY, Luo JT, Xiao Y, Qiu H, Wang QQ, Jin HM. Application of Large Language Models in Drug-Induced Osteotoxicity Prediction. J Chem Inf Model 2025;65:3370-3379. [PMID: 40114317 DOI: 10.1021/acs.jcim.5c00275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2025]

Kim S, Yang S, Jung J, Choi J, Kang M, Joo J. Psychedelic Drugs in Mental Disorders: Current Clinical Scope and Deep Learning-Based Advanced Perspectives. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025;12:e2413786. [PMID: 40112231 PMCID: PMC12005819 DOI: 10.1002/advs.202413786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2024] [Revised: 02/13/2025] [Indexed: 03/22/2025]

Duy H, Srisongkram T. Bidirectional Long Short-Term Memory (BiLSTM) Neural Networks with Conjoint Fingerprints: Application in Predicting Skin-Sensitizing Agents in Natural Compounds. J Chem Inf Model 2025;65:3035-3047. [PMID: 40029998 PMCID: PMC11938345 DOI: 10.1021/acs.jcim.5c00032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2025] [Revised: 02/19/2025] [Accepted: 02/20/2025] [Indexed: 03/25/2025]

Abstract

Skin sensitization, or allergic contact dermatitis, represents a critical end point in toxicity assessment, with profound implications for drug safety and regulatory decision-making. This study aims to develop a robust deep-learning-based quantitative structure-activity relationship framework for accurately predicting skin sensitization toxicity, particularly in the context of natural-product-derived compounds. To achieve this, we explored advanced recurrent neural network architectures, including long short-term memory (LSTM), bidirectional LSTM (BiLSTM), gated recurrent unit (GRU), and bidirectional GRU, to model the intricate structure-toxicity relationships inherent in molecular compounds. We aim to optimize and improve predictive performance by training a cohort of 55 models with a diverse set of molecular fingerprints. Notably, the BiLSTM model, which integrates SMILES tokens with RDKit fingerprints, achieved superior predictive performance, underscoring its capability to effectively capture key molecular determinants of skin sensitization. An extensive applicability domain analysis coupled with an in-depth evaluation of feature importance provided new insights into the key molecular attributes that influence sensitization propensity. We further evaluated the BiLSTM model using a natural product data set, where it demonstrated exceptional generalization capabilities. The model achieved an accuracy of 86.5%, a Matthews correlation coefficient of 75.2%, a sensitivity of 100%, an area under the curve of 88%, a specificity of 75%, and an F1-score of 88.8%. Remarkably, the model effectively categorized natural products by discriminating sensitizing from non-sensitizing agents across various natural product subcategories. These results underscore the potential of BiLSTM-based models as powerful in silico tools for modern drug discovery efforts and regulatory assessments, especially in the field of natural products.

Collapse

Liu J, Li J, Li Z, Dong F, Guo W, Ge W, Patterson TA, Hong H. Developing predictive models for µ opioid receptor binding using machine learning and deep learning techniques. Exp Biol Med (Maywood) 2025;250:10359. [PMID: 40177220 PMCID: PMC11961360 DOI: 10.3389/ebm.2025.10359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Accepted: 02/25/2025] [Indexed: 04/05/2025] Open

Duy H, Srisongkram T. Comparative Analysis of Recurrent Neural Networks with Conjoint Fingerprints for Skin Corrosion Prediction. J Chem Inf Model 2025;65:1305-1317. [PMID: 39835935 PMCID: PMC11815816 DOI: 10.1021/acs.jcim.4c02062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2024] [Revised: 12/28/2024] [Accepted: 01/03/2025] [Indexed: 01/22/2025]

Kakar M, Huynh BN, Zlygosteva O, Juvkam IS, Edin N, Tomic O, Futsaether CM, Malinen E. Attention-based Vision Transformer Enables Early Detection of Radiotherapy-Induced Toxicity in Magnetic Resonance Images of a Preclinical Model. Technol Cancer Res Treat 2025;24:15330338251333018. [PMID: 40183426 PMCID: PMC11970093 DOI: 10.1177/15330338251333018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2025] [Revised: 03/07/2025] [Accepted: 03/20/2025] [Indexed: 04/05/2025] Open

Wei Y, Qiu T, Ai Y, Zhang Y, Xie J, Zhang D, Luo X, Sun X, Wang X, Qiu J. Advances of computational methods enhance the development of multi-epitope vaccines. Brief Bioinform 2024;26:bbaf055. [PMID: 39951549 PMCID: PMC11827616 DOI: 10.1093/bib/bbaf055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2024] [Revised: 11/28/2024] [Accepted: 01/27/2025] [Indexed: 02/16/2025] Open

Affiliation(s)

Yiwen Wei School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
Tianyi Qiu Institute of Clinical Science, Zhongshan Hospital; Intelligent Medicine Institute; Shanghai Institute of Infectious Disease and Biosecurity, Shanghai Medical College, Fudan University, No. 180, Fenglin Road, Xuhui Destrict, Shanghai 200032, China
Yisi Ai School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
Yuxi Zhang School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
Junting Xie School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
Dong Zhang School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
Xiaochuan Luo School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
Xiulan Sun State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Foods, Synergetic Innovation Center of Food Safety and Nutrition, Jiangnan University, Lihu Avenue 1800, Wuxi, Jiangsu 214122, China
Xin Wang School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China Shanghai Collaborative Innovation Center of Energy Therapy for Tumors, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
Jingxuan Qiu School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China Shanghai Collaborative Innovation Center of Energy Therapy for Tumors, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China

Collapse

Guo W, Liu J, Dong F, Hong H. Unlocking the potential of AI: Machine learning and deep learning models for predicting carcinogenicity of chemicals. JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH. PART C, TOXICOLOGY AND CARCINOGENESIS 2024;43:23-50. [PMID: 39228157 DOI: 10.1080/26896583.2024.2396731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]

Arab I, Laukens K, Bittremieux W. Semisupervised Learning to Boost hERG, Nav1.5, and Cav1.2 Cardiac Ion Channel Toxicity Prediction by Mining a Large Unlabeled Small Molecule Data Set. J Chem Inf Model 2024;64:6410-6420. [PMID: 39110924 DOI: 10.1021/acs.jcim.4c01102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/27/2024]

Huang ETC, Yang JS, Liao KYK, Tseng WCW, Lee CK, Gill M, Compas C, See S, Tsai FJ. Predicting blood-brain barrier permeability of molecules with a large language model and machine learning. Sci Rep 2024;14:15844. [PMID: 38982309 PMCID: PMC11233737 DOI: 10.1038/s41598-024-66897-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 07/05/2024] [Indexed: 07/11/2024] Open

Abstract

Predicting the blood-brain barrier (BBB) permeability of small-molecule compounds using a novel artificial intelligence platform is necessary for drug discovery. Machine learning and a large language model on artificial intelligence (AI) tools improve the accuracy and shorten the time for new drug development. The primary goal of this research is to develop artificial intelligence (AI) computing models and novel deep learning architectures capable of predicting whether molecules can permeate the human blood-brain barrier (BBB). The in silico (computational) and in vitro (experimental) results were validated by the Natural Products Research Laboratories (NPRL) at China Medical University Hospital (CMUH). The transformer-based MegaMolBART was used as the simplified molecular input line entry system (SMILES) encoder with an XGBoost classifier as an in silico method to check if a molecule could cross through the BBB. We used Morgan or Circular fingerprints to apply the Morgan algorithm to a set of atomic invariants as a baseline encoder also with an XGBoost classifier to compare the results. BBB permeability was assessed in vitro using three-dimensional (3D) human BBB spheroids (human brain microvascular endothelial cells, brain vascular pericytes, and astrocytes). Using multiple BBB databases, the results of the final in silico transformer and XGBoost model achieved an area under the receiver operating characteristic curve of 0.88 on the held-out test dataset. Temozolomide (TMZ) and 21 randomly selected BBB permeable compounds (Pred scores = 1, indicating BBB-permeable) from the NPRL penetrated human BBB spheroid cells. No evidence suggests that ferulic acid or five BBB-impermeable compounds (Pred scores < 1.29423E-05, which designate compounds that pass through the human BBB) can pass through the spheroid cells of the BBB. Our validation of in vitro experiments indicated that the in silico prediction of small-molecule permeation in the BBB model is accurate. Transformer-based models like MegaMolBART, leveraging the SMILES representations of molecules, show great promise for applications in new drug discovery. These models have the potential to accelerate the development of novel targeted treatments for disorders of the central nervous system.

Collapse

Liu J, Khan MKH, Guo W, Dong F, Ge W, Zhang C, Gong P, Patterson TA, Hong H. Machine learning and deep learning approaches for enhanced prediction of hERG blockade: a comprehensive QSAR modeling study. Expert Opin Drug Metab Toxicol 2024;20:665-684. [PMID: 38968091 DOI: 10.1080/17425255.2024.2377593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 06/26/2024] [Indexed: 07/07/2024]

Khan MK, Raza M, Shahbaz M, Hussain I, Khan MF, Xie Z, Shah SSA, Tareen AK, Bashir Z, Khan K. The recent advances in the approach of artificial intelligence (AI) towards drug discovery. Front Chem 2024;12:1408740. [PMID: 38882215 PMCID: PMC11176507 DOI: 10.3389/fchem.2024.1408740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Accepted: 04/26/2024] [Indexed: 06/18/2024] Open

Dong F, Guo W, Liu J, Patterson TA, Hong H. BERT-based language model for accurate drug adverse event extraction from social media: implementation, evaluation, and contributions to pharmacovigilance practices. Front Public Health 2024;12:1392180. [PMID: 38716250 PMCID: PMC11074401 DOI: 10.3389/fpubh.2024.1392180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 04/11/2024] [Indexed: 05/18/2024] Open

Abstract

Introduction

Social media platforms serve as a valuable resource for users to share health-related information, aiding in the monitoring of adverse events linked to medications and treatments in drug safety surveillance. However, extracting drug-related adverse events accurately and efficiently from social media poses challenges in both natural language processing research and the pharmacovigilance domain.

Method

Recognizing the lack of detailed implementation and evaluation of Bidirectional Encoder Representations from Transformers (BERT)-based models for drug adverse event extraction on social media, we developed a BERT-based language model tailored to identifying drug adverse events in this context. Our model utilized publicly available labeled adverse event data from the ADE-Corpus-V2. Constructing the BERT-based model involved optimizing key hyperparameters, such as the number of training epochs, batch size, and learning rate. Through ten hold-out evaluations on ADE-Corpus-V2 data and external social media datasets, our model consistently demonstrated high accuracy in drug adverse event detection.

Result

The hold-out evaluations resulted in average F1 scores of 0.8575, 0.9049, and 0.9813 for detecting words of adverse events, words in adverse events, and words not in adverse events, respectively. External validation using human-labeled adverse event tweets data from SMM4H further substantiated the effectiveness of our model, yielding F1 scores 0.8127, 0.8068, and 0.9790 for detecting words of adverse events, words in adverse events, and words not in adverse events, respectively.

Discussion

This study not only showcases the effectiveness of BERT-based language models in accurately identifying drug-related adverse events in the dynamic landscape of social media data, but also addresses the need for the implementation of a comprehensive study design and evaluation. By doing so, we contribute to the advancement of pharmacovigilance practices and methodologies in the context of emerging information sources like social media.

Collapse

Hong H, Slikker W. Integrating artificial intelligence with bioinformatics promotes public health. Exp Biol Med (Maywood) 2023;248:1905-1907. [PMID: 38179798 PMCID: PMC10798184 DOI: 10.1177/15353702231223575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2024] Open