1
|
Chutia H, Borah GS, Mahanta HJ, Nagamani S. BoostDILI: Extreme Gradient Boost-Powered Drug-Induced Liver Injury Prediction and Structural Alerts Generation. Chem Res Toxicol 2025; 38:865-876. [PMID: 40241442 DOI: 10.1021/acs.chemrestox.4c00532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/18/2025]
Abstract
Over the past 60 years, drug-induced liver injury (DILI) has played a key role in the withdrawal of marketed drugs due to safety concerns. Early prediction of DILI is crucial for developing safer pharmaceuticals, yet current in vitro and in vivo testing methods are complex and cumbersome. In this study, we developed an extreme gradient boosting (XGB)-powered machine learning (ML) model for DILI prediction. Comparing various DILI prediction models is challenging because they rely on different public data sets. We comprehensively evaluated the proposed BoostDILI model to address two crucial questions: 1. Can insights derived from public data sets help in DILI prediction for Food and Drug Administration (FDA) approved drugs? 2. Can we generate structural alerts to improve the model's explainability? To address the first question, we developed a DILI prediction model using four publicly available data sets. This effort led to the creation of the BoostDILI model, which achieved a 5-fold CV accuracy of 0.70. A sequential feature selection method was employed to identify relevant descriptors. This model integrates feature-level representations derived from RDKit (12 features) and Mordred (23 features) features. Bayesian statistics was applied to identify high-performance substructures iteratively, and a structural alerts model was developed to address the second question. The developed model was further validated with two FDA-approved drug data sets, DILIst and DILIRank. The BoostDILI model offers a trustable solution for evaluating the DILI risk in preclinical research. The structural alerts help in identifying the substructures that may be responsible for DILI. The data set and the source code are available at https://github.com/Naga270588/BoostDILI.
Collapse
Affiliation(s)
- Hillul Chutia
- CSIR-North East Institute of Science and Technology, Jorhat 785006, India
| | - Gori Sankar Borah
- School of Computer Science, The Assam Kaziranga University, Jorhat 785006, India
| | - Hridoy Jyoti Mahanta
- CSIR-North East Institute of Science and Technology, Jorhat 785006, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Selvaraman Nagamani
- CSIR-North East Institute of Science and Technology, Jorhat 785006, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| |
Collapse
|
2
|
Ha S, Bang D, Kim S. Fate-tox: fragment attention transformer for E(3)-equivariant multi-organ toxicity prediction. J Cheminform 2025; 17:74. [PMID: 40369624 PMCID: PMC12080013 DOI: 10.1186/s13321-025-01012-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2025] [Accepted: 04/11/2025] [Indexed: 05/16/2025] Open
Abstract
Toxicity is a critical hurdle in drug development, often causing the late-stage failure of promising compounds. Existing computational prediction models often focus on single-organ toxicity. However, avoiding toxicity of an organ, such as reducing gastrointestinal side effects, may inadvertently lead to toxicity in another organ, as seen in the real case of rofecoxib, which was withdrawn due to increased cardiovascular risks. Thus, simultaneous prediction of multi-organ toxicity is a desirable but challenging task. The main challenges are (1) the variability of substructures that contribute to toxicity of different organs, (2) insufficient power of molecular representations in diverse perspectives, and (3) explainability of prediction results especially in terms of substructures or potential toxicophores. To address these challenges with multiple strategies, we developed FATE-Tox, a novel multi-view deep learning framework for multi-organ toxicity prediction. For variability of substructures, we used three fragmentation methods such as BRICS, Bemis-Murcko scaffolds, and RDKit Functional Groups to formulate fragment-level graphs so that diverse substructures can be used to identify toxicity for different organs. For insufficient power of molecular representations, we used molecular representations in both 2D and 3D perspectives. For explainability, our fragment attention transformer identifies potential 3D toxicophores using attention coefficients. Scientific contribution: Our framework achieved significant improvements in prediction performance, with up to 3.01% gains over prior baseline methods on toxicity benchmark datasets from MoleculeNet (BBBP, SIDER, ClinTox) and TDC (DILI, Skin Reaction, Carcinogens, and hERG), while the multi-task learning approach further enhanced performance by up to 1.44% compared to the single-task learning framework that had already surpassed these baselines. Additionally, attention visualization aligning with literature contributes to greater transparency in predictive modeling. Our approach has the potential to provide scientists and clinicians with a more interpretable and clinically meaningful tool to assess systemic toxicity, ultimately supporting safer and more informed drug development processes.
Collapse
Affiliation(s)
- Sumin Ha
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul, 08826, Republic of Korea
| | - Dongmin Bang
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Republic of Korea
- AIGENDRUG Co., Ltd., Seoul, 08758, Republic of Korea
| | - Sun Kim
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul, 08826, Republic of Korea.
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Republic of Korea.
- AIGENDRUG Co., Ltd., Seoul, 08758, Republic of Korea.
- Department of Computer Science and Engineering, Seoul National University, Seoul, 08826, Republic of Korea.
| |
Collapse
|
3
|
Bai C, Wu L, Li R, Cao Y, He S, Bo X. Machine Learning-Enabled Drug-Induced Toxicity Prediction. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025; 12:e2413405. [PMID: 39899688 PMCID: PMC12021114 DOI: 10.1002/advs.202413405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Revised: 12/25/2024] [Indexed: 02/05/2025]
Abstract
Unexpected toxicity has become a significant obstacle to drug candidate development, accounting for 30% of drug discovery failures. Traditional toxicity assessment through animal testing is costly and time-consuming. Big data and artificial intelligence (AI), especially machine learning (ML), are robustly contributing to innovation and progress in toxicology research. However, the optimal AI model for different types of toxicity usually varies, making it essential to conduct comparative analyses of AI methods across toxicity domains. The diverse data sources also pose challenges for researchers focusing on specific toxicity studies. In this review, 10 categories of drug-induced toxicity is examined, summarizing the characteristics and applicable ML models, including both predictive and interpretable algorithms, striking a balance between breadth and depth. Key databases and tools used in toxicity prediction are also highlighted, including toxicology, chemical, multi-omics, and benchmark databases, organized by their focus and function to clarify their roles in drug-induced toxicity prediction. Finally, strategies to turn challenges into opportunities are analyzed and discussed. This review may provide researchers with a valuable reference for understanding and utilizing the available resources to bridge prediction and mechanistic insights, and further advance the application of ML in drugs-induced toxicity prediction.
Collapse
Affiliation(s)
- Changsen Bai
- Academy of Medical Engineering and Translational MedicineTianjin UniversityTianjin300072China
- Department of Advanced & Interdisciplinary BiotechnologyAcademy of Military Medical SciencesBeijing100850China
- Tianjin Medical University Cancer Institute and HospitalTianjin300060China
| | - Lianlian Wu
- Academy of Medical Engineering and Translational MedicineTianjin UniversityTianjin300072China
- Department of Advanced & Interdisciplinary BiotechnologyAcademy of Military Medical SciencesBeijing100850China
| | - Ruijiang Li
- Department of Advanced & Interdisciplinary BiotechnologyAcademy of Military Medical SciencesBeijing100850China
| | - Yang Cao
- Department of Environmental MedicineAcademy of Military Medical SciencesTianjin300050China
| | - Song He
- Department of Advanced & Interdisciplinary BiotechnologyAcademy of Military Medical SciencesBeijing100850China
| | - Xiaochen Bo
- Academy of Medical Engineering and Translational MedicineTianjin UniversityTianjin300072China
- Department of Advanced & Interdisciplinary BiotechnologyAcademy of Military Medical SciencesBeijing100850China
| |
Collapse
|
4
|
Cho C, Lee S, Bang D, Piao Y, Kim S. ChemAP: predicting drug approval with chemical structures before clinical trial phase by leveraging multi-modal embedding space and knowledge distillation. Sci Rep 2024; 14:23010. [PMID: 39362916 PMCID: PMC11449903 DOI: 10.1038/s41598-024-72868-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Accepted: 09/11/2024] [Indexed: 10/05/2024] Open
Abstract
Recent studies showed that the likelihood of drug approval can be predicted with clinical data and structure information of drug using computational approaches. Predicting the likelihood of drug approval can be innovative and of high impact. However, models that leverage clinical data are applicable only in clinical stages, which is not very practical. Prioritizing drug candidates and early-stage decision-making in the de novo drug development process is crucial in pharmaceutical research to optimize resource allocation. For early-stage decision-making, we need a computational model that uses only chemical structures. This seemingly impossible task may utilize the predictive power with multi-modal features including clinical data. In this work, we introduce ChemAP (Chemical structure-based drug Approval Predictor), a novel deep learning scheme for drug approval prediction in the early-stage drug discovery phase. ChemAP aims to enhance the possibility of early-stage decision-making by enriching semantic knowledge to fill in the gap between multi-modal and single-modal chemical spaces through knowledge distillation techniques. This approach facilitates the effective construction of chemical space solely from chemical structure data, guided by multi-modal knowledge related to efficacy, such as clinical trials and patents of drugs. In this study, ChemAP achieved state-of-the-art performance, outperforming both traditional machine learning and deep learning models in drug approval prediction, with AUROC and AUPRC scores of 0.782 and 0.842 respectively on the drug approval benchmark dataset. Additionally, we demonstrated its generalizability by outperforming baseline models on a recent external dataset, which included drugs from the 2023 FDA-approved list and the 2024 clinical trial failure drug list, achieving AUROC and AUPRC scores of 0.694 and 0.851. These results demonstrate that ChemAP is an effective method in predicting drug approval only with chemical structure information of drug so that decision-making can be done at the early stages of drug development process. To the best of our knowledge, our work is the first of its kind to show that prediction of drug approval is possible only with structure information of drug by defining the chemical space of approved and unapproved drugs using deep learning technology.
Collapse
Affiliation(s)
- Changyun Cho
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Republic of Korea
- AIGENDRUG Co., Ltd, Seoul, Republic of Korea
| | - Sangseon Lee
- Institute of Computer Technology, Seoul National University, Seoul, 08826, Republic of Korea
- Department of Artificial Intelligence, Inha University, Incheon, 22212, Republic of Korea
| | - Dongmin Bang
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Republic of Korea
- AIGENDRUG Co., Ltd, Seoul, Republic of Korea
| | - Yinhua Piao
- Department of Computer Science and Engineering, Seoul National University, Seoul, 08826, Republic of Korea
| | - Sun Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Republic of Korea.
- Department of Computer Science and Engineering, Seoul National University, Seoul, 08826, Republic of Korea.
- AIGENDRUG Co., Ltd, Seoul, Republic of Korea.
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul, 08826, Republic of Korea.
| |
Collapse
|
5
|
Shkil DO, Muhamedzhanova AA, Petrov PI, Skorb EV, Aliev TA, Steshin IS, Tumanov AV, Kislinskiy AS, Fedorov MV. Expanding Predictive Capacities in Toxicology: Insights from Hackathon-Enhanced Data and Model Aggregation. Molecules 2024; 29:1826. [PMID: 38675645 PMCID: PMC11055041 DOI: 10.3390/molecules29081826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 04/11/2024] [Accepted: 04/15/2024] [Indexed: 04/28/2024] Open
Abstract
In the realm of predictive toxicology for small molecules, the applicability domain of QSAR models is often limited by the coverage of the chemical space in the training set. Consequently, classical models fail to provide reliable predictions for wide classes of molecules. However, the emergence of innovative data collection methods such as intensive hackathons have promise to quickly expand the available chemical space for model construction. Combined with algorithmic refinement methods, these tools can address the challenges of toxicity prediction, enhancing both the robustness and applicability of the corresponding models. This study aimed to investigate the roles of gradient boosting and strategic data aggregation in enhancing the predictivity ability of models for the toxicity of small organic molecules. We focused on evaluating the impact of incorporating fragment features and expanding the chemical space, facilitated by a comprehensive dataset procured in an open hackathon. We used gradient boosting techniques, accounting for critical features such as the structural fragments or functional groups often associated with manifestations of toxicity.
Collapse
Affiliation(s)
- Dmitrii O. Shkil
- Syntelly LLC, Moscow 121205, Russia; (A.A.M.); (I.S.S.); (A.V.T.); (A.S.K.)
- Moscow Institute of Physics and Technology, Moscow 141700, Russia
| | | | | | - Ekaterina V. Skorb
- Infochemistry Scientific Center, ITMO University, Saint-Petersburg 191002, Russia; (E.V.S.); (T.A.A.)
| | - Timur A. Aliev
- Infochemistry Scientific Center, ITMO University, Saint-Petersburg 191002, Russia; (E.V.S.); (T.A.A.)
| | - Ilya S. Steshin
- Syntelly LLC, Moscow 121205, Russia; (A.A.M.); (I.S.S.); (A.V.T.); (A.S.K.)
| | | | | | - Maxim V. Fedorov
- Kharkevich Institute for Information Transmission Problems of Russian Academy of Sciences, Moscow 127994, Russia
| |
Collapse
|
6
|
Mostafa F, Chen M. Computational models for predicting liver toxicity in the deep learning era. FRONTIERS IN TOXICOLOGY 2024; 5:1340860. [PMID: 38312894 PMCID: PMC10834666 DOI: 10.3389/ftox.2023.1340860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 12/22/2023] [Indexed: 02/06/2024] Open
Abstract
Drug-induced liver injury (DILI) is a severe adverse reaction caused by drugs and may result in acute liver failure and even death. Many efforts have centered on mitigating risks associated with potential DILI in humans. Among these, quantitative structure-activity relationship (QSAR) was proven to be a valuable tool for early-stage hepatotoxicity screening. Its advantages include no requirement for physical substances and rapid delivery of results. Deep learning (DL) made rapid advancements recently and has been used for developing QSAR models. This review discusses the use of DL in predicting DILI, focusing on the development of QSAR models employing extensive chemical structure datasets alongside their corresponding DILI outcomes. We undertake a comprehensive evaluation of various DL methods, comparing with those of traditional machine learning (ML) approaches, and explore the strengths and limitations of DL techniques regarding their interpretability, scalability, and generalization. Overall, our review underscores the potential of DL methodologies to enhance DILI prediction and provides insights into future avenues for developing predictive models to mitigate DILI risk in humans.
Collapse
Affiliation(s)
- Fahad Mostafa
- Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX, United States
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, United States
| | - Minjun Chen
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, United States
| |
Collapse
|
7
|
Wu W, Qian J, Liang C, Yang J, Ge G, Zhou Q, Guan X. GeoDILI: A Robust and Interpretable Model for Drug-Induced Liver Injury Prediction Using Graph Neural Network-Based Molecular Geometric Representation. Chem Res Toxicol 2023; 36:1717-1730. [PMID: 37839069 DOI: 10.1021/acs.chemrestox.3c00199] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2023]
Abstract
Drug-induced liver injury (DILI) is a significant cause of drug failure and withdrawal due to liver damage. Accurate prediction of hepatotoxic compounds is crucial for safe drug development. Several DILI prediction models have been published, but they are built on different data sets, making it difficult to compare model performance. Moreover, most existing models are based on molecular fingerprints or descriptors, neglecting molecular geometric properties and lacking interpretability. To address these limitations, we developed GeoDILI, an interpretable graph neural network that uses a molecular geometric representation. First, we utilized a geometry-based pretrained molecular representation and optimized it on the DILI data set to improve predictive performance. Second, we leveraged gradient information to obtain high-precision atomic-level weights and deduce the dominant substructure. We benchmarked GeoDILI against recently published DILI prediction models, as well as popular GNN models and fingerprint-based machine learning models using the same data set, showing superior predictive performance of our proposed model. We applied the interpretable method in the DILI data set and derived seven precise and mechanistically elucidated structural alerts. Overall, GeoDILI provides a promising approach for accurate and interpretable DILI prediction with potential applications in drug discovery and safety assessment. The data and source code are available at GitHub repository (https://github.com/CSU-QJY/GeoDILI).
Collapse
Affiliation(s)
- Wenxuan Wu
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Jiayu Qian
- School of Mathematics and Statistics, Central South University, Changsha, Hunan 410083, China
| | - Changjie Liang
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Jingya Yang
- School of Mathematics and Statistics, Central South University, Changsha, Hunan 410083, China
| | - Guangbo Ge
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Qingping Zhou
- School of Mathematics and Statistics, Central South University, Changsha, Hunan 410083, China
| | - Xiaoqing Guan
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| |
Collapse
|
8
|
Han R, Yoon H, Kim G, Lee H, Lee Y. Revolutionizing Medicinal Chemistry: The Application of Artificial Intelligence (AI) in Early Drug Discovery. Pharmaceuticals (Basel) 2023; 16:1259. [PMID: 37765069 PMCID: PMC10537003 DOI: 10.3390/ph16091259] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 08/24/2023] [Accepted: 09/04/2023] [Indexed: 09/29/2023] Open
Abstract
Artificial intelligence (AI) has permeated various sectors, including the pharmaceutical industry and research, where it has been utilized to efficiently identify new chemical entities with desirable properties. The application of AI algorithms to drug discovery presents both remarkable opportunities and challenges. This review article focuses on the transformative role of AI in medicinal chemistry. We delve into the applications of machine learning and deep learning techniques in drug screening and design, discussing their potential to expedite the early drug discovery process. In particular, we provide a comprehensive overview of the use of AI algorithms in predicting protein structures, drug-target interactions, and molecular properties such as drug toxicity. While AI has accelerated the drug discovery process, data quality issues and technological constraints remain challenges. Nonetheless, new relationships and methods have been unveiled, demonstrating AI's expanding potential in predicting and understanding drug interactions and properties. For its full potential to be realized, interdisciplinary collaboration is essential. This review underscores AI's growing influence on the future trajectory of medicinal chemistry and stresses the importance of ongoing synergies between computational and domain experts.
Collapse
Affiliation(s)
| | | | | | | | - Yoonji Lee
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea
| |
Collapse
|
9
|
Li X, Ni J, Chen L. Advances in the study of acetaminophen-induced liver injury. Front Pharmacol 2023; 14:1239395. [PMID: 37601069 PMCID: PMC10436315 DOI: 10.3389/fphar.2023.1239395] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 07/28/2023] [Indexed: 08/22/2023] Open
Abstract
Acetaminophen (APAP) overdose is a significant cause of drug-induced liver injury and acute liver failure. The diagnosis, screening, and management of APAP-induced liver injury (AILI) is challenging because of the complex mechanisms involved. Starting from the current studies on the mechanisms of AILI, this review focuses on novel findings in the field of diagnosis, screening, and management of AILI. It highlights the current issues that need to be addressed. This review is supposed to summarize the recent research progress and make recommendations for future research.
Collapse
Affiliation(s)
- Xinghui Li
- West China School of Pharmacy, Sichuan University, Chengdu, China
| | - Jiaqi Ni
- West China School of Pharmacy, Sichuan University, Chengdu, China
- Department of Pharmacy, Evidence-Based Pharmacy Center, West China Second University Hospital, Sichuan University, Chengdu, China
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Sichuan University, Ministry of Education, Chengdu, China
| | - Li Chen
- Department of Pharmacy, Evidence-Based Pharmacy Center, West China Second University Hospital, Sichuan University, Chengdu, China
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Sichuan University, Ministry of Education, Chengdu, China
| |
Collapse
|