1
|
Khatun S, Dasgupta I, Islam R, Amin SA, Jha T, Dhaked DK, Gayen S. Unveiling critical structural features for effective HDAC8 inhibition: a comprehensive study using quantitative read-across structure-activity relationship (q-RASAR) and pharmacophore modeling. Mol Divers 2024:10.1007/s11030-024-10903-y. [PMID: 38871969 DOI: 10.1007/s11030-024-10903-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Accepted: 05/20/2024] [Indexed: 06/15/2024]
Abstract
Histone deacetylases constitute a group of enzymes that participate in several biological processes. Notably, inhibiting HDAC8 has become a therapeutic strategy for various diseases. The current inhibitors for HDAC8 lack selectivity and target multiple HDACs. Consequently, there is a growing recognition of the need for selective HDAC8 inhibitors to enhance the effectiveness of therapeutic interventions. In our current study, we have utilized a multi-faceted approach, including Quantitative Structure-Activity Relationship (QSAR) combined with Quantitative Read-Across Structure-Activity Relationship (q-RASAR) modeling, pharmacophore mapping, molecular docking, and molecular dynamics (MD) simulations. The developed q-RASAR model has a high statistical significance and predictive ability (Q2F1:0.778, Q2F2:0.775). The contributions of important descriptors are discussed in detail to gain insight into the crucial structural features in HDAC8 inhibition. The best pharmacophore hypothesis exhibits a high regression coefficient (0.969) and a low root mean square deviation (0.944), highlighting the importance of correctly orienting hydrogen bond acceptor (HBA), ring aromatic (RA), and zinc-binding group (ZBG) features in designing potent HDAC8 inhibitors. To confirm the results of q-RASAR and pharmacophore mapping, molecular docking analysis of the five potent compounds (44, 54, 82, 102, and 118) was performed to gain further insights into these structural features crucial for interaction with the HDAC8 enzyme. Lastly, MD simulation studies of the most active compound (54, mapped correctly with the pharmacophore hypothesis) and the least active compound (34, mapped poorly with the pharmacophore hypothesis) were carried out to validate the observations of the studies above. This study not only refines our understanding of essential structural features for HDAC8 inhibition but also provides a robust framework for the rational design of novel selective HDAC8 inhibitors which may offer insights to medicinal chemists and researchers engaged in the development of HDAC8-targeted therapeutics.
Collapse
Affiliation(s)
- Samima Khatun
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India
| | - Indrasis Dasgupta
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India
| | - Rakibul Islam
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), Kolkata, West Bengal, 700054, India
| | - Sk Abdul Amin
- Department of Pharmaceutical Technology, JIS University, 81, Nilgunj Road, Agarpara, Kolkata, West Bengal, India
| | - Tarun Jha
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India
| | - Devendra Kumar Dhaked
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), Kolkata, West Bengal, 700054, India
| | - Shovanlal Gayen
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India.
| |
Collapse
|
2
|
Srisongkram T. DeepRA: A novel deep learning-read-across framework and its application in non-sugar sweeteners mutagenicity prediction. Comput Biol Med 2024; 178:108731. [PMID: 38870727 DOI: 10.1016/j.compbiomed.2024.108731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 05/07/2024] [Accepted: 06/08/2024] [Indexed: 06/15/2024]
Abstract
Non-sugar sweeteners (NSSs) or artificial sweeteners have long been used as food chemicals since World War II. NSSs, however, also raise a concern about their mutagenicity. Evaluating the mutagenic ability of NSSs is crucial for food safety; this step is needed for every new chemical registration in the food and pharmaceutical industries. A computational assessment provides less time, money, and involved animals than the in vivo experiments; thus, this study developed a novel computational method from an ensemble convolutional deep neural network and read-across algorithms, called DeepRA, to classify the mutagenicity of chemicals. The mutagenicity data were obtained from the curated Ames test data set. The DeepRA model was developed using both molecular descriptors and molecular fingerprints. The obtained DeepRA model provides accurate and reliable mutagenicity classification through an independent test set. This model was then used to examine the NSSs-related chemicals, enabling the evaluation of mutagenicity from the NSSs-like substances. Finally, this model was publicly available at https://github.com/taraponglab/deepra for further use in chemical regulation and risk assessment.
Collapse
Affiliation(s)
- Tarapong Srisongkram
- Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, 40002, Thailand.
| |
Collapse
|
3
|
Ghosh S, Roy K. Quantitative read-across structure-activity relationship (q-RASAR): A novel approach to estimate the subchronic oral safety (NOAEL) of diverse organic chemicals in rats. Toxicology 2024; 505:153824. [PMID: 38705560 DOI: 10.1016/j.tox.2024.153824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 04/28/2024] [Accepted: 04/29/2024] [Indexed: 05/07/2024]
Abstract
We have developed a quantitative safety prediction model for subchronic repeated doses of diverse organic chemicals on rats using the novel quantitative read-across structure-activity relationship (q-RASAR) approach, which uses similarity-based descriptors for predictive model generation. The experimental -Log (NOAEL) values have been used here as a potential indicator of oral subchronic safety on rats as it determines the maximum dose level for which no observed adverse effects of chemicals are found. A total of 186 data points of diverse organic chemicals have been used for the model generation using structural and physicochemical (0D-2D) descriptors. The read-across-derived similarity, error, and concordance measures (RASAR descriptors) have been extracted from the preliminary 0D-2D descriptors. Then, the combined pool of RASAR and the identified 0D-2D descriptors of the training set were employed to develop the final models by using the partial least squares (PLS) algorithm. The developed PLS model was rigorously validated by various internal and external validation metrics as suggested by the Organization for Economic Co-operation and Development (OECD). The final q-RASAR model is proven to be statistically sound, robust and externally predictive (R2 = 0.85, Q2LOO = 0.82 and Q2F1 = 0.94), superseding the internal as well as external predictivity of the corresponding quantitative structure-activity relationship (QSAR) model as well as previously reported subchronic repeated dose toxicity model found in the literature. In a nutshell, the q-RASAR is an effective approach that has the potential to be used as a good alternative way to improve external predictivity, interpretability, and transferability for subchronic oral safety prediction as well as ecotoxicity risk identification.
Collapse
Affiliation(s)
- Shilpayan Ghosh
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India.
| |
Collapse
|
4
|
Zhou Y, Wang Z, Huang Z, Li W, Chen Y, Yu X, Tang Y, Liu G. In silico prediction of ocular toxicity of compounds using explainable machine learning and deep learning approaches. J Appl Toxicol 2024; 44:892-907. [PMID: 38329145 DOI: 10.1002/jat.4586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 01/16/2024] [Accepted: 01/16/2024] [Indexed: 02/09/2024]
Abstract
The accurate identification of chemicals with ocular toxicity is of paramount importance in health hazard assessment. In contemporary chemical toxicology, there is a growing emphasis on refining, reducing, and replacing animal testing in safety evaluations. Therefore, the development of robust computational tools is crucial for regulatory applications. The performance of predictive models is heavily reliant on the quality and quantity of data. In this investigation, we amalgamated the most extensive dataset (4901 compounds) sourced from governmental GHS-compliant databases and literature to develop binary classification models of chemical ocular toxicity. We employed 12 molecular representations in conjunction with six machine learning algorithms and two deep learning algorithms to create a series of binary classification models. The findings indicated that the deep learning method GCN outperformed the machine learning models in cross-validation, achieving an impressive AUC of 0.915. However, the top-performing machine learning model (RF-Descriptor) demonstrated excellent performance with an AUC of 0.869 on the test set and was therefore selected as the best model. To enhance model interpretability, we conducted the SHAP method and attention weights analysis. The two approaches offered visual depictions of the relevance of key descriptors and substructures in predicting ocular toxicity of chemicals. Thus, we successfully struck a delicate balance between data quality and model interpretability, rendering our model valuable for predicting and comprehending potential ocular-toxic compounds in the early stages of drug discovery.
Collapse
Affiliation(s)
- Yiqing Zhou
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Ze Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Zejun Huang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Yuanting Chen
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Xinxin Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| |
Collapse
|
5
|
Kumar V, Banerjee A, Roy K. Breaking the Barriers: Machine-Learning-Based c-RASAR Approach for Accurate Blood-Brain Barrier Permeability Prediction. J Chem Inf Model 2024; 64:4298-4309. [PMID: 38700741 DOI: 10.1021/acs.jcim.4c00433] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2024]
Abstract
The intricate nature of the blood-brain barrier (BBB) poses a significant challenge in predicting drug permeability, which is crucial for assessing central nervous system (CNS) drug efficacy and safety. This research utilizes an innovative approach, the classification read-across structure-activity relationship (c-RASAR) framework, that leverages machine learning (ML) to enhance the accuracy of BBB permeability predictions. The c-RASAR framework seamlessly integrates principles from both read-across and QSAR methodologies, underscoring the need to consider similarity-related aspects during the development of the c-RASAR model. It is crucial to note that the primary goal of this research is not to introduce yet another model for predicting BBB permeability but rather to showcase the refinement in predicting the BBB permeability of organic compounds through the introduction of a c-RASAR approach. This groundbreaking methodology aims to elevate the accuracy of assessing neuropharmacological implications and streamline the process of drug development. In this study, an ML-based c-RASAR linear discriminant analysis (LDA) model was developed using a dataset of 7807 compounds, encompassing both BBB-permeable and -nonpermeable substances sourced from the B3DB database (freely accessible from https://github.com/theochem/B3DB), for predicting BBB permeability in lead discovery for CNS drugs. The model's predictive capability was then validated using three external sets: one containing 276,518 natural products (NPs) from the LOTUS database (accessible from https://lotus.naturalproducts.net/download) for data gap filling, another comprising 13,002 drug-like/drug compounds from the DrugBank database (available from https://go.drugbank.com/), and a third set of 56 FDA-approved drugs to assess the model's reliability. Further diversifying the predictive arsenal, various other ML-based c-RASAR models were also developed for comparison purposes. The proposed c-RASAR framework emerged as a powerful tool for predicting BBB permeability. This research not only advances the understanding of molecular determinants influencing CNS drug permeability but also provides a versatile computational platform for the rapid assessment of diverse compounds, facilitating informed decision-making in drug development and design.
Collapse
Affiliation(s)
- Vinay Kumar
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India
| | - Arkaprava Banerjee
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India
| |
Collapse
|
6
|
Banerjee A, Roy K. ARKA: a framework of dimensionality reduction for machine-learning classification modeling, risk assessment, and data gap-filling of sparse environmental toxicity data. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2024. [PMID: 38743054 DOI: 10.1039/d4em00173g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Due to the lack of experimental toxicity data for environmental chemicals, there arises a need to fill data gaps by in silico approaches. One of the most commonly used in silico approaches for toxicity assessment of small datasets is the Quantitative Structure-Activity Relationship (QSAR), which generates predictive models for the efficient prediction of query compounds. However, the reliability of the predictions from QSARs derived from small datasets is often questionable from a statistical point of view. This is due to the presence of a larger number of descriptors as compared to the number of training compounds, which reduces the degree of freedom of the developed model. To reduce the overall prediction error for a particular QSAR model, we have proposed here the computation of the novel Arithmetic Residuals in K-groups Analysis (ARKA) descriptors. We have reduced the number of modeling descriptors in a supervised manner by partitioning them into K classes (K = 2 here) depending on the higher mean normalized values of the descriptors to a particular response class, thus preventing the loss of chemical information. A scatter plot of the data points using the values of two ARKA descriptors (ARKA_2 vs. ARKA_1) can potentially identify activity cliffs, less confident data points, and less modelable data points. We have used here five representative environmentally relevant endpoints (skin sensitization, earthworm toxicity, milk/plasma partitioning, algal toxicity, and rodent carcinogenicity of hazardous chemicals) with graded responses to which the ARKA framework was applied for classification modeling. On comparing the performance of the models generated using conventional QSAR descriptors and the ARKA descriptors, the prediction quality of the models derived from ARKA descriptors was found, based on multiple graded-data validation metrics-derived decision criteria, much better than the models derived from QSAR descriptors signifying the potential of ARKA descriptors in ecotoxicological classification modeling of small data sets. Additionally, this holds true for the Read-Across approach as well, since the Read-Across predictions using ARKA descriptors supersede the predictions generated from QSAR descriptors. For the ease of users, a Java-based expert system has been developed that computes the ARKA descriptors from the input of QSAR descriptors.
Collapse
Affiliation(s)
- Arkaprava Banerjee
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India.
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India.
| |
Collapse
|
7
|
Pore S, Banerjee A, Roy K. Application of machine learning-based read-across structure-property relationship (RASPR) as a new tool for predictive modelling: Prediction of power conversion efficiency (PCE) for selected classes of organic dyes in dye-sensitized solar cells (DSSCs). Mol Inform 2024; 43:e202300210. [PMID: 38374528 DOI: 10.1002/minf.202300210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 12/31/2023] [Accepted: 02/04/2024] [Indexed: 02/21/2024]
Abstract
The application of various in-silico-based approaches for the prediction of various properties of materials has been an effective alternative to experimental methods. Recently, the concepts of Quantitative structure-property relationship (QSPR) and read-across (RA) methods were merged to develop a new emerging chemoinformatic tool: read-across structure-property relationship (RASPR). The RASPR method can be applicable to both large and small datasets as it uses various similarity and error-based measures. It has also been observed that RASPR models tend to have an increased external predictivity compared to the corresponding QSPR models. In this study, we have modeled the power conversion efficiency (PCE) of organic dyes used in dye-sensitized solar cells (DSSCs) by using the quantitative RASPR (q-RASPR) method. We have used relatively larger classes of organic dyes-Phenothiazines (n=207), Porphyrins (n=281), and Triphenylamines (n=229) for the modelling purpose. We have divided each of the datasets into training and test sets in 3 different combinations, and with the training sets we have developed three different QSPR models with structural and physicochemical descriptors and validated them with the corresponding test sets. These corresponding modeled descriptors were used to calculate the RASPR descriptors using a Java-based tool RASAR Descriptor Calculator v2.0 (https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home), and then data fusion was performed by pooling the previously selected structural and physicochemical descriptors with the calculated RASPR descriptors. Further feature selection algorithm was employed to develop the final RASPR PLS models. Here, we also developed different machine learning (ML) models with the descriptors selected in the QSPR PLS and RASPR PLS models, and it was found that models with RASPR descriptors superseded in external predictivity the models with only structural and physicochemical descriptors: RMSEP reduced for phenothiazines from 1.16-1.25 to 1.07-1.18, for porphyrins from 1.60-1.79 to 1.45-1.53, for triphenylamines from 1.27-1.54 to 1.20-1.47.
Collapse
Affiliation(s)
- Souvik Pore
- Drug Theoretics and Chemoinformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, 188 Raja S C Mullick Road, 700032, Kolkata, India
| | - Arkaprava Banerjee
- Drug Theoretics and Chemoinformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, 188 Raja S C Mullick Road, 700032, Kolkata, India
| | - Kunal Roy
- Drug Theoretics and Chemoinformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, 188 Raja S C Mullick Road, 700032, Kolkata, India
| |
Collapse
|
8
|
Wu X, Gong J, Ren S, Tan F, Wang Y, Zhao H. A machine learning-based QSAR model reveals important molecular features for understanding the potential inhibition mechanism of ionic liquids to acetylcholinesterase. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 915:169974. [PMID: 38199350 DOI: 10.1016/j.scitotenv.2024.169974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/02/2024] [Accepted: 01/04/2024] [Indexed: 01/12/2024]
Abstract
The broad application of ionic liquids (ILs) has been hindered by uncertainties surrounding their ecotoxicity. In this work, a Quantitative Structure-Activity Relationship (QSAR) model was devised to predict the inhibition of ILs towards the activity of AChE, employing both Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) machine learning approaches. Fourteen kings of essential molecular feature descriptors were screened from an initial roster of 244 descriptors through the application of a feature importance index and they showed a significant impact on the activity of AChE activity. The two models based solely on the 14 most critical molecular descriptors could maintain model's robustness and reliability. The correlation analysis between these 14 descriptors and the inhibition of AChE activity revealed the potential impact of the molecular characteristics on ILs toxicity. The results underscored the main influence of cations in ILs on the inhibitory activity towards the AChE enzyme. Specifically, cations exhibiting hydrophobicity properties were found to exert more potent inhibitory effects on the AChE enzyme. In addition, some other properties of the cations, such as the degree of branching, atomic weight and partial charge also modulated their inhibition potential. This study enhances the comprehension of the structure-activity relationship between ILs and AChE inhibition, providing a reference for designing safer and greener ILs.
Collapse
Affiliation(s)
- Xuri Wu
- Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Jixiang Gong
- Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Suyu Ren
- School of Environmental and Material Engineering, Yantai University, Yantai 264005, China
| | - Feng Tan
- Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China.
| | - Yan Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Hongxia Zhao
- Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
9
|
Huang Z, Yu J, He W, Yu J, Deng S, Yang C, Zhu W, Shao X. AI-enhanced chemical paradigm: From molecular graphs to accurate prediction and mechanism. JOURNAL OF HAZARDOUS MATERIALS 2024; 465:133355. [PMID: 38198864 DOI: 10.1016/j.jhazmat.2023.133355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 12/19/2023] [Accepted: 12/21/2023] [Indexed: 01/12/2024]
Abstract
The development of accurate and interpretable models for predicting reaction constants of organic compounds with hydroxyl radicals is vital for advancing quantitative structure-activity relationships (QSAR) in pollutant degradation. Methods like molecular descriptors, molecular fingerprinting, and group contribution methods have limitations, as traditional machine learning struggles to capture all intramolecular information simultaneously. To address this, we established an integrated graph neural network (GNN) with approximately 12 million learnable parameters. GNN represents atoms as nodes and chemical bonds as edges, thus transforming molecules into a graph structures, effectively capturing microscopic properties while depicting atom connectivity in non-Euclidean space. Our datasets comprise 1401 pollutants to develop an integrated GNN model with Bayesian optimization, the model achieves root mean square errors of 0.165, 0.172, and 0.189 on the training, validation, and test datasets, respectively. Furthermore, we assess molecular structure similarity using molecular fingerprint to enhance the model's applicability. Afterwards, we propose a gradient weight mapping method for model explainability, uncovering the key functional groups in chemical reactions in artificial intelligence perspective, which would boost chemistry through artificial intelligence extreme arithmetic power.
Collapse
Affiliation(s)
- Zhi Huang
- Department of Environmental Science and Engineering, College of Architecture and Environment, Sichuan University, Chengdu 610065, PR China
| | - Jiang Yu
- Department of Environmental Science and Engineering, College of Architecture and Environment, Sichuan University, Chengdu 610065, PR China; Institute of New Energy and Low Carbon Technology, Sichuan University, Chengdu 610065, PR China; Yibin Institute of Industrial Technology, Sichuan University, Yibin 644000, PR China.
| | - Wei He
- Chengdu Jin Sheng Water Engineering Co, PR China
| | - Jie Yu
- Department of Environmental Science and Engineering, College of Architecture and Environment, Sichuan University, Chengdu 610065, PR China; Institute of New Energy and Low Carbon Technology, Sichuan University, Chengdu 610065, PR China
| | - Siwei Deng
- Department of Environmental Science and Engineering, College of Architecture and Environment, Sichuan University, Chengdu 610065, PR China
| | - Chun Yang
- Ministry of Education and School of Mathematics Sciences, Sichuan Normal University, PR China
| | - Weiwei Zhu
- Department of Environmental Science and Engineering, College of Architecture and Environment, Sichuan University, Chengdu 610065, PR China
| | - Xiao Shao
- School of Agriculture and Environment, University of Western Australia, Perth 6907, Western Australia, Australia
| |
Collapse
|
10
|
Gomatam A, Hirlekar BU, Singh KD, Murty US, Dixit VA. Improved QSAR models for PARP-1 inhibition using data balancing, interpretable machine learning, and matched molecular pair analysis. Mol Divers 2024:10.1007/s11030-024-10809-9. [PMID: 38374474 DOI: 10.1007/s11030-024-10809-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 01/07/2024] [Indexed: 02/21/2024]
Abstract
The poly (ADP-ribose) polymerase-1 (PARP-1) enzyme is an important target in the treatment of breast cancer. Currently, treatment options include the drugs Olaparib, Niraparib, Rucaparib, and Talazoparib; however, these drugs can cause severe side effects including hematological toxicity and cardiotoxicity. Although in silico models for the prediction of PARP-1 activity have been developed, the drawbacks of these models include low specificity, a narrow applicability domain, and a lack of interpretability. To address these issues, a comprehensive machine learning (ML)-based quantitative structure-activity relationship (QSAR) approach for the informed prediction of PARP-1 activity is presented. Classification models built using the Synthetic Minority Oversampling Technique (SMOTE) for data balancing gave robust and predictive models based on the K-nearest neighbor algorithm (accuracy 0.86, sensitivity 0.88, specificity 0.80). Regression models were built on structurally congeneric datasets, with the models for the phthalazinone class and fused cyclic compounds giving the best performance. In accordance with the Organization for Economic Cooperation and Development (OECD) guidelines, a mechanistic interpretation is proposed using the Shapley Additive Explanations (SHAP) to identify the important topological features to differentiate between PARP-1 actives and inactives. Moreover, an analysis of the PARP-1 dataset revealed the prevalence of activity cliffs, which possibly negatively impacts the model's predictive performance. Finally, a set of chemical transformation rules were extracted using the matched molecular pair analysis (MMPA) which provided mechanistic insights and can guide medicinal chemists in the design of novel PARP-1 inhibitors.
Collapse
Affiliation(s)
- Anish Gomatam
- Department of Medicinal Chemistry, National Institute of Pharmaceutical Education and Research, (NIPER Guwahati), Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Govt. of India, Sila Katamur (Halugurisuk), Dist: Kamrup, P.O.: Changsari, Guwahati, Assam, 781101, India
| | - Bhakti Umesh Hirlekar
- Department of Medicinal Chemistry, National Institute of Pharmaceutical Education and Research, (NIPER Guwahati), Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Govt. of India, Sila Katamur (Halugurisuk), Dist: Kamrup, P.O.: Changsari, Guwahati, Assam, 781101, India
| | - Krishan Dev Singh
- Department of Medicinal Chemistry, National Institute of Pharmaceutical Education and Research, (NIPER Guwahati), Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Govt. of India, Sila Katamur (Halugurisuk), Dist: Kamrup, P.O.: Changsari, Guwahati, Assam, 781101, India
| | - Upadhyayula Suryanarayana Murty
- Department of Medicinal Chemistry, National Institute of Pharmaceutical Education and Research, (NIPER Guwahati), Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Govt. of India, Sila Katamur (Halugurisuk), Dist: Kamrup, P.O.: Changsari, Guwahati, Assam, 781101, India
| | - Vaibhav A Dixit
- Department of Medicinal Chemistry, National Institute of Pharmaceutical Education and Research, (NIPER Guwahati), Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Govt. of India, Sila Katamur (Halugurisuk), Dist: Kamrup, P.O.: Changsari, Guwahati, Assam, 781101, India.
| |
Collapse
|
11
|
Banjare P, Singh R, Pandey NK, Matore BW, Murmu A, Singh J, Roy PP. In silico soil degradation and ecotoxicity analysis of veterinary pharmaceuticals on terrestrial species: first report. Toxicol Res (Camb) 2024; 13:tfae020. [PMID: 38496320 PMCID: PMC10939401 DOI: 10.1093/toxres/tfae020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 02/01/2024] [Accepted: 02/02/2024] [Indexed: 03/19/2024] Open
Abstract
With the aim of persistence property analysis and ecotoxicological impact of veterinary pharmaceuticals on different terrestrial species, different classes of veterinary pharmaceuticals (n = 37) with soil degradation property (DT50) were gathered and subjected to QSAR and q-RASAR model development. The models were developed from 2D descriptors under organization for economic cooperation and development guidelines with the application of multiple linear regressions along with genetic algorithm. All developed QSAR and q-RASAR were statistically significant (Internal = R2adj: 0.721-0.861, Q2LOO: 0.609-0.757, and external = Q2Fn = 0.597-0.933, MAEext = 0.174-0.260). Further, the leverage approach of applicability domain assured the model's reliability. The veterinary pharmaceuticals with no experimental values were classified based on their persistence level. Further, the terrestrial toxicity analysis of persistent veterinary pharmaceuticals was done using toxicity prediction by computer assisted technology and in-house built quantitative structure toxicity relationship models to prioritize the toxic and persistent veterinary pharmaceuticals. This study will be helpful in estimation of persistence and toxicity of existing and upcoming veterinary pharmaceuticals.
Collapse
Affiliation(s)
- Purusottam Banjare
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur 495009, Chhattisgarh, India
- Department of Pharmaceutical Chemistry, Apollo College of Pharmacy, Anjora, Durg 491001, Chhattisgarh, India
| | - Rekha Singh
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur 495009, Chhattisgarh, India
| | - Nilesh Kumar Pandey
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur 495009, Chhattisgarh, India
| | - Balaji Wamanrao Matore
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur 495009, Chhattisgarh, India
| | - Anjali Murmu
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur 495009, Chhattisgarh, India
| | - Jagadish Singh
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur 495009, Chhattisgarh, India
| | - Partha Pratim Roy
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur 495009, Chhattisgarh, India
| |
Collapse
|
12
|
Pandey NK, Murmu A, Banjare P, Matore BW, Singh J, Roy PP. Integrated predictive QSAR, Read Across, and q-RASAR analysis for diverse agrochemical phytotoxicity in oat and corn: A consensus-based approach for risk assessment and prioritization. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2024; 31:12371-12386. [PMID: 38228952 DOI: 10.1007/s11356-024-31872-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 01/02/2024] [Indexed: 01/18/2024]
Abstract
In the modern fast-paced lifestyle, time-efficient and nutritionally rich foods like corn and oat have gained popularity for their amino acids and antioxidant contents. The increasing demand for these cereals necessitates higher production which leads to dependency on agrochemicals, which can pose health risks through residual present in the plant products. To first report the phytotoxicity for corn and oat, our study employs QSAR, quantitative Read-Across and quantitative RASAR (q-RASAR). All developed QSAR and q-RASAR models were equally robust (R2 = 0.680-0.762, Q2Loo = 0.593-0.693, Q2F1 = 0.680-0.860) and find their superiority in either oat or corn model, respectively, based on MAE criteria. AD and PRI had been performed which confirm the reliability and predictability of the models. The mechanistic interpretation reveals that the symmetrical arrangement of electronegative atoms and polar groups directly influences the toxicity of compounds. The final phytotoxicity and prioritization are performed by the consensus approach which results into selection of 15 most toxic compounds for both species.
Collapse
Affiliation(s)
- Nilesh Kumar Pandey
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur, 495009, India
| | - Anjali Murmu
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur, 495009, India
| | | | - Balaji Wamanrao Matore
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur, 495009, India
| | - Jagadish Singh
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur, 495009, India
| | - Partha Pratim Roy
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur, 495009, India.
| |
Collapse
|
13
|
Ahmadi M, Ayyoubzadeh SM, Ghorbani-Bidkorpeh F. Toxicity prediction of nanoparticles using machine learning approaches. Toxicology 2024; 501:153697. [PMID: 38056590 DOI: 10.1016/j.tox.2023.153697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 11/21/2023] [Accepted: 12/01/2023] [Indexed: 12/08/2023]
Abstract
Nanoparticle toxicity analysis is critical for evaluating the safety of nanomaterials due to their potential harm to the biological system. However, traditional experimental methods for evaluating nanoparticle toxicity are expensive and time-consuming. As an alternative approach, machine learning offers a solution for predicting cellular responses to nanoparticles. This study focuses on developing ML models for nanoparticle toxicity prediction. The training dataset used for building these models includes the physicochemical properties of nanoparticles, exposure conditions, and cellular responses of different cell lines. The impact of each parameter on cell death was assessed using the Gini index. Five classifiers, namely Decision Tree, Random Forest, Support Vector Machine, Naïve Bayes, and Artificial Neural Network, were employed to predict toxicity. The models' performance was compared based on accuracy, sensitivity, specificity, area under the curve, F measure, K-fold validation, and classification error. The Gini index indicated that cell line, exposure dose, and tissue are the most influential factors in cell death. Among the models tested, Random Forest exhibited the highest performance in the given dataset. Other models demonstrated lower performance compared to Random Forest. Researchers can utilize the Random Forest model to predict nanoparticle toxicity, resulting in cost and time savings for toxicity analysis.
Collapse
Affiliation(s)
- Mahnaz Ahmadi
- Medical Nanotechnology and Tissue Engineering Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Seyed Mohammad Ayyoubzadeh
- Department of Health Information Management, School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran; Health Information Management Research Center, Tehran University of Medical Sciences, Tehran, Iran.
| | - Fatemeh Ghorbani-Bidkorpeh
- Department of Pharmaceutics and Pharmaceutical Nanotechnology, School of Pharmacy, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
14
|
Duchowicz PR, Fioressi SE, Bacelo DE, Quispe AQ, Yapu EL, Castañeta H. QSPR predicting the vapor pressure of pesticides into high/low volatility classes. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2024; 31:1395-1402. [PMID: 38038924 DOI: 10.1007/s11356-023-31235-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 11/21/2023] [Indexed: 12/02/2023]
Abstract
In this work, the vapor pressure of pesticides is employed as an indicator of their volatility potential. Quantitative Structure-Property Relationship models are established to predict the classification of compounds according to their volatility, into the high and low binary classes separated by the 1-mPa limit. A large dataset of 1005 structurally diverse pesticides with known experimental vapor pressure data at 20 °C is compiled from the publicly available Pesticide Properties DataBase (PPDB) and used for model development. The freely available PaDEL-Descriptor and ISIDA/Fragmentor molecular descriptor programs provide a large number of 19,947 non-conformational molecular descriptors that are analyzed through multivariable linear regressions and the Replacement Method technique. Through the selection of appropriate molecular descriptors of the substructure fragment type and the use of different standard classification metrics of model's quality, the classification of the structure-property relationship achieves acceptable results for discerning between the high and low volatility classes. Finally, an application of the obtained QSPR model is performed to predict the classes for 504 pesticides not having experimentally measured vapor pressures.
Collapse
Affiliation(s)
- Pablo R Duchowicz
- Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas (INIFTA), CONICET, UNLP, Diag. 113 y 64, C.C. 16, Sucursal 4, 1900, La Plata, Argentina.
| | - Silvina E Fioressi
- Facultad de Ciencias Exactas y Naturales, Universidad de Belgrano, CONICET, Villanueva 1324, 1426, Buenos Aires, Argentina
| | - Daniel E Bacelo
- Facultad de Ciencias Exactas y Naturales, Universidad de Belgrano, CONICET, Villanueva 1324, 1426, Buenos Aires, Argentina
| | - Alexander Q Quispe
- Carrera de Ciencias Químicas, Universidad Mayor de San Andrés, 303, La Paz, Bolivia
| | - Ebbe L Yapu
- Carrera de Ciencias Químicas, Universidad Mayor de San Andrés, 303, La Paz, Bolivia
| | - Heriberto Castañeta
- Instituto de Investigaciones Químicas, Universidad Mayor de San Andrés, 303, La Paz, Bolivia
| |
Collapse
|
15
|
Ghosh V, Bhattacharjee A, Kumar A, Ojha PK. q-RASTR modelling for prediction of diverse toxic chemicals towards T. pyriformis. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2024; 35:11-30. [PMID: 38193248 DOI: 10.1080/1062936x.2023.2298452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 12/16/2023] [Indexed: 01/10/2024]
Abstract
A series of diverse organic compounds impose serious detrimental effects on the health of living organisms and the environment. Determination of the structural aspects of compounds that impart toxicity and evaluation of the same is crucial before public usage. The present study aims to determine the structural characteristics of compounds for Tetrahymena pyriformis toxicity using the q-RASTR (Quantitative Read Across Structure-Toxicity Relationship) model. It was developed using RASTR and 2-D descriptors for a dataset of 1792 compounds with defined endpoint (pIGC50) against a model organism, T. pyriformis. For the current study, the whole dataset was divided based on activity/property into the training and test sets, and the q-RASTR model was developed employing six descriptors (three latent variables) having r2, Q2F1 and Q2 values of 0.739, 0.767, and 0.735, respectively. The generated model was thoroughly validated using internationally recognized internal and external validation criteria to assess the model's dependability and predictability. It was highlighted that high molecular weight, aromatic hydroxyls, nitrogen, double bonds, and hydrophobicity increase the toxicity of organic compounds. The current study demonstrates the applicability of the RASTR algorithm in QSTR model development for the prediction of toxic chemicals (pIGC50) towards T. pyriformis.
Collapse
Affiliation(s)
- V Ghosh
- Drug Discovery and Development Laboratory (DDD Lab), Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - A Bhattacharjee
- Drug Discovery and Development Laboratory (DDD Lab), Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - A Kumar
- Drug Discovery and Development Laboratory (DDD Lab), Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - P K Ojha
- Drug Discovery and Development Laboratory (DDD Lab), Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| |
Collapse
|
16
|
Srisongkram T. Ensemble Quantitative Read-Across Structure-Activity Relationship Algorithm for Predicting Skin Cytotoxicity. Chem Res Toxicol 2023; 36:1961-1972. [PMID: 38047785 DOI: 10.1021/acs.chemrestox.3c00238] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
Read-across (RA) and quantitative structure-activity relationship (QSAR) are two alternative methods commonly used to fill data gaps in chemical registrations. These approaches use physicochemical properties or molecular fingerprints of source substances to predict the properties of unknown substances that have similar chemical structures or physicochemical properties. Research on RA and QSAR is essential to minimize the time, money, and animal testing needed to determine biological properties that are not currently known. This study developed a stacked ensemble quantitative read-across structure-activity relationship algorithm (enQRASAR) for predicting skin irritation toxicity based on negative log cell viability inhibition concentration at 50% (pIC50) against skin keratinocytes as the end point. The goodness-of-fit and predictability of this algorithm were validated using leave-one-out cross-validation and external test data sets. The results obtained were statistically reliable in terms of goodness-of-fit, robustness, and predictability metrics. Additionally, the developed model demonstrated a low prediction error when predicting FDA-approved drugs. These results confirm that the enQRASAR algorithm can be used to predict skin cytotoxicity of chemicals. Therefore, this model was publicly available to further facilitate toxicity predictions of unknown compounds in chemical registrations.
Collapse
Affiliation(s)
- Tarapong Srisongkram
- Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, Khon Kaen 40000, Thailand
| |
Collapse
|
17
|
Baran K, Kloskowski A. Graph Neural Networks and Structural Information on Ionic Liquids: A Cheminformatics Study on Molecular Physicochemical Property Prediction. J Phys Chem B 2023; 127:10542-10555. [PMID: 38015981 PMCID: PMC10726349 DOI: 10.1021/acs.jpcb.3c05521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 11/01/2023] [Accepted: 11/16/2023] [Indexed: 11/30/2023]
Abstract
Ionic liquids (ILs) provide a promising solution in many industrial applications, such as solvents, absorbents, electrolytes, catalysts, lubricants, and many others. However, due to the enormous variety of their structures, uncovering or designing those with optimal attributes requires expensive and exhaustive simulations and experiments. For these reasons, searching for an efficient theoretical tool for finding the relationship between the IL structure and properties has been the subject of many research studies. Recently, special attention has been paid to machine learning tools, especially multilayer perceptron and convolutional neural networks, among many other algorithms in the field of artificial neural networks. For the latter, graph neural networks (GNNs) seem to be a powerful cheminformatic tool yet not well enough studied for dual molecular systems such as ILs. In this work, the usage of GNNs in structure-property studies is critically evaluated for predicting the density, viscosity, and surface tension of ILs. The problem of data availability and integrity is discussed to show how well GNNs deal with mislabeled chemical data. Providing more training data is proven to be more important than ensuring that they are immaculate. Great attention is paid to how GNNs process different ions to give graph transformations and electrostatic information. Clues on how GNNs should be applied to predict the properties of ILs are provided. Differences, especially regarding handling mislabeled data, favoring the use of GNNs over classical quantitative structure-property models are discussed.
Collapse
Affiliation(s)
- Karol Baran
- Department of Physical Chemistry,
Faculty of Chemistry, Gdansk University
of Technology, Narutowicza Street 11/12, 80-233 Gdansk, Poland
| | - Adam Kloskowski
- Department of Physical Chemistry,
Faculty of Chemistry, Gdansk University
of Technology, Narutowicza Street 11/12, 80-233 Gdansk, Poland
| |
Collapse
|
18
|
Pandey SK, Roy K. Development of a read-across-derived classification model for the predictions of mutagenicity data and its comparison with traditional QSAR models and expert systems. Toxicology 2023; 500:153676. [PMID: 37993082 DOI: 10.1016/j.tox.2023.153676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 11/06/2023] [Accepted: 11/17/2023] [Indexed: 11/24/2023]
Abstract
Mutagenicity is considered an important endpoint from the regulatory, environmental and medical points of view. Due to the wide number of compounds that may be of concern and the enormous expenses (in terms of time, money, and animals) associated with rodent mutagenicity bioassays, this endpoint is a major target for the development of alternative approaches for screening and prediction. The majority of old-aged expert systems and quantitative structure-activity relationship (QSAR) models may show reduced performance over time for their application on newer chemical candidates; thus, researchers constantly try to improve the modeling strategies. In our report, we initially performed traditional classification-based linear discriminant analysis (LDA) QSAR modeling using the benchmark Ames dataset of diverse chemicals (6512 compounds) to recognize the relationship between the molecules and their potential mutagenic behavior. The classical LDA QSAR model is developed from a selected set of 2D descriptors. The LDA QSAR model was developed by using a total of 31 descriptors identified from the analysis of the most discriminating features. Additionally, we have used similarity-derived features obtained from the read-across (RA) to develop an RA-based QSAR model. The developed RA-based LDA QSAR model has better predictivity, transferability, and interpretability compared to the LDA QSAR model, and it uses a very small number of descriptors compared to the classical QSAR model. Different machine learning (ML) models were also developed using the descriptors appearing in the read-across-based LDA QSAR model for comparative studies. We have checked the prediction quality of 216 true external set compounds using the novel similarity-derived RA model. The performance of the OECD toolbox is also compared with the RA-derived LDA QSAR model for a true external set. The current study aimed to explore the significance of the read-across-based algorithm and its application to the most current experimental mutagenicity data to complement already available expert systems.
Collapse
Affiliation(s)
- Sapna Kumari Pandey
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India.
| |
Collapse
|
19
|
Banerjee A, Roy K. Read-across-based intelligent learning: development of a global q-RASAR model for the efficient quantitative predictions of skin sensitization potential of diverse organic chemicals. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2023; 25:1626-1644. [PMID: 37682520 DOI: 10.1039/d3em00322a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]
Abstract
Environmental chemicals and contaminants cause a wide array of harmful implications to terrestrial and aquatic life which ranges from skin sensitization to acute oral toxicity. The current study aims to assess the quantitative skin sensitization potential of a large set of industrial and environmental chemicals acting through different mechanisms using the novel quantitative Read-Across Structure-Activity Relationship (q-RASAR) approach. Based on the identified important set of structural and physicochemical features, Read-Across-based hyperparameters were optimized using the training set compounds followed by the calculation of similarity and error-based RASAR descriptors. Data fusion, further feature selection, and removal of prediction confidence outliers were performed to generate a partial least squares (PLS) q-RASAR model, followed by the application of various Machine Learning (ML) tools to check the quality of predictions. The PLS model was found to be the best among different models. A simple user-friendly Java-based software tool was developed based on the PLS model, which efficiently predicts the toxicity value(s) of query compound(s) along with their status of Applicability Domain (AD) in terms of leverage values. This model has been developed using structurally diverse compounds and is expected to predict efficiently and quantitatively the skin sensitization potential of environmental chemicals to estimate their occupational and health hazards.
Collapse
Affiliation(s)
- Arkaprava Banerjee
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India.
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India.
| |
Collapse
|