1
|
Banerjee A, Roy K. The multiclass ARKA framework for developing improved q-RASAR models for environmental toxicity endpoints. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2025; 27:1229-1243. [PMID: 40227888 DOI: 10.1039/d5em00068h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2025]
Abstract
The continuous quest for the quick, accurate, and efficient methods for filling the gaps in the toxicity data of commercial chemicals is the need of the hour. Thus, it has become essential to develop simple and improved modeling strategies that aim to generate more accurate predictions. Recently, quantitative Read-Across Structure-Activity Relationship (q-RASAR) modeling has been reported to enhance the external predictivity of QSAR models. However, the cross-validation metrics of some q-RASAR models show compromised values compared to those of the corresponding QSAR models. We report here an improved q-RASAR workflow coupled with the Arithmetic Residuals in K-groups Analysis (ARKA) framework. This improved workflow (ARKA-RASAR) considers two important aspects: the contribution of different QSAR descriptors to different experimental response ranges, and the identification of similarity among close congeners based on both the selected QSAR descriptors and the contribution of different QSAR descriptors to different experimental response ranges. A simple, free, and user-friendly Java-based tool, Multiclass ARKA-v1.0, has been developed to compute the multiclass ARKA descriptors. In this study, five different toxicity datasets previously used for the development of QSAR and q-RASAR models were considered. We developed hybrid ARKA models that consist of a combination of QSAR descriptors and ARKA descriptors. These hybrid feature spaces were used to compute RASAR descriptors and develop ARKA-RASAR models. We used the same modeling strategies used to develop the previously reported QSAR and q-RASAR models for a fair comparison. Additionally, these modeling algorithms are straightforward, reproducible, and transferable. A multi-criteria decision-making statistical approach, the Sum of Ranking Differences (SRD), indicated that the ARKA-RASAR models are the best-performing models, considering training, test, and cross-validation statistics. The least significant difference procedure ensured that the SRD values were significantly different for most models, presenting an unbiased workflow. True external validation using a set of pesticide metabolites and predicting their early-stage acute fish toxicity using relevant ARKA-RASAR models was also carried out and yielded encouraging results. The promising results and the ease of computation of ARKA and RASAR descriptors using our tools suggest that the ARKA-RASAR modeling framework may be a potential choice for developing highly robust and predictive models for filling the gaps in environmental toxicity data.
Collapse
Affiliation(s)
- Arkaprava Banerjee
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India.
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India.
| |
Collapse
|
2
|
Jyoti S, Murmu A, Matore BW, Singh J, Roy PP. Exploring QSTR and q-RASTR modeling of agrochemical toxicity on cabbage for environmental safety and human health. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2025; 32:5504-5520. [PMID: 39930099 DOI: 10.1007/s11356-025-36033-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/04/2024] [Accepted: 01/26/2025] [Indexed: 02/28/2025]
Abstract
Cabbage is a widely consumed vegetable in the human diet because of its low cost, broad availability and high nutritional value. The rising use of pesticides in food production creates a need to assess vegetable toxicity, which primarily results from residues in food products and environmental exposure. The study aims to offer exploration of vegetable toxicity in cabbage with the help of reliable QSTR and q-RASTR models. All the developed models were robust and predictive enough (Q2LOO = 0.7491-0.8164, Q2F1 = 0.5243-0.6253, Q2F2 = 0.513-0.617, MAEext = 0.495-0.690). Furthermore, the reliability and predictability of models were assessed and confirmed by applicability domain and prediction reliability indicator analysis. Additionally, different machine learning models were developed to making effective predictions and multiple linear regression (MLR) comparison. Consensus approach was advocated data gap filling for USEPA ECOTOX database compounds. The most and least toxic compounds from both MLR model predictions were prioritized and analyzed. Mechanistic interpretation highlighted the structural features or fragments responsible for the agrochemical toxicity and a safe approach for designing green chemicals minimizing the toxicity. This first reported study can be useful for toxicity profiling, data gap filling and designing safer and green agrochemical for minimizing vegetable toxicity, healthy human life and environmental safety.
Collapse
Affiliation(s)
- Surbhi Jyoti
- Laboratory of Drug Discovery and Ecotoxicology, Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur, 495009, India
| | - Anjali Murmu
- Laboratory of Drug Discovery and Ecotoxicology, Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur, 495009, India
| | - Balaji Wamanrao Matore
- Laboratory of Drug Discovery and Ecotoxicology, Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur, 495009, India
| | - Jagadish Singh
- Laboratory of Drug Discovery and Ecotoxicology, Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur, 495009, India
| | - Partha Pratim Roy
- Laboratory of Drug Discovery and Ecotoxicology, Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur, 495009, India.
| |
Collapse
|
3
|
Dasgupta I, Barik H, Gayen S. Modelling of intrinsic membrane permeability of drug molecules by explainable ML-based q-RASPR approach towards better pharmacokinetics and toxicokinetics properties. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2025; 36:127-143. [PMID: 40190164 DOI: 10.1080/1062936x.2025.2478118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2025] [Accepted: 03/04/2025] [Indexed: 05/17/2025]
Abstract
Drug discovery's success lies in potent inhibition against a target and optimum pharmacokinetic and toxicokinetic properties of drug molecules. Membrane permeability is a crucial factor in determining the absorption, distribution, metabolism, and excretion of drug molecules, thereby determining the pharmacokinetic and toxicokinetic properties important for drug development. Intrinsic permeability (P0) is more crucial than apparent permeability (Papp) in assessing the transport of drug molecules across a membrane. It gives more consistent results due to its non-dependency on external/site-specific factors. In the present work, our focus is on the construction of a machine learning (ML)-based quantitative read-across structure-property relationship (q-RASPR) model of intrinsic permeability of drug molecules by utilizing both linear and non-linear algorithms. The Support Vector Regression (SVR) q-RASPR model was found to be the best model having superior predictive ability (Q2F1 = 0.788, Q2F2 = 0.785, MAEtest = 0.637). The contribution of important descriptors in the final model is explained to get a mechanistic interpretation of intrinsic permeability. Overall, the present study unveils the application of the q-RASPR framework for significant improvement of the external predictivity of the traditional QSPR model in the case of intrinsic permeability to get a better assessment of the total permeability of drug molecules.
Collapse
Affiliation(s)
- I Dasgupta
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - H Barik
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - S Gayen
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| |
Collapse
|
4
|
Banerjee A, Roy K. Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs. Sci Rep 2025; 15:808. [PMID: 39755865 PMCID: PMC11700179 DOI: 10.1038/s41598-024-85063-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2024] [Accepted: 12/30/2024] [Indexed: 01/06/2025] Open
Abstract
We have adopted the classification Read-Across Structure-Activity Relationship (c-RASAR) approach in the present study for machine-learning (ML)-based model development from a recently reported curated dataset of nephrotoxicity potential of orally active drugs. We initially developed ML models using nine different algorithms separately on topological descriptors (referred to as simply "descriptors" in the subsequent sections of the manuscript) and MACCS fingerprints (referred to as "fingerprints" in the subsequent sections of the manuscript), thus generating 18 different ML QSAR models. Using the chemical spaces defined by the modeling descriptors and fingerprints, the similarity and error-based RASAR descriptors were computed, and the most discriminating RASAR descriptors were used to develop another set of 18 different ML c-RASAR models. All 36 models were cross-validated 20 times with a fivefold cross-validation strategy, and their predictivity was checked on the test set data. A multi-criteria decision-making strategy - the Sum of Ranking Differences (SRD) approach-was adopted to identify the best-performing model based on robustness and external validation parameters. This statistical analysis suggested that the c-RASAR models had an overall good performance, while the best-performing model was also a c-RASAR model (LDA c-RASAR model derived from topological descriptors, with MCC values of 0.229 and 0.431 for the training and test sets, respectively). This model was used to screen a true external data set prepared from the known nephrotoxic compounds of DrugBankDB, demonstrating good predictivity.
Collapse
Affiliation(s)
- Arkaprava Banerjee
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India.
| |
Collapse
|
5
|
Das S, Bhattacharjee A, Ojha PK. First report on q-RASTR modelling of hazardous dose (HD 5) for acute toxicity of pesticides: an efficient and reliable approach towards safeguarding the sensitive avian species. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2025; 36:39-55. [PMID: 39931931 DOI: 10.1080/1062936x.2025.2462559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2024] [Accepted: 01/27/2025] [Indexed: 02/25/2025]
Abstract
Pesticides are crucial in modern agriculture, significantly enhancing crop productivity by managing pests. It is important to evaluate their toxicity to minimize health risks to bird species and preserve ecosystem balance. Traditional parameters including lethal concentration (LC50) or median lethal dose (LD50) often underestimate hazards due to limited data and uncertainty about the most sensitive species tested. This limitation can be addressed using extrapolation factors like HD5 accounting for 50% mortality of the most sensitive 5% of bird species. In this research, a QSTR model was developed utilizing a diverse set of 480 pesticides using partial least squares (PLS) regression with 2D descriptors. Additionally, a PLS-based quantitative read-across structure-toxicity relationship (q-RASTR) and classification based models were constructed. The q-RASTR model outperformed traditional QSTR approaches, achieving robust statistical performance with internal validation metrics r2 = 0.623, Q2 = 0.569 and external validation metrics Q2F1 = 0.541, Q2F2 = 0.540. Key factors influencing avian toxicity were identified. The q-RASTR model was used to screen the Pesticide Properties Database (PPDB) to recognize the most and least toxic pesticides for avian species, aligning well with real-world data. This work provides a more economical and ethical alternative to conventional in vivo testing methods, aiding regulatory bodies and industries in developing safer, environmentally friendly pesticides.
Collapse
Affiliation(s)
- S Das
- Drug Discovery and Development Laboratory (DDD Lab), Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - A Bhattacharjee
- Drug Discovery and Development Laboratory (DDD Lab), Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - P K Ojha
- Drug Discovery and Development Laboratory (DDD Lab), Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| |
Collapse
|
6
|
Xiao X, Trinh TX, Gerelkhuu Z, Ha E, Yoon TH. Automated machine learning in nanotoxicity assessment: A comparative study of predictive model performance. Comput Struct Biotechnol J 2024; 25:9-19. [PMID: 38414794 PMCID: PMC10899003 DOI: 10.1016/j.csbj.2024.02.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 01/24/2024] [Accepted: 02/05/2024] [Indexed: 02/29/2024] Open
Abstract
Computational modeling has earned significant interest as an alternative to animal testing of toxicity assessment. However, the process of selecting an appropriate algorithm and fine-tuning hyperparameters for the developing of optimized models takes considerable time, expertise, and an intensive search. The recent emergence of automated machine learning (autoML) approaches, available as user-friendly platforms, has proven beneficial for individuals with limited knowledge in ML-based predictive model development. These autoML platforms automate crucial steps in model development, including data preprocessing, algorithm selection, and hyperparameter tuning. In this study, we used seven previously published and publicly available datasets for oxides and metals to develop nanotoxicity prediction models. AutoML platforms, namely Vertex AI, Azure, and Dataiku, were employed and performance measures such as accuracy, F1 score, precision, and recall for these autoML-based models were then compared with those of conventional ML-based models. The results demonstrated clearly that the autoML platforms produced more reliable nanotoxicity prediction models, outperforming those built with conventional ML algorithms. While none of the three autoML platforms significantly outperformed the others, distinctions exist among them in terms of the available options for choosing technical features throughout the model development steps. This allows users to select an autoML platform that aligns with their knowledge of predictive model development and its technical features. Additionally, prediction models constructed from datasets with better data quality displayed, enhanced performance than those built from datasets with lower data quality, indicating that future studies with high-quality datasets can further improve the performance of those autoML-based prediction models.
Collapse
Affiliation(s)
- Xiao Xiao
- Department of Chemistry, College of Natural Sciences, Hanyang University, Seoul 04763, the Republic of Korea
| | - Tung X Trinh
- Department of Chemistry, College of Natural Sciences, Hanyang University, Seoul 04763, the Republic of Korea
- Institute of Next Generation Material Design, Hanyang University, Seoul 04763, the Republic of Korea
| | - Zayakhuu Gerelkhuu
- Institute of Next Generation Material Design, Hanyang University, Seoul 04763, the Republic of Korea
- Yoon Idea Lab. Co. Ltd, Seoul 04763, the Republic of Korea
| | - Eunyong Ha
- Department of Chemistry, College of Natural Sciences, Hanyang University, Seoul 04763, the Republic of Korea
| | - Tae Hyun Yoon
- Department of Chemistry, College of Natural Sciences, Hanyang University, Seoul 04763, the Republic of Korea
- Institute of Next Generation Material Design, Hanyang University, Seoul 04763, the Republic of Korea
- Yoon Idea Lab. Co. Ltd, Seoul 04763, the Republic of Korea
| |
Collapse
|
7
|
Pandey SK, Roy K. Development of hybrid models by the integration of the read-across hypothesis with the QSAR framework for the assessment of developmental and reproductive toxicity (DART) tested according to OECD TG 414. Toxicol Rep 2024; 13:101822. [PMID: 39649380 PMCID: PMC11621937 DOI: 10.1016/j.toxrep.2024.101822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2024] [Revised: 11/15/2024] [Accepted: 11/18/2024] [Indexed: 12/10/2024] Open
Abstract
The governing laws mandate animal testing guidelines (TG) to assess the developmental and reproductive toxicity (DART) potential of new and current chemical compounds for the categorization, hazard identification, and labeling. In silico modeling has evolved as a promising, economical, and animal-friendly technique for assessing a chemical's potential for DART testing. The complexity of the endpoint has presented a problem for Quantitative Structure-Activity Relationship (QSAR) model developers as various facets of the chemical have to be appropriately analyzed to predict the DART. For the next-generation risk assessment (NGRA) studies, researchers and governing bodies are exploring various new approach methodologies (NAMs) integrated to address complex endpoints like repeated dose toxicity and DART. We have developed four hybrid computational models for DART studies of rodents and rabbits for their adult and fetal life stages separately. The hybrid models were created by integrating QSAR features with similarities-derived features (obtained from read-across hypotheses). This analysis has identified that this integrated method gives a better statistical quality compared to the traditional QSAR models, and the predictivity and transferability of the model are also enhanced in this new approach.
Collapse
|
8
|
Sobańska AW, Banerjee A, Roy K. Organic Sunscreens and Their Products of Degradation in Biotic and Abiotic Conditions-In Silico Studies of Drug-Likeness and Human Placental Transport. Int J Mol Sci 2024; 25:12373. [PMID: 39596438 PMCID: PMC11595199 DOI: 10.3390/ijms252212373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2024] [Revised: 11/06/2024] [Accepted: 11/15/2024] [Indexed: 11/28/2024] Open
Abstract
A total of 16 organic sunscreens and over 160 products of their degradation in biotic and abiotic conditions were investigated in the context of their safety during pregnancy. Drug-likeness and the ability of the studied compounds to be absorbed from the gastrointestinal tract and cross the human placenta were predicted in silico using the SwissADME software (for drug-likeness and oral absorption) and multiple linear regression and "ARKA" models (for placenta permeability expressed as fetus-to-mother blood concentration in the state of equilibrium), with the latter outperforming the MLR models. It was established that most of the studied compounds can be absorbed from the gastrointestinal tract. The drug-likeness of the studied compounds (expressed as a binary descriptor, Lipinski) is closely related to their ability to cross the placenta (most likely by a passive diffusion mechanism). The organic sunscreens and their degradation products are likely to cross the placenta, except for very bulky and highly lipophilic 1,3,5-triazine derivatives; an avobenzone degradation product, 1,2-bis(4-tert-butylphenyl)ethane-1,2-dione; diethylamino hydroxybenzoyl hexyl benzoate; and dimerization products of sunscreens from the 4-methoxycinnamate group.
Collapse
Affiliation(s)
- Anna W. Sobańska
- Department of Analytical Chemistry, Medical University of Lodz, Muszyńskiego 1, 90-151 Lodz, Poland
| | - Arkaprava Banerjee
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India;
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India;
| |
Collapse
|
9
|
Qin LT, Zhang JY, Nong QY, Xu XCL, Zeng HH, Liang YP, Mo LY. Classification and regression machine learning models for predicting the combined toxicity and interactions of antibiotics and fungicides mixtures. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2024; 360:124565. [PMID: 39033842 DOI: 10.1016/j.envpol.2024.124565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 07/13/2024] [Accepted: 07/15/2024] [Indexed: 07/23/2024]
Abstract
Antibiotics and triazole fungicides coexist in varying concentrations in natural aquatic environments, resulting in complex mixtures. These mixtures can potentially affect aquatic ecosystems. Accurately distinguishing synergistic and antagonistic mixtures and predicting mixture toxicity are crucial for effective mixture risk assessment. We tested the toxicities of 75 binary mixtures of antibiotics and fungicides against Auxenochlorella pyrenoidosa. Both regression and classification models for these mixtures were developed using machine learning models: random forest (RF), k-nearest neighbors (KNN), and kernel k-nearest neighbors (KKNN). The KKNN model emerged as the best regression model with high values of determination coefficient (R2 = 0.977), explained variance in prediction leave-one-out (Q2LOO = 0.894), and explained variance in external prediction (Q2F1 = 0.929, Q2F2 = 0.929, and Q2F3 = 0.923). The RF model, the leading classifier, exhibited high accuracy (accuracy = 1 for the training set and 0.905 for the test set) in distinguishing the synergistic and antagonistic mixtures. These results provide crucial value for the risk assessment of mixtures.
Collapse
Affiliation(s)
- Li-Tang Qin
- College of Environmental Science and Engineering, Guilin University of Technology, Guilin, 541004, China; Guangxi Key Laboratory of Environmental Pollution Control Theory and Technology, Guilin University of Technology, Guilin, 541004, China; Collaborative Innovation Center for Water Pollution Control and Water Safety in Karst Area, Guilin University of Technology, Guilin, 541004, China
| | - Jun-Yao Zhang
- College of Environmental Science and Engineering, Guilin University of Technology, Guilin, 541004, China
| | - Qiong-Yuan Nong
- College of Environmental Science and Engineering, Guilin University of Technology, Guilin, 541004, China
| | - Xia-Chang-Li Xu
- College of Environmental Science and Engineering, Guilin University of Technology, Guilin, 541004, China
| | - Hong-Hu Zeng
- College of Environmental Science and Engineering, Guilin University of Technology, Guilin, 541004, China; Guangxi Key Laboratory of Environmental Pollution Control Theory and Technology, Guilin University of Technology, Guilin, 541004, China; Collaborative Innovation Center for Water Pollution Control and Water Safety in Karst Area, Guilin University of Technology, Guilin, 541004, China
| | - Yan-Peng Liang
- College of Environmental Science and Engineering, Guilin University of Technology, Guilin, 541004, China; Guangxi Key Laboratory of Environmental Pollution Control Theory and Technology, Guilin University of Technology, Guilin, 541004, China; Collaborative Innovation Center for Water Pollution Control and Water Safety in Karst Area, Guilin University of Technology, Guilin, 541004, China.
| | - Ling-Yun Mo
- Guangxi Key Laboratory of Environmental Pollution Control Theory and Technology, Guilin University of Technology, Guilin, 541004, China; Collaborative Innovation Center for Water Pollution Control and Water Safety in Karst Area, Guilin University of Technology, Guilin, 541004, China; Technical Innovation Center of Mine Geological Environmental Restoration Engineering in Southern Karst Area, Nanjing, China.
| |
Collapse
|
10
|
Banerjee A, Kar S, Roy K, Patlewicz G, Charest N, Benfenati E, Cronin MTD. Molecular similarity in chemical informatics and predictive toxicity modeling: from quantitative read-across (q-RA) to quantitative read-across structure-activity relationship (q-RASAR) with the application of machine learning. Crit Rev Toxicol 2024; 54:659-684. [PMID: 39225123 PMCID: PMC12010357 DOI: 10.1080/10408444.2024.2386260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 07/25/2024] [Accepted: 07/25/2024] [Indexed: 09/04/2024]
Abstract
This article aims to provide a comprehensive critical, yet readable, review of general interest to the chemistry community on molecular similarity as applied to chemical informatics and predictive modeling with a special focus on read-across (RA) and read-across structure-activity relationships (RASAR). Molecular similarity-based computational tools, such as quantitative structure-activity relationships (QSARs) and RA, are routinely used to fill the data gaps for a wide range of properties including toxicity endpoints for regulatory purposes. This review will explore the background of RA starting from how structural information has been used through to how other similarity contexts such as physicochemical, absorption, distribution, metabolism, and elimination (ADME) properties, and biological aspects are being characterized. More recent developments of RA's integration with QSAR have resulted in the emergence of novel models such as ToxRead, generalized read-across (GenRA), and quantitative RASAR (q-RASAR). Conventional QSAR techniques have been excluded from this review except where necessary for context.
Collapse
Affiliation(s)
- Arkaprava Banerjee
- Department of Pharmaceutical Technology, Drug Theoretics and Cheminformatics (DTC) Laboratory, Jadavpur University, Kolkata, India
| | - Supratik Kar
- Department of Chemistry and Physics, Chemometrics & Molecular Modeling Laboratory, Kean University, Union, NJ, USA
| | - Kunal Roy
- Department of Pharmaceutical Technology, Drug Theoretics and Cheminformatics (DTC) Laboratory, Jadavpur University, Kolkata, India
| | - Grace Patlewicz
- Center for Computational Toxicology and Exposure, US Environmental Protection Agency, Research Triangle Park, NC, USA
| | - Nathaniel Charest
- Center for Computational Toxicology and Exposure, US Environmental Protection Agency, Research Triangle Park, NC, USA
| | - Emilio Benfenati
- Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | - Mark T. D. Cronin
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, UK
| |
Collapse
|
11
|
Kelleci Çelik F, Karaduman G. Computational modeling of air pollutants for aquatic risk: Prediction of ecological toxicity and exploring structural characteristics. CHEMOSPHERE 2024; 366:143501. [PMID: 39384138 DOI: 10.1016/j.chemosphere.2024.143501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 09/22/2024] [Accepted: 10/05/2024] [Indexed: 10/11/2024]
Abstract
Assessing the aquatic toxicity originating from air pollutants is essential in sustaining water resources and maintaining the ecosystem's safety. Quantitative structure-activity relationship (QSAR) models provide a computational tool for predicting pollutant toxicity, facilitating the identification/evaluation of the contaminants and identifying responsible structural fragments. One-vs-all (OvA) QSAR is a tailored approach to address multi-class QSAR problems. The study aims to determine five distinct levels of aquatic hazard categories for airborne pollutants using OvA-QSAR modeling containing 254 air contaminants. This QSAR analysis reveals the critical descriptors of air pollutants to target for molecular modification. Various factors, including the selection of relevant mechanistic descriptors, data quality, and outliers, determine the reliability of QSAR models. By employing feature selection and outlier identification approaches, the robustness and accuracy of our QSAR models were significantly increased, leading to more reliable predictions in chemical hazard assessment. The results revealed that models using the Random Forest algorithm performed the best based on the selected descriptors, with internal and external validation accuracy ranging from 71.90% to 97.53% and 76.47%-98.03%, respectively. This study indicated that the aquatic risk of air contaminants might be attributed predominantly to their sp3/sp2 carbon ratio, hydrogen-bond acceptor capability, hydrophilicity/lipophilicity, and van der Waals volumes. These structures can be critical in developing innovative strategies to mitigate or avoid the chemicals' harmful effects. Supporting air quality improvement, this study contributes to the rapid implementation of measures to protect aquatic ecosystems affected by air pollution.
Collapse
Affiliation(s)
- Feyza Kelleci Çelik
- Karamanoglu Mehmetbey University, Vocational School of Health Services, 70200, Karaman, Turkey.
| | - Gul Karaduman
- Karamanoglu Mehmetbey University, Department of Mathematics, 70100, Karaman, Turkey.
| |
Collapse
|
12
|
Banerjee A, Roy K. The application of chemical similarity measures in an unconventional modeling framework c-RASAR along with dimensionality reduction techniques to a representative hepatotoxicity dataset. Sci Rep 2024; 14:20812. [PMID: 39242880 PMCID: PMC11379871 DOI: 10.1038/s41598-024-71892-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Accepted: 09/02/2024] [Indexed: 09/09/2024] Open
Abstract
With the exponential progress in the field of cheminformatics, the conventional modeling approaches have so far been to employ supervised and unsupervised machine learning (ML) and deep learning models, utilizing the standard molecular descriptors, which represent the structural, physicochemical, and electronic properties of a particular compound. Deviating from the conventional approach, in this investigation, we have employed the classification Read-Across Structure-Activity Relationship (c-RASAR), which involves the amalgamation of the concepts of classification-based quantitative structure-activity relationship (QSAR) and Read-Across to incorporate Read-Across-derived similarity and error-based descriptors into a statistical and machine learning modeling framework. ML models developed from these RASAR descriptors use similarity-based information from the close source neighbors of a particular query compound. We have employed different classification modeling algorithms on the selected QSAR and RASAR descriptors to develop predictive models for efficient prediction of query compounds' hepatotoxicity. The predictivity of each of these models was evaluated on a large number of test set compounds. The best-performing model was also used to screen a true external data set. The concepts of explainable AI (XAI) coupled with Read-Across were used to interpret the contributions of the RASAR descriptors in the best c-RASAR model and to explain the chemical diversity in the dataset. The application of various unsupervised dimensionality reduction techniques like t-SNE and UMAP and the supervised ARKA framework showed the usefulness of the RASAR descriptors over the selected QSAR descriptors in their ability to group similar compounds, enhancing the modelability of the dataset and efficiently identifying activity cliffs. Furthermore, the activity cliffs were also identified from Read-Across by observing the nature of compounds constituting the nearest neighbors for a particular query compound. On comparing our simple linear c-RASAR model with the previously reported models developed using the same dataset derived from the US FDA Orange Book ( https://www.accessdata.fda.gov/scripts/cder/ob/index.cfm ), it was observed that our model is simple, reproducible, transferable, and highly predictive. The performance of the LDA c-RASAR model on the true external set supersedes that of the previously reported work. Therefore, the present simple LDA c-RASAR model can efficiently be used to predict the hepatotoxicity of query chemicals.
Collapse
Affiliation(s)
- Arkaprava Banerjee
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India.
| |
Collapse
|
13
|
Banerjee A, Roy K. How to correctly develop q-RASAR models for predictive cheminformatics. Expert Opin Drug Discov 2024; 19:1017-1022. [PMID: 38966910 DOI: 10.1080/17460441.2024.2376651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 07/02/2024] [Indexed: 07/06/2024]
Affiliation(s)
- Arkaprava Banerjee
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| |
Collapse
|
14
|
Srisongkram T. DeepRA: A novel deep learning-read-across framework and its application in non-sugar sweeteners mutagenicity prediction. Comput Biol Med 2024; 178:108731. [PMID: 38870727 DOI: 10.1016/j.compbiomed.2024.108731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 05/07/2024] [Accepted: 06/08/2024] [Indexed: 06/15/2024]
Abstract
Non-sugar sweeteners (NSSs) or artificial sweeteners have long been used as food chemicals since World War II. NSSs, however, also raise a concern about their mutagenicity. Evaluating the mutagenic ability of NSSs is crucial for food safety; this step is needed for every new chemical registration in the food and pharmaceutical industries. A computational assessment provides less time, money, and involved animals than the in vivo experiments; thus, this study developed a novel computational method from an ensemble convolutional deep neural network and read-across algorithms, called DeepRA, to classify the mutagenicity of chemicals. The mutagenicity data were obtained from the curated Ames test data set. The DeepRA model was developed using both molecular descriptors and molecular fingerprints. The obtained DeepRA model provides accurate and reliable mutagenicity classification through an independent test set. This model was then used to examine the NSSs-related chemicals, enabling the evaluation of mutagenicity from the NSSs-like substances. Finally, this model was publicly available at https://github.com/taraponglab/deepra for further use in chemical regulation and risk assessment.
Collapse
Affiliation(s)
- Tarapong Srisongkram
- Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, 40002, Thailand.
| |
Collapse
|
15
|
Banerjee A, Roy K. ARKA: a framework of dimensionality reduction for machine-learning classification modeling, risk assessment, and data gap-filling of sparse environmental toxicity data. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2024; 26:991-1007. [PMID: 38743054 DOI: 10.1039/d4em00173g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Due to the lack of experimental toxicity data for environmental chemicals, there arises a need to fill data gaps by in silico approaches. One of the most commonly used in silico approaches for toxicity assessment of small datasets is the Quantitative Structure-Activity Relationship (QSAR), which generates predictive models for the efficient prediction of query compounds. However, the reliability of the predictions from QSARs derived from small datasets is often questionable from a statistical point of view. This is due to the presence of a larger number of descriptors as compared to the number of training compounds, which reduces the degree of freedom of the developed model. To reduce the overall prediction error for a particular QSAR model, we have proposed here the computation of the novel Arithmetic Residuals in K-groups Analysis (ARKA) descriptors. We have reduced the number of modeling descriptors in a supervised manner by partitioning them into K classes (K = 2 here) depending on the higher mean normalized values of the descriptors to a particular response class, thus preventing the loss of chemical information. A scatter plot of the data points using the values of two ARKA descriptors (ARKA_2 vs. ARKA_1) can potentially identify activity cliffs, less confident data points, and less modelable data points. We have used here five representative environmentally relevant endpoints (skin sensitization, earthworm toxicity, milk/plasma partitioning, algal toxicity, and rodent carcinogenicity of hazardous chemicals) with graded responses to which the ARKA framework was applied for classification modeling. On comparing the performance of the models generated using conventional QSAR descriptors and the ARKA descriptors, the prediction quality of the models derived from ARKA descriptors was found, based on multiple graded-data validation metrics-derived decision criteria, much better than the models derived from QSAR descriptors signifying the potential of ARKA descriptors in ecotoxicological classification modeling of small data sets. Additionally, this holds true for the Read-Across approach as well, since the Read-Across predictions using ARKA descriptors supersede the predictions generated from QSAR descriptors. For the ease of users, a Java-based expert system has been developed that computes the ARKA descriptors from the input of QSAR descriptors.
Collapse
Affiliation(s)
- Arkaprava Banerjee
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India.
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India.
| |
Collapse
|
16
|
Das S, Samal A, Ojha PK. Chemometrics-driven prediction and prioritization of diverse pesticides on chickens for addressing hazardous effects on public health. JOURNAL OF HAZARDOUS MATERIALS 2024; 471:134326. [PMID: 38636230 DOI: 10.1016/j.jhazmat.2024.134326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 04/09/2024] [Accepted: 04/15/2024] [Indexed: 04/20/2024]
Abstract
The extensive use of various pesticides in the agriculture field badly affects both chickens and humans, primarily through residues in food products and environmental exposure. This study offers the first quantitative structure-toxicity relationship (QSTR) and quantitative read-across-structure toxicity relationship (q-RASTR) models encompassing the LOEL and NOEL endpoints for acute toxicity in chicken, a widely consumed protein. The study's significance lies in the direct link between chemical toxicity in chicken, human intake, and environmental damage. Both the QSTR and the similarity-based read-across algorithms are applied concurrently to improve the predictability of the models. The q-RASTR models were generated by combining read-across derived similarity and error-based parameters, alongside structural and physicochemical descriptors. Machine Learning approaches (SVM and RR) were also employed with the optimization of relevant hyperparameters based on the cross-validation approach, and the final test set prediction results were compared. The PLS-based q-RASTR models for NOEL and LOEL endpoints showed good statistical performance, as traced from the external validation metrics Q2F1: 0.762-0.844; Q2F2: 0.759-0.831 and MAEtest: 0.195-0.214. The developed models were further used to screen the Pesticide Properties DataBase (PPDB) for potential toxicants in chickens. Thus, established models can address eco-toxicological data gaps and development of novel and safe eco-friendly pesticides.
Collapse
Affiliation(s)
- Shubha Das
- Drug Discovery and Development Laboratory (DDD Lab), Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India
| | - Abhisek Samal
- Drug Discovery and Development Laboratory (DDD Lab), Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India
| | - Probir Kumar Ojha
- Drug Discovery and Development Laboratory (DDD Lab), Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India.
| |
Collapse
|
17
|
Ghosh S, Roy K. Quantitative read-across structure-activity relationship (q-RASAR): A novel approach to estimate the subchronic oral safety (NOAEL) of diverse organic chemicals in rats. Toxicology 2024; 505:153824. [PMID: 38705560 DOI: 10.1016/j.tox.2024.153824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 04/28/2024] [Accepted: 04/29/2024] [Indexed: 05/07/2024]
Abstract
We have developed a quantitative safety prediction model for subchronic repeated doses of diverse organic chemicals on rats using the novel quantitative read-across structure-activity relationship (q-RASAR) approach, which uses similarity-based descriptors for predictive model generation. The experimental -Log (NOAEL) values have been used here as a potential indicator of oral subchronic safety on rats as it determines the maximum dose level for which no observed adverse effects of chemicals are found. A total of 186 data points of diverse organic chemicals have been used for the model generation using structural and physicochemical (0D-2D) descriptors. The read-across-derived similarity, error, and concordance measures (RASAR descriptors) have been extracted from the preliminary 0D-2D descriptors. Then, the combined pool of RASAR and the identified 0D-2D descriptors of the training set were employed to develop the final models by using the partial least squares (PLS) algorithm. The developed PLS model was rigorously validated by various internal and external validation metrics as suggested by the Organization for Economic Co-operation and Development (OECD). The final q-RASAR model is proven to be statistically sound, robust and externally predictive (R2 = 0.85, Q2LOO = 0.82 and Q2F1 = 0.94), superseding the internal as well as external predictivity of the corresponding quantitative structure-activity relationship (QSAR) model as well as previously reported subchronic repeated dose toxicity model found in the literature. In a nutshell, the q-RASAR is an effective approach that has the potential to be used as a good alternative way to improve external predictivity, interpretability, and transferability for subchronic oral safety prediction as well as ecotoxicity risk identification.
Collapse
Affiliation(s)
- Shilpayan Ghosh
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India.
| |
Collapse
|
18
|
Karaduman G, Kelleci Çelik F. Towards safer pesticide management: A quantitative structure-activity relationship based hazard prediction model. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 916:170173. [PMID: 38266732 DOI: 10.1016/j.scitotenv.2024.170173] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Revised: 01/07/2024] [Accepted: 01/13/2024] [Indexed: 01/26/2024]
Abstract
Pesticides are recognized as common environmental contaminants. The potential pesticide hazard to non-target organisms, including various mammal species, is a global concern. The global problem requires a comprehensive risk assessment. To assess the toxic effects of pesticides at the early stage, a toxicological risk analysis is conducted to determine pesticide hazard levels. World Health Organization (WHO) has established five pesticide hazard classes based on lethal dose (LD50) values to perform these assessments. In this paper, we have developed one-vs-all quantitative structure-activity relationship (OvA-QSAR) models using five machine-learning techniques with the selected optimum molecular descriptors. Descriptor selection was conducted based on correlation to evaluate the relevance and significance of individual features in our dataset. Our OvA-QSAR model was built using a dataset obtained from the WHO, covering a wide range of chemical pesticides. These models can predict the hazard category for a pesticide within the five available categories. Notably, our experiments demonstrate the outstanding performance and robustness of the Random Forest (RF) model in addressing the challenge of multi-class classification with the selected descriptors.
Collapse
Affiliation(s)
- Gül Karaduman
- Karamanoğlu Mehmetbey University, Vocational School of Health Services, 70200 Karaman, Turkey; University of Texas at Arlington, Department of Mathematics, Arlington, TX 76019-0408, USA.
| | - Feyza Kelleci Çelik
- Karamanoğlu Mehmetbey University, Vocational School of Health Services, 70200 Karaman, Turkey.
| |
Collapse
|
19
|
Kumar V, Roy K. Protein-protein interaction network analysis for the identification of novel multi-target inhibitors and target miRNAs against Alzheimer's disease. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2024; 139:405-467. [PMID: 38448142 DOI: 10.1016/bs.apcsb.2023.11.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
This study presents a strategy for extracting significant gene complexes and then provides prospective therapeutics for AD. In this research, a total of 7905 reports published from 1981 to 2022 were retrieved. Following a review of all those articles, only the genetic association studies on AD were considered. Finally, there is a list of 453 Alzheimer-related genes in our dataset for network analysis. To this end, an experimentally derived protein-protein interaction (PPI) network from the String database was utilized to extract four meaningful gene complexes functionally interconnected using Cytoscape v3.9.1 software. The acquired gene complexes were subjected to an enrichment analysis using the ClueGO v2.5.9 tool to emphasize the most significant biological processes and pathways. Afterward, extracted gene complexes were used to extract the drugs related to AD from DGI v3.0 database and introduce some new drugs which may be helpful for this disease. Finally, a comprehensive network that included every gene connected to each gene complex group as well as the drug targets for each gene has been shown. Moreover, molecular docking studies have been performed with the selected compounds to identify the interaction pattern with the respective targets. Finally, we proposed a list of 62 compounds as multi-targeted directed drug-like compounds with a degree value between 2 and 5 and 30 compounds as target-specific drug-like compounds, which have not been proclaimed as AD-related drugs in prior scientific and medical investigations. Then, new drugs were suggested that can be experimentally examined for future work. In addition to this, four bipartite networks representing each group's genes and target miRNAs were established to introduce target miRNAs by using the miRWalk v3 server.
Collapse
Affiliation(s)
- Vinay Kumar
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India.
| |
Collapse
|
20
|
Banjare P, Singh R, Pandey NK, Matore BW, Murmu A, Singh J, Roy PP. In silico soil degradation and ecotoxicity analysis of veterinary pharmaceuticals on terrestrial species: first report. Toxicol Res (Camb) 2024; 13:tfae020. [PMID: 38496320 PMCID: PMC10939401 DOI: 10.1093/toxres/tfae020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 02/01/2024] [Accepted: 02/02/2024] [Indexed: 03/19/2024] Open
Abstract
With the aim of persistence property analysis and ecotoxicological impact of veterinary pharmaceuticals on different terrestrial species, different classes of veterinary pharmaceuticals (n = 37) with soil degradation property (DT50) were gathered and subjected to QSAR and q-RASAR model development. The models were developed from 2D descriptors under organization for economic cooperation and development guidelines with the application of multiple linear regressions along with genetic algorithm. All developed QSAR and q-RASAR were statistically significant (Internal = R2adj: 0.721-0.861, Q2LOO: 0.609-0.757, and external = Q2Fn = 0.597-0.933, MAEext = 0.174-0.260). Further, the leverage approach of applicability domain assured the model's reliability. The veterinary pharmaceuticals with no experimental values were classified based on their persistence level. Further, the terrestrial toxicity analysis of persistent veterinary pharmaceuticals was done using toxicity prediction by computer assisted technology and in-house built quantitative structure toxicity relationship models to prioritize the toxic and persistent veterinary pharmaceuticals. This study will be helpful in estimation of persistence and toxicity of existing and upcoming veterinary pharmaceuticals.
Collapse
Affiliation(s)
- Purusottam Banjare
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur 495009, Chhattisgarh, India
- Department of Pharmaceutical Chemistry, Apollo College of Pharmacy, Anjora, Durg 491001, Chhattisgarh, India
| | - Rekha Singh
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur 495009, Chhattisgarh, India
| | - Nilesh Kumar Pandey
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur 495009, Chhattisgarh, India
| | - Balaji Wamanrao Matore
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur 495009, Chhattisgarh, India
| | - Anjali Murmu
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur 495009, Chhattisgarh, India
| | - Jagadish Singh
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur 495009, Chhattisgarh, India
| | - Partha Pratim Roy
- Department of Pharmacy, Guru Ghasidas Vishwavidyalaya (A Central University), Bilaspur 495009, Chhattisgarh, India
| |
Collapse
|
21
|
Habiballah S, Heath LS, Reisfeld B. A deep-learning approach for identifying prospective chemical hazards. Toxicology 2024; 501:153708. [PMID: 38104655 DOI: 10.1016/j.tox.2023.153708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 12/11/2023] [Accepted: 12/13/2023] [Indexed: 12/19/2023]
Abstract
With the aim of helping to set safe exposure limits for the general population, various techniques have been implemented to conduct risk assessments for chemicals and other environmental stressors; however, none of these tools facilitate the identification of completely new chemicals that are likely hazardous and elicit an adverse biological effect. Here, we detail a novel in silico, deep-learning framework that is designed to systematically generate structures for new chemical compounds that are predicted to be chemical hazards. To assess the utility of the framework, we applied the tool to four endpoints related to environmental toxicants and their impacts on human and animal health: (i) toxicity to honeybees, (ii) immunotoxicity, (iii) endocrine disruption via ER-α antagonism, and (iv) mutagenicity. In addition, we characterized the predicted potency of these compounds and examined their structural relationship to existing chemicals of concern. As part of the array of emerging new approach methodologies (NAMs), we anticipate that such a framework will be a significant asset to risk assessors and other environmental scientists when planning and forecasting. Though not in the scope of the present study, we expect that the methodology detailed here could also be useful in the de novo design of more environmentally-friendly industrial chemicals.
Collapse
Affiliation(s)
- Sohaib Habiballah
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, CO 80523-1370, USA
| | - Lenwood S Heath
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061-0106, USA
| | - Brad Reisfeld
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, CO 80523-1370, USA; Colorado School of Public Health, Colorado State University, Fort Collins, CO 80523-1612, USA.
| |
Collapse
|
22
|
Suciu I, Delp J, Gutbier S, Suess J, Henschke L, Celardo I, Mayer TU, Amelio I, Leist M. Definition of the Neurotoxicity-Associated Metabolic Signature Triggered by Berberine and Other Respiratory Chain Inhibitors. Antioxidants (Basel) 2023; 13:49. [PMID: 38247474 PMCID: PMC10812665 DOI: 10.3390/antiox13010049] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 12/06/2023] [Accepted: 12/19/2023] [Indexed: 01/23/2024] Open
Abstract
To characterize the hits from a phenotypic neurotoxicity screen, we obtained transcriptomics data for valinomycin, diethylstilbestrol, colchicine, rotenone, 1-methyl-4-phenylpyridinium (MPP), carbaryl and berberine (Ber). For all compounds, the concentration triggering neurite degeneration correlated with the onset of gene expression changes. The mechanistically diverse toxicants caused similar patterns of gene regulation: the responses were dominated by cell de-differentiation and a triggering of canonical stress response pathways driven by ATF4 and NRF2. To obtain more detailed and specific information on the modes-of-action, the effects on energy metabolism (respiration and glycolysis) were measured. Ber, rotenone and MPP inhibited the mitochondrial respiratory chain and they shared complex I as the target. This group of toxicants was further evaluated by metabolomics under experimental conditions that did not deplete ATP. Ber (204 changed metabolites) showed similar effects as MPP and rotenone. The overall metabolic situation was characterized by oxidative stress, an over-abundance of NADH (>1000% increase) and a re-routing of metabolism in order to dispose of the nitrogen resulting from increased amino acid turnover. This unique overall pattern led to the accumulation of metabolites known as biomarkers of neurodegeneration (saccharopine, aminoadipate and branched-chain ketoacids). These findings suggest that neurotoxicity of mitochondrial inhibitors may result from an ensemble of metabolic changes rather than from a simple ATP depletion. The combi-omics approach used here provided richer and more specific MoA data than the more common transcriptomics analysis alone. As Ber, a human drug and food supplement, mimicked closely the mode-of-action of known neurotoxicants, its potential hazard requires further investigation.
Collapse
Affiliation(s)
- Ilinca Suciu
- In Vitro Toxicology and Biomedicine, Department Inaugurated by the Doerenkamp-Zbinden Foundation, University of Konstanz, 78464 Konstanz, Germany
- Graduate School of Chemical Biology, University of Konstanz, 78464 Konstanz, Germany
| | - Johannes Delp
- In Vitro Toxicology and Biomedicine, Department Inaugurated by the Doerenkamp-Zbinden Foundation, University of Konstanz, 78464 Konstanz, Germany
| | - Simon Gutbier
- In Vitro Toxicology and Biomedicine, Department Inaugurated by the Doerenkamp-Zbinden Foundation, University of Konstanz, 78464 Konstanz, Germany
| | - Julian Suess
- In Vitro Toxicology and Biomedicine, Department Inaugurated by the Doerenkamp-Zbinden Foundation, University of Konstanz, 78464 Konstanz, Germany
| | - Lars Henschke
- Graduate School of Chemical Biology, University of Konstanz, 78464 Konstanz, Germany
- Department of Molecular Genetics, University of Konstanz, 78464 Konstanz, Germany
| | - Ivana Celardo
- In Vitro Toxicology and Biomedicine, Department Inaugurated by the Doerenkamp-Zbinden Foundation, University of Konstanz, 78464 Konstanz, Germany
| | - Thomas U. Mayer
- Department of Molecular Genetics, University of Konstanz, 78464 Konstanz, Germany
| | - Ivano Amelio
- Division for Systems Toxicology, Department of Biology, University of Konstanz, 78464 Konstanz, Germany
| | - Marcel Leist
- In Vitro Toxicology and Biomedicine, Department Inaugurated by the Doerenkamp-Zbinden Foundation, University of Konstanz, 78464 Konstanz, Germany
| |
Collapse
|
23
|
Srisongkram T. Ensemble Quantitative Read-Across Structure-Activity Relationship Algorithm for Predicting Skin Cytotoxicity. Chem Res Toxicol 2023; 36:1961-1972. [PMID: 38047785 DOI: 10.1021/acs.chemrestox.3c00238] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
Read-across (RA) and quantitative structure-activity relationship (QSAR) are two alternative methods commonly used to fill data gaps in chemical registrations. These approaches use physicochemical properties or molecular fingerprints of source substances to predict the properties of unknown substances that have similar chemical structures or physicochemical properties. Research on RA and QSAR is essential to minimize the time, money, and animal testing needed to determine biological properties that are not currently known. This study developed a stacked ensemble quantitative read-across structure-activity relationship algorithm (enQRASAR) for predicting skin irritation toxicity based on negative log cell viability inhibition concentration at 50% (pIC50) against skin keratinocytes as the end point. The goodness-of-fit and predictability of this algorithm were validated using leave-one-out cross-validation and external test data sets. The results obtained were statistically reliable in terms of goodness-of-fit, robustness, and predictability metrics. Additionally, the developed model demonstrated a low prediction error when predicting FDA-approved drugs. These results confirm that the enQRASAR algorithm can be used to predict skin cytotoxicity of chemicals. Therefore, this model was publicly available to further facilitate toxicity predictions of unknown compounds in chemical registrations.
Collapse
Affiliation(s)
- Tarapong Srisongkram
- Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, Khon Kaen 40000, Thailand
| |
Collapse
|
24
|
Banerjee A, Roy K. Read-across-based intelligent learning: development of a global q-RASAR model for the efficient quantitative predictions of skin sensitization potential of diverse organic chemicals. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2023; 25:1626-1644. [PMID: 37682520 DOI: 10.1039/d3em00322a] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]
Abstract
Environmental chemicals and contaminants cause a wide array of harmful implications to terrestrial and aquatic life which ranges from skin sensitization to acute oral toxicity. The current study aims to assess the quantitative skin sensitization potential of a large set of industrial and environmental chemicals acting through different mechanisms using the novel quantitative Read-Across Structure-Activity Relationship (q-RASAR) approach. Based on the identified important set of structural and physicochemical features, Read-Across-based hyperparameters were optimized using the training set compounds followed by the calculation of similarity and error-based RASAR descriptors. Data fusion, further feature selection, and removal of prediction confidence outliers were performed to generate a partial least squares (PLS) q-RASAR model, followed by the application of various Machine Learning (ML) tools to check the quality of predictions. The PLS model was found to be the best among different models. A simple user-friendly Java-based software tool was developed based on the PLS model, which efficiently predicts the toxicity value(s) of query compound(s) along with their status of Applicability Domain (AD) in terms of leverage values. This model has been developed using structurally diverse compounds and is expected to predict efficiently and quantitatively the skin sensitization potential of environmental chemicals to estimate their occupational and health hazards.
Collapse
Affiliation(s)
- Arkaprava Banerjee
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India.
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India.
| |
Collapse
|
25
|
Chatterjee M, Banerjee A, Tosi S, Carnesecchi E, Benfenati E, Roy K. Machine learning - based q-RASAR modeling to predict acute contact toxicity of binary organic pesticide mixtures in honey bees. JOURNAL OF HAZARDOUS MATERIALS 2023; 460:132358. [PMID: 37634379 DOI: 10.1016/j.jhazmat.2023.132358] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 08/02/2023] [Accepted: 08/20/2023] [Indexed: 08/29/2023]
Abstract
We have reported here a quantitative read-across structure-activity relationship (q-RASAR) model for the prediction of binary mixture toxicity (acute contact toxicity) in honey bees. Both the quantitative structure-activity relationship (QSAR) and the similarity-based read-across algorithms are used simultaneously for enhancing the predictability of the model. Several similarity and error-based parameters, obtained from the read-across prediction tool, have been put together with the structural and physicochemical descriptors to develop the final q-RASAR model. The calculated statistical and validation metrics indicate the goodness-of-fit, robustness, and good predictability of the partial least squares (PLS) regression model. Machine learning algorithms like ridge regression, linear support vector machine (SVM), and non-linear SVM have been used to further enhance the predictability of the q-RASAR model. The prediction quality of the q-RASAR models outperforms the previously reported quasi-SMILEs-based QSAR model in terms of external correlation coefficient (Q2F1 SVM q-RASAR: 0.935 vs. Q2VLD QSAR: 0.89). In this research, the toxicity values of several new untested binary mixtures have been predicted with the new models, and the reliability of the PLS predictions has been validated by the prediction reliability indicator tool. The q-RASAR approach can be used as reliable, complementary, and integrative to the conventional experimental approaches of pesticide mixture risk assessment.
Collapse
Affiliation(s)
- Mainak Chatterjee
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India
| | - Arkaprava Banerjee
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India
| | - Simone Tosi
- Department of Agricultural, Forest, and Food Sciences, University of Turin, Turin, Italy
| | | | - Emilio Benfenati
- Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCSS, via Mario Negri 2, 20156 Milano, Italy
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India.
| |
Collapse
|
26
|
Banerjee A, Roy K. Prediction-Inspired Intelligent Training for the Development of Classification Read-across Structure-Activity Relationship (c-RASAR) Models for Organic Skin Sensitizers: Assessment of Classification Error Rate from Novel Similarity Coefficients. Chem Res Toxicol 2023; 36:1518-1531. [PMID: 37584642 DOI: 10.1021/acs.chemrestox.3c00155] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]
Abstract
The advancements in the field of cheminformatics have led to a reduction in animal testing to estimate the activity, property, and toxicity of query chemicals. Read-across structure-activity relationship (RASAR) is an emerging concept that utilizes various similarity functions derived from chemical information to develop highly predictive models. Unlike quantitative structure-activity relationship (QSAR) models, RASAR descriptors of a query compound are computed from its close congeners instead of the compound itself, thus targeting predictions in the model training phase. The objective of the present study is not to propose new QSAR models for skin sensitization but to demonstrate the enhancement in the quality of predictions of the skin-sensitizing potential of organic compounds by developing classification-based RASAR (c-RASAR) models. A diverse, previously curated data set was collected from the literature for which 2D descriptors were computed. The extracted essential features were then used to develop a classification-based linear discriminant analysis (LDA) QSAR model. Furthermore, from the read-across-based predictions, RASAR descriptors were calculated using the basic settings of the hyperparameters for the Laplacian Kernel-based optimum similarity measure. After feature selection, an LDA c-RASAR model was developed, which superseded the prediction quality of the LDA-QSAR model. Various other combinations of RASAR descriptors were also taken to develop additional c-RASAR models, all showing better prediction quality than the LDA QSAR model while using a lower number of descriptors. Various other machine learning c-RASAR models were also developed for comparison purposes. In this work, we have proposed and analyzed three new similarity metrics: gm_class, sm1, and sm2. The first one is an indicator variable used to generate a simple univariate c-RASAR model with good prediction ability, while the remaining two are similarity indices used to analyze possible activity cliffs in the training and test sets and are believed to play an important role in the modelability analysis of data sets.
Collapse
Affiliation(s)
- Arkaprava Banerjee
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India
| |
Collapse
|
27
|
Nguyen HD. In silico identification of novel heterocyclic compounds combats Alzheimer's disease through inhibition of butyrylcholinesterase enzymatic activity. J Biomol Struct Dyn 2023; 42:10890-10910. [PMID: 37723904 DOI: 10.1080/07391102.2023.2259482] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 09/09/2023] [Indexed: 09/20/2023]
Abstract
Increasing evidence indicates that heterocyclic molecules possess properties against butyrylcholinesterase (BChE) enzymatic activity, which is a potential therapeutic target for Alzheimer's disease (AD). Thus, this study aimed to further evaluate the relationship between heterocyclic molecules and their biological activities. A dataset of 38 selective and potent heterocyclic compounds (-log[the half‑maximal inhibitory concentration (pIC50)]) values ranging from 8.02 to 10.05) was applied to construct a quantitative structure-activity relationship (QSAR) study, including Bayesian model average (BMA), artificial neural network (ANN), multiple nonlinear regression (MNLR), and multiple linear regression (MLR) models. Four models met statistical acceptance in internal and external validation. The ANN model was superior to other models in predicting the pIC50 of the outcome. The descriptors put into the models were found to be comparable with the target-ligand complex X-ray structures, making these models interpretable. Three selected molecules possess drug-like properties (pIC50 values ranged from 9.19 to 9.54). The docking score between candidates and the BChE receptor (RCSB ID 6EYF) ranged from -8.4 to -9.0 kcal/mol. Remarkably, the pharmacokinetics, biological activities, molecular dynamics, and physicochemical properties of compound 18 (C20H22N4O, pIC50 value = 9.33, oxadiazole derivative group) support its protective effects on AD treatment due to its non-toxic nature, non-carcinogen, cholinergic nature, capability to penetrate the blood-brain barrier, and high gastrointestinal absorption.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Hai Duc Nguyen
- Department of Pharmacy, College of Pharmacy, Research Institute of Life and Pharmaceutical Sciences, Sunchon National University, Suncheon, South Korea
| |
Collapse
|
28
|
Klambauer G, Clevert DA, Shah I, Benfenati E, Tetko IV. Introduction to the Special Issue: AI Meets Toxicology. Chem Res Toxicol 2023; 36:1163-1167. [PMID: 37599584 DOI: 10.1021/acs.chemrestox.3c00217] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/22/2023]
Affiliation(s)
- Günter Klambauer
- ELLIS Unit Linz, LIT AI Lab & Institute for Machine Learning, Johannes Kepler University Linz, Altenbergerstraße 69, Linz 4040, Austria
| | - Djork-Arné Clevert
- Machine Learning Research, Pfizer Worldwide Research Development and Medical, Linkstr. 10, Berlin 10785, Germany
| | - Imran Shah
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Emilio Benfenati
- Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milano 20156, Italy
| | - Igor V Tetko
- Institute of Structural Biology, Molecular Targets and Therapeutics Center, Helmholtz Munich - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), 85764 Neuherberg, Germany
- BIGCHEM GmbH, Valerystr. 49, 85716 Unterschleißheim, Germany
| |
Collapse
|
29
|
Tan Z, Zhao Y, Zhou T, Lin K. Hi-MGT: A hybrid molecule graph transformer for toxicity identification. JOURNAL OF HAZARDOUS MATERIALS 2023; 457:131808. [PMID: 37307723 DOI: 10.1016/j.jhazmat.2023.131808] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Revised: 05/18/2023] [Accepted: 06/07/2023] [Indexed: 06/14/2023]
Abstract
Conventional toxicity testing methods that rely on animal experimentation are resource-intensive, time-consuming, and ethically controversial. Therefore, the development of alternative non-animal testing approaches is crucial. This study proposes a novel hybrid graph transformer architecture, termed Hi-MGT, for the toxicity identification. An innovative aggregation strategy, referred to as GNN-GT combination, enables Hi-MGT to simultaneously and comprehensively aggregate local and global structural information of molecules, thus elucidating more informative toxicity information hidden in molecule graphs. The results show that the state-of-the-art model outperforms current baseline CML and DL models on a diverse range of toxicity endpoints and is even comparable to large-scale pretrained GNNs with geometry enhancement. Additionally, the impact of hyperparameters on model performance is investigated, and a systematic ablation study is conducted to demonstrate the effectiveness of the GNN-GT combination. Moreover, this study provides valuable insights into the learning process on molecules and proposes a novel similarity-based method for toxic site detection, which could potentially facilitate toxicity identification and analysis. Overall, the Hi-MGT model represents a significant advancement in the development of alternative non-animal testing approaches for toxicity identification, with promising implications for enhancing human safety in the use of chemical compounds.
Collapse
Affiliation(s)
- Zhichao Tan
- The State Key Laboratory of Pollution Control and Resource Reuse, School of Environmental Science and Engineering, Tongji University, 1239 Siping Road, Shanghai 200092, China; Shanghai Institute of Pollution Control and Ecological Security, 1515 North Zhongshan Rd. (No. 2), Shanghai 200092, China
| | - Youcai Zhao
- The State Key Laboratory of Pollution Control and Resource Reuse, School of Environmental Science and Engineering, Tongji University, 1239 Siping Road, Shanghai 200092, China; Shanghai Institute of Pollution Control and Ecological Security, 1515 North Zhongshan Rd. (No. 2), Shanghai 200092, China
| | - Tao Zhou
- The State Key Laboratory of Pollution Control and Resource Reuse, School of Environmental Science and Engineering, Tongji University, 1239 Siping Road, Shanghai 200092, China; Shanghai Institute of Pollution Control and Ecological Security, 1515 North Zhongshan Rd. (No. 2), Shanghai 200092, China.
| | - Kunsen Lin
- The State Key Laboratory of Pollution Control and Resource Reuse, School of Environmental Science and Engineering, Tongji University, 1239 Siping Road, Shanghai 200092, China; Shanghai Institute of Pollution Control and Ecological Security, 1515 North Zhongshan Rd. (No. 2), Shanghai 200092, China.
| |
Collapse
|
30
|
Erturan AM, Karaduman G, Durmaz H. Machine learning-based approach for efficient prediction of toxicity of chemical gases using feature selection. JOURNAL OF HAZARDOUS MATERIALS 2023; 455:131616. [PMID: 37201279 DOI: 10.1016/j.jhazmat.2023.131616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 05/09/2023] [Accepted: 05/10/2023] [Indexed: 05/20/2023]
Abstract
Toxic gases can be fatal as they damage many living tissues, especially the nervous and respiratory systems. They can cause permanent damage for many years by harming environmental tissue and living organisms. They can also cause mass deaths when used as chemical weapons. These chemical agents consist of organophosphates, namely ester, amide, or thiol derivatives of phosphorus, phosphonic or phosphinic acids, or can be synthesized independently. In this study, machine learning models were used to predict the toxicity of chemical gases. Toxic and non-toxic gases, consisting of 144 gases, were identified according to the United States Environmental Protection Agency, Occupational Safety and Health Administration, and the Centers for Disease Control and Prevention. Six machine-learning models were used to predict the toxicity of these chemical gases. The performance of the models was verified through internal and external validation. The results showed that the model's internal validation accuracy was 86.96% with the Relief-J48 algorithm. The accuracy value of the model was 89.65% with the Bayes Net algorithm for external validation. Our results reveal that identifying the toxicity of existing and potential chemicals is essential for the early detection of these chemicals in nature.
Collapse
Affiliation(s)
- Ahmet Murat Erturan
- Konya Technical University, Department of Electrical and Electronics Engineering, Konya, Republic of Turkey
| | - Gül Karaduman
- Karamanoğlu Mehmetbey University, Vocational School of Health Services, Karaman, Republic of Turkey.
| | - Habibe Durmaz
- Karamanoğlu Mehmetbey University, Department of Electrical and Electronics Engineering, Karaman, Republic of Turkey
| |
Collapse
|