1
|
Le NQK, Tran TX, Nguyen PA, Ho TT, Nguyen VN. Recent progress in machine learning approaches for predicting carcinogenicity in drug development. Expert Opin Drug Metab Toxicol 2024:1-8. [PMID: 38742542 DOI: 10.1080/17425255.2024.2356162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Accepted: 05/13/2024] [Indexed: 05/16/2024]
Abstract
INTRODUCTION This review explores the transformative impact of machine learning (ML) on carcinogenicity prediction within drug development. It discusses the historical context and recent advancements, emphasizing the significance of ML methodologies in overcoming challenges related to data interpretation, ethical considerations, and regulatory acceptance. AREAS COVERED The review comprehensively examines the integration of ML, deep learning, and diverse artificial intelligence (AI) approaches in various aspects of drug development safety assessments. It explores applications ranging from early-phase compound screening to clinical trial optimization, highlighting the versatility of ML in enhancing predictive accuracy and efficiency. EXPERT OPINION Through the analysis of traditional approaches such as in vivo rodent bioassays and in vitro assays, the review underscores the limitations and resource intensity associated with these methods. It provides expert insights into how ML offers innovative solutions to address these challenges, revolutionizing safety assessments in drug development.
Collapse
Affiliation(s)
- Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
- AIBioMed Research Group, Taipei Medical University, Taipei, Taiwan
- Research Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei, Taiwan
- Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan
| | - Thi-Xuan Tran
- University of Economics and Business Administration, Thai Nguyen University, Thai Nguyen, Vietnam
| | - Phung-Anh Nguyen
- Clinical Data Center, Office of Data Science, Taipei Medical University, Taipei, Vietnam
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei Medical University, Taipei, Vietnam
| | - Trang-Thi Ho
- Department of Computer Science and Information Engineering, TamKang University, New Taipei, Taiwan
| | - Van-Nui Nguyen
- University of Information and Communication Technology, Thai Nguyen University, Thai Nguyen, Vietnam
| |
Collapse
|
2
|
Guo W, Liu J, Dong F, Song M, Li Z, Khan MKH, Patterson TA, Hong H. Review of machine learning and deep learning models for toxicity prediction. Exp Biol Med (Maywood) 2023; 248:1952-1973. [PMID: 38057999 PMCID: PMC10798180 DOI: 10.1177/15353702231209421] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023] Open
Abstract
The ever-increasing number of chemicals has raised public concerns due to their adverse effects on human health and the environment. To protect public health and the environment, it is critical to assess the toxicity of these chemicals. Traditional in vitro and in vivo toxicity assays are complicated, costly, and time-consuming and may face ethical issues. These constraints raise the need for alternative methods for assessing the toxicity of chemicals. Recently, due to the advancement of machine learning algorithms and the increase in computational power, many toxicity prediction models have been developed using various machine learning and deep learning algorithms such as support vector machine, random forest, k-nearest neighbors, ensemble learning, and deep neural network. This review summarizes the machine learning- and deep learning-based toxicity prediction models developed in recent years. Support vector machine and random forest are the most popular machine learning algorithms, and hepatotoxicity, cardiotoxicity, and carcinogenicity are the frequently modeled toxicity endpoints in predictive toxicology. It is known that datasets impact model performance. The quality of datasets used in the development of toxicity prediction models using machine learning and deep learning is vital to the performance of the developed models. The different toxicity assignments for the same chemicals among different datasets of the same type of toxicity have been observed, indicating benchmarking datasets is needed for developing reliable toxicity prediction models using machine learning and deep learning algorithms. This review provides insights into current machine learning models in predictive toxicology, which are expected to promote the development and application of toxicity prediction models in the future.
Collapse
Affiliation(s)
- Wenjing Guo
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Jie Liu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Fan Dong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Meng Song
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Zoe Li
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Md Kamrul Hasan Khan
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Tucker A Patterson
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| |
Collapse
|
3
|
Hao N, Sun P, Zhao W, Li X. Application of a developed triple-classification machine learning model for carcinogenic prediction of hazardous organic chemicals to the US, EU, and WHO based on Chinese database. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2023; 255:114806. [PMID: 36948010 DOI: 10.1016/j.ecoenv.2023.114806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2022] [Revised: 03/04/2023] [Accepted: 03/16/2023] [Indexed: 06/18/2023]
Abstract
Cancer, the second largest human disease, has become a major public health problem. The prediction of chemicals' carcinogenicity before their synthesis is crucial. In this paper, seven machine learning algorithms (i.e., Random Forest (RF), Logistic Regression (LR), Support Vector Machines (SVM), Complement Naive Bayes (CNB), K-Nearest Neighbor (KNN), XGBoost, and Multilayer Perceptron (MLP)) were used to construct the carcinogenicity triple classification prediction (TCP) model (i.e., 1A, 1B, Category 2). A total of 1444 descriptors of 118 hazardous organic chemicals were calculated by Discovery Studio 2020, Sybyl X-2.0 and PaDEL-Descriptor software. The constructed carcinogenicity TCP model was evaluated through five model evaluation indicators (i.e., Accuracy, Precision, Recall, F1 Score and AUC). The model evaluation results show that Accuracy, Precision, Recall, F1 Score and AUC evaluation indicators meet requirements (greater than 0.6). The accuracy of RF, LR, XGBoost, and MLP models for predicting carcinogenicity of Category 2 is 91.67%, 79.17%, 100%, and 100%, respectively. In addition, the constructed machine learning model in this study has potential for error correction. Taking XGBoost model as an example, the predicted carcinogenicity level of 1,2,3-Trichloropropane (96-18-4) is Category 2, but the actual carcinogenicity level is 1B. But the difference between Category 2 and 1B is only 0.004, indicating that the XGBoost is one optimum model of the seven constructed machine learning models. Besides, results showed that functional groups like chlorine and benzene ring might influence the prediction of carcinogenic classification. Therefore, considering functional group characteristics of chemicals before constructing the carcinogenicity prediction model of organic chemicals is recommended. The predicted carcinogenicity of the organic chemicals using the optimum machine leaning model (i.e., XGBoost) was also evaluated and verified by the toxicokinetics. The RF and XGBoost TCP models constructed in this paper can be used for carcinogenicity detection before synthesizing new organic substances. It also provides technical support for the subsequent management of organic chemicals.
Collapse
Affiliation(s)
- Ning Hao
- College of New Energy and Environment, Jilin University, Changchun 130012, China
| | - Peixuan Sun
- College of New Energy and Environment, Jilin University, Changchun 130012, China
| | - Wenjin Zhao
- College of New Energy and Environment, Jilin University, Changchun 130012, China.
| | - Xixi Li
- State Environmental Protection Key Laboratory of Ecological Effect and Risk Assessment of Chemicals, Chinese Research Academy of Environmental Sciences, Beijing 100012, China; Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University, St. John's, A1B 3×5, Canada.
| |
Collapse
|
4
|
Limbu S, Dakshanamurthy S. Predicting Chemical Carcinogens Using a Hybrid Neural Network Deep Learning Method. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22218185. [PMID: 36365881 PMCID: PMC9653664 DOI: 10.3390/s22218185] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 10/11/2022] [Accepted: 10/23/2022] [Indexed: 05/28/2023]
Abstract
Determining environmental chemical carcinogenicity is urgently needed as humans are increasingly exposed to these chemicals. In this study, we developed a hybrid neural network (HNN) method called HNN-Cancer to predict potential carcinogens of real-life chemicals. The HNN-Cancer included a new SMILES feature representation method by modifying our previous 3D array representation of 1D SMILES simulated by the convolutional neural network (CNN). We developed binary classification, multiclass classification, and regression models based on diverse non-congeneric chemicals. Along with the HNN-Cancer model, we developed models based on the random forest (RF), bootstrap aggregating (Bagging), and adaptive boosting (AdaBoost) methods for binary and multiclass classification. We developed regression models using HNN-Cancer, RF, support vector regressor (SVR), gradient boosting (GB), kernel ridge (KR), decision tree with AdaBoost (DT), KNeighbors (KN), and a consensus method. The performance of the models for all classifications was assessed using various statistical metrics. The accuracy of the HNN-Cancer, RF, and Bagging models were 74%, and their AUC was ~0.81 for binary classification models developed with 7994 chemicals. The sensitivity was 79.5% and the specificity was 67.3% for the HNN-Cancer, which outperforms the other methods. In the case of multiclass classification models with 1618 chemicals, we obtained the optimal accuracy of 70% with an AUC 0.7 for HNN-Cancer, RF, Bagging, and AdaBoost, respectively. In the case of regression models, the correlation coefficient (R) was around 0.62 for HNN-Cancer and RF higher than the SVM, GB, KR, DTBoost, and NN machine learning methods. Overall, the HNN-Cancer performed better for the majority of the known carcinogen experimental datasets. Further, the predictive performance of HNN-Cancer on diverse chemicals is comparable to the literature-reported models that included similar and less diverse molecules. Our HNN-Cancer could be used in identifying potentially carcinogenic chemicals for a wide variety of chemical classes.
Collapse
|
5
|
Ozbuyukkaya G, Parker RS, Veser G. Determining robust reaction kinetics from limited data. AIChE J 2021. [DOI: 10.1002/aic.17538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Gizem Ozbuyukkaya
- Department of Chemical Engineering, Swanson School of Engineering, and Center for Energy University of Pittsburgh Pittsburgh Pennsylvania USA
| | - Robert S. Parker
- Department of Chemical Engineering, Swanson School of Engineering, and Center for Energy University of Pittsburgh Pittsburgh Pennsylvania USA
| | - Goetz Veser
- Department of Chemical Engineering, Swanson School of Engineering, and Center for Energy University of Pittsburgh Pittsburgh Pennsylvania USA
| |
Collapse
|
6
|
Jiao Z, Hu P, Xu H, Wang Q. Machine Learning and Deep Learning in Chemical Health and Safety: A Systematic Review of Techniques and Applications. ACS CHEMICAL HEALTH & SAFETY 2020. [DOI: 10.1021/acs.chas.0c00075] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Zeren Jiao
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843-3122, United States
| | - Pingfan Hu
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843-3122, United States
| | - Hongfei Xu
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843-3122, United States
| | - Qingsheng Wang
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843-3122, United States
| |
Collapse
|
7
|
Gupta A, Kahali B. Machine learning-based cognitive impairment classification with optimal combination of neuropsychological tests. ALZHEIMER'S & DEMENTIA (NEW YORK, N. Y.) 2020; 6:e12049. [PMID: 32699817 PMCID: PMC7369403 DOI: 10.1002/trc2.12049] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 01/16/1800] [Accepted: 01/28/2020] [Indexed: 11/09/2022]
Abstract
INTRODUCTION An extensive battery of neuropsychological tests is currently used to classify individuals as healthy (HV), mild cognitively impaired (MCI), and with Alzheimer's disease (AD). We used machine learning models for effective cognitive impairment classification and optimized the number of tests for expeditious and inexpensive implementation. METHODS Using random forests (RF) and support vector machine, we classified cognitive impairment in multi-class data sets from Rush Religious Orders Study Memory and Aging Project, and National Alzheimer's Coordinating Center. We applied Fischer's linear discrimination and assessed importance of each test iteratively for feature selection. RESULTS RF has best accuracy with increased sensitivity, specificity in this first ever multi-class classification of HV, MCI, and AD. Moreover, a subset of six to eight tests shows equivalent classification accuracy as an entire battery of tests. DISCUSSIONS Fully automated feature selection approach reveals six to eight tests comprising episodic, semantic memory, perceptual orientation, and executive functioning can accurately classify the cognitive status, ensuring minimal subject burden.
Collapse
Affiliation(s)
- Abhay Gupta
- Undergraduate Program (Physics)Indian Institute of ScienceBengaluruKarnatakaIndia
| | - Bratati Kahali
- Centre for Brain Research, Indian Institute of ScienceBengaluruKarnatakaIndia
| |
Collapse
|
8
|
Guan D, Fan K, Spence I, Matthews S. Combining machine learning models of in vitro and in vivo bioassays improves rat carcinogenicity prediction. Regul Toxicol Pharmacol 2018; 94:8-15. [PMID: 29337192 DOI: 10.1016/j.yrtph.2018.01.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2017] [Revised: 01/09/2018] [Accepted: 01/10/2018] [Indexed: 12/18/2022]
Abstract
In vitro genotoxicity bioassays are cost-efficient methods of assessing potential carcinogens. However, many genotoxicity bioassays are inappropriate for detecting chemicals eliciting non-genotoxic mechanisms, such as tumour promotion, this necessitates the use of in vivo rodent carcinogenicity (IVRC) assays. In silico IVRC modelling could potentially address the low throughput and high cost of this assay. We aimed to develop and combine computational QSAR models of novel bioassays for the prediction of IVRC results and compare with existing software. QSAR models were generated from existing Ames (n = 6512), Syrian Hamster Embryonic (SHE, n = 410), ISSCAN rodent carcinogenicity (ISC, n = 834) and GreenScreen GADD45a-GFP (n = 1415) chemical datasets. These models mapped the molecular descriptors of each compound to their respective assay result using machine learning algorithms (adaboost, k-Nearest Neighbours, C.45 Decision Tree, Multilayer Perceptron, Random Forest). The best performing models were combined with k-Nearest Neighbours to create a cascade model for IVRC prediction. High QSAR model performance was observed from ten time 10-fold cross-validation with above 80% accuracy and 0.85 AUC for each assay dataset. The cascade model predicted rat carcinogenicity with 69.3% accuracy and 0.700 AUC. This study demonstrates the novelty of a combined approach for IVRC prediction, with higher performance than existing software.
Collapse
Affiliation(s)
- Davy Guan
- Sydney Medical School, The University of Sydney, Australia
| | - Kevin Fan
- Sydney Medical School, The University of Sydney, Australia
| | - Ian Spence
- Sydney Medical School, The University of Sydney, Australia
| | - Slade Matthews
- Sydney Medical School, The University of Sydney, Australia.
| |
Collapse
|
9
|
Ford KA. Refinement, Reduction, and Replacement of Animal Toxicity Tests by Computational Methods. ILAR J 2017; 57:226-233. [DOI: 10.1093/ilar/ilw031] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Revised: 10/12/2016] [Indexed: 12/16/2022] Open
|
10
|
Zhang H, Cao ZX, Li M, Li YZ, Peng C. Novel naïve Bayes classification models for predicting the carcinogenicity of chemicals. Food Chem Toxicol 2016; 97:141-149. [PMID: 27597133 DOI: 10.1016/j.fct.2016.09.005] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2016] [Revised: 08/02/2016] [Accepted: 09/01/2016] [Indexed: 02/05/2023]
Abstract
The carcinogenicity prediction has become a significant issue for the pharmaceutical industry. The purpose of this investigation was to develop a novel prediction model of carcinogenicity of chemicals by using a naïve Bayes classifier. The established model was validated by the internal 5-fold cross validation and external test set. The naïve Bayes classifier gave an average overall prediction accuracy of 90 ± 0.8% for the training set and 68 ± 1.9% for the external test set. Moreover, five simple molecular descriptors (e.g., AlogP, Molecular weight (MW), No. of H donors, Apol and Wiener) considered as important for the carcinogenicity of chemicals were identified, and some substructures related to the carcinogenicity were achieved. Thus, we hope the established naïve Bayes prediction model could be applied to filter early-stage molecules for this potential carcinogenicity adverse effect; and the identified five simple molecular descriptors and substructures of carcinogens would give a better understanding of the carcinogenicity of chemicals, and further provide guidance for medicinal chemists in the design of new candidate drugs and lead optimization, ultimately reducing the attrition rate in later stages of drug development.
Collapse
Affiliation(s)
- Hui Zhang
- College of Life Science, Northwest Normal University, Lanzhou, Gansu, 730070, PR China; State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, West China Medical School, Sichuan University, Chengdu, Sichuan, 610041, PR China.
| | - Zhi-Xing Cao
- Pharmacy College, Chengdu University of Traditional Chinese Medicine, Key Laboratory of Systematic Research, Development and Utilization of Chinese Medicine Resources in Sichuan Province-key Laboratory Breeding Base of Co-founded by Sichuan Province and MOST, Chendu, Sichuan, PR China; State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, West China Medical School, Sichuan University, Chengdu, Sichuan, 610041, PR China
| | - Meng Li
- College of Life Science, Northwest Normal University, Lanzhou, Gansu, 730070, PR China
| | - Yu-Zhi Li
- Pharmacy College, Chengdu University of Traditional Chinese Medicine, Key Laboratory of Systematic Research, Development and Utilization of Chinese Medicine Resources in Sichuan Province-key Laboratory Breeding Base of Co-founded by Sichuan Province and MOST, Chendu, Sichuan, PR China
| | - Cheng Peng
- Pharmacy College, Chengdu University of Traditional Chinese Medicine, Key Laboratory of Systematic Research, Development and Utilization of Chinese Medicine Resources in Sichuan Province-key Laboratory Breeding Base of Co-founded by Sichuan Province and MOST, Chendu, Sichuan, PR China
| |
Collapse
|
11
|
Li X, Du Z, Wang J, Wu Z, Li W, Liu G, Shen X, Tang Y. In Silico Estimation of Chemical Carcinogenicity with Binary and Ternary Classification Methods. Mol Inform 2015; 34:228-35. [PMID: 27490168 DOI: 10.1002/minf.201400127] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2014] [Accepted: 01/11/2015] [Indexed: 11/07/2022]
Abstract
Carcinogenicity is one of the most concerned properties of chemicals to human health, thus it is important to identify chemical carcinogenicity as early as possible. In this study, 829 diverse compounds with rat carcinogenicity were collected from Carcinogenic Potency Database (CPDB). Using six types of fingerprints to represent the molecules, 30 binary and ternary classification models were generated to predict chemical carcinogenicity by five machine learning methods. The models were evaluated by an external validation set containing 87 chemicals from ISSCAN database. The best binary model was developed by MACCS keys and kNN algorithm with predictive accuracy at 83.91 %, while the best ternary model was also generated by MACCS keys and kNN algorithm with overall accuracy at 80.46 %. Furthermore, the best binary and ternary classification models were used to estimate carcinogenicity of tobacco smoke components containing 2251 compounds. 981 ones were predicted as carcinogens by binary classification model, while 110 compounds were predicted as strong carcinogens and 807 ones as weak carcinogens by ternary classification model. The results indicated that our models would be helpful for prediction of chemical carcinogenicity.
Collapse
Affiliation(s)
- Xiao Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033.,Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, P. R. China
| | - Zheng Du
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033
| | - Jie Wang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033
| | - Zengrui Wu
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033
| | - Weihua Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033
| | - Guixia Liu
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033
| | - Xu Shen
- Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, P. R. China
| | - Yun Tang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033. .,Key Laboratory of Cigarette Smoke, Technical Center, Shanghai Tobacco Group Co. Ltd. Shanghai 200082, P. R. China.
| |
Collapse
|
12
|
Application of radial basis function neural network and DFT quantum mechanical calculations for the prediction of the activity of 2-biarylethylimidazole derivatives as bombesin receptor subtype-3 (BRS-3) agonists. Med Chem Res 2014. [DOI: 10.1007/s00044-014-0948-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
13
|
Singh KP, Gupta S, Rai P. Predicting carcinogenicity of diverse chemicals using probabilistic neural network modeling approaches. Toxicol Appl Pharmacol 2013; 272:465-75. [PMID: 23856075 DOI: 10.1016/j.taap.2013.06.029] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Accepted: 06/22/2013] [Indexed: 01/31/2023]
Abstract
Robust global models capable of discriminating positive and non-positive carcinogens; and predicting carcinogenic potency of chemicals in rodents were developed. The dataset of 834 structurally diverse chemicals extracted from Carcinogenic Potency Database (CPDB) was used which contained 466 positive and 368 non-positive carcinogens. Twelve non-quantum mechanical molecular descriptors were derived. Structural diversity of the chemicals and nonlinearity in the data were evaluated using Tanimoto similarity index and Brock-Dechert-Scheinkman statistics. Probabilistic neural network (PNN) and generalized regression neural network (GRNN) models were constructed for classification and function optimization problems using the carcinogenicity end point in rat. Validation of the models was performed using the internal and external procedures employing a wide series of statistical checks. PNN constructed using five descriptors rendered classification accuracy of 92.09% in complete rat data. The PNN model rendered classification accuracies of 91.77%, 80.70% and 92.08% in mouse, hamster and pesticide data, respectively. The GRNN constructed with nine descriptors yielded correlation coefficient of 0.896 between the measured and predicted carcinogenic potency with mean squared error (MSE) of 0.44 in complete rat data. The rat carcinogenicity model (GRNN) applied to the mouse and hamster data yielded correlation coefficient and MSE of 0.758, 0.71 and 0.760, 0.46, respectively. The results suggest for wide applicability of the inter-species models in predicting carcinogenic potency of chemicals. Both the PNN and GRNN (inter-species) models constructed here can be useful tools in predicting the carcinogenicity of new chemicals for regulatory purposes.
Collapse
Affiliation(s)
- Kunwar P Singh
- Academy of Scientific and Innovative Research, Council of Scientific & Industrial Research, New Delhi, India; Environmental Chemistry Division, CSIR-Indian Institute of Toxicology Research, Post Box 80, Mahatma Gandhi Marg, Lucknow 226 001, India.
| | | | | |
Collapse
|
14
|
Abstract
Structure-activity relationship (SAR) and quantitative structure-activity relationship (QSAR) models are increasingly used in toxicology, ecotoxicology, and pharmacology for predicting the activity of the molecules from their physicochemical properties and/or their structural characteristics. However, the design of such models has many traps for unwary practitioners. Consequently, the purpose of this chapter is to give a practical guide for the computation of SAR and QSAR models, point out problems that may be encountered, and suggest ways of solving them. Attempts are also made to see how these models can be validated and interpreted.
Collapse
|
15
|
Devillers J, Doucet JP, Doucet-Panaye A, Decourtye A, Aupinel P. Linear and non-linear QSAR modelling of juvenile hormone esterase inhibitors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2012; 23:357-369. [PMID: 22443267 DOI: 10.1080/1062936x.2012.664562] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
A tight control of juvenile hormone (JH) titre is crucial during the life cycle of a holometabolous insect. JH metabolism is made through the action of enzymes, particularly the juvenile hormone esterase (JHE). Trifluoromethylketones (TFKs) are able to inhibit this enzyme to disrupt the endocrine function of the targeted insect. In this context, a set of 96 TFKs, tested on Trichoplusia ni for their JHE inhibition, was split into a training set (n = 77) and a test set (n = 19) to derive a QSAR model. TFKs were initially described by 42 CODESSA (Comprehensive Descriptors for Structural and Statistical Analysis) descriptors, but a feature selection process allowed us to consider only five descriptors encoding the structural characteristics of the TFKs and their reactivity. A classical and spline regression analysis, a three-layer perceptron, a radial basis function network and a support vector regression were experienced as statistical tools. The best results were obtained with the support vector regression (r(2) and r(test)(2) = 0.91). The model provides information on the structural features and properties responsible for the high JHE inhibition activity of TFKs.
Collapse
|
16
|
Qin Y, Deng H, Yan H, Zhong R. An accurate nonlinear QSAR model for the antitumor activities of chloroethylnitrosoureas using neural networks. J Mol Graph Model 2011; 29:826-33. [DOI: 10.1016/j.jmgm.2011.01.007] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2010] [Revised: 01/11/2011] [Accepted: 01/17/2011] [Indexed: 10/18/2022]
|
17
|
Tanabe K, Lučić B, Amić D, Kurita T, Kaihara M, Onodera N, Suzuki T. Prediction of carcinogenicity for diverse chemicals based on substructure grouping and SVM modeling. Mol Divers 2010; 14:789-802. [PMID: 20186479 DOI: 10.1007/s11030-010-9232-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2009] [Accepted: 02/05/2010] [Indexed: 01/22/2023]
Abstract
The Carcinogenicity Reliability Database (CRDB) was constructed by collecting experimental carcinogenicity data on about 1,500 chemicals from six sources, including IARC, and NTP databases, and then by ranking their reliabilities into six unified categories. A wide variety of 911 organic chemicals were selected from the database for QSAR modeling, and 1,504 kinds of different molecular descriptors were calculated, based on their 3D molecular structures as modeled by the Dragon software. Positive (carcinogenic) and negative (non-carcinogenic) chemicals containing various substructures were counted using atom and functional group count descriptors, and the statistical significance of ratios of positives to negatives was tested for those substructures. Very few were judged to be strongly related to carcinogenicity, among substructures known to be responsible for carcinogens as revealed from biomedical studies. In order to develop QSAR models for the prediction of the carcinogenicities of a wide variety of chemicals with a satisfactory performance level, the relationship between the carcinogenicity data with improved reliability and a subset of significant descriptors selected from 1,504 Dragon descriptors was analyzed with a support vector machine (SVM) method: the classification function (SVC) for weighted data in LIBSVM program was used to classify chemicals into two carcinogenic categories (positive or negative), where weights were set depending on the reliabilities of the carcinogenicity data. The quality and stability of the models presented were tested by performing a dual cross-validation procedure. A single SVM model as the first step was developed for all the 911 chemicals using 250 selected descriptors, achieving an overall accuracy level, i.e., positive and negative correct estimate, of about 70%. In order to improve the accuracy of the final model, the 911 chemicals were classified into 20 mutually overlapping subgroups according to contained substructures, a specific SVM model was optimized for each subgroup, and the predicted carcinogenicities of the 911 chemicals were determined by the majorities of the outputs of the corresponding SVM models. The model developed on the basis of grouping of chemicals into 20 substructures predicts the carcinogenicities of a wide variety of chemicals with a satisfactory overall accuracy of approximately 80%.
Collapse
Affiliation(s)
- Kazutoshi Tanabe
- Neuroscience Research Institute, National Institute of Advanced Industrial Science and Technology, Umezono 1-1-1, Tsukuba, 305-8568, Japan.
| | | | | | | | | | | | | |
Collapse
|
18
|
Devillers J, Devillers H. Prediction of acute mammalian toxicity from QSARs and interspecies correlations. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2009; 20:467-500. [PMID: 19916110 DOI: 10.1080/10629360903278651] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
With the ever-growing number of xenobiotics that can potentially contaminate the environment, the determination of their mammalian toxicity is of prime importance. In this context, LD50 tests on rats and mice have been used for a long time to express the relative hazard associated with the acute toxicity of inorganic and organic chemicals. However, these laboratory tests encounter important hurdles. They are costly, time consuming and actively opposed by animal rights activists. Moreover, new legislation policies, such as REACH (Registration, Evaluation, Authorization and Restriction of Chemicals), aim at reducing the use of toxicity tests on vertebrates. Consequently, there is a need to find alternative methods for estimating the acute mammalian toxicity of chemicals. The quantitative structure-activity relationships (QSARs) and interspecies correlations appear particularly suited to reaching this goal. In this context, this paper reviews more than 150 models aiming at predicting rat and mouse LD50 values from molecular descriptors or (and) ecotoxicity data. The interest of these computational tools is discussed.
Collapse
|