Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Tan NX, Rao HB, Li ZR, Li XY. Prediction of chemical carcinogenicity by machine learning approaches. SAR QSAR Environ Res 2009;20:27-75. [PMID: 19343583 DOI: 10.1080/10629360902724085] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]

Number

Cited by Other Article(s)

Le NQK, Tran TX, Nguyen PA, Ho TT, Nguyen VN. Recent progress in machine learning approaches for predicting carcinogenicity in drug development. Expert Opin Drug Metab Toxicol 2024:1-8. [PMID: 38742542 DOI: 10.1080/17425255.2024.2356162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Accepted: 05/13/2024] [Indexed: 05/16/2024]

Guo W, Liu J, Dong F, Song M, Li Z, Khan MKH, Patterson TA, Hong H. Review of machine learning and deep learning models for toxicity prediction. Exp Biol Med (Maywood) 2023;248:1952-1973. [PMID: 38057999 PMCID: PMC10798180 DOI: 10.1177/15353702231209421] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023] Open

Abstract

The ever-increasing number of chemicals has raised public concerns due to their adverse effects on human health and the environment. To protect public health and the environment, it is critical to assess the toxicity of these chemicals. Traditional in vitro and in vivo toxicity assays are complicated, costly, and time-consuming and may face ethical issues. These constraints raise the need for alternative methods for assessing the toxicity of chemicals. Recently, due to the advancement of machine learning algorithms and the increase in computational power, many toxicity prediction models have been developed using various machine learning and deep learning algorithms such as support vector machine, random forest, k-nearest neighbors, ensemble learning, and deep neural network. This review summarizes the machine learning- and deep learning-based toxicity prediction models developed in recent years. Support vector machine and random forest are the most popular machine learning algorithms, and hepatotoxicity, cardiotoxicity, and carcinogenicity are the frequently modeled toxicity endpoints in predictive toxicology. It is known that datasets impact model performance. The quality of datasets used in the development of toxicity prediction models using machine learning and deep learning is vital to the performance of the developed models. The different toxicity assignments for the same chemicals among different datasets of the same type of toxicity have been observed, indicating benchmarking datasets is needed for developing reliable toxicity prediction models using machine learning and deep learning algorithms. This review provides insights into current machine learning models in predictive toxicology, which are expected to promote the development and application of toxicity prediction models in the future.

Collapse

Hao N, Sun P, Zhao W, Li X. Application of a developed triple-classification machine learning model for carcinogenic prediction of hazardous organic chemicals to the US, EU, and WHO based on Chinese database. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2023;255:114806. [PMID: 36948010 DOI: 10.1016/j.ecoenv.2023.114806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2022] [Revised: 03/04/2023] [Accepted: 03/16/2023] [Indexed: 06/18/2023]

Abstract

Cancer, the second largest human disease, has become a major public health problem. The prediction of chemicals' carcinogenicity before their synthesis is crucial. In this paper, seven machine learning algorithms (i.e., Random Forest (RF), Logistic Regression (LR), Support Vector Machines (SVM), Complement Naive Bayes (CNB), K-Nearest Neighbor (KNN), XGBoost, and Multilayer Perceptron (MLP)) were used to construct the carcinogenicity triple classification prediction (TCP) model (i.e., 1A, 1B, Category 2). A total of 1444 descriptors of 118 hazardous organic chemicals were calculated by Discovery Studio 2020, Sybyl X-2.0 and PaDEL-Descriptor software. The constructed carcinogenicity TCP model was evaluated through five model evaluation indicators (i.e., Accuracy, Precision, Recall, F1 Score and AUC). The model evaluation results show that Accuracy, Precision, Recall, F1 Score and AUC evaluation indicators meet requirements (greater than 0.6). The accuracy of RF, LR, XGBoost, and MLP models for predicting carcinogenicity of Category 2 is 91.67%, 79.17%, 100%, and 100%, respectively. In addition, the constructed machine learning model in this study has potential for error correction. Taking XGBoost model as an example, the predicted carcinogenicity level of 1,2,3-Trichloropropane (96-18-4) is Category 2, but the actual carcinogenicity level is 1B. But the difference between Category 2 and 1B is only 0.004, indicating that the XGBoost is one optimum model of the seven constructed machine learning models. Besides, results showed that functional groups like chlorine and benzene ring might influence the prediction of carcinogenic classification. Therefore, considering functional group characteristics of chemicals before constructing the carcinogenicity prediction model of organic chemicals is recommended. The predicted carcinogenicity of the organic chemicals using the optimum machine leaning model (i.e., XGBoost) was also evaluated and verified by the toxicokinetics. The RF and XGBoost TCP models constructed in this paper can be used for carcinogenicity detection before synthesizing new organic substances. It also provides technical support for the subsequent management of organic chemicals.

Collapse

Limbu S, Dakshanamurthy S. Predicting Chemical Carcinogens Using a Hybrid Neural Network Deep Learning Method. SENSORS (BASEL, SWITZERLAND) 2022;22:s22218185. [PMID: 36365881 PMCID: PMC9653664 DOI: 10.3390/s22218185] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 10/11/2022] [Accepted: 10/23/2022] [Indexed: 05/28/2023]

Abstract

Determining environmental chemical carcinogenicity is urgently needed as humans are increasingly exposed to these chemicals. In this study, we developed a hybrid neural network (HNN) method called HNN-Cancer to predict potential carcinogens of real-life chemicals. The HNN-Cancer included a new SMILES feature representation method by modifying our previous 3D array representation of 1D SMILES simulated by the convolutional neural network (CNN). We developed binary classification, multiclass classification, and regression models based on diverse non-congeneric chemicals. Along with the HNN-Cancer model, we developed models based on the random forest (RF), bootstrap aggregating (Bagging), and adaptive boosting (AdaBoost) methods for binary and multiclass classification. We developed regression models using HNN-Cancer, RF, support vector regressor (SVR), gradient boosting (GB), kernel ridge (KR), decision tree with AdaBoost (DT), KNeighbors (KN), and a consensus method. The performance of the models for all classifications was assessed using various statistical metrics. The accuracy of the HNN-Cancer, RF, and Bagging models were 74%, and their AUC was ~0.81 for binary classification models developed with 7994 chemicals. The sensitivity was 79.5% and the specificity was 67.3% for the HNN-Cancer, which outperforms the other methods. In the case of multiclass classification models with 1618 chemicals, we obtained the optimal accuracy of 70% with an AUC 0.7 for HNN-Cancer, RF, Bagging, and AdaBoost, respectively. In the case of regression models, the correlation coefficient (R) was around 0.62 for HNN-Cancer and RF higher than the SVM, GB, KR, DTBoost, and NN machine learning methods. Overall, the HNN-Cancer performed better for the majority of the known carcinogen experimental datasets. Further, the predictive performance of HNN-Cancer on diverse chemicals is comparable to the literature-reported models that included similar and less diverse molecules. Our HNN-Cancer could be used in identifying potentially carcinogenic chemicals for a wide variety of chemical classes.

Collapse

Ozbuyukkaya G, Parker RS, Veser G. Determining robust reaction kinetics from limited data. AIChE J 2021. [DOI: 10.1002/aic.17538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Jiao Z, Hu P, Xu H, Wang Q. Machine Learning and Deep Learning in Chemical Health and Safety: A Systematic Review of Techniques and Applications. ACS CHEMICAL HEALTH & SAFETY 2020. [DOI: 10.1021/acs.chas.0c00075] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Gupta A, Kahali B. Machine learning-based cognitive impairment classification with optimal combination of neuropsychological tests. ALZHEIMER'S & DEMENTIA (NEW YORK, N. Y.) 2020;6:e12049. [PMID: 32699817 PMCID: PMC7369403 DOI: 10.1002/trc2.12049] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 01/16/1800] [Accepted: 01/28/2020] [Indexed: 11/09/2022]

Guan D, Fan K, Spence I, Matthews S. Combining machine learning models of in vitro and in vivo bioassays improves rat carcinogenicity prediction. Regul Toxicol Pharmacol 2018;94:8-15. [PMID: 29337192 DOI: 10.1016/j.yrtph.2018.01.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2017] [Revised: 01/09/2018] [Accepted: 01/10/2018] [Indexed: 12/18/2022]

Ford KA. Refinement, Reduction, and Replacement of Animal Toxicity Tests by Computational Methods. ILAR J 2017;57:226-233. [DOI: 10.1093/ilar/ilw031] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Revised: 10/12/2016] [Indexed: 12/16/2022] Open

Zhang H, Cao ZX, Li M, Li YZ, Peng C. Novel naïve Bayes classification models for predicting the carcinogenicity of chemicals. Food Chem Toxicol 2016;97:141-149. [PMID: 27597133 DOI: 10.1016/j.fct.2016.09.005] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2016] [Revised: 08/02/2016] [Accepted: 09/01/2016] [Indexed: 02/05/2023]

Li X, Du Z, Wang J, Wu Z, Li W, Liu G, Shen X, Tang Y. In Silico Estimation of Chemical Carcinogenicity with Binary and Ternary Classification Methods. Mol Inform 2015;34:228-35. [PMID: 27490168 DOI: 10.1002/minf.201400127] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2014] [Accepted: 01/11/2015] [Indexed: 11/07/2022]

Affiliation(s)

Xiao Li Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033.,Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, P. R. China
Zheng Du Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033
Jie Wang Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033
Zengrui Wu Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033
Weihua Li Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033
Guixia Liu Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033
Xu Shen Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, P. R. China
Yun Tang Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033. .,Key Laboratory of Cigarette Smoke, Technical Center, Shanghai Tobacco Group Co. Ltd. Shanghai 200082, P. R. China.

Collapse

Application of radial basis function neural network and DFT quantum mechanical calculations for the prediction of the activity of 2-biarylethylimidazole derivatives as bombesin receptor subtype-3 (BRS-3) agonists. Med Chem Res 2014. [DOI: 10.1007/s00044-014-0948-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Singh KP, Gupta S, Rai P. Predicting carcinogenicity of diverse chemicals using probabilistic neural network modeling approaches. Toxicol Appl Pharmacol 2013;272:465-75. [PMID: 23856075 DOI: 10.1016/j.taap.2013.06.029] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Accepted: 06/22/2013] [Indexed: 01/31/2023]

Devillers J. Methods for building QSARs. Methods Mol Biol 2013;930:3-27. [PMID: 23086835 DOI: 10.1007/978-1-62703-059-5_1] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Devillers J, Doucet JP, Doucet-Panaye A, Decourtye A, Aupinel P. Linear and non-linear QSAR modelling of juvenile hormone esterase inhibitors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2012;23:357-369. [PMID: 22443267 DOI: 10.1080/1062936x.2012.664562] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Qin Y, Deng H, Yan H, Zhong R. An accurate nonlinear QSAR model for the antitumor activities of chloroethylnitrosoureas using neural networks. J Mol Graph Model 2011;29:826-33. [DOI: 10.1016/j.jmgm.2011.01.007] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2010] [Revised: 01/11/2011] [Accepted: 01/17/2011] [Indexed: 10/18/2022]

Tanabe K, Lučić B, Amić D, Kurita T, Kaihara M, Onodera N, Suzuki T. Prediction of carcinogenicity for diverse chemicals based on substructure grouping and SVM modeling. Mol Divers 2010;14:789-802. [PMID: 20186479 DOI: 10.1007/s11030-010-9232-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2009] [Accepted: 02/05/2010] [Indexed: 01/22/2023]

Abstract

The Carcinogenicity Reliability Database (CRDB) was constructed by collecting experimental carcinogenicity data on about 1,500 chemicals from six sources, including IARC, and NTP databases, and then by ranking their reliabilities into six unified categories. A wide variety of 911 organic chemicals were selected from the database for QSAR modeling, and 1,504 kinds of different molecular descriptors were calculated, based on their 3D molecular structures as modeled by the Dragon software. Positive (carcinogenic) and negative (non-carcinogenic) chemicals containing various substructures were counted using atom and functional group count descriptors, and the statistical significance of ratios of positives to negatives was tested for those substructures. Very few were judged to be strongly related to carcinogenicity, among substructures known to be responsible for carcinogens as revealed from biomedical studies. In order to develop QSAR models for the prediction of the carcinogenicities of a wide variety of chemicals with a satisfactory performance level, the relationship between the carcinogenicity data with improved reliability and a subset of significant descriptors selected from 1,504 Dragon descriptors was analyzed with a support vector machine (SVM) method: the classification function (SVC) for weighted data in LIBSVM program was used to classify chemicals into two carcinogenic categories (positive or negative), where weights were set depending on the reliabilities of the carcinogenicity data. The quality and stability of the models presented were tested by performing a dual cross-validation procedure. A single SVM model as the first step was developed for all the 911 chemicals using 250 selected descriptors, achieving an overall accuracy level, i.e., positive and negative correct estimate, of about 70%. In order to improve the accuracy of the final model, the 911 chemicals were classified into 20 mutually overlapping subgroups according to contained substructures, a specific SVM model was optimized for each subgroup, and the predicted carcinogenicities of the 911 chemicals were determined by the majorities of the outputs of the corresponding SVM models. The model developed on the basis of grouping of chemicals into 20 substructures predicts the carcinogenicities of a wide variety of chemicals with a satisfactory overall accuracy of approximately 80%.

Collapse

Devillers J, Devillers H. Prediction of acute mammalian toxicity from QSARs and interspecies correlations. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2009;20:467-500. [PMID: 19916110 DOI: 10.1080/10629360903278651] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]