1
|
Zhu T, Zhang Y, Li Y, Tao T, Tao C. Contribution of molecular structures and quantum chemistry technique to root concentration factor: An innovative application of interpretable machine learning. JOURNAL OF HAZARDOUS MATERIALS 2023; 459:132320. [PMID: 37604035 DOI: 10.1016/j.jhazmat.2023.132320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 08/03/2023] [Accepted: 08/15/2023] [Indexed: 08/23/2023]
Abstract
Root concentration factor (RCF) is a significant parameter to characterize uptake and accumulation of hazardous organic contaminants (HOCs) by plant roots. However, complex interactions among chemicals, plant roots and soil make it challenging to identify underlying mechanisms of uptake and accumulation of HOCs. Here, nine machine learning techniques were applied to investigate major factors controlling RCF based on variable combinations of molecular descriptors (MD), MACCS fingerprints, quantum chemistry descriptors (QCD) and three physicochemical properties related to chemical-soil-plant system. Compared to models with variables including MACCS fingerprints or solitary physicochemical properties, the XGBoost-6 model developed by the variable combination of MD, QCD and three physicochemical properties achieved the most remarkable performance, with R2 of 0.977. Model interpretation achieved by permutation variable importance and partial dependence plots revealed the vital importance of HOCs lipophilicity, lipid content of plant roots, soil organic matter content, the overall deformability and the molecular dispersive ability of HOCs for regulating RCF. The integration of MD and QCD with physicochemical properties could improve our knowledge of underlying mechanisms regarding HOCs accumulation in plant roots from innovative structural perspectives. Multiple variables combination-oriented performance improvement of model can be extended to other parameters prediction in environmental risk assessment field.
Collapse
Affiliation(s)
- Tengyi Zhu
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China.
| | - Yu Zhang
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| | - Yi Li
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| | - Tianyun Tao
- College of Agriculture, Yangzhou University, Yangzhou 225009, Jiangsu, China
| | - Cuicui Tao
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| |
Collapse
|
2
|
Application of multi-objective optimization in the study of anti-breast cancer candidate drugs. Sci Rep 2022; 12:19347. [PMID: 36369522 PMCID: PMC9652409 DOI: 10.1038/s41598-022-23851-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 11/07/2022] [Indexed: 11/13/2022] Open
Abstract
In the development of anti-breast cancer drugs, the quantitative structure-activity relationship model of compounds is usually used to select potential active compounds. However, the existing methods often have problems such as low model prediction performance, lack of overall consideration of the biological activity and related properties of compounds, and difficulty in directly selection candidate drugs. Therefore, this paper constructs a complete set of compound selection framework from three aspects: feature selection, relationship mapping and multi-objective optimization problem solving. In feature selection part, a feature selection method based on unsupervised spectral clustering is proposed. The selected features have more comprehensive information expression ability. In the relationship mapping part, a variety of machine learning algorithms are used for comparative experiments. Finally, the CatBoost algorithm is selected to perform the relationship mapping between each other, and better prediction performance is achieved. In the multi-objective optimization part, based on the analysis of the conflict relationship between the objectives, the AGE-MOEA algorithm is improved and used to solve this problem. Compared with various algorithms, the improved algorithm has better search performance.
Collapse
|
3
|
Lau WY, Wang K, Zhang XP, Li LQ, Wen TF, Chen MS, Jia WD, Xu L, Shi J, Guo WX, Sun JX, Chen ZH, Guo L, Wei XB, Lu CD, Xue J, Zhou LP, Zheng YX, Wang M, Wu MC, Cheng SQ. A new staging system for hepatocellular carcinoma associated with portal vein tumor thrombus. Hepatobiliary Surg Nutr 2021; 10:782-795. [PMID: 35004945 DOI: 10.21037/hbsn-19-810] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Accepted: 04/28/2020] [Indexed: 02/05/2023]
Abstract
BACKGROUND A new staging system for patients with hepatocellular carcinoma (HCC) associated with portal vein tumor thrombus (PVTT) was developed by incorporating the good points of the BCLC classification of HCC, and by improving on the currently existing classifications of HCC associated with PVTT. METHODS Univariate and multivariate analysis with Wald χ2 test were used to determinate the clinical prognostic factors for overall survival (OS) in patients with HCC and PVTT in the training cohort. Then the conditional inference trees analysis was applied to establish a new staging system. RESULTS A training cohort of 2,179 patients from the Eastern Hepatobiliary Surgery Hospital and a validation cohort of 1,550 patients from four major liver centers in China were enrolled into establishing and validating a new staging system. The system was established by incorporating liver function, general health status, tumor resectability, extrahepatic metastasis and extent of PVTT. This staging system had a good discriminatory ability to separate patients into different stages and substages. The median OS for the two cohorts were 57.1 (37.2-76.9), 12.1 (11.0-13.2), 5.7 (5.1-6.2), 4.0 (3.3-4.6) and 2.5 (1.7-3.3) months for the stages 0 to IV, respectively (P<0.001) in the training cohort. The corresponding figures for the validation cohort were 6.4 (4.9-7.9), 2.8 (1.3-4.4), 10.8 (9.3-12.4), and 1.5 (1.3-1.7) months for the stages II to IV, respectively (P<0.001). The mean survival for stage 0 to 1 were 37.6 (35.9-39.2) and 30.4 (27.4-33.4), respectively (P<0.001). CONCLUSIONS A new staging system was established which provided a good discriminatory ability to separate patients into different stages and substages after treatment. It can be used to supplement the other HCC staging systems.
Collapse
Affiliation(s)
- Wan Yee Lau
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China.,Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China
| | - Kang Wang
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China
| | - Xiu-Ping Zhang
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China.,Faculty of Hepato-Pancreato-Biliary Surgery, The First Medical Center of Chinese People's Liberation Army (PLA) General Hospital, Beijing, China
| | - Le-Qun Li
- Department of Hepatobiliary Surgery, Affiliated Tumor Hospital of Guangxi Medical University, Nanning, China
| | - Tian-Fu Wen
- Department of Liver Surgery & Transplantation Center, West China Hospital of Sichuan University, Chengdu, China
| | - Min-Shan Chen
- Department of Hepatobiliary Surgery, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Wei-Dong Jia
- Department of General Surgery, Affiliated Provincial Hospital, Anhui Medical University, Hefei, China.,Anhui Province Key Laboratory of Hepatopancreatobiliary Surgery, Hefei, China
| | - Li Xu
- Department of Hepatobiliary Surgery, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Jie Shi
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China
| | - Wei-Xing Guo
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China
| | - Ju-Xian Sun
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China
| | - Zhen-Hua Chen
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China
| | - Lei Guo
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China
| | - Xu-Biao Wei
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China
| | - Chong-De Lu
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China
| | - Jie Xue
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China
| | - Li-Ping Zhou
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China
| | - Ya-Xing Zheng
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China
| | - Meng Wang
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China
| | - Meng-Chao Wu
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China
| | - Shu-Qun Cheng
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, The Second Military Medical University, Shanghai, China
| |
Collapse
|
4
|
Li J, Wilkinson JL, Boxall ABA. Use of a large dataset to develop new models for estimating the sorption of active pharmaceutical ingredients in soils and sediments. JOURNAL OF HAZARDOUS MATERIALS 2021; 415:125688. [PMID: 34088186 DOI: 10.1016/j.jhazmat.2021.125688] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 03/11/2021] [Accepted: 03/16/2021] [Indexed: 06/12/2023]
Abstract
Information on the sorption of active pharmaceutical ingredients (APIs) in soils and sediments is needed for assessing the environmental risks of these substances yet these data are unavailable for many APIs in use. Predictive models for estimating sorption could provide a solution. The performance of existing models is, however, often poor and most models do not account for the effects of soil/sediment properties which are known to significantly affect API sorption. Therefore, here, we use a high-quality dataset on the sorption behavior of 54 APIs in 13 soils and sediments to develop new models for estimating sorption coefficients for APIs in soils and sediments using three machine learning approaches (artificial neural network, random forest and support vector machine) and linear regression. A random forest-based model, with chemical and solid descriptors as the input, was the best performing model. Evaluation of this model using an independent sorption dataset from the literature showed that the model was able to predict sorption coefficients of 90% of the test set to within a factor of 10 of the experimental values. This new model could be invaluable in assessing the sorption behavior of molecules that have yet to be tested and in landscape-level risk assessments.
Collapse
Affiliation(s)
- Jun Li
- Department of Environment and Geography, University of York, Heslington, York YO10 5NG, UK
| | - John L Wilkinson
- Department of Environment and Geography, University of York, Heslington, York YO10 5NG, UK
| | - Alistair B A Boxall
- Department of Environment and Geography, University of York, Heslington, York YO10 5NG, UK.
| |
Collapse
|
5
|
Hatten KM, Amin J, Isaiah A. Machine Learning Prediction of Extracapsular Extension in Human Papillomavirus-Associated Oropharyngeal Squamous Cell Carcinoma. Otolaryngol Head Neck Surg 2020; 163:992-999. [PMID: 32600154 DOI: 10.1177/0194599820935446] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
OBJECTIVE To determine whether machine learning (ML) can predict the presence of extracapsular extension (ECE) prior to treatment, using common oncologic variables, in patients with human papillomavirus (HPV)-associated oropharyngeal squamous cell carcinoma (OPSCC). STUDY DESIGN Retrospective database review. SETTING National Cancer Database study. METHODS All patients with HPV-associated OPSCC treated surgically between January 1, 2010, and December 31, 2015, were selected from the National Cancer Database. Patients were excluded if surgical pathology reports did not include information regarding primary tumor stage, number of metastatic regional lymph nodes, size of largest metastatic regional lymph node, and tumor grade. The data were split into a random distribution of 80% for training and 20% for testing with ML methods. RESULTS A total of 3753 adults with surgically treated HPV-associated OPSCC met criteria for inclusion in the study. Approximately 38% of these patients treated with surgical management demonstrated ECE. ML models demonstrated modest accuracy in predicting ECE, with the areas under the receiver operating characteristic curves ranging from 0.58 to 0.68. The conditional inference tree model (0.66) predicted the metastatic lymph node number to be the most important predictor of ECE. CONCLUSION Despite a large cohort and the use of ML algorithms, the power of clinical and oncologic variables to predict ECE in HPV-associated OPSCC remains limited.
Collapse
Affiliation(s)
- Kyle M Hatten
- Department of Otorhinolaryngology-Head and Neck Surgery, School of Medicine, University of Maryland, Baltimore, Maryland, USA
| | - Julian Amin
- Department of Otorhinolaryngology-Head and Neck Surgery, School of Medicine, University of Maryland, Baltimore, Maryland, USA
| | - Amal Isaiah
- Department of Otorhinolaryngology-Head and Neck Surgery, School of Medicine, University of Maryland, Baltimore, Maryland, USA
| |
Collapse
|
6
|
Bagheri M, Al-Jabery K, Wunsch D, Burken JG. Examining plant uptake and translocation of emerging contaminants using machine learning: Implications to food security. THE SCIENCE OF THE TOTAL ENVIRONMENT 2020; 698:133999. [PMID: 31499345 DOI: 10.1016/j.scitotenv.2019.133999] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Revised: 08/16/2019] [Accepted: 08/18/2019] [Indexed: 05/24/2023]
Abstract
When water and solutes enter the plant root through the epidermis, organic contaminants in solution either cross the root membranes and transport through the vascular pathways to the aerial tissues or accumulate in the plant roots. The accumulation of contaminants in plant roots and edible tissues is measured by root concentration factor (RCF) and fruit concentration factor (FCF). In this paper, 1) a neural network (NN) was applied to model RCF based on physicochemical properties of organic compounds, 2) correlation and significance of physicochemical properties were assessed using statistical analysis, 3) fuzzy logic was used to examine the simultaneous impacts of significant compound properties on RCF and FCF, 4) a clustering algorithm (k-means) was used to identify unique groups and discover hidden relationships within contaminants in various parts of the plants. The physicochemical cutoffs achieved by fuzzy logic for the RCF and the FCF were compared versus the cutoffs for compounds that crossed the plant root membranes and found their way into transpiration stream (measured by transpiration stream concentration factor, TSCF). The NN predicted the RCF with improved accuracy compared to mechanistic models. The analysis indicated that log Kow, molecular weight, and rotatable bonds are the most important properties for predicting the RCF. These significant compound properties are positively correlated with RCF while they are negatively correlated with TSCF. Comparing the relationships between compound properties in various plant tissues showed that compounds detected in the edible parts have physicochemical cutoffs that are more like the compounds crossing the plant root membranes (into xylem tissues) than the compounds accumulating in the plant roots, with clear relationships to food security. The cluster analysis placed the contaminants into three meaningful groups that were in agreement with the results of fuzzy logic.
Collapse
Affiliation(s)
- Majid Bagheri
- Civil, Architectural and Environmental Engineering Department, Missouri University of Science and Technology, Rolla, MO, United States
| | - Khalid Al-Jabery
- Applied Computational Intelligence Laboratory, Electrical and Computer Engineering Department, Missouri University of Science and Technology, Rolla, MO, United States
| | - Donald Wunsch
- Applied Computational Intelligence Laboratory, Electrical and Computer Engineering Department, Missouri University of Science and Technology, Rolla, MO, United States
| | - Joel G Burken
- Civil, Architectural and Environmental Engineering Department, Missouri University of Science and Technology, Rolla, MO, United States.
| |
Collapse
|
7
|
Miller TH, Gallidabino MD, MacRae JI, Owen SF, Bury NR, Barron LP. Prediction of bioconcentration factors in fish and invertebrates using machine learning. THE SCIENCE OF THE TOTAL ENVIRONMENT 2019; 648:80-89. [PMID: 30114591 PMCID: PMC6234108 DOI: 10.1016/j.scitotenv.2018.08.122] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2018] [Revised: 08/08/2018] [Accepted: 08/09/2018] [Indexed: 04/14/2023]
Abstract
The application of machine learning has recently gained interest from ecotoxicological fields for its ability to model and predict chemical and/or biological processes, such as the prediction of bioconcentration. However, comparison of different models and the prediction of bioconcentration in invertebrates has not been previously evaluated. A comparison of 24 linear and machine learning models is presented herein for the prediction of bioconcentration in fish and important factors that influenced accumulation identified. R2 and root mean square error (RMSE) for the test data (n = 110 cases) ranged from 0.23-0.73 and 0.34-1.20, respectively. Model performance was critically assessed with neural networks and tree-based learners showing the best performance. An optimised 4-layer multi-layer perceptron (14 descriptors) was selected for further testing. The model was applied for cross-species prediction of bioconcentration in a freshwater invertebrate, Gammarus pulex. The model for G. pulex showed good performance with R2 of 0.99 and 0.93 for the verification and test data, respectively. Important molecular descriptors determined to influence bioconcentration were molecular mass (MW), octanol-water distribution coefficient (logD), topological polar surface area (TPSA) and number of nitrogen atoms (nN) among others. Modelling of hazard criteria such as PBT, showed potential to replace the need for animal testing. However, the use of machine learning models in the regulatory context has been minimal to date and is critically discussed herein. The movement away from experimental estimations of accumulation to in silico modelling would enable rapid prioritisation of contaminants that may pose a risk to environmental health and the food chain.
Collapse
Affiliation(s)
- Thomas H Miller
- Department of Analytical, Environmental & Forensic Sciences, School of Population Health & Environmental Sciences, Faculty of Life Sciences and Medicine, King's College London, 150 Stamford Street, London SE1 9NH, UK.
| | - Matteo D Gallidabino
- Department of Applied Sciences, Northumbria University, Newcastle Upon Tyne NE1 8ST, UK
| | - James I MacRae
- Metabolomics Laboratory, The Francis Crick Institute, 1 Midland Road, London, NW1 1AT, UK
| | - Stewart F Owen
- AstraZeneca, Global Environment, Alderley Park, Macclesfield, Cheshire SK10 4TF, UK
| | - Nicolas R Bury
- Division of Diabetes and Nutritional Sciences, Faculty of Life Sciences and Medicine, King's College London, Franklin Wilkins Building, 150 Stamford Street, London SE1 9NH, UK; Faculty of Science, Health and Technology, University of Suffolk, James Hehir Building, University Avenue, Ipswich, Suffolk IP3 0FS, UK
| | - Leon P Barron
- Department of Analytical, Environmental & Forensic Sciences, School of Population Health & Environmental Sciences, Faculty of Life Sciences and Medicine, King's College London, 150 Stamford Street, London SE1 9NH, UK.
| |
Collapse
|
8
|
Nendza M, Kühne R, Lombardo A, Strempel S, Schüürmann G. PBT assessment under REACH: Screening for low aquatic bioaccumulation with QSAR classifications based on physicochemical properties to replace BCF in vivo testing on fish. THE SCIENCE OF THE TOTAL ENVIRONMENT 2018; 616-617:97-106. [PMID: 29107783 DOI: 10.1016/j.scitotenv.2017.10.317] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2017] [Revised: 10/30/2017] [Accepted: 10/30/2017] [Indexed: 06/07/2023]
Abstract
Aquatic bioconcentration factors (BCFs) are critical in PBT (persistent, bioaccumulative, toxic) and risk assessment of chemicals. High costs and use of more than 100 fish per standard BCF study (OECD 305) call for alternative methods to replace as much in vivo testing as possible. The BCF waiving scheme is a screening tool combining QSAR classifications based on physicochemical properties related to the distribution (hydrophobicity, ionisation), persistence (biodegradability, hydrolysis), solubility and volatility (Henry's law constant) of substances in water bodies and aquatic biota to predict substances with low aquatic bioaccumulation (nonB, BCF<2000). The BCF waiving scheme was developed with a dataset of reliable BCFs for 998 compounds and externally validated with another 181 substances. It performs with 100% sensitivity (no false negatives), >50% efficacy (waiving potential), and complies with the OECD principles for valid QSARs. The chemical applicability domain of the BCF waiving scheme is given by the structures of the training set, with some compound classes explicitly excluded like organometallics, poly- and perfluorinated compounds, aromatic triphenylphosphates, surfactants. The prediction confidence of the BCF waiving scheme is based on applicability domain compliance, consensus modelling, and the structural similarity with known nonB and B/vB substances. Compounds classified as nonB by the BCF waiving scheme are candidates for waiving of BCF in vivo testing on fish due to low concern with regard to the B criterion. The BCF waiving scheme supports the 3Rs with a possible reduction of >50% of BCF in vivo testing on fish. If the target chemical is outside the applicability domain of the BCF waiving scheme or not classified as nonB, further assessments with in silico, in vitro or in vivo methods are necessary to either confirm or reject bioaccumulative behaviour.
Collapse
Affiliation(s)
- Monika Nendza
- Analytical Laboratory AL-Luhnstedt, Bahnhofstraße 1, 24816 Luhnstedt, Germany.
| | - Ralph Kühne
- UFZ Department of Ecological Chemistry, Helmholtz Centre for Environmental Research, Permoserstr. 15, 04318 Leipzig, Germany.
| | - Anna Lombardo
- IRCCS - Istituto di Ricerche Farmacologiche "Mario Negri", Environmental Chemistry and Toxicology Laboratory, via La Masa 19, 20156 Milan, Italy.
| | | | - Gerrit Schüürmann
- UFZ Department of Ecological Chemistry, Helmholtz Centre for Environmental Research, Permoserstr. 15, 04318 Leipzig, Germany; Institute for Organic Chemistry, Technical University Bergakademie Freiberg, Leipziger Strasse 29, 09596 Freiberg, Germany.
| |
Collapse
|
9
|
Pirovano A, Brandmaier S, Huijbregts MAJ, Ragas AMJ, Veltman K, Hendriks AJ. QSARs for estimating intrinsic hepatic clearance of organic chemicals in humans. ENVIRONMENTAL TOXICOLOGY AND PHARMACOLOGY 2016; 42:190-197. [PMID: 26874337 DOI: 10.1016/j.etap.2016.01.017] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2015] [Revised: 01/19/2016] [Accepted: 01/21/2016] [Indexed: 06/05/2023]
Abstract
Quantitative structure-activity relationships (QSARs) were developed to predict the in vitro clearance (CLINT) of xenobiotics metabolised in human hepatocytes (118 compounds) and microsomes (115 compounds). Clearance values were gathered from the scientific literature and multiple linear models were built and validated selecting at most 6 predictors from a pool of over 2000 potential molecular descriptors. For the hepatocytes QSAR, the explained variance (Radj(2)) was 67% and the predictive ability (Rext(2)) was 62%. For the microsomes QSAR, Radj(2) was 50% and Rext(2) 30%. For both liver assays, the most important descriptor relates to electronic properties of the compound. Functional groups of fragments were useful to identify specific compounds that have a deviating reaction rate compared to the others, such as polychlorobiphenyls (PCBs) and organic amides which were poorly metabolised by hepatocytes and microsomes, respectively. For hepatocytes, clearance was predominantly determined by electronic characteristics, while size and shape characteristics were less important and partitioning properties were absent. This may suggest that uptake across the membrane and enzyme binding are not rate-limiting steps. Particularly for hepatocytes the QSAR statistics are encouraging, allowing application of the outcomes in in vitro to in vivo extrapolation.
Collapse
Affiliation(s)
- Alessandra Pirovano
- Radboud University Nijmegen, Institute for Wetland and Water Research, Department of Environmental Science, Heyendaalseweg 135, 6525 AJ Nijmegen, The Netherlands.
| | - Stefan Brandmaier
- Helmholtz-Zentrum München - German Research Centre for Environmental Health (GmbH), Research Unit of Molecular Epidemiology, Institute of Epidemiology II, Munich, Germany
| | - Mark A J Huijbregts
- Radboud University Nijmegen, Institute for Wetland and Water Research, Department of Environmental Science, Heyendaalseweg 135, 6525 AJ Nijmegen, The Netherlands
| | - Ad M J Ragas
- Radboud University Nijmegen, Institute for Wetland and Water Research, Department of Environmental Science, Heyendaalseweg 135, 6525 AJ Nijmegen, The Netherlands; Faculty of Management, Science and Technology, Open University, Heerlen, The Netherlands
| | - Karin Veltman
- University of Michigan, School of Public Health, Department of Environmental Health Sciences, 1415 Washington Heights, Ann Arbor, MI, USA
| | - A Jan Hendriks
- Radboud University Nijmegen, Institute for Wetland and Water Research, Department of Environmental Science, Heyendaalseweg 135, 6525 AJ Nijmegen, The Netherlands
| |
Collapse
|
10
|
Piir G, Sild S, Maran U. Classifying bio-concentration factor with random forest algorithm, influence of the bio-accumulative vs. non-bio-accumulative compound ratio to modelling result, and applicability domain for random forest model. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2014; 25:967-81. [PMID: 25482723 DOI: 10.1080/1062936x.2014.969310] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2014] [Accepted: 08/03/2014] [Indexed: 05/27/2023]
Abstract
In environmental risk assessment, the bio-concentration factor (BCF) is a widely used parameter in the estimation of the bio-accumulation potential of chemicals. BCF data often have an uneven distribution of classes (bio-accumulative vs. non-bio-accumulative), which could severely bias the classification results towards the prevailing class. The present study focuses on the influence of uneven distribution of the classes in training phase of Random Forest (RF) classification models. Three different training set designs were used and descriptors selected to the models based on the occurrence frequency in RF trees and considering the mechanistic aspects they reflect. Models were compared and their classification performance was analysed, indicating good predictive characteristics (sensitivity = 0.90 and specificity = 0.83) for the balanced set; also imbalanced sets have their strengths in certain application scenarios. The confidence of classifications was assessed with a new schema for the applicability domain that makes use of the RF proximity matrix by analysing the similarity between the predicted compound and the training set of the model. All developed models were made available in the transparent, accessible and reproducible way in QsarDB repository (http://dx.doi.org/10.15152/QDB.116).
Collapse
Affiliation(s)
- G Piir
- a Institute of Chemistry , University of Tartu , Tartu , Estonia
| | | | | |
Collapse
|
11
|
Nendza M, Gabbert S, Kühne R, Lombardo A, Roncaglioni A, Benfenati E, Benigni R, Bossa C, Strempel S, Scheringer M, Fernández A, Rallo R, Giralt F, Dimitrov S, Mekenyan O, Bringezu F, Schüürmann G. A comparative survey of chemistry-driven in silico methods to identify hazardous substances under REACH. Regul Toxicol Pharmacol 2013; 66:301-14. [DOI: 10.1016/j.yrtph.2013.05.007] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2012] [Revised: 05/09/2013] [Accepted: 05/11/2013] [Indexed: 11/29/2022]
|