1
|
Hamidi H, Hamidi S, Vaez H. A quantitative structure–mobility relationship of organic acids using solvation parameters. J LIQ CHROMATOGR R T 2017. [DOI: 10.1080/10826076.2017.1398171] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Hossein Hamidi
- Department of Control Engineering, Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran
| | - Samin Hamidi
- Food and Drug Safety Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Haleh Vaez
- Department of Pharmacology, Faculty of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran
| |
Collapse
|
2
|
Gunturi SB, Ramamurthi N. A novel approach to generate robust classification models to predict developmental toxicity from imbalanced datasets. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2014; 25:711-727. [PMID: 25102768 DOI: 10.1080/1062936x.2014.942357] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Computational models to predict the developmental toxicity of compounds are built on imbalanced datasets wherein the toxicants outnumber the non-toxicants. Consequently, the results are biased towards the majority class (toxicants). To overcome this problem and to obtain sensitive but also accurate classifiers, we followed an integrated approach wherein (i) Synthetic Minority Over Sampling (SMOTE) is used for re-sampling, (ii) genetic algorithm (GA) is used for variable selection and (iii) support vector machines (SVM) is used for model development. The best model, M3, has (i) sensitivity (SE) = 85.54% and specificity (SP) = 85.62% in leave-one-out validation, (ii) classification accuracy of the training set = 99.67%, (iii) classification accuracy of the test set = 92.59%; and (iv) sensitivity = 92.68, specificity = 92.31 on the test set. Consensus prediction based on models M3-M5 improved these percentages by 5% over M3. From the analysis of results we infer that data imbalance in toxicity studies can be effectively addressed by the application of re-sampling techniques.
Collapse
Affiliation(s)
- S B Gunturi
- a Innovation Labs Hyderabad , Tata Consultancy Services Limited , Madhapur , Hyderabad , India
| | | |
Collapse
|
3
|
Cao DS, Liang YZ, Yan J, Tan GS, Xu QS, Liu S. PyDPI: Freely Available Python Package for Chemoinformatics, Bioinformatics, and Chemogenomics Studies. J Chem Inf Model 2013; 53:3086-96. [PMID: 24047419 DOI: 10.1021/ci400127q] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Dong-Sheng Cao
- School of Pharmaceutical Sciences, Central South University, Changsha 410013, P.R. China
| | | | | | - Gui-Shan Tan
- School of Pharmaceutical Sciences, Central South University, Changsha 410013, P.R. China
| | | | - Shao Liu
- Xiangya Hospital, Central South University, Changsha 410008, P.R. China
| |
Collapse
|
4
|
Electrocardiographic signals and swarm-based support vector machine for hypoglycemia detection. Ann Biomed Eng 2011; 40:934-45. [PMID: 22012087 DOI: 10.1007/s10439-011-0446-7] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2011] [Accepted: 10/11/2011] [Indexed: 10/16/2022]
Abstract
Cardiac arrhythmia relating to hypoglycemia is suggested as a cause of death in diabetic patients. This article introduces electrocardiographic (ECG) parameters for artificially induced hypoglycemia detection. In addition, a hybrid technique of swarm-based support vector machine (SVM) is introduced for hypoglycemia detection using the ECG parameters as inputs. In this technique, a particle swarm optimization (PSO) is proposed to optimize the SVM to detect hypoglycemia. In an experiment using medical data of patients with Type 1 diabetes, the introduced ECG parameters show significant contributions to the performance of the hypoglycemia detection and the proposed detection technique performs well in terms of sensitivity and specificity.
Collapse
|
5
|
Sugimoto M, Hirayama A, Robert M, Abe S, Soga T, Tomita M. Prediction of metabolite identity from accurate mass, migration time prediction and isotopic pattern information in CE-TOFMS data. Electrophoresis 2010; 31:2311-8. [PMID: 20568260 DOI: 10.1002/elps.200900584] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
CE-TOFMS is a powerful method for profiling charged metabolites. However, the limited availability of metabolite standards hinders the process of identifying compounds from detected features in CE-TOFMS data sets. To overcome this problem, we developed a method to identify unknown peaks based on the predicted migration time (t(m)) and accurate m/z values. We developed a predictive model using 375 standard cationic metabolites and support vector regression. The model yielded good correlations between the predicted and measured t(m) (R=0.952 and 0.905 using complete and cross-validation data sets, respectively). Using the trained model, we subsequently predicted the t(m) for 2938 metabolites available from the public databases and assigned tentative identities to noise-filtered features in human urine samples. While 38.9% of the peaks were assigned metabolite names by matching with the standard library alone, the proportion increased to 52.2%. The proposed methodology increases the value of metabolomic data sets obtained from CE-TOFMS profiling.
Collapse
Affiliation(s)
- Masahiro Sugimoto
- Institute for Advanced Biosciences, Keio University, Yamagata, Japan.
| | | | | | | | | | | |
Collapse
|
6
|
Luan F, Liu HT, Wen YY, Zhang XY. Classification of the fragrance properties of chemical compounds based on support vector machine and linear discriminant analysis. FLAVOUR FRAG J 2008. [DOI: 10.1002/ffj.1876] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
7
|
Considerations and recent advances in QSAR models for cytochrome P450-mediated drug metabolism prediction. J Comput Aided Mol Des 2008; 22:843-55. [DOI: 10.1007/s10822-008-9225-4] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2007] [Accepted: 06/08/2008] [Indexed: 02/07/2023]
|
8
|
Luan F, Liu HT, Ma WP, Fan BT. Classification of estrogen receptor-β ligands on the basis of their binding affinities using support vector machine and linear discriminant analysis. Eur J Med Chem 2008; 43:43-52. [PMID: 17459530 DOI: 10.1016/j.ejmech.2007.03.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2006] [Revised: 03/03/2007] [Accepted: 03/06/2007] [Indexed: 01/22/2023]
Abstract
Classification models of estrogen receptor-beta ligands were proposed using linear and nonlinear models. The data set was divided into active and inactive classes on the basis of their binding affinities. The two-class problem (active, inactive) was firstly explored by linear classifier approach, linear discriminant analysis (LDA). In order to get a more accurate prediction model, the nonlinear novel machine learning technique, support vectors machine (SVM), was subsequently used to investigate. The heuristic method (HM) was used to pre-select the whole descriptor sets. The model containing eight descriptors founded by SVM, showed better predictive ability than LDA. The accuracy in prediction for the training, test and overall data sets are 92.9%, 85.8% and 91.4% for SVM, 83.1%, 76.1% and 81.9% for LDA, respectively. The results indicate that SVM can be used as a powerful modeling tool for QSAR studies.
Collapse
Affiliation(s)
- F Luan
- Department of Applied Chemistry, Yantai University, Yantai, Shandong 264005, PR China.
| | | | | | | |
Collapse
|
9
|
Zheng G, Xiao M, Lu X. Quantitative Structure–Activity Relationships Study on the Ah Receptor Binding Affinities of Polybrominated Diphenyl Ethers Using a Support Vector Machine. ACTA ACUST UNITED AC 2007. [DOI: 10.1002/qsar.200610078] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
10
|
Wang J, Liu H, Qin S, Yao X, Liu M, Hu Z, Fan B. Study on the Structure-Activity Relationship of New Anti-HIV Nucleoside Derivatives Based on the Support Vector Machine Method. ACTA ACUST UNITED AC 2007. [DOI: 10.1002/qsar.200510166] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
11
|
Yuan S, Xiao M, Zheng G, Tian M, Lu X. Quantitative structure-property relationship studies on electrochemical degradation of substituted phenols using a support vector machine. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2006; 17:473-81. [PMID: 17050187 DOI: 10.1080/10629360600934044] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
A quantitative structure-property relationship (QSPR) model has been developed for the electrochemical degradation of substituted phenols using a support vector machine (SVM). Thirty descriptors, including quantum chemical parameters, steric effect descriptors and half wave potential (E1/2), were used for describing twelve substituted phenols, including mono- and multi-substituent phenols. A leave-one-out (LOO) cross validation procedure resulted in the selection of three descriptors, the total of electron and nuclear energies of the two-center terms for the carbon-chlorine or carbon-nitrogen bond (TE2), the net atomic charges on the chlorine or nitrogen (qx), and the largest negative atomic charge on an atom (q-). The model based on SVM yielded a Q2 value of 0.892, indicating a high predictive ability. Compared with models developed with partial least squares (PLS) and multiple linear regression (MLR), where Q2 were 0.804 and 0.799 respectively, SVM showed higher performances.
Collapse
Affiliation(s)
- S Yuan
- Environmental Science Research Institute, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China
| | | | | | | | | |
Collapse
|
12
|
Luan F, Ma W, Zhang X, Zhang H, Liu M, Hu Z, Fan BT. Quantitative structure-activity relationship models for prediction of sensory irritants (logRD50) of volatile organic chemicals. CHEMOSPHERE 2006; 63:1142-53. [PMID: 16307788 DOI: 10.1016/j.chemosphere.2005.09.053] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/24/2005] [Revised: 09/14/2005] [Accepted: 09/14/2005] [Indexed: 05/05/2023]
Abstract
Quantitative classification and regression models for prediction of sensory irritants (logRD50) of volatile organic chemicals (VOCs) have been developed. Each compound was represented by the calculated structural descriptors to encode constitutional, topological, geometrical, electrostatic, and quantum-chemical features. The heuristic method (HM) was then used to search the descriptor space and select the descriptors responsible for activity. The best classification results were found using support vector machine (SVM): the accuracy for training, test and overall data set is 96.5%, 85.7% and 94.4%, respectively. The nonlinear regression models were built by radial basis function neural networks (RNFNN) and SVM, respectively. The root mean squared errors (RMS) in prediction for the training, test and overall data set are 0.4755, 0.6322 and 0.5009 for reactive group, 0.2430, 0.4798 and 0.3064 for nonreactive group by RBFNN. The comparative results obtained by SVM are 0.4415, 0.7430 and 0.5140 for reactive group, 0.3920, 0.4520 and 0.4050 for nonreactive group, respectively. This paper proposes an effective method for poisonous chemicals screening and considering.
Collapse
Affiliation(s)
- Feng Luan
- Department of Chemistry, Lanzhou University, Lanzhou 730000, China
| | | | | | | | | | | | | |
Collapse
|
13
|
Ma W, Luan F, Zhang H, Zhang X, Liu M, Hu Z, Fan B. Quantitative structure–property relationships for pesticides in biopartitioning micellar chromatography. J Chromatogr A 2006; 1113:140-7. [PMID: 16490199 DOI: 10.1016/j.chroma.2006.01.136] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2005] [Revised: 12/19/2005] [Accepted: 01/31/2006] [Indexed: 01/04/2023]
Abstract
The retention factor (log k) in the biopartitioning micellar chromatography (BMC) of 79 heterogeneous pesticides was studied by quantitative structure-property relationships (QSPR) method. Heuristic method (HM) and support vector machine (SVM) method were used to build linear and nonlinear models, respectively. Compared the results of these two methods, those obtained by the SVM model are much better. For the test set, a predictive correlation coefficient (R) of 0.9755 and root-mean-square (RMS) error of 0.1403 were obtained. The proposed QSPR models, both by HM and SVM, contain the same descriptors that agree with the classical Abraham parameters of well-known linear solvation energy relationships (LSER).
Collapse
Affiliation(s)
- Weiping Ma
- Department of Chemistry, Lanzhou University, Lanzhou 730000, Gansu, PR China
| | | | | | | | | | | | | |
Collapse
|
14
|
Xue C, Yao X, Liu H, Liu M, Hu Z, Fan B. Development of migration models for acids in capillary electrophoresis using heuristic and radial basis function neural network methods. Electrophoresis 2005; 26:2154-64. [PMID: 15852353 DOI: 10.1002/elps.200410175] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
A quantitative structure-mobility relationship (QSMR) was developed for the absolute mobilities of a diverse set of 277 organic and inorganic acids in capillary electrophoresis based on the descriptors calculated from the structure alone. The heuristic method (HM) and the radial basis function neural networks (RBFNN) were utilized to construct the linear and nonlinear prediction models, respectively. The prediction results were in agreement with the experimental values. The HM model gave a root-mean-square (RMS) error of 3.66 electrophoretic mobility units for the training set, 4.67 for the test set, and 3.88 for the whole data set, while the RBFNN gave an RMS error of 2.49, 3.19, and 2.65, respectively. The heuristic linear model could give some insights into the factors that are likely to govern the mobilities of the compounds, however, the prediction results of the RBFNN model seem to be better than that of the HM.
Collapse
Affiliation(s)
- Chunxia Xue
- Department of Chemistry, Lanzhou University, Lanzhou, China
| | | | | | | | | | | |
Collapse
|
15
|
Luan F, Zhang R, Yao X, Liu M, Hu Z, Fan B. Support Vector Machine-based QSPR for the Prediction of Van der Waals' Constants. ACTA ACUST UNITED AC 2005. [DOI: 10.1002/qsar.200430890] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
16
|
Cytochrome P450 Classification of Drugs with Support Vector Machines Implementing the Nearest Point Algorithm. ACTA ACUST UNITED AC 2004. [DOI: 10.1007/978-3-540-30478-4_17] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|