Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Hua S, Sun Z. Support vector machine approach for protein subcellular localization prediction. Bioinformatics 2001;17:721-8. [PMID: 11524373 DOI: 10.1093/bioinformatics/17.8.721] [Citation(s) in RCA: 479] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

For:	Hua S, Sun Z. Support vector machine approach for protein subcellular localization prediction. Bioinformatics 2001;17:721-8. [PMID: 11524373 DOI: 10.1093/bioinformatics/17.8.721] [Citation(s) in RCA: 479] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Number

Cited by Other Article(s)

401

Dönnes P, Höglund A. Predicting protein subcellular localization: past, present, and future. GENOMICS PROTEOMICS & BIOINFORMATICS 2005;2:209-15. [PMID: 15901249 PMCID: PMC5187447 DOI: 10.1016/s1672-0229(04)02027-3] [Citation(s) in RCA: 78] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]

402

Guda C, Subramaniam S. pTARGET [corrected] a new method for predicting protein subcellular localization in eukaryotes. Bioinformatics 2005;21:3963-9. [PMID: 16144808 DOI: 10.1093/bioinformatics/bti650] [Citation(s) in RCA: 77] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

403

Sharabiani MTA, Siermala M, Lehtinen TO, Vihinen M. Dynamic covariation between gene expression and proteome characteristics. BMC Bioinformatics 2005;6:215. [PMID: 16131395 PMCID: PMC1236912 DOI: 10.1186/1471-2105-6-215] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2004] [Accepted: 08/30/2005] [Indexed: 02/07/2023] Open

404

Kulkarni OC, Vigneshwar R, Jayaraman VK, Kulkarni BD. Identification of coding and non-coding sequences using local Holder exponent formalism. Bioinformatics 2005;21:3818-23. [PMID: 16118261 DOI: 10.1093/bioinformatics/bti639] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

405

Bhasin M, Raghava GPS. GPCRsclass: a web tool for the classification of amine type of G-protein-coupled receptors. Nucleic Acids Res 2005;33:W143-7. [PMID: 15980444 PMCID: PMC1160112 DOI: 10.1093/nar/gki351] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open

406

Xie D, Li A, Wang M, Fan Z, Feng H. LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST. Nucleic Acids Res 2005;33:W105-10. [PMID: 15980436 PMCID: PMC1160120 DOI: 10.1093/nar/gki359] [Citation(s) in RCA: 119] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open

407

Stochastic molecular descriptors for polymers. 3. Markov electrostatic moments as polymer 2D-folding descriptors: RNA–QSAR for mycobacterial promoters. POLYMER 2005. [DOI: 10.1016/j.polymer.2005.04.104] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

408

Cai YD, Chou KC. Predicting membrane protein type by functional domain composition and pseudo-amino acid composition. J Theor Biol 2005;238:395-400. [PMID: 16040052 DOI: 10.1016/j.jtbi.2005.05.035] [Citation(s) in RCA: 72] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2005] [Revised: 05/25/2005] [Accepted: 05/26/2005] [Indexed: 10/25/2022]

409

González-Díaz H, Molina R, Uriarte E. Recognition of stable protein mutants with 3D stochastic average electrostatic potentials. FEBS Lett 2005;579:4297-301. [PMID: 16081074 DOI: 10.1016/j.febslet.2005.06.065] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2004] [Revised: 06/07/2005] [Accepted: 06/23/2005] [Indexed: 11/15/2022]

410

Wang J, Sung WK, Krishnan A, Li KB. Protein subcellular localization prediction for Gram-negative bacteria using amino acid subalphabets and a combination of multiple support vector machines. BMC Bioinformatics 2005;6:174. [PMID: 16011808 PMCID: PMC1190155 DOI: 10.1186/1471-2105-6-174] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2005] [Accepted: 07/13/2005] [Indexed: 11/10/2022] Open

411

van Diepen MT, Spencer GE, van Minnen J, Gouwenberg Y, Bouwman J, Smit AB, van Kesteren RE. The molluscan RING-finger protein L-TRIM is essential for neuronal outgrowth. Mol Cell Neurosci 2005;29:74-81. [PMID: 15866048 DOI: 10.1016/j.mcn.2005.01.005] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2004] [Accepted: 01/17/2005] [Indexed: 01/23/2023] Open

412

Huang N, Chen H, Sun Z. CTKPred: an SVM-based method for the prediction and classification of the cytokine superfamily. Protein Eng Des Sel 2005;18:365-8. [PMID: 15980017 DOI: 10.1093/protein/gzi041] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

413

Sarda D, Chua GH, Li KB, Krishnan A. pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties. BMC Bioinformatics 2005;6:152. [PMID: 15963230 PMCID: PMC1182350 DOI: 10.1186/1471-2105-6-152] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2004] [Accepted: 06/17/2005] [Indexed: 12/04/2022] Open

414

Saíz-Urra L, González-Díaz H, Uriarte E. Proteins Markovian 3D-QSAR with spherically-truncated average electrostatic potentials. Bioorg Med Chem 2005;13:3641-7. [PMID: 15862992 DOI: 10.1016/j.bmc.2005.03.041] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2004] [Revised: 03/16/2005] [Accepted: 03/21/2005] [Indexed: 11/17/2022]

Abstract

Proteins 3D-QSAR is an emerging field of bioorganic chemistry. However, the large dimensions of the structures to be handled may become a bottleneck to scaling up classic QSAR problems for proteins. In this sense, truncation approach could be used as in molecular dynamic to perform timely calculations. The spherical truncation of electrostatic field with different functions breaks down long-range interactions at a given cutoff distance (r(off)) resulting in short-range ones. Consequently, a Markov chain model may approach to the average electrostatic potentials of spatial distribution of charges within the protein backbone. These average electrostatic potentials can be used to predict proteins properties. Herein, we explore the effect of abrupt, shifting, force shifting, and switching truncation functions on 3D-QSAR models classifying 26 proteins with different functions: lysozymes, dihydrofolate reductases, and alcohol dehydrogenases. Almost all methods have shown overall accuracies higher than 73%. The present result points to an acceptable robustness of the MC for different truncation schemes and r(off) values. The results of best accuracy 92% with abrupt truncation coincide with our recent communication. We also developed models with the same accuracy value for other truncation functions; however they are more complex functions. PCA analysis for 152 non-homologous proteins has shown that there are five main eigenvalues, which explain more than 87% of the variance of the studied properties. The present molecular descriptors may encode structural information not totally accounted for the previous ones, so success with these descriptors could be expected when classic fails. The present result confirms the utility of our Markov models combined with truncation approach to generate bioorganic structure protein molecular descriptors for QSAR.

Collapse

415

Gao QB, Wang ZZ, Yan C, Du YH. Prediction of protein subcellular location using a combined feature of sequence. FEBS Lett 2005;579:3444-8. [PMID: 15949806 DOI: 10.1016/j.febslet.2005.05.021] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2005] [Revised: 05/10/2005] [Accepted: 05/10/2005] [Indexed: 11/20/2022]

416

Bodén M, Hawkins J. Prediction of subcellular localization using sequence-biased recurrent networks. Bioinformatics 2005;21:2279-86. [PMID: 15746276 DOI: 10.1093/bioinformatics/bti372] [Citation(s) in RCA: 112] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

417

Rey S, Acab M, Gardy JL, Laird MR, deFays K, Lambert C, Brinkman FSL. PSORTdb: a protein subcellular localization database for bacteria. Nucleic Acids Res 2005;33:D164-8. [PMID: 15608169 PMCID: PMC539981 DOI: 10.1093/nar/gki027] [Citation(s) in RCA: 100] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

418

Chou KC, Cai YD. Using GO-PseAA predictor to identify membrane proteins and their types. Biochem Biophys Res Commun 2005;327:845-7. [PMID: 15649422 DOI: 10.1016/j.bbrc.2004.12.069] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2004] [Indexed: 11/21/2022]

419

Huang SW, Hwang JK. Computation of conformational entropy from protein sequences using the machine-learning method-Application to the study of the relationship between structural conservation and local structural stability. Proteins 2005;59:802-9. [PMID: 15828008 DOI: 10.1002/prot.20462] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

420

Lu Z, Hunter L. Go molecular function terms are predictive of subcellular localization. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2005:151-61. [PMID: 15759622 PMCID: PMC2652875 DOI: 10.1142/9789812702456_0015] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

421

Wang ML, Yao H, Xu WB. Prediction by support vector machines and analysis by Z-score of poly-l-proline type II conformation based on local sequence. Comput Biol Chem 2005;29:95-100. [PMID: 15833437 DOI: 10.1016/j.compbiolchem.2005.02.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2004] [Revised: 01/08/2005] [Accepted: 02/18/2005] [Indexed: 11/25/2022]

422

Huang J, Shi F. Support vector machines for predicting apoptosis proteins types. Acta Biotheor 2005;53:39-47. [PMID: 15906142 DOI: 10.1007/s10441-005-7002-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2004] [Revised: 05/17/2004] [Accepted: 10/07/2004] [Indexed: 10/25/2022]

423

Zheng L, Yang J, Landwehr C, Fan F, Ji Y. Identification of an essential glycoprotease in Staphylococcus aureus. FEMS Microbiol Lett 2005;245:279-85. [PMID: 15837383 DOI: 10.1016/j.femsle.2005.03.017] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2005] [Revised: 03/11/2005] [Accepted: 03/13/2005] [Indexed: 11/19/2022] Open

424

Nair R, Rost B. Mimicking Cellular Sorting Improves Prediction of Subcellular Localization. J Mol Biol 2005;348:85-100. [PMID: 15808855 DOI: 10.1016/j.jmb.2005.02.025] [Citation(s) in RCA: 219] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2004] [Revised: 02/08/2005] [Accepted: 02/09/2005] [Indexed: 11/24/2022]

425

González-Díaz H, Saíz-Urra L, Molina R, Uriarte E. Stochastic molecular descriptors for polymers. 2. Spherical truncation of electrostatic interactions on entropy based polymers 3D-QSAR. POLYMER 2005. [DOI: 10.1016/j.polymer.2005.01.066] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

426

Chou KC, Cai YD. Prediction of Membrane Protein Types by Incorporating Amphipathic Effects. J Chem Inf Model 2005;45:407-13. [PMID: 15807506 DOI: 10.1021/ci049686v] [Citation(s) in RCA: 147] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

427

Drabkin HJ, Hollenbeck C, Hill DP, Blake JA. Ontological visualization of protein-protein interactions. BMC Bioinformatics 2005;6:29. [PMID: 15707487 PMCID: PMC550656 DOI: 10.1186/1471-2105-6-29] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2004] [Accepted: 02/11/2005] [Indexed: 11/10/2022] Open

428

Bhasin M, Garg A, Raghava GPS. PSLpred: prediction of subcellular localization of bacterial proteins. Bioinformatics 2005;21:2522-4. [PMID: 15699023 DOI: 10.1093/bioinformatics/bti309] [Citation(s) in RCA: 168] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

429

González-Díaz H, Cruz-Monteagudo M, Molina R, Tenorio E, Uriarte E. Predicting multiple drugs side effects with a general drug-target interaction thermodynamic Markov model. Bioorg Med Chem 2005;13:1119-29. [PMID: 15670920 DOI: 10.1016/j.bmc.2004.11.030] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2004] [Revised: 11/09/2004] [Accepted: 11/12/2004] [Indexed: 10/26/2022]

Abstract

Most of present molecular descriptors just consider the molecular structure. In the present article we pretend extending the use of Markov chain models to define novel molecular descriptors, which consider in addition to molecular structure other parameters like target site or toxic effect. Specifically, this molecular descriptor takes into consideration not only the molecular structure but the specific system the drug affects too. Herein, it is developed a general Markov model that describes 39 different drugs side effects grouped in 11 affected systems for 301 drugs, being 686 cases finally. The data was processed by linear discriminant analysis (LDA) classifying drugs according to their specific side effects, forward stepwise was fixed as strategy for variables selection. The average percentage of good classification and number of compounds used in the training/predicting sets were 100/100% for systemic phenomena (47 out of 47)/(12 out of 12) and metabolic (18 out of 18)/(5 out of 5), muscular-skeletal (23 out of 23)/(6 out of 6) and neurological manifestations (33 out of 33)/(8 out of 8); 97.6/96.7% for cardiovascular manifestation (122 out of 125)/(30 out of 31); 97.1/97.5% for breathing manifestations (34 out of 35)/(8 out of 9); 97/99.4% for gastrointestinal manifestations (159 out of 164)/(40 out of 41); 97/95% for endocrine manifestations (32 out of 33)/(7 out of 8); 96.4/94.6% for psychiatric manifestations (53 out of 55)/(13 out of 14); 95.1/99.1% for hematological manifestations (98 out of 103)/(25 out of 26) and 88/92.3% for dermal manifestations (44 out of 50)/(12 out of 13). In addition, we report preliminary experimental reversible decrease of lymphocytes differential count after administration of the antibacterial drug G-1 in mice, which coincide with a posterior probability (P%=74.91) predicted by the model. This article develops a model that encompasses a large number of side effects grouped in specific organ systems in a single stochastic framework for the first time.

Collapse

430

de Armas RR, Díaz HG, Molina R, Uriarte E. Stochastic-based descriptors studying biopolymers biological properties: Extended MARCH-INSIDE methodology describing antibacterial activity of lactoferricin derivatives. Biopolymers 2005;77:247-56. [PMID: 15682438 DOI: 10.1002/bip.20202] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

431

Garg A, Bhasin M, Raghava GPS. Support vector machine-based method for subcellular localization of human proteins using amino acid compositions, their order, and similarity search. J Biol Chem 2005;280:14427-32. [PMID: 15647269 DOI: 10.1074/jbc.m411789200] [Citation(s) in RCA: 153] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

432

Predicting Subcellular Localization of Proteins Using Support Vector Machine with N-Terminal Amino Composition. ACTA ACUST UNITED AC 2005. [DOI: 10.1007/11527503_73] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

433

Wang L, Chen K, Ong YS. Bio-kernel Self-organizing Map for HIV Drug Resistance Classification. LECTURE NOTES IN COMPUTER SCIENCE 2005. [PMCID: PMC7122014 DOI: 10.1007/11539087_20] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

434

González-Díaz H, Uriarte E, Ramos de Armas R. Predicting stability of Arc repressor mutants with protein stochastic moments. Bioorg Med Chem 2005;13:323-31. [PMID: 15598555 DOI: 10.1016/j.bmc.2004.10.024] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2004] [Revised: 10/08/2004] [Accepted: 10/09/2004] [Indexed: 11/18/2022]

435

Heazlewood JL, Millar AH. AMPDB: the Arabidopsis Mitochondrial Protein Database. Nucleic Acids Res 2005;33:D605-10. [PMID: 15608271 PMCID: PMC540002 DOI: 10.1093/nar/gki048] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2004] [Revised: 09/30/2004] [Accepted: 09/30/2004] [Indexed: 12/02/2022] Open

436

Voting Fuzzy k-NN to Predict Protein Subcellular Localization from Normalized Amino Acid Pair Compositions. ACTA ACUST UNITED AC 2005. [DOI: 10.1007/11430919_23] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

437

Millar AH, Heazlewood JL, Kristensen BK, Braun HP, Møller IM. The plant mitochondrial proteome. TRENDS IN PLANT SCIENCE 2005;10:36-43. [PMID: 15642522 DOI: 10.1016/j.tplants.2004.12.002] [Citation(s) in RCA: 127] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]

438

Yu CS, Lin CJ, Hwang JK. Predicting subcellular localization of proteins for Gram-negative bacteria by support vector machines based on n-peptide compositions. Protein Sci 2004;13:1402-6. [PMID: 15096640 PMCID: PMC2286765 DOI: 10.1110/ps.03479604] [Citation(s) in RCA: 628] [Impact Index Per Article: 29.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

439

Chou KC, Cai YD. Using GO-PseAA predictor to predict enzyme sub-class. Biochem Biophys Res Commun 2004;325:506-9. [PMID: 15530421 DOI: 10.1016/j.bbrc.2004.10.058] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2004] [Indexed: 11/25/2022]

440

Collier N, Takeuchi K. Comparison of character-level and part of speech features for name recognition in biomedical texts. J Biomed Inform 2004;37:423-35. [PMID: 15542016 DOI: 10.1016/j.jbi.2004.08.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2004] [Indexed: 10/26/2022]

Abstract

The immense volume of data which is now available from experiments in molecular biology has led to an explosion in reported results most of which are available only in unstructured text format. For this reason there has been great interest in the task of text mining to aid in fact extraction, document screening, citation analysis, and linkage with large gene and gene-product databases. In particular there has been an intensive investigation into the named entity (NE) task as a core technology in all of these tasks which has been driven by the availability of high volume training sets such as the GENIA v3.02 corpus. Despite such large training sets accuracy for biology NE has proven to be consistently far below the high levels of performance in the news domain where F scores above 90 are commonly reported which can be considered near to human performance. We argue that it is crucial that more rigorous analysis of the factors that contribute to the model's performance be applied to discover where the underlying limitations are and what our future research direction should be. Our investigation in this paper reports on variations of two widely used feature types, part of speech (POS) tags and character-level orthographic features, and makes a comparison of how these variations influence performance. We base our experiments on a proven state-of-the-art model, support vector machines using a high quality subset of 100 annotated MEDLINE abstracts. Experiments reveal that the best performing features are orthographic features with F score of 72.6. Although the Brill tagger trained in-domain on the GENIA v3.02p POS corpus gives the best overall performance of any POS tagger, at an F score of 68.6, this is still significantly below the orthographic features. In combination these two features types appear to interfere with each other and degrade performance slightly to an F score of 72.3.

Collapse

441

Nucleic Acid Quadratic Indices of the “Macromolecular Graph’s Nucleotides Adjacency Matrix”. Modeling of Footprints after the Interaction of Paromomycin with the HIV-1 Ψ-RNA Packaging Region. Int J Mol Sci 2004. [DOI: 10.3390/i5110276] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

442

Scott MS, Thomas DY, Hallett MT. Predicting subcellular localization via protein motif co-occurrence. Genome Res 2004;14:1957-66. [PMID: 15466294 PMCID: PMC524420 DOI: 10.1101/gr.2650004] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

443

Jiang-Ning S, Wei-Jiang L, Wen-Bo X. Cooperativity of the oxidization of cysteines in globular proteins. J Theor Biol 2004;231:85-95. [PMID: 15363931 DOI: 10.1016/j.jtbi.2004.06.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2003] [Revised: 06/01/2004] [Accepted: 06/07/2004] [Indexed: 11/17/2022]

444

Reczko M, Hatzigerrorgiou A. Prediction of the subcellular localization of eukaryotic proteins using sequence signals and composition. Proteomics 2004;4:1591-6. [PMID: 15174129 DOI: 10.1002/pmic.200300769] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

445

Chou KC, Cai YD. Predicting protein localization in budding Yeast. Bioinformatics 2004;21:944-50. [PMID: 15513989 DOI: 10.1093/bioinformatics/bti104] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract

MOTIVATION

Most of the existing methods in predicting protein subcellular location were used to deal with the cases limited within the scope from two to five localizations, and only a few of them can be effectively extended to cover the cases of 12-14 localizations. This is because the more the locations involved are, the poorer the success rate would be. Besides, some proteins may occur in several different subcellular locations, i.e. bear the feature of 'multiplex locations'. So far there is no method that can be used to effectively treat the difficult multiplex location problem. The present study was initiated in an attempt to address (1) how to efficiently identify the localization of a query protein among many possible subcellular locations, and (2) how to deal with the case of multiplex locations.

RESULTS

By hybridizing gene ontology, functional domain and pseudo amino acid composition approaches, a new method has been developed that can be used to predict subcellular localization of proteins with multiplex location feature. A global analysis of the proteins in budding yeast classified into 22 locations was performed by jack-knife cross-validation with the new method. The overall success identification rate thus obtained is 70%. In contrast to this, the corresponding rates obtained by some other existing methods were only 13-14%, indicating that the new method is very powerful and promising. Furthermore, predictions were made for the four proteins whose localizations could not be determined by experiments, as well as for the 236 proteins whose localizations in budding yeast were ambiguous according to experimental observations. However, according to our predicted results, many of these 'ambiguous proteins' were found to have the same score and ranking for several different subcellular locations, implying that they may simultaneously exist, or move around, in these locations. This finding is intriguing because it reflects the dynamic feature of these proteins in a cell that may be associated with some special biological functions.

Collapse

446

Jiang XS, Dai J, Sheng QH, Zhang L, Xia QC, Wu JR, Zeng R. A comparative proteomic strategy for subcellular proteome research: ICAT approach coupled with bioinformatics prediction to ascertain rat liver mitochondrial proteins and indication of mitochondrial localization for catalase. Mol Cell Proteomics 2004;4:12-34. [PMID: 15507458 DOI: 10.1074/mcp.m400079-mcp200] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Abstract

Subcellular proteomics, as an important step to functional proteomics, has been a focus in proteomic research. However, the co-purification of "contaminating" proteins has been the major problem in all the subcellular proteomic research including all kinds of mitochondrial proteome research. It is often difficult to conclude whether these "contaminants" represent true endogenous partners or artificial associations induced by cell disruption or incomplete purification. To solve such a problem, we applied a high-throughput comparative proteome experimental strategy, ICAT approach performed with two-dimensional LC-MS/MS analysis, coupled with combinational usage of different bioinformatics tools, to study the proteome of rat liver mitochondria prepared with traditional centrifugation (CM) or further purified with a Nycodenz gradient (PM). A total of 169 proteins were identified and quantified convincingly in the ICAT analysis, in which 90 proteins have an ICAT ratio of PM:CM>1.0, while another 79 proteins have an ICAT ratio of PM:CM<1.0. Almost all the proteins annotated as mitochondrial according to Swiss-Prot annotation, bioinformatics prediction, and literature reports have a ratio of PM:CM>1.0, while proteins annotated as extracellular or secreted, cytoplasmic, endoplasmic reticulum, ribosomal, and so on have a ratio of PM:CM<1.0. Catalase and AP endonuclease 1, which have been known as peroxisomal and nuclear, respectively, have shown a ratio of PM:CM>1.0, confirming the reports about their mitochondrial location. Moreover, the 125 proteins with subcellular location annotation have been used as a testing dataset to evaluate the efficiency for ascertaining mitochondrial proteins by ICAT analysis and the bioinformatics tools such as PSORT, TargetP, SubLoc, MitoProt, and Predotar. The results indicated that ICAT analysis coupled with combinational usage of different bioinformatics tools could effectively ascertain mitochondrial proteins and distinguish contaminant proteins and even multilocation proteins. Using such a strategy, many novel proteins, known proteins without subcellular location annotation, and even known proteins that have been annotated as other locations have been strongly indicated for their mitochondrial location.

Collapse

447

Gardy JL, Laird MR, Chen F, Rey S, Walsh CJ, Ester M, Brinkman FSL. PSORTb v.2.0: Expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis. Bioinformatics 2004;21:617-23. [PMID: 15501914 DOI: 10.1093/bioinformatics/bti057] [Citation(s) in RCA: 573] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

448

Huff T, Rosorius O, Otto AM, Müller CSG, Ballweber E, Hannappel E, Mannherz HG. Nuclear localisation of the G-actin sequestering peptide thymosin β4. J Cell Sci 2004;117:5333-41. [PMID: 15466884 DOI: 10.1242/jcs.01404] [Citation(s) in RCA: 74] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

449

Cai YD, Chou KC. Predicting 22 protein localizations in budding yeast. Biochem Biophys Res Commun 2004;323:425-8. [PMID: 15369769 DOI: 10.1016/j.bbrc.2004.08.113] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2004] [Indexed: 11/30/2022]

450

Chou KC, Cai YD. Prediction of protein subcellular locations by GO-FunD-PseAA predictor. Biochem Biophys Res Commun 2004;320:1236-9. [PMID: 15249222 DOI: 10.1016/j.bbrc.2004.06.073] [Citation(s) in RCA: 123] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2004] [Indexed: 11/18/2022]