Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Huang WL, Tung CW, Huang HL, Ho SY. Predicting protein subnuclear localization using GO-amino-acid composition features. Biosystems 2009;98:73-9. [DOI: 10.1016/j.biosystems.2009.06.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2009] [Revised: 06/10/2009] [Accepted: 06/26/2009] [Indexed: 10/20/2022]

For:	Huang WL, Tung CW, Huang HL, Ho SY. Predicting protein subnuclear localization using GO-amino-acid composition features. Biosystems 2009;98:73-9. [DOI: 10.1016/j.biosystems.2009.06.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2009] [Revised: 06/10/2009] [Accepted: 06/26/2009] [Indexed: 10/20/2022]

Number

Cited by Other Article(s)

Ismail H, White C, Al-Barakati H, Newman RH, Kc DB. FEPS: A Tool for Feature Extraction from Protein Sequence. Methods Mol Biol 2022;2499:65-104. [PMID: 35696075 DOI: 10.1007/978-1-0716-2317-6_3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Furutani Y, Yoshihara Y. Proteomic Analysis of Dendritic Filopodia-Rich Fraction Isolated by Telencephalin and Vitronectin Interaction. Front Synaptic Neurosci 2018;10:27. [PMID: 30147651 PMCID: PMC6097459 DOI: 10.3389/fnsyn.2018.00027] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Accepted: 07/19/2018] [Indexed: 01/13/2023] Open

Hoseini ASH, Mirzarezaee M. Prediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks. IRANIAN JOURNAL OF BIOTECHNOLOGY 2018;16:e1933. [PMID: 31457027 PMCID: PMC6697825 DOI: 10.15171/ijb.1933] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/29/2017] [Revised: 01/11/2018] [Accepted: 01/13/2018] [Indexed: 01/09/2023]

Abstract

Background

Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from protein sequences. In contrast, protein interactions have been less investigated.

Objectives

As protein interactions usually occur in the same or adjacent places, using this feature to find the location would be efficient and impressive. This study did not aim at increasing the total accuracy of the conducted research. The study has focused on the features of the proteins’ interaction and their employment which lead to a higher accuracy.

Materials and Methods

In this study, we have examined the protein interaction network as one of the features for prediction of the protein localization and its effects on the prediction results. In this regards, we have gathered some of the most common features including Amino Acid Composition, Dipeptide Compositions, Pseudo Amino Acid Compositions (PseAAC), Position Specific Scoring Matrix (PSSM), Functional Domain, Gene Ontology information, and the Pair-wise sequence alignment. The results of the classification are compared to the ones using protein interactions. For achieving this goal different machine learning algorithms were tested.

Results

The best-obtained results of using single feature set obtained using SVM classifier for PseAAC feature. The accuracy of combining all features with PPI data, using the Decision Tree and Random Forest classifiers, was 82.49% and 83.35%, respectively. In another experiment, using just protein interaction data with the different cutting points resulted in obtaining an accuracy of 93.035% for the protein location prediction.

Conclusion

In total, it was shown that protein(s) interaction has a significant impact on the prediction of the mitochondrial proteins’ location. This feature can separately distinguish the locations well. Using this feature the accuracy of the results is raised up to 5%.

Collapse

Gong AGW, Duan R, Wang HY, Dong TTX, Tsim KWK. Calycosin Orchestrates Osteogenesis of Danggui Buxue Tang in Cultured Osteoblasts: Evaluating the Mechanism of Action by Omics and Chemical Knock-out Methodologies. Front Pharmacol 2018;9:36. [PMID: 29449812 PMCID: PMC5799702 DOI: 10.3389/fphar.2018.00036] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2017] [Accepted: 01/12/2018] [Indexed: 01/12/2023] Open

Arrighetti N, Cossa G, De Cecco L, Stucchi S, Carenini N, Corna E, Gandellini P, Zaffaroni N, Perego P, Gatti L. PKC-alpha modulation by miR-483-3p in platinum-resistant ovarian carcinoma cells. Toxicol Appl Pharmacol 2016;310:9-19. [DOI: 10.1016/j.taap.2016.08.005] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2016] [Revised: 07/27/2016] [Accepted: 08/05/2016] [Indexed: 12/19/2022]

TargetFreeze: Identifying Antifreeze Proteins via a Combination of Weights using Sequence Evolutionary Information and Pseudo Amino Acid Composition. J Membr Biol 2015;248:1005-14. [PMID: 26058944 DOI: 10.1007/s00232-015-9811-z] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2015] [Accepted: 05/19/2015] [Indexed: 11/26/2022]

Lin YC, Wang CC, Tung CW. An in silico toxicogenomics approach for inferring potential diseases associated with maleic acid. Chem Biol Interact 2014;223:38-44. [PMID: 25239558 DOI: 10.1016/j.cbi.2014.09.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2014] [Revised: 07/25/2014] [Accepted: 09/05/2014] [Indexed: 10/24/2022]

Zuo YC, Peng Y, Liu L, Chen W, Yang L, Fan GL. Predicting peroxidase subcellular location by hybridizing different descriptors of Chou’ pseudo amino acid patterns. Anal Biochem 2014;458:14-9. [DOI: 10.1016/j.ab.2014.04.032] [Citation(s) in RCA: 81] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2014] [Revised: 04/22/2014] [Accepted: 04/25/2014] [Indexed: 11/28/2022]

Li X, Wu X, Wu G. Robust feature generation for protein subchloroplast location prediction with a weighted GO transfer model. J Theor Biol 2014;347:84-94. [PMID: 24423409 DOI: 10.1016/j.jtbi.2014.01.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2013] [Revised: 10/17/2013] [Accepted: 01/03/2014] [Indexed: 10/25/2022]

Mei S. SVM ensemble based transfer learning for large-scale membrane proteins discrimination. J Theor Biol 2013;340:105-10. [PMID: 24050851 DOI: 10.1016/j.jtbi.2013.09.007] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Revised: 09/04/2013] [Accepted: 09/06/2013] [Indexed: 11/16/2022]

Butler GS, Overall CM. Matrix metalloproteinase processing of signaling molecules to regulate inflammation. Periodontol 2000 2013;63:123-48. [DOI: 10.1111/prd.12035] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/19/2013] [Indexed: 12/12/2022]

Using over-represented tetrapeptides to predict protein submitochondria locations. Acta Biotheor 2013;61:259-68. [PMID: 23475502 DOI: 10.1007/s10441-013-9181-9] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2012] [Accepted: 02/23/2013] [Indexed: 01/25/2023]

Wan S, Mak MW, Kung SY. GOASVM: a subcellular location predictor by incorporating term-frequency gene ontology into the general form of Chou's pseudo-amino acid composition. J Theor Biol 2013;323:40-8. [PMID: 23376577 DOI: 10.1016/j.jtbi.2013.01.012] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2012] [Revised: 01/16/2013] [Accepted: 01/16/2013] [Indexed: 01/03/2023]

Predicting plant protein subcellular multi-localization by Chou's PseAAC formulation based multi-label homolog knowledge transfer learning. J Theor Biol 2012;310:80-7. [DOI: 10.1016/j.jtbi.2012.06.028] [Citation(s) in RCA: 98] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2012] [Revised: 05/12/2012] [Accepted: 06/18/2012] [Indexed: 11/21/2022]

Mei S. Multi-label multi-kernel transfer learning for human protein subcellular localization. PLoS One 2012;7:e37716. [PMID: 22719847 PMCID: PMC3374840 DOI: 10.1371/journal.pone.0037716] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2011] [Accepted: 04/28/2012] [Indexed: 11/19/2022] Open

Abstract

Recent years have witnessed much progress in computational modelling for protein subcellular localization. However, the existing sequence-based predictive models demonstrate moderate or unsatisfactory performance, and the gene ontology (GO) based models may take the risk of performance overestimation for novel proteins. Furthermore, many human proteins have multiple subcellular locations, which renders the computational modelling more complicated. Up to the present, there are far few researches specialized for predicting the subcellular localization of human proteins that may reside in multiple cellular compartments. In this paper, we propose a multi-label multi-kernel transfer learning model for human protein subcellular localization (MLMK-TLM). MLMK-TLM proposes a multi-label confusion matrix, formally formulates three multi-labelling performance measures and adapts one-against-all multi-class probabilistic outputs to multi-label learning scenario, based on which to further extends our published work GO-TLM (gene ontology based transfer learning model for protein subcellular localization) and MK-TLM (multi-kernel transfer learning based on Chou's PseAAC formulation for protein submitochondria localization) for multiplex human protein subcellular localization. With the advantages of proper homolog knowledge transfer, comprehensive survey of model performance for novel protein and multi-labelling capability, MLMK-TLM will gain more practical applicability. The experiments on human protein benchmark dataset show that MLMK-TLM significantly outperforms the baseline model and demonstrates good multi-labelling ability for novel human proteins. Some findings (predictions) are validated by the latest Swiss-Prot database. The software can be freely downloaded at http://soft.synu.edu.cn/upload/msy.rar.

Collapse

Li L, Zhang Y, Zou L, Li C, Yu B, Zheng X, Zhou Y. An ensemble classifier for eukaryotic protein subcellular location prediction using gene ontology categories and amino acid hydrophobicity. PLoS One 2012;7:e31057. [PMID: 22303481 PMCID: PMC3268814 DOI: 10.1371/journal.pone.0031057] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2011] [Accepted: 12/31/2011] [Indexed: 02/05/2023] Open

Mei S. Multi-kernel transfer learning based on Chou's PseAAC formulation for protein submitochondria localization. J Theor Biol 2012;293:121-30. [DOI: 10.1016/j.jtbi.2011.10.015] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2011] [Revised: 10/09/2011] [Accepted: 10/13/2011] [Indexed: 10/16/2022]

Du P, Li T, Wang X. Recent progress in predicting protein sub-subcellular locations. Expert Rev Proteomics 2011;8:391-404. [PMID: 21679119 DOI: 10.1586/epr.11.20] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Mei S, Fei W, Zhou S. Gene ontology based transfer learning for protein subcellular localization. BMC Bioinformatics 2011;12:44. [PMID: 21284890 PMCID: PMC3039576 DOI: 10.1186/1471-2105-12-44] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2010] [Accepted: 02/02/2011] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Prediction of protein subcellular localization generally involves many complex factors, and using only one or two aspects of data information may not tell the true story. For this reason, some recent predictive models are deliberately designed to integrate multiple heterogeneous data sources for exploiting multi-aspect protein feature information. Gene ontology, hereinafter referred to as GO, uses a controlled vocabulary to depict biological molecules or gene products in terms of biological process, molecular function and cellular component. With the rapid expansion of annotated protein sequences, gene ontology has become a general protein feature that can be used to construct predictive models in computational biology. Existing models generally either concatenated the GO terms into a flat binary vector or applied majority-vote based ensemble learning for protein subcellular localization, both of which can not estimate the individual discriminative abilities of the three aspects of gene ontology.

RESULTS

In this paper, we propose a Gene Ontology Based Transfer Learning Model (GO-TLM) for large-scale protein subcellular localization. The model transfers the signature-based homologous GO terms to the target proteins, and further constructs a reliable learning system to reduce the adverse affect of the potential false GO terms that are resulted from evolutionary divergence. We derive three GO kernels from the three aspects of gene ontology to measure the GO similarity of two proteins, and derive two other spectrum kernels to measure the similarity of two protein sequences. We use simple non-parametric cross validation to explicitly weigh the discriminative abilities of the five kernels, such that the time & space computational complexities are greatly reduced when compared to the complicated semi-definite programming and semi-indefinite linear programming. The five kernels are then linearly merged into one single kernel for protein subcellular localization. We evaluate GO-TLM performance against three baseline models: MultiLoc, MultiLoc-GO and Euk-mPLoc on the benchmark datasets the baseline models adopted. 5-fold cross validation experiments show that GO-TLM achieves substantial accuracy improvement against the baseline models: 80.38% against model Euk-mPLoc 67.40% with 12.98% substantial increase; 96.65% and 96.27% against model MultiLoc-GO 89.60% and 89.60%, with 7.05% and 6.67% accuracy increase on dataset MultiLoc plant and dataset MultiLoc animal, respectively; 97.14%, 95.90% and 96.85% against model MultiLoc-GO 83.70%, 90.10% and 85.70%, with accuracy increase 13.44%, 5.8% and 11.15% on dataset BaCelLoc plant, dataset BaCelLoc fungi and dataset BaCelLoc animal respectively. For BaCelLoc independent sets, GO-TLM achieves 81.25%, 80.45% and 79.46% on dataset BaCelLoc plant holdout, dataset BaCelLoc plant holdout and dataset BaCelLoc animal holdout, respectively, as compared against baseline model MultiLoc-GO 76%, 60.00% and 73.00%, with accuracy increase 5.25%, 20.45% and 6.46%, respectively.

CONCLUSIONS

Since direct homology-based GO term transfer may be prone to introducing noise and outliers to the target protein, we design an explicitly weighted kernel learning system (called Gene Ontology Based Transfer Learning Model, GO-TLM) to transfer to the target protein the known knowledge about related homologous proteins, which can reduce the risk of outliers and share knowledge between homologous proteins, and thus achieve better predictive performance for protein subcellular localization. Cross validation and independent test experimental results show that the homology-based GO term transfer and explicitly weighing the GO kernels substantially improve the prediction performance.

Collapse

Scott MS, Boisvert FM, Lamond AI, Barton GJ. PNAC: a protein nucleolar association classifier. BMC Genomics 2011;12:74. [PMID: 21272300 PMCID: PMC3038921 DOI: 10.1186/1471-2164-12-74] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2010] [Accepted: 01/27/2011] [Indexed: 01/11/2023] Open

Abstract

Background

Although primarily known as the site of ribosome subunit production, the nucleolus is involved in numerous and diverse cellular processes. Recent large-scale proteomics projects have identified thousands of human proteins that associate with the nucleolus. However, in most cases, we know neither the fraction of each protein pool that is nucleolus-associated nor whether their association is permanent or conditional.

Results

To describe the dynamic localisation of proteins in the nucleolus, we investigated the extent of nucleolar association of proteins by first collating an extensively curated literature-derived dataset. This dataset then served to train a probabilistic predictor which integrates gene and protein characteristics. Unlike most previous experimental and computational studies of the nucleolar proteome that produce large static lists of nucleolar proteins regardless of their extent of nucleolar association, our predictor models the fluidity of the nucleolus by considering different classes of nucleolar-associated proteins. The new method predicts all human proteins as either nucleolar-enriched, nucleolar-nucleoplasmic, nucleolar-cytoplasmic or non-nucleolar. Leave-one-out cross validation tests reveal sensitivity values for these four classes ranging from 0.72 to 0.90 and positive predictive values ranging from 0.63 to 0.94. The overall accuracy of the classifier was measured to be 0.85 on an independent literature-based test set and 0.74 using a large independent quantitative proteomics dataset. While the three nucleolar-association groups display vastly different Gene Ontology biological process signatures and evolutionary characteristics, they collectively represent the most well characterised nucleolar functions.

Conclusions

Our proteome-wide classification of nucleolar association provides a novel representation of the dynamic content of the nucleolus. This model of nucleolar localisation thus increases the coverage while providing accurate and specific annotations of the nucleolar proteome. It will be instrumental in better understanding the central role of the nucleolus in the cell and its interaction with other subcellular compartments.

Collapse

Ma J, Gu H. A novel method for predicting protein subcellular localization based on pseudo amino acid composition. BMB Rep 2010;43:670-6. [DOI: 10.5483/bmbrep.2010.43.10.670] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open