151
|
Moretti M, Minerdi D, Gehrig P, Garibaldi A, Gullino ML, Riedel K. A bacterial-fungal metaproteomic analysis enlightens an intriguing multicomponent interaction in the rhizosphere of Lactuca sativa. J Proteome Res 2012; 11:2061-77. [PMID: 22360353 DOI: 10.1021/pr201204v] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Fusarium oxysporum MSA 35 [wild-type (WT) strain] is an antagonistic isolate that protects plants against pathogenic Fusaria. This strain lives in association with ectosymbiotic bacteria. When cured of the prokaryotic symbionts [cured (CU) form], the fungus is pathogenic, causing wilt symptoms similar to those of F. oxysporum f.sp. lactucae. The aim of this study was to understand if and how the host plant Lactuca sativa contributes to the expression of the antagonistic/pathogenic behaviors of MSA 35 strains. A time-course comparative analysis of the proteomic profiles of WT and CU strains was performed. Fungal proteins expressed during the early stages of plant-fungus interaction were involved in stress defense, energy metabolism, and virulence and were equally induced in both strains. In the late phase of the interkingdom interaction, only CU strain continued the production of virulence- and energy-related proteins. The expression analysis of lettuce genes coding for proteins involved in resistance-related processes corroborated proteomic data by showing that, at the beginning of the interaction, both fungi are perceived by the plant as pathogen. On the contrary, after 8 days, only the CU strain is able to induce plant gene expression. For the first time, it was demonstrated that an antagonistic F. oxysporum behaves initially as pathogen, showing an interesting similarity with other beneficial organisms such as mychorrizae.
Collapse
Affiliation(s)
- Marino Moretti
- Agroinnova-Centre of Competence for the Innovation in the Agro-Environmental Field, University of Torino, Italy
| | | | | | | | | | | |
Collapse
|
152
|
Mourad GS, Tippmann-Crosby J, Hunt KA, Gicheru Y, Bade K, Mansfield TA, Schultes NP. Genetic and molecular characterization reveals a unique nucleobase cation symporter 1 in Arabidopsis. FEBS Lett 2012; 586:1370-8. [PMID: 22616996 DOI: 10.1016/j.febslet.2012.03.058] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2012] [Revised: 03/25/2012] [Accepted: 03/26/2012] [Indexed: 11/13/2022]
Abstract
Locus At5g03555 encodes a nucleobase cation symporter 1 (AtNCS1) in the Arabidopsis genome. Arabidopsis insertion mutants, AtNcs1-1 and AtNcs1-3, were used for in planta toxic nucleobase analog growth studies and radio-labeled nucleobase uptake assays to characterize solute transport specificities. These results correlate with similar growth and uptake studies of AtNCS1 expressed in Saccharomyces cerevisiae. Both in planta and heterologous expression studies in yeast revealed a unique solute transport profile for AtNCS1 in moving adenine, guanine and uracil. This is in stark contrast to the canonical transport profiles determined for the well-characterized S. cerevisiae NCS1 proteins FUR4 (uracil transport) or FCY2 (adenine, guanine, and cytosine transport).
Collapse
Affiliation(s)
- George S Mourad
- Department of Biology, Indiana University-Purdue University Fort Wayne, Fort Wayne, IN 46805, USA.
| | | | | | | | | | | | | |
Collapse
|
153
|
Law SR, Narsai R, Taylor NL, Delannoy E, Carrie C, Giraud E, Millar AH, Small I, Whelan J. Nucleotide and RNA metabolism prime translational initiation in the earliest events of mitochondrial biogenesis during Arabidopsis germination. PLANT PHYSIOLOGY 2012; 158:1610-27. [PMID: 22345507 PMCID: PMC3320173 DOI: 10.1104/pp.111.192351] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2011] [Accepted: 02/13/2012] [Indexed: 05/18/2023]
Abstract
Mitochondria play a crucial role in germination and early seedling growth in Arabidopsis (Arabidopsis thaliana). Morphological observations of mitochondria revealed that mitochondrial numbers, typical size, and oval morphology were evident after 12 h of imbibition in continuous light (following 48 h of stratification). The transition from a dormant to an active metabolic state was punctuated by an early molecular switch, characterized by a transient burst in the expression of genes encoding mitochondrial proteins. Factors involved in mitochondrial transcription and RNA processing were overrepresented among these early-expressed genes. This was closely followed by an increase in the transcript abundance of genes encoding proteins involved in mitochondrial DNA replication and translation. This burst in the expression of factors implicated in mitochondrial RNA and DNA metabolism was accompanied by an increase in transcripts encoding components required for nucleotide biosynthesis in the cytosol and increases in transcript abundance of specific members of the mitochondrial carrier protein family that have previously been associated with nucleotide transport into mitochondria. Only after these genes peaked in expression and largely declined were typical mitochondrial numbers and morphology observed. Subsequently, there was an increase in transcript abundance for various bioenergetic and metabolic functions of mitochondria. The coordination of nucleus- and organelle-encoded gene expression was also examined by quantitative reverse transcription-polymerase chain reaction, specifically for components of the mitochondrial electron transport chain and the chloroplastic photosynthetic machinery. Analysis of protein abundance using western-blot analysis and mass spectrometry revealed that for many proteins, patterns of protein and transcript abundance changes displayed significant positive correlations. A model for mitochondrial biogenesis during germination is proposed, in which an early increase in the abundance of transcripts encoding biogenesis functions (RNA metabolism and import components) precedes a later cascade of gene expression encoding the bioenergetic and metabolic functions of mitochondria.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - James Whelan
- Australian Research Council Centre of Excellence in Plant Energy Biology (S.R.L., R.N., N.L.T., E.D., C.C., E.G., A.H.M., I.S., J.W.), Centre for Computational Systems Biology (R.N., I.S.), and Centre for Comparative Analysis of Biomolecular Networks (N.L.T., A.H.M.), University of Western Australia, Crawley 6009, Western Australia, Australia
| |
Collapse
|
154
|
Zhang S, Ye F, Yuan X. Using principal component analysis and support vector machine to predict protein structural class for low-similarity sequences via PSSM. J Biomol Struct Dyn 2012; 29:634-42. [PMID: 22545994 DOI: 10.1080/07391102.2011.672627] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
155
|
Nag A, Karpinets TV, Chang CH, Bar-Peled M. Enhancing a Pathway-Genome Database (PGDB) to capture subcellular localization of metabolites and enzymes: the nucleotide-sugar biosynthetic pathways of Populus trichocarpa. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2012; 2012:bas013. [PMID: 22465851 PMCID: PMC3316911 DOI: 10.1093/database/bas013] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Understanding how cellular metabolism works and is regulated requires that the underlying biochemical pathways be adequately represented and integrated with large metabolomic data sets to establish a robust network model. Genetically engineering energy crops to be less recalcitrant to saccharification requires detailed knowledge of plant polysaccharide structures and a thorough understanding of the metabolic pathways involved in forming and regulating cell-wall synthesis. Nucleotide-sugars are building blocks for synthesis of cell wall polysaccharides. The biosynthesis of nucleotide-sugars is catalyzed by a multitude of enzymes that reside in different subcellular organelles, and precise representation of these pathways requires accurate capture of this biological compartmentalization. The lack of simple localization cues in genomic sequence data and annotations however leads to missing compartmentalization information for eukaryotes in automatically generated databases, such as the Pathway-Genome Databases (PGDBs) of the SRI Pathway Tools software that drives much biochemical knowledge representation on the internet. In this report, we provide an informal mechanism using the existing Pathway Tools framework to integrate protein and metabolite sub-cellular localization data with the existing representation of the nucleotide-sugar metabolic pathways in a prototype PGDB for Populus trichocarpa. The enhanced pathway representations have been successfully used to map SNP abundance data to individual nucleotide-sugar biosynthetic genes in the PGDB. The manually curated pathway representations are more conducive to the construction of a computational platform that will allow the simulation of natural and engineered nucleotide-sugar precursor fluxes into specific recalcitrant polysaccharide(s). Database URL: The curated Populus PGDB is available in the BESC public portal at http://cricket.ornl.gov/cgi-bin/beocyc_home.cgi and the nucleotide-sugar biosynthetic pathways can be directly accessed at http://cricket.ornl.gov:1555/PTR/new-image?object=SUGAR-NUCLEOTIDES.
Collapse
Affiliation(s)
- Ambarish Nag
- Computational Sciences Center, National Renewable Energy Laboratory, 1617 Cole Boulevard, Golden, CO 80401, USA
| | | | | | | |
Collapse
|
156
|
Characterization of a Trypanosoma brucei Alkb homolog capable of repairing alkylated DNA. Exp Parasitol 2012; 131:92-100. [PMID: 22465611 DOI: 10.1016/j.exppara.2012.03.011] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2012] [Revised: 02/29/2012] [Accepted: 03/12/2012] [Indexed: 11/20/2022]
Abstract
Trypanosoma brucei encodes a protein (denoted TbABH) that is homologous to AlkB of Escherichia coli and AlkB homolog (ABH) proteins in other organisms, raising the possibility that trypanosomes catalyze oxidative repair of alkylation-damaged DNA. TbABH was cloned and expressed in E. coli, and the recombinant protein was purified and characterized. Incubation of anaerobic TbABH with Fe(II) and α-ketoglutarate (αKG) produces a characteristic metal-to-ligand charge-transfer chromophore, confirming its membership in the Fe(II)/αKG dioxygenase superfamily. The protein binds to DNA, with a clear preference for alkylated oligonucleotides according to results derived by electrophoretic mobility shift assays. Finally, the protozoan gene was shown to partially complement E. coli alkB cells when stressed with methylmethanesulfonate; thus confirming assignment of TbABH as a functional AlkB protein in T. brucei.
Collapse
|
157
|
Abstract
With the development of ultra-high-throughput technologies, the cost of sequencing bacterial genomes has been vastly reduced. As more genomes are sequenced, less time can be spent manually annotating those genomes, resulting in an increased reliance on automatic annotation pipelines. However, automatic pipelines can produce inaccurate genome annotation and their results often require manual curation. Here, we discuss the automatic and manual annotation of bacterial genomes, identify common problems introduced by the current genome annotation process and suggests potential solutions.
Collapse
Affiliation(s)
- Emily J Richardson
- The Roslin Institute, University of Edinburgh, Easter Bush, EH25 9RG, UK
| | | |
Collapse
|
158
|
Ding S, Zhang S, Li Y, Wang T. A novel protein structural classes prediction method based on predicted secondary structure. Biochimie 2012; 94:1166-71. [PMID: 22353242 DOI: 10.1016/j.biochi.2012.01.022] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2011] [Accepted: 01/31/2012] [Indexed: 10/14/2022]
Abstract
Knowledge of structural classes plays an important role in understanding protein folding patterns. In this paper, features based on the predicted secondary structure sequence and the corresponding E-H sequence are extracted. Then, an 11-dimensional feature vector is selected based on a wrapper feature selection algorithm and a support vector machine (SVM). Among the 11 selected features, 4 novel features are newly designed to model the differences between α/β class and α + β class, and other 7 rational features are proposed by previous researchers. To examine the performance of our method, a total of 5 datasets are used to design and test the proposed method. The results show that competitive prediction accuracies can be achieved by the proposed method compared to existing methods (SCPRED, RKS-PPSC and MODAS), and 4 new features are demonstrated essential to differentiate α/β and α + β classes. Standalone version of the proposed method is written in JAVA language and it can be downloaded from http://web.xidian.edu.cn/slzhang/paper.html.
Collapse
Affiliation(s)
- Shuyan Ding
- School of Mathematical Sciences, Dalian University of Technology, Linggong Road, Dalian, 116024, PR China.
| | | | | | | |
Collapse
|
159
|
Characterization of the 55-residue protein encoded by the 9S E1A mRNA of species C adenovirus. J Virol 2012; 86:4222-33. [PMID: 22301148 DOI: 10.1128/jvi.06399-11] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Early region 1A (E1A) of human adenovirus (HAdV) has been the focus of over 30 years of investigation and is required for the oncogenic capacity of HAdV in rodents. Alternative splicing of the E1A transcript generates mRNAs encoding multiple E1A proteins. The 55-residue (55R) E1A protein, which is encoded by the 9S mRNA, is particularly interesting due to the unique properties it displays relative to all other E1A isoforms. 55R E1A does not contain any of the conserved regions (CRs) present in the other E1A isoforms. The C-terminal region of the 55R E1A protein contains a unique sequence compared to all other E1A isoforms, which results from a frameshift generated by alternative splicing. The 55R E1A protein is thought to be produced preferentially at the late stages of infection. Here we report the first study to directly investigate the function of the species C HAdV 55R E1A protein during infection. Polyclonal rabbit antibodies (Abs) have been generated that are capable of immunoprecipitating HAdV-2 55R E1A. These Abs can also detect HAdV-2 55R E1A by immunoblotting and indirect immunofluorescence assay. These studies indicate that 55R E1A is expressed late and is localized to the cytoplasm and to the nucleus. 55R E1A was able to activate the expression of viral genes during infection and could also promote productive replication of species C HAdV. 55R E1A was also found to interact with the S8 component of the proteasome, and knockdown of S8 was detrimental to viral replication dependent on 55R E1A.
Collapse
|
160
|
Martínez-Turiño S, Hernández C. Analysis of the subcellular targeting of the smaller replicase protein of Pelargonium flower break virus. Virus Res 2012; 163:580-91. [PMID: 22222362 DOI: 10.1016/j.virusres.2011.12.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2011] [Revised: 12/13/2011] [Accepted: 12/16/2011] [Indexed: 12/30/2022]
Abstract
Replication of all positive RNA viruses occurs in association with intracellular membranes. In many cases, the mechanism of membrane targeting is unknown and there appears to be no correlation between virus phylogeny and the membrane systems recruited for replication. Pelargonium flower break virus (PFBV, genus Carmovirus, family Tombusviridae) encodes two proteins, p27 and its read-through product p86 (the viral RNA dependent-RNA polymerase), that are essential for replication. Recent reports with other members of the family Tombusviridae have shown that the smaller replicase protein is targeted to specific intracellular membranes and it is assumed to determine the subcellular localization of the replication complex. Using in vivo expression of green fluorescent protein (GFP) fusions in plant and yeast cells, we show here that PFBV p27 localizes in mitochondria. The same localization pattern was found for p86 that contains the p27 sequence at its N-terminus. Cellular fractionation of p27GFP-expressing cells confirmed the confocal microscopy observations and biochemical treatments suggested a tight association of the protein to membranes. Analysis of deletion mutants allowed identification of two regions required for targeting of p27 to mitochondria. These regions mapped toward the N- and C-terminus of the protein, respectively, and could function independently though with distinct efficiency. In an attempt to search for putative cellular factors involved in p27 localization, the subcellular distribution of the protein was checked in a selected series of knockout yeast strains and the outcome of this approach is discussed.
Collapse
Affiliation(s)
- Sandra Martínez-Turiño
- Instituto de Biología Molecular y Celular de Plantas (CSIC-Universidad Politécnica de Valencia), Ciudad Politécnica de Innovación, Ed. 8E, Camino de Vera s/n, 46022 Valencia, Spain
| | | |
Collapse
|
161
|
A novel algorithm combining support vector machine with the discrete wavelet transform for the prediction of protein subcellular localization. Comput Biol Med 2012; 42:180-7. [DOI: 10.1016/j.compbiomed.2011.11.006] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2011] [Revised: 09/29/2011] [Accepted: 11/15/2011] [Indexed: 02/03/2023]
|
162
|
Identification of voltage-gated potassium channel subfamilies from sequence information using support vector machine. Comput Biol Med 2012; 42:504-7. [PMID: 22297432 DOI: 10.1016/j.compbiomed.2012.01.003] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2010] [Revised: 10/16/2011] [Accepted: 01/12/2012] [Indexed: 02/05/2023]
Abstract
Proteins belonging to different subfamilies of Voltage-gated K(+) channels (VKC) are functionally divergent. The traditional method to classify ion channels is more time consuming. Thus, it is highly desirable to develop novel computational methods for VKC subfamily classification. In this study, a support vector machine based method was proposed to predict VKC subfamilies using amino acid and dipeptide compositions. In order to remove redundant information, a novel feature selection technique was employed to single out optimized features. In the jackknife cross-validation, the proposed method (VKCPred) achieved an overall accuracy of 93.09% with 93.22% average sensitivity and 98.34% average specificity, which are superior to that of other two state-of-the-art classifiers. These results indicate that VKCPred can be efficiently used to identify and annotate voltage-gated K(+) channels' subfamilies. The VKCPred software and dataset are freely available at http://cobi.uestc.edu.cn/people/hlin/tools/VKCPred/.
Collapse
|
163
|
|
164
|
Identification of human protein complexes from local sub-graphs of protein-protein interaction network based on random forest with topological structure features. Anal Chim Acta 2012; 718:32-41. [PMID: 22305895 DOI: 10.1016/j.aca.2011.12.069] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2011] [Revised: 12/28/2011] [Accepted: 12/30/2011] [Indexed: 11/20/2022]
Abstract
In the post-genomic era, one of the most important and challenging tasks is to identify protein complexes and further elucidate its molecular mechanisms in specific biological processes. Previous computational approaches usually identify protein complexes from protein interaction network based on dense sub-graphs and incomplete priori information. Additionally, the computational approaches have little concern about the biological properties of proteins and there is no a common evaluation metric to evaluate the performance. So, it is necessary to construct novel method for identifying protein complexes and elucidating the function of protein complexes. In this study, a novel approach is proposed to identify protein complexes using random forest and topological structure. Each protein complex is represented by a graph of interactions, where descriptor of the protein primary structure is used to characterize biological properties of protein and vertex is weighted by the descriptor. The topological structure features are developed and used to characterize protein complexes. Random forest algorithm is utilized to build prediction model and identify protein complexes from local sub-graphs instead of dense sub-graphs. As a demonstration, the proposed approach is applied to protein interaction data in human, and the satisfied results are obtained with accuracy of 80.24%, sensitivity of 81.94%, specificity of 80.07%, and Matthew's correlation coefficient of 0.4087 in 10-fold cross-validation test. Some new protein complexes are identified, and analysis based on Gene Ontology shows that the complexes are likely to be true complexes and play important roles in the pathogenesis of some diseases. PCI-RFTS, a corresponding executable program for protein complexes identification, can be acquired freely on request from the authors.
Collapse
|
165
|
Funck D, Clauß K, Frommer WB, Hellmann HA. The Arabidopsis CstF64-Like RSR1/ESP1 Protein Participates in Glucose Signaling and Flowering Time Control. FRONTIERS IN PLANT SCIENCE 2012; 3:80. [PMID: 22629280 PMCID: PMC3355569 DOI: 10.3389/fpls.2012.00080] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2012] [Accepted: 04/10/2012] [Indexed: 05/03/2023]
Abstract
Mechanisms for sensing and regulating metabolic processes at the cellular level are critical for the general physiology and development of living organisms. In higher plants, sugar signaling is crucial for adequate regulation of carbon and energy metabolism and affects virtually every aspect of development. Although many genes are regulated by sugar levels, little is known on how sugar levels are measured by plants. Several components of the sugar signaling network have been unraveled and demonstrated to have extensive overlap with hormone signaling networks. Here we describe the reduced sugar response1-1 (rsr1-1) mutant as a new early flowering mutant that displays decreased sensitivity to abscisic acid. Both hexokinase1 (HXK1)-dependent and glucose phosphorylation-independent signaling is reduced in rsr1-1. Map-based identification of the affected locus demonstrated that rsr1-1 carries a premature stop codon in the gene for a CstF64-like putative RNA processing factor, ESP1, which is involved in mRNA 3'-end formation. The identification of RSR1/ESP1 as a nuclear protein with a potential threonine phosphorylation site may explain the impact of protein phosphorylation cascades on sugar-dependent signal transduction. Additionally, RSR1/ESP1 may be a crucial factor in linking sugar signaling to the control of flowering time.
Collapse
Affiliation(s)
- Dietmar Funck
- Department of Plant Physiology and Biochemistry, University KonstanzKonstanz, Germany
| | - Karen Clauß
- Department of Plant Biology, Carnegie Institution for ScienceStanford, CA, USA
| | - Wolf B. Frommer
- Department of Plant Biology, Carnegie Institution for ScienceStanford, CA, USA
- *Correspondence: Wolf B. Frommer, Department of Plant Biology, Carnegie Institution for Science, 260 Panama Street, Stanford, CA 94306, USA. e-mail:
| | - Hanjo A. Hellmann
- School of Biological Sciences, Washington State UniversityPullman, WA, USA
| |
Collapse
|
166
|
Jin L, Tang H, Fang W. PREDICTION OF PROTEIN SUBCELLULAR LOCATIONS USING A NEW MEASURE OF INFORMATION DISCREPANCY. J Bioinform Comput Biol 2011; 3:915-27. [PMID: 16078367 DOI: 10.1142/s0219720005001399] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2004] [Accepted: 12/27/2004] [Indexed: 11/18/2022]
Abstract
Given a raw protein sequence, knowing its subcellular location is an important step toward understanding its function and designing further experiments. A novel method is proposed for the prediction of protein subcellular locations from sequences. For four categories of eukaryotic proteins the overall predictive accuracy is 82.0%, 2.6% higher than that by using SVM approach. For three subcellular locations of prokaryotic proteins, an overall accuracy of 89.9% is obtained. In accordance with the architecture of cells, a hierarchical prediction approach is designed. Based on amino acid composition extracellular proteins and intracellular proteins can be identified with accuracy of 97%.
Collapse
Affiliation(s)
- Lixia Jin
- Bioinformatics and Computational Biology, Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, IA 50010, USA.
| | | | | |
Collapse
|
167
|
ASSFALG JOHANNES, GONG JING, KRIEGEL HANSPETER, PRYAKHIN ALEXEY, WEI TIANDI, ZIMEK ARTHUR. SUPERVISED ENSEMBLES OF PREDICTION METHODS FOR SUBCELLULAR LOCALIZATION. J Bioinform Comput Biol 2011; 7:269-85. [DOI: 10.1142/s0219720009004072] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2008] [Revised: 10/15/2008] [Accepted: 10/18/2008] [Indexed: 11/18/2022]
Abstract
In the past decade, many automated prediction methods for the subcellular localization of proteins have been proposed, utilizing a wide range of principles and learning approaches. Based on an experimental evaluation of different methods and their theoretical properties, we propose to combine a well-balanced set of existing approaches to new, ensemble-based prediction methods. The experimental evaluation shows that our ensembles improve substantially over the underlying base methods.
Collapse
Affiliation(s)
- JOHANNES ASSFALG
- Institute for Informatics, Ludwig-Maximilians-Universität München, Oettingenstrasse 67, 80538 Munich, Germany
| | - JING GONG
- Institute for Informatics, Ludwig-Maximilians-Universität München, Oettingenstrasse 67, 80538 Munich, Germany
| | - HANS-PETER KRIEGEL
- Institute for Informatics, Ludwig-Maximilians-Universität München, Oettingenstrasse 67, 80538 Munich, Germany
| | - ALEXEY PRYAKHIN
- Institute for Informatics, Ludwig-Maximilians-Universität München, Oettingenstrasse 67, 80538 Munich, Germany
| | - TIANDI WEI
- Institute for Informatics, Ludwig-Maximilians-Universität München, Oettingenstrasse 67, 80538 Munich, Germany
| | - ARTHUR ZIMEK
- Institute for Informatics, Ludwig-Maximilians-Universität München, Oettingenstrasse 67, 80538 Munich, Germany
| |
Collapse
|
168
|
Gao QB, Zhao H, Ye X, He J. Prediction of pattern recognition receptor family using pseudo-amino acid composition. Biochem Biophys Res Commun 2011; 417:73-7. [PMID: 22138239 DOI: 10.1016/j.bbrc.2011.11.057] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2011] [Accepted: 11/12/2011] [Indexed: 01/21/2023]
Abstract
Pattern recognition receptors (PRRs) play a key role in the innate immune response by recognizing pathogen associated molecular patterns derived from a diverse collection of microbial pathogens. PRRs form a superfamily of proteins related to host health and disease. Thus, prediction of PRR family might supply biologically significant information for functional annotation of PRRs and development of novel drugs. In this paper, a computational method is proposed for predicting the families of PRRs. The prediction was performed on the basis of amino acid composition and pseudo-amino acid composition (PseAAC) from primary sequences of proteins using support vector machines. A non-redundant dataset consisted of 332 PRRs in seven families was constructed to do training and testing. It was demonstrated that different families of PRRs were quite closely correlated with amino acid composition as well as PseAAC. In the jackknife test, overall accuracies of amino acid composition-based and PseAAC-based classifiers reached 96.1% and 97.9%, respectively. The results indicate that families of PRRs are predictable with high accuracy. It is anticipated that this computational method might be a powerful tool for the automated assignment of families of PRRs.
Collapse
Affiliation(s)
- Qing-Bin Gao
- Department of Health Statistics, Second Military Medical University, Shanghai, China
| | | | | | | |
Collapse
|
169
|
Du P, Li T, Wang X. Recent progress in predicting protein sub-subcellular locations. Expert Rev Proteomics 2011; 8:391-404. [PMID: 21679119 DOI: 10.1586/epr.11.20] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In the last two decades, the number of the known protein sequences increased very rapidly. However, a knowledge of protein function only exists for a small portion of these sequences. Since the experimental approaches for determining protein functions are costly and time consuming, in silico methods have been introduced to bridge the gap between knowledge of protein sequences and their functions. Knowing the subcellular location of a protein is considered to be a critical step in understanding its biological functions. Many efforts have been undertaken to predict the protein subcellular locations in silico. With the accumulation of available data, the substructures of some subcellular organelles, such as the cell nucleus, mitochondria and chloroplasts, have been taken into consideration by several studies in recent years. These studies create a new research topic, namely 'protein sub-subcellular location prediction', which goes one level deeper than classic protein subcellular location prediction.
Collapse
Affiliation(s)
- Pufeng Du
- School of Computer Science and Technology, Tianjin University, Tianjin 300072, China
| | | | | |
Collapse
|
170
|
Qu W, Yang B, Jiang W, Wang L. HYBP_PSSP: a hybrid back propagation method for predicting protein secondary structure. Neural Comput Appl 2011. [DOI: 10.1007/s00521-011-0739-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
171
|
Mooney C, Wang YH, Pollastri G. SCLpred: protein subcellular localization prediction by N-to-1 neural networks. Bioinformatics 2011; 27:2812-9. [PMID: 21873639 DOI: 10.1093/bioinformatics/btr494] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
SUMMARY Knowledge of the subcellular location of a protein provides valuable information about its function and possible interaction with other proteins. In the post-genomic era, fast and accurate predictors of subcellular location are required if this abundance of sequence data is to be fully exploited. We have developed a subcellular localization predictor (SCLpred), which predicts the location of a protein into four classes for animals and fungi and five classes for plants (secreted, cytoplasm, nucleus, mitochondrion and chloroplast) using machine learning models trained on large non-redundant sets of protein sequences. The algorithm powering SCLpred is a novel Neural Network (N-to-1 Neural Network, or N1-NN) we have developed, which is capable of mapping whole sequences into single properties (a functional class, in this work) without resorting to predefined transformations, but rather by adaptively compressing the sequence into a hidden feature vector. We benchmark SCLpred against other publicly available predictors using two benchmarks including a new subset of Swiss-Prot Release 2010_06. We show that SCLpred surpasses the state of the art. The N1-NN algorithm is fully general and may be applied to a host of problems of similar shape, that is, in which a whole sequence needs to be mapped into a fixed-size array of properties, and the adaptive compression it operates may shed light on the space of protein sequences. AVAILABILITY The predictive systems described in this article are publicly available as a web server at http://distill.ucd.ie/distill/. CONTACT gianluca.pollastri@ucd.ie.
Collapse
Affiliation(s)
- Catherine Mooney
- School of Computer Science and Informatics, University College Dublin, Belfield, Ireland
| | | | | |
Collapse
|
172
|
Jiao Y, D'haeseleer P, Dill BD, Shah M, VerBerkmoes NC, Hettich RL, Banfield JF, Thelen MP. Identification of biofilm matrix-associated proteins from an acid mine drainage microbial community. Appl Environ Microbiol 2011; 77:5230-7. [PMID: 21685158 PMCID: PMC3147463 DOI: 10.1128/aem.03005-10] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2010] [Accepted: 06/03/2011] [Indexed: 01/01/2023] Open
Abstract
In microbial communities, extracellular polymeric substances (EPS), also called the extracellular matrix, provide the spatial organization and structural stability during biofilm development. One of the major components of EPS is protein, but it is not clear what specific functions these proteins contribute to the extracellular matrix or to microbial physiology. To investigate this in biofilms from an extremely acidic environment, we used shotgun proteomics analyses to identify proteins associated with EPS in biofilms at two developmental stages, designated DS1 and DS2. The proteome composition of the EPS was significantly different from that of the cell fraction, with more than 80% of the cellular proteins underrepresented or undetectable in EPS. In contrast, predicted periplasmic, outer membrane, and extracellular proteins were overrepresented by 3- to 7-fold in EPS. Also, EPS proteins were more basic by ∼2 pH units on average and about half the length. When categorized by predicted function, proteins involved in motility, defense, cell envelope, and unknown functions were enriched in EPS. Chaperones, such as histone-like DNA binding protein and cold shock protein, were overrepresented in EPS. Enzymes, such as protein peptidases, disulfide-isomerases, and those associated with cell wall and polysaccharide metabolism, were also detected. Two of these enzymes, identified as β-N-acetylhexosaminidase and cellulase, were confirmed in the EPS fraction by enzymatic activity assays. Compared to the differences between EPS and cellular fractions, the relative differences in the EPS proteomes between DS1 and DS2 were smaller and consistent with expected physiological changes during biofilm development.
Collapse
Affiliation(s)
| | - Patrik D'haeseleer
- Computations Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550
| | | | - Manesh Shah
- Biosciences Divisions, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831
| | | | | | - Jillian F. Banfield
- Department of Environmental Science, Policy, and Management, University of California, Berkeley, California 94720
| | | |
Collapse
|
173
|
Robust prediction of protein subcellular localization combining PCA and WSVMs. Comput Biol Med 2011; 41:648-52. [PMID: 21722885 DOI: 10.1016/j.compbiomed.2011.05.016] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2009] [Revised: 04/09/2011] [Accepted: 05/28/2011] [Indexed: 11/21/2022]
Abstract
Automated prediction of protein subcellular localization is an important tool for genome annotation and drug discovery, and Support Vector Machines (SVMs) can effectively solve this problem in a supervised manner. However, the datasets obtained from real experiments are likely to contain outliers or noises, which can lead to poor generalization ability and classification accuracy. To explore this problem, we adopt strategies to lower the effect of outliers. First we design a method based on Weighted SVMs, different weights are assigned to different data points, so the training algorithm will learn the decision boundary according to the relative importance of the data points. Second we analyse the influence of Principal Component Analysis (PCA) on WSVM classification, propose a hybrid classifier combining merits of both PCA and WSVM. After performing dimension reduction operations on the datasets, kernel-based possibilistic c-means algorithm can generate more suitable weights for the training, as PCA transforms the data into a new coordinate system with largest variances affected greatly by the outliers. Experiments on benchmark datasets show promising results, which confirms the effectiveness of the proposed method in terms of prediction accuracy.
Collapse
|
174
|
Mallika V, Sivakumar KC, Soniya EV. Evolutionary Implications and Physicochemical Analyses of Selected Proteins of Type III Polyketide Synthase Family. Evol Bioinform Online 2011; 7:41-53. [PMID: 21697991 PMCID: PMC3118698 DOI: 10.4137/ebo.s6854] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
Type III polyketide synthases have a substantial role in the biosynthesis of various polyketides in plants and microorganisms. Comparative proteomic analysis of type III polyketide synthases showed evolutionarily and structurally related positions in a compilation of amino acid sequences from different families. Bacterial and fungal type III polyketide synthase proteins showed <50% similarity but in higher plants, it exhibited >80% among chalcone synthases and >70% in the case of non-chalcone synthases. In a consensus phylogenetic tree based on 1000 replicates; bacterial, fungal and plant proteins were clustered in separate groups. Proteins from bryophytes and pteridophytes grouped immediately near to the fungal cluster, demonstrated how evolutionary lineage has occurred among type III polyketide synthase proteins. Upon physicochemical analysis, it was observed that the proteins localized in the cytoplasm and were hydrophobic in nature. Molecular structural analysis revealed comparatively stable structure comprising of alpha helices and random coils as major structural components. It was found that there was a decline in the structural stability with active site mutation as prophesied by the in silico mutation studies.
Collapse
|
175
|
Lythgow KT, Hudson G, Andras P, Chinnery PF. A critical analysis of the combined usage of protein localization prediction methods: Increasing the number of independent data sets can reduce the accuracy of predicted mitochondrial localization. Mitochondrion 2011; 11:444-9. [PMID: 21195798 PMCID: PMC3081538 DOI: 10.1016/j.mito.2010.12.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2010] [Revised: 12/14/2010] [Accepted: 12/21/2010] [Indexed: 11/16/2022]
Abstract
In the absence of a comprehensive experimentally derived mitochondrial proteome, several bioinformatic approaches have been developed to aid the identification of novel mitochondrial disease genes within mapped nuclear genetic loci. Often, many classifiers are combined to increase the sensitivity and specificity of the predictions. Here we show that the greatest sensitivity and specificity are obtained by using a combination of seven carefully selected classifiers. We also show that increasing the number of independent prediction methods can paradoxically decrease the accuracy of predicting mitochondrial localization. This approach will help to accelerate the identification of new mitochondrial disease genes by providing a principled way for the selection for combination of appropriate prediction methods of mitochondrial localization of proteins.
Collapse
Affiliation(s)
- Kieren T. Lythgow
- Institute of Human Genetics, Newcastle University, Central Parkway, Newcastle upon Tyne, NE1 3BZ, UK
| | - Gavin Hudson
- Institute of Human Genetics, Newcastle University, Central Parkway, Newcastle upon Tyne, NE1 3BZ, UK
| | - Peter Andras
- School of Computing Science, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK
| | - Patrick F. Chinnery
- Institute of Human Genetics, Newcastle University, Central Parkway, Newcastle upon Tyne, NE1 3BZ, UK
| |
Collapse
|
176
|
Xu Q, Pan SJ, Xue HH, Yang Q. Multitask learning for protein subcellular location prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:748-759. [PMID: 20421687 DOI: 10.1109/tcbb.2010.22] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Protein subcellular localization is concerned with predicting the location of a protein within a cell using computational methods. The location information can indicate key functionalities of proteins. Thus, accurate prediction of subcellular localizations of proteins can help the prediction of protein functions and genome annotations, as well as the identification of drug targets. Machine learning methods such as Support Vector Machines (SVMs) have been used in the past for the problem of protein subcellular localization, but have been shown to suffer from a lack of annotated training data in each species under study. To overcome this data sparsity problem, we observe that because some of the organisms may be related to each other, there may be some commonalities across different organisms that can be discovered and used to help boost the data in each localization task. In this paper, we formulate protein subcellular localization problem as one of multitask learning across different organisms. We adapt and compare two specializations of the multitask learning algorithms on 20 different organisms. Our experimental results show that multitask learning performs much better than the traditional single-task methods. Among the different multitask learning methods, we found that the multitask kernels and supertype kernels under multitask learning that share parameters perform slightly better than multitask learning by sharing latent features. The most significant improvement in terms of localization accuracy is about 25 percent. We find that if the organisms are very different or are remotely related from a biological point of view, then jointly training the multiple models cannot lead to significant improvement. However, if they are closely related biologically, the multitask learning can do much better than individual learning.
Collapse
Affiliation(s)
- Qian Xu
- Bioengineering Program, Hong Kong University of Science and Technology, Clearwater Bay, Kowloon, Hong Kong.
| | | | | | | |
Collapse
|
177
|
Mito-GSAAC: mitochondria prediction using genetic ensemble classifier and split amino acid composition. Amino Acids 2011; 42:1443-54. [DOI: 10.1007/s00726-011-0888-0] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2010] [Accepted: 03/09/2011] [Indexed: 12/15/2022]
|
178
|
Arenas NE, Salazar LM, Soto CY, Vizcaíno C, Patarroyo ME, Patarroyo MA, Gómez A. Molecular modeling and in silico characterization of Mycobacterium tuberculosis TlyA: possible misannotation of this tubercle bacilli-hemolysin. BMC STRUCTURAL BIOLOGY 2011; 11:16. [PMID: 21443791 PMCID: PMC3072309 DOI: 10.1186/1472-6807-11-16] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2010] [Accepted: 03/28/2011] [Indexed: 11/24/2022]
Abstract
Background The TlyA protein has a controversial function as a virulence factor in Mycobacterium tuberculosis (M. tuberculosis). At present, its dual activity as hemolysin and RNA methyltransferase in M. tuberculosis has been indirectly proposed based on in vitro results. There is no evidence however for TlyA relevance in the survival of tubercle bacilli inside host cells or whether both activities are functionally linked. A thorough analysis of structure prediction for this mycobacterial protein in this study shows the need for reevaluating TlyA's function in virulence. Results Bioinformatics analysis of TlyA identified a ribosomal protein binding domain (S4 domain), located between residues 5 and 68 as well as an FtsJ-like methyltranferase domain encompassing residues 62 and 247, all of which have been previously described in translation machinery-associated proteins. Subcellular localization prediction showed that TlyA lacks a signal peptide and its hydrophobicity profile showed no evidence of transmembrane helices. These findings suggested that it may not be attached to the membrane, which is consistent with a cytoplasmic localization. Three-dimensional modeling of TlyA showed a consensus structure, having a common core formed by a six-stranded β-sheet between two α-helix layers, which is consistent with an RNA methyltransferase structure. Phylogenetic analyses showed high conservation of the tlyA gene among Mycobacterium species. Additionally, the nucleotide substitution rates suggested purifying selection during tlyA gene evolution and the absence of a common ancestor between TlyA proteins and bacterial pore-forming proteins. Conclusion Altogether, our manual in silico curation suggested that TlyA is involved in ribosomal biogenesis and that there is a functional annotation error regarding this protein family in several microbial and plant genomes, including the M. tuberculosis genome.
Collapse
Affiliation(s)
- Nelson E Arenas
- Departamento de Química, Facultad de Ciencias, Universidad Nacional de Colombia, Carrera 45 No. 26-85 Bogotá, DC. Colombia
| | | | | | | | | | | | | |
Collapse
|
179
|
Predicting protein secondary structure using a mixed-modal SVM method in a compound pyramid model. Knowl Based Syst 2011. [DOI: 10.1016/j.knosys.2010.10.002] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
180
|
Yang WY, Lu BL, Kwok JT. Incorporating cellular sorting structure for better prediction of protein subcellular locations. J EXP THEOR ARTIF IN 2011. [DOI: 10.1080/0952813x.2010.506303] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
181
|
Yu X, Zheng X, Liu T, Dou Y, Wang J. Predicting subcellular location of apoptosis proteins with pseudo amino acid composition: approach from amino acid substitution matrix and auto covariance transformation. Amino Acids 2011; 42:1619-25. [DOI: 10.1007/s00726-011-0848-8] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2010] [Accepted: 02/09/2011] [Indexed: 12/13/2022]
|
182
|
Moretti M, Grunau A, Minerdi D, Gehrig P, Roschitzki B, Eberl L, Garibaldi A, Gullino ML, Riedel K. A proteomics approach to study synergistic and antagonistic interactions of the fungal-bacterial consortium Fusarium oxysporum wild-type MSA 35. Proteomics 2011; 10:3292-320. [PMID: 20707000 DOI: 10.1002/pmic.200900716] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Fusarium oxysporum is an important plant pathogen that causes severe damage of many economically important crop species. Various microorganisms have been shown to inhibit this soil-borne plant pathogen, including non-pathogenic F. oxysporum strains. In this study, F. oxysporum wild-type (WT) MSA 35, a biocontrol multispecies consortium that consists of a fungus and numerous rhizobacteria mainly belonging to gamma-proteobacteria, was analyzed by two complementary metaproteomic approaches (2-DE combined with MALDI-Tof/Tof MS and 1-D PAGE combined with LC-ESI-MS/MS) to identify fungal or bacterial factors potentially involved in antagonistic or synergistic interactions between the consortium members. Moreover, the proteome profiles of F. oxysporum WT MSA 35 and its cured counter-part CU MSA 35 (WT treated with antibiotics) were compared with unravel the bacterial impact on consortium functioning. Our study presents the first proteome mapping of an antagonistic F. oxysporum strain and proposes candidate proteins that might play an important role for the biocontrol activity and the close interrelationship between the fungus and its bacterial partners.
Collapse
Affiliation(s)
- Marino Moretti
- Agroinnova-Centre of Competence for the Innovation in the Agro-Environmental Field, University of Torino, Torino, Italy.
| | | | | | | | | | | | | | | | | |
Collapse
|
183
|
Cui J, Liu J, Li Y, Shi T. Integrative identification of Arabidopsis mitochondrial proteome and its function exploitation through protein interaction network. PLoS One 2011; 6:e16022. [PMID: 21297957 PMCID: PMC3031521 DOI: 10.1371/journal.pone.0016022] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2010] [Accepted: 12/03/2010] [Indexed: 02/07/2023] Open
Abstract
Mitochondria are major players on the production of energy, and host several key reactions involved in basic metabolism and biosynthesis of essential molecules. Currently, the majority of nucleus-encoded mitochondrial proteins are unknown even for model plant Arabidopsis. We reported a computational framework for predicting Arabidopsis mitochondrial proteins based on a probabilistic model, called Naive Bayesian Network, which integrates disparate genomic data generated from eight bioinformatics tools, multiple orthologous mappings, protein domain properties and co-expression patterns using 1,027 microarray profiles. Through this approach, we predicted 2,311 candidate mitochondrial proteins with 84.67% accuracy and 2.53% FPR performances. Together with those experimental confirmed proteins, 2,585 mitochondria proteins (named CoreMitoP) were identified, we explored those proteins with unknown functions based on protein-protein interaction network (PIN) and annotated novel functions for 26.65% CoreMitoP proteins. Moreover, we found newly predicted mitochondrial proteins embedded in particular subnetworks of the PIN, mainly functioning in response to diverse environmental stresses, like salt, draught, cold, and wound etc. Candidate mitochondrial proteins involved in those physiological acitivites provide useful targets for further investigation. Assigned functions also provide comprehensive information for Arabidopsis mitochondrial proteome.
Collapse
Affiliation(s)
- Jian Cui
- College of Life Sciences, Center for Bioinformatics and Institute of Biomedical Sciences, East China Normal University, Shanghai, China
- College of Life Sciences, Northeast Forestry University, Harbin, Heilongjiang, China
- Daqing Institute of Biotechnology, Northeast Forestry University, Daqing, Heilongjiang, China
| | - Jinghua Liu
- Southern Medical University, Guangzhou, Guangdong, China
- Daqing Institute of Biotechnology, Northeast Forestry University, Daqing, Heilongjiang, China
| | - Yuhua Li
- College of Life Sciences, Center for Bioinformatics and Institute of Biomedical Sciences, East China Normal University, Shanghai, China
- Daqing Institute of Biotechnology, Northeast Forestry University, Daqing, Heilongjiang, China
| | - Tieliu Shi
- College of Life Sciences, Northeast Forestry University, Harbin, Heilongjiang, China
- Shanghai Information Center for Life Sciences, Chinese Academy of Sciences, Shanghai, China
- Daqing Institute of Biotechnology, Northeast Forestry University, Daqing, Heilongjiang, China
| |
Collapse
|
184
|
Gianazza E, Eberini I, Sensi C, Barile M, Vergani L, Vanoni MA. Energy matters: mitochondrial proteomics for biomedicine. Proteomics 2011; 11:657-74. [PMID: 21241019 DOI: 10.1002/pmic.201000412] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2010] [Revised: 09/22/2010] [Accepted: 11/03/2010] [Indexed: 12/16/2022]
Abstract
This review compiles results of medical relevance from mitochondrial proteomics, grouped either according to the type of disease - genetic or degenerative - or to the involved mechanism - oxidative stress or apoptosis. The findings are commented in the light of our current understanding of uniformity/variability in cell responses to different stimuli. Specificities in the conceptual and technical approaches to human mitochondrial proteomics are also outlined.
Collapse
Affiliation(s)
- Elisabetta Gianazza
- Dipartimento di Scienze Farmacologiche, Università degli Studi di Milano, Milano, Italy.
| | | | | | | | | | | |
Collapse
|
185
|
Discrimination of Golgi type II membrane proteins based on their hydropathy profiles and the amino acid propensities of their transmembrane regions. Biosci Biotechnol Biochem 2011; 75:82-8. [PMID: 21228484 DOI: 10.1271/bbb.100571] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Membrane proteins in the Golgi apparatus play important roles in biological functions, predominantly as catalysts related to post-translational modification of protein oligosaccharides. We succeeded in extracting the characteristics of Golgi type II membrane proteins computationally by comparison with those of Golgi no retention proteins, which are mainly localized in the plasma membrane. Golgi type II membrane proteins were detected by combining hydropathy alignment and a position-specific score matrix of the amino acid propensities around the transmembrane region. We achieved 96.2% sensitivity, 93.5% specificity, and a 0.949 success rate in a self-consistency test. In a 5-fold cross-validation test, 88.0% sensitivity, 85.5% specificity, and a 0.867 success rate were achieved.
Collapse
|
186
|
Zhang S, Ding S, Wang T. High-accuracy prediction of protein structural class for low-similarity sequences based on predicted secondary structure. Biochimie 2011; 93:710-4. [PMID: 21237245 DOI: 10.1016/j.biochi.2011.01.001] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2010] [Accepted: 01/04/2011] [Indexed: 11/30/2022]
Abstract
Information on the structural classes of proteins has been proven to be important in many fields of bioinformatics. Prediction of protein structural class for low-similarity sequences is a challenge problem. In this study, 11 features (including 8 re-used features and 3 newly-designed features) are rationally utilized to reflect the general contents and spatial arrangements of the secondary structural elements of a given protein sequence. To evaluate the performance of the proposed method, jackknife cross-validation tests are performed on two widely used benchmark datasets, 1189 and 25PDB with sequence similarity lower than 40% and 25%, respectively. Comparison of our results with other methods shows that our proposed method is very promising and may provide a cost-effective alternative to predict protein structural class in particular for low-similarity datasets.
Collapse
Affiliation(s)
- Shengli Zhang
- School of Mathematical Sciences, Dalian University of Technology, Ganjingzi District, Dalian, Liaoning, PR China.
| | | | | |
Collapse
|
187
|
Tedelind S, Poliakova K, Valeta A, Hunegnaw R, Yemanaberhan EL, Heldin NE, Kurebayashi J, Weber E, Kopitar-Jerala N, Turk B, Bogyo M, Brix K. Nuclear cysteine cathepsin variants in thyroid carcinoma cells. Biol Chem 2011; 391:923-35. [PMID: 20536394 DOI: 10.1515/bc.2010.109] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The cysteine peptidase cathepsin B is important in thyroid physiology by being involved in thyroid prohormone processing initiated in the follicular lumen and completed in endo-lysosomal compartments. However, cathepsin B has also been localized to the extrafollicular space and is therefore suggested to promote invasiveness and metastasis in thyroid carcinomas through, e.g., ECM degradation. In this study, immunofluorescence and biochemical data from subcellular fractionation revealed that cathepsin B, in its single- and two-chain forms, is localized to endo-lysosomes in the papillary thyroid carcinoma cell line KTC-1 and in the anaplastic thyroid carcinoma cell lines HTh7 and HTh74. This distribution is not affected by thyroid stimulating hormone (TSH) incubation of HTh74, the only cell line that expresses a functional TSH-receptor. Immunofluorescence data disclosed an additional nuclear localization of cathepsin B immunoreactivity. This was supported by biochemical data showing a proteolytically active variant slightly smaller than the cathepsin B proform in nuclear fractions. We also demonstrate that immunoreactions specific for cathepsin V, but not cathepsin L, are localized to the nucleus in HTh74 in peri-nucleolar patterns. As deduced from co-localization studies and in vitro degradation assays, we suggest that nuclear variants of cathepsins are involved in the development of thyroid malignancies through modification of DNA-associated proteins.
Collapse
Affiliation(s)
- Sofia Tedelind
- Research Center of Molecular Life Science, School of Engineering and Science, Jacobs University Bremen, Bremen, Germany.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
188
|
Interleukin-4-inducing principle from Schistosoma mansoni eggs contains a functional C-terminal nuclear localization signal necessary for nuclear translocation in mammalian cells but not for its uptake. Infect Immun 2011; 79:1779-88. [PMID: 21220486 DOI: 10.1128/iai.01048-10] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Interleukin-4-inducing principle from schistosome eggs (IPSE/alpha-1) is a protein produced exclusively by the eggs of the trematode Schistosoma mansoni. IPSE/alpha-1 is a secretory glycoprotein which activates human basophils via an IgE-dependent but non-antigen-specific mechanism. Sequence analyses revealed a potential nuclear localization signal (NLS) at the C terminus of IPSE/alpha-1. Here we show that this sequence (125-PKRRRTY-131) is both necessary and sufficient for nuclear localization of IPSE or IPSE-enhanced green fluorescent protein (EGFP) fusions. While transiently expressed EGFP-IPSE/alpha-1 was exclusively nuclear in the Huh7 and U-2 OS cell lines, a mutant lacking amino acids 125 to 134 showed both nuclear and cytoplasmic staining. Moreover, insertion of the IPSE/alpha-1 NLS into a tetra-EGFP construct rendered the protein nuclear. Alanine scanning mutagenesis revealed a requirement for the KRRR residues. Fluorescence microscopy depicted, and Western blotting further confirmed, that recombinant IPSE/alpha-1 protein added exogenously is rapidly internalized by CHO cells and accumulates in nuclei in an NLS-dependent manner. A mutant protein in which the NLS motif was disrupted by triple mutation (RRR to AAA) was able to penetrate CHO cells but did not translocate to the nucleus. Furthermore, the uptake of native glycosylated IPSE/alpha-1 was confirmed in human primary monocyte-derived dendritic cells and was found to be a calcium- and temperature-dependent process. Live-cell imaging showed that IPSE/alpha-1 is not targeted to lysosomes. In contrast, peripheral blood basophils do not take up IPSE/alpha-1 and do not require the presence of an intact NLS for activation. Taken together, our results suggest that IPSE/alpha-1 may have additional nuclear functions in host cells.
Collapse
|
189
|
Ryngajllo M, Childs L, Lohse M, Giorgi FM, Lude A, Selbig J, Usadel B. SLocX: Predicting Subcellular Localization of Arabidopsis Proteins Leveraging Gene Expression Data. FRONTIERS IN PLANT SCIENCE 2011; 2:43. [PMID: 22639594 PMCID: PMC3355584 DOI: 10.3389/fpls.2011.00043] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2011] [Accepted: 08/12/2011] [Indexed: 05/08/2023]
Abstract
Despite the growing volume of experimentally validated knowledge about the subcellular localization of plant proteins, a well performing in silico prediction tool is still a necessity. Existing tools, which employ information derived from protein sequence alone, offer limited accuracy and/or rely on full sequence availability. We explored whether gene expression profiling data can be harnessed to enhance prediction performance. To achieve this, we trained several support vector machines to predict the subcellular localization of Arabidopsis thaliana proteins using sequence derived information, expression behavior, or a combination of these data and compared their predictive performance through a cross-validation test. We show that gene expression carries information about the subcellular localization not available in sequence information, yielding dramatic benefits for plastid localization prediction, and some notable improvements for other compartments such as the mitochondrion, the Golgi, and the plasma membrane. Based on these results, we constructed a novel subcellular localization prediction engine, SLocX, combining gene expression profiling data with protein sequence-based information. We then validated the results of this engine using an independent test set of annotated proteins and a transient expression of GFP fusion proteins. Here, we present the prediction framework and a website of predicted localizations for Arabidopsis. The relatively good accuracy of our prediction engine, even in cases where only partial protein sequence is available (e.g., in sequences lacking the N-terminal region), offers a promising opportunity for similar application to non-sequenced or poorly annotated plant species. Although the prediction scope of our method is currently limited by the availability of expression information on the ATH1 array, we believe that the advances in measuring gene expression technology will make our method applicable for all Arabidopsis proteins.
Collapse
Affiliation(s)
| | - Liam Childs
- Max Planck Institute of Molecular Plant PhysiologyPotsdam, Germany
| | - Marc Lohse
- Max Planck Institute of Molecular Plant PhysiologyPotsdam, Germany
| | | | - Anja Lude
- Max Planck Institute of Molecular Plant PhysiologyPotsdam, Germany
| | - Joachim Selbig
- Department of Bioinformatics, Institute of Biochemistry and Biology, University of PotsdamPotsdam, Germany
| | - Björn Usadel
- Max Planck Institute of Molecular Plant PhysiologyPotsdam, Germany
- *Correspondence: Björn Usadel, Max Planck Institute of Molecular Plant Physiology, Am Muehlenberg 1, Golm, 14476 Potsdam, Germany. e-mail:
| |
Collapse
|
190
|
Wu ZC, Xiao X, Chou KC. iLoc-Plant: a multi-label classifier for predicting the subcellular localization of plant proteins with both single and multiple sites. MOLECULAR BIOSYSTEMS 2011; 7:3287-97. [PMID: 21984117 DOI: 10.1039/c1mb05232b] [Citation(s) in RCA: 163] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Affiliation(s)
- Zhi-Cheng Wu
- Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen 333046, China
| | | | | |
Collapse
|
191
|
Mooney C, Wang YH, Pollastri G. De Novo Protein Subcellular Localization Prediction by N-to-1 Neural Networks. COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS 2011. [DOI: 10.1007/978-3-642-21946-7_3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
192
|
Mak MW, Wang W, Kung SY. Fast subcellular localization by cascaded fusion of signal-based and homology-based methods. Proteome Sci 2011; 9 Suppl 1:S8. [PMID: 22166017 PMCID: PMC3289086 DOI: 10.1186/1477-5956-9-s1-s8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background Results Conclusions
Collapse
|
193
|
Naik PK, Ranjan P, Kesari P, Jain S. MetalloPred: A tool for hierarchical prediction of metal ion binding proteins using cluster of neural networks and sequence derived features. ACTA ACUST UNITED AC 2011. [DOI: 10.4236/jbpc.2011.22014] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
194
|
Shen YQ, Burger G. TESTLoc: protein subcellular localization prediction from EST data. BMC Bioinformatics 2010; 11:563. [PMID: 21078192 PMCID: PMC3000424 DOI: 10.1186/1471-2105-11-563] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2010] [Accepted: 11/15/2010] [Indexed: 11/25/2022] Open
Abstract
Background The eukaryotic cell has an intricate architecture with compartments and substructures dedicated to particular biological processes. Knowing the subcellular location of proteins not only indicates how bio-processes are organized in different cellular compartments, but also contributes to unravelling the function of individual proteins. Computational localization prediction is possible based on sequence information alone, and has been successfully applied to proteins from virtually all subcellular compartments and all domains of life. However, we realized that current prediction tools do not perform well on partial protein sequences such as those inferred from Expressed Sequence Tag (EST) data, limiting the exploitation of the large and taxonomically most comprehensive body of sequence information from eukaryotes. Results We developed a new predictor, TESTLoc, suited for subcellular localization prediction of proteins based on their partial sequence conceptually translated from ESTs (EST-peptides). Support Vector Machine (SVM) is used as computational method and EST-peptides are represented by different features such as amino acid composition and physicochemical properties. When TESTLoc was applied to the most challenging test case (plant data), it yielded high accuracy (~85%). Conclusions TESTLoc is a localization prediction tool tailored for EST data. It provides a variety of models for the users to choose from, and is available for download at http://megasun.bch.umontreal.ca/~shenyq/TESTLoc/TESTLoc.html
Collapse
Affiliation(s)
- Yao-Qing Shen
- Robert-Cedergren Center for Bioinformatics and Genomics; Biochemistry Department, Université de Montréal, 2900 Edouard-Montpetit, Montreal, QC, H3T 1J4, Canada.
| | | |
Collapse
|
195
|
Lee YH, Tan HT, Chung MCM. Subcellular fractionation methods and strategies for proteomics. Proteomics 2010; 10:3935-56. [DOI: 10.1002/pmic.201000289] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
196
|
Zakeri P, Moshiri B, Sadeghi M. Prediction of protein submitochondria locations based on data fusion of various features of sequences. J Theor Biol 2010; 269:208-16. [PMID: 21040732 DOI: 10.1016/j.jtbi.2010.10.026] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2010] [Revised: 10/16/2010] [Accepted: 10/22/2010] [Indexed: 01/16/2023]
Abstract
In this study, the predictors are developed for protein submitochondria locations based on various features of sequences. Information about the submitochondria location for a mitochondria protein can provide much better understanding about its function. We use ten representative models of protein samples such as pseudo amino acid composition, dipeptide composition, functional domain composition, the combining discrete model based on prediction of solvent accessibility and secondary structure elements, the discrete model of pairwise sequence similarity, etc. We construct a predictor based on support vector machines (SVMs) for each representative model. The overall prediction accuracy by the leave-one-out cross validation test obtained by the predictor which is based on the discrete model of pairwise sequence similarity is 1% better than the best computational system that exists for this problem. Moreover, we develop a method based on ordered weighted averaging (OWA) which is one of the fusion data operators. Therefore, OWA is applied on the 11 best SVM-based classifiers that are constructed based on various features of sequence. This method is called Mito-Loc. The overall leave-one-out cross validation accuracy obtained by Mito-Loc is about 95%. This indicates that our proposed approach (Mito-Loc) is superior to the result of the best existing approach which has already been reported.
Collapse
Affiliation(s)
- Pooya Zakeri
- Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, Iran
| | | | | |
Collapse
|
197
|
Desvaux M, Dumas E, Chafsey I, Chambon C, Hébraud M. Comprehensive appraisal of the extracellular proteins from a monoderm bacterium: theoretical and empirical exoproteomes of Listeria monocytogenes EGD-e by secretomics. J Proteome Res 2010; 9:5076-92. [PMID: 20839850 DOI: 10.1021/pr1003642] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Defined as proteins actively transported via secretion systems, secreted proteins can have radically different subcellular destinations in monoderm (Gram-positive) bacteria. From degradative enzymes in saprophytes to virulence factors in pathogens, secreted proteins are the main tools used by bacteria to interact with their surroundings. The etiological agent of listeriosis, Listeria monocytogenes, is a Gram-positive facultative intracellular foodborne pathogen, whose ecological niche is the soil and as such should be primarily considered as a ubiquitous saprophyte. Recent advances on protein secretion systems in this species prompted us to investigate the exoproteome. First, an original and rational bioinformatic strategy was developed to mimic the protein exportation steps leading to the extracellular localization of secreted proteins; 79 exoproteins were predicted as secreted via Sec, 1 exoprotein via Tat, 4 bacteriocins via ABC exporters, 3 exoproteins via holins, and 3 exoproteins via the WXG100 system. This bioinformatic analysis allowed for defining a databank of the mature protein set in L. monocytogenes, which was used for generating the theoretical exoproteome and for subsequent protein identification by proteomics. 2-DE proteomic analyses were performed over a wide pI range to experimentally cover the largest protein spectrum possible. A total of 120 spots could be resolved and identified, which corresponded to 50 distinct proteins. These exoproteins were essentially virulence factors, degradative enzymes, and proteins of unknown functions, which exportation would essentially rely on the Sec pathway or nonclassical secretion. This investigation resulted in the first comprehensive appraisal of the exoproteome of L. monocytogenes EGD-e based on theoretical and experimental secretomic analyses, which further provided indications on listerial physiology in relation with its habitat and lifestyle. The novel and rational strategy described here is generic and has been purposely designed for the prediction of proteins localized extracellularly in monoderm bacteria.
Collapse
Affiliation(s)
- Mickaël Desvaux
- INRA, UR454 Microbiology, Food Quality and Safety Team, Saint-Genès Champanelle, France.
| | | | | | | | | |
Collapse
|
198
|
Marfori M, Mynott A, Ellis JJ, Mehdi AM, Saunders NFW, Curmi PM, Forwood JK, Bodén M, Kobe B. Molecular basis for specificity of nuclear import and prediction of nuclear localization. BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR CELL RESEARCH 2010; 1813:1562-77. [PMID: 20977914 DOI: 10.1016/j.bbamcr.2010.10.013] [Citation(s) in RCA: 315] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Received: 06/15/2010] [Revised: 10/15/2010] [Accepted: 10/19/2010] [Indexed: 01/03/2023]
Abstract
Although proteins are translated on cytoplasmic ribosomes, many of these proteins play essential roles in the nucleus, mediating key cellular processes including but not limited to DNA replication and repair as well as transcription and RNA processing. Thus, understanding how these critical nuclear proteins are accurately targeted to the nucleus is of paramount importance in biology. Interaction and structural studies in the recent years have jointly revealed some general rules on the specificity determinants of the recognition of nuclear targeting signals by their specific receptors, at least for two nuclear import pathways: (i) the classical pathway, which involves the classical nuclear localization sequences (cNLSs) and the receptors importin-α/karyopherin-α and importin-β/karyopherin-β1; and (ii) the karyopherin-β2 pathway, which employs the proline-tyrosine (PY)-NLSs and the receptor transportin-1/karyopherin-β2. The understanding of specificity rules allows the prediction of protein nuclear localization. We review the current understanding of the molecular determinants of the specificity of nuclear import, focusing on the importin-α•cargo recognition, as well as the currently available databases and predictive tools relevant to nuclear localization. This article is part of a Special Issue entitled: Regulation of Signaling and Cellular Fate through Modulation of Nuclear Protein Import.
Collapse
Affiliation(s)
- Mary Marfori
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, Queensland 4072, Australia
| | | | | | | | | | | | | | | | | |
Collapse
|
199
|
Liu T, Zheng X, Wang J. Prediction of protein structural class for low-similarity sequences using support vector machine and PSI-BLAST profile. Biochimie 2010; 92:1330-4. [DOI: 10.1016/j.biochi.2010.06.013] [Citation(s) in RCA: 98] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2010] [Accepted: 06/16/2010] [Indexed: 11/25/2022]
|
200
|
Prediction of midbody, centrosome and kinetochore proteins based on gene ontology information. Biochem Biophys Res Commun 2010; 401:382-4. [PMID: 20854791 DOI: 10.1016/j.bbrc.2010.09.061] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2010] [Accepted: 09/14/2010] [Indexed: 01/21/2023]
Abstract
In the process of cell division, a great deal of proteins is assembled into three distinct organelles, namely midbody, centrosome and kinetochore. Knowing the localization of microkit (midbody, centrosome and kinetochore) proteins will facilitate drug target discovery and provide novel insights into understanding their functions. In this study, a support vector machine (SVM) model, MicekiPred, was presented to predict the localization of microkit proteins based on gene ontology (GO) information. A total accuracy of 77.51% was achieved using the jackknife cross-validation. This result shows that the model will be an effective complementary tool for future experimental study. The prediction model and dataset used in this article can be freely downloaded from http://cobi.uestc.edu.cn/people/hlin/tools/MicekiPred/.
Collapse
|