1
|
Ahammad RU, Nishioka T, Yoshimoto J, Kannon T, Amano M, Funahashi Y, Tsuboi D, Faruk MO, Yamahashi Y, Yamada K, Nagai T, Kaibuchi K. KANPHOS: A Database of Kinase-Associated Neural Protein Phosphorylation in the Brain. Cells 2021; 11:47. [PMID: 35011609 PMCID: PMC8750479 DOI: 10.3390/cells11010047] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Revised: 12/19/2021] [Accepted: 12/21/2021] [Indexed: 12/15/2022] Open
Abstract
Protein phosphorylation plays critical roles in a variety of intracellular signaling pathways and physiological functions that are controlled by neurotransmitters and neuromodulators in the brain. Dysregulation of these signaling pathways has been implicated in neurodevelopmental disorders, including autism spectrum disorder, attention deficit hyperactivity disorder and schizophrenia. While recent advances in mass spectrometry-based proteomics have allowed us to identify approximately 280,000 phosphorylation sites, it remains largely unknown which sites are phosphorylated by which kinases. To overcome this issue, previously, we developed methods for comprehensive screening of the target substrates of given kinases, such as PKA and Rho-kinase, upon stimulation by extracellular signals and identified many candidate substrates for specific kinases and their phosphorylation sites. Here, we developed a novel online database to provide information about the phosphorylation signals identified by our methods, as well as those previously reported in the literature. The "KANPHOS" (Kinase-Associated Neural Phospho-Signaling) database and its web portal were built based on a next-generation XooNIps neuroinformatics tool. To explore the functionality of the KANPHOS database, we obtained phosphoproteomics data for adenosine-A2A-receptor signaling and its downstream MAPK-mediated signaling in the striatum/nucleus accumbens, registered them in KANPHOS, and analyzed the related pathways.
Collapse
Grants
- JP18dm0207005, JP21dm0207075, JP21wm0425017 and JP21wm0425008 Japan Agency for Medical Research and Development
- JP16K18393, JP17H01380, JP17K07383, JP17H02220, JP17K19483, JP18K14849, JP19K16370, JP21K06428 and JP21K06427 Japan Society for the Promotion of Science
- JP17H05561, JP19H05209 and JP21H00196 Ministry of Education, Culture, Sports, Science and Technology
Collapse
Affiliation(s)
- Rijwan Uddin Ahammad
- Department of Cell Pharmacology, Graduate School of Medicine, Nagoya University, 65 Tsurumai, Nagoya 466-8550, Japan
| | - Tomoki Nishioka
- Institute for Comprehensive Medical Science, Fujita Health University, Toyoake 470-1192, Japan
| | - Junichiro Yoshimoto
- Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma 630-0192, Japan
| | - Takayuki Kannon
- Department of Bioinformatics and Genomics, Graduate School of Advanced Preventive Medical Science, Kanazawa University, Kanazawa 920-8640, Japan
| | - Mutsuki Amano
- Department of Cell Pharmacology, Graduate School of Medicine, Nagoya University, 65 Tsurumai, Nagoya 466-8550, Japan
| | - Yasuhiro Funahashi
- Institute for Comprehensive Medical Science, Fujita Health University, Toyoake 470-1192, Japan
| | - Daisuke Tsuboi
- Institute for Comprehensive Medical Science, Fujita Health University, Toyoake 470-1192, Japan
| | - Md Omar Faruk
- Department of Cell Pharmacology, Graduate School of Medicine, Nagoya University, 65 Tsurumai, Nagoya 466-8550, Japan
| | - Yukie Yamahashi
- Institute for Comprehensive Medical Science, Fujita Health University, Toyoake 470-1192, Japan
| | - Kiyofumi Yamada
- Department of Neuropsychopharmacology and Hospital Pharmacy, Graduate School of Medicine, Nagoya University, 65 Tsurumai, Nagoya 466-8550, Japan
| | - Taku Nagai
- Division of Behavioral Neuropharmacology, International Center for Brain Science (ICBS), Fujita Health University, Toyoake 470-1192, Japan
| | - Kozo Kaibuchi
- Department of Cell Pharmacology, Graduate School of Medicine, Nagoya University, 65 Tsurumai, Nagoya 466-8550, Japan
- Institute for Comprehensive Medical Science, Fujita Health University, Toyoake 470-1192, Japan
| |
Collapse
|
2
|
Pérez-Mejías G, Velázquez-Cruz A, Guerra-Castellano A, Baños-Jaime B, Díaz-Quintana A, González-Arzola K, Ángel De la Rosa M, Díaz-Moreno I. Exploring protein phosphorylation by combining computational approaches and biochemical methods. Comput Struct Biotechnol J 2020; 18:1852-1863. [PMID: 32728408 PMCID: PMC7369424 DOI: 10.1016/j.csbj.2020.06.043] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 06/29/2020] [Accepted: 06/30/2020] [Indexed: 12/14/2022] Open
Abstract
Post-translational modifications of proteins expand their functional diversity, regulating the response of cells to a variety of stimuli. Among these modifications, phosphorylation is the most ubiquitous and plays a prominent role in cell signaling. The addition of a phosphate often affects the function of a protein by altering its structure and dynamics. However, these alterations are often difficult to study and the functional and structural implications remain unresolved. New approaches are emerging to overcome common obstacles related to the production and manipulation of these samples. Here, we summarize the available methods for phosphoprotein purification and phosphomimetic engineering, highlighting the advantages and disadvantages of each. We propose a general workflow for protein phosphorylation analysis combining computational and biochemical approaches, building on recent advances that enable user-friendly and easy-to-access Molecular Dynamics simulations. We hope this innovative workflow will inform the best experimental approach to explore such post-translational modifications. We have applied this workflow to two different human protein models: the hemeprotein cytochrome c and the RNA binding protein HuR. Our results illustrate the usefulness of Molecular Dynamics as a decision-making tool to design the most appropriate phosphomimetic variant.
Collapse
Affiliation(s)
- Gonzalo Pérez-Mejías
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| | - Alejandro Velázquez-Cruz
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| | - Alejandra Guerra-Castellano
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| | - Blanca Baños-Jaime
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| | - Antonio Díaz-Quintana
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| | - Katiuska González-Arzola
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| | - Miguel Ángel De la Rosa
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| | - Irene Díaz-Moreno
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| |
Collapse
|
3
|
Huang KY, Lee TY, Kao HJ, Ma CT, Lee CC, Lin TH, Chang WC, Huang HD. dbPTM in 2019: exploring disease association and cross-talk of post-translational modifications. Nucleic Acids Res 2020; 47:D298-D308. [PMID: 30418626 PMCID: PMC6323979 DOI: 10.1093/nar/gky1074] [Citation(s) in RCA: 134] [Impact Index Per Article: 33.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 10/19/2018] [Indexed: 12/25/2022] Open
Abstract
The dbPTM (http://dbPTM.mbc.nctu.edu.tw/) has been maintained for over 10 years with the aim to provide functional and structural analyses for post-translational modifications (PTMs). In this update, dbPTM not only integrates more experimentally validated PTMs from available databases and through manual curation of literature but also provides PTM-disease associations based on non-synonymous single nucleotide polymorphisms (nsSNPs). The high-throughput deep sequencing technology has led to a surge in the data generated through analysis of association between SNPs and diseases, both in terms of growth amount and scope. This update thus integrated disease-associated nsSNPs from dbSNP based on genome-wide association studies. The PTM substrate sites located at a specified distance in terms of the amino acids encoded from nsSNPs were deemed to have an association with the involved diseases. In recent years, increasing evidence for crosstalk between PTMs has been reported. Although mass spectrometry-based proteomics has substantially improved our knowledge about substrate site specificity of single PTMs, the fact that the crosstalk of combinatorial PTMs may act in concert with the regulation of protein function and activity is neglected. Because of the relatively limited information about concurrent frequency and functional relevance of PTM crosstalk, in this update, the PTM sites neighboring other PTM sites in a specified window length were subjected to motif discovery and functional enrichment analysis. This update highlights the current challenges in PTM crosstalk investigation and breaks the bottleneck of how proteomics may contribute to understanding PTM codes, revealing the next level of data complexity and proteomic limitation in prospective PTM research.
Collapse
Affiliation(s)
- Kai-Yao Huang
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen 518172, China.,School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen 518172, China.,School of Life and Health Science, The Chinese University of Hong Kong, Shenzhen 518172, China
| | - Tzong-Yi Lee
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen 518172, China.,School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen 518172, China.,School of Life and Health Science, The Chinese University of Hong Kong, Shenzhen 518172, China
| | - Hui-Ju Kao
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 32003, Taiwan
| | - Chen-Tse Ma
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 32003, Taiwan
| | - Chao-Chun Lee
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 32003, Taiwan
| | - Tsai-Hsuan Lin
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 32003, Taiwan
| | - Wen-Chi Chang
- Institute of Tropical Plant Sciences, College of Biosciences and Biotechnology, National Cheng Kung University, Tainan 70101, Taiwan
| | - Hsien-Da Huang
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen 518172, China.,School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen 518172, China.,School of Life and Health Science, The Chinese University of Hong Kong, Shenzhen 518172, China
| |
Collapse
|
4
|
Abstract
Proteomics and phosphoproteomics have been emerging as new dimensions of omics. Phosphorylation has a profound impact on the biological functions and applications of proteins. It influences everything from intrinsic activity and extrinsic executions to cellular localization. This post-translational modification has been subjected to detailed study and has been an object of analytical curiosity with the advent of faster instrumentation. The major strength of phosphoproteomic research lies in the fact that it gives an overall picture of the workforce of the cell. Phosphoproteomics gives deeper insights into understanding the mechanism behind development and progression of a disease. This review for the first time consolidates the list of existing bioinformatics tools developed for phosphoproteomics. The gap between development of bioinformatics tools and their implementation in clinical research is highlighted. The challenge facing progress is ideally believed to be the interdisciplinary arena this field of research is associated with. For meaningful solutions and deliverables, these tools need to be implemented in clinical studies for obtaining answers to pharmacodynamic questions, saving time, costs and energy. This review hopes to invoke some thought in this direction.
Collapse
|
5
|
Huang KY, Kao HJ, Hsu JBK, Weng SL, Lee TY. Characterization and identification of lysine glutarylation based on intrinsic interdependence between positions in the substrate sites. BMC Bioinformatics 2019; 19:384. [PMID: 30717647 PMCID: PMC7394328 DOI: 10.1186/s12859-018-2394-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Accepted: 09/25/2018] [Indexed: 01/06/2023] Open
Abstract
Background Glutarylation, the addition of a glutaryl group (five carbons) to a lysine residue of a protein molecule, is an important post-translational modification and plays a regulatory role in a variety of physiological and biological processes. As the number of experimentally identified glutarylated peptides increases, it becomes imperative to investigate substrate motifs to enhance the study of protein glutarylation. We carried out a bioinformatics investigation of glutarylation sites based on amino acid composition using a public database containing information on 430 non-homologous glutarylation sites. Results The TwoSampleLogo analysis indicates that positively charged and polar amino acids surrounding glutarylated sites may be associated with the specificity in substrate site of protein glutarylation. Additionally, the chi-squared test was utilized to explore the intrinsic interdependence between two positions around glutarylation sites. Further, maximal dependence decomposition (MDD), which consists of partitioning a large-scale dataset into subgroups with statistically significant amino acid conservation, was used to capture motif signatures of glutarylation sites. We considered single features, such as amino acid composition (AAC), amino acid pair composition (AAPC), and composition of k-spaced amino acid pairs (CKSAAP), as well as the effectiveness of incorporating MDD-identified substrate motifs into an integrated prediction model. Evaluation by five-fold cross-validation showed that AAC was most effective in discriminating between glutarylation and non-glutarylation sites, according to support vector machine (SVM). Conclusions The SVM model integrating MDD-identified substrate motifs performed well, with a sensitivity of 0.677, a specificity of 0.619, an accuracy of 0.638, and a Matthews Correlation Coefficient (MCC) value of 0.28. Using an independent testing dataset (46 glutarylated and 92 non-glutarylated sites) obtained from the literature, we demonstrated that the integrated SVM model could improve the predictive performance effectively, yielding a balanced sensitivity and specificity of 0.652 and 0.739, respectively. This integrated SVM model has been implemented as a web-based system (MDDGlutar), which is now freely available at http://csb.cse.yzu.edu.tw/MDDGlutar/. Electronic supplementary material The online version of this article (10.1186/s12859-018-2394-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Kai-Yao Huang
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, 518172, China.,Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, 518172, China
| | - Hui-Ju Kao
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, 518172, China.,Department of Computer Science and Engineering, Yuan Ze University, Taoyuan city, 320, Taiwan
| | - Justin Bo-Kai Hsu
- Department of Medical Research, Taipei Medical University Hospital, Taipei city, 110, Taiwan
| | - Shun-Long Weng
- Department of Medicine, Mackay Medical College, New Taipei City, 252, Taiwan.,Mackay Medicine, Nursing and Management College, Taipei, 112, Taiwan.,Department of Obstetrics and Gynecology, Hsinchu Mackay Memorial Hospital, Hsin-Chu, 300, Taiwan
| | - Tzong-Yi Lee
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, 518172, China. .,Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, 518172, China.
| |
Collapse
|
6
|
Zhou X, Qiu YH, He P, Jiang F, Wu LF, Lu X, Lei SF, Deng FY. Why SNP rs227584 is associated with human BMD and fracture risk? A molecular and cellular study in bone cells. J Cell Mol Med 2018; 23:898-907. [PMID: 30370607 PMCID: PMC6349212 DOI: 10.1111/jcmm.13991] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2018] [Revised: 09/03/2018] [Accepted: 09/29/2018] [Indexed: 11/28/2022] Open
Abstract
A large number of SNPs significant for osteoporosis (OP) had been identified by genome-wide association studies. However, the underlying association mechanisms were largely unknown. From the perspective of protein phosphorylation, gene expression regulation, and bone cell activity, this study aims to illustrate association mechanisms for representative SNPs of interest. We utilized public databases and bioinformatics tool to identify OP-associated SNPs which potentially influence protein phosphorylation (phosSNPs). Associations with hip/spine BMD, as well as fracture risk, in human populations for one significant phosSNP, that is, rs227584 (major/minor allele: C/A, EAS population) located in C17orf53 gene, were suggested in prior meta-analyses. Specifically, carriers of allele C had significant higher BMD and lower risk of low-trauma fractures than carriers of A. We pursued to test the molecular and cellular functions of rs227584 in bone through osteoblastic cell culture and multiple assays. We identified five phosSNPs significant for OP (P < 0.01). The osteoblastic cells, which was transfected with wild-type C17orf53 (allele C at rs227584, P126), demonstrated specific interaction with NEK2 kinase, increased expression levels of osteoblastic genes significantly (OPN, OCN, COL1A1, P < 0.05), and promoted osteoblast growth and ALP activity, in contrast to those transfected with mutant C17orf53 (allele A at rs227584, T126). In the light of the consistent evidences between the present functional study in human bone cells and the prior association studies in human populations, we conclude that the SNP rs227584, via altering protein-kinase interaction, regulates osteoblastic gene expression, influences osteoblast growth and activity, hence to affect BMD and fracture risk in humans.
Collapse
Affiliation(s)
- Xu Zhou
- Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China
| | - Ying-Hua Qiu
- Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China
| | - Pei He
- Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China
| | - Fei Jiang
- Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China
| | - Long-Fei Wu
- Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China
| | - Xin Lu
- Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China
| | - Shu-Feng Lei
- Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China
| | - Fei-Yan Deng
- Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China
| |
Collapse
|
7
|
Weng SL, Kao HJ, Huang CH, Lee TY. MDD-Palm: Identification of protein S-palmitoylation sites with substrate motifs based on maximal dependence decomposition. PLoS One 2017; 12:e0179529. [PMID: 28662047 PMCID: PMC5491019 DOI: 10.1371/journal.pone.0179529] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2016] [Accepted: 05/31/2017] [Indexed: 12/14/2022] Open
Abstract
S-palmitoylation, the covalent attachment of 16-carbon palmitic acids to a cysteine residue via a thioester linkage, is an important reversible lipid modification that plays a regulatory role in a variety of physiological and biological processes. As the number of experimentally identified S-palmitoylated peptides increases, it is imperative to investigate substrate motifs to facilitate the study of protein S-palmitoylation. Based on 710 non-homologous S-palmitoylation sites obtained from published databases and the literature, we carried out a bioinformatics investigation of S-palmitoylation sites based on amino acid composition. Two Sample Logo indicates that positively charged and polar amino acids surrounding S-palmitoylated sites may be associated with the substrate site specificity of protein S-palmitoylation. Additionally, maximal dependence decomposition (MDD) was applied to explore the motif signatures of S-palmitoylation sites by categorizing a large-scale dataset into subgroups with statistically significant conservation of amino acids. Single features such as amino acid composition (AAC), amino acid pair composition (AAPC), position specific scoring matrix (PSSM), position weight matrix (PWM), amino acid substitution matrix (BLOSUM62), and accessible surface area (ASA) were considered, along with the effectiveness of incorporating MDD-identified substrate motifs into a two-layered prediction model. Evaluation by five-fold cross-validation showed that a hybrid of AAC and PSSM performs best at discriminating between S-palmitoylation and non-S-palmitoylation sites, according to the support vector machine (SVM). The two-layered SVM model integrating MDD-identified substrate motifs performed well, with a sensitivity of 0.79, specificity of 0.80, accuracy of 0.80, and Matthews Correlation Coefficient (MCC) value of 0.45. Using an independent testing dataset (613 S-palmitoylated and 5412 non-S-palmitoylated sites) obtained from the literature, we demonstrated that the two-layered SVM model could outperform other prediction tools, yielding a balanced sensitivity and specificity of 0.690 and 0.694, respectively. This two-layered SVM model has been implemented as a web-based system (MDD-Palm), which is now freely available at http://csb.cse.yzu.edu.tw/MDDPalm/.
Collapse
Affiliation(s)
- Shun-Long Weng
- Department of Medicine, Mackay Medical College, New Taipei City, Taiwan
- Department of Obstetrics and Gynecology, Hsinchu Mackay Memorial Hospital, Hsinchu city, Taiwan
- Mackay Junior College of Medicine, Nursing and Management, Taipei, Taiwan
| | - Hui-Ju Kao
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, Taiwan
| | - Chien-Hsun Huang
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, Taiwan
- Tao-Yuan Hospital, Ministry of Health & Welfare, Taoyuan, Taiwan
- * E-mail: (TYL); (CHH)
| | - Tzong-Yi Lee
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, Taiwan
- Innovation Center for Big Data and Digital Convergence, Yuan Ze University, Taoyuan, Taiwan
- * E-mail: (TYL); (CHH)
| |
Collapse
|
8
|
Niu T, Liu N, Yu X, Zhao M, Choi HJ, Leo PJ, Brown MA, Zhang L, Pei YF, Shen H, He H, Fu X, Lu S, Chen XD, Tan LJ, Yang TL, Guo Y, Cho NH, Shen J, Guo YF, Nicholson GC, Prince RL, Eisman JA, Jones G, Sambrook PN, Tian Q, Zhu XZ, Papasian CJ, Duncan EL, Uitterlinden AG, Shin CS, Xiang S, Deng HW. Identification of IDUA and WNT16 Phosphorylation-Related Non-Synonymous Polymorphisms for Bone Mineral Density in Meta-Analyses of Genome-Wide Association Studies. J Bone Miner Res 2016; 31:358-68. [PMID: 26256109 PMCID: PMC5362379 DOI: 10.1002/jbmr.2687] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/23/2015] [Revised: 07/29/2015] [Accepted: 08/06/2015] [Indexed: 11/06/2022]
Abstract
Protein phosphorylation regulates a wide variety of cellular processes. Thus, we hypothesize that single-nucleotide polymorphisms (SNPs) that may modulate protein phosphorylation could affect osteoporosis risk. Based on a previous conventional genome-wide association (GWA) study, we conducted a three-stage meta-analysis targeting phosphorylation-related SNPs (phosSNPs) for femoral neck (FN)-bone mineral density (BMD), total hip (HIP)-BMD, and lumbar spine (LS)-BMD phenotypes. In stage 1, 9593 phosSNPs were meta-analyzed in 11,140 individuals of various ancestries. Genome-wide significance (GWS) and suggestive significance were defined by α = 5.21 × 10(-6) (0.05/9593) and 1.00 × 10(-4), respectively. In stage 2, nine stage 1-discovered phosSNPs (based on α = 1.00 × 10(-4)) were in silico meta-analyzed in Dutch, Korean, and Australian cohorts. In stage 3, four phosSNPs that replicated in stage 2 (based on α = 5.56 × 10(-3), 0.05/9) were de novo genotyped in two independent cohorts. IDUA rs3755955 and rs6831280, and WNT16 rs2707466 were associated with BMD phenotypes in each respective stage, and in three stages combined, achieving GWS for both FN-BMD (p = 8.36 × 10(-10), p = 5.26 × 10(-10), and p = 3.01 × 10(-10), respectively) and HIP-BMD (p = 3.26 × 10(-6), p = 1.97 × 10(-6), and p = 1.63 × 10(-12), respectively). Although in vitro studies demonstrated no differences in expressions of wild-type and mutant forms of IDUA and WNT16B proteins, in silico analyses predicts that WNT16 rs2707466 directly abolishes a phosphorylation site, which could cause a deleterious effect on WNT16 protein, and that IDUA phosSNPs rs3755955 and rs6831280 could exert indirect effects on nearby phosphorylation sites. Further studies will be required to determine the detailed and specific molecular effects of these BMD-associated non-synonymous variants.
Collapse
Affiliation(s)
- Tianhua Niu
- Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
| | - Ning Liu
- College of Life Science, Hunan Normal University, Changsha, P.R. China
| | - Xun Yu
- College of Life Science, Hunan Normal University, Changsha, P.R. China
| | - Ming Zhao
- Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
| | - Hyung Jin Choi
- Department of Internal Medicine, College of Medicine, Seoul National University, Seoul, Korea.,Department of Internal Medicine, Chungbuk National University Hospital, Cheongju, Korea
| | - Paul J Leo
- University of Queensland Diamantina Institute, Translational Research Institute, Princess Alexandra Hospital, Brisbane, Australia
| | - Matthew A Brown
- University of Queensland Diamantina Institute, Translational Research Institute, Princess Alexandra Hospital, Brisbane, Australia
| | - Lei Zhang
- Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA.,Center of System Biomedical Sciences, University of Shanghai for Science and Technology, Shanghai, P.R. China
| | - Yu-Fang Pei
- Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
| | - Hui Shen
- Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
| | - Hao He
- Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
| | - Xiaoying Fu
- Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
| | - Shan Lu
- College of Life Science, Hunan Normal University, Changsha, P.R. China
| | - Xiang-Ding Chen
- College of Life Science, Hunan Normal University, Changsha, P.R. China
| | - Li-Jun Tan
- College of Life Science, Hunan Normal University, Changsha, P.R. China
| | - Tie-Lin Yang
- School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, P.R. China
| | - Yan Guo
- School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, P.R. China
| | - Nam H Cho
- Department of Preventive Medicine, Ajou University School of Medicine, Youngtong-Gu, Korea
| | - Jie Shen
- Third Affiliated Hospital of Southern Medical University, Guangzhou, P.R. China
| | - Yan-Fang Guo
- Third Affiliated Hospital of Southern Medical University, Guangzhou, P.R. China
| | | | - Richard L Prince
- School of Medicine and Pharmacology, University of Western Australia, Perth, Australia.,Department of Endocrinology and Diabetes, Sir Charles Gairdner Hospital, Perth, Australia
| | - John A Eisman
- Garvan Institute of Medical Research, University of New South Wales, Sydney, Australia
| | - Graeme Jones
- Menzies Institute for Medical Research, University of Tasmania, Hobart, Australia
| | - Philip N Sambrook
- Kolling Institute of Medical Research, Royal North Shore Hospital, University of Sydney, Sydney, Australia
| | - Qing Tian
- Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
| | - Xue-Zhen Zhu
- School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, P.R. China
| | - Christopher J Papasian
- Department of Basic Medical Science, University of Missouri-Kansas City, Kansas City, MO, USA
| | - Emma L Duncan
- University of Queensland Diamantina Institute, Translational Research Institute, Princess Alexandra Hospital, Brisbane, Australia.,Department of Endocrinology, Royal Brisbane and Women's Hospital, Brisbane, Australia
| | - André G Uitterlinden
- Department of Internal Medicine, Erasmus Medical Center, Rotterdam, The Netherlands.,Department of Epidemiology, Erasmus Medical Center, Rotterdam, The Netherlands.,Netherlands Genomics Initiative (NGI)-sponsored Netherlands Consortium for Healthy Aging (NCHA), Leiden, The Netherlands
| | - Chan Soo Shin
- Department of Internal Medicine, College of Medicine, Seoul National University, Seoul, Korea
| | - Shuanglin Xiang
- College of Life Science, Hunan Normal University, Changsha, P.R. China
| | - Hong-Wen Deng
- Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA.,College of Life Science, Hunan Normal University, Changsha, P.R. China
| |
Collapse
|
9
|
Huang CH, Su MG, Kao HJ, Jhong JH, Weng SL, Lee TY. UbiSite: incorporating two-layered machine learning method with substrate motifs to predict ubiquitin-conjugation site on lysines. BMC SYSTEMS BIOLOGY 2016; 10 Suppl 1:6. [PMID: 26818456 PMCID: PMC4895383 DOI: 10.1186/s12918-015-0246-z] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Background The conjugation of ubiquitin to a substrate protein (protein ubiquitylation), which involves a sequential process – E1 activation, E2 conjugation and E3 ligation, is crucial to the regulation of protein function and activity in eukaryotes. This ubiquitin-conjugation process typically binds the last amino acid of ubiquitin (glycine 76) to a lysine residue of a target protein. The high-throughput of mass spectrometry-based proteomics has stimulated a large-scale identification of ubiquitin-conjugated peptides. Hence, a new web resource, UbiSite, was developed to identify ubiquitin-conjugation site on lysines based on large-scale proteome dataset. Results Given a total of 37,647 ubiquitin-conjugated proteins, including 128026 ubiquitylated peptides, obtained from various resources, this study carries out a large-scale investigation on ubiquitin-conjugation sites based on sequenced and structural characteristics. A TwoSampleLogo reveals that a significant depletion of histidine (H), arginine (R) and cysteine (C) residues around ubiquitylation sites may impact the conjugation of ubiquitins in closed three-dimensional environments. Based on the large-scale ubiquitylation dataset, a motif discovery tool, MDDLogo, has been adopted to characterize the potential substrate motifs for ubiquitin conjugation. Not only are single features such as amino acid composition (AAC), positional weighted matrix (PWM), position-specific scoring matrix (PSSM) and solvent-accessible surface area (SASA) considered, but also the effectiveness of incorporating MDDLogo-identified substrate motifs into a two-layered prediction model is taken into account. Evaluation by five-fold cross-validation showed that PSSM is the best feature in discriminating between ubiquitylation and non-ubiquitylation sites, based on support vector machine (SVM). Additionally, the two-layered SVM model integrating MDDLogo-identified substrate motifs could obtain a promising accuracy and the Matthews Correlation Coefficient (MCC) at 81.06 % and 0.586, respectively. Furthermore, the independent testing showed that the two-layered SVM model could outperform other prediction tools, reaching at 85.10 % sensitivity, 69.69 % specificity, 73.69 % accuracy and the 0.483 of MCC value. Conclusion The independent testing result indicated the effectiveness of incorporating MDDLogo-identified motifs into the prediction of ubiquitylation sites. In order to provide meaningful assistance to researchers interested in large-scale ubiquitinome data, the two-layered SVM model has been implemented onto a web-based system (UbiSite), which is freely available at http://csb.cse.yzu.edu.tw/UbiSite/. Two cases given in the UbiSite provide a demonstration of effective identification of ubiquitylation sites with reference to substrate motifs. Electronic supplementary material The online version of this article (doi:10.1186/s12918-015-0246-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Chien-Hsun Huang
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, 320, Taiwan. .,Ministry of Health & Welfare, Tao-Yuan Hospital, Taoyuan, 320, Taiwan.
| | - Min-Gang Su
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, 320, Taiwan.
| | - Hui-Ju Kao
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, 320, Taiwan.
| | - Jhih-Hua Jhong
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, 320, Taiwan.
| | - Shun-Long Weng
- Department of Obstetrics and Gynecology, Hsinchu Mackay Memorial Hospital, Hsin-Chu, 300, Taiwan. .,Mackay Junior College of Medicine, Nursing and Management , Taipei, 112, Taiwan. .,Department of Medicine, Mackay Medical College, New Taipei City, 252, Taiwan.
| | - Tzong-Yi Lee
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, 320, Taiwan. .,Innovation Center for Big Data and Digital Convergence, Yuan Ze University, Taoyuan, 320, Taiwan.
| |
Collapse
|
10
|
Kao HJ, Huang CH, Bretaña NA, Lu CT, Huang KY, Weng SL, Lee TY. A two-layered machine learning method to identify protein O-GlcNAcylation sites with O-GlcNAc transferase substrate motifs. BMC Bioinformatics 2015; 16 Suppl 18:S10. [PMID: 26680539 PMCID: PMC4682369 DOI: 10.1186/1471-2105-16-s18-s10] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Protein O-GlcNAcylation, involving the β-attachment of single N-acetylglucosamine (GlcNAc) to the hydroxyl group of serine or threonine residues, is an O-linked glycosylation catalyzed by O-GlcNAc transferase (OGT). Molecular level investigation of the basis for OGT's substrate specificity should aid understanding how O-GlcNAc contributes to diverse cellular processes. Due to an increasing number of O-GlcNAcylated peptides with site-specific information identified by mass spectrometry (MS)-based proteomics, we were motivated to characterize substrate site motifs of O-GlcNAc transferases. In this investigation, a non-redundant dataset of 410 experimentally verified O-GlcNAcylation sites were manually extracted from dbOGAP, OGlycBase and UniProtKB. After detection of conserved motifs by using maximal dependence decomposition, profile hidden Markov model (profile HMM) was adopted to learn a first-layered model for each identified OGT substrate motif. Support Vector Machine (SVM) was then used to generate a second-layered model learned from the output values of profile HMMs in first layer. The two-layered predictive model was evaluated using a five-fold cross validation which yielded a sensitivity of 85.4%, a specificity of 84.1%, and an accuracy of 84.7%. Additionally, an independent testing set from PhosphoSitePlus, which was really non-homologous to the training data of predictive model, was used to demonstrate that the proposed method could provide a promising accuracy (84.05%) and outperform other O-GlcNAcylation site prediction tools. A case study indicated that the proposed method could be a feasible means of conducting preliminary analyses of protein O-GlcNAcylation and has been implemented as a web-based system, OGTSite, which is now freely available at http://csb.cse.yzu.edu.tw/OGTSite/.
Collapse
|
11
|
Huang KY, Su MG, Kao HJ, Hsieh YC, Jhong JH, Cheng KH, Huang HD, Lee TY. dbPTM 2016: 10-year anniversary of a resource for post-translational modification of proteins. Nucleic Acids Res 2015; 44:D435-46. [PMID: 26578568 PMCID: PMC4702878 DOI: 10.1093/nar/gkv1240] [Citation(s) in RCA: 136] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2015] [Accepted: 11/02/2015] [Indexed: 01/23/2023] Open
Abstract
Owing to the importance of the post-translational modifications (PTMs) of proteins in regulating biological processes, the dbPTM (http://dbPTM.mbc.nctu.edu.tw/) was developed as a comprehensive database of experimentally verified PTMs from several databases with annotations of potential PTMs for all UniProtKB protein entries. For this 10th anniversary of dbPTM, the updated resource provides not only a comprehensive dataset of experimentally verified PTMs, supported by the literature, but also an integrative interface for accessing all available databases and tools that are associated with PTM analysis. As well as collecting experimental PTM data from 14 public databases, this update manually curates over 12 000 modified peptides, including the emerging S-nitrosylation, S-glutathionylation and succinylation, from approximately 500 research articles, which were retrieved by text mining. As the number of available PTM prediction methods increases, this work compiles a non-homologous benchmark dataset to evaluate the predictive power of online PTM prediction tools. An increasing interest in the structural investigation of PTM substrate sites motivated the mapping of all experimental PTM peptides to protein entries of Protein Data Bank (PDB) based on database identifier and sequence identity, which enables users to examine spatially neighboring amino acids, solvent-accessible surface area and side-chain orientations for PTM substrate sites on tertiary structures. Since drug binding in PDB is annotated, this update identified over 1100 PTM sites that are associated with drug binding. The update also integrates metabolic pathways and protein-protein interactions to support the PTM network analysis for a group of proteins. Finally, the web interface is redesigned and enhanced to facilitate access to this resource.
Collapse
Affiliation(s)
- Kai-Yao Huang
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan
| | - Min-Gang Su
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan
| | - Hui-Ju Kao
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan
| | - Yun-Chung Hsieh
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan
| | - Jhih-Hua Jhong
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan
| | - Kuang-Hao Cheng
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan
| | - Hsien-Da Huang
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 300, Taiwan Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Tzong-Yi Lee
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan Innovation Center for Big Data and Digital Convergence, Yuan Ze University, Taoyuan 320, Taiwan
| |
Collapse
|
12
|
Bui VM, Lu CT, Ho TT, Lee TY. MDD-SOH: exploiting maximal dependence decomposition to identify S-sulfenylation sites with substrate motifs. Bioinformatics 2015; 32:165-72. [PMID: 26411868 DOI: 10.1093/bioinformatics/btv558] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2015] [Accepted: 09/18/2015] [Indexed: 01/12/2023] Open
Abstract
UNLABELLED S-sulfenylation (S-sulphenylation, or sulfenic acid), the covalent attachment of S-hydroxyl (-SOH) to cysteine thiol, plays a significant role in redox regulation of protein functions. Although sulfenic acid is transient and labile, most of its physiological activities occur under control of S-hydroxylation. Therefore, discriminating the substrate site of S-sulfenylated proteins is an essential task in computational biology for the furtherance of protein structures and functions. Research into S-sulfenylated protein is currently very limited, and no dedicated tools are available for the computational identification of SOH sites. Given a total of 1096 experimentally verified S-sulfenylated proteins from humans, this study carries out a bioinformatics investigation on SOH sites based on amino acid composition and solvent-accessible surface area. A TwoSampleLogo indicates that the positively and negatively charged amino acids flanking the SOH sites may impact the formulation of S-sulfenylation in closed three-dimensional environments. In addition, the substrate motifs of SOH sites are studied using the maximal dependence decomposition (MDD). Based on the concept of binary classification between SOH and non-SOH sites, Support vector machine (SVM) is applied to learn the predictive model from MDD-identified substrate motifs. According to the evaluation results of 5-fold cross-validation, the integrated SVM model learned from substrate motifs yields an average accuracy of 0.87, significantly improving the prediction of SOH sites. Furthermore, the integrated SVM model also effectively improves the predictive performance in an independent testing set. Finally, the integrated SVM model is applied to implement an effective web resource, named MDD-SOH, to identify SOH sites with their corresponding substrate motifs. AVAILABILITY AND IMPLEMENTATION The MDD-SOH is now freely available to all interested users at http://csb.cse.yzu.edu.tw/MDDSOH/. All of the data set used in this work is also available for download in the website. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. CONTACT francis@saturn.yzu.edu.tw.
Collapse
Affiliation(s)
- Van-Minh Bui
- Department of Computer Science and Engineering and
| | | | - Thi-Trang Ho
- Department of Computer Science and Engineering and
| | - Tzong-Yi Lee
- Department of Computer Science and Engineering and Innovation Center for Big Data and Digital Convergence, Yuan Ze University, Taoyuan 320, Taiwan
| |
Collapse
|
13
|
Chen YJ, Lu CT, Huang KY, Wu HY, Chen YJ, Lee TY. GSHSite: exploiting an iteratively statistical method to identify s-glutathionylation sites with substrate specificity. PLoS One 2015; 10:e0118752. [PMID: 25849935 PMCID: PMC4388702 DOI: 10.1371/journal.pone.0118752] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2014] [Accepted: 01/06/2015] [Indexed: 01/13/2023] Open
Abstract
S-glutathionylation, the covalent attachment of a glutathione (GSH) to the sulfur atom of cysteine, is a selective and reversible protein post-translational modification (PTM) that regulates protein activity, localization, and stability. Despite its implication in the regulation of protein functions and cell signaling, the substrate specificity of cysteine S-glutathionylation remains unknown. Based on a total of 1783 experimentally identified S-glutathionylation sites from mouse macrophages, this work presents an informatics investigation on S-glutathionylation sites including structural factors such as the flanking amino acids composition and the accessible surface area (ASA). TwoSampleLogo presents that positively charged amino acids flanking the S-glutathionylated cysteine may influence the formation of S-glutathionylation in closed three-dimensional environment. A statistical method is further applied to iteratively detect the conserved substrate motifs with statistical significance. Support vector machine (SVM) is then applied to generate predictive model considering the substrate motifs. According to five-fold cross-validation, the SVMs trained with substrate motifs could achieve an enhanced sensitivity, specificity, and accuracy, and provides a promising performance in an independent test set. The effectiveness of the proposed method is demonstrated by the correct identification of previously reported S-glutathionylation sites of mouse thioredoxin (TXN) and human protein tyrosine phosphatase 1b (PTP1B). Finally, the constructed models are adopted to implement an effective web-based tool, named GSHSite (http://csb.cse.yzu.edu.tw/GSHSite/), for identifying uncharacterized GSH substrate sites on the protein sequences.
Collapse
Affiliation(s)
- Yi-Ju Chen
- Institute of Chemistry, Academia Sinica, Taipei, Taiwan
| | - Cheng-Tsung Lu
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, Taiwan
| | - Kai-Yao Huang
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, Taiwan
| | - Hsin-Yi Wu
- Institute of Chemistry, Academia Sinica, Taipei, Taiwan
| | - Yu-Ju Chen
- Institute of Chemistry, Academia Sinica, Taipei, Taiwan
- * E-mail: (TYL); (YJC)
| | - Tzong-Yi Lee
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, Taiwan
- Innovation Center for Big Data and Digital Convergence, Yuan Ze University, Taoyuan, Taiwan
- * E-mail: (TYL); (YJC)
| |
Collapse
|
14
|
Huang SY, Shi SP, Qiu JD, Liu MC. Using support vector machines to identify protein phosphorylation sites in viruses. J Mol Graph Model 2015; 56:84-90. [DOI: 10.1016/j.jmgm.2014.12.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2014] [Revised: 12/13/2014] [Accepted: 12/16/2014] [Indexed: 10/24/2022]
|
15
|
Ding TB, Zhong R, Jiang XZ, Liao CY, Xia WK, Liu B, Dou W, Wang JJ. Molecular characterisation of a sodium channel gene and identification of a Phe1538 to Ile mutation in citrus red mite, Panonychus citri. PEST MANAGEMENT SCIENCE 2015; 71:266-277. [PMID: 24753229 DOI: 10.1002/ps.3802] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2013] [Revised: 03/02/2014] [Accepted: 04/12/2014] [Indexed: 06/03/2023]
Abstract
BACKGROUND The citrus red mite, Panonychus citri (McGregor), is regarded as one of the most serious citrus pests in many countries and has developed high resistance to pyrethroids as a result of the intensive use of these acaricides. RESULTS The para sodium channel gene of P. citri (named PcNav ), containing an entire coding region of 6729 bp, was cloned in this study. Three alternative splicing sites and 12 potential RNA editing sites were identified in PcNav . Thus, exons alt 1 and alt 3-v3 were found to be unique to PcNav . Comparison of field fenpropathrin-resistant (WZ) and susceptible (LS) strains identified the point mutation F1538I in IIIS6 of the sodium channel, which is known to confer strong resistance to pyrethroids in mites. Moreover, it was also found that the PcNav mRNA was present during all life stages, and the transcript seems to be more abundant in larvae than in other developmental stages. CONCLUSION These results suggest that the F1538I mutation plays an important role in fenpropathrin resistance in citrus red mites. This is the first study of the sodium channel in P. citri and provides abundant information for further research on the mechanism of pyrethroid resistance.
Collapse
Affiliation(s)
- Tian-Bo Ding
- Key Laboratory of Entomology and Pest Control Engineering, College of Plant Protection, Southwest University, Chongqing, 400716, China
| | | | | | | | | | | | | | | |
Collapse
|
16
|
Wu HY, Lu CT, Kao HJ, Chen YJ, Chen YJ, Lee TY. Characterization and identification of protein O-GlcNAcylation sites with substrate specificity. BMC Bioinformatics 2014; 15 Suppl 16:S1. [PMID: 25521204 PMCID: PMC4290634 DOI: 10.1186/1471-2105-15-s16-s1] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Background Protein O-GlcNAcylation, involving the attachment of single N-acetylglucosamine (GlcNAc) to the hydroxyl group of serine or threonine residues. Elucidation of O-GlcNAcylation sites on proteins is required in order to decipher its crucial roles in regulating cellular processes and aid in drug design. With an increasing number of O-GlcNAcylation sites identified by mass spectrometry (MS)-based proteomics, several methods have been proposed for the computational identification of O-GlcNAcylation sites. However, no development that focuses on the investigation of O-GlcNAcylated substrate motifs has existed. Thus, we were motivated to design a new method for the identification of protein O-GlcNAcylation sites with the consideration of substrate site specificity. Results In this study, 375 experimentally verified O-GlcNAcylation sites were collected from dbOGAP, which is an integrated resource for protein O-GlcNAcylation. Due to the difficulty in characterizing the substrate motifs by conventional sequence logo analysis, a recursively statistical method has been applied to obtain significant conserved motifs. To construct the predictive models learned from the identified substrate motifs, we adopted Support Vector Machines (SVMs). A five-fold cross validation was used to evaluate the predictive model, achieving sensitivity, specificity, and accuracy of 0.76, 0.80, and 0.78, respectively. Additionally, an independent testing set, which was really blind to the training data of predictive model, was used to demonstrate that the proposed method could provide a promising accuracy (0.94) and outperform three other O-GlcNAcylation site prediction tools. Conclusion This work proposed a computational method to identify informative substrate motifs for O-GlcNAcylation sites. The evaluation of cross validation and independent testing indicated that the identified motifs were effective in the identification of O-GlcNAcylation sites. A case study demonstrated that the proposed method could be a feasible means of conducting preliminary analyses of protein O-GlcNAcylation. We also anticipated that the revealed substrate motif may facilitate the study of extensive crosstalk between O-GlcNAcylation and phosphorylation. This method may help unravel their mechanisms and roles in signaling, transcription, chronic disease, and cancer.
Collapse
|
17
|
An intelligent system for identifying acetylated lysine on histones and nonhistone proteins. BIOMED RESEARCH INTERNATIONAL 2014; 2014:528650. [PMID: 25147802 PMCID: PMC4132336 DOI: 10.1155/2014/528650] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/11/2014] [Revised: 06/23/2014] [Accepted: 06/24/2014] [Indexed: 01/15/2023]
Abstract
Lysine acetylation is an important and ubiquitous posttranslational modification conserved in prokaryotes and eukaryotes. This process, which is dynamically and temporally regulated by histone acetyltransferases and deacetylases, is crucial for numerous essential biological processes such as transcriptional regulation, cellular signaling, and stress response. Since the experimental identification of lysine acetylation sites within proteins is time-consuming and laboratory-intensive, several computational approaches have been developed to identify candidates for experimental validation. In this work, acetylated protein data collected from UniProtKB were categorized into histone or nonhistone proteins. Support vector machines (SVMs) were applied to build predictive models by using amino acid pair composition (AAPC) as a feature in a histone model. We combined BLOSUM62 and AAPC features in a nonhistone model. Furthermore, using maximal dependence decomposition (MDD) clustering can enhance the performance of the model on a fivefold cross-validation evaluation to yield a sensitivity of 0.863, specificity of 0.885, accuracy of 0.880, and MCC of 0.706. Additionally, the proposed method is evaluated using independent test sets resulting in a predictive accuracy of 74%. This indicates that the performance of our method is comparable with that of other acetylation prediction methods.
Collapse
|
18
|
Huang KY, Wu HY, Chen YJ, Lu CT, Su MG, Hsieh YC, Tsai CM, Lin KI, Huang HD, Lee TY, Chen YJ. RegPhos 2.0: an updated resource to explore protein kinase-substrate phosphorylation networks in mammals. Database (Oxford) 2014; 2014:bau034. [PMID: 24771658 PMCID: PMC3999940 DOI: 10.1093/database/bau034] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Revised: 03/27/2014] [Accepted: 03/30/2014] [Indexed: 11/13/2022]
Abstract
Protein phosphorylation catalyzed by kinases plays crucial roles in regulating a variety of intracellular processes. Owing to an increasing number of in vivo phosphorylation sites that have been identified by mass spectrometry (MS)-based proteomics, the RegPhos, available online at http://csb.cse.yzu.edu.tw/RegPhos2/, was developed to explore protein phosphorylation networks in human. In this update, we not only enhance the data content in human but also investigate kinase-substrate phosphorylation networks in mouse and rat. The experimentally validated phosphorylation sites as well as their catalytic kinases were extracted from public resources, and MS/MS phosphopeptides were manually curated from research articles. RegPhos 2.0 aims to provide a more comprehensive view of intracellular signaling networks by integrating the information of metabolic pathways and protein-protein interactions. A case study shows that analyzing the phosphoproteome profile of time-dependent cell activation obtained from Liquid chromatography-mass spectrometry (LC-MS/MS) analysis, the RegPhos deciphered not only the consistent scheme in B cell receptor (BCR) signaling pathway but also novel regulatory molecules that may involve in it. With an attempt to help users efficiently identify the candidate biomarkers in cancers, 30 microarray experiments, including 39 cancerous versus normal cells, were analyzed for detecting cancer-specific expressed genes coding for kinases and their substrates. Furthermore, this update features an improved web interface to facilitate convenient access to the exploration of phosphorylation networks for a group of genes/proteins. Database URL: http://csb.cse.yzu.edu.tw/RegPhos2/
Collapse
Affiliation(s)
- Kai-Yao Huang
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
| | - Hsin-Yi Wu
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
| | - Yi-Ju Chen
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
| | - Cheng-Tsung Lu
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
| | - Min-Gang Su
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
| | - Yun-Chung Hsieh
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
| | - Chih-Ming Tsai
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
| | - Kuo-I Lin
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
| | - Hsien-Da Huang
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
| | - Tzong-Yi Lee
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
| | - Yu-Ju Chen
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
| |
Collapse
|
19
|
Xu Y, Wang X, Wang Y, Tian Y, Shao X, Wu LY, Deng N. Prediction of posttranslational modification sites from amino acid sequences with kernel methods. J Theor Biol 2014; 344:78-87. [DOI: 10.1016/j.jtbi.2013.11.012] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2013] [Revised: 09/13/2013] [Accepted: 11/16/2013] [Indexed: 01/12/2023]
|
20
|
Huang KY, Lu CT, Bretaña N, Lee TY, Chang TH. ViralPhos: incorporating a recursively statistical method to predict phosphorylation sites on virus proteins. BMC Bioinformatics 2013; 14 Suppl 16:S10. [PMID: 24564381 PMCID: PMC3853219 DOI: 10.1186/1471-2105-14-s16-s10] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Background The phosphorylation of virus proteins by host kinases is linked to viral replication. This leads to an inhibition of normal host-cell functions. Further elucidation of phosphorylation in virus proteins is required in order to aid in drug design and treatment. However, only a few studies have investigated substrate motifs in identifying virus phosphorylation sites. Additionally, existing bioinformatics tool do not consider potential host kinases that may initiate the phosphorylation of a virus protein. Results 329 experimentally verified phosphorylation fragments on 111 virus proteins were collected from virPTM. These were clustered into subgroups of significantly conserved motifs using a recursively statistical method. Two-layered Support Vector Machines (SVMs) were then applied to train a predictive model for the identified substrate motifs. The SVM models were evaluated using a five-fold cross validation which yields an average accuracy of 0.86 for serine, and 0.81 for threonine. Furthermore, the proposed method is shown to perform at par with three other phosphorylation site prediction tools: PPSP, KinasePhos 2.0 and GPS 2.1. Conclusion In this study, we propose a computational method, ViralPhos, which aims to investigate virus substrate site motifs and identify potential phosphorylation sites on virus proteins. We identified informative substrate motifs that matched with several well-studied kinase groups as potential catalytic kinases for virus protein substrates. The identified substrate motifs were further exploited to identify potential virus phosphorylation sites. The proposed method is shown to be capable of predicting virus phosphorylation sites and has been implemented as a web server http://csb.cse.yzu.edu.tw/ViralPhos/.
Collapse
|
21
|
Su MG, Lee TY. Incorporating substrate sequence motifs and spatial amino acid composition to identify kinase-specific phosphorylation sites on protein three-dimensional structures. BMC Bioinformatics 2013; 14 Suppl 16:S2. [PMID: 24564522 PMCID: PMC3853090 DOI: 10.1186/1471-2105-14-s16-s2] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND Protein phosphorylation catalyzed by kinases plays crucial regulatory roles in cellular processes. Given the high-throughput mass spectrometry-based experiments, the desire to annotate the catalytic kinases for in vivo phosphorylation sites has motivated. Thus, a variety of computational methods have been developed for performing a large-scale prediction of kinase-specific phosphorylation sites. However, most of the proposed methods solely rely on the local amino acid sequences surrounding the phosphorylation sites. An increasing number of three-dimensional structures make it possible to physically investigate the structural environment of phosphorylation sites. RESULTS In this work, all of the experimental phosphorylation sites are mapped to the protein entries of Protein Data Bank by sequence identity. It resulted in a total of 4508 phosphorylation sites containing the protein three-dimensional (3D) structures. To identify phosphorylation sites on protein 3D structures, this work incorporates support vector machines (SVMs) with the information of linear motifs and spatial amino acid composition, which is determined for each kinase group by calculating the relative frequencies of 20 amino acid types within a specific radial distance from central phosphorylated amino acid residue. After the cross-validation evaluation, most of the kinase-specific models trained with the consideration of structural information outperform the models considering only the sequence information. Furthermore, the independent testing set which is not included in training set has demonstrated that the proposed method could provide a comparable performance to other popular tools. CONCLUSION The proposed method is shown to be capable of predicting kinase-specific phosphorylation sites on 3D structures and has been implemented as a web server which is freely accessible at http://csb.cse.yzu.edu.tw/PhosK3D/. Due to the difficulty of identifying the kinase-specific phosphorylation sites with similar sequenced motifs, this work also integrates the 3D structural information to improve the cross classifying specificity.
Collapse
|
22
|
Lu CT, Huang KY, Su MG, Lee TY, Bretaña NA, Chang WC, Chen YJ, Chen YJ, Huang HD. DbPTM 3.0: an informative resource for investigating substrate site specificity and functional association of protein post-translational modifications. Nucleic Acids Res 2012. [PMID: 23193290 PMCID: PMC3531199 DOI: 10.1093/nar/gks1229] [Citation(s) in RCA: 165] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Protein modification is an extremely important post-translational regulation that adjusts the physical and chemical properties, conformation, stability and activity of a protein; thus altering protein function. Due to the high throughput of mass spectrometry (MS)-based methods in identifying site-specific post-translational modifications (PTMs), dbPTM (http://dbPTM.mbc.nctu.edu.tw/) is updated to integrate experimental PTMs obtained from public resources as well as manually curated MS/MS peptides associated with PTMs from research articles. Version 3.0 of dbPTM aims to be an informative resource for investigating the substrate specificity of PTM sites and functional association of PTMs between substrates and their interacting proteins. In order to investigate the substrate specificity for modification sites, a newly developed statistical method has been applied to identify the significant substrate motifs for each type of PTMs containing sufficient experimental data. According to the data statistics in dbPTM, >60% of PTM sites are located in the functional domains of proteins. It is known that most PTMs can create binding sites for specific protein-interaction domains that work together for cellular function. Thus, this update integrates protein–protein interaction and domain–domain interaction to determine the functional association of PTM sites located in protein-interacting domains. Additionally, the information of structural topologies on transmembrane (TM) proteins is integrated in dbPTM in order to delineate the structural correlation between the reported PTM sites and TM topologies. To facilitate the investigation of PTMs on TM proteins, the PTM substrate sites and the structural topology are graphically represented. Also, literature information related to PTMs, orthologous conservations and substrate motifs of PTMs are also provided in the resource. Finally, this version features an improved web interface to facilitate convenient access to the resource.
Collapse
Affiliation(s)
- Cheng-Tsung Lu
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li 320, Taiwan
| | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Functional analyses of endometriosis-related polymorphisms in the estrogen synthesis and metabolism-related genes. PLoS One 2012; 7:e47374. [PMID: 23139742 PMCID: PMC3490981 DOI: 10.1371/journal.pone.0047374] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2012] [Accepted: 09/12/2012] [Indexed: 11/19/2022] Open
Abstract
Endometriosis is determined by genetic factors, and the prevalence of genetic polymorphisms varies greatly depending on the ethnic group studied. The objective of this study was to investigate the relationship between single nucleotide polymorphisms (SNPs) of 9 genes involved in estrogen biosynthesis and metabolism and the risks of endometriosis. Three hundred patients with endometriosis and 337 non-endometriotic controls were recruited. Thirty four non-synonymous SNPs, which change amino acid residues, were analyzed using matrix-assisted laser desorption-ionization time-of-flight mass spectrometry (MALDI-TOF MS). The functions of SNP-resulted amino acid changes were analyzed using multiple web-accessible databases and phosphorylation predicting algorithms. Among the 34 NCBI-listed SNPs, 22 did not exhibit polymorphism in this study of more than 600 Taiwanese Chinese women. However, homozygous and heterozygous mutants of 4 SNPs - rs6165 (genotype GG+GA, 307(Ala/Ala)+307(Ala/Thr)) of FSHR, rs 6166 (genotype GG+GA, 680(Ser/Asn)+680(Ser/Ser)) of FSHR, rs2066479 (genotype AA+AG, 289(Ser/Ser)+289(Ser/Gly)) of HSD17B3 and rs700519 (genotype TT+TC, 264(Cys/Cys)+264(Cys/Arg)) of CYP19, alone or in combination, were significantly associated with decreased risks of endometriosis. Bioinformatics results identified 307(Thr) of FSHR to be a site for O-linked glycosylation, 680(Ser) of FSHR a phosphorylated site by protein kinase B, and 289(Ser) of HSD17B3 a phosphorylated site by protein kinase B or ribosomal protein S6 kinase 1. Results of this study suggest that non-synonymous polymorphisms of FSHR, HSD17B3 and CYP19 genes may modulate the risk of endometriosis in Taiwanese Chinese women. Identification of the endometrosis-preferential non-synonymous SNPs and the conformational changes in those proteins may pave the way for the development of more disease-specific drugs.
Collapse
|
24
|
Huang JH, Cao DS, Yan J, Xu QS, Hu QN, Liang YZ. Using core hydrophobicity to identify phosphorylation sites of human G protein-coupled receptors. Biochimie 2012; 94:1697-704. [PMID: 22503742 DOI: 10.1016/j.biochi.2012.03.022] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2011] [Accepted: 03/28/2012] [Indexed: 01/23/2023]
Abstract
As the most frequent drug target, G protein-coupled receptors (GPCRs) are a large family of seven trans-membrane receptors that sense molecules outside the cell and activate inside signal transduction pathways. The activity and lifetime of activated receptors are regulated by receptor phosphorylation. Therefore, investigating the exact positions of phosphorylation sites in GPCRs sequence could provide useful clues for drug design and other biotechnology applications. Experimental identification of phosphorylation sites is expensive and laborious. Hence, there is significant interest in the development of computational methods for reliable prediction of phosphorylation sites from amino acid sequences. In this article, we presented a simple and effective method to recognize phosphorylation sites of human GPCRs by combining amino acid hydrophobicity and support vector machine. The prediction accuracy, sensitivity, specificity, Matthews correlation coefficient and area under the curve values for phosphoserine, phosphothreonine, and phosphotyrosine were 0.964, 0.790, 0.999, 0.866, 0.941; 0.954, 0.800, 0.985, 0.828, 0.958; and 0.976, 0.820, 0.993, 0.861, 0.959, respectively. The establishment of such a fast and accurate prediction method will speed up the pace of identifying proper GPCRs sites to facilitate drug discovery.
Collapse
Affiliation(s)
- Jian-Hua Huang
- Research center of Modernization of Traditional Chinese Medicines, Central South University, Changsha 410083, PR China
| | | | | | | | | | | |
Collapse
|
25
|
An Integrated Bayesian Framework for Identifying Phosphorylation Networks in Stimulated Cells. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2012; 736:59-80. [DOI: 10.1007/978-1-4419-7210-1_3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
26
|
Trost B, Kusalik A. Computational prediction of eukaryotic phosphorylation sites. Bioinformatics 2011; 27:2927-35. [DOI: 10.1093/bioinformatics/btr525] [Citation(s) in RCA: 121] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
27
|
Lee TY, Bretaña NA, Lu CT. PlantPhos: using maximal dependence decomposition to identify plant phosphorylation sites with substrate site specificity. BMC Bioinformatics 2011; 12:261. [PMID: 21703007 PMCID: PMC3228547 DOI: 10.1186/1471-2105-12-261] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2011] [Accepted: 06/26/2011] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND Protein phosphorylation catalyzed by kinases plays crucial regulatory roles in intracellular signal transduction. Due to the difficulty in performing high-throughput mass spectrometry-based experiment, there is a desire to predict phosphorylation sites using computational methods. However, previous studies regarding in silico prediction of plant phosphorylation sites lack the consideration of kinase-specific phosphorylation data. Thus, we are motivated to propose a new method that investigates different substrate specificities in plant phosphorylation sites. RESULTS Experimentally verified phosphorylation data were extracted from TAIR9-a protein database containing 3006 phosphorylation data from the plant species Arabidopsis thaliana. In an attempt to investigate the various substrate motifs in plant phosphorylation, maximal dependence decomposition (MDD) is employed to cluster a large set of phosphorylation data into subgroups containing significantly conserved motifs. Profile hidden Markov model (HMM) is then applied to learn a predictive model for each subgroup. Cross-validation evaluation on the MDD-clustered HMMs yields an average accuracy of 82.4% for serine, 78.6% for threonine, and 89.0% for tyrosine models. Moreover, independent test results using Arabidopsis thaliana phosphorylation data from UniProtKB/Swiss-Prot show that the proposed models are able to correctly predict 81.4% phosphoserine, 77.1% phosphothreonine, and 83.7% phosphotyrosine sites. Interestingly, several MDD-clustered subgroups are observed to have similar amino acid conservation with the substrate motifs of well-known kinases from Phospho.ELM-a database containing kinase-specific phosphorylation data from multiple organisms. CONCLUSIONS This work presents a novel method for identifying plant phosphorylation sites with various substrate motifs. Based on cross-validation and independent testing, results show that the MDD-clustered models outperform models trained without using MDD. The proposed method has been implemented as a web-based plant phosphorylation prediction tool, PlantPhos http://csb.cse.yzu.edu.tw/PlantPhos/. Additionally, two case studies have been demonstrated to further evaluate the effectiveness of PlantPhos.
Collapse
Affiliation(s)
- Tzong-Yi Lee
- Department of Computer Science and Engineering, Yuan Ze University, Chungli 320, Taiwan.
| | | | | |
Collapse
|
28
|
Lee TY, Lin ZQ, Hsieh SJ, Bretaña NA, Lu CT. Exploiting maximal dependence decomposition to identify conserved motifs from a group of aligned signal sequences. Bioinformatics 2011; 27:1780-7. [DOI: 10.1093/bioinformatics/btr291] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
29
|
Lee TY, Bo-Kai Hsu J, Chang WC, Huang HD. RegPhos: a system to explore the protein kinase-substrate phosphorylation network in humans. Nucleic Acids Res 2011; 39:D777-87. [PMID: 21037261 PMCID: PMC3013804 DOI: 10.1093/nar/gkq970] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Protein phosphorylation catalyzed by kinases plays crucial regulatory roles in intracellular signal transduction. With the increasing number of experimental phosphorylation sites that has been identified by mass spectrometry-based proteomics, the desire to explore the networks of protein kinases and substrates is motivated. Manning et al. have identified 518 human kinase genes, which provide a starting point for comprehensive analysis of protein phosphorylation networks. In this study, a knowledgebase is developed to integrate experimentally verified protein phosphorylation data and protein-protein interaction data for constructing the protein kinase-substrate phosphorylation networks in human. A total of 21,110 experimental verified phosphorylation sites within 5092 human proteins are collected. However, only 4138 phosphorylation sites (∼20%) have the annotation of catalytic kinases from public domain. In order to fully investigate how protein kinases regulate the intracellular processes, a published kinase-specific phosphorylation site prediction tool, named KinasePhos is incorporated for assigning the potential kinase. The web-based system, RegPhos, can let users input a group of human proteins; consequently, the phosphorylation network associated with the protein subcellular localization can be explored. Additionally, time-coursed microarray expression data is subsequently used to represent the degree of similarity in the expression profiles of network members. A case study demonstrates that the proposed scheme not only identify the correct network of insulin signaling but also detect a novel signaling pathway that may cross-talk with insulin signaling network. This effective system is now freely available at http://RegPhos.mbc.nctu.edu.tw.
Collapse
Affiliation(s)
- Tzong-Yi Lee
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan
| | | | | | | |
Collapse
|
30
|
Annan RB, Lee AY, Reid ID, Sayad A, Whiteway M, Hallett M, Thomas DY. A biochemical genomics screen for substrates of Ste20p kinase enables the in silico prediction of novel substrates. PLoS One 2009; 4:e8279. [PMID: 20020052 PMCID: PMC2791418 DOI: 10.1371/journal.pone.0008279] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2009] [Accepted: 11/19/2009] [Indexed: 01/13/2023] Open
Abstract
The Ste20/PAK family is involved in many cellular processes, including the regulation of actin-based cytoskeletal dynamics and the activation of MAPK signaling pathways. Despite its numerous roles, few of its substrates have been identified. To better characterize the roles of the yeast Ste20p kinase, we developed an in vitro biochemical genomics screen to identify its substrates. When applied to 539 purified yeast proteins, the screen reported 14 targets of Ste20p phosphorylation. We used the data resulting from our screen to build an in silico predictor to identify Ste20p substrates on a proteome-wide basis. Since kinase-substrate specificity is often mediated by additional binding events at sites distal to the phosphorylation site, the predictor uses the presence/absence of multiple sequence motifs to evaluate potential substrates. Statistical validation estimates a threefold improvement in substrate recovery over random predictions, despite the lack of a single dominant motif that can characterize Ste20p phosphorylation. The set of predicted substrates significantly overrepresents elements of the genetic and physical interaction networks surrounding Ste20p, suggesting that some of the predicted substrates are in vivo targets. We validated this combined experimental and computational approach for identifying kinase substrates by confirming the in vitro phosphorylation of polarisome components Bni1p and Bud6p, thus suggesting a mechanism by which Ste20p effects polarized growth.
Collapse
Affiliation(s)
- Robert B Annan
- Department of Biochemistry, McGill University, Montreal, Quebec, Canada.
| | | | | | | | | | | | | |
Collapse
|
31
|
Chang WC, Lee TY, Shien DM, Hsu JBK, Horng JT, Hsu PC, Wang TY, Huang HD, Pan RL. Incorporating support vector machine for identifying protein tyrosine sulfation sites. J Comput Chem 2009; 30:2526-37. [PMID: 19373826 DOI: 10.1002/jcc.21258] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Tyrosine sulfation is a post-translational modification of many secreted and membrane-bound proteins. It governs protein-protein interactions that are involved in leukocyte adhesion, hemostasis, and chemokine signaling. However, the intrinsic feature of sulfated protein remains elusive and remains to be delineated. This investigation presents SulfoSite, which is a computational method based on a support vector machine (SVM) for predicting protein sulfotyrosine sites. The approach was developed to consider structural information such as concerning the secondary structure and solvent accessibility of amino acids that surround the sulfotyrosine sites. One hundred sixty-two experimentally verified tyrosine sulfation sites were identified using UniProtKB/SwissProt release 53.0. The results of a five-fold cross-validation evaluation suggest that the accessibility of the solvent around the sulfotyrosine sites contributes substantially to predictive accuracy. The SVM classifier can achieve an accuracy of 94.2% in five-fold cross validation when sequence positional weighted matrix (PWM) is coupled with values of the accessible surface area (ASA). The proposed method significantly outperforms previous methods for accurately predicting the location of tyrosine sulfation sites.
Collapse
Affiliation(s)
- Wen-Chi Chang
- Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu, Taiwan
| | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Shien DM, Lee TY, Chang WC, Hsu JBK, Horng JT, Hsu PC, Wang TY, Huang HD. Incorporating structural characteristics for identification of protein methylation sites. J Comput Chem 2009; 30:1532-43. [PMID: 19263424 DOI: 10.1002/jcc.21232] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Studies over the last few years have identified protein methylation on histones and other proteins that are involved in the regulation of gene transcription. Several works have developed approaches to identify computationally the potential methylation sites on lysine and arginine. Studies of protein tertiary structure have demonstrated that the sites of protein methylation are preferentially in regions that are easily accessible. However, previous studies have not taken into account the solvent-accessible surface area (ASA) that surrounds the methylation sites. This work presents a method named MASA that combines the support vector machine with the sequence and structural characteristics of proteins to identify methylation sites on lysine, arginine, glutamate, and asparagine. Since most experimental methylation sites are not associated with corresponding protein tertiary structures in the Protein Data Bank, the effective solvent-accessible prediction tools have been adopted to determine the potential ASA values of amino acids in proteins. Evaluation of predictive performance by cross-validation indicates that the ASA values around the methylation sites can improve the accuracy of prediction. Additionally, an independent test reveals that the prediction accuracies for methylated lysine and arginine are 80.8 and 85.0%, respectively. Finally, the proposed method is implemented as an effective system for identifying protein methylation sites. The developed web server is freely available at http://MASA.mbc.nctu.edu.tw/.
Collapse
Affiliation(s)
- Dray-Ming Shien
- Department of Computer Science and Information Engineering, National Central University, Chung-Li 320, Taiwan
| | | | | | | | | | | | | | | |
Collapse
|
33
|
Lee TY, Hsu JBK, Chang WC, Wang TY, Hsu PC, Huang HD. A comprehensive resource for integrating and displaying protein post-translational modifications. BMC Res Notes 2009; 2:111. [PMID: 19549291 PMCID: PMC2713254 DOI: 10.1186/1756-0500-2-111] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2008] [Accepted: 06/23/2009] [Indexed: 11/22/2022] Open
Abstract
Background Protein Post-Translational Modification (PTM) plays an essential role in cellular control mechanisms that adjust protein physical and chemical properties, folding, conformation, stability and activity, thus also altering protein function. Findings dbPTM (version 1.0), which was developed previously, aimed on a comprehensive collection of protein post-translational modifications. In this update version (dbPTM2.0), we developed a PTM database towards an expert system of protein post-translational modifications. The database comprehensively collects experimental and predictive protein PTM sites. In addition, dbPTM2.0 was extended to a knowledge base comprising the modified sites, solvent accessibility of substrate, protein secondary and tertiary structures, protein domains, protein intrinsic disorder region, and protein variations. Moreover, this work compiles a benchmark to construct evaluation datasets for computational study to identifying PTM sites, such as phosphorylated sites, glycosylated sites, acetylated sites and methylated sites. Conclusion The current release not only provides the sequence-based information, but also annotates the structure-based information for protein post-translational modification. The interface is also designed to facilitate the access to the resource. This effective database is now freely accessible at .
Collapse
Affiliation(s)
- Tzong-Yi Lee
- Department of Biological Science and Technology, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan.
| | | | | | | | | | | |
Collapse
|
34
|
Garbino A, van Oort RJ, Dixit SS, Landstrom AP, Ackerman MJ, Wehrens XHT. Molecular evolution of the junctophilin gene family. Physiol Genomics 2009; 37:175-86. [PMID: 19318539 DOI: 10.1152/physiolgenomics.00017.2009] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Junctophilins (JPHs) are members of a junctional membrane complex protein family important for the physical approximation of plasmalemmal and sarcoplasmic/endoplasmic reticulum membranes. As such, JPHs facilitate signal transduction in excitable cells between plasmalemmal voltage-gated calcium channels and intracellular calcium release channels. To determine the molecular evolution of the JPH gene family, we performed a phylogenetic analysis of over 60 JPH genes from over 40 species and compared conservation across species and different isoforms. We found that JPHs are evolutionary highly conserved, in particular the membrane occupation and recognition nexus motifs found in all species. Our data suggest that an ancestral form of JPH arose at the latest in a common metazoan ancestor and that in vertebrates four isoforms arose, probably following two rounds of whole genome duplications. By combining multiple prediction techniques with sequence alignments, we also postulate the presence of new important functional regions and candidate sites for posttranslational modifications. The increasing number of available sequences yields significant insight into the molecular evolution of JPHs. Our analysis is consistent with the emerging concept that JPHs serve dual important functions in excitable cells: structural assembly of junctional membrane complexes and regulation of intracellular calcium signaling pathways.
Collapse
Affiliation(s)
- Alejandro Garbino
- Department of Molecular Physiology and Biophysics, Baylor College of Medicine, Houston, Texas 77030, USA
| | | | | | | | | | | |
Collapse
|
35
|
Dang TH, Van Leemput K, Verschoren A, Laukens K. Prediction of kinase-specific phosphorylation sites using conditional random fields. ACTA ACUST UNITED AC 2008; 24:2857-64. [PMID: 18940828 PMCID: PMC2639296 DOI: 10.1093/bioinformatics/btn546] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Motivation: Phosphorylation is a crucial post-translational protein modification mechanism with important regulatory functions in biological systems. It is catalyzed by a group of enzymes called kinases, each of which recognizes certain target sites in its substrate proteins. Several authors have built computational models trained from sets of experimentally validated phosphorylation sites to predict these target sites for each given kinase. All of these models suffer from certain limitations, such as the fact that they do not take into account the dependencies between amino acid motifs within protein sequences in a global fashion. Results: We propose a novel approach to predict phosphorylation sites from the protein sequence. The method uses a positive dataset to train a conditional random field (CRF) model. The negative training dataset is used to specify the decision threshold corresponding to a desired false positive rate. Application of the method on experimentally verified benchmark phosphorylation data (Phospho.ELM) shows that it performs well compared to existing methods for most kinases. This is to our knowledge that the first report of the use of CRFs to predict post-translational modification sites in protein sequences. Availability: The source code of the implementation, called CRPhos, is available from http://www.ptools.ua.ac.be/CRPhos/ Contact:kris.laukens@ua.ac.be Suplementary Information: Supplementary data are available at http://www.ptools.ua.ac.be/CRPhos/
Collapse
Affiliation(s)
- Thanh Hai Dang
- Intelligent Systems Laboratory and Advanced Database Research and Modelling, Department of Mathematics and Computer Science, Middelheimlaan 1, B-2020 Antwerpen, Belgium
| | | | | | | |
Collapse
|
36
|
Wan J, Kang S, Tang C, Yan J, Ren Y, Liu J, Gao X, Banerjee A, Ellis LBM, Li T. Meta-prediction of phosphorylation sites with weighted voting and restricted grid search parameter selection. Nucleic Acids Res 2008; 36:e22. [PMID: 18234718 PMCID: PMC2275094 DOI: 10.1093/nar/gkm848] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2007] [Revised: 08/28/2007] [Accepted: 09/26/2007] [Indexed: 11/21/2022] Open
Abstract
Meta-predictors make predictions by organizing and processing the predictions produced by several other predictors in a defined problem domain. A proficient meta-predictor not only offers better predicting performance than the individual predictors from which it is constructed, but it also relieves experimentally researchers from making difficult judgments when faced with conflicting results made by multiple prediction programs. As increasing numbers of predicting programs are being developed in a large number of fields of life sciences, there is an urgent need for effective meta-prediction strategies to be investigated. We compiled four unbiased phosphorylation site datasets, each for one of the four major serine/threonine (S/T) protein kinase families-CDK, CK2, PKA and PKC. Using these datasets, we examined several meta-predicting strategies with 15 phosphorylation site predictors from six predicting programs: GPS, KinasePhos, NetPhosK, PPSP, PredPhospho and Scansite. Meta-predictors constructed with a generalized weighted voting meta-predicting strategy with parameters determined by restricted grid search possess the best performance, exceeding that of all individual predictors in predicting phosphorylation sites of all four kinase families. Our results demonstrate a useful decision-making tool for analysing the predictions of the various S/T phosphorylation site predictors. An implementation of these meta-predictors is available on the web at: http://MetaPred.umn.edu/MetaPredPS/.
Collapse
Affiliation(s)
- Ji Wan
- Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
| | - Shuli Kang
- Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
| | - Chuanning Tang
- Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
| | - Jianhua Yan
- Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
| | - Yongliang Ren
- Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
| | - Jie Liu
- Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
| | - Xiaolian Gao
- Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
| | - Arindam Banerjee
- Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
| | - Lynda B. M. Ellis
- Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
| | - Tongbin Li
- Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
| |
Collapse
|
37
|
Shonhai A, Boshoff A, Blatch GL. The structural and functional diversity of Hsp70 proteins from Plasmodium falciparum. Protein Sci 2007; 16:1803-18. [PMID: 17766381 PMCID: PMC2206976 DOI: 10.1110/ps.072918107] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
It is becoming increasingly apparent that heat shock proteins play an important role in the survival of Plasmodium falciparum against temperature changes associated with its passage from the cold-blooded mosquito vector to the warm-blooded human host. Interest in understanding the possible role of P. falciparum Hsp70s in the life cycle of the parasite has led to the identification of six HSP70 genes. Although most research attention has focused primarily on one of the cytosolic Hsp70s (PfHsp70-1) and its endoplasmic reticulum homolog (PfHsp70-2), further functional insights could be inferred from the structural motifs exhibited by the rest of the Hsp70 family members of P. falciparum. There is increasing evidence that suggests that PfHsp70-1 could play an important role in the life cycle of P. falciparum both as a chaperone and immunogen. In addition, P. falciparum Hsp70s and Hsp40 partners are implicated in the intracellular and extracellular trafficking of proteins. This review summarizes data emerging from studies on the chaperone role of P. falciparum Hsp70s, taking advantage of inferences gleaned from their structures and information on their cellular localization. The possible associations between P. falciparum Hsp70s with their cochaperone partners as well as other chaperones and proteins are discussed.
Collapse
Affiliation(s)
- Addmore Shonhai
- Department of Biochemistry, Microbiology and Biotechnology, Rhodes University, Grahamstown 6140, South Africa
| | | | | |
Collapse
|
38
|
Patterson EK, Watson PH, Hodsman AB, Hendy GN, Canaff L, Bringhurst FR, Poschwatta CH, Fraher LJ. Expression of PTH1R constructs in LLC-PK1 cells: protein nuclear targeting is mediated by the PTH1R NLS. Bone 2007; 41:603-10. [PMID: 17627912 DOI: 10.1016/j.bone.2007.04.201] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/13/2006] [Revised: 03/01/2007] [Accepted: 04/04/2007] [Indexed: 10/23/2022]
Abstract
This study demonstrates that the PTH1R NLS can target a fusion protein to the nucleus, and that this is blocked by sequences downstream of the NLS. GFP fused to the NLS showed a significant increase in nuclear targeting compared to GFP alone or GFP fused to a peptide of the same length. In previous studies, we demonstrated that the type I PTH/PTHrP receptor (PTH1R) localizes to the nucleus of cells within rat liver, kidney, uterus, ovary and gut. Similarly, nuclear localization of the PTH1R was observed in the cultured osteoblast-like cells MC3T3-E1, UMR106, ROS 17/2.8 and SaOS-2. We have identified a putative bipartite nuclear localization signal (NLS), from residues 471-488 in the protein sequence of the PTH1R. In this study, several PTH1R constructs were made in the Enhanced Green Fluorescent Protein (EGFP) expression vector (Clontech), transiently transfected into LLC-PK1 Clone 46 cells, and the resultant fusion protein expression followed by fluorescence microscopy. This particular clone of LLC-PK1 shows no biochemical response in vitro to parathyroid hormone. Constructs included the entire PTH1R sequence (PTH1R-GFP), the putative NLS fused to the C-terminus of GFP (GFP-NLS) or the NLS through to the C-terminus of the PTH1R fused to GFP (GFP-NLSCT). Deconvolution fluorescence microscopy of cells transfected with PTH1R-GFP showed abundant fluorescent signal throughout the cells with distinctly fluorescing plasma membranes. These cells also exhibited an increase in cAMP production in response to (0-10(-8) M) hPTH(1-34), with an increase in cAMP from 11 fmol/mug of protein to 101 fmol/microg. In contrast, cells transfected with the GFP-NLS construct showed significant nuclear sequestration of fluorescence as compared to GFP alone, GFP-NLSCT, or a short amino acid sequence fused to GFP (GFP-FFVAIYCFCNGEVQAEI). These results indicate that the NLS at residues 471-488 of the mature rat PTH1R is functional and plays a role in targeting the PTH1R the nucleus, also the addition of GFP to the C-terminus of the PTH1R still allows cAMP generation which will be useful for further studies.
Collapse
Affiliation(s)
- Eric K Patterson
- Department of Biochemistry, University of Western Ontario, and The Lawson Health Research Institute, London, Ontario, Canada N6A 4V2
| | | | | | | | | | | | | | | |
Collapse
|
39
|
Chang EJ, Begum R, Chait BT, Gaasterland T. Prediction of cyclin-dependent kinase phosphorylation substrates. PLoS One 2007; 2:e656. [PMID: 17668044 PMCID: PMC1924601 DOI: 10.1371/journal.pone.0000656] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2007] [Accepted: 06/24/2007] [Indexed: 11/18/2022] Open
Abstract
Protein phosphorylation, mediated by a family of enzymes called cyclin-dependent kinases (Cdks), plays a central role in the cell-division cycle of eukaryotes. Phosphorylation by Cdks directs the cell cycle by modifying the function of regulators of key processes such as DNA replication and mitotic progression. Here, we present a novel computational procedure to predict substrates of the cyclin-dependent kinase Cdc28 (Cdk1) in the Saccharomyces cerevisiae. Currently, most computational phosphorylation site prediction procedures focus solely on local sequence characteristics. In the present procedure, we model Cdk substrates based on both local and global characteristics of the substrates. Thus, we define the local sequence motifs that represent the Cdc28 phosphorylation sites and subsequently model clustering of these motifs within the protein sequences. This restraint reflects the observation that many known Cdk substrates contain multiple clustered phosphorylation sites. The present strategy defines a subset of the proteome that is highly enriched for Cdk substrates, as validated by comparing it to a set of bona fide, published, experimentally characterized Cdk substrates which was to our knowledge, comprehensive at the time of writing. To corroborate our model, we compared its predictions with three experimentally independent Cdk proteomic datasets and found significant overlap. Finally, we directly detected in vivo phosphorylation at Cdk motifs for selected putative substrates using mass spectrometry.
Collapse
Affiliation(s)
- Emmanuel J Chang
- Department of Chemistry, York College of the City University of New York, Jamaica, New York, United States of America; Laboratory of Mass Spectrometry and Gaseous Ion Chemistry, Rockefeller University, New York, New York, United States of America.
| | | | | | | |
Collapse
|
40
|
Wong YH, Lee TY, Liang HK, Huang CM, Wang TY, Yang YH, Chu CH, Huang HD, Ko MT, Hwang JK. KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns. Nucleic Acids Res 2007; 35:W588-94. [PMID: 17517770 PMCID: PMC1933228 DOI: 10.1093/nar/gkm322] [Citation(s) in RCA: 266] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Due to the importance of protein phosphorylation in cellular control, many researches are undertaken to predict the kinase-specific phosphorylation sites. Referred to our previous work, KinasePhos 1.0, incorporated profile hidden Markov model (HMM) with flanking residues of the kinase-specific phosphorylation sites. Herein, a new web server, KinasePhos 2.0, incorporates support vector machines (SVM) with the protein sequence profile and protein coupling pattern, which is a novel feature used for identifying phosphorylation sites. The coupling pattern [XdZ] denotes the amino acid coupling-pattern of amino acid types X and Z that are separated by d amino acids. The differences or quotients of coupling strength CXdZ between the positive set of phosphorylation sites and the background set of whole protein sequences from Swiss-Prot are computed to determine the number of coupling patterns for training SVM models. After the evaluation based on k-fold cross-validation and Jackknife cross-validation, the average predictive accuracy of phosphorylated serine, threonine, tyrosine and histidine are 90, 93, 88 and 93%, respectively. KinasePhos 2.0 performs better than other tools previously developed. The proposed web server is freely available at http://KinasePhos2.mbc.nctu.edu.tw/.
Collapse
Affiliation(s)
- Yung-Hao Wong
- Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
| | - Tzong-Yi Lee
- Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
| | - Han-Kuen Liang
- Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
| | - Chia-Mao Huang
- Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
| | - Ting-Yuan Wang
- Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
| | - Yi-Huan Yang
- Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
| | - Chia-Huei Chu
- Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
| | - Hsien-Da Huang
- Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
- *To whom correspondence should be addressed. +886 3 5712121 Ext. 56952+886 3 5729288
| | - Ming-Tat Ko
- Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
| | - Jenn-Kang Hwang
- Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
| |
Collapse
|
41
|
Zanzoni A, Ausiello G, Via A, Gherardini PF, Helmer-Citterich M. Phospho3D: a database of three-dimensional structures of protein phosphorylation sites. Nucleic Acids Res 2006; 35:D229-31. [PMID: 17142231 PMCID: PMC1669737 DOI: 10.1093/nar/gkl922] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Phosphorylation is the most common protein post-translational modification. Phosphorylated residues (serine, threonine and tyrosine) play critical roles in the regulation of many cellular processes. Since the amount of data produced by screening assays is growing continuously, the development of computational tools for collecting and analysing experimental data has become a pivotal task for unravelling the complex network of interactions regulating eukaryotic cell life. Here we present Phospho3D, , a database of 3D structures of phosphorylation sites, which stores information retrieved from the phospho.ELM database and is enriched with structural information and annotations at the residue level. The database also collects the results of a large-scale structural comparison procedure providing clues for the identification of new putative phosphorylation sites.
Collapse
Affiliation(s)
- Andreas Zanzoni
- Centre for Molecular Bioinformatics, Department of Biology University of Rome Tor Vergata, Rome 00133, Italy.
| | | | | | | | | |
Collapse
|
42
|
Xue Y, Li A, Wang L, Feng H, Yao X. PPSP: prediction of PK-specific phosphorylation site with Bayesian decision theory. BMC Bioinformatics 2006; 7:163. [PMID: 16549034 PMCID: PMC1435943 DOI: 10.1186/1471-2105-7-163] [Citation(s) in RCA: 160] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2005] [Accepted: 03/20/2006] [Indexed: 11/25/2022] Open
Abstract
Background As a reversible and dynamic post-translational modification (PTM) of proteins, phosphorylation plays essential regulatory roles in a broad spectrum of the biological processes. Although many studies have been contributed on the molecular mechanism of phosphorylation dynamics, the intrinsic feature of substrates specificity is still elusive and remains to be delineated. Results In this work, we present a novel, versatile and comprehensive program, PPSP (Prediction of PK-specific Phosphorylation site), deployed with approach of Bayesian decision theory (BDT). PPSP could predict the potential phosphorylation sites accurately for ~70 PK (Protein Kinase) groups. Compared with four existing tools Scansite, NetPhosK, KinasePhos and GPS, PPSP is more accurate and powerful than these tools. Moreover, PPSP also provides the prediction for many novel PKs, say, TRK, mTOR, SyK and MET/RON, etc. The accuracy of these novel PKs are also satisfying. Conclusion Taken together, we propose that PPSP could be a potentially powerful tool for the experimentalists who are focusing on phosphorylation substrates with their PK-specific sites identification. Moreover, the BDT strategy could also be a ubiquitous approach for PTMs, such as sumoylation and ubiquitination, etc.
Collapse
Affiliation(s)
- Yu Xue
- School of Life Science, University of Science and Technology of China, Hefei, Anhui, 230027, China
| | - Ao Li
- Department of Electronic Science and Technology, University of Science and Technology of China, Hefei, Anhui, 230027, China
| | - Lirong Wang
- Department of Electronic Science and Technology, University of Science and Technology of China, Hefei, Anhui, 230027, China
| | - Huanqing Feng
- Department of Electronic Science and Technology, University of Science and Technology of China, Hefei, Anhui, 230027, China
| | - Xuebiao Yao
- School of Life Science, University of Science and Technology of China, Hefei, Anhui, 230027, China
- Department of Physiology, Morehouse School of Medicine, Atlanta, GA 30310, USA
| |
Collapse
|
43
|
Xue Y, Li A, Wang L, Feng H, Yao X. PPSP: prediction of PK-specific phosphorylation site with Bayesian decision theory. BMC Bioinformatics 2006. [PMID: 16549034 DOI: 10.1186/1471‐2105‐7‐163] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND As a reversible and dynamic post-translational modification (PTM) of proteins, phosphorylation plays essential regulatory roles in a broad spectrum of the biological processes. Although many studies have been contributed on the molecular mechanism of phosphorylation dynamics, the intrinsic feature of substrates specificity is still elusive and remains to be delineated. RESULTS In this work, we present a novel, versatile and comprehensive program, PPSP (Prediction of PK-specific Phosphorylation site), deployed with approach of Bayesian decision theory (BDT). PPSP could predict the potential phosphorylation sites accurately for approximately 70 PK (Protein Kinase) groups. Compared with four existing tools Scansite, NetPhosK, KinasePhos and GPS, PPSP is more accurate and powerful than these tools. Moreover, PPSP also provides the prediction for many novel PKs, say, TRK, mTOR, SyK and MET/RON, etc. The accuracy of these novel PKs are also satisfying. CONCLUSION Taken together, we propose that PPSP could be a potentially powerful tool for the experimentalists who are focusing on phosphorylation substrates with their PK-specific sites identification. Moreover, the BDT strategy could also be a ubiquitous approach for PTMs, such as sumoylation and ubiquitination, etc.
Collapse
Affiliation(s)
- Yu Xue
- School of Life Science, University of Science and Technology of China, Hefei, Anhui, 230027, China.
| | | | | | | | | |
Collapse
|
44
|
Lee TY, Huang HD, Hung JH, Huang HY, Yang YS, Wang TH. dbPTM: an information repository of protein post-translational modification. Nucleic Acids Res 2006; 34:D622-7. [PMID: 16381945 PMCID: PMC1347446 DOI: 10.1093/nar/gkj083] [Citation(s) in RCA: 177] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
dbPTM is a database that compiles information on protein post-translational modifications (PTMs), such as the catalytic sites, solvent accessibility of amino acid residues, protein secondary and tertiary structures, protein domains and protein variations. The database includes all of the experimentally validated PTM sites from Swiss-Prot, PhosphoELM and O-GLYCBASE. Only a small fraction of Swiss-Prot proteins are annotated with experimentally verified PTM. Although the Swiss-Prot provides rich information about the PTM, other structural properties and functional information of proteins are also essential for elucidating protein mechanisms. The dbPTM systematically identifies three major types of protein PTM (phosphorylation, glycosylation and sulfation) sites against Swiss-Prot proteins by refining our previously developed prediction tool, KinasePhos (). Solvent accessibility and secondary structure of residues are also computationally predicted and are mapped to the PTM sites. The resource is now freely available at .
Collapse
Affiliation(s)
- Tzong-Yi Lee
- Institute of Bioinformatics, National Chiao Tung UniversityHsin-Chu 300, Taiwan
| | - Hsien-Da Huang
- Institute of Bioinformatics, National Chiao Tung UniversityHsin-Chu 300, Taiwan
- Department of Biological Science and Technology, National Chiao Tung UniversityHsin-Chu 300, Taiwan
- To whom correspondence should be addressed. Tel: +886 3 5712121, ext. 56952;
| | - Jui-Hung Hung
- Institute of Bioinformatics, National Chiao Tung UniversityHsin-Chu 300, Taiwan
| | - Hsi-Yuan Huang
- Institute of Bioinformatics, National Chiao Tung UniversityHsin-Chu 300, Taiwan
| | - Yuh-Shyong Yang
- Department of Biological Science and Technology, National Chiao Tung UniversityHsin-Chu 300, Taiwan
- Institute of Biochemical Engineering, National Chiao Tung UniversityHsin-Chu 300, Taiwan
| | - Tzu-Hao Wang
- Department of Obstetrics and Gynecology, Chang Gung Memorial Hospital, Lin-Kou Medical CenterTao-Yuan 333, Taiwan
| |
Collapse
|