Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Huang HD, Lee TY, Tzeng SW, Wu LC, Horng JT, Tsou AP, Huang KT. Incorporating hidden Markov models for identifying protein kinase-specific phosphorylation sites. J Comput Chem 2005;26:1032-41. [PMID: 15889432 DOI: 10.1002/jcc.20235] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

For:	Huang HD, Lee TY, Tzeng SW, Wu LC, Horng JT, Tsou AP, Huang KT. Incorporating hidden Markov models for identifying protein kinase-specific phosphorylation sites. J Comput Chem 2005;26:1032-41. [PMID: 15889432 DOI: 10.1002/jcc.20235] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Number

Cited by Other Article(s)

Ahammad RU, Nishioka T, Yoshimoto J, Kannon T, Amano M, Funahashi Y, Tsuboi D, Faruk MO, Yamahashi Y, Yamada K, Nagai T, Kaibuchi K. KANPHOS: A Database of Kinase-Associated Neural Protein Phosphorylation in the Brain. Cells 2021;11:47. [PMID: 35011609 PMCID: PMC8750479 DOI: 10.3390/cells11010047] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Revised: 12/19/2021] [Accepted: 12/21/2021] [Indexed: 12/15/2022] Open

Pérez-Mejías G, Velázquez-Cruz A, Guerra-Castellano A, Baños-Jaime B, Díaz-Quintana A, González-Arzola K, Ángel De la Rosa M, Díaz-Moreno I. Exploring protein phosphorylation by combining computational approaches and biochemical methods. Comput Struct Biotechnol J 2020;18:1852-1863. [PMID: 32728408 PMCID: PMC7369424 DOI: 10.1016/j.csbj.2020.06.043] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 06/29/2020] [Accepted: 06/30/2020] [Indexed: 12/14/2022] Open

Affiliation(s)

Gonzalo Pérez-Mejías Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
Alejandro Velázquez-Cruz Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
Alejandra Guerra-Castellano Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
Blanca Baños-Jaime Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
Antonio Díaz-Quintana Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
Katiuska González-Arzola Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
Miguel Ángel De la Rosa Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
Irene Díaz-Moreno Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain

Collapse

Huang KY, Lee TY, Kao HJ, Ma CT, Lee CC, Lin TH, Chang WC, Huang HD. dbPTM in 2019: exploring disease association and cross-talk of post-translational modifications. Nucleic Acids Res 2020;47:D298-D308. [PMID: 30418626 PMCID: PMC6323979 DOI: 10.1093/nar/gky1074] [Citation(s) in RCA: 134] [Impact Index Per Article: 33.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 10/19/2018] [Indexed: 12/25/2022] Open

In Silico Tools and Phosphoproteomic Software Exclusives. Processes (Basel) 2019. [DOI: 10.3390/pr7120869] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

Huang KY, Kao HJ, Hsu JBK, Weng SL, Lee TY. Characterization and identification of lysine glutarylation based on intrinsic interdependence between positions in the substrate sites. BMC Bioinformatics 2019;19:384. [PMID: 30717647 PMCID: PMC7394328 DOI: 10.1186/s12859-018-2394-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Accepted: 09/25/2018] [Indexed: 01/06/2023] Open

Abstract

Background

Glutarylation, the addition of a glutaryl group (five carbons) to a lysine residue of a protein molecule, is an important post-translational modification and plays a regulatory role in a variety of physiological and biological processes. As the number of experimentally identified glutarylated peptides increases, it becomes imperative to investigate substrate motifs to enhance the study of protein glutarylation. We carried out a bioinformatics investigation of glutarylation sites based on amino acid composition using a public database containing information on 430 non-homologous glutarylation sites.

Results

The TwoSampleLogo analysis indicates that positively charged and polar amino acids surrounding glutarylated sites may be associated with the specificity in substrate site of protein glutarylation. Additionally, the chi-squared test was utilized to explore the intrinsic interdependence between two positions around glutarylation sites. Further, maximal dependence decomposition (MDD), which consists of partitioning a large-scale dataset into subgroups with statistically significant amino acid conservation, was used to capture motif signatures of glutarylation sites. We considered single features, such as amino acid composition (AAC), amino acid pair composition (AAPC), and composition of k-spaced amino acid pairs (CKSAAP), as well as the effectiveness of incorporating MDD-identified substrate motifs into an integrated prediction model. Evaluation by five-fold cross-validation showed that AAC was most effective in discriminating between glutarylation and non-glutarylation sites, according to support vector machine (SVM).

Conclusions

The SVM model integrating MDD-identified substrate motifs performed well, with a sensitivity of 0.677, a specificity of 0.619, an accuracy of 0.638, and a Matthews Correlation Coefficient (MCC) value of 0.28. Using an independent testing dataset (46 glutarylated and 92 non-glutarylated sites) obtained from the literature, we demonstrated that the integrated SVM model could improve the predictive performance effectively, yielding a balanced sensitivity and specificity of 0.652 and 0.739, respectively. This integrated SVM model has been implemented as a web-based system (MDDGlutar), which is now freely available at http://csb.cse.yzu.edu.tw/MDDGlutar/.

Electronic supplementary material

The online version of this article (10.1186/s12859-018-2394-9) contains supplementary material, which is available to authorized users.

Collapse

Zhou X, Qiu YH, He P, Jiang F, Wu LF, Lu X, Lei SF, Deng FY. Why SNP rs227584 is associated with human BMD and fracture risk? A molecular and cellular study in bone cells. J Cell Mol Med 2018;23:898-907. [PMID: 30370607 PMCID: PMC6349212 DOI: 10.1111/jcmm.13991] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2018] [Revised: 09/03/2018] [Accepted: 09/29/2018] [Indexed: 11/28/2022] Open

Affiliation(s)

Xu Zhou Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China
Ying-Hua Qiu Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China
Pei He Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China
Fei Jiang Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China
Long-Fei Wu Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China
Xin Lu Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China
Shu-Feng Lei Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China
Fei-Yan Deng Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Suzhou, Jiangsu, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, School of Public Health, Soochow University, Suzhou, Jiangsu, China

Collapse

Weng SL, Kao HJ, Huang CH, Lee TY. MDD-Palm: Identification of protein S-palmitoylation sites with substrate motifs based on maximal dependence decomposition. PLoS One 2017;12:e0179529. [PMID: 28662047 PMCID: PMC5491019 DOI: 10.1371/journal.pone.0179529] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2016] [Accepted: 05/31/2017] [Indexed: 12/14/2022] Open

Abstract

S-palmitoylation, the covalent attachment of 16-carbon palmitic acids to a cysteine residue via a thioester linkage, is an important reversible lipid modification that plays a regulatory role in a variety of physiological and biological processes. As the number of experimentally identified S-palmitoylated peptides increases, it is imperative to investigate substrate motifs to facilitate the study of protein S-palmitoylation. Based on 710 non-homologous S-palmitoylation sites obtained from published databases and the literature, we carried out a bioinformatics investigation of S-palmitoylation sites based on amino acid composition. Two Sample Logo indicates that positively charged and polar amino acids surrounding S-palmitoylated sites may be associated with the substrate site specificity of protein S-palmitoylation. Additionally, maximal dependence decomposition (MDD) was applied to explore the motif signatures of S-palmitoylation sites by categorizing a large-scale dataset into subgroups with statistically significant conservation of amino acids. Single features such as amino acid composition (AAC), amino acid pair composition (AAPC), position specific scoring matrix (PSSM), position weight matrix (PWM), amino acid substitution matrix (BLOSUM62), and accessible surface area (ASA) were considered, along with the effectiveness of incorporating MDD-identified substrate motifs into a two-layered prediction model. Evaluation by five-fold cross-validation showed that a hybrid of AAC and PSSM performs best at discriminating between S-palmitoylation and non-S-palmitoylation sites, according to the support vector machine (SVM). The two-layered SVM model integrating MDD-identified substrate motifs performed well, with a sensitivity of 0.79, specificity of 0.80, accuracy of 0.80, and Matthews Correlation Coefficient (MCC) value of 0.45. Using an independent testing dataset (613 S-palmitoylated and 5412 non-S-palmitoylated sites) obtained from the literature, we demonstrated that the two-layered SVM model could outperform other prediction tools, yielding a balanced sensitivity and specificity of 0.690 and 0.694, respectively. This two-layered SVM model has been implemented as a web-based system (MDD-Palm), which is now freely available at http://csb.cse.yzu.edu.tw/MDDPalm/.

Collapse

Niu T, Liu N, Yu X, Zhao M, Choi HJ, Leo PJ, Brown MA, Zhang L, Pei YF, Shen H, He H, Fu X, Lu S, Chen XD, Tan LJ, Yang TL, Guo Y, Cho NH, Shen J, Guo YF, Nicholson GC, Prince RL, Eisman JA, Jones G, Sambrook PN, Tian Q, Zhu XZ, Papasian CJ, Duncan EL, Uitterlinden AG, Shin CS, Xiang S, Deng HW. Identification of IDUA and WNT16 Phosphorylation-Related Non-Synonymous Polymorphisms for Bone Mineral Density in Meta-Analyses of Genome-Wide Association Studies. J Bone Miner Res 2016;31:358-68. [PMID: 26256109 PMCID: PMC5362379 DOI: 10.1002/jbmr.2687] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/23/2015] [Revised: 07/29/2015] [Accepted: 08/06/2015] [Indexed: 11/06/2022]

Abstract

Protein phosphorylation regulates a wide variety of cellular processes. Thus, we hypothesize that single-nucleotide polymorphisms (SNPs) that may modulate protein phosphorylation could affect osteoporosis risk. Based on a previous conventional genome-wide association (GWA) study, we conducted a three-stage meta-analysis targeting phosphorylation-related SNPs (phosSNPs) for femoral neck (FN)-bone mineral density (BMD), total hip (HIP)-BMD, and lumbar spine (LS)-BMD phenotypes. In stage 1, 9593 phosSNPs were meta-analyzed in 11,140 individuals of various ancestries. Genome-wide significance (GWS) and suggestive significance were defined by α = 5.21 × 10(-6) (0.05/9593) and 1.00 × 10(-4), respectively. In stage 2, nine stage 1-discovered phosSNPs (based on α = 1.00 × 10(-4)) were in silico meta-analyzed in Dutch, Korean, and Australian cohorts. In stage 3, four phosSNPs that replicated in stage 2 (based on α = 5.56 × 10(-3), 0.05/9) were de novo genotyped in two independent cohorts. IDUA rs3755955 and rs6831280, and WNT16 rs2707466 were associated with BMD phenotypes in each respective stage, and in three stages combined, achieving GWS for both FN-BMD (p = 8.36 × 10(-10), p = 5.26 × 10(-10), and p = 3.01 × 10(-10), respectively) and HIP-BMD (p = 3.26 × 10(-6), p = 1.97 × 10(-6), and p = 1.63 × 10(-12), respectively). Although in vitro studies demonstrated no differences in expressions of wild-type and mutant forms of IDUA and WNT16B proteins, in silico analyses predicts that WNT16 rs2707466 directly abolishes a phosphorylation site, which could cause a deleterious effect on WNT16 protein, and that IDUA phosSNPs rs3755955 and rs6831280 could exert indirect effects on nearby phosphorylation sites. Further studies will be required to determine the detailed and specific molecular effects of these BMD-associated non-synonymous variants.

Collapse

Affiliation(s)

Tianhua Niu Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
Ning Liu College of Life Science, Hunan Normal University, Changsha, P.R. China
Xun Yu College of Life Science, Hunan Normal University, Changsha, P.R. China
Ming Zhao Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
Hyung Jin Choi Department of Internal Medicine, College of Medicine, Seoul National University, Seoul, Korea.,Department of Internal Medicine, Chungbuk National University Hospital, Cheongju, Korea
Paul J Leo University of Queensland Diamantina Institute, Translational Research Institute, Princess Alexandra Hospital, Brisbane, Australia
Matthew A Brown University of Queensland Diamantina Institute, Translational Research Institute, Princess Alexandra Hospital, Brisbane, Australia
Lei Zhang Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA.,Center of System Biomedical Sciences, University of Shanghai for Science and Technology, Shanghai, P.R. China
Yu-Fang Pei Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
Hui Shen Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
Hao He Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
Xiaoying Fu Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
Shan Lu College of Life Science, Hunan Normal University, Changsha, P.R. China
Xiang-Ding Chen College of Life Science, Hunan Normal University, Changsha, P.R. China
Li-Jun Tan College of Life Science, Hunan Normal University, Changsha, P.R. China
Tie-Lin Yang School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, P.R. China
Yan Guo School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, P.R. China
Nam H Cho Department of Preventive Medicine, Ajou University School of Medicine, Youngtong-Gu, Korea
Jie Shen Third Affiliated Hospital of Southern Medical University, Guangzhou, P.R. China
Yan-Fang Guo Third Affiliated Hospital of Southern Medical University, Guangzhou, P.R. China
Geoffrey C Nicholson School of Medicine, The University of Queensland, Toowoomba, Australia
Richard L Prince School of Medicine and Pharmacology, University of Western Australia, Perth, Australia.,Department of Endocrinology and Diabetes, Sir Charles Gairdner Hospital, Perth, Australia
John A Eisman Garvan Institute of Medical Research, University of New South Wales, Sydney, Australia
Graeme Jones Menzies Institute for Medical Research, University of Tasmania, Hobart, Australia
Philip N Sambrook Kolling Institute of Medical Research, Royal North Shore Hospital, University of Sydney, Sydney, Australia
Qing Tian Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
Xue-Zhen Zhu School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, P.R. China
Christopher J Papasian Department of Basic Medical Science, University of Missouri-Kansas City, Kansas City, MO, USA
Emma L Duncan University of Queensland Diamantina Institute, Translational Research Institute, Princess Alexandra Hospital, Brisbane, Australia.,Department of Endocrinology, Royal Brisbane and Women's Hospital, Brisbane, Australia
André G Uitterlinden Department of Internal Medicine, Erasmus Medical Center, Rotterdam, The Netherlands.,Department of Epidemiology, Erasmus Medical Center, Rotterdam, The Netherlands.,Netherlands Genomics Initiative (NGI)-sponsored Netherlands Consortium for Healthy Aging (NCHA), Leiden, The Netherlands
Chan Soo Shin Department of Internal Medicine, College of Medicine, Seoul National University, Seoul, Korea
Shuanglin Xiang College of Life Science, Hunan Normal University, Changsha, P.R. China
Hong-Wen Deng Department of Biostatistics and Bioinformation, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA.,College of Life Science, Hunan Normal University, Changsha, P.R. China

Collapse

Huang CH, Su MG, Kao HJ, Jhong JH, Weng SL, Lee TY. UbiSite: incorporating two-layered machine learning method with substrate motifs to predict ubiquitin-conjugation site on lysines. BMC SYSTEMS BIOLOGY 2016;10 Suppl 1:6. [PMID: 26818456 PMCID: PMC4895383 DOI: 10.1186/s12918-015-0246-z] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]

Abstract

Background

The conjugation of ubiquitin to a substrate protein (protein ubiquitylation), which involves a sequential process – E1 activation, E2 conjugation and E3 ligation, is crucial to the regulation of protein function and activity in eukaryotes. This ubiquitin-conjugation process typically binds the last amino acid of ubiquitin (glycine 76) to a lysine residue of a target protein. The high-throughput of mass spectrometry-based proteomics has stimulated a large-scale identification of ubiquitin-conjugated peptides. Hence, a new web resource, UbiSite, was developed to identify ubiquitin-conjugation site on lysines based on large-scale proteome dataset.

Results

Given a total of 37,647 ubiquitin-conjugated proteins, including 128026 ubiquitylated peptides, obtained from various resources, this study carries out a large-scale investigation on ubiquitin-conjugation sites based on sequenced and structural characteristics. A TwoSampleLogo reveals that a significant depletion of histidine (H), arginine (R) and cysteine (C) residues around ubiquitylation sites may impact the conjugation of ubiquitins in closed three-dimensional environments. Based on the large-scale ubiquitylation dataset, a motif discovery tool, MDDLogo, has been adopted to characterize the potential substrate motifs for ubiquitin conjugation. Not only are single features such as amino acid composition (AAC), positional weighted matrix (PWM), position-specific scoring matrix (PSSM) and solvent-accessible surface area (SASA) considered, but also the effectiveness of incorporating MDDLogo-identified substrate motifs into a two-layered prediction model is taken into account. Evaluation by five-fold cross-validation showed that PSSM is the best feature in discriminating between ubiquitylation and non-ubiquitylation sites, based on support vector machine (SVM). Additionally, the two-layered SVM model integrating MDDLogo-identified substrate motifs could obtain a promising accuracy and the Matthews Correlation Coefficient (MCC) at 81.06 % and 0.586, respectively. Furthermore, the independent testing showed that the two-layered SVM model could outperform other prediction tools, reaching at 85.10 % sensitivity, 69.69 % specificity, 73.69 % accuracy and the 0.483 of MCC value.

Conclusion

The independent testing result indicated the effectiveness of incorporating MDDLogo-identified motifs into the prediction of ubiquitylation sites. In order to provide meaningful assistance to researchers interested in large-scale ubiquitinome data, the two-layered SVM model has been implemented onto a web-based system (UbiSite), which is freely available at http://csb.cse.yzu.edu.tw/UbiSite/. Two cases given in the UbiSite provide a demonstration of effective identification of ubiquitylation sites with reference to substrate motifs.

Electronic supplementary material

The online version of this article (doi:10.1186/s12918-015-0246-z) contains supplementary material, which is available to authorized users.

Collapse

Kao HJ, Huang CH, Bretaña NA, Lu CT, Huang KY, Weng SL, Lee TY. A two-layered machine learning method to identify protein O-GlcNAcylation sites with O-GlcNAc transferase substrate motifs. BMC Bioinformatics 2015;16 Suppl 18:S10. [PMID: 26680539 PMCID: PMC4682369 DOI: 10.1186/1471-2105-16-s18-s10] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Abstract

Protein O-GlcNAcylation, involving the β-attachment of single N-acetylglucosamine (GlcNAc) to the hydroxyl group of serine or threonine residues, is an O-linked glycosylation catalyzed by O-GlcNAc transferase (OGT). Molecular level investigation of the basis for OGT's substrate specificity should aid understanding how O-GlcNAc contributes to diverse cellular processes. Due to an increasing number of O-GlcNAcylated peptides with site-specific information identified by mass spectrometry (MS)-based proteomics, we were motivated to characterize substrate site motifs of O-GlcNAc transferases. In this investigation, a non-redundant dataset of 410 experimentally verified O-GlcNAcylation sites were manually extracted from dbOGAP, OGlycBase and UniProtKB. After detection of conserved motifs by using maximal dependence decomposition, profile hidden Markov model (profile HMM) was adopted to learn a first-layered model for each identified OGT substrate motif. Support Vector Machine (SVM) was then used to generate a second-layered model learned from the output values of profile HMMs in first layer. The two-layered predictive model was evaluated using a five-fold cross validation which yielded a sensitivity of 85.4%, a specificity of 84.1%, and an accuracy of 84.7%. Additionally, an independent testing set from PhosphoSitePlus, which was really non-homologous to the training data of predictive model, was used to demonstrate that the proposed method could provide a promising accuracy (84.05%) and outperform other O-GlcNAcylation site prediction tools. A case study indicated that the proposed method could be a feasible means of conducting preliminary analyses of protein O-GlcNAcylation and has been implemented as a web-based system, OGTSite, which is now freely available at http://csb.cse.yzu.edu.tw/OGTSite/.

Collapse

Huang KY, Su MG, Kao HJ, Hsieh YC, Jhong JH, Cheng KH, Huang HD, Lee TY. dbPTM 2016: 10-year anniversary of a resource for post-translational modification of proteins. Nucleic Acids Res 2015;44:D435-46. [PMID: 26578568 PMCID: PMC4702878 DOI: 10.1093/nar/gkv1240] [Citation(s) in RCA: 136] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2015] [Accepted: 11/02/2015] [Indexed: 01/23/2023] Open

Abstract

Owing to the importance of the post-translational modifications (PTMs) of proteins in regulating biological processes, the dbPTM (http://dbPTM.mbc.nctu.edu.tw/) was developed as a comprehensive database of experimentally verified PTMs from several databases with annotations of potential PTMs for all UniProtKB protein entries. For this 10th anniversary of dbPTM, the updated resource provides not only a comprehensive dataset of experimentally verified PTMs, supported by the literature, but also an integrative interface for accessing all available databases and tools that are associated with PTM analysis. As well as collecting experimental PTM data from 14 public databases, this update manually curates over 12 000 modified peptides, including the emerging S-nitrosylation, S-glutathionylation and succinylation, from approximately 500 research articles, which were retrieved by text mining. As the number of available PTM prediction methods increases, this work compiles a non-homologous benchmark dataset to evaluate the predictive power of online PTM prediction tools. An increasing interest in the structural investigation of PTM substrate sites motivated the mapping of all experimental PTM peptides to protein entries of Protein Data Bank (PDB) based on database identifier and sequence identity, which enables users to examine spatially neighboring amino acids, solvent-accessible surface area and side-chain orientations for PTM substrate sites on tertiary structures. Since drug binding in PDB is annotated, this update identified over 1100 PTM sites that are associated with drug binding. The update also integrates metabolic pathways and protein-protein interactions to support the PTM network analysis for a group of proteins. Finally, the web interface is redesigned and enhanced to facilitate access to this resource.

Collapse

Bui VM, Lu CT, Ho TT, Lee TY. MDD-SOH: exploiting maximal dependence decomposition to identify S-sulfenylation sites with substrate motifs. Bioinformatics 2015;32:165-72. [PMID: 26411868 DOI: 10.1093/bioinformatics/btv558] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2015] [Accepted: 09/18/2015] [Indexed: 01/12/2023] Open

Abstract

UNLABELLED

S-sulfenylation (S-sulphenylation, or sulfenic acid), the covalent attachment of S-hydroxyl (-SOH) to cysteine thiol, plays a significant role in redox regulation of protein functions. Although sulfenic acid is transient and labile, most of its physiological activities occur under control of S-hydroxylation. Therefore, discriminating the substrate site of S-sulfenylated proteins is an essential task in computational biology for the furtherance of protein structures and functions. Research into S-sulfenylated protein is currently very limited, and no dedicated tools are available for the computational identification of SOH sites. Given a total of 1096 experimentally verified S-sulfenylated proteins from humans, this study carries out a bioinformatics investigation on SOH sites based on amino acid composition and solvent-accessible surface area. A TwoSampleLogo indicates that the positively and negatively charged amino acids flanking the SOH sites may impact the formulation of S-sulfenylation in closed three-dimensional environments. In addition, the substrate motifs of SOH sites are studied using the maximal dependence decomposition (MDD). Based on the concept of binary classification between SOH and non-SOH sites, Support vector machine (SVM) is applied to learn the predictive model from MDD-identified substrate motifs. According to the evaluation results of 5-fold cross-validation, the integrated SVM model learned from substrate motifs yields an average accuracy of 0.87, significantly improving the prediction of SOH sites. Furthermore, the integrated SVM model also effectively improves the predictive performance in an independent testing set. Finally, the integrated SVM model is applied to implement an effective web resource, named MDD-SOH, to identify SOH sites with their corresponding substrate motifs.

AVAILABILITY AND IMPLEMENTATION

The MDD-SOH is now freely available to all interested users at http://csb.cse.yzu.edu.tw/MDDSOH/. All of the data set used in this work is also available for download in the website.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

CONTACT

francis@saturn.yzu.edu.tw.

Collapse

Chen YJ, Lu CT, Huang KY, Wu HY, Chen YJ, Lee TY. GSHSite: exploiting an iteratively statistical method to identify s-glutathionylation sites with substrate specificity. PLoS One 2015;10:e0118752. [PMID: 25849935 PMCID: PMC4388702 DOI: 10.1371/journal.pone.0118752] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2014] [Accepted: 01/06/2015] [Indexed: 01/13/2023] Open

Huang SY, Shi SP, Qiu JD, Liu MC. Using support vector machines to identify protein phosphorylation sites in viruses. J Mol Graph Model 2015;56:84-90. [DOI: 10.1016/j.jmgm.2014.12.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2014] [Revised: 12/13/2014] [Accepted: 12/16/2014] [Indexed: 10/24/2022]

Ding TB, Zhong R, Jiang XZ, Liao CY, Xia WK, Liu B, Dou W, Wang JJ. Molecular characterisation of a sodium channel gene and identification of a Phe1538 to Ile mutation in citrus red mite, Panonychus citri. PEST MANAGEMENT SCIENCE 2015;71:266-277. [PMID: 24753229 DOI: 10.1002/ps.3802] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2013] [Revised: 03/02/2014] [Accepted: 04/12/2014] [Indexed: 06/03/2023]

Wu HY, Lu CT, Kao HJ, Chen YJ, Chen YJ, Lee TY. Characterization and identification of protein O-GlcNAcylation sites with substrate specificity. BMC Bioinformatics 2014;15 Suppl 16:S1. [PMID: 25521204 PMCID: PMC4290634 DOI: 10.1186/1471-2105-15-s16-s1] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open

Abstract

Background

Protein O-GlcNAcylation, involving the attachment of single N-acetylglucosamine (GlcNAc) to the hydroxyl group of serine or threonine residues. Elucidation of O-GlcNAcylation sites on proteins is required in order to decipher its crucial roles in regulating cellular processes and aid in drug design. With an increasing number of O-GlcNAcylation sites identified by mass spectrometry (MS)-based proteomics, several methods have been proposed for the computational identification of O-GlcNAcylation sites. However, no development that focuses on the investigation of O-GlcNAcylated substrate motifs has existed. Thus, we were motivated to design a new method for the identification of protein O-GlcNAcylation sites with the consideration of substrate site specificity.

Results

In this study, 375 experimentally verified O-GlcNAcylation sites were collected from dbOGAP, which is an integrated resource for protein O-GlcNAcylation. Due to the difficulty in characterizing the substrate motifs by conventional sequence logo analysis, a recursively statistical method has been applied to obtain significant conserved motifs. To construct the predictive models learned from the identified substrate motifs, we adopted Support Vector Machines (SVMs). A five-fold cross validation was used to evaluate the predictive model, achieving sensitivity, specificity, and accuracy of 0.76, 0.80, and 0.78, respectively. Additionally, an independent testing set, which was really blind to the training data of predictive model, was used to demonstrate that the proposed method could provide a promising accuracy (0.94) and outperform three other O-GlcNAcylation site prediction tools.

Conclusion

This work proposed a computational method to identify informative substrate motifs for O-GlcNAcylation sites. The evaluation of cross validation and independent testing indicated that the identified motifs were effective in the identification of O-GlcNAcylation sites. A case study demonstrated that the proposed method could be a feasible means of conducting preliminary analyses of protein O-GlcNAcylation. We also anticipated that the revealed substrate motif may facilitate the study of extensive crosstalk between O-GlcNAcylation and phosphorylation. This method may help unravel their mechanisms and roles in signaling, transcription, chronic disease, and cancer.

Collapse

An intelligent system for identifying acetylated lysine on histones and nonhistone proteins. BIOMED RESEARCH INTERNATIONAL 2014;2014:528650. [PMID: 25147802 PMCID: PMC4132336 DOI: 10.1155/2014/528650] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/11/2014] [Revised: 06/23/2014] [Accepted: 06/24/2014] [Indexed: 01/15/2023]

Huang KY, Wu HY, Chen YJ, Lu CT, Su MG, Hsieh YC, Tsai CM, Lin KI, Huang HD, Lee TY, Chen YJ. RegPhos 2.0: an updated resource to explore protein kinase-substrate phosphorylation networks in mammals. Database (Oxford) 2014;2014:bau034. [PMID: 24771658 PMCID: PMC3999940 DOI: 10.1093/database/bau034] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Revised: 03/27/2014] [Accepted: 03/30/2014] [Indexed: 11/13/2022]

Affiliation(s)

Kai-Yao Huang Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
Hsin-Yi Wu Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
Yi-Ju Chen Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
Cheng-Tsung Lu Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
Min-Gang Su Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
Yun-Chung Hsieh Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
Chih-Ming Tsai Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
Kuo-I Lin Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
Hsien-Da Huang Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
Tzong-Yi Lee Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan
Yu-Ju Chen Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan, Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan and Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan

Collapse

Xu Y, Wang X, Wang Y, Tian Y, Shao X, Wu LY, Deng N. Prediction of posttranslational modification sites from amino acid sequences with kernel methods. J Theor Biol 2014;344:78-87. [DOI: 10.1016/j.jtbi.2013.11.012] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2013] [Revised: 09/13/2013] [Accepted: 11/16/2013] [Indexed: 01/12/2023]

Huang KY, Lu CT, Bretaña N, Lee TY, Chang TH. ViralPhos: incorporating a recursively statistical method to predict phosphorylation sites on virus proteins. BMC Bioinformatics 2013;14 Suppl 16:S10. [PMID: 24564381 PMCID: PMC3853219 DOI: 10.1186/1471-2105-14-s16-s10] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Su MG, Lee TY. Incorporating substrate sequence motifs and spatial amino acid composition to identify kinase-specific phosphorylation sites on protein three-dimensional structures. BMC Bioinformatics 2013;14 Suppl 16:S2. [PMID: 24564522 PMCID: PMC3853090 DOI: 10.1186/1471-2105-14-s16-s2] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open

Abstract

BACKGROUND

Protein phosphorylation catalyzed by kinases plays crucial regulatory roles in cellular processes. Given the high-throughput mass spectrometry-based experiments, the desire to annotate the catalytic kinases for in vivo phosphorylation sites has motivated. Thus, a variety of computational methods have been developed for performing a large-scale prediction of kinase-specific phosphorylation sites. However, most of the proposed methods solely rely on the local amino acid sequences surrounding the phosphorylation sites. An increasing number of three-dimensional structures make it possible to physically investigate the structural environment of phosphorylation sites.

RESULTS

In this work, all of the experimental phosphorylation sites are mapped to the protein entries of Protein Data Bank by sequence identity. It resulted in a total of 4508 phosphorylation sites containing the protein three-dimensional (3D) structures. To identify phosphorylation sites on protein 3D structures, this work incorporates support vector machines (SVMs) with the information of linear motifs and spatial amino acid composition, which is determined for each kinase group by calculating the relative frequencies of 20 amino acid types within a specific radial distance from central phosphorylated amino acid residue. After the cross-validation evaluation, most of the kinase-specific models trained with the consideration of structural information outperform the models considering only the sequence information. Furthermore, the independent testing set which is not included in training set has demonstrated that the proposed method could provide a comparable performance to other popular tools.

CONCLUSION

The proposed method is shown to be capable of predicting kinase-specific phosphorylation sites on 3D structures and has been implemented as a web server which is freely accessible at http://csb.cse.yzu.edu.tw/PhosK3D/. Due to the difficulty of identifying the kinase-specific phosphorylation sites with similar sequenced motifs, this work also integrates the 3D structural information to improve the cross classifying specificity.

Collapse

Lu CT, Huang KY, Su MG, Lee TY, Bretaña NA, Chang WC, Chen YJ, Chen YJ, Huang HD. DbPTM 3.0: an informative resource for investigating substrate site specificity and functional association of protein post-translational modifications. Nucleic Acids Res 2012. [PMID: 23193290 PMCID: PMC3531199 DOI: 10.1093/nar/gks1229] [Citation(s) in RCA: 165] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open

Abstract

Protein modification is an extremely important post-translational regulation that adjusts the physical and chemical properties, conformation, stability and activity of a protein; thus altering protein function. Due to the high throughput of mass spectrometry (MS)-based methods in identifying site-specific post-translational modifications (PTMs), dbPTM (http://dbPTM.mbc.nctu.edu.tw/) is updated to integrate experimental PTMs obtained from public resources as well as manually curated MS/MS peptides associated with PTMs from research articles. Version 3.0 of dbPTM aims to be an informative resource for investigating the substrate specificity of PTM sites and functional association of PTMs between substrates and their interacting proteins. In order to investigate the substrate specificity for modification sites, a newly developed statistical method has been applied to identify the significant substrate motifs for each type of PTMs containing sufficient experimental data. According to the data statistics in dbPTM, >60% of PTM sites are located in the functional domains of proteins. It is known that most PTMs can create binding sites for specific protein-interaction domains that work together for cellular function. Thus, this update integrates protein–protein interaction and domain–domain interaction to determine the functional association of PTM sites located in protein-interacting domains. Additionally, the information of structural topologies on transmembrane (TM) proteins is integrated in dbPTM in order to delineate the structural correlation between the reported PTM sites and TM topologies. To facilitate the investigation of PTMs on TM proteins, the PTM substrate sites and the structural topology are graphically represented. Also, literature information related to PTMs, orthologous conservations and substrate motifs of PTMs are also provided in the resource. Finally, this version features an improved web interface to facilitate convenient access to the resource.

Collapse

Functional analyses of endometriosis-related polymorphisms in the estrogen synthesis and metabolism-related genes. PLoS One 2012;7:e47374. [PMID: 23139742 PMCID: PMC3490981 DOI: 10.1371/journal.pone.0047374] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2012] [Accepted: 09/12/2012] [Indexed: 11/19/2022] Open

Abstract

Endometriosis is determined by genetic factors, and the prevalence of genetic polymorphisms varies greatly depending on the ethnic group studied. The objective of this study was to investigate the relationship between single nucleotide polymorphisms (SNPs) of 9 genes involved in estrogen biosynthesis and metabolism and the risks of endometriosis. Three hundred patients with endometriosis and 337 non-endometriotic controls were recruited. Thirty four non-synonymous SNPs, which change amino acid residues, were analyzed using matrix-assisted laser desorption-ionization time-of-flight mass spectrometry (MALDI-TOF MS). The functions of SNP-resulted amino acid changes were analyzed using multiple web-accessible databases and phosphorylation predicting algorithms. Among the 34 NCBI-listed SNPs, 22 did not exhibit polymorphism in this study of more than 600 Taiwanese Chinese women. However, homozygous and heterozygous mutants of 4 SNPs - rs6165 (genotype GG+GA, 307(Ala/Ala)+307(Ala/Thr)) of FSHR, rs 6166 (genotype GG+GA, 680(Ser/Asn)+680(Ser/Ser)) of FSHR, rs2066479 (genotype AA+AG, 289(Ser/Ser)+289(Ser/Gly)) of HSD17B3 and rs700519 (genotype TT+TC, 264(Cys/Cys)+264(Cys/Arg)) of CYP19, alone or in combination, were significantly associated with decreased risks of endometriosis. Bioinformatics results identified 307(Thr) of FSHR to be a site for O-linked glycosylation, 680(Ser) of FSHR a phosphorylated site by protein kinase B, and 289(Ser) of HSD17B3 a phosphorylated site by protein kinase B or ribosomal protein S6 kinase 1. Results of this study suggest that non-synonymous polymorphisms of FSHR, HSD17B3 and CYP19 genes may modulate the risk of endometriosis in Taiwanese Chinese women. Identification of the endometrosis-preferential non-synonymous SNPs and the conformational changes in those proteins may pave the way for the development of more disease-specific drugs.

Collapse

Huang JH, Cao DS, Yan J, Xu QS, Hu QN, Liang YZ. Using core hydrophobicity to identify phosphorylation sites of human G protein-coupled receptors. Biochimie 2012;94:1697-704. [PMID: 22503742 DOI: 10.1016/j.biochi.2012.03.022] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2011] [Accepted: 03/28/2012] [Indexed: 01/23/2023]

An Integrated Bayesian Framework for Identifying Phosphorylation Networks in Stimulated Cells. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2012;736:59-80. [DOI: 10.1007/978-1-4419-7210-1_3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]

Trost B, Kusalik A. Computational prediction of eukaryotic phosphorylation sites. Bioinformatics 2011;27:2927-35. [DOI: 10.1093/bioinformatics/btr525] [Citation(s) in RCA: 121] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Lee TY, Bretaña NA, Lu CT. PlantPhos: using maximal dependence decomposition to identify plant phosphorylation sites with substrate site specificity. BMC Bioinformatics 2011;12:261. [PMID: 21703007 PMCID: PMC3228547 DOI: 10.1186/1471-2105-12-261] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2011] [Accepted: 06/26/2011] [Indexed: 01/18/2023] Open

Abstract

BACKGROUND

Protein phosphorylation catalyzed by kinases plays crucial regulatory roles in intracellular signal transduction. Due to the difficulty in performing high-throughput mass spectrometry-based experiment, there is a desire to predict phosphorylation sites using computational methods. However, previous studies regarding in silico prediction of plant phosphorylation sites lack the consideration of kinase-specific phosphorylation data. Thus, we are motivated to propose a new method that investigates different substrate specificities in plant phosphorylation sites.

RESULTS

Experimentally verified phosphorylation data were extracted from TAIR9-a protein database containing 3006 phosphorylation data from the plant species Arabidopsis thaliana. In an attempt to investigate the various substrate motifs in plant phosphorylation, maximal dependence decomposition (MDD) is employed to cluster a large set of phosphorylation data into subgroups containing significantly conserved motifs. Profile hidden Markov model (HMM) is then applied to learn a predictive model for each subgroup. Cross-validation evaluation on the MDD-clustered HMMs yields an average accuracy of 82.4% for serine, 78.6% for threonine, and 89.0% for tyrosine models. Moreover, independent test results using Arabidopsis thaliana phosphorylation data from UniProtKB/Swiss-Prot show that the proposed models are able to correctly predict 81.4% phosphoserine, 77.1% phosphothreonine, and 83.7% phosphotyrosine sites. Interestingly, several MDD-clustered subgroups are observed to have similar amino acid conservation with the substrate motifs of well-known kinases from Phospho.ELM-a database containing kinase-specific phosphorylation data from multiple organisms.

CONCLUSIONS

This work presents a novel method for identifying plant phosphorylation sites with various substrate motifs. Based on cross-validation and independent testing, results show that the MDD-clustered models outperform models trained without using MDD. The proposed method has been implemented as a web-based plant phosphorylation prediction tool, PlantPhos http://csb.cse.yzu.edu.tw/PlantPhos/. Additionally, two case studies have been demonstrated to further evaluate the effectiveness of PlantPhos.

Collapse

Lee TY, Lin ZQ, Hsieh SJ, Bretaña NA, Lu CT. Exploiting maximal dependence decomposition to identify conserved motifs from a group of aligned signal sequences. Bioinformatics 2011;27:1780-7. [DOI: 10.1093/bioinformatics/btr291] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Lee TY, Bo-Kai Hsu J, Chang WC, Huang HD. RegPhos: a system to explore the protein kinase-substrate phosphorylation network in humans. Nucleic Acids Res 2011;39:D777-87. [PMID: 21037261 PMCID: PMC3013804 DOI: 10.1093/nar/gkq970] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Annan RB, Lee AY, Reid ID, Sayad A, Whiteway M, Hallett M, Thomas DY. A biochemical genomics screen for substrates of Ste20p kinase enables the in silico prediction of novel substrates. PLoS One 2009;4:e8279. [PMID: 20020052 PMCID: PMC2791418 DOI: 10.1371/journal.pone.0008279] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2009] [Accepted: 11/19/2009] [Indexed: 01/13/2023] Open

Chang WC, Lee TY, Shien DM, Hsu JBK, Horng JT, Hsu PC, Wang TY, Huang HD, Pan RL. Incorporating support vector machine for identifying protein tyrosine sulfation sites. J Comput Chem 2009;30:2526-37. [PMID: 19373826 DOI: 10.1002/jcc.21258] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Shien DM, Lee TY, Chang WC, Hsu JBK, Horng JT, Hsu PC, Wang TY, Huang HD. Incorporating structural characteristics for identification of protein methylation sites. J Comput Chem 2009;30:1532-43. [PMID: 19263424 DOI: 10.1002/jcc.21232] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Lee TY, Hsu JBK, Chang WC, Wang TY, Hsu PC, Huang HD. A comprehensive resource for integrating and displaying protein post-translational modifications. BMC Res Notes 2009;2:111. [PMID: 19549291 PMCID: PMC2713254 DOI: 10.1186/1756-0500-2-111] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2008] [Accepted: 06/23/2009] [Indexed: 11/22/2022] Open

Garbino A, van Oort RJ, Dixit SS, Landstrom AP, Ackerman MJ, Wehrens XHT. Molecular evolution of the junctophilin gene family. Physiol Genomics 2009;37:175-86. [PMID: 19318539 DOI: 10.1152/physiolgenomics.00017.2009] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

Dang TH, Van Leemput K, Verschoren A, Laukens K. Prediction of kinase-specific phosphorylation sites using conditional random fields. ACTA ACUST UNITED AC 2008;24:2857-64. [PMID: 18940828 PMCID: PMC2639296 DOI: 10.1093/bioinformatics/btn546] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Wan J, Kang S, Tang C, Yan J, Ren Y, Liu J, Gao X, Banerjee A, Ellis LBM, Li T. Meta-prediction of phosphorylation sites with weighted voting and restricted grid search parameter selection. Nucleic Acids Res 2008;36:e22. [PMID: 18234718 PMCID: PMC2275094 DOI: 10.1093/nar/gkm848] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2007] [Revised: 08/28/2007] [Accepted: 09/26/2007] [Indexed: 11/21/2022] Open

Affiliation(s)

Ji Wan Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
Shuli Kang Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
Chuanning Tang Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
Jianhua Yan Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
Yongliang Ren Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
Jie Liu Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
Xiaolian Gao Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
Arindam Banerjee Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
Lynda B. M. Ellis Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
Tongbin Li Department of Neuroscience, Department of Computer Science and Engineering and Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA and Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA

Collapse

Shonhai A, Boshoff A, Blatch GL. The structural and functional diversity of Hsp70 proteins from Plasmodium falciparum. Protein Sci 2007;16:1803-18. [PMID: 17766381 PMCID: PMC2206976 DOI: 10.1110/ps.072918107] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

Patterson EK, Watson PH, Hodsman AB, Hendy GN, Canaff L, Bringhurst FR, Poschwatta CH, Fraher LJ. Expression of PTH1R constructs in LLC-PK1 cells: protein nuclear targeting is mediated by the PTH1R NLS. Bone 2007;41:603-10. [PMID: 17627912 DOI: 10.1016/j.bone.2007.04.201] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/13/2006] [Revised: 03/01/2007] [Accepted: 04/04/2007] [Indexed: 10/23/2022]

Abstract

This study demonstrates that the PTH1R NLS can target a fusion protein to the nucleus, and that this is blocked by sequences downstream of the NLS. GFP fused to the NLS showed a significant increase in nuclear targeting compared to GFP alone or GFP fused to a peptide of the same length. In previous studies, we demonstrated that the type I PTH/PTHrP receptor (PTH1R) localizes to the nucleus of cells within rat liver, kidney, uterus, ovary and gut. Similarly, nuclear localization of the PTH1R was observed in the cultured osteoblast-like cells MC3T3-E1, UMR106, ROS 17/2.8 and SaOS-2. We have identified a putative bipartite nuclear localization signal (NLS), from residues 471-488 in the protein sequence of the PTH1R. In this study, several PTH1R constructs were made in the Enhanced Green Fluorescent Protein (EGFP) expression vector (Clontech), transiently transfected into LLC-PK1 Clone 46 cells, and the resultant fusion protein expression followed by fluorescence microscopy. This particular clone of LLC-PK1 shows no biochemical response in vitro to parathyroid hormone. Constructs included the entire PTH1R sequence (PTH1R-GFP), the putative NLS fused to the C-terminus of GFP (GFP-NLS) or the NLS through to the C-terminus of the PTH1R fused to GFP (GFP-NLSCT). Deconvolution fluorescence microscopy of cells transfected with PTH1R-GFP showed abundant fluorescent signal throughout the cells with distinctly fluorescing plasma membranes. These cells also exhibited an increase in cAMP production in response to (0-10(-8) M) hPTH(1-34), with an increase in cAMP from 11 fmol/mug of protein to 101 fmol/microg. In contrast, cells transfected with the GFP-NLS construct showed significant nuclear sequestration of fluorescence as compared to GFP alone, GFP-NLSCT, or a short amino acid sequence fused to GFP (GFP-FFVAIYCFCNGEVQAEI). These results indicate that the NLS at residues 471-488 of the mature rat PTH1R is functional and plays a role in targeting the PTH1R the nucleus, also the addition of GFP to the C-terminus of the PTH1R still allows cAMP generation which will be useful for further studies.

Collapse

Chang EJ, Begum R, Chait BT, Gaasterland T. Prediction of cyclin-dependent kinase phosphorylation substrates. PLoS One 2007;2:e656. [PMID: 17668044 PMCID: PMC1924601 DOI: 10.1371/journal.pone.0000656] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2007] [Accepted: 06/24/2007] [Indexed: 11/18/2022] Open

Wong YH, Lee TY, Liang HK, Huang CM, Wang TY, Yang YH, Chu CH, Huang HD, Ko MT, Hwang JK. KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns. Nucleic Acids Res 2007;35:W588-94. [PMID: 17517770 PMCID: PMC1933228 DOI: 10.1093/nar/gkm322] [Citation(s) in RCA: 266] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Affiliation(s)

Yung-Hao Wong Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
Tzong-Yi Lee Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
Han-Kuen Liang Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
Chia-Mao Huang Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
Ting-Yuan Wang Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
Yi-Huan Yang Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
Chia-Huei Chu Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
Hsien-Da Huang Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan *To whom correspondence should be addressed. +886 3 5712121 Ext. 56952+886 3 5729288
Ming-Tat Ko Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan
Jenn-Kang Hwang Institute of Bioinformatics, Department of Biological Science and Technology, Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsin-chu 300, Taiwan and Institute of Information Science, Academia Sinica, 128sec. 2, Academia Rd, Taipei, Taiwan

Collapse

Zanzoni A, Ausiello G, Via A, Gherardini PF, Helmer-Citterich M. Phospho3D: a database of three-dimensional structures of protein phosphorylation sites. Nucleic Acids Res 2006;35:D229-31. [PMID: 17142231 PMCID: PMC1669737 DOI: 10.1093/nar/gkl922] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open

Xue Y, Li A, Wang L, Feng H, Yao X. PPSP: prediction of PK-specific phosphorylation site with Bayesian decision theory. BMC Bioinformatics 2006;7:163. [PMID: 16549034 PMCID: PMC1435943 DOI: 10.1186/1471-2105-7-163] [Citation(s) in RCA: 160] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2005] [Accepted: 03/20/2006] [Indexed: 11/25/2022] Open

Xue Y, Li A, Wang L, Feng H, Yao X. PPSP: prediction of PK-specific phosphorylation site with Bayesian decision theory. BMC Bioinformatics 2006. [PMID: 16549034 DOI: 10.1186/1471‐2105‐7‐163] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Lee TY, Huang HD, Hung JH, Huang HY, Yang YS, Wang TH. dbPTM: an information repository of protein post-translational modification. Nucleic Acids Res 2006;34:D622-7. [PMID: 16381945 PMCID: PMC1347446 DOI: 10.1093/nar/gkj083] [Citation(s) in RCA: 177] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open