1
|
Lv Z, Wei X, Hu S, Lin G, Qiu W. iSUMO-RsFPN: A predictor for identifying lysine SUMOylation sites based on multi-features and feature pyramid networks. Anal Biochem 2024; 687:115460. [PMID: 38191118 DOI: 10.1016/j.ab.2024.115460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 01/02/2024] [Accepted: 01/03/2024] [Indexed: 01/10/2024]
Abstract
SUMOylation is a protein post-translational modification that plays an essential role in cellular functions. For predicting SUMO sites, numerous researchers have proposed advanced methods based on ordinary machine learning algorithms. These reported methods have shown excellent predictive performance, but there is room for improvement. In this study, we constructed a novel deep neural network Residual Pyramid Network (RsFPN), and developed an ensemble deep learning predictor called iSUMO-RsFPN. Initially, three feature extraction methods were employed to extract features from samples. Following this, weak classifiers were trained based on RsFPN for each feature type. Ultimately, the weak classifiers were integrated to construct the final classifier. Moreover, the predictor underwent systematically testing on an independent test dataset, where the results demonstrated a significant improvement over the existing state-of-the-art predictors. The code of iSUMO-RsFPN is free and available at https://github.com/454170054/iSUMO-RsFPN.
Collapse
Affiliation(s)
- Zhe Lv
- School of Mega Data, Jiangxi Institute of Fashion Technology, 330201, Nanchang, Jiangxi, China
| | - Xin Wei
- Business School, Jiangxi Institute of Fashion Technology, 330201, Nanchang, Jiangxi, China
| | - Siqin Hu
- School of Mega Data, Jiangxi Institute of Fashion Technology, 330201, Nanchang, Jiangxi, China
| | - Gang Lin
- School of Mega Data, Jiangxi Institute of Fashion Technology, 330201, Nanchang, Jiangxi, China
| | - Wangren Qiu
- Computer Department, Jingdezhen Ceramic University, 333403, Jingdezhen, Jiangxi, China.
| |
Collapse
|
2
|
Chen Z, Liu X, Li F, Li C, Marquez-Lago T, Leier A, Webb GI, Xu D, Akutsu T, Song J. Systematic Characterization of Lysine Post-translational Modification Sites Using MUscADEL. Methods Mol Biol 2022; 2499:205-219. [PMID: 35696083 DOI: 10.1007/978-1-0716-2317-6_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Among various types of protein post-translational modifications (PTMs), lysine PTMs play an important role in regulating a wide range of functions and biological processes. Due to the generation and accumulation of enormous amount of protein sequence data by ongoing whole-genome sequencing projects, systematic identification of different types of lysine PTM substrates and their specific PTM sites in the entire proteome is increasingly important and has therefore received much attention. Accordingly, a variety of computational methods for lysine PTM identification have been developed based on the combination of various handcrafted sequence features and machine-learning techniques. In this chapter, we first briefly review existing computational methods for lysine PTM identification and then introduce a recently developed deep learning-based method, termed MUscADEL (Multiple Scalable Accurate Deep Learner for lysine PTMs). Specifically, MUscADEL employs bidirectional long short-term memory (BiLSTM) recurrent neural networks and is capable of predicting eight major types of lysine PTMs in both the human and mouse proteomes. The web server of MUscADEL is publicly available at http://muscadel.erc.monash.edu/ for the research community to use.
Collapse
Affiliation(s)
- Zhen Chen
- Key Laboratory of Rice Biology in Henan Province, Henan Agricultural University, Zhengzhou, China
| | - Xuhan Liu
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands
| | - Fuyi Li
- Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, Australia
| | - Chen Li
- Biomedicine Discovery Institute, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC, Australia
| | - Tatiana Marquez-Lago
- Department of Genetics, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - André Leier
- Department of Genetics, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Geoffrey I Webb
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, Australia
| | - Dakang Xu
- Faculty of Medical Laboratory Science, Ruijin Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Department of Molecular and Translational Science, Faculty of Medicine, Hudson Institute of Medical Research, Monash University, Melbourne, VIC, Australia
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto, Japan.
| | - Jiangning Song
- Biomedicine Discovery Institute, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC, Australia.
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, Australia.
| |
Collapse
|
3
|
Zhao YW, Zhang S, Ding H. Recent development of machine learning methods in sumoylation sites prediction. Curr Med Chem 2021; 29:894-907. [PMID: 34525906 DOI: 10.2174/0929867328666210915112030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 07/24/2021] [Accepted: 08/07/2021] [Indexed: 11/22/2022]
Abstract
Sumoylation of proteins is an important reversible post-translational modification of proteins and mediates a variety of cellular processes. Sumo-modified proteins can change their subcellular localization, activity and stability. In addition, it also plays an important role in various cellular processes such as transcriptional regulation and signal transduction. The abnormal sumoylation is involved in many diseases, including neurodegeneration and immune-related diseases, as well as the development of cancer. Therefore, identification of the sumoylation site (SUMO site) is fundamental to understanding their molecular mechanisms and regulatory roles. In contrast to labor-intensive and costly experimental approaches, computational prediction of sumoylation sites in silico also attracted much attention for its accuracy, convenience and speed. At present, many computational prediction models have been used to identify SUMO sites, but these contents have not been comprehensively summarized and reviewed. Therefore, the research progress of relevant models is summarized and discussed in this paper. We will briefly summarize the development of bioinformatics methods on sumoylation site prediction. We will mainly focus on the benchmark dataset construction, feature extraction, machine learning method, published results and online tools. We hope the review will provide more help for wet-experimental scholars.
Collapse
Affiliation(s)
- Yi-Wei Zhao
- School of Medicine, University of Electronic Science and Technology of China, Chengdu 610054. China
| | - Shihua Zhang
- College of Life Science and Health, Wuhan University of Science and Technology, Wuhan 430065. China
| | - Hui Ding
- School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054. China
| |
Collapse
|
4
|
Chen Z, Liu X, Li F, Li C, Marquez-Lago T, Leier A, Akutsu T, Webb GI, Xu D, Smith AI, Li L, Chou KC, Song J. Large-scale comparative assessment of computational predictors for lysine post-translational modification sites. Brief Bioinform 2019; 20:2267-2290. [PMID: 30285084 PMCID: PMC6954452 DOI: 10.1093/bib/bby089] [Citation(s) in RCA: 85] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Revised: 08/17/2018] [Accepted: 08/18/2018] [Indexed: 12/22/2022] Open
Abstract
Lysine post-translational modifications (PTMs) play a crucial role in regulating diverse functions and biological processes of proteins. However, because of the large volumes of sequencing data generated from genome-sequencing projects, systematic identification of different types of lysine PTM substrates and PTM sites in the entire proteome remains a major challenge. In recent years, a number of computational methods for lysine PTM identification have been developed. These methods show high diversity in their core algorithms, features extracted and feature selection techniques and evaluation strategies. There is therefore an urgent need to revisit these methods and summarize their methodologies, to improve and further develop computational techniques to identify and characterize lysine PTMs from the large amounts of sequence data. With this goal in mind, we first provide a comprehensive survey on a large collection of 49 state-of-the-art approaches for lysine PTM prediction. We cover a variety of important aspects that are crucial for the development of successful predictors, including operating algorithms, sequence and structural features, feature selection, model performance evaluation and software utility. We further provide our thoughts on potential strategies to improve the model performance. Second, in order to examine the feasibility of using deep learning for lysine PTM prediction, we propose a novel computational framework, termed MUscADEL (Multiple Scalable Accurate Deep Learner for lysine PTMs), using deep, bidirectional, long short-term memory recurrent neural networks for accurate and systematic mapping of eight major types of lysine PTMs in the human and mouse proteomes. Extensive benchmarking tests show that MUscADEL outperforms current methods for lysine PTM characterization, demonstrating the potential and power of deep learning techniques in protein PTM prediction. The web server of MUscADEL, together with all the data sets assembled in this study, is freely available at http://muscadel.erc.monash.edu/. We anticipate this comprehensive review and the application of deep learning will provide practical guide and useful insights into PTM prediction and inspire future bioinformatics studies in the related fields.
Collapse
Affiliation(s)
- Zhen Chen
- School of Basic Medical Science, Qingdao University, Dengzhou Road, Qingdao, Shandong, China
| | - Xuhan Liu
- Medicinal Chemistry, Leiden Academic Centre for Drug Research,Einsteinweg, Leiden, The Netherlands
| | - Fuyi Li
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Faculty of Medicine, Monash University, Melbourne, VIC, Australia
- ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC, Australia
| | - Chen Li
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Faculty of Medicine, Monash University, Melbourne, VIC, Australia
- Institute of Molecular Systems Biology, ETH Zürich,Auguste-Piccard-Hof, Zürich, Switzerland
| | - Tatiana Marquez-Lago
- Department of Genetics, School of Medicine, University of Alabama at Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA
| | - André Leier
- Department of Genetics, School of Medicine, University of Alabama at Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research,Kyoto University, Uji, Kyoto, Japan
| | - Geoffrey I Webb
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, Australia
| | - Dakang Xu
- Faculty of Medical Laboratory Science, Ruijin Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Department of Molecular and Translational Science, Faculty of Medicine, Hudson Institute of Medical Research, Monash University, Melbourne, VIC, Australia
| | - Alexander Ian Smith
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Faculty of Medicine, Monash University, Melbourne, VIC, Australia
- ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC, Australia
| | - Lei Li
- School of Basic Medical Science, Qingdao University, Dengzhou Road, Qingdao, Shandong, China
| | - Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA, USA
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Faculty of Medicine, Monash University, Melbourne, VIC, Australia
- ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC, Australia
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, Australia
| |
Collapse
|
5
|
Ning Q, Ma Z, Zhao X. dForml(KNN)-PseAAC: Detecting formylation sites from protein sequences using K-nearest neighbor algorithm via Chou's 5-step rule and pseudo components. J Theor Biol 2019; 470:43-49. [DOI: 10.1016/j.jtbi.2019.03.011] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Revised: 03/09/2019] [Accepted: 03/13/2019] [Indexed: 10/27/2022]
|
6
|
Sharma A, Lysenko A, López Y, Dehzangi A, Sharma R, Reddy H, Sattar A, Tsunoda T. HseSUMO: Sumoylation site prediction using half-sphere exposures of amino acids residues. BMC Genomics 2019; 19:982. [PMID: 30999862 PMCID: PMC7402407 DOI: 10.1186/s12864-018-5206-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Accepted: 10/28/2018] [Indexed: 02/06/2023] Open
Abstract
Background Post-translational modifications are viewed as an important mechanism for controlling protein function and are believed to be involved in multiple important diseases. However, their profiling using laboratory-based techniques remain challenging. Therefore, making the development of accurate computational methods to predict post-translational modifications is particularly important for making progress in this area of research. Results This work explores the use of four half-sphere exposure-based features for computational prediction of sumoylation sites. Unlike most of the previously proposed approaches, which focused on patterns of amino acid co-occurrence, we were able to demonstrate that protein structural based features could be sufficiently informative to achieve good predictive performance. The evaluation of our method has demonstrated high sensitivity (0.9), accuracy (0.89) and Matthew’s correlation coefficient (0.78–0.79). We have compared these results to the recently released pSumo-CD method and were able to demonstrate better performance of our method on the same evaluation dataset. Conclusions The proposed predictor HseSUMO uses half-sphere exposures of amino acids to predict sumoylation sites. It has shown promising results on a benchmark dataset when compared with the state-of-the-art method. The extracted data of this study can be accessed at https://github.com/YosvanyLopez/HseSUMO. Electronic supplementary material The online version of this article (10.1186/s12864-018-5206-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alok Sharma
- Institute for Integrated and Intelligent Systems, Griffith University, Q, Brisbane, LD-4111, Australia. .,Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan. .,School of Engineering and Physics, Faculty of Science, Technology and Environment, University of the South Pacific, Suva, Fiji Islands.
| | - Artem Lysenko
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
| | - Yosvany López
- Genesis Institute of Genetic Research, Genesis Healthcare Co, Tokyo, Japan
| | - Abdollah Dehzangi
- Department of Computer Science, Morgan State University, Baltimore, MD, USA
| | - Ronesh Sharma
- School of Engineering and Physics, Faculty of Science, Technology and Environment, University of the South Pacific, Suva, Fiji Islands.,School of Electrical and Electronics Engineering, Fiji National University, Suva, Fiji
| | - Hamendra Reddy
- School of Engineering and Physics, Faculty of Science, Technology and Environment, University of the South Pacific, Suva, Fiji Islands
| | - Abdul Sattar
- Institute for Integrated and Intelligent Systems, Griffith University, Q, Brisbane, LD-4111, Australia
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan. .,Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan. .,CREST, JST, Tokyo, 113-8510, Japan.
| |
Collapse
|
7
|
Antoniou-Kourounioti M, Mimmack ML, Porter ACG, Farr CJ. The Impact of the C-Terminal Region on the Interaction of Topoisomerase II Alpha with Mitotic Chromatin. Int J Mol Sci 2019; 20:ijms20051238. [PMID: 30871006 PMCID: PMC6429393 DOI: 10.3390/ijms20051238] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 03/04/2019] [Accepted: 03/08/2019] [Indexed: 02/06/2023] Open
Abstract
Type II topoisomerase enzymes are essential for resolving DNA topology problems arising through various aspects of DNA metabolism. In vertebrates two isoforms are present, one of which (TOP2A) accumulates on chromatin during mitosis. Moreover, TOP2A targets the mitotic centromere during prophase, persisting there until anaphase onset. It is the catalytically-dispensable C-terminal domain of TOP2 that is crucial in determining this isoform-specific behaviour. In this study we show that, in addition to the recently identified chromatin tether domain, several other features of the alpha-C-Terminal Domain (CTD). influence the mitotic localisation of TOP2A. Lysine 1240 is a major SUMOylation target in cycling human cells and the efficiency of this modification appears to be influenced by T1244 and S1247 phosphorylation. Replacement of K1240 by arginine results in fewer cells displaying centromeric TOP2A accumulation during prometaphase-metaphase. The same phenotype is displayed by cells expressing TOP2A in which either of the mitotic phosphorylation sites S1213 or S1247 has been substituted by alanine. Conversely, constitutive modification of TOP2A by fusion to SUMO2 exerts the opposite effect. FRAP analysis of protein mobility indicates that post-translational modification of TOP2A can influence the enzyme's residence time on mitotic chromatin, as well as its subcellular localisation.
Collapse
Affiliation(s)
- Melissa Antoniou-Kourounioti
- Department of Genetics, University of Cambridge, Downing St, Cambridge CB2 3EH, UK.
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK.
| | - Michael L Mimmack
- Department of Genetics, University of Cambridge, Downing St, Cambridge CB2 3EH, UK.
- Metabolic Research Laboratories, Wellcome Trust-Medical Research Council Institute of Metabolic Science, University of Cambridge, Cambridge CB2 0QQ, UK.
| | - Andrew C G Porter
- Centre for Haematology, Faculty of Medicine, Imperial College London, Hammersmith Hospital Campus, Du Cane Rd, London W12 0NN, UK.
| | - Christine J Farr
- Department of Genetics, University of Cambridge, Downing St, Cambridge CB2 3EH, UK.
| |
Collapse
|
8
|
Dehzangi A, López Y, Taherzadeh G, Sharma A, Tsunoda T. SumSec: Accurate Prediction of Sumoylation Sites Using Predicted Secondary Structure. Molecules 2018; 23:E3260. [PMID: 30544729 PMCID: PMC6320791 DOI: 10.3390/molecules23123260] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2018] [Revised: 11/30/2018] [Accepted: 12/05/2018] [Indexed: 12/13/2022] Open
Abstract
Post Translational Modification (PTM) is defined as the modification of amino acids along the protein sequences after the translation process. These modifications significantly impact on the functioning of proteins. Therefore, having a comprehensive understanding of the underlying mechanism of PTMs turns out to be critical in studying the biological roles of proteins. Among a wide range of PTMs, sumoylation is one of the most important modifications due to its known cellular functions which include transcriptional regulation, protein stability, and protein subcellular localization. Despite its importance, determining sumoylation sites via experimental methods is time-consuming and costly. This has led to a great demand for the development of fast computational methods able to accurately determine sumoylation sites in proteins. In this study, we present a new machine learning-based method for predicting sumoylation sites called SumSec. To do this, we employed the predicted secondary structure of amino acids to extract two types of structural features from neighboring amino acids along the protein sequence which has never been used for this task. As a result, our proposed method is able to enhance the sumoylation site prediction task, outperforming previously proposed methods in the literature. SumSec demonstrated high sensitivity (0.91), accuracy (0.94) and MCC (0.88). The prediction accuracy achieved in this study is 21% better than those reported in previous studies. The script and extracted features are publicly available at: https://github.com/YosvanyLopez/SumSec.
Collapse
Affiliation(s)
- Abdollah Dehzangi
- Department of Computer Science, Morgan State University, Baltimore, MD 21251, USA.
| | - Yosvany López
- Genesis Institute of Genetic Research, Genesis Healthcare Co., Tokyo 150-6015, Japan.
| | - Ghazaleh Taherzadeh
- School of Information and Communication Technology, Griffith University, Gold Coast 4222, Australia.
| | - Alok Sharma
- Institute for Integrated and Intelligent Systems, Griffith University, Brisbane 4111, Australia.
- School of Engineering & Physics, University of the South Pacific, Suva, Fiji.
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.
- CREST, JST, Tokyo 102-0076, Japan.
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo 113-8510, Japan.
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.
- CREST, JST, Tokyo 102-0076, Japan.
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo 113-8510, Japan.
| |
Collapse
|
9
|
SUMOgo: Prediction of sumoylation sites on lysines by motif screening models and the effects of various post-translational modifications. Sci Rep 2018; 8:15512. [PMID: 30341374 PMCID: PMC6195521 DOI: 10.1038/s41598-018-33951-5] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Accepted: 10/08/2018] [Indexed: 12/14/2022] Open
Abstract
Most modern tools used to predict sites of small ubiquitin-like modifier (SUMO) binding (referred to as SUMOylation) use algorithms, chemical features of the protein, and consensus motifs. However, these tools rarely consider the influence of post-translational modification (PTM) information for other sites within the same protein on the accuracy of prediction results. This study applied the Random Forest machine learning method, as well as motif screening models and a feature selection combination mechanism, to develop a SUMOylation prediction system, referred to as SUMOgo. With regard to prediction method, PTM sites were coded as new functional features in addition to structural features, such as sequence-based binary coding, encoded chemical features of proteins, and encoded secondary structure information that is important for PTM. Twenty cycles of prediction were conducted with a 1:1 combination of positive test data and random negative data. Matthew’s correlation coefficient of SUMOgo reached 0.511, which is higher than that of current commonly used tools. This study further verified the important role of PTM in SUMOgo and includes a case study on CREB binding protein (CREBBP). The website for the final tool is http://predictor.nchu.edu.tw/SUMOgo.
Collapse
|
10
|
SVM-SulfoSite: A support vector machine based predictor for sulfenylation sites. Sci Rep 2018; 8:11288. [PMID: 30050050 PMCID: PMC6062547 DOI: 10.1038/s41598-018-29126-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Accepted: 07/02/2018] [Indexed: 12/15/2022] Open
Abstract
Protein S-sulfenylation, which results from oxidation of free thiols on cysteine residues, has recently emerged as an important post-translational modification that regulates the structure and function of proteins involved in a variety of physiological and pathological processes. By altering the size and physiochemical properties of modified cysteine residues, sulfenylation can impact the cellular function of proteins in several different ways. Thus, the ability to rapidly and accurately identify putative sulfenylation sites in proteins will provide important insights into redox-dependent regulation of protein function in a variety of cellular contexts. Though bottom-up proteomic approaches, such as tandem mass spectrometry (MS/MS), provide a wealth of information about global changes in the sulfenylation state of proteins, MS/MS-based experiments are often labor-intensive, costly and technically challenging. Therefore, to complement existing proteomic approaches, researchers have developed a series of computational tools to identify putative sulfenylation sites on proteins. However, existing methods often suffer from low accuracy, specificity, and/or sensitivity. In this study, we developed SVM-SulfoSite, a novel sulfenylation prediction tool that uses support vector machines (SVM) to identify key determinants of sulfenylation among five feature classes: binary code, physiochemical properties, k-space amino acid pairs, amino acid composition and high-quality physiochemical indices. Using 10-fold cross-validation, SVM-SulfoSite achieved 95% sensitivity and 83% specificity, with an overall accuracy of 89% and Matthew’s correlation coefficient (MCC) of 0.79. Likewise, using an independent test set of experimentally identified sulfenylation sites, our method achieved scores of 74%, 62%, 80% and 0.42 for accuracy, sensitivity, specificity and MCC, with an area under the receiver operator characteristic (ROC) curve of 0.81. Moreover, in side-by-side comparisons, SVM-SulfoSite performed as well as or better than existing sulfenylation prediction tools. Together, these results suggest that our method represents a robust and complementary technique for advanced exploration of protein S-sulfenylation.
Collapse
|
11
|
Site-specific mapping of the human SUMO proteome reveals co-modification with phosphorylation. Nat Struct Mol Biol 2017; 24:325-336. [DOI: 10.1038/nsmb.3366] [Citation(s) in RCA: 219] [Impact Index Per Article: 27.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2016] [Accepted: 12/16/2016] [Indexed: 12/18/2022]
|
12
|
Abstract
BACKGROUND Neddylation is a reversible post-translational modification that plays a vital role in maintaining cellular machinery. It is shown to affect localization, binding partners and structure of target proteins. Disruption of protein neddylation was observed in various diseases such as Alzheimer's and cancer. Therefore, understanding the neddylation mechanism and determining neddylation targets possibly bears a huge importance in further understanding the cellular processes. This study is the first attempt to predict neddylated sites from protein sequences by using several sequence and sequence-based structural features. RESULTS We have developed a neddylation site prediction method using a support vector machine based on various sequence properties, position-specific scoring matrices, and disorder. Using 21 amino acid long lysine-centred windows, our model was able to predict neddylation sites successfully, with an average 5-fold stratified cross validation performance of 0.91, 0.91, 0.75, 0.44, 0.95 for accuracy, specificity, sensitivity, Matthew's correlation coefficient and area under curve, respectively. Independent test set results validated the robustness of reported new method. Additionally, we observed that neddylation sites are commonly flexible and there is a significant positively charged amino acid presence in neddylation sites. CONCLUSIONS In this study, a neddylation site prediction method was developed for the first time in literature. Common characteristics of neddylation sites and their discriminative properties were explored for further in silico studies on neddylation. Lastly, up-to-date neddylation dataset was provided for researchers working on post-translational modifications in the accompanying supplementary material of this article.
Collapse
|
13
|
Schönbach C, Tan T, Ranganathan S. InCoB2014: mining biological data from genomics for transforming industry and health. BMC Genomics 2014; 15 Suppl 9:I1. [PMID: 25521539 PMCID: PMC4290585 DOI: 10.1186/1471-2164-15-s9-i1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The 13th International Conference on Bioinformatics (InCoB2014) was held for the first time in Australia, at Sydney, July 31-2 August, 2014. InCoB is the annual scientific gathering of the Asia-Pacific Bioinformatics Network (APBioNet), hosted since 2002 in the Asia-Pacific region. Of 106 full papers submitted to the BMC track of InCoB2014, 50 (47.2%) were accepted in BMC Bioinformatics, BMC Genomics and BMC Systems Biology supplements, with three papers in a new BMC Medical Genomics supplement. While the majority of presenters and authors were from Asia and Australia, the increasing number of US and European conference attendees augurs well for the international flavour of InCoB. Next year's InCoB will be held jointly with the Genome Informatics Workshop (GIW), September 9-11, 2015 in Tokyo, Japan, with a view to integrate bioinformatics communities in the region.
Collapse
|