1
|
Song T, Yang Q, Qu P, Qiao L, Wang X. Attenphos: General Phosphorylation Site Prediction Model Based on Attention Mechanism. Int J Mol Sci 2024; 25:1526. [PMID: 38338804 PMCID: PMC10855885 DOI: 10.3390/ijms25031526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 01/18/2024] [Accepted: 01/23/2024] [Indexed: 02/12/2024] Open
Abstract
Phosphorylation site prediction has important application value in the field of bioinformatics. It can act as an important reference and help with protein function research, protein structure research, and drug discovery. So, it is of great significance to propose scientific and effective calculation methods to accurately predict phosphorylation sites. In this study, we propose a new method, Attenphos, based on the self-attention mechanism for predicting general phosphorylation sites in proteins. The method not only captures the long-range dependence information of proteins but also better represents the correlation between amino acids through feature vector encoding transformation. Attenphos takes advantage of the one-dimensional convolutional layer to reduce the number of model parameters, improve model efficiency and prediction accuracy, and enhance model generalization. Comparisons between our method and existing state-of-the-art prediction tools were made using balanced datasets from human proteins and unbalanced datasets from mouse proteins. We performed prediction comparisons using independent test sets. The results showed that Attenphos demonstrated the best overall performance in the prediction of Serine (S), Threonine (T), and Tyrosine (Y) sites on both balanced and unbalanced datasets. Compared to current state-of-the-art methods, Attenphos has significantly higher prediction accuracy. This proves the potential of Attenphos in accelerating the identification and functional analysis of protein phosphorylation sites and provides new tools and ideas for biological research and drug discovery.
Collapse
Affiliation(s)
| | | | | | | | - Xun Wang
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum, Qingdao 266555, China; (T.S.); (Q.Y.); (P.Q.); (L.Q.)
| |
Collapse
|
2
|
Grunfeld N, Levine E, Libby E. Experimental measurement and computational prediction of bacterial Hanks-type Ser/Thr signaling system regulatory targets. Mol Microbiol 2024. [PMID: 38167835 DOI: 10.1111/mmi.15220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 12/15/2023] [Accepted: 12/17/2023] [Indexed: 01/05/2024]
Abstract
Bacteria possess diverse classes of signaling systems that they use to sense and respond to their environments and execute properly timed developmental transitions. One widespread and evolutionarily ancient class of signaling systems are the Hanks-type Ser/Thr kinases, also sometimes termed "eukaryotic-like" due to their homology with eukaryotic kinases. In diverse bacterial species, these signaling systems function as critical regulators of general cellular processes such as metabolism, growth and division, developmental transitions such as sporulation, biofilm formation, and virulence, as well as antibiotic tolerance. This multifaceted regulation is due to the ability of a single Hanks-type Ser/Thr kinase to post-translationally modify the activity of multiple proteins, resulting in the coordinated regulation of diverse cellular pathways. However, in part due to their deep integration with cellular physiology, to date, we have a relatively limited understanding of the timing, regulatory hierarchy, the complete list of targets of a given kinase, as well as the potential regulatory overlap between the often multiple kinases present in a single organism. In this review, we discuss experimental methods and curated datasets aimed at elucidating the targets of these signaling pathways and approaches for using these datasets to develop computational models for quantitative predictions of target motifs. We emphasize novel approaches and opportunities for collecting data suitable for the creation of new predictive computational models applicable to diverse species.
Collapse
Affiliation(s)
- Noam Grunfeld
- Department of Bioengineering, Northeastern University, Boston, Massachusetts, USA
| | - Erel Levine
- Department of Bioengineering, Northeastern University, Boston, Massachusetts, USA
- Department of Chemical Engineering, Northeastern University, Boston, Massachusetts, USA
| | - Elizabeth Libby
- Department of Bioengineering, Northeastern University, Boston, Massachusetts, USA
| |
Collapse
|
3
|
Varshney N, Mishra AK. Deep Learning in Phosphoproteomics: Methods and Application in Cancer Drug Discovery. Proteomes 2023; 11:proteomes11020016. [PMID: 37218921 DOI: 10.3390/proteomes11020016] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 04/24/2023] [Accepted: 04/25/2023] [Indexed: 05/24/2023] Open
Abstract
Protein phosphorylation is a key post-translational modification (PTM) that is a central regulatory mechanism of many cellular signaling pathways. Several protein kinases and phosphatases precisely control this biochemical process. Defects in the functions of these proteins have been implicated in many diseases, including cancer. Mass spectrometry (MS)-based analysis of biological samples provides in-depth coverage of phosphoproteome. A large amount of MS data available in public repositories has unveiled big data in the field of phosphoproteomics. To address the challenges associated with handling large data and expanding confidence in phosphorylation site prediction, the development of many computational algorithms and machine learning-based approaches have gained momentum in recent years. Together, the emergence of experimental methods with high resolution and sensitivity and data mining algorithms has provided robust analytical platforms for quantitative proteomics. In this review, we compile a comprehensive collection of bioinformatic resources used for the prediction of phosphorylation sites, and their potential therapeutic applications in the context of cancer.
Collapse
Affiliation(s)
- Neha Varshney
- Division of Biological Sciences, Department of Cellular and Molecular Medicine, University of California, San Diego, CA 93093, USA
- Ludwig Institute for Cancer Research, La Jolla, CA 92093, USA
| | - Abhinava K Mishra
- Molecular, Cellular and Developmental Biology Department, University of California, Santa Barbara, CA 93106, USA
| |
Collapse
|
4
|
Hou Z, Liu H. Mapping the Protein Kinome: Current Strategy and Future Direction. Cells 2023; 12:cells12060925. [PMID: 36980266 PMCID: PMC10047437 DOI: 10.3390/cells12060925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 02/23/2023] [Accepted: 03/13/2023] [Indexed: 03/30/2023] Open
Abstract
The kinome includes over 500 different protein kinases, which form an integrated kinase network that regulates cellular phosphorylation signals. The kinome plays a central role in almost every cellular process and has strong linkages with many diseases. Thus, the evaluation of the cellular kinome in the physiological environment is essential to understand biological processes, disease development, and to target therapy. Currently, a number of strategies for kinome analysis have been developed, which are based on monitoring the phosphorylation of kinases or substrates. They have enabled researchers to tackle increasingly complex biological problems and pathological processes, and have promoted the development of kinase inhibitors. Additionally, with the increasing interest in how kinases participate in biological processes at spatial scales, it has become urgent to develop tools to estimate spatial kinome activity. With multidisciplinary efforts, a growing number of novel approaches have the potential to be applied to spatial kinome analysis. In this paper, we review the widely used methods used for kinome analysis and the challenges encountered in their applications. Meanwhile, potential approaches that may be of benefit to spatial kinome study are explored.
Collapse
Affiliation(s)
- Zhanwu Hou
- Center for Mitochondrial Biology and Medicine, Douglas C. Wallace Institute for Mitochondrial and Epigenetic Information Sciences, The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
| | - Huadong Liu
- School of Health and Life Science, University of Health and Rehabilitation Sciences, Qingdao 266071, China
| |
Collapse
|
5
|
Parker Cates Z, Facciuolo A, Hogan D, Griebel PJ, Napper S, Kusalik AJ. EPIphany—A Platform for Analysis and Visualization of Peptide Immunoarray Data. Front Bioinform 2021; 1:694324. [PMID: 36303765 PMCID: PMC9581008 DOI: 10.3389/fbinf.2021.694324] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 06/25/2021] [Indexed: 11/13/2022] Open
Abstract
Antibodies are critical effector molecules of the humoral immune system. Upon infection or vaccination, populations of antibodies are generated which bind to various regions of the invading pathogen or exogenous agent. Defining the reactivity and breadth of this antibody response provides an understanding of the antigenic determinants and enables the rational development and assessment of vaccine candidates. High-resolution analysis of these populations typically requires advanced techniques such as B cell receptor repertoire sequencing, mass spectrometry of isolated immunoglobulins, or phage display libraries that are dependent upon equipment and expertise which are prohibitive for many labs. High-density peptide microarrays representing diverse populations of putative linear epitopes (immunoarrays) are an effective alternative for high-throughput examination of antibody reactivity and diversity. While a promising technology, widespread adoption of immunoarrays has been limited by the need for, and relative absence of, user-friendly tools for consideration and visualization of the emerging data. To address this limitation, we developed EPIphany, a software platform with a simple web-based user interface, aimed at biological users, that provides access to important analysis parameters, data normalization options, and a variety of unique data visualization options. This platform provides researchers the greatest opportunity to extract biologically meaningful information from the immunoarray data, thereby facilitating the discovery and development of novel immuno-therapeutics.
Collapse
Affiliation(s)
- Zoe Parker Cates
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada
| | - Antonio Facciuolo
- Vaccine and Infectious Disease Organization (VIDO), University of Saskatchewan, Saskatoon, SK, Canada
| | - Daniel Hogan
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada
| | - Philip J. Griebel
- Vaccine and Infectious Disease Organization (VIDO), University of Saskatchewan, Saskatoon, SK, Canada
- School of Public Health, University of Saskatchewan, Saskatoon, SK, Canada
| | - Scott Napper
- Vaccine and Infectious Disease Organization (VIDO), University of Saskatchewan, Saskatoon, SK, Canada
- Department of Biochemistry, Microbiology and Immunology, University of Saskatchewan, Saskatoon, SK, Canada
- *Correspondence: Scott Napper,
| | - Anthony J. Kusalik
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada
| |
Collapse
|
6
|
Arafat ME, Ahmad MW, Shovan S, Dehzangi A, Dipta SR, Hasan MAM, Taherzadeh G, Shatabda S, Sharma A. Accurately Predicting Glutarylation Sites Using Sequential Bi-Peptide-Based Evolutionary Features. Genes (Basel) 2020; 11:E1023. [PMID: 32878321 PMCID: PMC7565944 DOI: 10.3390/genes11091023] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 08/19/2020] [Accepted: 08/27/2020] [Indexed: 02/07/2023] Open
Abstract
Post Translational Modification (PTM) is defined as the alteration of protein sequence upon interaction with different macromolecules after the translation process. Glutarylation is considered one of the most important PTMs, which is associated with a wide range of cellular functioning, including metabolism, translation, and specified separate subcellular localizations. During the past few years, a wide range of computational approaches has been proposed to predict Glutarylation sites. However, despite all the efforts that have been made so far, the prediction performance of the Glutarylation sites has remained limited. One of the main challenges to tackle this problem is to extract features with significant discriminatory information. To address this issue, we propose a new machine learning method called BiPepGlut using the concept of a bi-peptide-based evolutionary method for feature extraction. To build this model, we also use the Extra-Trees (ET) classifier for the classification purpose, which, to the best of our knowledge, has never been used for this task. Our results demonstrate BiPepGlut is able to significantly outperform previously proposed models to tackle this problem. BiPepGlut achieves 92.0%, 84.8%, 95.6%, 0.82, and 0.88 in accuracy, sensitivity, specificity, Matthew's Correlation Coefficient, and F1-score, respectively. BiPepGlut is implemented as a publicly available online predictor.
Collapse
Affiliation(s)
- Md. Easin Arafat
- Department of Computer Science and Engineering, United International University, Dhaka 1212, Bangladesh; (M.E.A.); (M.W.A.); (S.R.D.)
| | - Md. Wakil Ahmad
- Department of Computer Science and Engineering, United International University, Dhaka 1212, Bangladesh; (M.E.A.); (M.W.A.); (S.R.D.)
| | - S.M. Shovan
- Department of Computer Science and Engineering, Rajshahi University of Engineering and Technology, Rajshahi 6204, Bangladesh; (S.M.S.); (M.A.M.H.)
| | - Abdollah Dehzangi
- Department of Computer Science, Rutgers University, Camden, NJ 08102, USA;
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ 08102, USA
| | - Shubhashis Roy Dipta
- Department of Computer Science and Engineering, United International University, Dhaka 1212, Bangladesh; (M.E.A.); (M.W.A.); (S.R.D.)
| | - Md. Al Mehedi Hasan
- Department of Computer Science and Engineering, Rajshahi University of Engineering and Technology, Rajshahi 6204, Bangladesh; (S.M.S.); (M.A.M.H.)
| | - Ghazaleh Taherzadeh
- Institute for Bioscience and Biotechnology Research, University of Maryland, College Park, MD 20742, USA
| | - Swakkhar Shatabda
- Department of Computer Science and Engineering, United International University, Dhaka 1212, Bangladesh; (M.E.A.); (M.W.A.); (S.R.D.)
| | - Alok Sharma
- Institute for Integrated and Intelligent Systems, Griffith University, Brisbane, QLD 4111, Australia
- Department of Medical Science Mathematics, Tokyo Medical and Dental University (TMDU), Tokyo 113-8510, Japan
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
- School of Engineering and Physics, Faculty of Science Technology and Environment, University of the South Pacific, Suva, Fiji
| |
Collapse
|
7
|
Savage SR, Zhang B. Using phosphoproteomics data to understand cellular signaling: a comprehensive guide to bioinformatics resources. Clin Proteomics 2020; 17:27. [PMID: 32676006 PMCID: PMC7353784 DOI: 10.1186/s12014-020-09290-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Accepted: 07/04/2020] [Indexed: 12/19/2022] Open
Abstract
Mass spectrometry-based phosphoproteomics is becoming an essential methodology for the study of global cellular signaling. Numerous bioinformatics resources are available to facilitate the translation of phosphopeptide identification and quantification results into novel biological and clinical insights, a critical step in phosphoproteomics data analysis. These resources include knowledge bases of kinases and phosphatases, phosphorylation sites, kinase inhibitors, and sequence variants affecting kinase function, and bioinformatics tools that can predict phosphorylation sites in addition to the kinase that phosphorylates them, infer kinase activity, and predict the effect of mutations on kinase signaling. However, these resources exist in silos and it is challenging to select among multiple resources with similar functions. Therefore, we put together a comprehensive collection of resources related to phosphoproteomics data interpretation, compared the use of tools with similar functions, and assessed the usability from the standpoint of typical biologists or clinicians. Overall, tools could be improved by standardization of enzyme names, flexibility of data input and output format, consistent maintenance, and detailed manuals.
Collapse
Affiliation(s)
- Sara R. Savage
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN USA
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX USA
| |
Collapse
|
8
|
Luo F, Wang M, Liu Y, Zhao XM, Li A. DeepPhos: prediction of protein phosphorylation sites with deep learning. Bioinformatics 2020; 35:2766-2773. [PMID: 30601936 PMCID: PMC6691328 DOI: 10.1093/bioinformatics/bty1051] [Citation(s) in RCA: 97] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 11/19/2018] [Accepted: 12/12/2018] [Indexed: 11/28/2022] Open
Abstract
Motivation Phosphorylation is the most studied post-translational modification, which is crucial for multiple biological processes. Recently, many efforts have been taken to develop computational predictors for phosphorylation site prediction, but most of them are based on feature selection and discriminative classification. Thus, it is useful to develop a novel and highly accurate predictor that can unveil intricate patterns automatically for protein phosphorylation sites. Results In this study we present DeepPhos, a novel deep learning architecture for prediction of protein phosphorylation. Unlike multi-layer convolutional neural networks, DeepPhos consists of densely connected convolutional neuron network blocks which can capture multiple representations of sequences to make final phosphorylation prediction by intra block concatenation layers and inter block concatenation layers. DeepPhos can also be used for kinase-specific prediction varying from group, family, subfamily and individual kinase level. The experimental results demonstrated that DeepPhos outperforms competitive predictors in general and kinase-specific phosphorylation site prediction. Availability and implementation The source code of DeepPhos is publicly deposited at https://github.com/USTCHIlab/DeepPhos. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Fenglin Luo
- School of Information Science and Technology
| | - Minghui Wang
- School of Information Science and Technology.,Centers for Biomedical Engineering, University of Science and Technology of China, Hefei AH, China
| | - Yu Liu
- School of Information Science and Technology
| | - Xing-Ming Zhao
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
| | - Ao Li
- School of Information Science and Technology.,Centers for Biomedical Engineering, University of Science and Technology of China, Hefei AH, China
| |
Collapse
|
9
|
Facciuolo A, Denomy C, Lipsit S, Kusalik A, Napper S. From Beef to Bees: High-Throughput Kinome Analysis to Understand Host Responses of Livestock Species to Infectious Diseases and Industry-Associated Stress. Front Immunol 2020; 11:765. [PMID: 32499776 PMCID: PMC7243914 DOI: 10.3389/fimmu.2020.00765] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 04/06/2020] [Indexed: 11/13/2022] Open
Abstract
Within human health research, the remarkable utility of kinase inhibitors as therapeutics has motivated efforts to understand biology at the level of global cellular kinase activity (the kinome). In contrast, the diminished potential for using kinase inhibitors in food animals has dampened efforts to translate this research approach to livestock species. This, in our opinion, was a lost opportunity for livestock researchers given the unique potential of kinome analysis to offer insight into complex biology. To remedy this situation, our lab developed user-friendly, cost-effective approaches for kinome analysis that can be readily incorporated into most research programs but with a specific priority to enable the technology to livestock researchers. These contributions include the development of custom software programs for the creation of species-specific kinome arrays as well as comprehensive deconvolution and analysis of kinome array data. Presented in this review are examples of the application of kinome analysis to highlight the utility of the technology to further our understanding of two key complex biological events of priority to the livestock industry: host immune responses to infectious diseases and animal stress responses. These advances and examples of application aim to provide both mechanisms and motivation for researchers, particularly livestock researchers, to incorporate kinome analysis into their research programs.
Collapse
Affiliation(s)
- Antonio Facciuolo
- Vaccine and Infectious Disease Organization - International Vaccine Centre, University of Saskatchewan, Saskatoon, SK, Canada
| | - Connor Denomy
- Vaccine and Infectious Disease Organization - International Vaccine Centre, University of Saskatchewan, Saskatoon, SK, Canada.,Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada
| | - Sean Lipsit
- Vaccine and Infectious Disease Organization - International Vaccine Centre, University of Saskatchewan, Saskatoon, SK, Canada.,Department of Biochemistry, Microbiology and Immunology, University of Saskatchewan, Saskatoon, SK, Canada
| | - Anthony Kusalik
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada
| | - Scott Napper
- Vaccine and Infectious Disease Organization - International Vaccine Centre, University of Saskatchewan, Saskatoon, SK, Canada.,Department of Biochemistry, Microbiology and Immunology, University of Saskatchewan, Saskatoon, SK, Canada
| |
Collapse
|
10
|
Pagano GJ, Arsenault RJ. Advances, challenges and tools in characterizing bacterial serine, threonine and tyrosine kinases and phosphorylation target sites. Expert Rev Proteomics 2019; 16:431-441. [DOI: 10.1080/14789450.2019.1601015] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Giovanni J. Pagano
- Center for Bioinformatics & Computational Biology, University of Delaware, Newark, DE, USA
| | - Ryan J. Arsenault
- Department of Animal and Food Sciences, University of Delaware, Newark, DE, USA
| |
Collapse
|
11
|
Abstract
In vivo, one of the most efficient biological mechanisms for expanding the genetic code and regulating cellular physiology is protein post-translational modification (PTM). Because PTM can provide very useful information for both basic research and drug development, identification of PTM sites in proteins has become a very important topic in bioinformatics. Lysine residue in protein can be subjected to many types of PTMs, such as acetylation, succinylation, methylation and propionylation and so on. In order to deal with the huge protein sequences, the present study is devoted to developing computational techniques that can be used to predict the multiple K-type modifications of any uncharacterized protein timely and effectively. In this work, we proposed a method which could deal with the acetylation and succinylation prediction in a multilabel learning. Three feature constructions including sequences and physicochemical properties have been applied. The multilabel learning algorithm RankSVM has been first used in PTMs. In 10-fold cross-validation the predictor with physicochemical properties encoding got accuracy 73.86%, abslute-true 64.70%, respectively. They were better than the other feature constructions. We compared with other multilabel algorithms and the existing predictor iPTM-Lys. The results of our predictor were better than other methods. Meanwhile we also analyzed the acetylation and succinylation peptides which could illustrate the results.
Collapse
Affiliation(s)
- Yan Xu
- Department of Information and Computer Science, University of Science and Technology Beijing, Beijing 100083, China
| | - Yingxi Yang
- Department of Information and Computer Science, University of Science and Technology Beijing, Beijing 100083, China
| | - Zu Wang
- Department of Information and Computer Science, University of Science and Technology Beijing, Beijing 100083, China
| | - Yuanhai Shao
- School of Economics and Management, Hainan University, Haikou 570228, China
| |
Collapse
|
12
|
Xu Y, Yang Y, Ding J, Li C. iGlu-Lys: A Predictor for Lysine Glutarylation Through Amino Acid Pair Order Features. IEEE Trans Nanobioscience 2018; 17:394-401. [DOI: 10.1109/tnb.2018.2848673] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
13
|
Xu Y, Song J, Wilson C, Whisstock JC. PhosContext2vec: a distributed representation of residue-level sequence contexts and its application to general and kinase-specific phosphorylation site prediction. Sci Rep 2018; 8:8240. [PMID: 29844483 DOI: 10.1038/s41598-018-26392-7] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Accepted: 05/10/2018] [Indexed: 11/28/2022] Open
Abstract
Phosphorylation is the most important type of protein post-translational modification. Accordingly, reliable identification of kinase-mediated phosphorylation has important implications for functional annotation of phosphorylated substrates and characterization of cellular signalling pathways. The local sequence context surrounding potential phosphorylation sites is considered to harbour the most relevant information for phosphorylation site prediction models. However, currently there is a lack of condensed vector representation for this important contextual information, despite the presence of varying residue-level features that can be constructed from sequence homology profiles, structural information, and physicochemical properties. To address this issue, we present PhosContext2vec which is a distributed representation of residue-level sequence contexts for potential phosphorylation sites and demonstrate its application in both general and kinase-specific phosphorylation site predictions. Benchmarking experiments indicate that PhosContext2vec could achieve promising predictive performance compared with several other existing methods for phosphorylation site prediction. We envisage that PhosContext2vec, as a new sequence context representation, can be used in combination with other informative residue-level features to improve the classification performance in a number of related bioinformatics tasks that require appropriate residue-level feature vector representation and extraction. The web server of PhosContext2vec is publicly available at http://phoscontext2vec.erc.monash.edu/.
Collapse
|
14
|
Li Y, Sahni N, Yi S. Comparative analysis of protein interactome networks prioritizes candidate genes with cancer signatures. Oncotarget 2018; 7:78841-78849. [PMID: 27791983 PMCID: PMC5346681 DOI: 10.18632/oncotarget.12879] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2016] [Accepted: 10/14/2016] [Indexed: 12/12/2022] Open
Abstract
Comprehensive understanding of human cancer mechanisms requires the identification of a thorough list of cancer-associated genes, which could serve as biomarkers for diagnoses and therapies in various types of cancer. Although substantial progress has been made in functional studies to uncover genes involved in cancer, these efforts are often time-consuming and costly. Therefore, it remains challenging to comprehensively identify cancer candidate genes. Network-based methods have accelerated this process through the analysis of complex molecular interactions in the cell. However, the extent to which various interactome networks can contribute to prediction of candidate genes responsible for cancer is still enigmatic. In this study, we evaluated different human protein-protein interactome networks and compared their application to cancer gene prioritization. Our results indicate that network analyses can increase the power to identify novel cancer genes. In particular, such predictive power can be enhanced with the use of unbiased systematic protein interaction maps for cancer gene prioritization. Functional analysis reveals that the top ranked genes from network predictions co-occur often with cancer-related terms in literature, and further, these candidate genes are indeed frequently mutated across cancers. Finally, our study suggests that integrating interactome networks with other omics datasets could provide novel insights into cancer-associated genes and underlying molecular mechanisms.
Collapse
Affiliation(s)
- Yongsheng Li
- Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Nidhi Sahni
- Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.,Graduate Program in Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Song Yi
- Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| |
Collapse
|
15
|
Jia C, Zuo Y, Zou Q. O-GlcNAcPRED-II: an integrated classification algorithm for identifying O-GlcNAcylation sites based on fuzzy undersampling and a K-means PCA oversampling technique. Bioinformatics 2018; 34:2029-2036. [DOI: 10.1093/bioinformatics/bty039] [Citation(s) in RCA: 97] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Accepted: 02/05/2018] [Indexed: 11/13/2022] Open
Affiliation(s)
- Cangzhi Jia
- Department of Mathematics, Dalian Maritime University, Dalian, China
| | - Yun Zuo
- Department of Mathematics, Dalian Maritime University, Dalian, China
| | - Quan Zou
- School of Computer Science and Technology, Tianjin University, Tianjin, China
| |
Collapse
|
16
|
Abstract
Predicting S-sulfenylation sites in proteins based on sequence and structural features by building an ensemble model by gradient tree boosting.
Collapse
Affiliation(s)
- Lei Deng
- School of Software, Central South University
- Changsha
- China
| | - Xiaojie Xu
- School of Software, Central South University
- Changsha
- China
| | - Hui Liu
- School of Software, Central South University
- Changsha
- China
- Lab of Information Management, Changzhou University
- Jiangsu
| |
Collapse
|
17
|
Baharani A, Trost B, Kusalik A, Napper S. Technological advances for interrogating the human kinome. Biochem Soc Trans 2017; 45:65-77. [PMID: 28202660 DOI: 10.1042/BST20160163] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Revised: 10/20/2016] [Accepted: 10/25/2016] [Indexed: 12/12/2022]
Abstract
There is increasing appreciation among researchers and clinicians of the value of investigating biology and pathobiology at the level of cellular kinase (kinome) activity. Kinome analysis provides valuable opportunity to gain insights into complex biology (including disease pathology), identify biomarkers of critical phenotypes (including disease prognosis and evaluation of therapeutic efficacy), and identify targets for therapeutic intervention through kinase inhibitors. The growing interest in kinome analysis has fueled efforts to develop and optimize technologies that enable characterization of phosphorylation-mediated signaling events in a cost-effective, high-throughput manner. In this review, we highlight recent advances to the central technologies currently available for kinome profiling and offer our perspectives on the key challenges remaining to be addressed.
Collapse
|
18
|
Abstract
Advanced high-throughput sequencing technology accumulated massive amount of genomics and transcriptomics data in the public databases. Due to the high technical accessibility, DNA and RNA sequencing have huge potential for the study of gene functions in most species including animals and crops. A proven analytic platform to convert sequencing data to gene functional information is co-functional network. Because all genes exert their functions through interactions with others, network analysis is a legitimate way to study gene functions. The workflow of network-based functional study is composed of three steps: (i) inferencing co-functional links, (ii) evaluating and integrating the links into genome-scale networks, and (iii) generating functional hypotheses from the networks. Co-functional links can be inferred from DNA sequencing data by using phylogenetic profiling, gene neighborhood, domain profiling, associalogs, and co-expression analysis from RNA sequencing data. The inferred links are then evaluated and integrated into a genome-scale network with aid from gold-standard co-functional links. Functional hypotheses can be generated from the network based on (i) network connectivity, (ii) network propagation, and (iii) subnetwork analysis. The functional analysis pipeline described here requires only sequencing data which can be readily available for most species by next-generation sequencing technology. Therefore, co-functional networks will greatly potentiate the use of the sequencing data for the study of genetics in any cellular organism.
Collapse
Affiliation(s)
- Jung Eun Shim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Tak Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| |
Collapse
|