1
|
Adejor J, Tumukunde E, Li G, Lin H, Xie R, Wang S. Impact of Lysine Succinylation on the Biology of Fungi. Curr Issues Mol Biol 2024; 46:1020-1046. [PMID: 38392183 PMCID: PMC10888112 DOI: 10.3390/cimb46020065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Revised: 01/02/2024] [Accepted: 01/03/2024] [Indexed: 02/24/2024] Open
Abstract
Post-translational modifications (PTMs) play a crucial role in protein functionality and the control of various cellular processes and secondary metabolites (SMs) in fungi. Lysine succinylation (Ksuc) is an emerging protein PTM characterized by the addition of a succinyl group to a lysine residue, which induces substantial alteration in the chemical and structural properties of the affected protein. This chemical alteration is reversible, dynamic in nature, and evolutionarily conserved. Recent investigations of numerous proteins that undergo significant succinylation have underscored the potential significance of Ksuc in various biological processes, encompassing normal physiological functions and the development of certain pathological processes and metabolites. This review aims to elucidate the molecular mechanisms underlying Ksuc and its diverse functions in fungi. Both conventional investigation techniques and predictive tools for identifying Ksuc sites were also considered. A more profound comprehension of Ksuc and its impact on the biology of fungi have the potential to unveil new insights into post-translational modification and may pave the way for innovative approaches that can be applied across various clinical contexts in the management of mycotoxins.
Collapse
Affiliation(s)
- John Adejor
- Key Laboratory of Pathogenic Fungi and Mycotoxins of Fujian Province, Key Laboratory of Biopesticide and Chemical Biology of Education Ministry, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Elisabeth Tumukunde
- Key Laboratory of Pathogenic Fungi and Mycotoxins of Fujian Province, Key Laboratory of Biopesticide and Chemical Biology of Education Ministry, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Guoqi Li
- Key Laboratory of Pathogenic Fungi and Mycotoxins of Fujian Province, Key Laboratory of Biopesticide and Chemical Biology of Education Ministry, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Hong Lin
- Key Laboratory of Pathogenic Fungi and Mycotoxins of Fujian Province, Key Laboratory of Biopesticide and Chemical Biology of Education Ministry, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Rui Xie
- Key Laboratory of Pathogenic Fungi and Mycotoxins of Fujian Province, Key Laboratory of Biopesticide and Chemical Biology of Education Ministry, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Shihua Wang
- Key Laboratory of Pathogenic Fungi and Mycotoxins of Fujian Province, Key Laboratory of Biopesticide and Chemical Biology of Education Ministry, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| |
Collapse
|
2
|
Hu Q, Xu J, Wang L, Yuan Y, Luo R, Gan M, Wang K, Zhao T, Wang Y, Han T, Wang J. SUCLG2 Regulates Mitochondrial Dysfunction through Succinylation in Lung Adenocarcinoma. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023; 10:e2303535. [PMID: 37904651 PMCID: PMC10724390 DOI: 10.1002/advs.202303535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 09/24/2023] [Indexed: 11/01/2023]
Abstract
Mitochondrial dysfunction and abnormal energy metabolism are major features of cancer. However, the mechanisms underlying mitochondrial dysfunction during cancer progression are far from being clarified. Here, it is demonstrated that the expression level of succinyl-coenzyme A (CoA) synthetase GDP-forming subunit β (SUCLG2) can affect the overall succinylation of lung adenocarcinoma (LUAD) cells. Succinylome analysis shows that the deletion of SUCLG2 can upregulate the succinylation level of mitochondrial proteins and inhibits the function of key metabolic enzymes by reducing either enzymatic activity or protein stability, thus dampening mitochondrial function in LUAD cells. Interestingly, SUCLG2 itself is also succinylated on Lys93, and this succinylation enhances its protein stability, leading to the upregulation of SUCLG2 and promoting the proliferation and tumorigenesis of LUAD cells. Sirtuin 5 (SIRT5) desuccinylates SUCLG2 on Lys93, followed by tripartite motif-containing protein 21 (TRIM21)-mediated ubiquitination through K63-linkage and degradation in the lysosome. The findings reveal a new role for SUCLG2 in mitochondrial dysfunction and clarify the mechanism of the succinylation-mediated protein homeostasis of SUCLG2 in LUAD, thus providing a theoretical basis for developing anti-cancer drugs targeting SUCLG2.
Collapse
Affiliation(s)
- Qifan Hu
- Department of Thoracic SurgeryThe First Affiliated Hospital of Nanchang UniversityNanchangJiangxi330006China
- School of Basic Medical SciencesNanchang UniversityNanchangJiangxi330031China
- Jiangxi Institute of Respiratory DiseaseThe First Affiliated Hospital of Nanchang UniversityNanchangJiangxi330006China
| | - Jing Xu
- School of Basic Medical SciencesNanchang UniversityNanchangJiangxi330031China
| | - Lei Wang
- School of Basic Medical SciencesNanchang UniversityNanchangJiangxi330031China
| | - Yi Yuan
- School of Huankui AcademyNanchang UniversityNanchangJiangxi330031China
| | - Ruiguang Luo
- School of Basic Medical SciencesNanchang UniversityNanchangJiangxi330031China
| | - Mingxi Gan
- School of Basic Medical SciencesNanchang UniversityNanchangJiangxi330031China
| | - Keru Wang
- School of Huankui AcademyNanchang UniversityNanchangJiangxi330031China
| | - Tao Zhao
- School of Basic Medical SciencesNanchang UniversityNanchangJiangxi330031China
| | - Yawen Wang
- School of Basic Medical SciencesNanchang UniversityNanchangJiangxi330031China
| | - Tianyu Han
- Jiangxi Institute of Respiratory DiseaseThe First Affiliated Hospital of Nanchang UniversityNanchangJiangxi330006China
- Jiangxi Clinical Research Center for Respiratory DiseasesNanchangJiangxi330006China
- China‐Japan Friendship Jiangxi HospitalNational Regional Center for Respiratory MedicineNanchangJiangxi330200China
| | - Jian‐Bin Wang
- Department of Thoracic SurgeryThe First Affiliated Hospital of Nanchang UniversityNanchangJiangxi330006China
- School of Basic Medical SciencesNanchang UniversityNanchangJiangxi330031China
| |
Collapse
|
3
|
Kumari S, Gupta R, Ambasta RK, Kumar P. Emerging trends in post-translational modification: Shedding light on Glioblastoma multiforme. Biochim Biophys Acta Rev Cancer 2023; 1878:188999. [PMID: 37858622 DOI: 10.1016/j.bbcan.2023.188999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 10/06/2023] [Accepted: 10/06/2023] [Indexed: 10/21/2023]
Abstract
Recent multi-omics studies, including proteomics, transcriptomics, genomics, and metabolomics have revealed the critical role of post-translational modifications (PTMs) in the progression and pathogenesis of Glioblastoma multiforme (GBM). Further, PTMs alter the oncogenic signaling events and offer a novel avenue in GBM therapeutics research through PTM enzymes as potential biomarkers for drug targeting. In addition, PTMs are critical regulators of chromatin architecture, gene expression, and tumor microenvironment (TME), that play a crucial function in tumorigenesis. Moreover, the implementation of artificial intelligence and machine learning algorithms enhances GBM therapeutics research through the identification of novel PTM enzymes and residues. Herein, we briefly explain the mechanism of protein modifications in GBM etiology, and in altering the biologics of GBM cells through chromatin remodeling, modulation of the TME, and signaling pathways. In addition, we highlighted the importance of PTM enzymes as therapeutic biomarkers and the role of artificial intelligence and machine learning in protein PTM prediction.
Collapse
Affiliation(s)
- Smita Kumari
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological, University, India
| | - Rohan Gupta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological, University, India; School of Medicine, University of South Carolina, Columbia, SC, United States of America
| | - Rashmi K Ambasta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological, University, India; Department of Biotechnology and Microbiology, SRM University, Sonepat, Haryana, India.
| | - Pravir Kumar
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological, University, India.
| |
Collapse
|
4
|
Luo X, Huang S, Liang M, Xue Q, Rehman SU, Ren X, Li Y, Yang T, Shi D, Li X. The freezability of Mediterranean buffalo sperm is associated with lysine succinylation and lipid metabolism. FASEB J 2022; 36:e22635. [PMID: 36333987 DOI: 10.1096/fj.202201254r] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 09/29/2022] [Accepted: 10/18/2022] [Indexed: 11/07/2022]
Abstract
Semen cryopreservation is used for the propagation of variety among species and domestic breeding. Mitochondria are implicated in sperm freezability, and their proteins are prone to succinylation, but the relationship between sperm freezability and mitochondrial protein succinylation is unclear. In this study, six bulls were classified as having good or poor freezability ejaculates (GFE or PFE, each 3 bulls). The fresh sperm mitochondrial membrane potential (MMP) and pan succinylation level of the two groups were first detected. Then the lysine succinylome and fatty acid content of the two groups were analyzed using label-free LC-MS/MS and GC-MS/MS in multiple reaction monitoring (MRM) modes, respectively. The results indicated that the GFE sperm had significantly higher MMPs than the PFE group (p < 0.05). A total of 1393 succinylation sites corresponding to 426 proteins were assessed and 5 succinylated peptides of the GFE group were markedly upregulated, while 3 were significantly downregulated (FC > 2.0 - < 0.5 and p-value < 0.05) when compared to the PFE group. Forty-six succinylated proteins were identified to have consistent presence/absence expression. The upregulated succinylated proteins in the GFE sperm were enriched in lipid metabolic processes. A total of 31 fatty acids were further subjected to quantitative analysis of which 23 including arachidic (C20:0), linolenic (C18:3n3), and docosahexaenoic acids (C22:6n3) were decreased in GFE sperm when compared with PFE (p < 0.05). These results suggest that lysine succinylation can potentially influence the sperm freezability of Mediterranean buffaloes through mitochondrial lipid metabolism. This novel study provides our understanding of sperm succinylation and the molecular basis for the mechanism of sperm freezability.
Collapse
Affiliation(s)
- Xi Luo
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi University, Nanning, China
| | - Shihai Huang
- College of Life Science and Technology, Guangxi University, Nanning, China
| | - Mingming Liang
- Liuzhou Maternity and Child Healthcare Hospital, Liuzhou, China
| | - Qingsong Xue
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi University, Nanning, China
| | - Saif Ur Rehman
- College of Life Science and Technology, Guangxi University, Nanning, China
| | - Xuan Ren
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi University, Nanning, China
| | - Yanfang Li
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi University, Nanning, China
| | - Ting Yang
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi University, Nanning, China
| | - Deshun Shi
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi University, Nanning, China
| | - Xiangping Li
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi University, Nanning, China
| |
Collapse
|
5
|
Jia J, Wu G, Li M, Qiu W. pSuc-EDBAM: Predicting lysine succinylation sites in proteins based on ensemble dense blocks and an attention module. BMC Bioinformatics 2022; 23:450. [PMCID: PMC9620660 DOI: 10.1186/s12859-022-05001-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Accepted: 10/25/2022] [Indexed: 11/10/2022] Open
Abstract
Background Lysine succinylation is a newly discovered protein post-translational modifications. Predicting succinylation sites helps investigate the metabolic disease treatments. However, the biological experimental approaches are costly and inefficient, it is necessary to develop efficient computational approaches. Results In this paper, we proposed a novel predictor based on ensemble dense blocks and an attention module, called as pSuc-EDBAM, which adopted one hot encoding to derive the feature maps of protein sequences, and generated the low-level feature maps through 1-D CNN. Afterward, the ensemble dense blocks were used to capture feature information at different levels in the process of feature learning. We also introduced an attention module to evaluate the importance degrees of different features. The experimental results show that Acc reaches 74.25%, and MCC reaches 0.2927 on the testing dataset, which suggest that the pSuc-EDBAM outperforms the existing predictors. Conclusions The experimental results of ten-fold cross-validation on the training dataset and independent test on the testing dataset showed that pSuc-EDBAM outperforms the existing succinylation site predictors and can predict potential succinylation sites effectively. The pSuc-EDBAM is feasible and obtains the credible predictive results, which may also provide valuable references for other related research. To make the convenience of the experimental scientists, a user-friendly web server has been established (http://bioinfo.wugenqiang.top/pSuc-EDBAM/), by which the desired results can be easily obtained.
Collapse
Affiliation(s)
- Jianhua Jia
- Computer Department, Jingdezhen Ceramic University, Jingdezhen, 333403 China
| | - Genqiang Wu
- Computer Department, Jingdezhen Ceramic University, Jingdezhen, 333403 China
| | - Meifang Li
- grid.410729.90000 0004 1759 3199Computer Department, Nanchang Institute of Technology, Nanchang, 330044 China
| | - Wangren Qiu
- Computer Department, Jingdezhen Ceramic University, Jingdezhen, 333403 China
| |
Collapse
|
6
|
Bai W, Cheng L, Xiong L, Wang M, Liu H, Yu K, Wang W. Protein succinylation associated with the progress of hepatocellular carcinoma. J Cell Mol Med 2022; 26:5702-5712. [PMID: 36308411 PMCID: PMC9667522 DOI: 10.1111/jcmm.17507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 07/09/2022] [Accepted: 07/19/2022] [Indexed: 12/01/2022] Open
Abstract
Although post‐translational modification is critical to tumorigenesis, how succinylation modification of lysine sites influences hepatocellular carcinoma (HCC) remains obscure. 90 tumours and paired adjacent normal tissue of liver cancer were enrolled for succinylation staining. 423 HCC samples with 20 genes related to succinylation modification from TCGA were downloaded for model construction. Statistical methods were employed to analyse the data, including the Non‐Negative Matrix Factorization (NMF) algorithm, t‐Distributed Stochastic Neighbour Embedding (t‐SNE) algorithm, and Cox regression analysis. The staining pan‐succinyllysine antibody staining indicated that tumour tissues had a higher succinyllysine level than adjacent tissues (p < 0.001), which could be associated with a worse prognosis (p = 0.02). The survival was associated with pathological stage, tumour recurrence status and succinyllysine intensity in the univariate or multivariable cox survival analysis model. The risk model from 20 succinyllysine‐related genes had the best prognosis prediction. The high expression of succinylation modification in HCC contributed to the worse patient survival prognosis. Model construction of 20 genes related to succinylation modification (MEAF6, OXCT1, SIRT2, CREBBP, KAT5, SIRT4, SIRT6, SIRT7, CPT1A, GLYATL1, SDHA, SDHB, SDHC, SDHD, SIRT1, SIRT3, SIRT5, SUCLA2, SUCLG1 and SUCLG2) could be reliable in predicting prognosis in HCC.
Collapse
Affiliation(s)
- Wenhui Bai
- Department of Hepatobiliary Surgery, Eastern Campus Renmin Hospital of Wuhan University Wuhan China
| | - Li Cheng
- Department of Intensive Care Unit, Eastern Campus Renmin Hospital of Wuhan University Wuhan China
| | - Liangkun Xiong
- Department of Hepatobiliary Surgery, Eastern Campus Renmin Hospital of Wuhan University Wuhan China
| | - Maoming Wang
- Department of Hepatobiliary Surgery, Eastern Campus Renmin Hospital of Wuhan University Wuhan China
| | - Hao Liu
- Department of Hepatobiliary Surgery, Eastern Campus Renmin Hospital of Wuhan University Wuhan China
| | - Kaihuan Yu
- Department of Hepatobiliary Surgery, Eastern Campus Renmin Hospital of Wuhan University Wuhan China
| | - Weixing Wang
- Department of Hepatobiliary Surgery Renmin Hospital of Wuhan University Wuhan China
| |
Collapse
|
7
|
Liu X, Xu LL, Lu YP, Yang T, Gu XY, Wang L, Liu Y. Deep_KsuccSite: A novel deep learning method for the identification of lysine succinylation sites. Front Genet 2022; 13:1007618. [PMID: 36246655 PMCID: PMC9557156 DOI: 10.3389/fgene.2022.1007618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 09/08/2022] [Indexed: 11/13/2022] Open
Abstract
Identification of lysine (symbol Lys or K) succinylation (Ksucc) sites centralizes the basis for disclosing the mechanism and function of lysine succinylation modifications. Traditional experimental methods for Ksucc site ientification are often costly and time-consuming. Therefore, it is necessary to construct an efficient computational method to prediction the presence of Ksucc sites in protein sequences. In this study, we proposed a novel and effective predictor for the identification of Ksucc sites based on deep learning algorithms that was termed as Deep_KsuccSite. The predictor adopted Composition, Transition, and Distribution (CTD) Composition (CTDC), Enhanced Grouped Amino Acid Composition (EGAAC), Amphiphilic Pseudo-Amino Acid Composition (APAAC), and Embedding Encoding methods to encode peptides, then constructed three base classifiers using one-dimensional (1D) convolutional neural network (CNN) and 2D-CNN, and finally utilized voting method to get the final results. K-fold cross-validation and independent testing showed that Deep_KsuccSite could serve as an effective tool to identify Ksucc sites in protein sequences. In addition, the ablation experiment results based on voting, feature combination, and model architecture showed that Deep_KsuccSite could make full use of the information of different features to construct an effective classifier. Taken together, we developed Deep_KsuccSite in this study, which was based on deep learning algorithm and could achieved better prediction accuracy than current methods for lysine succinylation sites. The code and dataset involved in this methodological study are permanently available at the URL https://github.com/flyinsky6/Deep_KsuccSite.
Collapse
Affiliation(s)
- Xin Liu
- School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, China
- *Correspondence: Xin Liu, ; Liang Wang, ; Yong Liu,
| | - Lin-Lin Xu
- School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, China
| | - Ya-Ping Lu
- College of Computer Science and Technology, China University of Mining and Technology, Xuzhou, China
| | - Ting Yang
- School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, China
| | - Xin-Yu Gu
- School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, China
| | - Liang Wang
- Laboratory Medicine, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
- *Correspondence: Xin Liu, ; Liang Wang, ; Yong Liu,
| | - Yong Liu
- Jiangsu Center for the Collaboration and Innovation of Cancer Biotherapy, Cancer Institute, Xuzhou Medical University, Xuzhou, Jiangsu, China
- *Correspondence: Xin Liu, ; Liang Wang, ; Yong Liu,
| |
Collapse
|
8
|
Amerifar S, Norouzi M, Ghandi M. A tool for feature extraction from biological sequences. Brief Bioinform 2022; 23:6563937. [PMID: 35383372 DOI: 10.1093/bib/bbac108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 03/01/2022] [Accepted: 03/03/2022] [Indexed: 11/12/2022] Open
Abstract
With the advances in sequencing technologies, a huge amount of biological data is extracted nowadays. Analyzing this amount of data is beyond the ability of human beings, creating a splendid opportunity for machine learning methods to grow. The methods, however, are practical only when the sequences are converted into feature vectors. Many tools target this task including iLearnPlus, a Python-based tool which supports a rich set of features. In this paper, we propose a holistic tool that extracts features from biological sequences (i.e. DNA, RNA and Protein). These features are the inputs to machine learning models that predict properties, structures or functions of the input sequences. Our tool not only supports all features in iLearnPlus but also 30 additional features which exist in the literature. Moreover, our tool is based on R language which makes an alternative for bioinformaticians to transform sequences into feature vectors. We have compared the conversion time of our tool with that of iLearnPlus: we transform the sequences much faster. We convert small nucleotides by a median of 2.8X faster, while we outperform iLearnPlus by a median of 6.3X for large sequences. Finally, in amino acids, our tool achieves a median speedup of 23.9X.
Collapse
Affiliation(s)
- Sare Amerifar
- Bioinformatics, Tatbiat Modares University, Jalal Al Ahmad, 14115-111, Tehran, Iran
| | - Mahammad Norouzi
- Computer Science, Technical University of Darmstadt, Hochschulstr. 1, 64293, Hesse, Germany
| | - Mahmoud Ghandi
- Bioinformatics, Monte Rosa Therapeutics, Summer Street, 02210, Boston, United States
| |
Collapse
|
9
|
Wang H, Zhao H, Zhang J, Han J, Liu Z. A parallel model of DenseCNN and ordered-neuron LSTM for generic and species-specific succinylation site prediction. Biotechnol Bioeng 2022; 119:1755-1767. [PMID: 35320585 DOI: 10.1002/bit.28091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2021] [Revised: 03/12/2022] [Accepted: 03/19/2022] [Indexed: 11/07/2022]
Abstract
Lysine succinylation (Ksucc) regulates various metabolic processes, participates in vital life processes, ans is involved in the occurrence and development of numerous diseases. Accurate recognition of succinylation sites can reveal underlying functional mechanisms and pathogenesis. However, most remain undetected. Moreover, a deep learning architecture focusing on generic and species-specific predictions is still lacking. Thus, we proposed a deep learning-based framework named Deep-Ksucc, combining a dense convolutional network (DenseCNN) and ordered-neuron long short-term memory (OnLSTM) in parallel, which took the cascading characteristics of sequence information and physicochemical properties as the input. The results of the generic and species-specific predictions indicated that Deep-Ksucc can identify sequence patterns of different organisms and recognize plenty of succinylation sites. The case study showed that Deep-Ksucc can serve as a reliable tool for biology verification and computer-aided recognition of succinylation sites. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Huiqing Wang
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030024, China
| | - Hong Zhao
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030024, China
| | - Jing Zhang
- Engineering Training Center, Taiyuan University of Technology, Taiyuan, 030024, China
| | - Jiale Han
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030024, China
| | - Zhihao Liu
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, 030024, China
| |
Collapse
|
10
|
Zhang D, Wang S. A protein succinylation sites prediction method based on the hybrid architecture of LSTM network and CNN. J Bioinform Comput Biol 2022; 20:2250003. [PMID: 35191361 DOI: 10.1142/s0219720022500032] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The succinylation modification of protein participates in the regulation of a variety of cellular processes. Identification of modified substrates with precise sites is the basis for understanding the molecular mechanism and regulation of succinylation. In this work, we picked and chose five superior feature codes: CKSAAP, ACF, BLOSUM62, AAindex, and one-hot, according to their performance in the problem of succinylation sites prediction. Then, LSTM network and CNN were used to construct four models: LSTM-CNN, CNN-LSTM, LSTM, and CNN. The five selected features were, respectively, input into each of these four models for training to compare the four models. Based on the performance of each model, the optimal model among them was chosen to construct a hybrid model DeepSucc that was composed of five sub-modules for integrating heterogeneous information. Under the 10-fold cross-validation, the hybrid model DeepSucc achieves 86.26% accuracy, 84.94% specificity, 87.57% sensitivity, 0.9406 AUC, and 0.7254 MCC. When compared with other prediction tools using an independent test set, DeepSucc outperformed them in sensitivity and MCC. The datasets and source codes can be accessed at https://github.com/1835174863zd/DeepSucc.
Collapse
Affiliation(s)
- Die Zhang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming 650504, P. R. China
| | - Shunfang Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming 650504, P. R. China
| |
Collapse
|
11
|
Ning Q, Ma Z, Zhao X, Yin M. SSKM_Succ: A Novel Succinylation Sites Prediction Method Incorporating K-Means Clustering With a New Semi-Supervised Learning Algorithm. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:643-652. [PMID: 32750881 DOI: 10.1109/tcbb.2020.3006144] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Protein succinylation is a type of post-translational modification (PTM) that occurs on lysine sites and plays a key role in protein conformation regulation and cellular function control. When training in computational method, it is difficult to designate negative samples because of the uncertainty of non-succinylation lysine sites, and if not handled properly, it may affect the performance of computational models dramatically. Therefore, we propose a new semi-supervised learning method to identify reliable non-succinylation lysine sites as negative samples. This method, named SSKM_Succ, also employs K-means clustering to divide data into 5 clusters. Besides, information of proximal PTMs and three kinds of sequence features (grey pseudo amino acid composition, K-space and position-special amino acid propensity) are utilized to formulate protein. Then, we perform a two-step feature selection to remove redundant features and construct the optimization model for each cluster. Finally, support vector machine is applied to construct a prediction model for each cluster. Promising results are obtained by this method with an accuracy of 80.18 percent for succinylation sites on the independent testing dataset. Meanwhile, we compare the result with other existing tools, and it shows that our method is promising for predicting succinylation sites. Through analysis, we further verify that succinylated protein has potential effects on amino acid degradation and fatty acid metabolism, and speculate that protein succinylation may be closely related to neurodegenerative diseases. The code of SSKM_Succ is available on the web https://github.com/yangyq505/SSKM_Succ.git.
Collapse
|
12
|
Mu R, Ma Z, Lu C, Wang H, Cheng X, Tuo B, Fan Y, Liu X, Li T. Role of succinylation modification in thyroid cancer and breast cancer. Am J Cancer Res 2021. [PMID: 34765287 DOI: 10.2156/j.ajcr.2021.11.100] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
The incidence of thyroid cancer and breast cancer is increasing year by year, and the specific pathogenesis is unclear. Posttranslational modifications constitute an important regulatory mechanism that affects the function of almost all proteins, are essential for a diverse and well-functioning proteome and can integrate metabolism with physiological and pathological processes. In recent years, posttranslational modifications, which mainly include metabolic enzyme-mediated protein posttranslational modifications, such as methylation, phosphorylation, acetylation and succinylation, have become a research hotspot. Among these modifications, lysine succinylation is a newly discovered broad-spectrum, dynamic, non-enzymatic protein post-translational modification, and it plays an important regulatory role in a variety of tumors. Studies have shown that succinylation can affect the synthesis of thyroid hormones, and the regulation of this post-translational modification can inhibit the apoptosis and migration of thyroid cancer cell lines, and promote breast cancer cell proliferation, DNA damage repair and autophagy-related regulation. However, the specific regulatory mechanism of succinylation in thyroid cancer and breast cancer is currently unclear. Therefore, this article mainly reviews the research progress of succinylation modification in thyroid cancer and breast cancer. It is expected to provide new directions and targets for the prevention and treatment of thyroid cancer and breast cancer.
Collapse
Affiliation(s)
- Renmin Mu
- Department of Thyroid and Breast Surgery, Affiliated Hospital of Zunyi Medical University Zunyi 563003, Guizhou Province, China
| | - Zhiyuan Ma
- Department of Gastroenterology, Affiliated Hospital of Zunyi Medical University Zunyi 563003, Guizhou Province, China.,Digestive Disease Institute of Guizhou Province Zunyi 563003, Guizhou Province, China
| | - Chengli Lu
- Department of Thyroid and Breast Surgery, Affiliated Hospital of Zunyi Medical University Zunyi 563003, Guizhou Province, China
| | - Hu Wang
- Department of Thyroid and Breast Surgery, Affiliated Hospital of Zunyi Medical University Zunyi 563003, Guizhou Province, China
| | - Xiaoming Cheng
- Department of Thyroid and Breast Surgery, Affiliated Hospital of Zunyi Medical University Zunyi 563003, Guizhou Province, China
| | - Biguang Tuo
- Department of Gastroenterology, Affiliated Hospital of Zunyi Medical University Zunyi 563003, Guizhou Province, China.,Digestive Disease Institute of Guizhou Province Zunyi 563003, Guizhou Province, China
| | - Yi Fan
- Endoscopy Center, Affiliated Hospital of Zunyi Medical University Zunyi 563003, Guizhou Province, China
| | - Xuemei Liu
- Department of Gastroenterology, Affiliated Hospital of Zunyi Medical University Zunyi 563003, Guizhou Province, China.,Digestive Disease Institute of Guizhou Province Zunyi 563003, Guizhou Province, China
| | - Taolang Li
- Department of Thyroid and Breast Surgery, Affiliated Hospital of Zunyi Medical University Zunyi 563003, Guizhou Province, China
| |
Collapse
|
13
|
Charoenkwan P, Chiangjong W, Hasan MM, Nantasenamat C, Shoombuatong W. Review and comparative analysis of machine learning-based predictors for predicting and analyzing of anti-angiogenic peptides. Curr Med Chem 2021; 29:849-864. [PMID: 34375178 DOI: 10.2174/0929867328666210810145806] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 06/17/2021] [Accepted: 06/22/2021] [Indexed: 11/22/2022]
Abstract
Cancer is one of the leading causes of death worldwide and underlying this is angiogenesis that represents one of the hallmarks of cancer. Ongoing effort is already under way in the discovery of anti-angiogenic peptides (AAPs) as a promising therapeutic route by tackling the formation of new blood vessels. As such, the identification of AAPs constitutes a viable path for understanding their mechanistic properties pertinent for the discovery of new anti-cancer drugs. In spite of the abundance of peptide sequences in public databases, experimental efforts in the identification of anti-angiogenic peptides have progressed very slowly owing to its high expenditures and laborious nature. Owing to its inherent ability to make sense of large volumes of data, machine learning (ML) represents a lucrative technique that can be harnessed for peptide-based drug discovery. In this review, we conducted a comprehensive and comparative analysis of ML-based AAP predictors in terms of their employed feature descriptors, ML algorithms, cross-validation methods and prediction performance. Moreover, the common framework of these AAP predictors and their inherent weaknesses are also discussed. Particularly, we explore future perspectives for improving the prediction accuracy and model interpretability, which represents an interesting avenue for overcoming some of the inherent weaknesses of existing AAP predictors. We anticipate that this review would assist researchers in the rapid screening and identification of promising AAPs for clinical use.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai, Thailand
| | - Wararat Chiangjong
- Pediatric Translational Research Unit, Department of Pediatrics, Faculty of Medicine, Ramathibodi Hospital, Mahidol University, Bangkok 10400, Thailand
| | - Md Mehedi Hasan
- Tulane Center for Biomedical Informatics and Genomics, Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA 70112, United States
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand
| |
Collapse
|
14
|
Charoenkwan P, Anuwongcharoen N, Nantasenamat C, Hasan MM, Shoombuatong W. In Silico Approaches for the Prediction and Analysis of Antiviral Peptides: A Review. Curr Pharm Des 2021; 27:2180-2188. [PMID: 33138759 DOI: 10.2174/1381612826666201102105827] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Accepted: 08/20/2020] [Indexed: 11/22/2022]
Abstract
In light of the growing resistance toward current antiviral drugs, efforts to discover novel and effective antiviral therapeutic agents remain a pressing scientific effort. Antiviral peptides (AVPs) represent promising therapeutic agents due to their extraordinary advantages in terms of potency, efficacy and pharmacokinetic properties. The growing volume of newly discovered peptide sequences in the post-genomic era requires computational approaches for timely and accurate identification of AVPs. Machine learning (ML) methods such as random forest and support vector machine represent robust learning algorithms that are instrumental in successful peptide-based drug discovery. Therefore, this review summarizes the current state-of-the-art application of ML methods for identifying AVPs directly from the sequence information. We compare the efficiency of these methods in terms of the underlying characteristics of the dataset used along with feature encoding methods, ML algorithms, cross-validation methods and prediction performance. Finally, guidelines for the development of robust AVP models are also discussed. It is anticipated that this review will serve as a useful guide for the design and development of robust AVP and related therapeutic peptide predictors in the future.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Nuttapat Anuwongcharoen
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Md Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| |
Collapse
|
15
|
Islam MKB, Rahman J, Hasan MAM, Ahmad S. predForm-Site: Formylation site prediction by incorporating multiple features and resolving data imbalance. Comput Biol Chem 2021; 94:107553. [PMID: 34384997 DOI: 10.1016/j.compbiolchem.2021.107553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 06/22/2021] [Accepted: 07/28/2021] [Indexed: 10/20/2022]
Abstract
Formylation is one of the newly discovered post-translational modifications in lysine residue which is responsible for different kinds of diseases. In this work, a novel predictor, named predForm-Site, has been developed to predict formylation sites with higher accuracy. We have integrated multiple sequence features for developing a more informative representation of formylation sites. Moreover, decision function of the underlying classifier have been optimized on skewed formylation dataset during prediction model training for prediction quality improvement. On the dataset used by LFPred and Formator predictor, predForm-Site achieved 99.5% sensitivity, 99.8% specificity and 99.8% overall accuracy with AUC of 0.999 in the jackknife test. In the independent test, it has also achieved more than 97% sensitivity and 99% specificity. Similarly, in benchmarking with recent method CKSAAP_FormSite, the proposed predictor significantly outperformed in all the measures, particularly sensitivity by around 20%, specificity by nearly 30% and overall accuracy by more than 22%. These experimental results show that the proposed predForm-Site can be used as a complementary tool for the fast exploration of formylation sites. For convenience of the scientific community, predForm-Site has been deployed as an online tool, accessible at http://103.99.176.239:8080/predForm-Site.
Collapse
Affiliation(s)
- Md Khaled Ben Islam
- Institute for Integrated and Intelligent Systems, Griffith University, Brisbane, Australia; Department of Computer Science & Engineering, Pabna University of Science and Technology, Pabna, Bangladesh.
| | - Julia Rahman
- Institute for Integrated and Intelligent Systems, Griffith University, Brisbane, Australia; Department of Computer Science & Engineering, Rajshahi University of Engineering and Technology, Rajshahi, Bangladesh.
| | - Md Al Mehedi Hasan
- Department of Computer Science & Engineering, Rajshahi University of Engineering and Technology, Rajshahi, Bangladesh
| | - Shamim Ahmad
- Department of Computer Science & Engineering, Rajshahi University, Rajshahi, Bangladesh
| |
Collapse
|
16
|
The Mystery of Extramitochondrial Proteins Lysine Succinylation. Int J Mol Sci 2021; 22:ijms22116085. [PMID: 34199982 PMCID: PMC8200203 DOI: 10.3390/ijms22116085] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 05/31/2021] [Accepted: 06/02/2021] [Indexed: 12/19/2022] Open
Abstract
Lysine succinylation is a post-translational modification which alters protein function in both physiological and pathological processes. Mindful that it requires succinyl-CoA, a metabolite formed within the mitochondrial matrix that cannot permeate the inner mitochondrial membrane, the question arises as to how there can be succinylation of proteins outside mitochondria. The present mini-review examines pathways participating in peroxisomal fatty acid oxidation that lead to succinyl-CoA production, potentially supporting succinylation of extramitochondrial proteins. Furthermore, the influence of the mitochondrial status on cytosolic NAD+ availability affecting the activity of cytosolic SIRT5 iso1 and iso4—in turn regulating cytosolic protein lysine succinylations—is presented. Finally, the discovery that glia in the adult human brain lack subunits of both alpha-ketoglutarate dehydrogenase complex and succinate-CoA ligase—thus being unable to produce succinyl-CoA in the matrix—and yet exhibit robust pancellular lysine succinylation, is highlighted.
Collapse
|
17
|
LSTMCNNsucc: A Bidirectional LSTM and CNN-Based Deep Learning Method for Predicting Lysine Succinylation Sites. BIOMED RESEARCH INTERNATIONAL 2021; 2021:9923112. [PMID: 34159204 PMCID: PMC8188601 DOI: 10.1155/2021/9923112] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/25/2021] [Accepted: 05/03/2021] [Indexed: 11/17/2022]
Abstract
Lysine succinylation is a typical protein post-translational modification and plays a crucial role of regulation in the cellular process. Identifying succinylation sites is fundamental to explore its functions. Although many computational methods were developed to deal with this challenge, few considered semantic relationship between residues. We combined long short-term memory (LSTM) and convolutional neural network (CNN) into a deep learning method for predicting succinylation site. The proposed method obtained a Matthews correlation coefficient of 0.2508 on the independent test, outperforming state of the art methods. We also performed the enrichment analysis of succinylation proteins. The results showed that functions of succinylation were conserved across species but differed to a certain extent with species. On basis of the proposed method, we developed a user-friendly web server for predicting succinylation sites.
Collapse
|
18
|
Dong Y, Li P, Li P, Chen C. First comprehensive analysis of lysine succinylation in paper mulberry (Broussonetia papyrifera). BMC Genomics 2021; 22:255. [PMID: 33838656 PMCID: PMC8035759 DOI: 10.1186/s12864-021-07567-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 03/26/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Lysine succinylation is a naturally occurring post-translational modification (PTM) that is ubiquitous in organisms. Lysine succinylation plays important roles in regulating protein structure and function as well as cellular metabolism. Global lysine succinylation at the proteomic level has been identified in a variety of species; however, limited information on lysine succinylation in plant species, especially paper mulberry, is available. Paper mulberry is not only an important plant in traditional Chinese medicine, but it is also a tree species with significant economic value. Paper mulberry is found in the temperate and tropical zones of China. The present study analyzed the effects of lysine succinylation on the growth, development, and physiology of paper mulberry. RESULTS A total of 2097 lysine succinylation sites were identified in 935 proteins associated with the citric acid cycle (TCA cycle), glyoxylic acid and dicarboxylic acid metabolism, ribosomes and oxidative phosphorylation; these pathways play a role in carbon fixation in photosynthetic organisms and may be regulated by lysine succinylation. The modified proteins were distributed in multiple subcellular compartments and were involved in a wide variety of biological processes, such as photosynthesis and the Calvin-Benson cycle. CONCLUSION Lysine-succinylated proteins may play key regulatory roles in metabolism, primarily in photosynthesis and oxidative phosphorylation, as well as in many other cellular processes. In addition to the large number of succinylated proteins associated with photosynthesis and oxidative phosphorylation, some proteins associated with the TCA cycle are succinylated. Our study can serve as a reference for further proteomics studies of the downstream effects of succinylation on the physiology and biochemistry of paper mulberry.
Collapse
Affiliation(s)
- Yibo Dong
- College of Animal Science, Guizhou university, Guiyang, 550025, Guizhou, China
- Department of Plant Protection, Institute of Crop Protection, College of Agriculture, Guizhou University, Guiyang, 550025, Guizhou, China
| | - Ping Li
- Institute of Grassland Research, Sichuan Academy of Grassland Science, Chengdu, 610000, Sichuan, China
| | - Ping Li
- College of Animal Science, Guizhou university, Guiyang, 550025, Guizhou, China
| | - Chao Chen
- College of Animal Science, Guizhou university, Guiyang, 550025, Guizhou, China.
| |
Collapse
|
19
|
Islam MM, Alam MJ, Ahmed FF, Hasan MM, Mollah MNH. Improved Prediction of Protein-Protein Interaction Mapping on Homo Sapiens by Using Amino Acid Sequence Features in a Supervised Learning Framework. Protein Pept Lett 2021; 28:74-83. [PMID: 32520672 DOI: 10.2174/0929866527666200610141258] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Revised: 05/03/2020] [Accepted: 05/04/2020] [Indexed: 02/07/2023]
Abstract
BACKGROUND Protein-Protein Interaction (PPI) has emerged as a key role in the control of many biological processes including protein function, disease incidence, and therapy design. However, the identification of PPI by wet lab experiment is a challenging task, since it is laborious, time consuming and expensive. Therefore, computational prediction of PPI is now given emphasis before going to the experimental validation, since it is simultaneously less laborious, time saver and cost minimizer. OBJECTIVE The objective of this study is to develop an improved computational method for PPI prediction mapping on Homo sapiens by using the amino acid sequence features in a supervised learning framework. METHODS The experimentally validated 91 positive-PPI pairs of human protein sequences were collected from IntAct Molecular Interaction Database. Then we constructed three balanced datasets with ratios 1:1, 1:2 and 1:3 of positive and negative PPI samples. Then we partitioned each dataset into training (80%) and independent test (20%) datasets. Again each training dataset was partitioned into four mutually exclusive groups of equal sizes for interchanging each group with independent test group to perform 5-fold cross validation (CV). Then we trained candidate seven classifiers (NN, SVM, LR, NB, KNN, AB and RF) with each ratio case to obtain the better PPI predictor by comparing their performance scores. RESULTS The random forest (RF) based predictor that was trained with 1:2 ratio of positive-PPI and negative-PPI samples based on AAC encoding features provided the most accurate PPI prediction by producing the highest average performance scores of accuracy (93.50%), sensitivity (95.0%), MCC (85.2%), AUC (0.941) and pAUC (0.236) with the 5-fold cross-validation. It also achieved the highest average performance scores of accuracy (92.0%), sensitivity (94.0%), MCC (83.6%), AUC (0.922) and pAUC (0.207) with the independent test datasets in a comparison of the other candidate and existing predictors. CONCLUSION The final resultant prediction strongly recommend that the RF based predictor is a better prediction model of PPI mapping on Homo sapiens.
Collapse
Affiliation(s)
- Md Merajul Islam
- Bioinformatics Laboratory, Department of Statistics, Rajshahi University, Rajshahi-6205, Bangladesh
| | - Md Jahangir Alam
- Bioinformatics Laboratory, Department of Statistics, Rajshahi University, Rajshahi-6205, Bangladesh
| | - Fee Faysal Ahmed
- Department of Mathematics, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Md Mehedi Hasan
- Deptartment of Bioscience and Bioinformatics, Kyushu Institute of Technology, Kawazu, Iizuka, Fukuoka, Japan
| | - Md Nurul Haque Mollah
- Bioinformatics Laboratory, Department of Statistics, Rajshahi University, Rajshahi-6205, Bangladesh
| |
Collapse
|
20
|
Nilamyani AN, Auliah FN, Moni MA, Shoombuatong W, Hasan MM, Kurata H. PredNTS: Improved and Robust Prediction of Nitrotyrosine Sites by Integrating Multiple Sequence Features. Int J Mol Sci 2021; 22:2704. [PMID: 33800121 PMCID: PMC7962192 DOI: 10.3390/ijms22052704] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Revised: 03/02/2021] [Accepted: 03/03/2021] [Indexed: 12/15/2022] Open
Abstract
Nitrotyrosine, which is generated by numerous reactive nitrogen species, is a type of protein post-translational modification. Identification of site-specific nitration modification on tyrosine is a prerequisite to understanding the molecular function of nitrated proteins. Thanks to the progress of machine learning, computational prediction can play a vital role before the biological experimentation. Herein, we developed a computational predictor PredNTS by integrating multiple sequence features including K-mer, composition of k-spaced amino acid pairs (CKSAAP), AAindex, and binary encoding schemes. The important features were selected by the recursive feature elimination approach using a random forest classifier. Finally, we linearly combined the successive random forest (RF) probability scores generated by the different, single encoding-employing RF models. The resultant PredNTS predictor achieved an area under a curve (AUC) of 0.910 using five-fold cross validation. It outperformed the existing predictors on a comprehensive and independent dataset. Furthermore, we investigated several machine learning algorithms to demonstrate the superiority of the employed RF algorithm. The PredNTS is a useful computational resource for the prediction of nitrotyrosine sites. The web-application with the curated datasets of the PredNTS is publicly available.
Collapse
Affiliation(s)
- Andi Nur Nilamyani
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; (A.N.N.); (F.N.A.)
| | - Firda Nurul Auliah
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; (A.N.N.); (F.N.A.)
| | - Mohammad Ali Moni
- WHO Collaborating Centre on eHealth, UNSW Digital Health, School of Public Health and Community Medicine, Faculty of Medicine, UNSW Sydney, Sydney, NSW 2052, Australia;
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand;
| | - Md Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; (A.N.N.); (F.N.A.)
- Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan
| | - Hiroyuki Kurata
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; (A.N.N.); (F.N.A.)
| |
Collapse
|
21
|
Auliah FN, Nilamyani AN, Shoombuatong W, Alam MA, Hasan MM, Kurata H. PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations. Int J Mol Sci 2021; 22:ijms22042120. [PMID: 33672741 PMCID: PMC7924619 DOI: 10.3390/ijms22042120] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2021] [Revised: 02/12/2021] [Accepted: 02/18/2021] [Indexed: 12/30/2022] Open
Abstract
Pupylation is a type of reversible post-translational modification of proteins, which plays a key role in the cellular function of microbial organisms. Several proteomics methods have been developed for the prediction and analysis of pupylated proteins and pupylation sites. However, the traditional experimental methods are laborious and time-consuming. Hence, computational algorithms are highly needed that can predict potential pupylation sites using sequence features. In this research, a new prediction model, PUP-Fuse, has been developed for pupylation site prediction by integrating multiple sequence representations. Meanwhile, we explored the five types of feature encoding approaches and three machine learning (ML) algorithms. In the final model, we integrated the successive ML scores using a linear regression model. The PUP-Fuse achieved a Mathew correlation value of 0.768 by a 10-fold cross-validation test. It also outperformed existing predictors in an independent test. The web server of the PUP-Fuse with curated datasets is freely available.
Collapse
Affiliation(s)
- Firda Nurul Auliah
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; (F.N.A.); (A.N.N.); (M.M.H.)
| | - Andi Nur Nilamyani
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; (F.N.A.); (A.N.N.); (M.M.H.)
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand;
| | - Md Ashad Alam
- Tulane Center for Biomedical Informatics and Genomics, Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA 70112, USA;
| | - Md Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; (F.N.A.); (A.N.N.); (M.M.H.)
- Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan
| | - Hiroyuki Kurata
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; (F.N.A.); (A.N.N.); (M.M.H.)
- Correspondence:
| |
Collapse
|
22
|
Mahmood MK, Ehsan A, Khan YD, Chou KC. iHyd-LysSite (EPSV): Identifying Hydroxylysine Sites in Protein Using Statistical Formulation by Extracting Enhanced Position and Sequence Variant Feature Technique. Curr Genomics 2020; 21:536-545. [PMID: 33214770 PMCID: PMC7604750 DOI: 10.2174/1389202921999200831142629] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Revised: 05/14/2020] [Accepted: 05/15/2020] [Indexed: 11/22/2022] Open
Abstract
Introduction Hydroxylation is one of the most important post-translational modifications (PTM) in cellular functions and is linked to various diseases. The addition of one of the hydroxyl groups (OH) to the lysine sites produces hydroxylysine when undergoes chemical modification. Methods The method which is used in this study for identifying hydroxylysine sites based on powerful mathematical and statistical methodology incorporating the sequence-order effect and composition of each object within protein sequences. This predictor is called “iHyd-LysSite (EPSV)” (identifying hydroxylysine sites by extracting enhanced position and sequence variant technique). The prediction of hydroxylysine sites by experimental methods is difficult, laborious and highly expensive. In silico technique is an alternative approach to identify hydroxylysine sites in proteins. Results The experimental results require that the predictive model should have high sensitivity and specificity values and must be more accurate. The self-consistency, independent, 10-fold cross-validation and jackknife tests are performed for validation purposes. These tests are resulted by using three renowned classifiers, Neural Networks (NN), Random Forest (RF) and Support Vector Machine (SVM) with the demanding prediction rate. The overall predictive outcomes are extraordinarily superior to the results obtained by previous predictors. The proposed model contributed an excellent prediction rate in the system for NN, RF, and SVM classifiers. The sensitivity and specificity results using all these classifiers for jackknife test are 96.08%, 94.99%, 98.16% and 97.52%, 98.52%, 80.95%. Conclusion The results obtained by the proposed tool show that this method may meet the future demand of hydroxylysine sites with a better prediction rate over the existing methods.
Collapse
Affiliation(s)
- Muhammad Khalid Mahmood
- 1Department of Mathematics, University of the Punjab, Lahore, Pakistan; 2Faculty of Information Technology, University of Management and Tecnology, Lahore, Pakistan; 3Gordon Life Science Institute, Boston, MA02478, USA
| | - Asma Ehsan
- 1Department of Mathematics, University of the Punjab, Lahore, Pakistan; 2Faculty of Information Technology, University of Management and Tecnology, Lahore, Pakistan; 3Gordon Life Science Institute, Boston, MA02478, USA
| | - Yaser Daanial Khan
- 1Department of Mathematics, University of the Punjab, Lahore, Pakistan; 2Faculty of Information Technology, University of Management and Tecnology, Lahore, Pakistan; 3Gordon Life Science Institute, Boston, MA02478, USA
| | - Kuo-Chen Chou
- 1Department of Mathematics, University of the Punjab, Lahore, Pakistan; 2Faculty of Information Technology, University of Management and Tecnology, Lahore, Pakistan; 3Gordon Life Science Institute, Boston, MA02478, USA
| |
Collapse
|
23
|
Hasan MM, Khatun MS, Kurata H. iLBE for Computational Identification of Linear B-cell Epitopes by Integrating Sequence and Evolutionary Features. GENOMICS PROTEOMICS & BIOINFORMATICS 2020; 18:593-600. [PMID: 33099033 PMCID: PMC8377379 DOI: 10.1016/j.gpb.2019.04.004] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Revised: 01/13/2019] [Accepted: 04/19/2019] [Indexed: 12/17/2022]
Abstract
Linear B-cell epitopes are critically important for immunological applications, such as vaccine design, immunodiagnostic test, and antibody production, as well as disease diagnosis and therapy. The accurate identification of linear B-cell epitopes remains challenging despite several decades of research. In this work, we have developed a novel predictor, Identification of Linear B-cell Epitope (iLBE), by integrating evolutionary and sequence-based features. The successive feature vectors were optimized by a Wilcoxon-rank sum test. Then the random forest (RF) algorithm using the optimal consecutive feature vectors was applied to predict linear B-cell epitopes. We combined the RF scores by the logistic regression to enhance the prediction accuracy. iLBE yielded an area under curve score of 0.809 on the training dataset and outperformed other prediction models on a comprehensive independent dataset. iLBE is a powerful computational tool to identify the linear B-cell epitopes and would help to develop penetrating diagnostic tests. A web application with curated datasets for iLBE is freely accessible at http://kurata14.bio.kyutech.ac.jp/iLBE/.
Collapse
Affiliation(s)
- Md Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka 820-8502, Japan
| | - Mst Shamima Khatun
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka 820-8502, Japan
| | - Hiroyuki Kurata
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka 820-8502, Japan; Biomedical Informatics R&D Center, Kyushu Institute of Technology, Iizuka, Fukuoka 820-8502, Japan.
| |
Collapse
|
24
|
Charoenkwan P, Kanthawong S, Nantasenamat C, Hasan MM, Shoombuatong W. iAMY-SCM: Improved prediction and analysis of amyloid proteins using a scoring card method with propensity scores of dipeptides. Genomics 2020; 113:689-698. [PMID: 33017626 DOI: 10.1016/j.ygeno.2020.09.065] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 09/21/2020] [Accepted: 09/30/2020] [Indexed: 01/09/2023]
Abstract
Fast, accurate identification and characterization of amyloid proteins at a large-scale is essential for understating their role in therapeutic intervention strategies. As a matter of fact, there exist only one in silico model for amyloid protein identification using the random forest (RF) model in conjunction with various feature types namely the RFAmy. However, it suffers from low interpretability for biologists. Thus, it is highly desirable to develop a simple and easily interpretable prediction method with robust accuracy as compared to the existing complicated model. In this study, we propose iAMY-SCM, the first scoring card method-based predictor for predicting and analyzing amyloid proteins. Herein, the iAMY-SCM made use of a simple weighted-sum function in conjunction with the propensity scores of dipeptides for the amyloid protein identification. Cross-validation results indicated that iAMY-SCM provided an accuracy of 0.895 that corresponded to 10-22% higher performance than that of widely used machine learning models. Furthermore, iAMY-SCM achieving an accuracy of 0.827 as evaluated by an independent test, which was found to be comparable to that of RFAmy and was approximately 9-13% higher than widely used machine learning models. Furthermore, the analysis of estimated propensity scores of amino acids and dipeptides were performed to provide insights into the biophysical and biochemical properties of amyloid proteins. As such, this demonstrates that the proposed iAMY-SCM is efficient and reliable in terms of simplicity, interpretability and implementation. To facilitate ease of use of the proposed iAMY-SCM, a user-friendly and publicly accessible web server at http://camt.pythonanywhere.com/iAMY-SCM has been established. We anticipate that that iAMY-SCM will be an important tool for facilitating the large-scale prediction and characterization of amyloid protein.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Sakawrat Kanthawong
- Department of Microbiology, Faculty of Medicine, Khon Kaen University, Khon Kaen 40002, Thailand
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - Md Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
| |
Collapse
|
25
|
Khatun MS, Hasan MM, Shoombuatong W, Kurata H. ProIn-Fuse: improved and robust prediction of proinflammatory peptides by fusing of multiple feature representations. J Comput Aided Mol Des 2020; 34:1229-1236. [DOI: 10.1007/s10822-020-00343-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Accepted: 09/16/2020] [Indexed: 12/11/2022]
|
26
|
Charoenkwan P, Kanthawong S, Nantasenamat C, Hasan MM, Shoombuatong W. iDPPIV-SCM: A Sequence-Based Predictor for Identifying and Analyzing Dipeptidyl Peptidase IV (DPP-IV) Inhibitory Peptides Using a Scoring Card Method. J Proteome Res 2020; 19:4125-4136. [PMID: 32897718 DOI: 10.1021/acs.jproteome.0c00590] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The inhibition of dipeptidyl peptidase IV (DPP-IV, E.C.3.4.14.5) is well recognized as a new avenue for the treatment of Type 2 diabetes (T2D). Until now, peptide-like DDP-IV inhibitors have been shown to normalize the blood glucose concentration in T2D subjects. To the best of our knowledge, there is yet no computational model for predicting and analyzing DPP-IV inhibitory peptides using sequence information. In this study, we present for the first time a simple and easily interpretable sequence-based predictor using the scoring card method (SCM) for modeling the bioactivity of DPP-IV inhibitory peptides (iDPPIV-SCM). Particularly, the iDPPIV-SCM was developed by employing the SCM method together with the propensity scores of amino acids. Rigorous independent test results demonstrated that the proposed iDPPIV-SCM was found to be superior to those of well-known machine learning (ML) classifiers (e.g., k-nearest neighbor, logistic regression, and decision tree) with demonstrated improvements of 2-11, 4-22, and 7-10% for accuracy, MCC, and AUC, respectively, while also achieving comparable results to that of the support vector machine. Furthermore, the analysis of estimated propensity scores of amino acids as derived from the iDPPIV-SCM was performed so as to provide a more in-depth understanding on the molecular basis for enhancing the DPP-IV inhibitory potency. Taken together, these results revealed that iDPPIV-SCM was superior to those of other well-known ML classifiers owing to its simplicity, interpretability, and validity. For the convenience of biologists, the predictive model is deployed as a publicly accessible web server at http://camt.pythonanywhere.com/iDPPIV-SCM. It is anticipated that iDPPIV-SCM can serve as an important tool for the rapid screening of promising DPP-IV inhibitory peptides prior to their synthesis.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Sakawrat Kanthawong
- Department of Microbiology, Faculty of Medicine, Khon Kaen University, Khon Kaen 40002, Thailand
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - Md Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| |
Collapse
|
27
|
Khatun MS, Shoombuatong W, Hasan MM, Kurata H. Evolution of Sequence-based Bioinformatics Tools for Protein-protein Interaction Prediction. Curr Genomics 2020; 21:454-463. [PMID: 33093807 PMCID: PMC7536797 DOI: 10.2174/1389202921999200625103936] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 03/19/2020] [Accepted: 05/27/2020] [Indexed: 12/22/2022] Open
Abstract
Protein-protein interactions (PPIs) are the physical connections between two or more proteins via electrostatic forces or hydrophobic effects. Identification of the PPIs is pivotal, which contributes to many biological processes including protein function, disease incidence, and therapy design. The experimental identification of PPIs via high-throughput technology is time-consuming and expensive. Bioinformatics approaches are expected to solve such restrictions. In this review, our main goal is to provide an inclusive view of the existing sequence-based computational prediction of PPIs. Initially, we briefly introduce the currently available PPI databases and then review the state-of-the-art bioinformatics approaches, working principles, and their performances. Finally, we discuss the caveats and future perspective of the next generation algorithms for the prediction of PPIs.
Collapse
Affiliation(s)
| | | | - Md. Mehedi Hasan
- Address correspondence to these authors at the Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan; Tel: +81-948-297-828; E-mail: and Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Biomedical Informatics R&D Center, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828; E-mail:
| | - Hiroyuki Kurata
- Address correspondence to these authors at the Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan; Tel: +81-948-297-828; E-mail: and Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Biomedical Informatics R&D Center, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828; E-mail:
| |
Collapse
|
28
|
Li X, Zhang C, Zhao T, Su Z, Li M, Hu J, Wen J, Shen J, Wang C, Pan J, Mu X, Ling T, Li Y, Wen H, Zhang X, You Q. Lysine-222 succinylation reduces lysosomal degradation of lactate dehydrogenase a and is increased in gastric cancer. JOURNAL OF EXPERIMENTAL & CLINICAL CANCER RESEARCH : CR 2020; 39:172. [PMID: 32859246 PMCID: PMC7455916 DOI: 10.1186/s13046-020-01681-0] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/17/2020] [Accepted: 08/17/2020] [Indexed: 01/07/2023]
Abstract
Background Lysine succinylation is an emerging posttranslational modification that has garnered increased attention recently, but its role in gastric cancer (GC) remains underexplored. Methods Proteomic quantification of lysine succinylation was performed in human GC tissues and adjacent normal tissues by mass spectrometry. The mRNA and protein levels of lactate dehydrogenase A (LDHA) in GC and adjacent normal tissues were analyzed by qRT-PCR and western blot, respectively. The expression of K222-succinylated LDHA was measured in GC tissue microarray by the K222 succinylation-specific antibody. The interaction between LDHA and sequestosome 1 (SQSTM1) was measured by co-immunoprecipitation (co-IP) and proximity ligation assay (PLA). The binding of carnitine palmitoyltransferase 1A (CPT1A) to LDHA was determined by co-IP. The effect of K222-succinylated LDHA on tumor growth and metastasis was evaluated by in vitro and in vivo experiments. Results Altogether, 503 lysine succinylation sites in 303 proteins were identified. Lactate dehydrogenase A (LDHA), the key enzyme in Warburg effect, was found highly succinylated at K222 in GC. Intriguingly, this modification did not affect LDHA ubiquitination, but reduced the binding of ubiquitinated LDHA to SQSTM1, thereby decreasing its lysosomal degradation. We demonstrated that CPT1A functions as a lysine succinyltransferase that interacts with and succinylates LDHA. Moreover, high K222-succinylation of LDHA was associated with poor prognosis in patients with GC. Finally, overexpression of a succinylation-mimic mutant of LDHA promoted cell proliferation, invasion, and migration. Conclusions Our data revealed a novel lysosomal pathway of LDHA degradation, which is mediated by the binding of K63-ubiquitinated LDHA to SQSTM1. Strikingly, CPT1A succinylates LDHA on K222, which thereby reduces the binding and inhibits the degradation of LDHA, as well as promotes GC invasion and proliferation. This study thus uncovers a new role of lysine succinylation and the mechanism underlying LDHA upregulation in GC.
Collapse
Affiliation(s)
- Xiang Li
- Affiliated Cancer Hospital & Institute of Guangzhou Medical University, Guangzhou, 510095, China.,Department of Biotherapy, Department of Surgery, Second Affiliated Hospital of Nanjing Medical University, Nanjing, 210011, China.,Nanjing University of Chinese Medicine, Nanjing, 210023, China
| | - Chen Zhang
- Department of Biotherapy, Department of Surgery, Second Affiliated Hospital of Nanjing Medical University, Nanjing, 210011, China
| | - Ting Zhao
- Affiliated Cancer Hospital & Institute of Guangzhou Medical University, Guangzhou, 510095, China
| | - Zhongping Su
- Department of Biotherapy, Department of Surgery, Second Affiliated Hospital of Nanjing Medical University, Nanjing, 210011, China
| | - Mengjing Li
- Department of Biotherapy, Department of Surgery, Second Affiliated Hospital of Nanjing Medical University, Nanjing, 210011, China
| | - Jiancheng Hu
- Division of Cellular and Molecular Research, National Cancer Centre Singapore, Singapore, 169610, Singapore.,Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, 169857, Singapore
| | - Jianfei Wen
- The First Affiliated Hospital of Nanjing Medical University, Nanjing, 210029, China
| | - Jiajia Shen
- The First Affiliated Hospital of Nanjing Medical University, Nanjing, 210029, China
| | - Chao Wang
- Department of Biotherapy, Department of Surgery, Second Affiliated Hospital of Nanjing Medical University, Nanjing, 210011, China
| | - Jinshun Pan
- Department of Biotherapy, Department of Surgery, Second Affiliated Hospital of Nanjing Medical University, Nanjing, 210011, China
| | - Xianmin Mu
- Department of Biotherapy, Department of Surgery, Second Affiliated Hospital of Nanjing Medical University, Nanjing, 210011, China
| | - Tao Ling
- Department of Biotherapy, Department of Surgery, Second Affiliated Hospital of Nanjing Medical University, Nanjing, 210011, China
| | - Yingchang Li
- Affiliated Cancer Hospital & Institute of Guangzhou Medical University, Guangzhou, 510095, China
| | - Hao Wen
- Department of Biotherapy, Department of Surgery, Second Affiliated Hospital of Nanjing Medical University, Nanjing, 210011, China
| | - Xiaoren Zhang
- Affiliated Cancer Hospital & Institute of Guangzhou Medical University, Guangzhou, 510095, China.,Key Laboratory of Cell Homeostasis and Cancer Research of Guangdong Higher Education Institutes, Guangzhou Medical University, Guangzhou, 510182, China
| | - Qiang You
- Affiliated Cancer Hospital & Institute of Guangzhou Medical University, Guangzhou, 510095, China. .,Department of Biotherapy, Department of Surgery, Second Affiliated Hospital of Nanjing Medical University, Nanjing, 210011, China. .,Key Laboratory of Cell Homeostasis and Cancer Research of Guangdong Higher Education Institutes, Guangzhou Medical University, Guangzhou, 510182, China.
| |
Collapse
|
29
|
iBitter-SCM: Identification and characterization of bitter peptides using a scoring card method with propensity scores of dipeptides. Genomics 2020; 112:2813-2822. [DOI: 10.1016/j.ygeno.2020.03.019] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 03/19/2020] [Accepted: 03/22/2020] [Indexed: 12/21/2022]
|
30
|
Mao M, Xue Y, He Y, Zhou X, Rafique F, Hu H, Liu J, Feng L, Yang W, Li X, Sun L, Huang Z, Ma J. Systematic identification and comparative analysis of lysine succinylation between the green and white parts of chimeric leaves of Ananas comosus var. bracteatus. BMC Genomics 2020; 21:383. [PMID: 32493214 PMCID: PMC7268518 DOI: 10.1186/s12864-020-6750-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Accepted: 04/21/2020] [Indexed: 01/26/2023] Open
Abstract
Background Lysine succinylation, an important protein posttranslational modification (PTM), is widespread and conservative. The regulatory functions of succinylation in leaf color has been reported. The chimeric leaves of Ananas comosus var. bracteatus are composed of normal green parts and albino white parts. However, the extent and function of lysine succinylation in chimeric leaves of Ananas comosus var. bracteatus has yet to be investigated. Results Compared to the green (Gr) parts, the global succinylation level was increased in the white (Wh) parts of chimeric leaves according to the Western blot and immunohistochemistry analysis. Furthermore, we quantitated the change in the succinylation profiles between the Wh and Gr parts of chimeric leaves using label-free LFQ intensity. In total, 855 succinylated sites in 335 proteins were identified, and 593 succinylated sites in 237 proteins were quantified. Compared to the Gr parts, 232 (61.1%) sites in 128 proteins were quantified as upregulated targets, and 148 (38.9%) sites in 70 proteins were quantified as downregulated targets in the Wh parts of chimeric leaves using a 1.5-fold threshold (P < 0.05). These proteins with altered succinylation level were mainly involved in crassulacean acid metabolism (CAM) photosynthesis, photorespiration, glycolysis, the citric acid cycle (CAC) and pyruvate metabolism. Conclusions Our results suggested that the changed succinylation level in proteins might function in the main energy metabolism pathways—photosynthesis and respiration. Succinylation might provide a significant effect in the growth of chimeric leaves and the relationship between the Wh and Gr parts of chimeric leaves. This study not only provided a basis for further characterization on the function of succinylated proteins in chimeric leaves of Ananas comosus var. bracteatus but also provided a new insight into molecular breeding for leaf color chimera.
Collapse
Affiliation(s)
- Meiqin Mao
- College of Landscape Architecture, Sichuan Agricultural University, Chengdu, China
| | - Yanbin Xue
- College of Landscape Architecture, Sichuan Agricultural University, Chengdu, China
| | - Yehua He
- Horticultural Biotechnology College, South China Agricultural University, Guangzhou, China
| | - Xuzixing Zhou
- College of Landscape Architecture, Sichuan Agricultural University, Chengdu, China
| | - Fatima Rafique
- College of Landscape Architecture, Sichuan Agricultural University, Chengdu, China
| | - Hao Hu
- College of Landscape Architecture, Sichuan Agricultural University, Chengdu, China
| | - Jiawen Liu
- College of Landscape Architecture, Sichuan Agricultural University, Chengdu, China
| | - Lijun Feng
- College of Landscape Architecture, Sichuan Agricultural University, Chengdu, China
| | - Wei Yang
- College of Landscape Architecture, Sichuan Agricultural University, Chengdu, China
| | - Xi Li
- College of Landscape Architecture, Sichuan Agricultural University, Chengdu, China
| | - Lingxia Sun
- College of Landscape Architecture, Sichuan Agricultural University, Chengdu, China
| | - Zhuo Huang
- College of Landscape Architecture, Sichuan Agricultural University, Chengdu, China
| | - Jun Ma
- College of Landscape Architecture, Sichuan Agricultural University, Chengdu, China.
| |
Collapse
|
31
|
Hasan MM, Manavalan B, Shoombuatong W, Khatun MS, Kurata H. i6mA-Fuse: improved and robust prediction of DNA 6 mA sites in the Rosaceae genome by fusing multiple feature representation. PLANT MOLECULAR BIOLOGY 2020; 103:225-234. [PMID: 32140819 DOI: 10.1007/s11103-020-00988-y] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Accepted: 02/29/2020] [Indexed: 05/28/2023]
Abstract
DNA N6-methyladenine (6 mA) is one of the most vital epigenetic modifications and involved in controlling the various gene expression levels. With the avalanche of DNA sequences generated in numerous databases, the accurate identification of 6 mA plays an essential role for understanding molecular mechanisms. Because the experimental approaches are time-consuming and costly, it is desirable to develop a computation model for rapidly and accurately identifying 6 mA. To the best of our knowledge, we first proposed a computational model named i6mA-Fuse to predict 6 mA sites from the Rosaceae genomes, especially in Rosa chinensis and Fragaria vesca. We implemented the five encoding schemes, i.e., mononucleotide binary, dinucleotide binary, k-space spectral nucleotide, k-mer, and electron-ion interaction pseudo potential compositions, to build the five, single-encoding random forest (RF) models. The i6mA-Fuse uses a linear regression model to combine the predicted probability scores of the five, single encoding-based RF models. The resultant species-specific i6mA-Fuse achieved remarkably high performances with AUCs of 0.982 and 0.978 and with MCCs of 0.869 and 0.858 on the independent datasets of Rosa chinensis and Fragaria vesca, respectively. In the F. vesca-specific i6mA-Fuse, the MBE and EIIP contributed to 75% and 25% of the total prediction; in the R. chinensis-specific i6mA-Fuse, Kmer, MBE, and EIIP contribute to 15%, 65%, and 20% of the total prediction. To assist high-throughput prediction for DNA 6 mA identification, the i6mA-Fuse is publicly accessible at https://kurata14.bio.kyutech.ac.jp/i6mA-Fuse/.
Collapse
Affiliation(s)
- Md Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan
- Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo, 102-0083, Japan
| | | | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Mst Shamima Khatun
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan
| | - Hiroyuki Kurata
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan.
- Biomedical Informatics R&D Center, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan.
| |
Collapse
|
32
|
i4mC-Mouse: Improved identification of DNA N4-methylcytosine sites in the mouse genome using multiple encoding schemes. Comput Struct Biotechnol J 2020; 18:906-912. [PMID: 32322372 PMCID: PMC7168350 DOI: 10.1016/j.csbj.2020.04.001] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Revised: 03/31/2020] [Accepted: 04/03/2020] [Indexed: 12/12/2022] Open
Abstract
N4-methylcytosine (4mC) is one of the most important DNA modifications and involved in regulating cell differentiations and gene expressions. The accurate identification of 4mC sites is necessary to understand various biological functions. In this work, we developed a new computational predictor called i4mC-Mouse to identify 4mC sites in the mouse genome. Herein, six encoding schemes of k-space nucleotide composition (KSNC), k-mer nucleotide composition (Kmer), mono nucleotide binary encoding (MBE), dinucleotide binary encoding, electron–ion interaction pseudo potentials (EIIP) and dinucleotide physicochemical composition were explored that cover different characteristics of DNA sequence information. Subsequently, we built six RF-based encoding models and then linearly combined their probability scores to construct the final predictor. Among the six RF-based models, the Kmer, KSNC, MBE, and EIIP encodings are sufficient, which contributed to 10%, 45%, 25%, and 20% of the prediction performance, respectively. On the independent test the i4mC-Mouse predicted the 4mC sites with accuracy and MCC of 0.816 and 0.633, respectively, which were approximately 2.5% and 5% higher than those of the existing method (4mCpred-EL). For experimental biologists, a freely available web application was implemented at http://kurata14.bio.kyutech.ac.jp/i4mC-Mouse/.
Collapse
|
33
|
Rashid MM, Shatabda S, Hasan MM, Kurata H. Recent Development of Machine Learning Methods in Microbial Phosphorylation Sites. Curr Genomics 2020; 21:194-203. [PMID: 33071613 PMCID: PMC7521030 DOI: 10.2174/1389202921666200427210833] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 04/12/2020] [Accepted: 04/13/2020] [Indexed: 01/10/2023] Open
Abstract
A variety of protein post-translational modifications has been identified that control many cellular functions. Phosphorylation studies in mycobacterial organisms have shown critical importance in diverse biological processes, such as intercellular communication and cell division. Recent technical advances in high-precision mass spectrometry have determined a large number of microbial phosphorylated proteins and phosphorylation sites throughout the proteome analysis. Identification of phosphorylated proteins with specific modified residues through experimentation is often labor-intensive, costly and time-consuming. All these limitations could be overcome through the application of machine learning (ML) approaches. However, only a limited number of computational phosphorylation site prediction tools have been developed so far. This work aims to present a complete survey of the existing ML-predictors for microbial phosphorylation. We cover a variety of important aspects for developing a successful predictor, including operating ML algorithms, feature selection methods, window size, and software utility. Initially, we review the currently available phosphorylation site databases of the microbiome, the state-of-the-art ML approaches, working principles, and their performances. Lastly, we discuss the limitations and future directions of the computational ML methods for the prediction of phosphorylation.
Collapse
Affiliation(s)
| | | | - Md. Mehedi Hasan
- Address correspondence to these authors at the Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828;, E-mail: and Biomedical Informatics R&D Center, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828; E-mail:
| | - Hiroyuki Kurata
- Address correspondence to these authors at the Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828;, E-mail: and Biomedical Informatics R&D Center, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828; E-mail:
| |
Collapse
|
34
|
Mosharaf MP, Hassan MM, Ahmed FF, Khatun MS, Moni MA, Mollah MNH. Computational prediction of protein ubiquitination sites mapping on Arabidopsis thaliana. Comput Biol Chem 2020; 85:107238. [DOI: 10.1016/j.compbiolchem.2020.107238] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2018] [Revised: 01/22/2020] [Accepted: 02/18/2020] [Indexed: 02/06/2023]
|
35
|
Charoenkwan P, Kanthawong S, Schaduangrat N, Yana J, Shoombuatong W. PVPred-SCM: Improved Prediction and Analysis of Phage Virion Proteins Using a Scoring Card Method. Cells 2020; 9:E353. [PMID: 32028709 PMCID: PMC7072630 DOI: 10.3390/cells9020353] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2019] [Revised: 01/20/2020] [Accepted: 01/27/2020] [Indexed: 12/16/2022] Open
Abstract
Although, existing methods have been successful in predicting phage (or bacteriophage) virion proteins (PVPs) using various types of protein features and complex classifiers, such as support vector machine and naïve Bayes, these two methods do not allow interpretability. However, the characterization and analysis of PVPs might be of great significance to understanding the molecular mechanisms of bacteriophage genetics and the development of antibacterial drugs. Hence, we herein proposed a novel method (PVPred-SCM) based on the scoring card method (SCM) in conjunction with dipeptide composition to identify and characterize PVPs. In PVPred-SCM, the propensity scores of 400 dipeptides were calculated using the statistical discrimination approach. Rigorous independent validation test showed that PVPred-SCM utilizing only dipeptide composition yielded an accuracy of 77.56%, indicating that PVPred-SCM performed well relative to the state-of-the-art method utilizing a number of protein features. Furthermore, the propensity scores of dipeptides were used to provide insights into the biochemical and biophysical properties of PVPs. Upon comparison, it was found that PVPred-SCM was superior to the existing methods considering its simplicity, interpretability, and implementation. Finally, in an effort to facilitate high-throughput prediction of PVPs, we provided a user-friendly web-server for identifying the likelihood of whether or not these sequences are PVPs. It is anticipated that PVPred-SCM will become a useful tool or at least a complementary existing method for predicting and analyzing PVPs.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand;
| | - Sakawrat Kanthawong
- Department of Microbiology, Faculty of Medicine, Khon Kaen University, Khon Kaen 40002, Thailand;
| | - Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand;
| | - Janchai Yana
- Department of Chemistry, Faculty of Science and Technology, Chiang Mai Rajabhat University, Chiang Mai 50300, Thailand;
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand;
| |
Collapse
|
36
|
Basith S, Manavalan B, Hwan Shin T, Lee G. Machine intelligence in peptide therapeutics: A next‐generation tool for rapid disease screening. Med Res Rev 2020; 40:1276-1314. [DOI: 10.1002/med.21658] [Citation(s) in RCA: 139] [Impact Index Per Article: 34.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Revised: 11/26/2019] [Accepted: 12/16/2019] [Indexed: 12/12/2022]
Affiliation(s)
- Shaherin Basith
- Department of PhysiologyAjou University School of MedicineSuwon Republic of Korea
| | | | - Tae Hwan Shin
- Department of PhysiologyAjou University School of MedicineSuwon Republic of Korea
| | - Gwang Lee
- Department of PhysiologyAjou University School of MedicineSuwon Republic of Korea
| |
Collapse
|
37
|
Hasan MM, Manavalan B, Khatun MS, Kurata H. i4mC-ROSE, a bioinformatics tool for the identification of DNA N4-methylcytosine sites in the Rosaceae genome. Int J Biol Macromol 2019; 157:752-758. [PMID: 31805335 DOI: 10.1016/j.ijbiomac.2019.12.009] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Revised: 11/29/2019] [Accepted: 12/02/2019] [Indexed: 12/18/2022]
Abstract
One of the most important epigenetic modifications is N4-methylcytosine, which regulates many biological processes including DNA replication and chromosome stability. Identification of N4-methylcytosine sites is pivotal to understand specific biological functions. Herein, we developed the first bioinformatics tool called i4mC-ROSE for identifying N4-methylcytosine sites in the genomes of Fragaria vesca and Rosa chinensis in the Rosaceae, which utilizes a random forest classifier with six encoding methods that cover various aspects of DNA sequence information. The i4mC-ROSE predictor achieves area under the curve scores of 0.883 and 0.889 for the two genomes during cross-validation. Moreover, the i4mC-ROSE outperforms other classifiers tested in this study when objectively evaluated on the independent datasets. The proposed i4mC-ROSE tool can serve users' demand for the prediction of 4mC sites in the Rosaceae genome. The i4mC-ROSE predictor and utilized datasets are publicly accessible at http://kurata14.bio.kyutech.ac.jp/i4mC-ROSE/.
Collapse
Affiliation(s)
- Md Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan
| | - Balachandran Manavalan
- Department of Physiology, Ajou University School of Medicine, Suwon 443380, Republic of Korea
| | - Mst Shamima Khatun
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
| | - Hiroyuki Kurata
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Biomedical Informatics R&D Center, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan.
| |
Collapse
|
38
|
Huang KY, Hsu JBK, Lee TY. Characterization and Identification of Lysine Succinylation Sites based on Deep Learning Method. Sci Rep 2019; 9:16175. [PMID: 31700141 PMCID: PMC6838336 DOI: 10.1038/s41598-019-52552-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Accepted: 10/18/2019] [Indexed: 12/14/2022] Open
Abstract
Succinylation is a type of protein post-translational modification (PTM), which can play important roles in a variety of cellular processes. Due to an increasing number of site-specific succinylated peptides obtained from high-throughput mass spectrometry (MS), various tools have been developed for computationally identifying succinylated sites on proteins. However, most of these tools predict succinylation sites based on traditional machine learning methods. Hence, this work aimed to carry out the succinylation site prediction based on a deep learning model. The abundance of MS-verified succinylated peptides enabled the investigation of substrate site specificity of succinylation sites through sequence-based attributes, such as position-specific amino acid composition, the composition of k-spaced amino acid pairs (CKSAAP), and position-specific scoring matrix (PSSM). Additionally, the maximal dependence decomposition (MDD) was adopted to detect the substrate signatures of lysine succinylation sites by dividing all succinylated sequences into several groups with conserved substrate motifs. According to the results of ten-fold cross-validation, the deep learning model trained using PSSM and informative CKSAAP attributes can reach the best predictive performance and also perform better than traditional machine-learning methods. Moreover, an independent testing dataset that truly did not exist in the training dataset was used to compare the proposed method with six existing prediction tools. The testing dataset comprised of 218 positive and 2621 negative instances, and the proposed model could yield a promising performance with 84.40% sensitivity, 86.99% specificity, 86.79% accuracy, and an MCC value of 0.489. Finally, the proposed method has been implemented as a web-based prediction tool (CNN-SuccSite), which is now freely accessible at http://csb.cse.yzu.edu.tw/CNN-SuccSite/.
Collapse
Affiliation(s)
- Kai-Yao Huang
- Department of Medical Research, Hsinchu Mackay Memorial Hospital, Hsinchu city, 300, Taiwan
| | - Justin Bo-Kai Hsu
- Department of Medical Research, Taipei Medical University Hospital, Taipei city, 110, Taiwan
| | - Tzong-Yi Lee
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, 518172, China. .,School of Life and Health Sciences, The Chinese University of Hong Kong, Shenzhen, 518172, China.
| |
Collapse
|
39
|
Khatun S, Hasan M, Kurata H. Efficient computational model for identification of antitubercular peptides by integrating amino acid patterns and properties. FEBS Lett 2019; 593:3029-3039. [PMID: 31297788 DOI: 10.1002/1873-3468.13536] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 06/25/2019] [Accepted: 07/05/2019] [Indexed: 12/30/2022]
Abstract
Tuberculosis (TB) is a leading killer caused by Mycobacterium tuberculosis. Recently, anti-TB peptides have provided an alternative approach to combat antibiotic tolerance. We have developed an effective computational predictor, identification of antitubercular peptides (iAntiTB), by the integration of multiple feature vectors deriving from the amino acid sequences via random forest (RF) and support vector machine (SVM) classifiers. The iAntiTB combines the RF and SVM scores via linear regression to enhance the prediction accuracy. To make a robust and accurate predictor, we prepared the two datasets with different types of negative samples. The iAntiTB achieved area under the ROC curve values of 0.896 and 0.946 on the training datasets of the first and second datasets, respectively. The iAntiTB outperformed the other existing predictors.
Collapse
Affiliation(s)
- Shamima Khatun
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka, Japan
| | - Mehedi Hasan
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka, Japan
| | - Hiroyuki Kurata
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka, Japan.,Biomedical Informatics R&D Center, Kyushu Institute of Technology, Iizuka, Fukuoka, Japan
| |
Collapse
|
40
|
Huang H, Huang Q, Tang T, Zhou X, Gu L, Lu X, Liu F. Differentially Expressed Gene Screening, Biological Function Enrichment, and Correlation with Prognosis in Non-Small Cell Lung Cancer. Med Sci Monit 2019; 25:4333-4341. [PMID: 31181055 PMCID: PMC6582684 DOI: 10.12659/msm.916962] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Background The aim of this study was to explore the differently expressed genes and pathways in non-small cell lung cancer (NSCLC) and their correlation with the prognosis. Material/Methods Gene expression data series of GSE19804, GSE101929, and GSE33532 were downloaded from the Gene Expression Ominibus (GEO) database. The overlaping differently expressed genes (DEGs) were identified form the above 3 data series. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEEG) were used to analyze the biological functions and signal pathways of DEGs. The protein–protein interaction (PPI) was analyzed thorough Search Tool for the Retrieval of Interacting Gens (STRING). The relationship between the expression of hub genes and the prognosis of patients was analyzed by Kaplan-Meier Plotter online software. Results Twenty-nine DEGs were identified, with 22 upregulated genes and 7 downregulated genes. The enriched biological processes were mainly related to diet-induced thermogenesis and actin filament binding. The KEGG pathways were enriched in calcium signaling, regulation of lipolysis in adipocytes, and PPAR signaling. Two downregulated genes (MMP1 and SPP1) were identified as hub genes by Cytohubba. Twenty-two dysregulated genes were correlated with patient prognosis. Conclusions Differentially expressed genes are common in NSCLC patients and can be used as biomarkers for patient prognosis.
Collapse
Affiliation(s)
- He Huang
- Department of Respiratory, Zhejiang Hospital, Hangzhou, Zhejiang, China (mainland)
| | - Qingdong Huang
- Department of Respiratory, Zhejiang Hospital, Hangzhou, Zhejiang, China (mainland)
| | - Tingyu Tang
- Department of Respiratory, Zhejiang Hospital, Hangzhou, Zhejiang, China (mainland)
| | - Xiaoxi Zhou
- Department of Respiratory, Zhejiang Hospital, Hangzhou, Zhejiang, China (mainland)
| | - Liang Gu
- Department of Respiratory, Zhejiang Hospital, Hangzhou, Zhejiang, China (mainland)
| | - Xiaoling Lu
- Department of Respiratory, Zhejiang Hospital, Hangzhou, Zhejiang, China (mainland)
| | - Fang Liu
- Department of Respiratory, Zhejiang Hospital, Hangzhou, Zhejiang, China (mainland)
| |
Collapse
|
41
|
Hasan MM, Manavalan B, Khatun MS, Kurata H. Prediction of S-nitrosylation sites by integrating support vector machines and random forest. Mol Omics 2019; 15:451-458. [DOI: 10.1039/c9mo00098d] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Cysteine S-nitrosylation is a type of reversible post-translational modification of proteins, which controls diverse biological processes.
Collapse
Affiliation(s)
- Md. Mehedi Hasan
- Department of Bioscience and Bioinformatics
- Kyushu Institute of Technology
- Iizuka
- Japan
- Japan Society for the Promotion of Science
| | | | - Mst. Shamima Khatun
- Department of Bioscience and Bioinformatics
- Kyushu Institute of Technology
- Iizuka
- Japan
| | - Hiroyuki Kurata
- Department of Bioscience and Bioinformatics
- Kyushu Institute of Technology
- Iizuka
- Japan
- Biomedical Informatics R&D Center
| |
Collapse
|