1
|
Zheng Y, Yu K, Lin JF, Liang Z, Zhang Q, Li J, Wu QN, He CY, Lin M, Zhao Q, Zuo ZX, Ju HQ, Xu RH, Liu ZX. Deep learning prioritizes cancer mutations that alter protein nucleocytoplasmic shuttling to drive tumorigenesis. Nat Commun 2025; 16:2511. [PMID: 40087285 PMCID: PMC11909177 DOI: 10.1038/s41467-025-57858-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 03/05/2025] [Indexed: 03/17/2025] Open
Abstract
Genetic variants can affect protein function by driving aberrant subcellular localization. However, comprehensive analysis of how mutations promote tumor progression by influencing nuclear localization is currently lacking. Here, we systematically characterize potential shuttling-attacking mutations (SAMs) across cancers through developing the deep learning model pSAM for the ab initio decoding of the sequence determinants of nucleocytoplasmic shuttling. Leveraging cancer mutations across 11 cancer types, we find that SAMs enrich functional genetic variations and critical genes in cancer. We experimentally validate a dozen SAMs, among which R14M in PTEN, P255L in CHFR, etc. are identified to disrupt the nuclear localization signals through interfering their interactions with importins. Further studies confirm that the nucleocytoplasmic shuttling altered by SAMs in PTEN and CHFR rewire the downstream signaling and eliminate their function of tumor suppression. Thus, this study will help to understand the molecular traits of nucleocytoplasmic shuttling and their dysfunctions mediated by genetic variants.
Collapse
Grants
- This study was supported by the National Key R&D Program of China [2021YFA1302100], National Natural Science Foundation of China [32370698, 81972239], Program for Guangdong Introducing Innovative and Entrepreneurial Teams [2017ZT07S096], Tip-Top Scientific and Technical Innovative Youth Talents of Guangdong Special Support Program [2019TQ05Y351], Young Talents Program of Sun Yat-sen University Cancer Center [YTP-SYSUCC-0029], Science and Technology Program of Guangzhou [202206080011], Guangdong Basic and Applied Basic Research Foundation [2023B1515040030] and CAMS Innovation Fund for Medical Sciences (CIFMS) [2019-I2M-5-036].
- This study was supported by the Chih Kuang Scholarship for Outstanding Young Physician-Scientists of Sun Yat-sen University Cancer Center [CKS-SYSUCC-2024009] and the Postdoctoral Science Foundation of China [2024M763801, GZB20240907].
- This study was supported by the Noncommunicable Chronic Diseases-National Science and Technology Major Project [2023ZD0501600], National Natural Science Foundation of China [82321003, 82173128] and Cancer Innovative Research Program of Sun Yat-sen University Cancer Center [CIRP-SYSUCC-0004].
Collapse
Affiliation(s)
- Yongqiang Zheng
- State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Kai Yu
- State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, 77030, USA
| | - Jin-Fei Lin
- State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
- Department of Clinical Laboratory, Sun Yat-Sen University Cancer Center, Guangzhou, 510060, China
| | - Zhuoran Liang
- State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Qingfeng Zhang
- State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Junteng Li
- State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Qi-Nian Wu
- State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Cai-Yun He
- State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Mei Lin
- State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Qi Zhao
- State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Zhi-Xiang Zuo
- State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Huai-Qiang Ju
- State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Rui-Hua Xu
- State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China.
- Research Unit of Precision Diagnosis and Treatment for Gastrointestinal Cancer, Chinese Academy of Medical Sciences, Guangzhou, 510060, China.
| | - Ze-Xian Liu
- State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China.
| |
Collapse
|
2
|
Zuo Y, Fang X, Chen J, Ji J, Li Y, Wu Z, Liu X, Zeng X, Deng Z, Yin H, Zhao A. MlyPredCSED: based on extreme point deviation compensated clustering combined with cross-scale convolutional neural networks to predict multiple lysine sites in human. Brief Bioinform 2025; 26:bbaf189. [PMID: 40285360 PMCID: PMC12031725 DOI: 10.1093/bib/bbaf189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Revised: 03/27/2025] [Accepted: 04/03/2025] [Indexed: 04/29/2025] Open
Abstract
In post-translational modification, covalent bonds on lysine and attached chemical groups significantly change proteins' physical and chemical properties. They shape protein structures, enhance function and stability, and are vital for physiological processes, affecting health and disease through mechanisms like gene expression, signal transduction, protein degradation, and cell metabolism. Although lysine (K) modification sites are considered among the most common types of post-translational modifications in proteins, research on K-PTMs has largely overlooked the synergistic effects between different modifications and lacked the techniques to address the problem of sample imbalance. Based on this, the Extreme Point Deviation Compensated Clustering (EPDCC) Undersampling algorithm was proposed in this study and combined with Cross-Scale Convolutional Neural Networks (CSCNNs) to develop a novel computational tool, MlyPredCSED, for simultaneously predicting multiple lysine modification sites. MlyPredCSED employs Multi-Label Position-Specific Triad Amino Acid Propensity and the physicochemical properties of amino acids to enhance the richness of sequence information. To address the challenge of sample imbalance, the innovative EPDCC Undersampling technique was introduced to adjust the majority class samples. The model's training and testing phase relies on the advanced CSCNN framework. MlyPredCSED, through cross-validation and testing, outperformed existing models, especially in complex categories with multiple modification sites. This research not only provides an efficient method for the identification of lysine modification sites but also demonstrates its value in biological research and drug development. To facilitate efficient use of MlyPredCSED by researchers, we have specifically developed an accessible free web tool: http://www.mlypredcsed.com.
Collapse
Affiliation(s)
- Yun Zuo
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214000, China
| | - Xingze Fang
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214000, China
| | - Jiankang Chen
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214000, China
| | - Jiayi Ji
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214000, China
| | - Yuwen Li
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214000, China
| | - Zeyu Wu
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214000, China
| | - Xiangrong Liu
- Department of Computer Science and Technology, National Institute for Data Science in Health and Medicine, Xiamen Key Laboratory of Intelligent Storage and Computing, Xiamen University, Xiamen 361005, China
| | - Xiangxiang Zeng
- School of Information Science and Engineering, Hunan University, Changsha, China
| | - Zhaohong Deng
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214000, China
| | - Hongwei Yin
- Department of Oncology, The First Affiliated Hospital of Naval Military Medical University, Shanghai 200000, China
| | - Anjing Zhao
- Department of Oncology, The First Affiliated Hospital of Naval Military Medical University, Shanghai 200000, China
| |
Collapse
|
3
|
Vinh T, Nguyen-Vo TH, Le VT, Phan-Nguyen XP, Nguyen BP. In silico identification of Histone Deacetylase inhibitors using Streamlined Masked Transformer-based Pretrained features. Methods 2025; 234:1-9. [PMID: 39581247 DOI: 10.1016/j.ymeth.2024.11.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2024] [Revised: 10/19/2024] [Accepted: 11/09/2024] [Indexed: 11/26/2024] Open
Abstract
Histone Deacetylases (HDACs) are enzymes that regulate gene expression by removing acetyl groups from histones. They are involved in various diseases, including neurodegenerative, cardiovascular, inflammatory, and metabolic disorders, as well as fibrosis in the liver, lungs, and kidneys. Successfully identifying potent HDAC inhibitors may offer a promising approach to treating these diseases. In addition to experimental techniques, researchers have introduced several in silico methods for identifying HDAC inhibitors. However, these existing computer-aided methods have shortcomings in their modeling stages, which limit their applications. In our study, we present a Streamlined Masked Transformer-based Pretrained (SMTP) encoder, which can be used to generate features for downstream tasks. The training process of the SMTP encoder was directed by masked attention-based learning, enhancing the model's generalizability in encoding molecules. The SMTP features were used to develop 11 classification models identifying 11 HDAC isoforms. We trained SMTP, a lightweight encoder, with only 1.9 million molecules, a smaller number than other known molecular encoders, yet its discriminant ability remains competitive. The results revealed that machine learning models developed using the SMTP feature set outperformed those developed using other feature sets in 8 out of 11 classification tasks. Additionally, chemical diversity analysis confirmed the encoder's effectiveness in distinguishing between two classes of molecules.
Collapse
Affiliation(s)
- Tuan Vinh
- Department of Chemistry, Emory University, 201 Dowman Drive, Atlanta, GA 30322-1007, United States.
| | - Thanh-Hoang Nguyen-Vo
- Faculty of Information Technology, Ho Chi Minh City Open University, 97 Vo Van Tan, District 3, Ho Chi Minh City 700000, Viet Nam.
| | - Viet-Tuan Le
- Faculty of Information Technology, Ho Chi Minh City Open University, 97 Vo Van Tan, District 3, Ho Chi Minh City 700000, Viet Nam.
| | - Xuan-Phuc Phan-Nguyen
- Faculty of Information Technology, Ho Chi Minh City Open University, 97 Vo Van Tan, District 3, Ho Chi Minh City 700000, Viet Nam.
| | - Binh P Nguyen
- Faculty of Information Technology, Ho Chi Minh City Open University, 97 Vo Van Tan, District 3, Ho Chi Minh City 700000, Viet Nam; School of Mathematics and Statistics, Victoria University of Wellington, Kelburn Parade, Wellington 6012, New Zealand.
| |
Collapse
|
4
|
Kim DN, Yin T, Zhang T, Im AK, Cort JR, Rozum JC, Pollock D, Qian WJ, Feng S. Artificial Intelligence Transforming Post-Translational Modification Research. Bioengineering (Basel) 2024; 12:26. [PMID: 39851300 PMCID: PMC11762806 DOI: 10.3390/bioengineering12010026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 12/16/2024] [Accepted: 12/29/2024] [Indexed: 01/26/2025] Open
Abstract
Post-Translational Modifications (PTMs) are covalent changes to amino acids that occur after protein synthesis, including covalent modifications on side chains and peptide backbones. Many PTMs profoundly impact cellular and molecular functions and structures, and their significance extends to evolutionary studies as well. In light of these implications, we have explored how artificial intelligence (AI) can be utilized in researching PTMs. Initially, rationales for adopting AI and its advantages in understanding the functions of PTMs are discussed. Then, various deep learning architectures and programs, including recent applications of language models, for predicting PTM sites on proteins and the regulatory functions of these PTMs are compared. Finally, our high-throughput PTM-data-generation pipeline, which formats data suitably for AI training and predictions is described. We hope this review illuminates areas where future AI models on PTMs can be improved, thereby contributing to the field of PTM bioengineering.
Collapse
Affiliation(s)
- Doo Nam Kim
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
| | - Tianzhixi Yin
- National Security Directorate, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA
| | - Tong Zhang
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
| | - Alexandria K. Im
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
| | - John R. Cort
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
| | - Jordan C. Rozum
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
| | - David Pollock
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Wei-Jun Qian
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
| | - Song Feng
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA (J.C.R.); (D.P.); (W.-J.Q.)
| |
Collapse
|
5
|
Qin Z, Ren H, Zhao P, Wang K, Liu H, Miao C, Du Y, Li J, Wu L, Chen Z. Current computational tools for protein lysine acylation site prediction. Brief Bioinform 2024; 25:bbae469. [PMID: 39316944 PMCID: PMC11421846 DOI: 10.1093/bib/bbae469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Revised: 08/20/2024] [Accepted: 09/07/2024] [Indexed: 09/26/2024] Open
Abstract
As a main subtype of post-translational modification (PTM), protein lysine acylations (PLAs) play crucial roles in regulating diverse functions of proteins. With recent advancements in proteomics technology, the identification of PTM is becoming a data-rich field. A large amount of experimentally verified data is urgently required to be translated into valuable biological insights. With computational approaches, PLA can be accurately detected across the whole proteome, even for organisms with small-scale datasets. Herein, a comprehensive summary of 166 in silico PLA prediction methods is presented, including a single type of PLA site and multiple types of PLA sites. This recapitulation covers important aspects that are critical for the development of a robust predictor, including data collection and preparation, sample selection, feature representation, classification algorithm design, model evaluation, and method availability. Notably, we discuss the application of protein language models and transfer learning to solve the small-sample learning issue. We also highlight the prediction methods developed for functionally relevant PLA sites and species/substrate/cell-type-specific PLA sites. In conclusion, this systematic review could potentially facilitate the development of novel PLA predictors and offer useful insights to researchers from various disciplines.
Collapse
Affiliation(s)
- Zhaohui Qin
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Haoran Ren
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Pei Zhao
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang 455000, China
| | - Kaiyuan Wang
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Huixia Liu
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Chunbo Miao
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Yanxiu Du
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Junzhou Li
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Liuji Wu
- National Key Laboratory of Wheat and Maize Crop Science, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Zhen Chen
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| |
Collapse
|
6
|
Qin J, Huang X, Gou S, Zhang S, Gou Y, Zhang Q, Chen H, Sun L, Chen M, Liu D, Han C, Tang M, Feng Z, Niu S, Zhao L, Tu Y, Liu Z, Xuan W, Dai L, Jia D, Xue Y. Ketogenic diet reshapes cancer metabolism through lysine β-hydroxybutyrylation. Nat Metab 2024; 6:1505-1528. [PMID: 39134903 DOI: 10.1038/s42255-024-01093-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 07/02/2024] [Indexed: 08/29/2024]
Abstract
Lysine β-hydroxybutyrylation (Kbhb) is a post-translational modification induced by the ketogenic diet (KD), a diet showing therapeutic effects on multiple human diseases. Little is known how cellular processes are regulated by Kbhb. Here we show that protein Kbhb is strongly affected by the KD through a multi-omics analysis of mouse livers. Using a small training dataset with known functions, we developed a bioinformatics method for the prediction of functionally important lysine modification sites (pFunK), which revealed functionally relevant Kbhb sites on various proteins, including aldolase B (ALDOB) Lys108. KD consumption or β-hydroxybutyrate supplementation in hepatocellular carcinoma cells increases ALDOB Lys108bhb and inhibits the enzymatic activity of ALDOB. A Kbhb-mimicking mutation (p.Lys108Gln) attenuates ALDOB activity and its binding to substrate fructose-1,6-bisphosphate, inhibits mammalian target of rapamycin signalling and glycolysis, and markedly suppresses cancer cell proliferation. Our study reveals a critical role of Kbhb in regulating cancer cell metabolism and provides a generally applicable algorithm for predicting functionally important lysine modification sites.
Collapse
Affiliation(s)
- Junhong Qin
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Xinhe Huang
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Shengsong Gou
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Sitao Zhang
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Yujie Gou
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Qian Zhang
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Hongyu Chen
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Lin Sun
- Frontiers Science Center for Synthetic Biology, Tianjin Key Laboratory of Function and Application of Biological Macromolecular Structures, School of Life Sciences, Tianjin University, Tianjin, China
| | - Miaomiao Chen
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Dan Liu
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Cheng Han
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Min Tang
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Zihao Feng
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Shenghui Niu
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Lin Zhao
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Yingfeng Tu
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China
| | - Zexian Liu
- Department of Medical Oncology, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University, Guangzhou, China
| | - Weimin Xuan
- Frontiers Science Center for Synthetic Biology, Tianjin Key Laboratory of Function and Application of Biological Macromolecular Structures, School of Life Sciences, Tianjin University, Tianjin, China
| | - Lunzhi Dai
- National Clinical Research Center for Geriatrics and Department of General Practice, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, and Collaborative Innovation Center of Biotherapy, Chengdu, China
| | - Da Jia
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Paediatrics, West China Second University Hospital, State Key Laboratory of Biotherapy, Sichuan University, Chengdu, China.
| | - Yu Xue
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China.
- Nanjing University Institute of Artificial Intelligence Biomedicine, Nanjing, China.
| |
Collapse
|
7
|
Meng L, Chen X, Cheng K, Chen N, Zheng Z, Wang F, Sun H, Wong KC. TransPTM: a transformer-based model for non-histone acetylation site prediction. Brief Bioinform 2024; 25:bbae219. [PMID: 38725156 PMCID: PMC11082075 DOI: 10.1093/bib/bbae219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 04/08/2024] [Accepted: 04/23/2024] [Indexed: 05/13/2024] Open
Abstract
Protein acetylation is one of the extensively studied post-translational modifications (PTMs) due to its significant roles across a myriad of biological processes. Although many computational tools for acetylation site identification have been developed, there is a lack of benchmark dataset and bespoke predictors for non-histone acetylation site prediction. To address these problems, we have contributed to both dataset creation and predictor benchmark in this study. First, we construct a non-histone acetylation site benchmark dataset, namely NHAC, which includes 11 subsets according to the sequence length ranging from 11 to 61 amino acids. There are totally 886 positive samples and 4707 negative samples for each sequence length. Secondly, we propose TransPTM, a transformer-based neural network model for non-histone acetylation site predication. During the data representation phase, per-residue contextualized embeddings are extracted using ProtT5 (an existing pre-trained protein language model). This is followed by the implementation of a graph neural network framework, which consists of three TransformerConv layers for feature extraction and a multilayer perceptron module for classification. The benchmark results reflect that TransPTM has the competitive performance for non-histone acetylation site prediction over three state-of-the-art tools. It improves our comprehension on the PTM mechanism and provides a theoretical basis for developing drug targets for diseases. Moreover, the created PTM datasets fills the gap in non-histone acetylation site datasets and is beneficial to the related communities. The related source code and data utilized by TransPTM are accessible at https://www.github.com/TransPTM/TransPTM.
Collapse
Affiliation(s)
- Lingkuan Meng
- Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
| | - Xingjian Chen
- Cutaneous Biology Research Center, Massachusetts General Hospital, Harvard Medical School, MA 02138, United States
| | - Ke Cheng
- Department of Chemistry, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
| | - Nanjun Chen
- Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
| | - Zetian Zheng
- Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
| | - Fuzhou Wang
- Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
| | - Hongyan Sun
- Department of Chemistry, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
- Shenzhen Research Institute, City University of Hong Kong, Shenzhen, China
| |
Collapse
|
8
|
Kumari S, Gupta R, Ambasta RK, Kumar P. Emerging trends in post-translational modification: Shedding light on Glioblastoma multiforme. Biochim Biophys Acta Rev Cancer 2023; 1878:188999. [PMID: 37858622 DOI: 10.1016/j.bbcan.2023.188999] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 10/06/2023] [Accepted: 10/06/2023] [Indexed: 10/21/2023]
Abstract
Recent multi-omics studies, including proteomics, transcriptomics, genomics, and metabolomics have revealed the critical role of post-translational modifications (PTMs) in the progression and pathogenesis of Glioblastoma multiforme (GBM). Further, PTMs alter the oncogenic signaling events and offer a novel avenue in GBM therapeutics research through PTM enzymes as potential biomarkers for drug targeting. In addition, PTMs are critical regulators of chromatin architecture, gene expression, and tumor microenvironment (TME), that play a crucial function in tumorigenesis. Moreover, the implementation of artificial intelligence and machine learning algorithms enhances GBM therapeutics research through the identification of novel PTM enzymes and residues. Herein, we briefly explain the mechanism of protein modifications in GBM etiology, and in altering the biologics of GBM cells through chromatin remodeling, modulation of the TME, and signaling pathways. In addition, we highlighted the importance of PTM enzymes as therapeutic biomarkers and the role of artificial intelligence and machine learning in protein PTM prediction.
Collapse
Affiliation(s)
- Smita Kumari
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological, University, India
| | - Rohan Gupta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological, University, India; School of Medicine, University of South Carolina, Columbia, SC, United States of America
| | - Rashmi K Ambasta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological, University, India; Department of Biotechnology and Microbiology, SRM University, Sonepat, Haryana, India.
| | - Pravir Kumar
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological, University, India.
| |
Collapse
|
9
|
Kumari S, Kumar P. Identification and characterization of putative biomarkers and therapeutic axis in Glioblastoma multiforme microenvironment. Front Cell Dev Biol 2023; 11:1236271. [PMID: 37538397 PMCID: PMC10395518 DOI: 10.3389/fcell.2023.1236271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 06/23/2023] [Indexed: 08/05/2023] Open
Abstract
Non-cellular secretory components, including chemokines, cytokines, and growth factors in the tumor microenvironment, are often dysregulated, impacting tumorigenesis in Glioblastoma multiforme (GBM) microenvironment, where the prognostic significance of the current treatment remains unsatisfactory. Recent studies have demonstrated the potential of post-translational modifications (PTM) and their respective enzymes, such as acetylation and ubiquitination in GBM etiology through modulating signaling events. However, the relationship between non-cellular secretory components and post-translational modifications will create a research void in GBM therapeutics. Therefore, we aim to bridge the gap between non-cellular secretory components and PTM modifications through machine learning and computational biology approaches. Herein, we highlighted the importance of BMP1, CTSB, LOX, LOXL1, PLOD1, MMP9, SERPINE1, and SERPING1 in GBM etiology. Further, we demonstrated the positive relationship between the E2 conjugating enzymes (Ube2E1, Ube2H, Ube2J2, Ube2C, Ube2J2, and Ube2S), E3 ligases (VHL and GNB2L1) and substrate (HIF1A). Additionally, we reported the novel HAT1-induced acetylation sites of Ube2S (K211) and Ube2H (K8, K52). Structural and functional characterization of Ube2S (8) and Ube2H (1) have identified their association with protein kinases. Lastly, our results found a putative therapeutic axis HAT1-Ube2S(K211)-GNB2L1-HIF1A and potential predictive biomarkers (CTSB, HAT1, Ube2H, VHL, and GNB2L1) that play a critical role in GBM pathogenesis.
Collapse
|
10
|
Yu K, Wang Y, Zheng Y, Liu Z, Zhang Q, Wang S, Zhao Q, Zhang X, Li X, Xu RH, Liu ZX. qPTM: an updated database for PTM dynamics in human, mouse, rat and yeast. Nucleic Acids Res 2022; 51:D479-D487. [PMID: 36165955 PMCID: PMC9825568 DOI: 10.1093/nar/gkac820] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 08/26/2022] [Accepted: 09/14/2022] [Indexed: 01/29/2023] Open
Abstract
Post-translational modifications (PTMs) are critical molecular mechanisms that regulate protein functions temporally and spatially in various organisms. Since most PTMs are dynamically regulated, quantifying PTM events under different states is crucial for understanding biological processes and diseases. With the rapid development of high-throughput proteomics technologies, massive quantitative PTM proteome datasets have been generated. Thus, a comprehensive one-stop data resource for surfing big data will benefit the community. Here, we updated our previous phosphorylation dynamics database qPhos to the qPTM (http://qptm.omicsbio.info). In qPTM, 11 482 553 quantification events among six types of PTMs, including phosphorylation, acetylation, glycosylation, methylation, SUMOylation and ubiquitylation in four different organisms were collected and integrated, and the matched proteome datasets were included if available. The raw mass spectrometry based false discovery rate control and the recurrences of identifications among datasets were integrated into a scoring system to assess the reliability of the PTM sites. Browse and search functions were improved to facilitate users in swiftly and accurately acquiring specific information. The results page was revised with more abundant annotations, and time-course dynamics data were visualized in trend lines. We expected the qPTM database to be a much more powerful and comprehensive data repository for the PTM research community.
Collapse
Affiliation(s)
| | | | | | | | - Qingfeng Zhang
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Siyu Wang
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Qi Zhao
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Xiaolong Zhang
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Xiaoxing Li
- Precision Medicine Institute, First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510080, China
| | - Rui-Hua Xu
- Correspondence may also be addressed to Rui-Hua Xu. Tel: +86 20 8734 3228; Fax: +86 20 8734 3392;
| | - Ze-Xian Liu
- To whom correspondence should be addressed. Tel: +86 20 8734 2025; Fax: +86 20 8734 2522;
| |
Collapse
|
11
|
Watanabe N, Yamamoto M, Murata M, Vavricka CJ, Ogino C, Kondo A, Araki M. Comprehensive Machine Learning Prediction of Extensive Enzymatic Reactions. J Phys Chem B 2022; 126:6762-6770. [PMID: 36053051 DOI: 10.1021/acs.jpcb.2c03287] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
New enzyme functions exist within the increasing number of unannotated protein sequences. Novel enzyme discovery is necessary to expand the pathways that can be accessed by metabolic engineering for the biosynthesis of functional compounds. Accordingly, various machine learning models have been developed to predict enzymatic reactions. However, the ability to predict unknown reactions that are not included in the training data has not been clarified. In order to cover uncertain and unknown reactions, a wider range of reaction types must be demonstrated by the models. Here, we establish 16 expanded enzymatic reaction prediction models developed using various machine learning algorithms, including deep neural network. Improvements in prediction performances over that of our previous study indicate that the updated methods are more effective for the prediction of enzymatic reactions. Overall, the deep neural network model trained with combined substrate-enzyme-product information exhibits the highest prediction accuracy with Macro F1 scores up to 0.966 and with robust prediction of unknown enzymatic reactions that are not included in the training data. This model can predict more extensive enzymatic reactions in comparison to previously reported models. This study will facilitate the discovery of new enzymes for the production of useful substances.
Collapse
Affiliation(s)
- Naoki Watanabe
- Department of Chemical Science and Engineering Graduate School of Engineering, Kobe University, 1-1 Rokkodai-cho, Nada, Kobe, Hyogo 657-8501, Japan
| | - Masaki Yamamoto
- Graduate School of Medicine, Kyoto University, 54 Kawahara-cho, Shogoin Sakyo-ku, Kyoto 606-8507, Japan
| | - Masahiro Murata
- Graduate School of Medicine, Kyoto University, 54 Kawahara-cho, Shogoin Sakyo-ku, Kyoto 606-8507, Japan
| | - Christopher J Vavricka
- Graduate School of Science, Technology and Innovation, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe 657-8501, Japan
| | - Chiaki Ogino
- Department of Chemical Science and Engineering Graduate School of Engineering, Kobe University, 1-1 Rokkodai-cho, Nada, Kobe, Hyogo 657-8501, Japan
| | - Akihiko Kondo
- Department of Chemical Science and Engineering Graduate School of Engineering, Kobe University, 1-1 Rokkodai-cho, Nada, Kobe, Hyogo 657-8501, Japan.,Graduate School of Science, Technology and Innovation, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe 657-8501, Japan
| | - Michihiro Araki
- Graduate School of Medicine, Kyoto University, 54 Kawahara-cho, Shogoin Sakyo-ku, Kyoto 606-8507, Japan.,Graduate School of Science, Technology and Innovation, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe 657-8501, Japan.,National Institutes of Biomedical Innovation, Health and Nutrition, National Institute of Health and Nutrition, 1-23-1 Toyama, Shinjuku-ku, Tokyo 162-8638, Japan.,National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita, Osaka 564-8565, Japan
| |
Collapse
|
12
|
Wang M, Li F, Wu H, Liu Q, Li S. PredPromoter-MF(2L): A Novel Approach of Promoter Prediction Based on Multi-source Feature Fusion and Deep Forest. Interdiscip Sci 2022; 14:697-711. [PMID: 35488998 DOI: 10.1007/s12539-022-00520-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 04/05/2022] [Accepted: 04/05/2022] [Indexed: 12/12/2022]
Abstract
Promoters short DNA sequences play vital roles in initiating gene transcription. However, it remains a challenge to identify promoters using conventional experiment techniques in a high-throughput manner. To this end, several computational predictors based on machine learning models have been developed, while their performance is unsatisfactory. In this study, we proposed a novel two-layer predictor, called PredPromoter-MF(2L), based on multi-source feature fusion and ensemble learning. PredPromoter-MF(2L) was developed based on various deep features learned by a pre-trained deep learning network model and sequence-derived features. Feature selection based on XGBoost was applied to reduce fused features dimensions, and a cascade deep forest model was trained on the selected feature subset for promoter prediction. The results both fivefold cross-validation and independent test demonstrated that PredPromoter-MF(2L) outperformed state-of-the-art methods.
Collapse
Affiliation(s)
- Miao Wang
- College of Information Engineering, Northwest A&F University, Yangling, 712100, Shanxi, China
| | - Fuyi Li
- Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, VIC, 3000, Australia
| | - Hao Wu
- School of Software, Shandong University, Jinan, 250100, Shandong, China
| | - Quanzhong Liu
- College of Information Engineering, Northwest A&F University, Yangling, 712100, Shanxi, China.
| | - Shuqin Li
- College of Information Engineering, Northwest A&F University, Yangling, 712100, Shanxi, China.
| |
Collapse
|
13
|
Nie J, Aweya JJ, Yu Z, Zhou H, Wang F, Yao D, Zheng Z, Li S, Ma H, Zhang Y. Deacetylation of K481 and K484 on Penaeid Shrimp Hemocyanin Is Critical for Antibacterial Activity. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2022; 209:476-487. [PMID: 35851542 PMCID: PMC10580119 DOI: 10.4049/jimmunol.2200078] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 05/24/2022] [Indexed: 10/17/2023]
Abstract
Although invertebrates' innate immunity relies on several immune-like molecules, the diversity of these molecules and their immune response mechanisms are not well understood. Here, we show that Penaeus vannamei hemocyanin (PvHMC) undergoes specific deacetylation under Vibrio parahaemolyticus and LPS challenge. In vitro deacetylation of PvHMC increases its binding capacity with LPS and antibacterial activity against Gram-negative bacteria. Lysine residues K481 and K484 on the Ig-like domain of PvHMC are the main acetylation sites modulated by the acetyltransferase TIP60 and deacetylase HDAC3. Deacetylation of PvHMC on K481 and K484 allows PvHMC to form a positively charged binding pocket that interacts directly with LPS, whereas acetylation abrogates the positive charge to decrease PvHMC-LPS attraction. Besides, V. parahaemolyticus and LPS challenge increases the expression of Pvhdac3 to induce PvHMC deacetylation. This work indicates that, during bacterial infections, deacetylation of hemocyanin is crucial for binding with LPS to clear Gram-negative bacteria in crustaceans.
Collapse
Affiliation(s)
- Junjie Nie
- Institute of Marine Sciences and Guangdong Provincial Key Laboratory of Marine Biotechnology, Shantou University, Shantou, China
- Shantou University-Universiti Malaysia Terengganu Joint Shellfish Research Laboratory, Shantou University, Shantou, China
| | - Jude Juventus Aweya
- Institute of Marine Sciences and Guangdong Provincial Key Laboratory of Marine Biotechnology, Shantou University, Shantou, China
- College of Ocean Food and Biological Engineering, Fujian Provincial Key Laboratory of Food Microbiology and Enzyme Engineering, Jimei University, Xiamen, Fujian, China
| | - Zhixue Yu
- Institute of Marine Sciences and Guangdong Provincial Key Laboratory of Marine Biotechnology, Shantou University, Shantou, China
| | - Hui Zhou
- Institute of Marine Sciences and Guangdong Provincial Key Laboratory of Marine Biotechnology, Shantou University, Shantou, China
| | - Fan Wang
- Institute of Marine Sciences and Guangdong Provincial Key Laboratory of Marine Biotechnology, Shantou University, Shantou, China
- Shantou University-Universiti Malaysia Terengganu Joint Shellfish Research Laboratory, Shantou University, Shantou, China
| | - Defu Yao
- Institute of Marine Sciences and Guangdong Provincial Key Laboratory of Marine Biotechnology, Shantou University, Shantou, China
- Shantou University-Universiti Malaysia Terengganu Joint Shellfish Research Laboratory, Shantou University, Shantou, China
| | - Zhihong Zheng
- Institute of Marine Sciences and Guangdong Provincial Key Laboratory of Marine Biotechnology, Shantou University, Shantou, China
- Shantou University-Universiti Malaysia Terengganu Joint Shellfish Research Laboratory, Shantou University, Shantou, China
| | - Shengkang Li
- Institute of Marine Sciences and Guangdong Provincial Key Laboratory of Marine Biotechnology, Shantou University, Shantou, China
- Shantou University-Universiti Malaysia Terengganu Joint Shellfish Research Laboratory, Shantou University, Shantou, China
- Southern Marine Science and Engineering Guangdong Laboratory, Guangzhou, China; and
| | - Hongyu Ma
- Institute of Marine Sciences and Guangdong Provincial Key Laboratory of Marine Biotechnology, Shantou University, Shantou, China
- Shantou University-Universiti Malaysia Terengganu Joint Shellfish Research Laboratory, Shantou University, Shantou, China
| | - Yueling Zhang
- Institute of Marine Sciences and Guangdong Provincial Key Laboratory of Marine Biotechnology, Shantou University, Shantou, China
- Shantou University-Universiti Malaysia Terengganu Joint Shellfish Research Laboratory, Shantou University, Shantou, China
- Southern Marine Science and Engineering Guangdong Laboratory, Guangzhou, China; and
| |
Collapse
|
14
|
Mini-review: Recent advances in post-translational modification site prediction based on deep learning. Comput Struct Biotechnol J 2022; 20:3522-3532. [PMID: 35860402 PMCID: PMC9284371 DOI: 10.1016/j.csbj.2022.06.045] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 06/21/2022] [Accepted: 06/21/2022] [Indexed: 11/23/2022] Open
Abstract
Post-translational modifications (PTMs) are closely linked to numerous diseases, playing a significant role in regulating protein structures, activities, and functions. Therefore, the identification of PTMs is crucial for understanding the mechanisms of cell biology and diseases therapy. Compared to traditional machine learning methods, the deep learning approaches for PTM prediction provide accurate and rapid screening, guiding the downstream wet experiments to leverage the screened information for focused studies. In this paper, we reviewed the recent works in deep learning to identify phosphorylation, acetylation, ubiquitination, and other PTM types. In addition, we summarized PTM databases and discussed future directions with critical insights.
Collapse
Key Words
- AAindex, Amino acid index
- ATP, Adenosine triphosphate
- AUC, Area under curve
- Ac, Acetylation
- BE, Binary encoding
- BLOSUM, Blocks substitution matrix
- Bi-LSTM, Bidirectional LSTM
- CKSAAP, Composition of k-spaced amino acid Pairs
- CNN, Convolutional neural network
- CNNOH, CNN with the one-hot encoding
- CNNWE, CNN with the word-embedding encoding
- CNNrgb, CNN red green blue
- CV, Cross-validation
- DC-CNN, Densely connected convolutional neural network
- DL, Deep learning
- DNNs, Deep neural networks
- Deep learning
- E. coli, Escherichia coli
- EBGW, Encoding based on grouped weight
- EGAAC, Enhanced grouped amino acids content
- IG, Information gain
- K, Lysine
- KNN, k nearest neighbor
- LASSO, Least absolute shrinkage and selection operator
- LSTM, Long short-term memory
- LSTMWE, LSTM with the word-embedding encoding
- M.musculus, Mus musculus
- MDC, Modular densely connected convolutional networks
- MDCAN, Multilane dense convolutional attention network
- ML, Machine learning
- MLP, Multilayer perceptron
- MMI, Multivariate mutual information
- Machine learning
- Mass spectrometry
- NMBroto, Normalized Moreau-Broto autocorrelation
- P, Proline
- PSP, PhosphoSitePlus
- PSSM, Position-specific scoring matrix
- PTM, Post-translational modifications
- Ph, Phosphorylation
- Post-translational modification
- Prediction
- PseAAC, Pseudo-amino acid composition
- R, Arginine
- RF, Random forest
- RNN, Recurrent neural network
- ROC, Receiver operating characteristic
- S, Serine
- S. typhimurium, Salmonella typhimurium
- S.cerevisiae, Saccharomyces cerevisiae
- SE, Squeeze and excitation
- SEV, Split to Equal Validation
- ST, Source and target
- SUMO, Small ubiquitin-like modifier
- SVM, Support vector machines
- T, Threonine
- Ub, Ubiquitination
- Y, Tyrosine
- ZSL, Zero-shot learning
Collapse
|
15
|
Vinogradov AA, Chang JS, Onaka H, Goto Y, Suga H. Accurate Models of Substrate Preferences of Post-Translational Modification Enzymes from a Combination of mRNA Display and Deep Learning. ACS CENTRAL SCIENCE 2022; 8:814-824. [PMID: 35756369 PMCID: PMC9228559 DOI: 10.1021/acscentsci.2c00223] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Indexed: 05/15/2023]
Abstract
Promiscuous post-translational modification (PTM) enzymes often display nonobvious substrate preferences by acting on diverse yet well-defined sets of peptides and/or proteins. Understanding of substrate fitness landscapes for PTM enzymes is important in many areas of contemporary science, including natural product biosynthesis, molecular biology, and biotechnology. Here, we report an integrated platform for accurate profiling of substrate preferences for PTM enzymes. The platform features (i) a combination of mRNA display with next-generation sequencing as an ultrahigh throughput technique for data acquisition and (ii) deep learning for data analysis. The high accuracy (>0.99 in each of two studies) of the resulting deep learning models enables comprehensive analysis of enzymatic substrate preferences. The models can quantify fitness across sequence space, map modification sites, and identify important amino acids in the substrate. To benchmark the platform, we performed profiling of a Ser dehydratase (LazBF) and a Cys/Ser cyclodehydratase (LazDEF), two enzymes from the lactazole biosynthesis pathway. In both studies, our results point to complex enzymatic preferences, which, particularly for LazBF, cannot be reduced to a set of simple rules. The ability of the constructed models to dissect such complexity suggests that the developed platform can facilitate a wider study of PTM enzymes.
Collapse
Affiliation(s)
- Alexander A. Vinogradov
- Department
of Chemistry, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Jun Shi Chang
- Department
of Chemistry, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Hiroyasu Onaka
- Department
of Biotechnology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-8657, Japan
- Collaborative
Research Institute for Innovative Microbiology, The University of Tokyo, Bunkyo-ku, Tokyo 113-8657, Japan
| | - Yuki Goto
- Department
of Chemistry, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Hiroaki Suga
- Department
of Chemistry, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| |
Collapse
|
16
|
Li G, Zhu Y, Gu J, Zhang T, Wang F, Huang K, Gu C, Xu K, Zhan R, Shen J. RNA modification patterns based on major RNA modifications define tumor microenvironment characteristics in glioblastoma. Sci Rep 2022; 12:10278. [PMID: 35717510 PMCID: PMC9206649 DOI: 10.1038/s41598-022-14539-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 06/08/2022] [Indexed: 12/11/2022] Open
Abstract
RNA modifications play a major role in tumorigenicity and progression, but the expression and function in glioblastoma (GBM) have not been well described. In this study, we developed a GBM score based on the differentially expressed genes (DEGs) between groups showing RNA modification patterns. We assessed the association between the GBM score and tumor microenvironment (TME) characteristics. Based on the gene expression of these regulators, we identified two clusters with distinct RNA modification patterns. Kaplan–Meier survival curves showed that patients in cluster 1 had worse survival than those in cluster 2. Kaplan–Meier and multivariate Cox regression analyses showed that GBM scores (based on DEGs between RNA modification patterns) are an independent predictive biomarker for patient prognosis. Besides, we found that samples with high scores were significantly associated with epithelial-to-mesenchymal transition and immune checkpoints, while samples with low scores were associated with cell cycle regulation. Importantly, GBM-score markedly positively correlated drug resistance, while negatively correlated with drug sensitive. The responders of anti-PD-1/PD-L1 immunotherapy tend to have a lower GBM score than non-responders. In conclusion, our comprehensive analysis of multiple RNA modifications in GBM revealed that RNA modification regulators were closely correlated with TME.
Collapse
Affiliation(s)
- Ganglei Li
- Department of Neurosurgery, The First Affiliated Hospital, College of Medicine, Zhejiang University, No.79 Qingchun Road, Hangzhou, 310003, Zhejiang, China
| | - Yu Zhu
- Department of Neurosurgery, The First Affiliated Hospital, College of Medicine, Zhejiang University, No.79 Qingchun Road, Hangzhou, 310003, Zhejiang, China
| | - Jun Gu
- Department of Neurosurgery, The First Affiliated Hospital, College of Medicine, Zhejiang University, No.79 Qingchun Road, Hangzhou, 310003, Zhejiang, China
| | - Tiesong Zhang
- Department of Neurosurgery, The First Affiliated Hospital, College of Medicine, Zhejiang University, No.79 Qingchun Road, Hangzhou, 310003, Zhejiang, China
| | - Feng Wang
- Department of Neurosurgery, The First Affiliated Hospital, College of Medicine, Zhejiang University, No.79 Qingchun Road, Hangzhou, 310003, Zhejiang, China
| | - Kaiyuan Huang
- Department of Neurosurgery, The First Affiliated Hospital, College of Medicine, Zhejiang University, No.79 Qingchun Road, Hangzhou, 310003, Zhejiang, China
| | - Chenjie Gu
- Department of Neurosurgery, The First Affiliated Hospital, College of Medicine, Zhejiang University, No.79 Qingchun Road, Hangzhou, 310003, Zhejiang, China
| | - Kangli Xu
- Department of Neurosurgery, The First Affiliated Hospital, College of Medicine, Zhejiang University, No.79 Qingchun Road, Hangzhou, 310003, Zhejiang, China
| | - Renya Zhan
- Department of Neurosurgery, The First Affiliated Hospital, College of Medicine, Zhejiang University, No.79 Qingchun Road, Hangzhou, 310003, Zhejiang, China.
| | - Jian Shen
- Department of Neurosurgery, The First Affiliated Hospital, College of Medicine, Zhejiang University, No.79 Qingchun Road, Hangzhou, 310003, Zhejiang, China.
| |
Collapse
|
17
|
Deep Learning-Based Advances In Protein Posttranslational Modification Site and Protein Cleavage Prediction. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2499:285-322. [PMID: 35696087 DOI: 10.1007/978-1-0716-2317-6_15] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Posttranslational modification (PTM ) is a ubiquitous phenomenon in both eukaryotes and prokaryotes which gives rise to enormous proteomic diversity. PTM mostly comes in two flavors: covalent modification to polypeptide chain and proteolytic cleavage. Understanding and characterization of PTM is a fundamental step toward understanding the underpinning of biology. Recent advances in experimental approaches, mainly mass-spectrometry-based approaches, have immensely helped in obtaining and characterizing PTMs. However, experimental approaches are not enough to understand and characterize more than 450 different types of PTMs and complementary computational approaches are becoming popular. Recently, due to the various advancements in the field of Deep Learning (DL), along with the explosion of applications of DL to various fields, the field of computational prediction of PTM has also witnessed the development of a plethora of deep learning (DL)-based approaches. In this book chapter, we first review some recent DL-based approaches in the field of PTM site prediction. In addition, we also review the recent advances in the not-so-studied PTM , that is, proteolytic cleavage predictions. We describe advances in PTM prediction by highlighting the Deep learning architecture, feature encoding, novelty of the approaches, and availability of the tools/approaches. Finally, we provide an outlook and possible future research directions for DL-based approaches for PTM prediction.
Collapse
|
18
|
Chen S, Zeng J, Huang L, Peng Y, Yan Z, Zhang A, Zhao X, Li J, Zhou Z, Wang S, Jing S, Hu M, Li Y, Wang D, Wang W, Yu H, Miao J, Li J, Deng Y, Li Y, Liu T, Xu D. RNA adenosine modifications related to prognosis and immune infiltration in osteosarcoma. J Transl Med 2022; 20:228. [PMID: 35568866 PMCID: PMC9107650 DOI: 10.1186/s12967-022-03415-6] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Accepted: 04/27/2022] [Indexed: 11/14/2022] Open
Abstract
Background RNA adenosine modifications, which are primarily mediated by “writer” enzymes (RMWs), play a key role in epigenetic regulation in various biological processes, including tumorigenesis. However, the expression and prognostic role of these genes in osteosarcoma (OS) remain unclear. Methods Univariate and multivariate Cox analyses were used to construct the RMW signature for OS using Target datasets. RMW expression in OS tissue was detected by qPCR analysis. Xcell and GSVA were used to determine the relationship between RMWs and immune infiltration. The DGIdb and CMap databases were used for drug prediction. In vivo and in vitro experiments showed that strophanthidin elicited antitumor activity against OS. Results A 3-RMW (CSTF2, ADAR and WTAP) prognostic signature in OS was constructed using the Target dataset and verified using GEO datasets and 63 independent OS tissues via qPCR analysis. High-risk OS patients had poor overall survival, and the prognostic signature was an independent prognostic factor for OS. Functional studies showed that tumour-, metabolism-, cell cycle- and immune-related pathways were related to high risk. Next, we found that RMW-derived high-risk patients exhibited increased infiltration of M2 macrophages and cDCs. Furthermore, we predicted the potential drugs for OS using the DGIdb and CMap databases. In vivo and in vitro experiments showed that strophanthidin elicited antitumor activity against OS by repressing cell growth and inducing cell cycle arrest at the G1 phase. Conclusion The 3-RWM-based prognostic signature established in this study is a novel gene signature associated with immune infiltration, and strophanthidin was identified as a candidate therapy for OS by repressing OS cell growth and the cell cycle. Supplementary Information The online version contains supplementary material available at 10.1186/s12967-022-03415-6.
Collapse
Affiliation(s)
- Shijie Chen
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China.,Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, 500 Dongchuan Rd, Shanghai, 200241, China
| | - Jin Zeng
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Liping Huang
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Yi Peng
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Zuyun Yan
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Aiqian Zhang
- Department of Obstetrics and Gynecology, The Third Xiangya Hospital of Central South University, 138, Tongzipo Road, Changsha, 410013, China
| | - Xingping Zhao
- Department of Obstetrics and Gynecology, The Third Xiangya Hospital of Central South University, 138, Tongzipo Road, Changsha, 410013, China
| | - Jun Li
- Department of Orthopedics, The Second Affiliated Hospital of Anhui Medical University, 678 Furong Rd, Hefei, 230601, Anhui, China
| | - Ziting Zhou
- The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Sidan Wang
- The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Shengyu Jing
- The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Minghua Hu
- Department of Anatomy, Histology, and Embryology, Changsha Medical University, 1501 Leifeng Avenue, Changsha, 410219, Hunan, China
| | - Yuezhan Li
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Dong Wang
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Weiguo Wang
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Haiyang Yu
- School of Basic Medical Science, Central South University, 172 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Jinglei Miao
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Jinsong Li
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Youwen Deng
- Department of Spine Surgery, The Third Xiangya Hospital of Central South University, 138 Tongzipo Rd, Changsha, 410013, Hunan, China
| | - Yusheng Li
- Department of Orthopeadics, Xiangya Hospital, Central South University, 87 Xiangya Rd, Changsha, 410008, Hunan, China.
| | - Tang Liu
- Department of Orthopaedics, The Second Xiangya Hospital of Central South University, 139 Renmin Middle Rd, Changsha, 410011, Hunan, China.
| | - Dabao Xu
- Department of Obstetrics and Gynecology, The Third Xiangya Hospital of Central South University, 138, Tongzipo Road, Changsha, 410013, China.
| |
Collapse
|
19
|
Yin T, Zhao L, Yao S. Comprehensive characterization of m6A methylation and its impact on prognosis, genome instability, and tumor microenvironment in hepatocellular carcinoma. BMC Med Genomics 2022; 15:53. [PMID: 35260168 PMCID: PMC8905789 DOI: 10.1186/s12920-022-01207-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 03/07/2022] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND N6-methyladenosine (m6A) RNA regulation was recently reported to be important in carcinogenesis and cancer development. However, the characteristics of m6A modification and its correlations with clinical features, genome instability, tumor microenvironments (TMEs), and immunotherapy responses in hepatocellular carcinoma (HCC) have not been fully explored. METHODS We systematically analyzed the m6A regulator-based expression patterns of 486 patients with HCC from The Cancer Genome Atlas and Gene Expression Omnibus databases, and correlated these patterns with clinical outcomes, somatic mutations, TME cell infiltration, and immunotherapy responses. The m6A score was developed by principal component analysis to evaluate m6A modifications in individual patients. RESULTS M6A regulators were dysregulated in HCC samples, among which 18 m6A regulators were identified as risk factors for prognosis. Three m6A regulator-based expression patterns, namely m6A clusters, were determined among HCC patients by m6A regulators with different m6A scores, somatic mutation counts, and specific TME features. Additionally, three distinct m6A regulator-associated gene-based expression patterns were also identified based on prognosis-associated genes that were differentially expressed among the three m6A clusters, showing similar properties as the m6A regulator-based expression patterns. Higher m6A scores were correlated with older age, advanced stages, lower overall survival, higher somatic mutation counts, elevated PD-L1 expression levels, and poorer responses to immune checkpoint inhibitors. The m6A score was validated as an independent and valuable prognostic factor for HCC. CONCLUSION M6A modification is correlated with genome instability and TME in HCC. Evaluating m6A regulator-based expression patterns and the m6A score of individual tumors may help identify candidate biomarkers for prognosis prediction and immunotherapeutic strategy selection.
Collapse
Affiliation(s)
- Tengfei Yin
- Peking University China-Japan Friendship School of Clinical Medicine, No. 2 Yinghua East Road, Chaoyang District, Beijing, 100029, China
- Department of Gastroenterology, China-Japan Friendship Hospital, Beijing, China
| | - Lang Zhao
- Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Shukun Yao
- Peking University China-Japan Friendship School of Clinical Medicine, No. 2 Yinghua East Road, Chaoyang District, Beijing, 100029, China.
- Department of Gastroenterology, China-Japan Friendship Hospital, Beijing, China.
| |
Collapse
|
20
|
Zhang N, Zang T. A multi-network integration approach for measuring disease similarity based on ncRNA regulation and heterogeneous information. BMC Bioinformatics 2022; 23:89. [PMID: 35255810 PMCID: PMC8902705 DOI: 10.1186/s12859-022-04613-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 02/14/2022] [Indexed: 11/28/2022] Open
Abstract
Background Measuring similarity between complex diseases has significant implications for revealing the pathogenesis of diseases and development in the domain of biomedicine. It has been consentaneous that functional associations between disease-related genes and semantic associations can be applied to calculate disease similarity. Currently, more and more studies have demonstrated the profound involvement of non-coding RNA in the regulation of genome organization and gene expression. Thus, taking ncRNA into account can be useful in measuring disease similarities. However, existing methods ignore the regulation functions of ncRNA in biological process. In this study, we proposed a novel deep-learning method to deduce disease similarity. Results In this article, we proposed a novel method, ImpAESim, a framework integrating multiple networks embedding to learn compact feature representations and disease similarity calculation. We first utilize three different disease-related information networks to build up a heterogeneous network, after a network diffusion process, RWR, a compact feature learning model composed of classic Auto Encoder (AE) and improved AE model is proposed to extract constraints and low-dimensional feature representations. We finally obtain an accurate and low-dimensional feature representation of diseases, then we employed the cosine distance as the measurement of disease similarity. Conclusion ImpAESim focuses on extracting a low-dimensional vector representation of features based on ncRNA regulation, and gene–gene interaction network. Our method can significantly reduce the calculation bias resulted from the sparse disease associations which are derived from semantic associations.
Collapse
Affiliation(s)
- Ningyi Zhang
- Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Tianyi Zang
- Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China.
| |
Collapse
|
21
|
Integrative analysis of OIP5-AS1/miR-129-5p/CREBBP axis as a potential therapeutic candidate in the pathogenesis of metal toxicity-induced Alzheimer's disease. GENE REPORTS 2022. [DOI: 10.1016/j.genrep.2021.101442] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
22
|
Manavalan B, Basith S, Lee G. Comparative analysis of machine learning-based approaches for identifying therapeutic peptides targeting SARS-CoV-2. Brief Bioinform 2022; 23:bbab412. [PMID: 34595489 PMCID: PMC8500067 DOI: 10.1093/bib/bbab412] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 08/27/2021] [Accepted: 09/07/2021] [Indexed: 01/08/2023] Open
Abstract
Coronavirus disease 2019 (COVID-19) has impacted public health as well as societal and economic well-being. In the last two decades, various prediction algorithms and tools have been developed for predicting antiviral peptides (AVPs). The current COVID-19 pandemic has underscored the need to develop more efficient and accurate machine learning (ML)-based prediction algorithms for the rapid identification of therapeutic peptides against severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). Several peptide-based ML approaches, including anti-coronavirus peptides (ACVPs), IL-6 inducing epitopes and other epitopes targeting SARS-CoV-2, have been implemented in COVID-19 therapeutics. Owing to the growing interest in the COVID-19 field, it is crucial to systematically compare the existing ML algorithms based on their performances. Accordingly, we comprehensively evaluated the state-of-the-art IL-6 and AVP predictors against coronaviruses in terms of core algorithms, feature encoding schemes, performance evaluation metrics and software usability. A comprehensive performance assessment was then conducted to evaluate the robustness and scalability of the existing predictors using well-constructed independent validation datasets. Additionally, we discussed the advantages and disadvantages of the existing methods, providing useful insights into the development of novel computational tools for characterizing and identifying epitopes or ACVPs. The insights gained from this review are anticipated to provide critical guidance to the scientific community in the rapid design and development of accurate and efficient next-generation in silico tools against SARS-CoV-2.
Collapse
Affiliation(s)
| | - Shaherin Basith
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Korea
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Korea
| |
Collapse
|
23
|
Gupta R, Kumar P. CREB1 K292 and HINFP K330 as Putative Common Therapeutic Targets in Alzheimer's and Parkinson's Disease. ACS OMEGA 2021; 6:35780-35798. [PMID: 34984308 PMCID: PMC8717564 DOI: 10.1021/acsomega.1c05827] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 12/07/2021] [Indexed: 05/16/2023]
Abstract
Integration of omics data and deciphering the mechanism of a biological regulatory network could be a promising approach to reveal the molecular mechanism involved in the progression of complex diseases, including Alzheimer's and Parkinson's. Despite having an overlapping mechanism in the etiology of Alzheimer's disease (AD) and Parkinson's disease (PD), the exact mechanism and signaling molecules behind them are still unknown. Further, the acetylation mechanism and histone deacetylase (HDAC) enzymes provide a positive direction toward studying the shared phenomenon between AD and PD pathogenesis. For instance, increased expression of HDACs causes a decrease in protein acetylation status, resulting in decreased cognitive and memory function. Herein, we employed an integrative approach to analyze the transcriptomics data that established a potential relationship between AD and PD. Data preprocessing and analysis of four publicly available microarray datasets revealed 10 HUB proteins, namely, CDC42, CD44, FGFR1, MYO5A, NUMA1, TUBB4B, ARHGEF9, USP5, INPP5D, and NUP93, that may be involved in the shared mechanism of AD and PD pathogenesis. Further, we identified the relationship between the HUB proteins and transcription factors that could be involved in the overlapping mechanism of AD and PD. CREB1 and HINFP were the crucial regulatory transcription factors that were involved in the AD and PD crosstalk. Further, lysine acetylation sites and HDAC enzyme prediction revealed the involvement of 15 and 27 potential lysine residues of CREB1 and HINFP, respectively. Our results highlighted the importance of HDAC1(K292) and HDAC6(K330) association with CREB1 and HINFP, respectively, in the AD and PD crosstalk. However, different datasets with a large number of samples and wet lab experimentation are required to validate and pinpoint the exact role of CREB1 and HINFP in the AD and PD crosstalk. It is also possible that the different datasets may or may not affect the results due to analysis parameters. In conclusion, our study potentially highlighted the crucial proteins, transcription factors, biological pathways, lysine residues, and HDAC enzymes shared between AD and PD at the molecular level. The findings can be used to study molecular studies to identify the possible relationship in the AD-PD crosstalk.
Collapse
Affiliation(s)
- Rohan Gupta
- Molecular Neuroscience and
Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Delhi 110042, India
| | - Pravir Kumar
- Molecular Neuroscience and
Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Delhi 110042, India
| |
Collapse
|
24
|
Basith S, Lee G, Manavalan B. STALLION: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction. Brief Bioinform 2021; 23:6370848. [PMID: 34532736 PMCID: PMC8769686 DOI: 10.1093/bib/bbab376] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Revised: 08/22/2021] [Accepted: 08/24/2021] [Indexed: 12/13/2022] Open
Abstract
Protein post-translational modification (PTM) is an important regulatory mechanism that plays a key role in both normal and disease states. Acetylation on lysine residues is one of the most potent PTMs owing to its critical role in cellular metabolism and regulatory processes. Identifying protein lysine acetylation (Kace) sites is a challenging task in bioinformatics. To date, several machine learning-based methods for the in silico identification of Kace sites have been developed. Of those, a few are prokaryotic species-specific. Despite their attractive advantages and performances, these methods have certain limitations. Therefore, this study proposes a novel predictor STALLION (STacking-based Predictor for ProkAryotic Lysine AcetyLatION), containing six prokaryotic species-specific models to identify Kace sites accurately. To extract crucial patterns around Kace sites, we employed 11 different encodings representing three different characteristics. Subsequently, a systematic and rigorous feature selection approach was employed to identify the optimal feature set independently for five tree-based ensemble algorithms and built their respective baseline model for each species. Finally, the predicted values from baseline models were utilized and trained with an appropriate classifier using the stacking strategy to develop STALLION. Comparative benchmarking experiments showed that STALLION significantly outperformed existing predictor on independent tests. To expedite direct accessibility to the STALLION models, a user-friendly online predictor was implemented, which is available at: http://thegleelab.org/STALLION.
Collapse
Affiliation(s)
- Shaherin Basith
- Department of Physiology, Ajou University School of Medicine, Republic of Korea
| | - Gwang Lee
- Department of Molecular Science and Technology, Ajou University, Suwon 16499, Republic of Korea
| | | |
Collapse
|
25
|
i6mA-VC: A Multi-Classifier Voting Method for the Computational Identification of DNA N6-methyladenine Sites. Interdiscip Sci 2021; 13:413-425. [PMID: 33834381 DOI: 10.1007/s12539-021-00429-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Revised: 03/26/2021] [Accepted: 03/29/2021] [Indexed: 12/14/2022]
Abstract
DNA N6-methyladenine (6 mA), as an essential component of epigenetic modification, cannot be neglected in genetic regulation mechanism. The efficient and accurate prediction of 6 mA sites is beneficial to the development of biological genetics. Biochemical experimental methods are considered to be time-consuming and laborious. Most of the established machine learning methods have a single dataset. Although some of them have achieved cross-species prediction, their results are not satisfactory. Therefore, we designed a novel statistical model called i6mA-VC to improve the accuracy for 6 mA sites. On the one hand, kmer and binary encoding are applied to extract features, and then gradient boosting decision tree (GBDT) embedded method is applied as the feature selection strategy. On the other hand, DNA sequences are represented by vectors through the feature extraction method of ring-function-hydrogen-chemical properties (RFHCP) and the feature selection strategy of ExtraTree. After fusing the two optimal features, a voting classifier based on gradient boosting decision tree (GBDT), light gradient boosting machine (LightGBM) and multilayer perceptron classifier (MLPC) is constructed for final classification and prediction. The accuracy of Rice dataset and M.musculus dataset with five-fold cross-validation are 0.888 and 0.967, respectively. The cross-species dataset is selected as independent testing dataset, and the accuracy reaches 0.848. Through rigorous experiments, it is demonstrated that the proposed predictor is convincing and applicable. The development of i6mA-VC predictor will become an effective way for the recognition of N6-methyladenine sites, and it will also be beneficial for biological geneticists to further study gene expression and DNA modification. In addition, an accessible web-server for i6mA-VC is available from http://www.zhanglab.site/ .
Collapse
|
26
|
Li S, Yu K, Wu G, Zhang Q, Wang P, Zheng J, Liu ZX, Wang J, Gao X, Cheng H. pCysMod: Prediction of Multiple Cysteine Modifications Based on Deep Learning Framework. Front Cell Dev Biol 2021; 9:617366. [PMID: 33732693 PMCID: PMC7959776 DOI: 10.3389/fcell.2021.617366] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Accepted: 01/12/2021] [Indexed: 12/18/2022] Open
Abstract
Thiol groups on cysteines can undergo multiple post-translational modifications (PTMs), acting as a molecular switch to maintain redox homeostasis and regulating a series of cell signaling transductions. Identification of sophistical protein cysteine modifications is crucial for dissecting its underlying regulatory mechanism. Instead of a time-consuming and labor-intensive experimental method, various computational methods have attracted intense research interest due to their convenience and low cost. Here, we developed the first comprehensive deep learning based tool pCysMod for multiple protein cysteine modification prediction, including S-nitrosylation, S-palmitoylation, S-sulfenylation, S-sulfhydration, and S-sulfinylation. Experimentally verified cysteine sites curated from literature and sites collected by other databases and predicting tools were integrated as benchmark dataset. Several protein sequence features were extracted and united into a deep learning model, and the hyperparameters were optimized by particle swarm optimization algorithms. Cross-validations indicated our model showed excellent robustness and outperformed existing tools, which was able to achieve an average AUC of 0.793, 0.807, 0.796, 0.793, and 0.876 for S-nitrosylation, S-palmitoylation, S-sulfenylation, S-sulfhydration, and S-sulfinylation, demonstrating pCysMod was stable and suitable for protein cysteine modification prediction. Besides, we constructed a comprehensive protein cysteine modification prediction web server based on this model to benefit the researches finding the potential modification sites of their interested proteins, which could be accessed at http://pcysmod.omicsbio.info. This work will undoubtedly greatly promote the study of protein cysteine modification and contribute to clarifying the biological regulation mechanisms of cysteine modification within and among the cells.
Collapse
Affiliation(s)
- Shihua Li
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China.,School of Life Sciences, Zhengzhou University, Zhengzhou, China
| | - Kai Yu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Guandi Wu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Qingfeng Zhang
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Panqin Wang
- School of Life Sciences, Zhengzhou University, Zhengzhou, China
| | - Jian Zheng
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Ze-Xian Liu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Jichao Wang
- CAS Key Lab of Biobased Materials, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, China
| | - Xinjiao Gao
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, University of Science and Technology of China, Hefei, China
| | - Han Cheng
- School of Life Sciences, Zhengzhou University, Zhengzhou, China
| |
Collapse
|
27
|
Wen B, Zeng W, Liao Y, Shi Z, Savage SR, Jiang W, Zhang B. Deep Learning in Proteomics. Proteomics 2020; 20:e1900335. [PMID: 32939979 PMCID: PMC7757195 DOI: 10.1002/pmic.201900335] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 09/14/2020] [Indexed: 12/17/2022]
Abstract
Proteomics, the study of all the proteins in biological systems, is becoming a data-rich science. Protein sequences and structures are comprehensively catalogued in online databases. With recent advancements in tandem mass spectrometry (MS) technology, protein expression and post-translational modifications (PTMs) can be studied in a variety of biological systems at the global scale. Sophisticated computational algorithms are needed to translate the vast amount of data into novel biological insights. Deep learning automatically extracts data representations at high levels of abstraction from data, and it thrives in data-rich scientific research domains. Here, a comprehensive overview of deep learning applications in proteomics, including retention time prediction, MS/MS spectrum prediction, de novo peptide sequencing, PTM prediction, major histocompatibility complex-peptide binding prediction, and protein structure prediction, is provided. Limitations and the future directions of deep learning in proteomics are also discussed. This review will provide readers an overview of deep learning and how it can be used to analyze proteomics data.
Collapse
Affiliation(s)
- Bo Wen
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Wen‐Feng Zeng
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS)Chinese Academy of SciencesInstitute of Computing TechnologyBeijing100190China
| | - Yuxing Liao
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Zhiao Shi
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Sara R. Savage
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Wen Jiang
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Bing Zhang
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| |
Collapse
|
28
|
Liang Y, Zhou R, Liang X, Kong X, Yang B. Pharmacological targets and molecular mechanisms of plumbagin to treat colorectal cancer: A systematic pharmacology study. Eur J Pharmacol 2020; 881:173227. [PMID: 32505664 DOI: 10.1016/j.ejphar.2020.173227] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Revised: 05/14/2020] [Accepted: 05/28/2020] [Indexed: 12/14/2022]
Abstract
Plumbagin (PL) pharmacologically plays the anti-proliferative effects in cancer cells, including effective suppression of colorectal cancer (CRC). However, the exact molecular mechanism of PL to treat CRC remains unclear. Using available SwissTargetPrediction and SuperPred databases, the anti-cancer biotargets of PL were identified, and the CRC-diseased targets were obtained through a DisGeNET database. The biological processes, and signaling pathways of PL to treat CRC were identified and visualized. Further, clinical and cell culture data were used to validate some bioinformatic findings. As shown in bioinformatics findings, 64 predictive biotargets of PL to treat CRC were collected, and 7 most important biotargets of tumor protein p53 (TP53), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), mitogen-activated protein kinase 1 (MAPK1), E1A-associated protein p300 (EP300), poly (ADP-ribose) polymerase 1 (PARP1), nuclear factor kappa p65 protein (RELA), Bcl-2 like protein 1 (BCL2L1) were identified respectively. In addition, top 20 functional biological processes, signaling pathways of PL to treat CRC were screened and prioritized. In human study, CRC samples showed elevated expressions of neoplastic MAPK1, PARP1 mRNAs and reduced EP300 mRNA level. In cell culture study, PL-treated CRC cells resulted in down-regulated MAPK1, PARP1 mRNA expressions and up-regulation of EP300 mRNA level, characterized with suppressed cell proliferation. Taken together, the therapeutic biotargets and molecular mechanisms of PL to treat CRC were screened and identified by using a systematic pharmacology analysis, and some bioinformatic findings were validated in clinical and cell line experiments. Potentially, these hub biotargets may be the biomarkers for CRC detection and treatment.
Collapse
Affiliation(s)
- Yujia Liang
- College of Pharmacy, Guangxi Medical University, Guangxi, Nanning, PR China
| | - Rui Zhou
- Department of Hepatobiliary Surgery, Guigang City People's Hospital, The Eighth Affiliated Hospital of Guangxi Medical University, Guigang, Guangxi, PR China
| | - Xiaoliu Liang
- College of Pharmacy, Guangxi Medical University, Guangxi, Nanning, PR China
| | - Xiaolong Kong
- College of Pharmacy, Guangxi Medical University, Guangxi, Nanning, PR China.
| | - Bin Yang
- College of Pharmacy, Guangxi Medical University, Guangxi, Nanning, PR China.
| |
Collapse
|