Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kumar M, Gromiha MM, Raghava GPS. SVM based prediction of RNA-binding proteins using binding residues and evolutionary information. J Mol Recognit 2011;24:303-13. [DOI: 10.1002/jmr.1061] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

For:	Kumar M, Gromiha MM, Raghava GPS. SVM based prediction of RNA-binding proteins using binding residues and evolutionary information. J Mol Recognit 2011;24:303-13. [DOI: 10.1002/jmr.1061] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Number

Cited by Other Article(s)

Wu S, Xu J, Guo JT. Accurate prediction of nucleic acid binding proteins using protein language model. BIOINFORMATICS ADVANCES 2025;5:vbaf008. [PMID: 39990254 PMCID: PMC11845279 DOI: 10.1093/bioadv/vbaf008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/22/2024] [Revised: 12/20/2024] [Accepted: 01/15/2025] [Indexed: 02/25/2025]

Pradhan UK, Naha S, Das R, Gupta A, Parsad R, Meher PK. RBProkCNN: Deep learning on appropriate contextual evolutionary information for RNA binding protein discovery in prokaryotes. Comput Struct Biotechnol J 2024;23:1631-1640. [PMID: 38660008 PMCID: PMC11039349 DOI: 10.1016/j.csbj.2024.04.034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 04/12/2024] [Accepted: 04/12/2024] [Indexed: 04/26/2024] Open

Wu S, Guo JT. Improved prediction of DNA and RNA binding proteins with deep learning models. Brief Bioinform 2024;25:bbae285. [PMID: 38856168 PMCID: PMC11163377 DOI: 10.1093/bib/bbae285] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 05/20/2024] [Accepted: 05/31/2024] [Indexed: 06/11/2024] Open

Kumar N, Tripathi S, Sharma N, Patiyal S, Devi NL, Raghava GPS. A method for predicting linear and conformational B-cell epitopes in an antigen from its primary sequence. Comput Biol Med 2024;170:108083. [PMID: 38295479 DOI: 10.1016/j.compbiomed.2024.108083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 12/26/2023] [Accepted: 01/27/2024] [Indexed: 02/02/2024]

Iwaniak A, Minkiewicz P, Darewicz M. Bioinformatics and bioactive peptides from foods: Do they work together? ADVANCES IN FOOD AND NUTRITION RESEARCH 2024;108:35-111. [PMID: 38461003 DOI: 10.1016/bs.afnr.2023.09.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/11/2024]

Avila-Lopez P, Lauberth SM. Exploring new roles for RNA-binding proteins in epigenetic and gene regulation. Curr Opin Genet Dev 2024;84:102136. [PMID: 38128453 PMCID: PMC11245729 DOI: 10.1016/j.gde.2023.102136] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 11/12/2023] [Accepted: 11/15/2023] [Indexed: 12/23/2023]

Pradhan UK, Meher PK, Naha S, Pal S, Gupta S, Gupta A, Parsad R. RBPLight: a computational tool for discovery of plant-specific RNA-binding proteins using light gradient boosting machine and ensemble of evolutionary features. Brief Funct Genomics 2023;22:401-410. [PMID: 37158175 DOI: 10.1093/bfgp/elad016] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 04/12/2023] [Accepted: 04/21/2023] [Indexed: 05/10/2023] Open

Arican OC, Gumus O. PredDRBP-MLP: Prediction of DNA-binding proteins and RNA-binding proteins by multilayer perceptron. Comput Biol Med 2023;164:107317. [PMID: 37562328 DOI: 10.1016/j.compbiomed.2023.107317] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 07/27/2023] [Accepted: 08/07/2023] [Indexed: 08/12/2023]

Agarwal A, Kant S, Bahadur RP. Efficient mapping of RNA-binding residues in RNA-binding proteins using local sequence features of binding site residues in protein-RNA complexes. Proteins 2023;91:1361-1379. [PMID: 37254800 DOI: 10.1002/prot.26528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 04/13/2023] [Accepted: 05/02/2023] [Indexed: 06/01/2023]

Abstract

Protein-RNA interactions play vital roles in plethora of biological processes such as regulation of gene expression, protein synthesis, mRNA processing and biogenesis. Identification of RNA-binding residues (RBRs) in proteins is essential to understand RNA-mediated protein functioning, to perform site-directed mutagenesis and to develop novel targeted drug therapies. Moreover, the extensive gap between sequence and structural data restricts the identification of binding sites in unsolved structures. However, efficient use of computational methods demanding only sequence to identify binding residues can bridge this huge sequence-structure gap. In this study, we have extensively studied protein-RNA interface in known RNA-binding proteins (RBPs). We find that the interface is highly enriched in basic and polar residues with Gly being the most common interface neighbor. We investigated several amino acid features and developed a method to predict putative RBRs from amino acid sequence. We have implemented balanced random forest (BRF) classifier with local residue features of protein sequences for prediction. With 5-fold cross-validations, the sequence pattern derived dipeptide composition based BRF model (DCP-BRF) resulted in an accuracy of 87.9%, specificity of 88.8%, sensitivity of 82.2%, Mathew's correlation coefficient of 0.60 and AUC of 0.93, performing better than few existing methods. We further validated our prediction model on known human RBPs through RBR prediction and could map ~54% of them. Further, knowledge of binding site preferences obtained from computational predictions combined with experimental validations of potential RNA binding sites can enhance our understanding of protein-RNA interactions. This may serve to accelerate investigations on functional roles of many novel RBPs.

Collapse

Solis-Miranda J, Chodasiewicz M, Skirycz A, Fernie AR, Moschou PN, Bozhkov PV, Gutierrez-Beltran E. Stress-related biomolecular condensates in plants. THE PLANT CELL 2023;35:3187-3204. [PMID: 37162152 PMCID: PMC10473214 DOI: 10.1093/plcell/koad127] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 04/07/2023] [Accepted: 04/27/2023] [Indexed: 05/11/2023]

Jin W, Brannan KW, Kapeli K, Park SS, Tan HQ, Gosztyla ML, Mujumdar M, Ahdout J, Henroid B, Rothamel K, Xiang JS, Wong L, Yeo GW. HydRA: Deep-learning models for predicting RNA-binding capacity from protein interaction association context and protein sequence. Mol Cell 2023;83:2595-2611.e11. [PMID: 37421941 PMCID: PMC11098078 DOI: 10.1016/j.molcel.2023.06.019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 03/20/2023] [Accepted: 06/13/2023] [Indexed: 07/10/2023]

Affiliation(s)

Wenhao Jin Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Kristopher W Brannan Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Katannya Kapeli Department of Physiology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
Samuel S Park Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Hui Qing Tan Department of Physiology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
Maya L Gosztyla Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Mayuresh Mujumdar Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Joshua Ahdout Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Bryce Henroid Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Katherine Rothamel Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Joy S Xiang Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Limsoon Wong Department of Computer Science, National University of Singapore, Singapore, Singapore
Gene W Yeo Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA.

Collapse

Yan K, Feng J, Huang J, Wu H. iDRPro-SC: identifying DNA-binding proteins and RNA-binding proteins based on subfunction classifiers. Brief Bioinform 2023:bbad251. [PMID: 37405873 DOI: 10.1093/bib/bbad251] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/10/2023] [Accepted: 06/12/2023] [Indexed: 07/07/2023] Open

Wang Z, Zhu H. Exploiting liver metabolism for tissue-specific cancer targeting. NATURE CANCER 2023;4:310-311. [PMID: 36977775 DOI: 10.1038/s43018-023-00530-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]

Li YT, Liu CJ, Kao JH, Lin LF, Tu HC, Wang CC, Huang PH, Cheng HR, Chen PJ, Chen DS, Wu HL. Metastatic tumor antigen 1 contributes to hepatocarcinogenesis posttranscriptionally through RNA-binding function. Hepatology 2023;77:379-394. [PMID: 35073601 DOI: 10.1002/hep.32356] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 01/07/2022] [Accepted: 01/12/2022] [Indexed: 01/28/2023]

Affiliation(s)

Yung-Tsung Li Hepatitis Research Center , National Taiwan University Hospital , Taipei , Taiwan Department of Internal Medicine , National Taiwan University Hospital , Taipei , Taiwan Graduate Institute of Clinical Medicine , National Taiwan University College of Medicine , Taipei , Taiwan
Chun-Jen Liu Hepatitis Research Center , National Taiwan University Hospital , Taipei , Taiwan Department of Internal Medicine , National Taiwan University Hospital , Taipei , Taiwan Graduate Institute of Clinical Medicine , National Taiwan University College of Medicine , Taipei , Taiwan
Jia-Horng Kao Hepatitis Research Center , National Taiwan University Hospital , Taipei , Taiwan Department of Internal Medicine , National Taiwan University Hospital , Taipei , Taiwan Graduate Institute of Clinical Medicine , National Taiwan University College of Medicine , Taipei , Taiwan
Li-Feng Lin Hepatitis Research Center , National Taiwan University Hospital , Taipei , Taiwan
Hui-Chu Tu Hepatitis Research Center , National Taiwan University Hospital , Taipei , Taiwan
Chih-Chiang Wang Graduate Institute of Clinical Medicine , National Taiwan University College of Medicine , Taipei , Taiwan
Po-Hsi Huang Hepatitis Research Center , National Taiwan University Hospital , Taipei , Taiwan
Huei-Ru Cheng Graduate Institute of Clinical Medicine , National Taiwan University College of Medicine , Taipei , Taiwan
Pei-Jer Chen Hepatitis Research Center , National Taiwan University Hospital , Taipei , Taiwan Department of Internal Medicine , National Taiwan University Hospital , Taipei , Taiwan Graduate Institute of Clinical Medicine , National Taiwan University College of Medicine , Taipei , Taiwan
Ding-Shinn Chen Hepatitis Research Center , National Taiwan University Hospital , Taipei , Taiwan Department of Internal Medicine , National Taiwan University Hospital , Taipei , Taiwan Graduate Institute of Clinical Medicine , National Taiwan University College of Medicine , Taipei , Taiwan Genomics Research Center , Academia Sinica , Taipei , Taiwan
Hui-Lin Wu Hepatitis Research Center , National Taiwan University Hospital , Taipei , Taiwan Graduate Institute of Clinical Medicine , National Taiwan University College of Medicine , Taipei , Taiwan

Collapse

Pande A, Patiyal S, Lathwal A, Arora C, Kaur D, Dhall A, Mishra G, Kaur H, Sharma N, Jain S, Usmani SS, Agrawal P, Kumar R, Kumar V, Raghava GPS. Pfeature: A Tool for Computing Wide Range of Protein Features and Building Prediction Models. J Comput Biol 2023;30:204-222. [PMID: 36251780 DOI: 10.1089/cmb.2022.0241] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open

Abstract

In the last three decades, a wide range of protein features have been discovered to annotate a protein. Numerous attempts have been made to integrate these features in a software package/platform so that the user may compute a wide range of features from a single source. To complement the existing methods, we developed a method, Pfeature, for computing a wide range of protein features. Pfeature allows to compute more than 200,000 features required for predicting the overall function of a protein, residue-level annotation of a protein, and function of chemically modified peptides. It has six major modules, namely, composition, binary profiles, evolutionary information, structural features, patterns, and model building. Composition module facilitates to compute most of the existing compositional features, plus novel features. The binary profile of amino acid sequences allows to compute the fraction of each type of residue as well as its position. The evolutionary information module allows to compute evolutionary information of a protein in the form of a position-specific scoring matrix profile generated using Position-Specific Iterative Basic Local Alignment Search Tool (PSI-BLAST); fit for annotation of a protein and its residues. A structural module was developed for computing of structural features/descriptors from a tertiary structure of a protein. These features are suitable to predict the therapeutic potential of a protein containing non-natural or chemically modified residues. The model-building module allows to implement various machine learning techniques for developing classification and regression models as well as feature selection. Pfeature also allows the generation of overlapping patterns and features from a protein. A user-friendly Pfeature is available as a web server python library and stand-alone package.

Collapse

Affiliation(s)

Akshara Pande Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
Sumeet Patiyal Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
Anjali Lathwal Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
Chakit Arora Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
Dilraj Kaur Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
Anjali Dhall Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
Gaurav Mishra Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Department of Electrical Engineering, Shiv Nadar University, Greater Noida, India
Harpreet Kaur Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
Neelam Sharma Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
Shipra Jain Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
Salman Sadullah Usmani Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
Piyush Agrawal Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
Rajesh Kumar Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
Vinod Kumar Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
Gajendra P S Raghava Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India

Collapse

Kumar N, Patiyal S, Choudhury S, Tomer R, Dhall A, Raghava GPS. DMPPred: a tool for identification of antigenic regions responsible for inducing type 1 diabetes mellitus. Brief Bioinform 2023;24:6911429. [PMID: 36524996 DOI: 10.1093/bib/bbac525] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 10/27/2022] [Accepted: 11/04/2022] [Indexed: 12/23/2022] Open

Du X, Hu J. Deep Multi-Label Joint Learning for RNA and DNA-Binding Proteins Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:307-320. [PMID: 35148267 DOI: 10.1109/tcbb.2022.3150280] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Wu Z, Basu S, Wu X, Kurgan L. qNABpredict: Quick, accurate, and taxonomy-aware sequence-based prediction of content of nucleic acid binding amino acids. Protein Sci 2023;32:e4544. [PMID: 36519304 PMCID: PMC9798252 DOI: 10.1002/pro.4544] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 12/07/2022] [Accepted: 12/08/2022] [Indexed: 12/23/2022]

Wang N, Zhang J, Liu B. iDRBP-EL: Identifying DNA- and RNA- Binding Proteins Based on Hierarchical Ensemble Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:432-441. [PMID: 34932484 DOI: 10.1109/tcbb.2021.3136905] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Liu Y, Niu G, Zhou J, Shen W, Corriou JP, Seferlis P. Hybrid Intelligent Fault Diagnosis Model Based on Improved MPCA-V for Sensors in a Laboratory-Scale Wastewater Treatment Process. Ind Eng Chem Res 2022. [DOI: 10.1021/acs.iecr.2c02334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Shahnazari M, Zakipour Z, Razi H, Moghadam A, Alemzadeh A. Bioinformatics approaches for classification and investigation of the evolution of the Na/K-ATPase alpha-subunit. BMC Ecol Evol 2022;22:122. [PMID: 36289471 PMCID: PMC9609216 DOI: 10.1186/s12862-022-02071-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Accepted: 09/29/2022] [Indexed: 11/22/2022] Open

Abstract

BACKGROUND

Na,K-ATPase is a key protein in maintaining membrane potential that has numerous additional cellular functions. Its catalytic subunit (α), found in a wide range of organisms from prokaryotes to complex eukaryote. Several studies have been done to identify the functions as well as determining the evolutionary relationships of the α-subunit. However, a survey of a larger collection of protein sequences according to sequences similarity and their attributes is very important in revealing deeper evolutionary relationships and identifying specific amino acid differences among evolutionary groups that may have a functional role.

RESULTS

In this study, 753 protein sequences using phylogenetic tree classification resulted in four groups: prokaryotes (I), fungi and various kinds of Protista and some invertebrates (II), the main group of invertebrates (III), and vertebrates (IV) that was consisted with species tree. The percent of sequences that acquired a specific motif for the α/β subunit assembly increased from group I to group IV. The vertebrate sequences were divided into four groups according to isoforms with each group conforming to the evolutionary path of vertebrates from fish to tetrapods. Data mining was used to identify the most effective attributes in classification of sequences. Using 1252 attributes extracted from the sequences, the decision tree classified them in five groups: Protista, prokaryotes, fungi, invertebrates and vertebrates. Also, vertebrates were divided into four subgroups (isoforms). Generally, the count of different dipeptides and amino acid ratios were the most significant attributes for grouping. Using alignment of sequences identified the effective position of the respective dipeptides in the separation of the groups. So that 208GC is apparently involved in the separation of vertebrates from the four other organism groups, and 41DH, 431FK, and 451KC were involved in separation vertebrate isoform types.

CONCLUSION

The application of phylogenetic and decision tree analysis for Na,K-ATPase, provides a better understanding of the evolutionary changes according to the amino acid sequence and its related properties that could lead to the identification of effective attributes in the separation of sequences in different groups of phylogenetic tree. In this study, key evolution-related dipeptides are identified which can guide future experimental studies.

Collapse

Balcerak A, Macech-Klicka E, Wakula M, Tomecki R, Goryca K, Rydzanicz M, Chmielarczyk M, Szostakowska-Rodzos M, Wisniewska M, Lyczek F, Helwak A, Tollervey D, Kudla G, Grzybowska EA. The RNA-Binding Landscape of HAX1 Protein Indicates Its Involvement in Translation and Ribosome Assembly. Cells 2022;11:cells11192943. [PMID: 36230905 PMCID: PMC9564044 DOI: 10.3390/cells11192943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 09/13/2022] [Accepted: 09/15/2022] [Indexed: 11/18/2022] Open

Affiliation(s)

Anna Balcerak Molecular and Translational Oncology, Maria Sklodowska-Curie National Research Institute of Oncology, 02-781 Warsaw, Poland
Ewelina Macech-Klicka Molecular and Translational Oncology, Maria Sklodowska-Curie National Research Institute of Oncology, 02-781 Warsaw, Poland
Maciej Wakula Molecular and Translational Oncology, Maria Sklodowska-Curie National Research Institute of Oncology, 02-781 Warsaw, Poland
Rafal Tomecki Laboratory of RNA Processing and Decay, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, 02-106 Warsaw, Poland Faculty of Biology, Institute of Genetics and Biotechnology, University of Warsaw, 02-106 Warsaw, Poland
Krzysztof Goryca Genomics Core Facility, Centre of New Technologies University of Warsaw, 02-097 Warsaw, Poland
Malgorzata Rydzanicz Department of Medical Genetics, Medical University of Warsaw, 02-106 Warsaw, Poland
Mateusz Chmielarczyk Molecular and Translational Oncology, Maria Sklodowska-Curie National Research Institute of Oncology, 02-781 Warsaw, Poland
Malgorzata Szostakowska-Rodzos Molecular and Translational Oncology, Maria Sklodowska-Curie National Research Institute of Oncology, 02-781 Warsaw, Poland
Marta Wisniewska Laboratory of Biological Chemistry of Metal Ions, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, 02-106 Warsaw, Poland
Filip Lyczek Molecular and Translational Oncology, Maria Sklodowska-Curie National Research Institute of Oncology, 02-781 Warsaw, Poland
Aleksandra Helwak Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh EH9 3BF, UK
David Tollervey Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh EH9 3BF, UK
Grzegorz Kudla MRC Human Genetics Unit, University of Edinburgh, Edinburgh EH4 2XU, UK
Ewa A. Grzybowska Molecular and Translational Oncology, Maria Sklodowska-Curie National Research Institute of Oncology, 02-781 Warsaw, Poland Correspondence:

Collapse

Feng J, Wang N, Zhang J, Liu B. iDRBP-ECHF: Identifying DNA- and RNA-binding proteins based on extensible cubic hybrid framework. Comput Biol Med 2022;149:105940. [PMID: 36044786 DOI: 10.1016/j.compbiomed.2022.105940] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 07/10/2022] [Accepted: 08/06/2022] [Indexed: 11/28/2022]

Wang N, Zhang J, Liu B. IDRBP-PPCT: Identifying Nucleic Acid-Binding Proteins Based on Position-Specific Score Matrix and Position-Specific Frequency Matrix Cross Transformation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:2284-2293. [PMID: 33780341 DOI: 10.1109/tcbb.2021.3069263] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Peng X, Wang X, Guo Y, Ge Z, Li F, Gao X, Song J. RBP-TSTL is a two-stage transfer learning framework for genome-scale prediction of RNA-binding proteins. Brief Bioinform 2022;23:6596984. [PMID: 35649392 PMCID: PMC9294422 DOI: 10.1093/bib/bbac215] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 04/25/2022] [Accepted: 05/06/2022] [Indexed: 11/27/2022] Open

Parra ALC, Bezerra LP, Shawar DE, Neto NAS, Mesquita FP, da Silva GO, Souza PFN. Synthetic antiviral peptides: a new way to develop targeted antiviral drugs. Future Virol 2022. [DOI: 10.2217/fvl-2021-0308] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Zhang J, Yan K, Chen Q, Liu B. PreRBP-TL: prediction of species-specific RNA-binding proteins based on transfer learning. Bioinformatics 2022;38:2135-2143. [PMID: 35176130 DOI: 10.1093/bioinformatics/btac106] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 11/18/2021] [Accepted: 02/15/2022] [Indexed: 02/03/2023] Open

Zhang J, Hess WR, Zhang C. "Life is short, and art is long": RNA degradation in cyanobacteria and model bacteria. MLIFE 2022;1:21-39. [PMID: 38818322 PMCID: PMC10989914 DOI: 10.1002/mlf2.12015] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 03/03/2022] [Accepted: 03/03/2022] [Indexed: 06/01/2024]

Xie J, Zhang X, Zheng J, Hong X, Tong X, Liu X, Xue Y, Wang X, Zhang Y, Liu S. Two novel RNA-binding proteins identification through computational prediction and experimental validation. Genomics 2021;114:149-160. [PMID: 34921931 DOI: 10.1016/j.ygeno.2021.12.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Revised: 08/05/2021] [Accepted: 12/13/2021] [Indexed: 11/16/2022]

Li HL, Pang YH, Liu B. BioSeq-BLM: a platform for analyzing DNA, RNA and protein sequences based on biological language models. Nucleic Acids Res 2021;49:e129. [PMID: 34581805 PMCID: PMC8682797 DOI: 10.1093/nar/gkab829] [Citation(s) in RCA: 146] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2021] [Revised: 08/24/2021] [Accepted: 09/09/2021] [Indexed: 01/08/2023] Open

Niu M, Wu J, Zou Q, Liu Z, Xu L. rBPDL:Predicting RNA-Binding Proteins Using Deep Learning. IEEE J Biomed Health Inform 2021;25:3668-3676. [PMID: 33780344 DOI: 10.1109/jbhi.2021.3069259] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Gutierrez‐Beltran E, Elander PH, Dalman K, Dayhoff GW, Moschou PN, Uversky VN, Crespo JL, Bozhkov PV. Tudor staphylococcal nuclease is a docking platform for stress granule components and is essential for SnRK1 activation in Arabidopsis. EMBO J 2021;40:e105043. [PMID: 34287990 PMCID: PMC8447601 DOI: 10.15252/embj.2020105043] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 06/23/2021] [Accepted: 07/01/2021] [Indexed: 12/19/2022] Open

Zhang J, Chen Q, Liu B. DeepDRBP-2L: A New Genome Annotation Predictor for Identifying DNA-Binding Proteins and RNA-Binding Proteins Using Convolutional Neural Network and Long Short-Term Memory. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:1451-1463. [PMID: 31722485 DOI: 10.1109/tcbb.2019.2952338] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Riediger M, Spät P, Bilger R, Voigt K, Maček B, Hess WR. Analysis of a photosynthetic cyanobacterium rich in internal membrane systems via gradient profiling by sequencing (Grad-seq). THE PLANT CELL 2021;33:248-269. [PMID: 33793824 PMCID: PMC8136920 DOI: 10.1093/plcell/koaa017] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Accepted: 11/12/2020] [Indexed: 05/23/2023]

The interactome of multifunctional HAX1 protein suggests its role in the regulation of energy metabolism, de-aggregation, cytoskeleton organization and RNA-processing. Biosci Rep 2021;40:226900. [PMID: 33146709 PMCID: PMC7670567 DOI: 10.1042/bsr20203094] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Revised: 10/14/2020] [Accepted: 11/02/2020] [Indexed: 01/07/2023] Open

Zhang ZM, Guan ZX, Wang F, Zhang D, Ding H. Application of Machine Learning Methods in Predicting Nuclear Receptors and their Families. Med Chem 2021;16:594-604. [PMID: 31584374 DOI: 10.2174/1573406415666191004125551] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Revised: 06/18/2019] [Accepted: 08/23/2019] [Indexed: 11/22/2022]

Abstract

Nuclear receptors (NRs) are a superfamily of ligand-dependent transcription factors that are closely related to cell development, differentiation, reproduction, homeostasis, and metabolism. According to the alignments of the conserved domains, NRs are classified and assigned the following seven subfamilies or eight subfamilies: (1) NR1: thyroid hormone like (thyroid hormone, retinoic acid, RAR-related orphan receptor, peroxisome proliferator activated, vitamin D3- like), (2) NR2: HNF4-like (hepatocyte nuclear factor 4, retinoic acid X, tailless-like, COUP-TFlike, USP), (3) NR3: estrogen-like (estrogen, estrogen-related, glucocorticoid-like), (4) NR4: nerve growth factor IB-like (NGFI-B-like), (5) NR5: fushi tarazu-F1 like (fushi tarazu-F1 like), (6) NR6: germ cell nuclear factor like (germ cell nuclear factor), and (7) NR0: knirps like (knirps, knirpsrelated, embryonic gonad protein, ODR7, trithorax) and DAX like (DAX, SHP), or dividing NR0 into (7) NR7: knirps like and (8) NR8: DAX like. Different NRs families have different structural features and functions. Since the function of a NR is closely correlated with which subfamily it belongs to, it is highly desirable to identify NRs and their subfamilies rapidly and effectively. The knowledge acquired is essential for a proper understanding of normal and abnormal cellular mechanisms. With the advent of the post-genomics era, huge amounts of sequence-known proteins have increased explosively. Conventional methods for accurately classifying the family of NRs are experimental means with high cost and low efficiency. Therefore, it has created a greater need for bioinformatics tools to effectively recognize NRs and their subfamilies for the purpose of understanding their biological function. In this review, we summarized the application of machine learning methods in the prediction of NRs from different aspects. We hope that this review will provide a reference for further research on the classification of NRs and their families.

Collapse

Mishra A, Khanal R, Kabir WU, Hoque T. AIRBP: Accurate identification of RNA-binding proteins using machine learning techniques. Artif Intell Med 2021;113:102034. [PMID: 33685590 DOI: 10.1016/j.artmed.2021.102034] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Revised: 01/19/2021] [Accepted: 02/09/2021] [Indexed: 12/25/2022]

Abstract

Identification of RNA-binding proteins (RBPs) that bind to ribonucleic acid molecules is an important problem in Computational Biology and Bioinformatics. It becomes indispensable to identify RBPs as they play crucial roles in post-transcriptional control of RNAs and RNA metabolism as well as have diverse roles in various biological processes such as splicing, mRNA stabilization, mRNA localization, and translation, RNA synthesis, folding-unfolding, modification, processing, and degradation. The existing experimental techniques for identifying RBPs are time-consuming and expensive. Therefore, identifying RBPs directly from the sequence using computational methods can be useful to annotate RBPs and assist the experimental design efficiently. In this work, we present a method called AIRBP, which is designed using an advanced machine learning technique, called stacking, to effectively predict RBPs by utilizing features extracted from evolutionary information, physiochemical properties, and disordered properties. Moreover, our method, AIRBP, use the majority vote from RBPPred, DeepRBPPred, and the stacking model for the prediction for RBPs. The results show that AIRBP attains Accuracy (ACC), Balanced Accuracy (BACC), F1-score, and Mathews Correlation Coefficient (MCC) of 95.84 %, 94.71 %, 0.928, and 0.899, respectively, based on the training dataset, using 10-fold cross-validation (CV). Further evaluation of AIRBP on independent test set reveals that it achieves ACC, BACC, F1-score, and MCC of 94.36 %, 94.28 %, 0.897, and 0.860, for Human test set; 91.25 %, 93.00 %, 0.896, and 0.835 for S. cerevisiae test set; and 90.60 %, 90.41 %, 0.934, and 0.775 for A. thaliana test set, respectively. These results indicate that the AIRBP outperforms the existing Deep- and TriPepSVM methods. Therefore, the proposed better-performing AIRBP can be useful for accurate identification and annotation of RBPs directly from the sequence and help gain valuable insight to treat critical diseases. Availability: Code-data is available here: http://cs.uno.edu/∼tamjid/Software/AIRBP/code_data.zip.

Collapse

The evolutionary relationship of S15/NS1RNA binding domains with a similar protein domain pattern - A computational approach. INFORMATICS IN MEDICINE UNLOCKED 2021. [DOI: 10.1016/j.imu.2021.100611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Sharma N, Patiyal S, Dhall A, Pande A, Arora C, Raghava GPS. AlgPred 2.0: an improved method for predicting allergenic proteins and mapping of IgE epitopes. Brief Bioinform 2020;22:5985292. [PMID: 33201237 DOI: 10.1093/bib/bbaa294] [Citation(s) in RCA: 152] [Impact Index Per Article: 30.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Revised: 10/02/2020] [Accepted: 10/05/2020] [Indexed: 12/22/2022] Open

Heffron J, Mayer BK. Improved Virus Isoelectric Point Estimation by Exclusion of Known and Predicted Genome-Binding Regions. Appl Environ Microbiol 2020;86:e01674-20. [PMID: 32978129 PMCID: PMC7657617 DOI: 10.1128/aem.01674-20] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Accepted: 09/18/2020] [Indexed: 01/16/2023] Open

Zhang J, Chen Q, Liu B. iDRBP_MMC: Identifying DNA-Binding Proteins and RNA-Binding Proteins Based on Multi-Label Learning Model and Motif-Based Convolutional Neural Network. J Mol Biol 2020;432:5860-5875. [DOI: 10.1016/j.jmb.2020.09.008] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 08/12/2020] [Accepted: 09/04/2020] [Indexed: 11/28/2022]

Chen YM, Zu XP, Li D. Identification of Proteins of Tobacco Mosaic Virus by Using a Method of Feature Extraction. Front Genet 2020;11:569100. [PMID: 33193664 PMCID: PMC7581905 DOI: 10.3389/fgene.2020.569100] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 09/09/2020] [Indexed: 12/03/2022] Open

Zhao Y, Du X. econvRBP: Improved ensemble convolutional neural networks for RNA binding protein prediction directly from sequence. Methods 2020;181-182:15-23. [PMID: 31513916 DOI: 10.1016/j.ymeth.2019.09.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 08/21/2019] [Accepted: 09/05/2019] [Indexed: 10/26/2022] Open

Abstract

RNA binding proteins (RBPs) determine RNA process from synthesis to decay, which play a key role in RNA transport, translation and degradation. Therefore, exploring RBPs' function from the amino acid sequence using computational methods has become one of the momentous topics in genome annotation. However, there still have some challenges: (1) shallow feature: Although the sequence determines structure is self-evident, it is difficult to analyze the essential features from simple sequence. (2) Poorly understand: feature-based prediction methods mainly emphasize feature extraction, while in-depth understanding of protein mysteries limits the application of feature engineering. (3) Feature fusion: multi-feature fusion is often used, but the features are not well integrated. In view of these challenges, we propose a novel ensemble convolutional neural network (econvRBP) to predict RBPs. In order to capture the local and global features of RNA binding proteins simultaneously, first of all, One Hot and Conjoint Triad encoding methods are used to transform amino acid sequence into local and global features, respectively. After that the local and global features are combined for further high-level feature extraction using convolutional neural networks. Some experiments are constructed to evaluate our method with 10-fold cross validation and the results show that it has achieved the best performance among all the predictors so far. We correctly predicted 99% of 2875 RBPs and 99% of 6782 non-RBPs with accuracy of 0.99. In addition, the datasets provided by RBPPred are also used to validate our models with an accuracy of 0.87. These results indicate that the econvRBP is the most excellent method at present, and will provide reliable guidance for the detection of RBPs. econvRBP is available at http://47.100.203.218:3389/home.html/.

Collapse

Kaur D, Arora C, Raghava GPS. A Hybrid Model for Predicting Pattern Recognition Receptors Using Evolutionary Information. Front Immunol 2020;11:71. [PMID: 32082326 PMCID: PMC7002473 DOI: 10.3389/fimmu.2020.00071] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Accepted: 01/13/2020] [Indexed: 12/17/2022] Open

Touati R, Messaoudi I, Oueslati AE, Lachiri Z, Kharrat M. Classification of intra-genomic helitrons based on features extracted from different orders of FCGS. INFORMATICS IN MEDICINE UNLOCKED 2020. [DOI: 10.1016/j.imu.2019.100271] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Zhang Y, Xie R, Wang J, Leier A, Marquez-Lago TT, Akutsu T, Webb GI, Chou KC, Song J. Computational analysis and prediction of lysine malonylation sites by exploiting informative features in an integrative machine-learning framework. Brief Bioinform 2019;20:2185-2199. [PMID: 30351377 PMCID: PMC6954445 DOI: 10.1093/bib/bby079] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 07/28/2018] [Accepted: 08/01/2018] [Indexed: 11/15/2022] Open

Abstract

As a newly discovered post-translational modification (PTM), lysine malonylation (Kmal) regulates a myriad of cellular processes from prokaryotes to eukaryotes and has important implications in human diseases. Despite its functional significance, computational methods to accurately identify malonylation sites are still lacking and urgently needed. In particular, there is currently no comprehensive analysis and assessment of different features and machine learning (ML) methods that are required for constructing the necessary prediction models. Here, we review, analyze and compare 11 different feature encoding methods, with the goal of extracting key patterns and characteristics from residue sequences of Kmal sites. We identify optimized feature sets, with which four commonly used ML methods (random forest, support vector machines, K-nearest neighbor and logistic regression) and one recently proposed [Light Gradient Boosting Machine (LightGBM)] are trained on data from three species, namely, Escherichia coli, Mus musculus and Homo sapiens, and compared using randomized 10-fold cross-validation tests. We show that integration of the single method-based models through ensemble learning further improves the prediction performance and model robustness on the independent test. When compared to the existing state-of-the-art predictor, MaloPred, the optimal ensemble models were more accurate for all three species (AUC: 0.930, 0.923 and 0.944 for E. coli, M. musculus and H. sapiens, respectively). Using the ensemble models, we developed an accessible online predictor, kmal-sp, available at http://kmalsp.erc.monash.edu/. We hope that this comprehensive survey and the proposed strategy for building more accurate models can serve as a useful guide for inspiring future developments of computational methods for PTM site prediction, expedite the discovery of new malonylation and other PTM types and facilitate hypothesis-driven experimental validation of novel malonylated substrates and malonylation sites.

Collapse

Bressin A, Schulte-Sasse R, Figini D, Urdaneta EC, Beckmann BM, Marsico A. TriPepSVM: de novo prediction of RNA-binding proteins based on short amino acid motifs. Nucleic Acids Res 2019;47:4406-4417. [PMID: 30923827 PMCID: PMC6511874 DOI: 10.1093/nar/gkz203] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Revised: 02/20/2019] [Accepted: 03/18/2019] [Indexed: 12/26/2022] Open

Sagar A, Xue B. Recent Advances in Machine Learning Based Prediction of RNA-protein Interactions. Protein Pept Lett 2019;26:601-619. [PMID: 31215361 DOI: 10.2174/0929866526666190619103853] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Revised: 04/04/2019] [Accepted: 06/01/2019] [Indexed: 12/18/2022]

Faustino AF, Martins AS, Karguth N, Artilheiro V, Enguita FJ, Ricardo JC, Santos NC, Martins IC. Structural and Functional Properties of the Capsid Protein of Dengue and Related Flavivirus. Int J Mol Sci 2019;20:E3870. [PMID: 31398956 PMCID: PMC6720645 DOI: 10.3390/ijms20163870] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Revised: 08/05/2019] [Accepted: 08/06/2019] [Indexed: 02/07/2023] Open

Chauhan S, Ahmad S. Enabling full‐length evolutionary profiles based deep convolutional neural network for predicting DNA‐binding proteins from sequence. Proteins 2019;88:15-30. [DOI: 10.1002/prot.25763] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2019] [Revised: 06/01/2019] [Accepted: 06/15/2019] [Indexed: 12/22/2022]