Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Chou KC, Zhang CT. Prediction of protein structural classes. Crit Rev Biochem Mol Biol 1995;30:275-349. [PMID: 7587280 DOI: 10.3109/10409239509083488] [Citation(s) in RCA: 910] [Impact Index Per Article: 31.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Number

Cited by Other Article(s)

Abbass J, Parisi C. Machine learning-based prediction of proteins' architecture using sequences of amino acids and structural alphabets. J Biomol Struct Dyn 2024:1-16. [PMID: 38505995 DOI: 10.1080/07391102.2024.2328736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 03/05/2024] [Indexed: 03/21/2024]

Chen L, Qu R, Liu X. Improved multi-label classifiers for predicting protein subcellular localization. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024;21:214-236. [PMID: 38303420 DOI: 10.3934/mbe.2024010] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]

Esmaili F, Pourmirzaei M, Ramazi S, Shojaeilangari S, Yavari E. A Review of Machine Learning and Algorithmic Methods for Protein Phosphorylation Site Prediction. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023;21:1266-1285. [PMID: 37863385 PMCID: PMC11082408 DOI: 10.1016/j.gpb.2023.03.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 01/16/2023] [Accepted: 03/23/2023] [Indexed: 10/22/2023]

Information entropy-based differential evolution with extremely randomized trees and LightGBM for protein structural class prediction. Appl Soft Comput 2023. [DOI: 10.1016/j.asoc.2023.110064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Shi Y, Bao XY. QSPR Modeling for the Prediction of the Triplet Yield of Singlet Fission Materials. JOURNAL OF SAUDI CHEMICAL SOCIETY 2023. [DOI: 10.1016/j.jscs.2023.101614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Chandra A, Tünnermann L, Löfstedt T, Gratz R. Transformer-based deep learning for predicting protein properties in the life sciences. eLife 2023;12:82819. [PMID: 36651724 PMCID: PMC9848389 DOI: 10.7554/elife.82819] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 01/06/2023] [Indexed: 01/19/2023] Open

The Role of Transmembrane Proteins in Plant Growth, Development, and Stress Responses. Int J Mol Sci 2022;23:ijms232113627. [PMID: 36362412 PMCID: PMC9655316 DOI: 10.3390/ijms232113627] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 11/02/2022] [Accepted: 11/04/2022] [Indexed: 11/09/2022] Open

Zhu L, Wang X, Li F, Song J. PreAcrs: a machine learning framework for identifying anti-CRISPR proteins. BMC Bioinformatics 2022;23:444. [PMID: 36284264 PMCID: PMC9597991 DOI: 10.1186/s12859-022-04986-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Accepted: 10/14/2022] [Indexed: 11/10/2022] Open

Cao Y, Yang ZQ, Zhang XL, Fan W, Wang Y, Shen J, Wei DQ, Li Q, Wei XY. Identifying the kind behind SMILES-anatomical therapeutic chemical classification using structure-only representations. Brief Bioinform 2022;23:6677124. [PMID: 36027578 DOI: 10.1093/bib/bbac346] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 07/11/2022] [Accepted: 07/26/2022] [Indexed: 01/25/2023] Open

Ning Q, Li J. DLF-Sul: a multi-module deep learning framework for prediction of S-sulfinylation sites in proteins. Brief Bioinform 2022;23:6658856. [PMID: 35945138 DOI: 10.1093/bib/bbac323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 07/16/2022] [Accepted: 07/18/2022] [Indexed: 11/14/2022] Open

Ren ZH, Yu CQ, Li LP, You ZH, Pan J, Guan YJ, Guo LX. BioChemDDI: Predicting Drug-Drug Interactions by Fusing Biochemical and Structural Information through a Self-Attention Mechanism. BIOLOGY 2022;11:biology11050758. [PMID: 35625486 PMCID: PMC9138786 DOI: 10.3390/biology11050758] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 05/12/2022] [Accepted: 05/13/2022] [Indexed: 01/13/2023]

Abstract

Simple Summary

Throughout history, combining drugs has been a common method in the fight against complex diseases. However, potential drug–drug interactions could give rise to unknown toxicity issues, which requires the urgent proposal of efficient methods to identify potential interactions.We use computer technology and machine learning techniques to propose a novel computational framework to calculate scores of drug–drug interaction probability for simplifying the screening process. Additionally, we built an online prescreening tool for biological researchers to further verify possible interactions in the fields of biomedicine and pharmacology. Overall, our study can provide new insights and approaches for rapidly identifying potential drug–drug interactions.

Abstract

During the development of drug and clinical applications, due to the co-administration of different drugs that have a high risk of interfering with each other’s mechanisms of action, correctly identifying potential drug–drug interactions (DDIs) is important to avoid a reduction in drug therapeutic activities and serious injuries to the organism. Therefore, to explore potential DDIs, we develop a computational method of integrating multi-level information. Firstly, the information of chemical sequence is fully captured by the Natural Language Processing (NLP) algorithm, and multiple biological function similarity information is fused by Similarity Network Fusion (SNF). Secondly, we extract deep network structure information through Hierarchical Representation Learning for Networks (HARP). Then, a highly representative comprehensive feature descriptor is constructed through the self-attention module that efficiently integrates biochemical and network features. Finally, a deep neural network (DNN) is employed to generate the prediction results. Contrasted with the previous supervision model, BioChemDDI innovatively introduced graph collapse for extracting a network structure and utilized the biochemical information during the pre-training process. The prediction results of the benchmark dataset indicate that BioChemDDI outperforms other existing models. Moreover, the case studies related to three cancer diseases, including breast cancer, hepatocellular carcinoma and malignancies, were analyzed using BioChemDDI. As a result, 24, 18 and 20 out of the top 30 predicted cancer-related drugs were confirmed by the databases. These experimental results demonstrate that BioChemDDI is a useful model to predict DDIs and can provide reliable candidates for biological experiments. The web server of BioChemDDI predictor is freely available to conduct further studies.

Collapse

Ali SD, Alam W, Tayara H, Chong KT. Identification of Functional piRNAs Using a Convolutional Neural Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:1661-1669. [PMID: 33119510 DOI: 10.1109/tcbb.2020.3034313] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Qiao H, Zhang S, Xue T, Wang J, Wang B. iPro-GAN: A novel model based on generative adversarial learning for identifying promoters and their strength. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022;215:106625. [PMID: 35038653 DOI: 10.1016/j.cmpb.2022.106625] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 12/13/2021] [Accepted: 01/06/2022] [Indexed: 06/14/2023]

Zhang Z, Wang L. Using Chou's 5-steps rule to identify N⁶-methyladenine sites by ensemble learning combined with multiple feature extraction methods. J Biomol Struct Dyn 2022;40:796-806. [PMID: 32948102 DOI: 10.1080/07391102.2020.1821778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Alzahrani E, Alghamdi W, Ullah MZ, Khan YD. Identification of stress response proteins through fusion of machine learning models and statistical paradigms. Sci Rep 2021;11:21767. [PMID: 34741132 PMCID: PMC8571424 DOI: 10.1038/s41598-021-99083-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Accepted: 09/13/2021] [Indexed: 11/08/2022] Open

Akmal MA, Hussain W, Rasool N, Khan YD, Khan SA, Chou KC. Using CHOU'S 5-Steps Rule to Predict O-Linked Serine Glycosylation Sites by Blending Position Relative Features and Statistical Moment. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:2045-2056. [PMID: 31985438 DOI: 10.1109/tcbb.2020.2968441] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Liang Y, Zhang S, Qiao H, Yao Y. iPromoter-ET: Identifying promoters and their strength by extremely randomized trees-based feature selection. Anal Biochem 2021;630:114335. [PMID: 34389299 DOI: 10.1016/j.ab.2021.114335] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2021] [Revised: 07/24/2021] [Accepted: 08/09/2021] [Indexed: 10/20/2022]

Li Y, Pu F, Wang J, Zhou Z, Zhang C, He F, Ma Z, Zhang J. Machine Learning Methods in Prediction of Protein Palmitoylation Sites: A Brief Review. Curr Pharm Des 2021;27:2189-2198. [PMID: 33183190 DOI: 10.2174/1381612826666201112142826] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 07/27/2020] [Indexed: 11/22/2022]

Liang X, Li F, Chen J, Li J, Wu H, Li S, Song J, Liu Q. Large-scale comparative review and assessment of computational methods for anti-cancer peptide identification. Brief Bioinform 2021;22:bbaa312. [PMID: 33316035 PMCID: PMC8294543 DOI: 10.1093/bib/bbaa312] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 09/30/2020] [Accepted: 08/25/2020] [Indexed: 12/13/2022] Open

Abstract

Anti-cancer peptides (ACPs) are known as potential therapeutics for cancer. Due to their unique ability to target cancer cells without affecting healthy cells directly, they have been extensively studied. Many peptide-based drugs are currently evaluated in the preclinical and clinical trials. Accurate identification of ACPs has received considerable attention in recent years; as such, a number of machine learning-based methods for in silico identification of ACPs have been developed. These methods promote the research on the mechanism of ACPs therapeutics against cancer to some extent. There is a vast difference in these methods in terms of their training/testing datasets, machine learning algorithms, feature encoding schemes, feature selection methods and evaluation strategies used. Therefore, it is desirable to summarize the advantages and disadvantages of the existing methods, provide useful insights and suggestions for the development and improvement of novel computational tools to characterize and identify ACPs. With this in mind, we firstly comprehensively investigate 16 state-of-the-art predictors for ACPs in terms of their core algorithms, feature encoding schemes, performance evaluation metrics and webserver/software usability. Then, comprehensive performance assessment is conducted to evaluate the robustness and scalability of the existing predictors using a well-prepared benchmark dataset. We provide potential strategies for the model performance improvement. Moreover, we propose a novel ensemble learning framework, termed ACPredStackL, for the accurate identification of ACPs. ACPredStackL is developed based on the stacking ensemble strategy combined with SVM, Naïve Bayesian, lightGBM and KNN. Empirical benchmarking experiments against the state-of-the-art methods demonstrate that ACPredStackL achieves a comparative performance for predicting ACPs. The webserver and source code of ACPredStackL is freely available at http://bigdata.biocie.cn/ACPredStackL/ and https://github.com/liangxiaoq/ACPredStackL, respectively.

Collapse

Feng P, Feng L, Tang C. Comparison and Analysis of Computational Methods for Identifying N6-Methyladenosine Sites in Saccharomyces cerevisiae. Curr Pharm Des 2021;27:1219-1229. [PMID: 33167827 DOI: 10.2174/1381612826666201109110703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Accepted: 07/20/2020] [Indexed: 11/22/2022]

Support vector regression-based QSAR models for prediction of antioxidant activity of phenolic compounds. Sci Rep 2021;11:8806. [PMID: 33888843 PMCID: PMC8062522 DOI: 10.1038/s41598-021-88341-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 04/12/2021] [Indexed: 12/15/2022] Open

i6mA-VC: A Multi-Classifier Voting Method for the Computational Identification of DNA N6-methyladenine Sites. Interdiscip Sci 2021;13:413-425. [PMID: 33834381 DOI: 10.1007/s12539-021-00429-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Revised: 03/26/2021] [Accepted: 03/29/2021] [Indexed: 12/14/2022]

Abstract

DNA N6-methyladenine (6 mA), as an essential component of epigenetic modification, cannot be neglected in genetic regulation mechanism. The efficient and accurate prediction of 6 mA sites is beneficial to the development of biological genetics. Biochemical experimental methods are considered to be time-consuming and laborious. Most of the established machine learning methods have a single dataset. Although some of them have achieved cross-species prediction, their results are not satisfactory. Therefore, we designed a novel statistical model called i6mA-VC to improve the accuracy for 6 mA sites. On the one hand, kmer and binary encoding are applied to extract features, and then gradient boosting decision tree (GBDT) embedded method is applied as the feature selection strategy. On the other hand, DNA sequences are represented by vectors through the feature extraction method of ring-function-hydrogen-chemical properties (RFHCP) and the feature selection strategy of ExtraTree. After fusing the two optimal features, a voting classifier based on gradient boosting decision tree (GBDT), light gradient boosting machine (LightGBM) and multilayer perceptron classifier (MLPC) is constructed for final classification and prediction. The accuracy of Rice dataset and M.musculus dataset with five-fold cross-validation are 0.888 and 0.967, respectively. The cross-species dataset is selected as independent testing dataset, and the accuracy reaches 0.848. Through rigorous experiments, it is demonstrated that the proposed predictor is convincing and applicable. The development of i6mA-VC predictor will become an effective way for the recognition of N6-methyladenine sites, and it will also be beneficial for biological geneticists to further study gene expression and DNA modification. In addition, an accessible web-server for i6mA-VC is available from http://www.zhanglab.site/ .

Collapse

Zhang ZM, Guan ZX, Wang F, Zhang D, Ding H. Application of Machine Learning Methods in Predicting Nuclear Receptors and their Families. Med Chem 2021;16:594-604. [PMID: 31584374 DOI: 10.2174/1573406415666191004125551] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Revised: 06/18/2019] [Accepted: 08/23/2019] [Indexed: 11/22/2022]

Abstract

Nuclear receptors (NRs) are a superfamily of ligand-dependent transcription factors that are closely related to cell development, differentiation, reproduction, homeostasis, and metabolism. According to the alignments of the conserved domains, NRs are classified and assigned the following seven subfamilies or eight subfamilies: (1) NR1: thyroid hormone like (thyroid hormone, retinoic acid, RAR-related orphan receptor, peroxisome proliferator activated, vitamin D3- like), (2) NR2: HNF4-like (hepatocyte nuclear factor 4, retinoic acid X, tailless-like, COUP-TFlike, USP), (3) NR3: estrogen-like (estrogen, estrogen-related, glucocorticoid-like), (4) NR4: nerve growth factor IB-like (NGFI-B-like), (5) NR5: fushi tarazu-F1 like (fushi tarazu-F1 like), (6) NR6: germ cell nuclear factor like (germ cell nuclear factor), and (7) NR0: knirps like (knirps, knirpsrelated, embryonic gonad protein, ODR7, trithorax) and DAX like (DAX, SHP), or dividing NR0 into (7) NR7: knirps like and (8) NR8: DAX like. Different NRs families have different structural features and functions. Since the function of a NR is closely correlated with which subfamily it belongs to, it is highly desirable to identify NRs and their subfamilies rapidly and effectively. The knowledge acquired is essential for a proper understanding of normal and abnormal cellular mechanisms. With the advent of the post-genomics era, huge amounts of sequence-known proteins have increased explosively. Conventional methods for accurately classifying the family of NRs are experimental means with high cost and low efficiency. Therefore, it has created a greater need for bioinformatics tools to effectively recognize NRs and their subfamilies for the purpose of understanding their biological function. In this review, we summarized the application of machine learning methods in the prediction of NRs from different aspects. We hope that this review will provide a reference for further research on the classification of NRs and their families.

Collapse

Recent Advances in the Prediction of Protein Structural Classes: Feature Descriptors and Machine Learning Algorithms. CRYSTALS 2021. [DOI: 10.3390/cryst11040324] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]

Abdennaji I, Zaied M, Girault JM. Prediction of protein structural class based on symmetrical recurrence quantification analysis. Comput Biol Chem 2021;92:107450. [PMID: 33631460 DOI: 10.1016/j.compbiolchem.2021.107450] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 02/03/2021] [Indexed: 11/19/2022]

Peng X, Chen L, Zhou JP. Identification of Carcinogenic Chemicals with Network Embedding and Deep Learning Methods. Curr Bioinform 2021. [DOI: 10.2174/1574893615999200414084317] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]

Khan YD, Alzahrani E, Alghamdi W, Ullah MZ. Sequence-based Identification of Allergen Proteins Developed by Integration of PseAAC and Statistical Moments via 5-Step Rule. Curr Bioinform 2021. [DOI: 10.2174/1574893615999200424085947] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]

Wang H, Xi Q, Liang P, Zheng L, Hong Y, Zuo Y. IHEC_RAAC: a online platform for identifying human enzyme classes via reduced amino acid cluster strategy. Amino Acids 2021;53:239-251. [PMID: 33486591 DOI: 10.1007/s00726-021-02941-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2020] [Accepted: 01/11/2021] [Indexed: 12/18/2022]

RAM-PGK: Prediction of Lysine Phosphoglycerylation Based on Residue Adjacency Matrix. Genes (Basel) 2020;11:genes11121524. [PMID: 33419274 PMCID: PMC7766696 DOI: 10.3390/genes11121524] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 12/15/2020] [Accepted: 12/16/2020] [Indexed: 11/29/2022] Open

Abstract

Background: Post-translational modification (PTM) is a biological process that is associated with the modification of proteome, which results in the alteration of normal cell biology and pathogenesis. There have been numerous PTM reports in recent years, out of which, lysine phosphoglycerylation has emerged as one of the recent developments. The traditional methods of identifying phosphoglycerylated residues, which are experimental procedures such as mass spectrometry, have shown to be time-consuming and cost-inefficient, despite the abundance of proteins being sequenced in this post-genomic era. Due to these drawbacks, computational techniques are being sought to establish an effective identification system of phosphoglycerylated lysine residues. The development of a predictor for phosphoglycerylation prediction is not a first, but it is necessary as the latest predictor falls short in adequately detecting phosphoglycerylated and non-phosphoglycerylated lysine residues. Results: In this work, we introduce a new predictor named RAM-PGK, which uses sequence-based information relating to amino acid residues to predict phosphoglycerylated and non-phosphoglycerylated sites. A benchmark dataset was employed for this purpose, which contained experimentally identified phosphoglycerylated and non-phosphoglycerylated lysine residues. From the dataset, we extracted the residue adjacency matrix pertaining to each lysine residue in the protein sequences and converted them into feature vectors, which is used to build the phosphoglycerylation predictor. Conclusion: RAM-PGK, which is based on sequential features and support vector machine classifiers, has shown a noteworthy improvement in terms of performance in comparison to some of the recent prediction methods. The performance metrics of the RAM-PGK predictor are: 0.5741 sensitivity, 0.6436 specificity, 0.0531 precision, 0.6414 accuracy, and 0.0824 Mathews correlation coefficient.

Collapse

Wu C, Li Q, Xing R, Fan GL. Using the Chou’s Pseudo Component to Predict the ncRNA Locations Based on the Improved K-Nearest Neighbor (iKNN) Classifier. Curr Bioinform 2020. [DOI: 10.2174/1574893614666191003142406] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Yang XF, Zhou YK, Zhang L, Gao Y, Du PF. Predicting LncRNA Subcellular Localization Using Unbalanced Pseudo-k Nucleotide Compositions. Curr Bioinform 2020. [DOI: 10.2174/1574893614666190902151038] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Zhang S, Qiao H. KD-KLNMF: Identification of lncRNAs subcellular localization with multiple features and nonnegative matrix factorization. Anal Biochem 2020;610:113995. [PMID: 33080214 DOI: 10.1016/j.ab.2020.113995] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Revised: 09/07/2020] [Accepted: 10/12/2020] [Indexed: 12/18/2022]

Amanat S, Ashraf A, Hussain W, Rasool N, Khan YD. Identification of Lysine Carboxylation Sites in Proteins by Integrating Statistical Moments and Position Relative Features via General PseAAC. Curr Bioinform 2020. [DOI: 10.2174/1574893614666190723114923] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]

Shah AA, Khan YD. Identification of 4-carboxyglutamate residue sites based on position based statistical feature and multiple classification. Sci Rep 2020;10:16913. [PMID: 33037248 PMCID: PMC7547663 DOI: 10.1038/s41598-020-73107-y] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2020] [Accepted: 08/20/2020] [Indexed: 11/08/2022] Open

Identification of Latent Oncogenes with a Network Embedding Method and Random Forest. BIOMED RESEARCH INTERNATIONAL 2020;2020:5160396. [PMID: 33029511 PMCID: PMC7530476 DOI: 10.1155/2020/5160396] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 09/09/2020] [Accepted: 09/14/2020] [Indexed: 12/29/2022]

Feng YH, Zhang SW, Shi JY. DPDDI: a deep predictor for drug-drug interactions. BMC Bioinformatics 2020;21:419. [PMID: 32972364 PMCID: PMC7513481 DOI: 10.1186/s12859-020-03724-x] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 08/31/2020] [Indexed: 12/21/2022] Open

Abstract

Background

The treatment of complex diseases by taking multiple drugs becomes increasingly popular. However, drug-drug interactions (DDIs) may give rise to the risk of unanticipated adverse effects and even unknown toxicity. DDI detection in the wet lab is expensive and time-consuming. Thus, it is highly desired to develop the computational methods for predicting DDIs. Generally, most of the existing computational methods predict DDIs by extracting the chemical and biological features of drugs from diverse drug-related properties, however some drug properties are costly to obtain and not available in many cases.

Results

In this work, we presented a novel method (namely DPDDI) to predict DDIs by extracting the network structure features of drugs from DDI network with graph convolution network (GCN), and the deep neural network (DNN) model as a predictor. GCN learns the low-dimensional feature representations of drugs by capturing the topological relationship of drugs in DDI network. DNN predictor concatenates the latent feature vectors of any two drugs as the feature vector of the corresponding drug pairs to train a DNN for predicting the potential drug-drug interactions. Experiment results show that, the newly proposed DPDDI method outperforms four other state-of-the-art methods; the GCN-derived latent features include more DDI information than other features derived from chemical, biological or anatomical properties of drugs; and the concatenation feature aggregation operator is better than two other feature aggregation operators (i.e., inner product and summation). The results in case studies confirm that DPDDI achieves reasonable performance in predicting new DDIs.

Conclusion

We proposed an effective and robust method DPDDI to predict the potential DDIs by utilizing the DDI network information without considering the drug properties (i.e., drug chemical and biological properties). The method should also be useful in other DDI-related scenarios, such as the detection of unexpected side effects, and the guidance of drug combination.

Collapse

Xu L, Liang G, Chen B, Tan X, Xiang H, Liao C. A Computational Method for the Identification of Endolysins and Autolysins. Protein Pept Lett 2020;27:329-336. [PMID: 31577192 DOI: 10.2174/0929866526666191002104735] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Revised: 06/27/2019] [Accepted: 09/03/2019] [Indexed: 12/21/2022]

Abstract

BACKGROUND

Cell lytic enzyme is a kind of highly evolved protein, which can destroy the cell structure and kill the bacteria. Compared with antibiotics, cell lytic enzyme will not cause serious problem of drug resistance of pathogenic bacteria. Thus, the study of cell wall lytic enzymes aims at finding an efficient way for curing bacteria infectious. Compared with using antibiotics, the problem of drug resistance becomes more serious. Therefore, it is a good choice for curing bacterial infections by using cell lytic enzymes. Cell lytic enzyme includes endolysin and autolysin and the difference between them is the purpose of the break of cell wall. The identification of the type of cell lytic enzymes is meaningful for the study of cell wall enzymes.

OBJECTIVE

In this article, our motivation is to predict the type of cell lytic enzyme. Cell lytic enzyme is helpful for killing bacteria, so it is meaningful for study the type of cell lytic enzyme. However, it is time consuming to detect the type of cell lytic enzyme by experimental methods. Thus, an efficient computational method for the type of cell lytic enzyme prediction is proposed in our work.

METHODS

We propose a computational method for the prediction of endolysin and autolysin. First, a data set containing 27 endolysins and 41 autolysins is built. Then the protein is represented by tripeptides composition. The features are selected with larger confidence degree. At last, the classifier is trained by the labeled vectors based on support vector machine. The learned classifier is used to predict the type of cell lytic enzyme.

RESULTS

Following the proposed method, the experimental results show that the overall accuracy can attain 97.06%, when 44 features are selected. Compared with Ding's method, our method improves the overall accuracy by nearly 4.5% ((97.06-92.9)/92.9%). The performance of our proposed method is stable, when the selected feature number is from 40 to 70. The overall accuracy of tripeptides optimal feature set is 94.12%, and the overall accuracy of Chou's amphiphilic PseAAC method is 76.2%. The experimental results also demonstrate that the overall accuracy is improved by nearly 18% when using the tripeptides optimal feature set.

CONCLUSION

The paper proposed an efficient method for identifying endolysin and autolysin. In this paper, support vector machine is used to predict the type of cell lytic enzyme. The experimental results show that the overall accuracy of the proposed method is 94.12%, which is better than some existing methods. In conclusion, the selected 44 features can improve the overall accuracy for identification of the type of cell lytic enzyme. Support vector machine performs better than other classifiers when using the selected feature set on the benchmark data set.

Collapse

Zhou JP, Chen L, Guo ZH. iATC-NRAKEL: an efficient multi-label classifier for recognizing anatomical therapeutic chemical classes of drugs. Bioinformatics 2020;36:1391-1396. [PMID: 31593226 DOI: 10.1093/bioinformatics/btz757] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Revised: 09/10/2019] [Accepted: 10/01/2019] [Indexed: 11/13/2022] Open

Sikka P, Nath A, Paul SS, Andonissamy J, Mishra DC, Rao AR, Balhara AK, Chaturvedi KK, Yadav KK, Balhara S. Inferring Relationship of Blood Metabolic Changes and Average Daily Gain With Feed Conversion Efficiency in Murrah Heifers: Machine Learning Approach. Front Vet Sci 2020;7:518. [PMID: 32984408 PMCID: PMC7492607 DOI: 10.3389/fvets.2020.00518] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Accepted: 07/06/2020] [Indexed: 11/13/2022] Open

Bi Y, Xiang D, Ge Z, Li F, Jia C, Song J. An Interpretable Prediction Model for Identifying N⁷-Methylguanosine Sites Based on XGBoost and SHAP. MOLECULAR THERAPY. NUCLEIC ACIDS 2020;22:362-372. [PMID: 33230441 PMCID: PMC7533297 DOI: 10.1016/j.omtn.2020.08.022] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 08/20/2020] [Indexed: 12/19/2022]

Use Chou’s 5-steps rule to identify DNase I hypersensitive sites via dinucleotide property matrix and extreme gradient boosting. Mol Genet Genomics 2020;295:1431-1442. [DOI: 10.1007/s00438-020-01711-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Accepted: 07/11/2020] [Indexed: 01/08/2023]

Chou KC. An Insightful 10-year Recollection Since the Emergence of the 5-steps Rule. Curr Pharm Des 2020;25:4223-4234. [PMID: 31782354 DOI: 10.2174/1381612825666191129164042] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Accepted: 11/25/2019] [Indexed: 11/22/2022]

Dou L, Li X, Ding H, Xu L, Xiang H. Prediction of m5C Modifications in RNA Sequences by Combining Multiple Sequence Features. MOLECULAR THERAPY. NUCLEIC ACIDS 2020;21:332-342. [PMID: 32645685 PMCID: PMC7340967 DOI: 10.1016/j.omtn.2020.06.004] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 06/03/2020] [Accepted: 06/04/2020] [Indexed: 12/14/2022]

Progresses in Predicting Post-translational Modification. Int J Pept Res Ther 2020. [DOI: 10.1007/s10989-019-09893-5
https://link.springer.com/article/10.1007%2fs10989-019-09893-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/29/2022]

A Two-Level Computation Model Based on Deep Learning Algorithm for Identification of piRNA and Their Functions via Chou’s 5-Steps Rule. Int J Pept Res Ther 2020. [DOI: 10.1007/s10989-019-09887-3
https://link.springer.com/article/10.1007%2fs10989-019-09887-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Yu Y, Wang S, Wang Y, Cao Y, Yu C, Pan Y, Su D, Lu Q, Zuo Y, Yang L. Using Reduced Amino Acid Alphabet and Biological Properties to Analyze and Predict Animal Neurotoxin Protein. Curr Drug Metab 2020;21:810-817. [PMID: 32433000 DOI: 10.2174/1389200221666200520090555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Revised: 01/07/2020] [Accepted: 01/15/2020] [Indexed: 11/22/2022]

Feng CQ, Zhang ZY, Zhu XJ, Lin Y, Chen W, Tang H, Lin H. iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics 2020;35:1469-1477. [PMID: 30247625 DOI: 10.1093/bioinformatics/bty827] [Citation(s) in RCA: 142] [Impact Index Per Article: 35.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2018] [Revised: 09/13/2018] [Accepted: 09/20/2018] [Indexed: 12/31/2022] Open

Mahmoudi O, Wahab A, Chong KT. iMethyl-Deep: N6 Methyladenosine Identification of Yeast Genome with Automatic Feature Extraction Technique by Using Deep Learning Algorithm. Genes (Basel) 2020;11:genes11050529. [PMID: 32397453 PMCID: PMC7288457 DOI: 10.3390/genes11050529] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Revised: 04/30/2020] [Accepted: 05/05/2020] [Indexed: 12/12/2022] Open

QUATgo: Protein quaternary structural attributes predicted by two-stage machine learning approaches with heterogeneous feature encoding. PLoS One 2020;15:e0232087. [PMID: 32348325 PMCID: PMC7190164 DOI: 10.1371/journal.pone.0232087] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Accepted: 04/07/2020] [Indexed: 11/19/2022] Open

Hussain W, Rasool N, Khan YD. A Sequence-Based Predictor of Zika Virus Proteins Developed by Integration of PseAAC and Statistical Moments. Comb Chem High Throughput Screen 2020;23:797-804. [PMID: 32342804 DOI: 10.2174/1386207323666200428115449] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 03/17/2020] [Accepted: 03/19/2020] [Indexed: 12/20/2022]