Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ding Y, Tang J, Guo F. Identification of Protein-Ligand Binding Sites by Sequence Information and Ensemble Classifier. J Chem Inf Model 2017;57:3149-3161. [PMID: 29125297 DOI: 10.1021/acs.jcim.7b00307] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

For:	Ding Y, Tang J, Guo F. Identification of Protein-Ligand Binding Sites by Sequence Information and Ensemble Classifier. J Chem Inf Model 2017;57:3149-3161. [PMID: 29125297 DOI: 10.1021/acs.jcim.7b00307] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Number

Cited by Other Article(s)

Liu Z, Qiu WR, Liu Y, Yan H, Pei W, Zhu YH, Qiu J. A comprehensive review of computational methods for Protein-DNA binding site prediction. Anal Biochem 2025;703:115862. [PMID: 40209920 DOI: 10.1016/j.ab.2025.115862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2024] [Revised: 03/20/2025] [Accepted: 04/06/2025] [Indexed: 04/12/2025]

Wu J, Liu Y, Zhang Y, Wang X, Yan H, Zhu Y, Song J, Yu DJ. Identifying Protein-Nucleotide Binding Residues via Grouped Multi-task Learning and Pre-trained Protein Language Models. J Chem Inf Model 2025;65:1040-1052. [PMID: 39788787 DOI: 10.1021/acs.jcim.4c02092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2025]

Hao S, Li CY, Hu X, Feng Z, Zhang G, Yang C, Hu H. S-DCNN: prediction of ATP binding residues by deep convolutional neural network based on SMOTE. Front Genet 2025;15:1513201. [PMID: 39834546 PMCID: PMC11744016 DOI: 10.3389/fgene.2024.1513201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2024] [Accepted: 12/11/2024] [Indexed: 01/22/2025] Open

Alnuqaydan AM. Riddelline from Tamarix articulate as a potential anti-bacterial lead compound for novel antibiotics discovery: A comprehensive computational and toxicological studies. PLoS One 2024;19:e0310319. [PMID: 39541292 PMCID: PMC11563397 DOI: 10.1371/journal.pone.0310319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2024] [Accepted: 08/28/2024] [Indexed: 11/16/2024] Open

Alawam AS, M Alneghery L, Alwethaynani MS, Alamri MA. A hierarchical approach towards identification of novel inhibitors against L, D-transpeptidase YcbB as an anti-bacterial therapeutic target. J Biomol Struct Dyn 2024:1-11. [PMID: 38411016 DOI: 10.1080/07391102.2024.2322619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Accepted: 02/16/2024] [Indexed: 02/28/2024]

Abstract

The bacterial cell wall, being a vital component for cell viability, is regarded as a promising drug target. The L, D-Transpeptidase YcbB enzyme has been implicated for a significant role in cell wall polymers cross linking during typhoid toxin release, β-lactam resistance and outer membrane defect rescue. These observations have been recorded in different bacterial pathogens such as Salmonella Typhimurium, Citrobacter rodentium, and Salmonella typhi. In this work, we have shown structure based virtual screening of diverse natural and synthetic drug libraries against the enzyme and revealed three compounds as LAS_32135590, LAS_34036730 and LAS-51380924. These compounds showed highly stable energies and the findings are very competitive with the control molecule ((1RG or (4 R,5S)-3-({(3S,5S)-5-[(3-carboxyphenyl)carbamoyl]pyrrolidin-3-yl}sulfanyl)-5-[(1S,2R)-1-formyl-2-hydroxypropyl]-4-methyl-4,5-dihydro-1H-pyrrole-2-carboxylic acid or ertapenem)) used. Compared to control (which has binding energy score of -11.63 kcal/mol), the compounds showed better binding energy. The binding energy score of LAS_32135590, LAS_34036730 and LAS-51380924 is -12.63 kcal/mol, -12.22 kcal/mol and -12.10 kcal/mol, respectively. Further, the docked snapshot of the lead compounds and control were investigated for stability under time dependent dynamics environment. All the three leads complex and control system showed significant equilibrium (mean RMSD < 3 Å) both in term of intermolecular docked conformation and binding interactions network. Further validation on the complex's stability was acquired from the end-state MMPB/GBSA analysis that observed greater contribution from van der Waals forces and electrostatic energy while less contribution was noticed from solvation part. The compounds were also showed good drug-likeness and are non-toxic and non-mutagenic. In short, the compounds can be used in experimental testing's and might be subjected to structure modification to get better results.

Collapse

Zhu YH, Liu Z, Liu Y, Ji Z, Yu DJ. ULDNA: integrating unsupervised multi-source language models with LSTM-attention network for high-accuracy protein-DNA binding site prediction. Brief Bioinform 2024;25:bbae040. [PMID: 38349057 PMCID: PMC10939370 DOI: 10.1093/bib/bbae040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Revised: 01/02/2024] [Accepted: 01/22/2024] [Indexed: 02/15/2024] Open

Guan S, Zou Q, Wu H, Ding Y. Protein-DNA Binding Residues Prediction Using a Deep Learning Model With Hierarchical Feature Extraction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:2619-2628. [PMID: 35834447 DOI: 10.1109/tcbb.2022.3190933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Qian Y, Shang T, Guo F, Wang C, Cui Z, Ding Y, Wu H. Identification of DNA-binding protein based multiple kernel model. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023;20:13149-13170. [PMID: 37501482 DOI: 10.3934/mbe.2023586] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]

Zhang J, Zhou F, Liang X, Yang G. SCAMPER: Accurate Type-Specific Prediction of Calcium-Binding Residues Using Sequence-Derived Features. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:1406-1416. [PMID: 35536812 DOI: 10.1109/tcbb.2022.3173437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Qian Y, Ding Y, Zou Q, Guo F. Multi-View Kernel Sparse Representation for Identification of Membrane Protein Types. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:1234-1245. [PMID: 35857734 DOI: 10.1109/tcbb.2022.3191325] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Hu J, Bai YS, Zheng LL, Jia NX, Yu DJ, Zhang GJ. Protein-DNA Binding Residue Prediction via Bagging Strategy and Sequence-Based Cube-Format Feature. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:3635-3645. [PMID: 34714748 DOI: 10.1109/tcbb.2021.3123828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Abstract

Protein-DNA interactions play an important role in diverse biological processes. Accurately identifying protein-DNA binding residues is a critical but challenging task for protein function annotations and drug design. Although wet-lab experimental methods are the most accurate way to identify protein-DNA binding residues, they are time consuming and labor intensive. There is an urgent need to develop computational methods to rapidly and accurately predict protein-DNA binding residues. In this study, we propose a novel sequence-based method, named PredDBR, for predicting DNA-binding residues. In PredDBR, for each query protein, its position-specific frequency matrix (PSFM), predicted secondary structure (PSS), and predicted probabilities of ligand-binding residues (PPLBR) are first generated as three feature sources. Secondly, for each feature source, the sliding window technique is employed to extract the matrix-format feature of each residue. Then, we design two strategies, i.e., square root (SR) and average (AVE), to separately transform PSFM-based and two predicted feature source-based, i.e., PSS-based and PPLBR-based, matrix-format features of each residue into three corresponding cube-format features. Finally, after serially combining the three cube-format features, the ensemble classifier is generated via applying bagging strategy to multiple base classifiers built by the framework of 2D convolutional neural network. The computational experimental results demonstrate that the proposed PredDBR achieves an average overall accuracy of 93.7% and a Mathew's correlation coefficient of 0.405 on two independent validation datasets and outperforms several state-of-the-art sequenced-based protein-DNA binding residue predictors. The PredDBR web-server is available at https://jun-csbio.github.io/PredDBR/.

Collapse

Navid A, Ahmad S, Sajjad R, Raza S, Azam SS. Structure Based in Silico Screening Revealed a Potent Acinetobacter Baumannii Ftsz Inhibitor From Asinex Antibacterial Library. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:3008-3018. [PMID: 34375286 DOI: 10.1109/tcbb.2021.3103899] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Xu X, Xuan P, Zhang T, Chen B, Sheng N. Inferring Drug-Target Interactions Based on Random Walk and Convolutional Neural Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:2294-2304. [PMID: 33729947 DOI: 10.1109/tcbb.2021.3066813] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Abstract

Computational strategies for identifying new drug-target interactions (DTIs) can guide the process of drug discovery, reduce the cost and time of drug development, and thus promote drug development. Most recently proposed methods predict DTIs via integration of heterogeneous data related to drugs and proteins. However, previous methods have failed to deeply integrate these heterogeneous data and learn deep feature representations of multiple original similarities and interactions related to drugs and proteins. We therefore constructed a heterogeneous network by integrating a variety of connection relationships about drugs and proteins, including drugs, proteins, and drug side effects, as well as their similarities, interactions, and associations. A DTI prediction method based on random walk and convolutional neural network was proposed and referred to as DTIPred. DTIPred not only takes advantage of various original features related to drugs and proteins, but also integrates the topological information of heterogeneous networks. The prediction model is composed of two sides and learns the deep feature representation of a drug-protein pair. On the left side, random walk with restart is applied to learn the topological vectors of drug and protein nodes. The topological representation is further learned by the constructed deep learning frame based on convolutional neural network. The right side of the model focuses on integrating multiple original similarities and interactions of drugs and proteins to learn the original representation of the drug-protein pair. The results of cross-validation experiments demonstrate that DTIPred achieves better prediction performance than several state-of-the-art methods. During the validation process, DTIPred can retrieve more actual drug-protein interactions within the top part of the predicted results, which may be more helpful to biologists. In addition, case studies on five drugs further demonstrate the ability of DTIPred to discover potential drug-protein interactions.

Collapse

Ma Z, Guo A, Jing P. Advances in dietary proteins binding with co-existed anthocyanins in foods: Driving forces, structure-affinity relationship, and functional and nutritional properties. Crit Rev Food Sci Nutr 2022;63:10792-10813. [PMID: 35748363 DOI: 10.1080/10408398.2022.2086211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]

Ranganathan S, Mahesh S, Suresh S, Nagarajan A, Z Sen T, M Yennamalli R. Experimental and computational studies of cellulases as bioethanol enzymes. Bioengineered 2022;13:14028-14046. [PMID: 35730402 PMCID: PMC9345620 DOI: 10.1080/21655979.2022.2085541] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Wang W, Zhang Y, Liu D, Zhang H, Wang X, Zhou Y. Prediction of DNA-Binding Protein–Drug-Binding Sites Using Residue Interaction Networks and Sequence Feature. Front Bioeng Biotechnol 2022;10:822392. [PMID: 35519609 PMCID: PMC9065339 DOI: 10.3389/fbioe.2022.822392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 03/14/2022] [Indexed: 11/13/2022] Open

A Comprehensive Review of Computation-Based Metal-Binding Prediction Approaches at the Residue Level. BIOMED RESEARCH INTERNATIONAL 2022;2022:8965712. [PMID: 35402609 PMCID: PMC8989566 DOI: 10.1155/2022/8965712] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Accepted: 03/04/2022] [Indexed: 12/29/2022]

Lu W, Shen J, Zhang Y, Wu H, Qian Y, Chen X, Fu Q. Identifying Membrane Protein Types Based on Lifelong Learning With Dynamically Scalable Networks. Front Genet 2022;12:834488. [PMID: 35371189 PMCID: PMC8964460 DOI: 10.3389/fgene.2021.834488] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 12/21/2021] [Indexed: 11/13/2022] Open

Boumezber S, Yelekçi K. Screening of novel and selective inhibitors for neuronal nitric oxide synthase (nNOS) via structure-based drug design techniques. J Biomol Struct Dyn 2022;41:3607-3629. [PMID: 35322764 DOI: 10.1080/07391102.2022.2054471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Zhao Q, Ma J, Wang Y, Xie F, Lv Z, Xu Y, Shi H, Han K. Mul-SNO: A novel prediction tool for S-nitrosylation sites based on deep learning methods. IEEE J Biomed Health Inform 2021;26:2379-2387. [PMID: 34762593 DOI: 10.1109/jbhi.2021.3123503] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Xuan P, Fan M, Cui H, Zhang T, Nakaguchi T. GVDTI: graph convolutional and variational autoencoders with attribute-level attention for drug-protein interaction prediction. Brief Bioinform 2021;23:6412398. [PMID: 34718408 DOI: 10.1093/bib/bbab453] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Revised: 09/14/2021] [Accepted: 10/02/2021] [Indexed: 11/12/2022] Open

Abstract

MOTIVATION

Identifying proteins that interact with drugs plays an important role in the initial period of developing drugs, which helps to reduce the development cost and time. Recent methods for predicting drug-protein interactions mainly focus on exploiting various data about drugs and proteins. These methods failed to completely learn and integrate the attribute information of a pair of drug and protein nodes and their attribute distribution.

RESULTS

We present a new prediction method, GVDTI, to encode multiple pairwise representations, including attention-enhanced topological representation, attribute representation and attribute distribution. First, a framework based on graph convolutional autoencoder is constructed to learn attention-enhanced topological embedding that integrates the topology structure of a drug-protein network for each drug and protein nodes. The topological embeddings of each drug and each protein are then combined and fused by multi-layer convolution neural networks to obtain the pairwise topological representation, which reveals the hidden topological relationships between drug and protein nodes. The proposed attribute-wise attention mechanism learns and adjusts the importance of individual attribute in each topological embedding of drug and protein nodes. Secondly, a tri-layer heterogeneous network composed of drug, protein and disease nodes is created to associate the similarities, interactions and associations across the heterogeneous nodes. The attribute distribution of the drug-protein node pair is encoded by a variational autoencoder. The pairwise attribute representation is learned via a multi-layer convolutional neural network to deeply integrate the attributes of drug and protein nodes. Finally, the three pairwise representations are fused by convolutional and fully connected neural networks for drug-protein interaction prediction. The experimental results show that GVDTI outperformed other seven state-of-the-art methods in comparison. The improved recall rates indicate that GVDTI retrieved more actual drug-protein interactions in the top ranked candidates than conventional methods. Case studies on five drugs further confirm GVDTI's ability in discovering the potential candidate drug-related proteins.

CONTACT

zhang@hlju.edu.cn Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.

Collapse

Ding Y, Yang C, Tang J, Guo F. Identification of protein-nucleotide binding residues via graph regularized k-local hyperplane distance nearest neighbor model. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02737-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Zhang J, Zhang Z, Pu L, Tang J, Guo F. AIEpred: An Ensemble Predictive Model of Classifier Chain to Identify Anti-Inflammatory Peptides. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:1831-1840. [PMID: 31985437 DOI: 10.1109/tcbb.2020.2968419] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Ding Y, Tang J, Guo F. Protein Crystallization Identification via Fuzzy Model on Linear Neighborhood Representation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:1986-1995. [PMID: 31751248 DOI: 10.1109/tcbb.2019.2954826] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Melse O, Hecht S, Antes I. DynaBiS: A hierarchical sampling algorithm to identify flexible binding sites for large ligands and peptides. Proteins 2021;90:18-32. [PMID: 34288078 DOI: 10.1002/prot.26182] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 06/24/2021] [Accepted: 07/11/2021] [Indexed: 11/11/2022]

Qian Y, Jiang L, Ding Y, Tang J, Guo F. A sequence-based multiple kernel model for identifying DNA-binding proteins. BMC Bioinformatics 2021;22:291. [PMID: 34058979 PMCID: PMC8167993 DOI: 10.1186/s12859-020-03875-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2020] [Accepted: 11/13/2020] [Indexed: 11/18/2022] Open

Guo X, Zhou W, Shi B, Wang X, Du A, Ding Y, Tang J, Guo F. An Efficient Multiple Kernel Support Vector Regression Model for Assessing Dry Weight of Hemodialysis Patients. Curr Bioinform 2021. [DOI: 10.2174/1574893615999200614172536] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Xuan P, Zhang Y, Cui H, Zhang T, Guo M, Nakaguchi T. Integrating multi-scale neighbouring topologies and cross-modal similarities for drug-protein interaction prediction. Brief Bioinform 2021;22:6220173. [PMID: 33839743 DOI: 10.1093/bib/bbab119] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2021] [Revised: 02/15/2021] [Accepted: 03/12/2021] [Indexed: 01/02/2023] Open

Abstract

MOTIVATION

Identifying the proteins that interact with drugs can reduce the cost and time of drug development. Existing computerized methods focus on integrating drug-related and protein-related data from multiple sources to predict candidate drug-target interactions (DTIs). However, multi-scale neighboring node sequences and various kinds of drug and protein similarities are neither fully explored nor considered in decision making.

RESULTS

We propose a drug-target interaction prediction method, DTIP, to encode and integrate multi-scale neighbouring topologies, multiple kinds of similarities, associations, interactions related to drugs and proteins. We firstly construct a three-layer heterogeneous network to represent interactions and associations across drug, protein, and disease nodes. Then a learning framework based on fully-connected autoencoder is proposed to learn the nodes' low-dimensional feature representations within the heterogeneous network. Secondly, multi-scale neighbouring sequences of drug and protein nodes are formulated by random walks. A module based on bidirectional gated recurrent unit is designed to learn the neighbouring sequential information and integrate the low-dimensional features of nodes. Finally, we propose attention mechanisms at feature level, neighbouring topological level and similarity level to learn more informative features, topologies and similarities. The prediction results are obtained by integrating neighbouring topologies, similarities and feature attributes using a multiple layer CNN. Comprehensive experimental results over public dataset demonstrated the effectiveness of our innovative features and modules. Comparison with other state-of-the-art methods and case studies of five drugs further validated DTIP's ability in discovering the potential candidate drug-related proteins.

Collapse

Xu L, Jiao S, Zhang D, Wu S, Zhang H, Gao B. Identification of long noncoding RNAs with machine learning methods: a review. Brief Funct Genomics 2021;20:174-180. [PMID: 33758917 DOI: 10.1093/bfgp/elab017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2021] [Revised: 02/24/2021] [Accepted: 02/25/2021] [Indexed: 12/11/2022] Open

Yang C, Ding Y, Meng Q, Tang J, Guo F. Granular multiple kernel learning for identifying RNA-binding protein residues via integrating sequence and structure information. Neural Comput Appl 2021. [DOI: 10.1007/s00521-020-05573-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Santana CA, Silveira SDA, Moraes JPA, Izidoro SC, de Melo-Minardi RC, Ribeiro AJM, Tyzack JD, Borkakoti N, Thornton JM. GRaSP: a graph-based residue neighborhood strategy to predict binding sites. Bioinformatics 2020;36:i726-i734. [DOI: 10.1093/bioinformatics/btaa805] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/08/2020] [Indexed: 01/22/2023] Open

Abstract Abstract Motivation The discovery of protein–ligand-binding sites is a major step for elucidating protein function and for investigating new functional roles. Detecting protein–ligand-binding sites experimentally is time-consuming and expensive. Thus, a variety of in silico methods to detect and predict binding sites was proposed as they can be scalable, fast and present low cost. Results We proposed Graph-based Residue neighborhood Strategy to Predict binding sites (GRaSP), a novel residue centric and scalable method to predict ligand-binding site residues. It is based on a supervised learning strategy that models the residue environment as a graph at the atomic level. Results show that GRaSP made compatible or superior predictions when compared with methods described in the literature. GRaSP outperformed six other residue-centric methods, including the one considered as state-of-the-art. Also, our method achieved better results than the method from CAMEO independent assessment. GRaSP ranked second when compared with five state-of-the-art pocket-centric methods, which we consider a significant result, as it was not devised to predict pockets. Finally, our method proved scalable as it took 10–20 s on average to predict the binding site for a protein complex whereas the state-of-the-art residue-centric method takes 2–5 h on average. Availability and implementation The source code and datasets are available at https://github.com/charles-abreu/GRaSP. Supplementary information Supplementary data are available at Bioinformatics online. Collapse

Ding Y, Tang J, Guo F. Human protein subcellular localization identification via fuzzy model on Kernelized Neighborhood Representation. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2020.106596] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Ao C, Zhou W, Gao L, Dong B, Yu L. Prediction of antioxidant proteins using hybrid feature representation method and random forest. Genomics 2020;112:4666-4674. [DOI: 10.1016/j.ygeno.2020.08.016] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 08/10/2020] [Accepted: 08/13/2020] [Indexed: 12/19/2022]

Yu L, Shi Y, Zou Q, Wang S, Zheng L, Gao L. Exploring Drug Treatment Patterns Based on the Action of Drug and Multilayer Network Model. Int J Mol Sci 2020;21:E5014. [PMID: 32708644 PMCID: PMC7404256 DOI: 10.3390/ijms21145014] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Revised: 07/13/2020] [Accepted: 07/14/2020] [Indexed: 02/01/2023] Open

Abstract

Some drugs can be used to treat multiple diseases, suggesting potential patterns in drug treatment. Determination of drug treatment patterns can improve our understanding of the mechanisms of drug action, enabling drug repurposing. A drug can be associated with a multilayer tissue-specific protein-protein interaction (TSPPI) network for the diseases it is used to treat. Proteins usually interact with other proteins to achieve functions that cause diseases. Hence, studying drug treatment patterns is similar to studying common module structures in multilayer TSPPI networks. Therefore, we propose a network-based model to study the treatment patterns of drugs. The method was designated SDTP (studying drug treatment pattern) and was based on drug effects and a multilayer network model. To demonstrate the application of the SDTP method, we focused on analysis of trichostatin A (TSA) in leukemia, breast cancer, and prostate cancer. We constructed a TSPPI multilayer network and obtained candidate drug-target modules from the network. Gene ontology analysis provided insights into the significance of the drug-target modules and co-expression networks. Finally, two modules were obtained as potential treatment patterns for TSA. Through analysis of the significance, composition, and functions of the selected drug-target modules, we validated the feasibility and rationality of our proposed SDTP method for identifying drug treatment patterns. In summary, our novel approach used a multilayer network model to overcome the shortcomings of single-layer networks and combined the network with information on drug activity. Based on the discovered drug treatment patterns, we can predict the potential diseases that the drug can treat. That is, if a disease-related protein module has a similar structure, then the drug is likely to be a potential drug for the treatment of the disease.

Collapse

Meng C, Zhang J, Ye X, Guo F, Zou Q. Review and comparative analysis of machine learning-based phage virion protein identification methods. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2020;1868:140406. [PMID: 32135196 DOI: 10.1016/j.bbapap.2020.140406] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/01/2020] [Revised: 02/14/2020] [Accepted: 02/27/2020] [Indexed: 02/01/2023]

Identification of membrane protein types via multivariate information fusion with Hilbert–Schmidt Independence Criterion. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.11.103] [Citation(s) in RCA: 88] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Wei L, Luan S, Nagai LAE, Su R, Zou Q. Exploring sequence-based features for the improved prediction of DNA N4-methylcytosine sites in multiple species. Bioinformatics 2020;35:1326-1333. [PMID: 30239627 DOI: 10.1093/bioinformatics/bty824] [Citation(s) in RCA: 130] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Revised: 09/12/2018] [Accepted: 09/18/2018] [Indexed: 12/20/2022] Open

Huang Q, Zhang J, Wei L, Guo F, Zou Q. 6mA-RicePred: A Method for Identifying DNA N ⁶-Methyladenine Sites in the Rice Genome Based on Feature Fusion. FRONTIERS IN PLANT SCIENCE 2020;11:4. [PMID: 32076430 PMCID: PMC7006724 DOI: 10.3389/fpls.2020.00004] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Accepted: 01/06/2020] [Indexed: 06/01/2023]

Wang C, Zhao N, Yuan L, Liu X. Computational Detection of Breast Cancer Invasiveness with DNA Methylation Biomarkers. Cells 2020;9:E326. [PMID: 32019269 PMCID: PMC7072524 DOI: 10.3390/cells9020326] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Revised: 01/28/2020] [Accepted: 01/28/2020] [Indexed: 12/14/2022] Open

Kumar AP, Verma CS, Lukman S. Structural dynamics and allostery of Rab proteins: strategies for drug discovery and design. Brief Bioinform 2020;22:270-287. [PMID: 31950981 DOI: 10.1093/bib/bbz161] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Revised: 08/29/2019] [Accepted: 11/15/2019] [Indexed: 01/09/2023] Open

Abstract

Rab proteins represent the largest family of the Rab superfamily guanosine triphosphatase (GTPase). Aberrant human Rab proteins are associated with multiple diseases, including cancers and neurological disorders. Rab subfamily members display subtle conformational variations that render specificity in their physiological functions and can be targeted for subfamily-specific drug design. However, drug discovery efforts have not focused much on targeting Rab allosteric non-nucleotide binding sites which are subjected to less evolutionary pressures to be conserved, hence are likely to offer subfamily specificity and may be less prone to undesirable off-target interactions and side effects. To discover druggable allosteric binding sites, Rab structural dynamics need to be first incorporated using multiple experimentally and computationally obtained structures. The high-dimensional structural data may necessitate feature extraction methods to identify manageable representative structures for subsequent analyses. We have detailed state-of-the-art computational methods to (i) identify binding sites using data on sequence, shape, energy, etc., (ii) determine the allosteric nature of these binding sites based on structural ensembles, residue networks and correlated motions and (iii) identify small molecule binders through structure- and ligand-based virtual screening. To benefit future studies for targeting Rab allosteric sites, we herein detail a refined workflow comprising multiple available computational methods, which have been successfully used alone or in combinations. This workflow is also applicable for drug discovery efforts targeting other medically important proteins. Depending on the structural dynamics of proteins of interest, researchers can select suitable strategies for allosteric drug discovery and design, from the resources of computational methods and tools enlisted in the workflow.

Collapse

Ao C, Zhang Y, Li D, Zhao Y, Zou Q. Progress in the development of antimicrobial peptide prediction tools. Curr Protein Pept Sci 2020;22:CPPS-EPUB-103746. [PMID: 31957609 DOI: 10.2174/1389203721666200117163802] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2019] [Revised: 06/12/2019] [Accepted: 07/15/2019] [Indexed: 11/22/2022]

Li Q, Dong B, Wang D, Wang S. Identification of Secreted Proteins From Malaria Protozoa With Few Features. IEEE ACCESS 2020;8:89793-89801. [DOI: 10.1109/access.2020.2994206] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2025]

Sun S, Wang C, Ding H, Zou Q. Machine learning and its applications in plant molecular studies. Brief Funct Genomics 2019;19:40-48. [DOI: 10.1093/bfgp/elz036] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Revised: 09/06/2019] [Accepted: 09/15/2019] [Indexed: 01/16/2023] Open

Zhao Z, Xu Y, Zhao Y. SXGBsite: Prediction of Protein-Ligand Binding Sites Using Sequence Information and Extreme Gradient Boosting. Genes (Basel) 2019;10:E965. [PMID: 31771119 PMCID: PMC6947422 DOI: 10.3390/genes10120965] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Revised: 10/19/2019] [Accepted: 11/19/2019] [Indexed: 12/13/2022] Open

Wang W, Li K, Lv H, Zhang H, Wang S, Huang J. SmoPSI: Analysis and Prediction of Small Molecule Binding Sites Based on Protein Sequence Information. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2019;2019:1926156. [PMID: 31814842 PMCID: PMC6877956 DOI: 10.1155/2019/1926156] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Revised: 09/16/2019] [Accepted: 09/26/2019] [Indexed: 11/20/2022]

Ding Y, Tang J, Guo F. Identification of Drug-Side Effect Association via Semisupervised Model and Multiple Kernel Learning. IEEE J Biomed Health Inform 2019;23:2619-2632. [DOI: 10.1109/jbhi.2018.2883834] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Meng C, Wei L, Zou Q. SecProMTB: Support Vector Machine‐Based Classifier for Secretory Proteins Using Imbalanced Data Sets Applied toMycobacterium tuberculosis. Proteomics 2019;19:e1900007. [DOI: 10.1002/pmic.201900007] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Revised: 03/25/2019] [Indexed: 11/08/2022]

Su R, Wu H, Xu B, Liu X, Wei L. Developing a Multi-Dose Computational Model for Drug-Induced Hepatotoxicity Prediction Based on Toxicogenomics Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019;16:1231-1239. [PMID: 30040651 DOI: 10.1109/tcbb.2018.2858756] [Citation(s) in RCA: 90] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Xuan P, Sun C, Zhang T, Ye Y, Shen T, Dong Y. Gradient Boosting Decision Tree-Based Method for Predicting Interactions Between Target Genes and Drugs. Front Genet 2019;10:459. [PMID: 31214240 PMCID: PMC6555260 DOI: 10.3389/fgene.2019.00459] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Accepted: 04/30/2019] [Indexed: 02/01/2023] Open

Abstract

Determining the target genes that interact with drugs—drug–target interactions—plays an important role in drug discovery. Identification of drug–target interactions through biological experiments is time consuming, laborious, and costly. Therefore, using computational approaches to predict candidate targets is a good way to reduce the cost of wet-lab experiments. However, the known interactions (positive samples) and the unknown interactions (negative samples) display a serious class imbalance, which has an adverse effect on the accuracy of the prediction results. To mitigate the impact of class imbalance and completely exploit the negative samples, we proposed a new method, named DTIGBDT, based on gradient boosting decision trees, for predicting candidate drug–target interactions. We constructed a drug–target heterogeneous network that contains the drug similarities based on the chemical structures of drugs, the target similarities based on target sequences, and the known drug–target interactions. The topological information of the network was captured by random walks to update the similarities between drugs or targets. The paths between drugs and targets could be divided into multiple categories, and the features of each category of paths were extracted. We constructed a prediction model based on gradient boosting decision trees. The model establishes multiple decision trees with the extracted features and obtains the interaction scores between drugs and targets. DTIGBDT is a method of ensemble learning, and it effectively reduces the impact of class imbalance. The experimental results indicate that DTIGBDT outperforms several state-of-the-art methods for drug–target interaction prediction. In addition, case studies on Quetiapine, Clozapine, Olanzapine, Aripiprazole, and Ziprasidone demonstrate the ability of DTIGBDT to discover potential drug–target interactions.

Collapse

Han K, Wang M, Zhang L, Wang Y, Guo M, Zhao M, Zhao Q, Zhang Y, Zeng N, Wang C. Predicting Ion Channels Genes and Their Types With Machine Learning Techniques. Front Genet 2019;10:399. [PMID: 31130983 PMCID: PMC6510169 DOI: 10.3389/fgene.2019.00399] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Accepted: 04/12/2019] [Indexed: 02/01/2023] Open