Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Tsukiyama S, Hasan MM, Fujii S, Kurata H. LSTM-PHV: prediction of human-virus protein-protein interactions by LSTM with word2vec. Brief Bioinform 2021;22:bbab228. [PMID: 34160596 PMCID: PMC8574953 DOI: 10.1093/bib/bbab228] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 04/27/2021] [Accepted: 05/25/2021] [Indexed: 12/30/2022] Open

For:	Tsukiyama S, Hasan MM, Fujii S, Kurata H. LSTM-PHV: prediction of human-virus protein-protein interactions by LSTM with word2vec. Brief Bioinform 2021;22:bbab228. [PMID: 34160596 PMCID: PMC8574953 DOI: 10.1093/bib/bbab228] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 04/27/2021] [Accepted: 05/25/2021] [Indexed: 12/30/2022] Open

Number

Cited by Other Article(s)

Yuan R, Zhang J, Zhou J, Cong Q. Recent progress and future challenges in structure-based protein-protein interaction prediction. Mol Ther 2025;33:2252-2268. [PMID: 40195117 DOI: 10.1016/j.ymthe.2025.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2025] [Revised: 03/05/2025] [Accepted: 04/02/2025] [Indexed: 04/09/2025] Open

Jiang J, Zhang C, Ke L, Hayes N, Zhu Y, Qiu H, Zhang B, Zhou T, Wei GW. A review of machine learning methods for imbalanced data challenges in chemistry. Chem Sci 2025;16:7637-7658. [PMID: 40271022 PMCID: PMC12013631 DOI: 10.1039/d5sc00270b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2025] [Accepted: 04/06/2025] [Indexed: 04/25/2025] Open

Shao D, Zou Y, Ma L, Yi S. Multiscale and global-local U-Net for protein-protein interaction site prediction. Comput Biol Chem 2025;118:108485. [PMID: 40306099 DOI: 10.1016/j.compbiolchem.2025.108485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2024] [Revised: 03/18/2025] [Accepted: 04/21/2025] [Indexed: 05/02/2025]

Abstract

Precise prediction of protein-protein interaction sites (PPIS) is fundamental to deciphering cellular mechanisms and accelerating therapeutic discovery. Despite significant advancements in computational approaches, current methods frequently fail to integrate multiscale features that simultaneously capture global context and local interactions. We present Multiscale and Global-Local U-Net for Protein-Protein Interaction Site Prediction (MGU-PPIS), a novel architecture designed to address this critical limitation. Our model leverages a U-Net framework with implemented multi-level pooling to extract comprehensive multiscale features. Within each scale, we synergistically combine Transformer networks, Graph Convolutional Networks (GCNs), and Graph Attention Networks (GATs) to simultaneously capture global patterns and local structural motifs. We implement Laplacian positional encoding to effectively represent global protein structural characteristics. In our framework, proteins are conceptualized as graph structures where individual residues function as nodes and their spatial relationships define edges. The model processes information through an innovative two-stage U-Net architecture, where output features from the initial stage serve as refined inputs for the subsequent stage. This dual-stage design, coupled with our graph-based representation, enables MGU-PPIS to extract a rich spectrum of multiscale features encompassing both global context and local interactions at each scale. Comprehensive experimental validation demonstrates that MGU-PPIS significantly outperforms state-of-the-art methods in predictive accuracy. Beyond introducing a novel computational strategy for PPIS prediction, our work establishes a foundation for advances in protein functional analysis and structure-based drug design.

Collapse

Han YL, Yin HH, Li C, Du J, He Y, Guan YX. Discovery of New Pentapeptide Inhibitors Against Amyloid-β Aggregation Using Word2Vec and Molecular Simulation. ACS Chem Neurosci 2025;16:1055-1065. [PMID: 39999409 DOI: 10.1021/acschemneuro.4c00661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2025] Open

Koul M, Kaushik S, Singh K, Sharma D. VITALdb: to select the best viroinformatics tools for a desired virus or application. Brief Bioinform 2025;26:bbaf084. [PMID: 40063348 PMCID: PMC11892104 DOI: 10.1093/bib/bbaf084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2024] [Revised: 01/14/2025] [Accepted: 02/17/2025] [Indexed: 05/13/2025] Open

Charoenkwan P, Chumnanpuen P, Schaduangrat N, Shoombuatong W. Deepstack-ACE: A deep stacking-based ensemble learning framework for the accelerated discovery of ACE inhibitory peptides. Methods 2025;234:131-140. [PMID: 39709069 DOI: 10.1016/j.ymeth.2024.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 11/27/2024] [Accepted: 12/07/2024] [Indexed: 12/23/2024] Open

Abstract

Identifying angiotensin-I-converting enzyme (ACE) inhibitory peptides accurately is crucial for understanding the primary factor that regulates the renin-angiotensin system and for providing guidance in developing new potential drugs. Given the inherent experimental complexities, using computational methods for in silico peptide identification could be indispensable for facilitating the high-throughput characterization of ACE inhibitory peptides. In this paper, we propose a novel deep stacking-based ensemble learning framework, termed Deepstack-ACE, to precisely identify ACE inhibitory peptides. In Deepstack-ACE, the input peptide sequences are fed into the word2vec embedding technique to generate sequence representations. Then, these representations were employed to train five powerful deep learning methods, including long short-term memory, convolutional neural network, multi-layer perceptron, gated recurrent unit network, and recurrent neural network, for the construction of base-classifiers. Finally, the optimized stacked model was constructed based on the best combination of selected base-classifiers. Benchmarking experiments showed that Deepstack-ACE attained a more accurate and robust identification of ACE inhibitory peptides compared to its base-classifiers and several conventional machine learning classifiers. Remarkably, in the independent test, our proposed model significantly outperformed the current state-of-the-art methods, with a balanced accuracy of 0.916, sensitivity of 0.911, and Matthews correlation coefficient scores of 0.826. Moreover, we developed a user-friendly web server for Deepstack-ACE, which is freely available at https://pmlabqsar.pythonanywhere.com/Deepstack-ACE. We anticipate that our proposed Deepstack-ACE model can provide a faster and reasonably accurate identification of ACE inhibitory peptides.

Collapse

Feng T, Chen X, Wu S, Tang W, Zhou H, Fang Z. Predicting the bacterial host range of plasmid genomes using the language model-based one-class support vector machine algorithm. Microb Genom 2025;11. [PMID: 39932495 DOI: 10.1099/mgen.0.001355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2025] Open

Abstract

The prediction of the plasmid host range is crucial for investigating the dissemination of plasmids and the transfer of resistance and virulence genes mediated by plasmids. Several machine learning-based tools have been developed to predict plasmid host ranges. These tools have been trained and tested based on the bacterial host records of plasmids in related databases. Typically, a plasmid genome in databases such as the National Center for Biotechnology Information is annotated with only one or a few bacterial hosts, which does not encompass all possible hosts. Consequently, existing methods may significantly underestimate the host ranges of mobile plasmids. In this work, we propose a novel method named HRPredict, which employs a word vector model to digitally represent the encoded proteins on plasmid genomes. Since it is difficult to confirm which host a particular plasmid definitely cannot enter, we developed a machine learning approach for predicting whether a plasmid can enter a specific bacterium as a no-negative samples learning task. Using multiple one-class support vector machine (SVM) models that do not require negative samples for training, HRPredict predicts the host range of plasmids across 45 families, 56 genera and 56 species. In the benchmark test set, we constructed reliable negative samples for each host taxonomic unit via two indirect methods, and we found that the area under the curve (AUC), F1-score, recall, precision and accuracy of most taxonomic unit prediction models exceeded 0.9. Among the 13 broad-host-range plasmid types, HRPredict demonstrated greater coverage than HOTSPOT and PlasmidHostFinder, thus successfully predicting the majority of hosts previously reported. Through feature importance calculation for each SVM model, we found that genes closely related to the plasmid host range are involved in functions such as bacterial adaptability, pathogenicity and survival. These findings provide significant insight into the mechanisms through which bacteria adjust to diverse environments through plasmids. The HRPredict algorithm is expected to facilitate in-depth research on the spread of broad-host-range plasmids and enable host-range predictions for novel plasmids reconstructed from microbiome sequencing data.

Collapse

Taha K. Protein-protein interaction detection using deep learning: A survey, comparative analysis, and experimental evaluation. Comput Biol Med 2025;185:109449. [PMID: 39644584 DOI: 10.1016/j.compbiomed.2024.109449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2024] [Revised: 11/13/2024] [Accepted: 11/14/2024] [Indexed: 12/09/2024]

Bowyer S, Allen DJ, Furnham N. Unveiling the ghost: machine learning's impact on the landscape of virology. J Gen Virol 2025;106. [PMID: 39804261 DOI: 10.1099/jgv.0.002067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2025] Open

Chen H, Liu J, Tang G, Hao G, Yang G. Bioinformatic Resources for Exploring Human-virus Protein-protein Interactions Based on Binding Modes. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024;22:qzae075. [PMID: 39404802 PMCID: PMC11658832 DOI: 10.1093/gpbjnl/qzae075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 10/05/2024] [Accepted: 10/11/2024] [Indexed: 12/21/2024]

Zouari S, Ali F, Masmoudi A, Ghazalah SA, Alghamdi W, Kateb FA, Ibrahim N. Deep-GB: A novel deep learning model for globular protein prediction using CNN-BiLSTM architecture and enhanced PSSM with trisection strategy. IET Syst Biol 2024;18:208-217. [PMID: 39514139 DOI: 10.1049/syb2.12108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2024] [Revised: 09/30/2024] [Accepted: 10/27/2024] [Indexed: 11/16/2024] Open

Zhang L, Wang S, Wang Y, Zhao T. HBFormer: a single-stream framework based on hybrid attention mechanism for identification of human-virus protein-protein interactions. Bioinformatics 2024;40:btae724. [PMID: 39673490 PMCID: PMC11648999 DOI: 10.1093/bioinformatics/btae724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2024] [Revised: 10/27/2024] [Accepted: 12/02/2024] [Indexed: 12/16/2024] Open

Wang C, Zou Q. MFPSP: Identification of fungal species-specific phosphorylation site using offspring competition-based genetic algorithm. PLoS Comput Biol 2024;20:e1012607. [PMID: 39556608 DOI: 10.1371/journal.pcbi.1012607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2024] [Accepted: 11/03/2024] [Indexed: 11/20/2024] Open

Beltrán JF, Belén LH, Yáñez AJ, Jimenez L. Predicting viral proteins that evade the innate immune system: a machine learning-based immunoinformatics tool. BMC Bioinformatics 2024;25:351. [PMID: 39522017 PMCID: PMC11550529 DOI: 10.1186/s12859-024-05972-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Accepted: 10/29/2024] [Indexed: 11/16/2024] Open

Beltrán JF, Herrera-Belén L, Yáñez AJ, Jimenez L. Prediction of viral oncoproteins through the combination of generative adversarial networks and machine learning techniques. Sci Rep 2024;14:27108. [PMID: 39511292 PMCID: PMC11543823 DOI: 10.1038/s41598-024-77028-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2024] [Accepted: 10/18/2024] [Indexed: 11/15/2024] Open

Tayebi Z, Ali S, Patterson M. TCellR2Vec: efficient feature selection for TCR sequences for cancer classification. PeerJ Comput Sci 2024;10:e2239. [PMID: 39650499 PMCID: PMC11622898 DOI: 10.7717/peerj-cs.2239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 07/14/2024] [Indexed: 12/11/2024]

Chen C, He Z, Zhao J, Zhu X, Li J, Wu X, Chen Z, Chen H, Jia G. Zoonotic outbreak risk prediction with long short-term memory models: a case study with schistosomiasis, echinococcosis, and leptospirosis. BMC Infect Dis 2024;24:1062. [PMID: 39333964 PMCID: PMC11437667 DOI: 10.1186/s12879-024-09892-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 09/05/2024] [Indexed: 09/30/2024] Open

Abstract

BACKGROUND

Zoonotic infections, characterized with huge pathogen diversity, wide affecting area and great society harm, have become a major global public health problem. Early and accurate prediction of their outbreaks is crucial for disease control. The aim of this study was to develop zoonotic diseases risk predictive models based on time-series incidence data and three zoonotic diseases in mainland China were employed as cases.

METHODS

The incidence data for schistosomiasis, echinococcosis, and leptospirosis were downloaded from the Scientific Data Centre of the National Ministry of Health of China, and were processed by interpolation, dynamic curve reconstruction and time series decomposition. Data were decomposed into three distinct components: the trend component, the seasonal component, and the residual component. The trend component was used as input to construct the Long Short-Term Memory (LSTM) prediction model, while the seasonal component was used in the comparison of the periods and amplitudes. Finaly, the accuracy of the hybrid LSTM prediction model was comprehensive evaluated.

RESULTS

This study employed trend series of incidence numbers and incidence rates of three zoonotic diseases for modeling. The prediction results of the model showed that the predicted incidence number and incidence rate were very close to the real incidence data. Model evaluation revealed that the prediction error of the hybrid LSTM model was smaller than that of the single LSTM. Thus, these results demonstrate that using trending sequences as input sequences for the model leads to better-fitting predictive models.

CONCLUSIONS

Our study successfully developed LSTM hybrid models for disease outbreak risk prediction using three zoonotic diseases as case studies. We demonstrate that the LSTM, when combined with time series decomposition, delivers more accurate results compared to conventional LSTM models using the raw data series. Disease outbreak trends can be predicted more accurately using hybrid models.

Collapse

Affiliation(s)

Chunrong Chen College of Animal Science and Technology, Guangxi University, Nanning, 530004, China Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
Zhaoyuan He College of Animal Science and Technology, Guangxi University, Nanning, 530004, China
Jin Zhao College of Animal Science and Technology, Guangxi University, Nanning, 530004, China Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
Xuhui Zhu Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
Jiabao Li Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China College of Data Science, Taiyuan University of Technology, Taiyuan, 030024, China
Xinnan Wu Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China College of Data Science, Taiyuan University of Technology, Taiyuan, 030024, China
Zhongting Chen College of Animal Science and Technology, Guangxi University, Nanning, 530004, China Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
Hailan Chen College of Animal Science and Technology, Guangxi University, Nanning, 530004, China.
Gengjie Jia Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China.

Collapse

Kurata H, Harun-Or-Roshid M, Tsukiyama S, Maeda K. PredIL13: Stacking a variety of machine and deep learning methods with ESM-2 language model for identifying IL13-inducing peptides. PLoS One 2024;19:e0309078. [PMID: 39172871 PMCID: PMC11340954 DOI: 10.1371/journal.pone.0309078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Accepted: 08/05/2024] [Indexed: 08/24/2024] Open

Guevara-Barrientos D, Kaundal R. Malivhu: A Comprehensive Bioinformatics Resource for Filtering SARS and MERS Virus Proteins by Their Classification, Family and Species, and Prediction of Their Interactions Against Human Proteins. Bioinform Biol Insights 2024;18:11779322241263671. [PMID: 39148721 PMCID: PMC11325310 DOI: 10.1177/11779322241263671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Accepted: 06/04/2024] [Indexed: 08/17/2024] Open

Volzhenin K, Bittner L, Carbone A. SENSE-PPI reconstructs interactomes within, across, and between species at the genome scale. iScience 2024;27:110371. [PMID: 39055916 PMCID: PMC11269938 DOI: 10.1016/j.isci.2024.110371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 05/04/2024] [Accepted: 06/21/2024] [Indexed: 07/28/2024] Open

Jin R, Ye Q, Wang J, Cao Z, Jiang D, Wang T, Kang Y, Xu W, Hsieh CY, Hou T. AttABseq: an attention-based deep learning prediction method for antigen-antibody binding affinity changes based on protein sequences. Brief Bioinform 2024;25:bbae304. [PMID: 38960407 PMCID: PMC11221889 DOI: 10.1093/bib/bbae304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 04/15/2024] [Accepted: 06/11/2024] [Indexed: 07/05/2024] Open

Abstract

The optimization of therapeutic antibodies through traditional techniques, such as candidate screening via hybridoma or phage display, is resource-intensive and time-consuming. In recent years, computational and artificial intelligence-based methods have been actively developed to accelerate and improve the development of therapeutic antibodies. In this study, we developed an end-to-end sequence-based deep learning model, termed AttABseq, for the predictions of the antigen-antibody binding affinity changes connected with antibody mutations. AttABseq is a highly efficient and generic attention-based model by utilizing diverse antigen-antibody complex sequences as the input to predict the binding affinity changes of residue mutations. The assessment on the three benchmark datasets illustrates that AttABseq is 120% more accurate than other sequence-based models in terms of the Pearson correlation coefficient between the predicted and experimental binding affinity changes. Moreover, AttABseq also either outperforms or competes favorably with the structure-based approaches. Furthermore, AttABseq consistently demonstrates robust predictive capabilities across a diverse array of conditions, underscoring its remarkable capacity for generalization across a wide spectrum of antigen-antibody complexes. It imposes no constraints on the quantity of altered residues, rendering it particularly applicable in scenarios where crystallographic structures remain unavailable. The attention-based interpretability analysis indicates that the causal effects of point mutations on antibody-antigen binding affinity changes can be visualized at the residue level, which might assist automated antibody sequence optimization. We believe that AttABseq provides a fiercely competitive answer to therapeutic antibody optimization.

Collapse

Affiliation(s)

Ruofan Jin College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China College of Life Science, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
Qing Ye College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
Jike Wang College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
Zheng Cao College of Computer Science and Technology, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
Dejun Jiang College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
Tianyue Wang College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
Yu Kang College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
Wanting Xu College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
Chang-Yu Hsieh College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
Tingjun Hou College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China

Collapse

Zeng X, Meng FF, Wen ML, Li SJ, Li Y. GNNGL-PPI: multi-category prediction of protein-protein interactions using graph neural networks based on global graphs and local subgraphs. BMC Genomics 2024;25:406. [PMID: 38724906 PMCID: PMC11080243 DOI: 10.1186/s12864-024-10299-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 04/10/2024] [Indexed: 05/13/2024] Open

Pan J, Zhang Z, Li Y, Yu J, You Z, Li C, Wang S, Zhu M, Ren F, Zhang X, Sun Y, Wang S. A microbial knowledge graph-based deep learning model for predicting candidate microbes for target hosts. Brief Bioinform 2024;25:bbae119. [PMID: 38555472 PMCID: PMC10981679 DOI: 10.1093/bib/bbae119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 02/23/2024] [Accepted: 03/02/2024] [Indexed: 04/02/2024] Open

Affiliation(s)

Jie Pan Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, College of Life Sciences, Northwest University, Xi’an 710069, China
Zhen Zhang Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, College of Life Sciences, Northwest University, Xi’an 710069, China
Ying Li Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, College of Life Sciences, Northwest University, Xi’an 710069, China
Jiaoyang Yu Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, College of Life Sciences, Northwest University, Xi’an 710069, China
Zhuhong You School of Computer Science, Northwestern Polytechnical University, Xi’an 710129, China
Chenyu Li Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, College of Life Sciences, Northwest University, Xi’an 710069, China
Shixu Wang Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, College of Life Sciences, Northwest University, Xi’an 710069, China
Minghui Zhu Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, College of Life Sciences, Northwest University, Xi’an 710069, China
Fengzhi Ren North China Pharmaceutical Group, Shijiazhuang 050015, Hebei, China National Microbial Medicine Engineering & Research Center, Shijiazhuang 050015, Hebei, China
Xuexia Zhang North China Pharmaceutical Group, Shijiazhuang 050015, Hebei, China National Microbial Medicine Engineering & Research Center, Shijiazhuang 050015, Hebei, China
Yanmei Sun Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, College of Life Sciences, Northwest University, Xi’an 710069, China
Shiwei Wang Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, College of Life Sciences, Northwest University, Xi’an 710069, China

Collapse

Ma Y, Zhao Y, Ma Y. Kernel Bayesian nonlinear matrix factorization based on variational inference for human-virus protein-protein interaction prediction. Sci Rep 2024;14:5693. [PMID: 38454139 PMCID: PMC10920681 DOI: 10.1038/s41598-024-56208-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Accepted: 03/04/2024] [Indexed: 03/09/2024] Open

Zhang HQ, Liu SH, Li R, Yu JW, Ye DX, Yuan SS, Lin H, Huang CB, Tang H. MIBPred: Ensemble Learning-Based Metal Ion-Binding Protein Classifier. ACS OMEGA 2024;9:8439-8447. [PMID: 38405489 PMCID: PMC10882704 DOI: 10.1021/acsomega.3c09587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 01/16/2024] [Accepted: 01/22/2024] [Indexed: 02/27/2024]

Fu X, Yuan Y, Qiu H, Suo H, Song Y, Li A, Zhang Y, Xiao C, Li Y, Dou L, Zhang Z, Cui F. AGF-PPIS: A protein-protein interaction site predictor based on an attention mechanism and graph convolutional networks. Methods 2024;222:142-151. [PMID: 38242383 DOI: 10.1016/j.ymeth.2024.01.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 01/04/2024] [Accepted: 01/13/2024] [Indexed: 01/21/2024] Open

Wu S, Feng T, Tang W, Qi C, Gao J, He X, Wang J, Zhou H, Fang Z. metaProbiotics: a tool for mining probiotic from metagenomic binning data based on a language model. Brief Bioinform 2024;25:bbae085. [PMID: 38487846 PMCID: PMC10940841 DOI: 10.1093/bib/bbae085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 01/26/2024] [Accepted: 02/15/2024] [Indexed: 03/18/2024] Open

Abstract

Beneficial bacteria remain largely unexplored. Lacking systematic methods, understanding probiotic community traits becomes challenging, leading to various conclusions about their probiotic effects among different publications. We developed language model-based metaProbiotics to rapidly detect probiotic bins from metagenomes, demonstrating superior performance in simulated benchmark datasets. Testing on gut metagenomes from probiotic-treated individuals, it revealed the probioticity of intervention strains-derived bins and other probiotic-associated bins beyond the training data, such as a plasmid-like bin. Analyses of these bins revealed various probiotic mechanisms and bai operon as probiotic Ruminococcaceae's potential marker. In different health-disease cohorts, these bins were more common in healthy individuals, signifying their probiotic role, but relevant health predictions based on the abundance profiles of these bins faced cross-disease challenges. To better understand the heterogeneous nature of probiotics, we used metaProbiotics to construct a comprehensive probiotic genome set from global gut metagenomic data. Module analysis of this set shows that diseased individuals often lack certain probiotic gene modules, with significant variation of the missing modules across different diseases. Additionally, different gene modules on the same probiotic have heterogeneous effects on various diseases. We thus believe that gene function integrity of the probiotic community is more crucial in maintaining gut homeostasis than merely increasing specific gene abundance, and adding probiotics indiscriminately might not boost health. We expect that the innovative language model-based metaProbiotics tool will promote novel probiotic discovery using large-scale metagenomic data and facilitate systematic research on bacterial probiotic effects. The metaProbiotics program can be freely downloaded at https://github.com/zhenchengfang/metaProbiotics.

Collapse

Yang X, Wuchty S, Liang Z, Ji L, Wang B, Zhu J, Zhang Z, Dong Y. Multi-modal features-based human-herpesvirus protein-protein interaction prediction by using LightGBM. Brief Bioinform 2024;25:bbae005. [PMID: 38279649 PMCID: PMC10818167 DOI: 10.1093/bib/bbae005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 12/25/2023] [Accepted: 01/01/2021] [Indexed: 01/28/2024] Open

Feng T, Wu S, Zhou H, Fang Z. MOBFinder: a tool for mobilization typing of plasmid metagenomic fragments based on a language model. Gigascience 2024;13:giae047. [PMID: 39101782 PMCID: PMC11299106 DOI: 10.1093/gigascience/giae047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 05/31/2024] [Accepted: 06/24/2024] [Indexed: 08/06/2024] Open

Chen HM, Liu JX, Liu D, Hao GF, Yang GF. Human-virus protein-protein interactions maps assist in revealing the pathogenesis of viral infection. Rev Med Virol 2024;34:e2517. [PMID: 38282401 DOI: 10.1002/rmv.2517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 09/12/2023] [Accepted: 01/16/2024] [Indexed: 01/30/2024]

Wang S, Liu Y, Liu Y, Zhang Y, Zhu X. BERT-5mC: an interpretable model for predicting 5-methylcytosine sites of DNA based on BERT. PeerJ 2023;11:e16600. [PMID: 38089911 PMCID: PMC10712318 DOI: 10.7717/peerj.16600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 11/15/2023] [Indexed: 12/18/2023] Open

Markus B, C GC, Andreas K, Arkadij K, Stefan L, Gustav O, Elina S, Radka S. Accelerating Biocatalysis Discovery with Machine Learning: A Paradigm Shift in Enzyme Engineering, Discovery, and Design. ACS Catal 2023;13:14454-14469. [PMID: 37942268 PMCID: PMC10629211 DOI: 10.1021/acscatal.3c03417] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 09/29/2023] [Accepted: 10/03/2023] [Indexed: 11/10/2023]

Halsana AA, Chakroborty T, Halder AK, Basu S. DensePPI: A Novel Image-Based Deep Learning Method for Prediction of Protein-Protein Interactions. IEEE Trans Nanobioscience 2023;22:904-911. [PMID: 37028059 DOI: 10.1109/tnb.2023.3251192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2023]

Liu T, Gao H, Ren X, Xu G, Liu B, Wu N, Luo H, Wang Y, Tu T, Yao B, Guan F, Teng Y, Huang H, Tian J. Protein-protein interaction and site prediction using transfer learning. Brief Bioinform 2023;24:bbad376. [PMID: 37870286 DOI: 10.1093/bib/bbad376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 09/14/2023] [Accepted: 10/02/2023] [Indexed: 10/24/2023] Open

Lee M. Recent Advances in Deep Learning for Protein-Protein Interaction Analysis: A Comprehensive Review. Molecules 2023;28:5169. [PMID: 37446831 DOI: 10.3390/molecules28135169] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 06/30/2023] [Accepted: 06/30/2023] [Indexed: 07/15/2023] Open

Ao C, Ye X, Sakurai T, Zou Q, Yu L. m5U-SVM: identification of RNA 5-methyluridine modification sites based on multi-view features of physicochemical features and distributed representation. BMC Biol 2023;21:93. [PMID: 37095510 PMCID: PMC10127088 DOI: 10.1186/s12915-023-01596-0] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 04/12/2023] [Indexed: 04/26/2023] Open

Abstract

BACKGROUND

RNA 5-methyluridine (m5U) modifications are obtained by methylation at the C₅ position of uridine catalyzed by pyrimidine methylation transferase, which is related to the development of human diseases. Accurate identification of m5U modification sites from RNA sequences can contribute to the understanding of their biological functions and the pathogenesis of related diseases. Compared to traditional experimental methods, computational methods developed based on machine learning with ease of use can identify modification sites from RNA sequences in an efficient and time-saving manner. Despite the good performance of these computational methods, there are some drawbacks and limitations.

RESULTS

In this study, we have developed a novel predictor, m5U-SVM, based on multi-view features and machine learning algorithms to construct predictive models for identifying m5U modification sites from RNA sequences. In this method, we used four traditional physicochemical features and distributed representation features. The optimized multi-view features were obtained from the four fused traditional physicochemical features by using the two-step LightGBM and IFS methods, and then the distributed representation features were fused with the optimized physicochemical features to obtain the new multi-view features. The best performing classifier, support vector machine, was identified by screening different machine learning algorithms. Compared with the results, the performance of the proposed model is better than that of the existing state-of-the-art tool.

CONCLUSIONS

m5U-SVM provides an effective tool that successfully captures sequence-related attributes of modifications and can accurately predict m5U modification sites from RNA sequences. The identification of m5U modification sites helps to understand and delve into the related biological processes and functions.

Collapse

Yang S, Yang Z, Yang J. 4mCBERT: A computing tool for the identification of DNA N4-methylcytosine sites by sequence- and chemical-derived information based on ensemble learning strategies. Int J Biol Macromol 2023;231:123180. [PMID: 36646347 DOI: 10.1016/j.ijbiomac.2023.123180] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 11/26/2022] [Accepted: 12/30/2022] [Indexed: 01/15/2023]

Karamveer, Tiwary BK. Genomic coevolution of papillomavirus and immune system in placental mammals indicates the role of IFN-γ in the emergence of new variants. Carcinogenesis 2023:bgad007. [PMID: 36827464 DOI: 10.1093/carcin/bgad007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Indexed: 02/26/2023] Open

Kang Y, Xu Y, Wang X, Pu B, Yang X, Rao Y, Chen J. HN-PPISP: a hybrid network based on MLP-Mixer for protein-protein interaction site prediction. Brief Bioinform 2023;24:6833645. [PMID: 36403092 DOI: 10.1093/bib/bbac480] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 09/16/2022] [Accepted: 10/09/2022] [Indexed: 11/21/2022] Open

Abstract

MOTIVATION

Biological experimental approaches to protein-protein interaction (PPI) site prediction are critical for understanding the mechanisms of biochemical processes but are time-consuming and laborious. With the development of Deep Learning (DL) techniques, the most popular Convolutional Neural Networks (CNN)-based methods have been proposed to address these problems. Although significant progress has been made, these methods still have limitations in encoding the characteristics of each amino acid in protein sequences. Current methods cannot efficiently explore the nature of Position Specific Scoring Matrix (PSSM), secondary structure and raw protein sequences by processing them all together. For PPI site prediction, how to effectively model the PPI context with attention to prediction remains an open problem. In addition, the long-distance dependencies of PPI features are important, which is very challenging for many CNN-based methods because the innate ability of CNN is difficult to outperform auto-regressive models like Transformers.

RESULTS

To effectively mine the properties of PPI features, a novel hybrid neural network named HN-PPISP is proposed, which integrates a Multi-layer Perceptron Mixer (MLP-Mixer) module for local feature extraction and a two-stage multi-branch module for global feature capture. The model merits Transformer, TextCNN and Bi-LSTM as a powerful alternative for PPI site prediction. On the one hand, this is the first application of an advanced Transformer (i.e. MLP-Mixer) with a hybrid network for sequence-based PPI prediction. On the other hand, unlike existing methods that treat global features altogether, the proposed two-stage multi-branch hybrid module firstly assigns different attention scores to the input features and then encodes the feature through different branch modules. In the first stage, different improved attention modules are hybridized to extract features from the raw protein sequences, secondary structure and PSSM, respectively. In the second stage, a multi-branch network is designed to aggregate information from both branches in parallel. The two branches encode the features and extract dependencies through several operations such as TextCNN, Bi-LSTM and different activation functions. Experimental results on real-world public datasets show that our model consistently achieves state-of-the-art performance over seven remarkable baselines.

AVAILABILITY

The source code of HN-PPISP model is available at https://github.com/ylxu05/HN-PPISP.

Collapse

Ma Y, Zhong J. Logistic tensor decomposition with sparse subspace learning for prediction of multiple disease types of human-virus protein-protein interactions. Brief Bioinform 2023;24:6961474. [PMID: 36573486 DOI: 10.1093/bib/bbac604] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 12/04/2022] [Accepted: 12/08/2022] [Indexed: 12/28/2022] Open

Ortiz-Vilchis P, De-la-Cruz-García JS, Ramirez-Arellano A. Identification of Relevant Protein Interactions with Partial Knowledge: A Complex Network and Deep Learning Approach. BIOLOGY 2023;12:140. [PMID: 36671832 PMCID: PMC9856098 DOI: 10.3390/biology12010140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 01/11/2023] [Accepted: 01/12/2023] [Indexed: 01/18/2023]

Murakami Y, Mizuguchi K. Recent developments of sequence-based prediction of protein-protein interactions. Biophys Rev 2022;14:1393-1411. [PMID: 36589735 PMCID: PMC9789376 DOI: 10.1007/s12551-022-01038-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/08/2022] [Indexed: 12/25/2022] Open

Neumann D, Roy S, Minhas FUAA, Ben-Hur A. On the choice of negative examples for prediction of host-pathogen protein interactions. FRONTIERS IN BIOINFORMATICS 2022;2:1083292. [PMID: 36591335 PMCID: PMC9798088 DOI: 10.3389/fbinf.2022.1083292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 11/14/2022] [Indexed: 12/23/2022] Open

Asim MN, Fazeel A, Ibrahim MA, Dengel A, Ahmed S. MP-VHPPI: Meta predictor for viral host protein-protein interaction prediction in multiple hosts and viruses. Front Med (Lausanne) 2022;9:1025887. [PMID: 36465911 PMCID: PMC9709337 DOI: 10.3389/fmed.2022.1025887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 10/17/2022] [Indexed: 09/19/2023] Open

Cross-attention PHV: Prediction of human and virus protein-protein interactions using cross-attention-based neural networks. Comput Struct Biotechnol J 2022;20:5564-5573. [PMID: 36249566 PMCID: PMC9546503 DOI: 10.1016/j.csbj.2022.10.012] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Revised: 10/05/2022] [Accepted: 10/05/2022] [Indexed: 11/30/2022] Open

Abstract

•

Cross-attention PHV implements two key technologies: cross-attention mechanism and 1D-CNN.

•

It accurately predicts PPIs between human and unknown influenza viruses/SARS-CoV-2.

•

It extracts critical taxonomic and evolutionary differences responsible for PPI prediction.

Viral infections represent a major health concern worldwide. The alarming rate at which SARS-CoV-2 spreads, for example, led to a worldwide pandemic. Viruses incorporate genetic material into the host genome to hijack host cell functions such as the cell cycle and apoptosis. In these viral processes, protein–protein interactions (PPIs) play critical roles. Therefore, the identification of PPIs between humans and viruses is crucial for understanding the infection mechanism and host immune responses to viral infections and for discovering effective drugs. Experimental methods including mass spectrometry-based proteomics and yeast two-hybrid assays are widely used to identify human-virus PPIs, but these experimental methods are time-consuming, expensive, and laborious. To overcome this problem, we developed a novel computational predictor, named cross-attention PHV, by implementing two key technologies of the cross-attention mechanism and a one-dimensional convolutional neural network (1D-CNN). The cross-attention mechanisms were very effective in enhancing prediction and generalization abilities. Application of 1D-CNN to the word2vec-generated feature matrices reduced computational costs, thus extending the allowable length of protein sequences to 9000 amino acid residues. Cross-attention PHV outperformed existing state-of-the-art models using a benchmark dataset and accurately predicted PPIs for unknown viruses. Cross-attention PHV also predicted human–SARS-CoV-2 PPIs with area under the curve values >0.95. The Cross-attention PHV web server and source codes are freely available at https://kurata35.bio.kyutech.ac.jp/Cross-attention_PHV/ and https://github.com/kuratahiroyuki/Cross-Attention_PHV, respectively.

Collapse

Madan S, Demina V, Stapf M, Ernst O, Fröhlich H. Accurate prediction of virus-host protein-protein interactions via a Siamese neural network using deep protein sequence embeddings. PATTERNS (NEW YORK, N.Y.) 2022;3:100551. [PMID: 36124304 PMCID: PMC9481957 DOI: 10.1016/j.patter.2022.100551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 03/28/2022] [Accepted: 06/16/2022] [Indexed: 11/13/2022]

Kumar S, Kumar GS, Maitra SS, Malý P, Bharadwaj S, Sharma P, Dwivedi VD. Viral informatics: bioinformatics-based solution for managing viral infections. Brief Bioinform 2022;23:6659740. [PMID: 35947964 DOI: 10.1093/bib/bbac326] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Revised: 06/26/2022] [Accepted: 07/18/2022] [Indexed: 11/13/2022] Open

Kurata H, Tsukiyama S, Manavalan B. iACVP: markedly enhanced identification of anti-coronavirus peptides using a dataset-specific word2vec model. Brief Bioinform 2022;23:6623727. [PMID: 35772910 DOI: 10.1093/bib/bbac265] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 05/23/2022] [Accepted: 06/06/2022] [Indexed: 01/22/2023] Open

Decoding the protein-ligand interactions using parallel graph neural networks. Sci Rep 2022;12:7624. [PMID: 35538084 PMCID: PMC9086424 DOI: 10.1038/s41598-022-10418-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 04/06/2022] [Indexed: 12/13/2022] Open

Abstract

Protein-ligand interactions (PLIs) are essential for biochemical functionality and their identification is crucial for estimating biophysical properties for rational therapeutic design. Currently, experimental characterization of these properties is the most accurate method, however, this is very time-consuming and labor-intensive. A number of computational methods have been developed in this context but most of the existing PLI prediction heavily depends on 2D protein sequence data. Here, we present a novel parallel graph neural network (GNN) to integrate knowledge representation and reasoning for PLI prediction to perform deep learning guided by expert knowledge and informed by 3D structural data. We develop two distinct GNN architectures: [Formula: see text] is the base implementation that employs distinct featurization to enhance domain-awareness, while [Formula: see text] is a novel implementation that can predict with no prior knowledge of the intermolecular interactions. The comprehensive evaluation demonstrated that GNN can successfully capture the binary interactions between ligand and protein's 3D structure with 0.979 test accuracy for [Formula: see text] and 0.958 for [Formula: see text] for predicting activity of a protein-ligand complex. These models are further adapted for regression tasks to predict experimental binding affinities and [Formula: see text] crucial for compound's potency and efficacy. We achieve a Pearson correlation coefficient of 0.66 and 0.65 on experimental affinity and 0.50 and 0.51 on [Formula: see text] with [Formula: see text] and [Formula: see text], respectively, outperforming similar 2D sequence based models. Our method can serve as an interpretable and explainable artificial intelligence (AI) tool for predicted activity, potency, and biophysical properties of lead candidates. To this end, we show the utility of [Formula: see text] on SARS-Cov-2 protein targets by screening a large compound library and comparing the prediction with the experimentally measured data.

Collapse

Yang X, Yang S, Ren P, Wuchty S, Zhang Z. Deep Learning-Powered Prediction of Human-Virus Protein-Protein Interactions. Front Microbiol 2022;13:842976. [PMID: 35495666 PMCID: PMC9051481 DOI: 10.3389/fmicb.2022.842976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Accepted: 03/25/2022] [Indexed: 11/13/2022] Open