Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang Y, Lin J, Zhao L, Zeng X, Liu X. A novel antibacterial peptide recognition algorithm based on BERT. Brief Bioinform 2021;22:6284370. [PMID: 34037687 DOI: 10.1093/bib/bbab200] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 04/19/2021] [Accepted: 05/03/2021] [Indexed: 12/31/2022] Open

For:	Zhang Y, Lin J, Zhao L, Zeng X, Liu X. A novel antibacterial peptide recognition algorithm based on BERT. Brief Bioinform 2021;22:6284370. [PMID: 34037687 DOI: 10.1093/bib/bbab200] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 04/19/2021] [Accepted: 05/03/2021] [Indexed: 12/31/2022] Open

Number

Cited by Other Article(s)

Su L, Ma Z, Ji H, Kong J, Yan W, Zhang Q, Li J, Zuo M. From prediction to design: Revealing the mechanisms of umami peptides using interpretable deep learning, quantum chemical simulations, and module substitution. Food Chem 2025;483:144301. [PMID: 40233511 DOI: 10.1016/j.foodchem.2025.144301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2025] [Revised: 03/24/2025] [Accepted: 04/08/2025] [Indexed: 04/17/2025]

Asim MN, Asif T, Hassan F, Dengel A. Protein Sequence Analysis landscape: A Systematic Review of Task Types, Databases, Datasets, Word Embeddings Methods, and Language Models. Database (Oxford) 2025;2025:baaf027. [PMID: 40448683 DOI: 10.1093/database/baaf027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2024] [Revised: 02/06/2025] [Accepted: 03/26/2025] [Indexed: 06/02/2025]

Abstract

Protein sequence analysis examines the order of amino acids within protein sequences to unlock diverse types of a wealth of knowledge about biological processes and genetic disorders. It helps in forecasting disease susceptibility by finding unique protein signatures, or biomarkers that are linked to particular disease states. Protein Sequence analysis through wet-lab experiments is expensive, time-consuming and error prone. To facilitate large-scale proteomics sequence analysis, the biological community is striving for utilizing AI competence for transitioning from wet-lab to computer aided applications. However, Proteomics and AI are two distinct fields and development of AI-driven protein sequence analysis applications requires knowledge of both domains. To bridge the gap between both fields, various review articles have been written. However, these articles focus revolves around few individual tasks or specific applications rather than providing a comprehensive overview about wide tasks and applications. Following the need of a comprehensive literature that presents a holistic view of wide array of tasks and applications, contributions of this manuscript are manifold: It bridges the gap between Proteomics and AI fields by presenting a comprehensive array of AI-driven applications for 63 distinct protein sequence analysis tasks. It equips AI researchers by facilitating biological foundations of 63 protein sequence analysis tasks. It enhances development of AI-driven protein sequence analysis applications by providing comprehensive details of 68 protein databases. It presents a rich data landscape, encompassing 627 benchmark datasets of 63 diverse protein sequence analysis tasks. It highlights the utilization of 25 unique word embedding methods and 13 language models in AI-driven protein sequence analysis applications. It accelerates the development of AI-driven applications by facilitating current state-of-the-art performances across 63 protein sequence analysis tasks.

Collapse

Liang Y, Li M. A deep learning model for prediction of lysine crotonylation sites by fusing multi-features based on multi-head self-attention mechanism. Sci Rep 2025;15:18940. [PMID: 40442183 PMCID: PMC12122789 DOI: 10.1038/s41598-025-04058-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2025] [Accepted: 05/23/2025] [Indexed: 06/02/2025] Open

Ji L, Hou W, Zhou H, Xiong L, Liu C, Yuan Z, Li L. EBMGP: a deep learning model for genomic prediction based on Elastic Net feature selection and bidirectional encoder representations from transformer's embedding and multi-head attention pooling. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2025;138:103. [PMID: 40253568 PMCID: PMC12009238 DOI: 10.1007/s00122-025-04894-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/13/2024] [Accepted: 03/27/2025] [Indexed: 04/21/2025]

Yue Y, Fan H, Zhao J, Xia J. Protein language model-based prediction for plant miRNA encoded peptides. PeerJ Comput Sci 2025;11:e2733. [PMID: 40134870 PMCID: PMC11935769 DOI: 10.7717/peerj-cs.2733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2024] [Accepted: 02/05/2025] [Indexed: 03/27/2025]

Li F, Bin Y, Zhao J, Zheng C. DeepPD: A Deep Learning Method for Predicting Peptide Detectability Based on Multi-feature Representation and Information Bottleneck. Interdiscip Sci 2025;17:200-214. [PMID: 39661307 DOI: 10.1007/s12539-024-00665-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 10/07/2024] [Accepted: 10/09/2024] [Indexed: 12/12/2024]

Clark JD, Mi X, Mitchell DA, Shukla D. Substrate prediction for RiPP biosynthetic enzymes via masked language modeling and transfer learning. DIGITAL DISCOVERY 2025;4:343-354. [PMID: 39649639 PMCID: PMC11622008 DOI: 10.1039/d4dd00170b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Accepted: 11/28/2024] [Indexed: 12/11/2024]

Zhao J, Liu H, Kang L, Gao W, Lu Q, Rao Y, Yue Z. deep-AMPpred: A Deep Learning Method for Identifying Antimicrobial Peptides and Their Functional Activities. J Chem Inf Model 2025;65:997-1008. [PMID: 39792442 DOI: 10.1021/acs.jcim.4c01913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2025]

Affiliation(s)

Jun Zhao School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Key Laboratory of Agricultural Sensors for Ministry of Agriculture and Rural Affairs, Anhui Agricultural University, Hefei, Anhui 230036, China
Hangcheng Liu School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Key Laboratory of Agricultural Sensors for Ministry of Agriculture and Rural Affairs, Anhui Agricultural University, Hefei, Anhui 230036, China
Leyao Kang School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Key Laboratory of Agricultural Sensors for Ministry of Agriculture and Rural Affairs, Anhui Agricultural University, Hefei, Anhui 230036, China
Wanling Gao School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Key Laboratory of Agricultural Sensors for Ministry of Agriculture and Rural Affairs, Anhui Agricultural University, Hefei, Anhui 230036, China
Quan Lu School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Key Laboratory of Agricultural Sensors for Ministry of Agriculture and Rural Affairs, Anhui Agricultural University, Hefei, Anhui 230036, China
Yuan Rao School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Key Laboratory of Agricultural Sensors for Ministry of Agriculture and Rural Affairs, Anhui Agricultural University, Hefei, Anhui 230036, China
Zhenyu Yue School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Key Laboratory of Agricultural Sensors for Ministry of Agriculture and Rural Affairs, Anhui Agricultural University, Hefei, Anhui 230036, China Research Center for Biological Breeding Technology, Advance Academy, Anhui Agricultural University, Hefei, Anhui 230036, China

Collapse

Luo J, Zhao K, Chen J, Yang C, Qu F, Liu Y, Jin X, Yan K, Zhang Y, Liu B. iMFP-LG: Identify Novel Multi-functional Peptides Using Protein Language Models and Graph-based Deep Learning. GENOMICS, PROTEOMICS & BIOINFORMATICS 2025;22:qzae084. [PMID: 39585308 PMCID: PMC12011362 DOI: 10.1093/gpbjnl/qzae084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 10/25/2024] [Accepted: 11/21/2024] [Indexed: 11/26/2024]

Guan C, Fernandes FC, Franco OL, de la Fuente-Nunez C. Leveraging large language models for peptide antibiotic design. CELL REPORTS. PHYSICAL SCIENCE 2025;6:102359. [PMID: 39949833 PMCID: PMC11823563 DOI: 10.1016/j.xcrp.2024.102359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/16/2025]

Affiliation(s)

Changge Guan Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, USA These authors contributed equally
Fabiano C. Fernandes Centro de Análises Proteômicas e Bioquímicas, Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília, Brazil Departamento de Ciência da Computação, Instituto Federal de Brasília, Campus Taguatinga, Brasília, Brazil These authors contributed equally
Octavio L. Franco Centro de Análises Proteômicas e Bioquímicas, Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília, Brazil S-Inova Biotech, Programa de Pós-Graduação em Biotecnologia, Universidade Católica Dom Bosco, Campo Grande, Brazil
Cesar de la Fuente-Nunez Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, USA

Collapse

Bizzotto E, Zampieri G, Treu L, Filannino P, Di Cagno R, Campanaro S. Classification of bioactive peptides: A systematic benchmark of models and encodings. Comput Struct Biotechnol J 2024;23:2442-2452. [PMID: 38867723 PMCID: PMC11168199 DOI: 10.1016/j.csbj.2024.05.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 05/10/2024] [Accepted: 05/22/2024] [Indexed: 06/14/2024] Open

Luo X, Chi ASY, Lin AH, Ong TJ, Wong L, Rahman CR. Benchmarking recent computational tools for DNA-binding protein identification. Brief Bioinform 2024;26:bbae634. [PMID: 39657630 PMCID: PMC11630855 DOI: 10.1093/bib/bbae634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Revised: 10/29/2024] [Accepted: 11/20/2024] [Indexed: 12/12/2024] Open

Liu X, Luo J, Wang X, Zhang Y, Chen J. Directed evolution of antimicrobial peptides using multi-objective zeroth-order optimization. Brief Bioinform 2024;26:bbae715. [PMID: 39800873 PMCID: PMC11725395 DOI: 10.1093/bib/bbae715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Revised: 12/08/2024] [Accepted: 12/27/2024] [Indexed: 01/16/2025] Open

Tang Q, Xiang Y, Gao W, Zhu L, Xu Z, Li Y, Yue Z. TeaTFactor: A Prediction Tool for Tea Plant Transcription Factors Based on BERT. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024;21:2123-2132. [PMID: 39150804 DOI: 10.1109/tcbb.2024.3444466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/18/2024]

Qi D, Song C, Liu T. PreDBP-PLMs: Prediction of DNA-binding proteins based on pre-trained protein language models and convolutional neural networks. Anal Biochem 2024;694:115603. [PMID: 38986796 DOI: 10.1016/j.ab.2024.115603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Revised: 06/15/2024] [Accepted: 07/06/2024] [Indexed: 07/12/2024]

Gao W, Zhao J, Gui J, Wang Z, Chen J, Yue Z. Comprehensive Assessment of BERT-Based Methods for Predicting Antimicrobial Peptides. J Chem Inf Model 2024;64:7772-7785. [PMID: 39316765 DOI: 10.1021/acs.jcim.4c00507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/26/2024]

Zhao Y, Zhang S, Liang Y. HemoFuse: multi-feature fusion based on multi-head cross-attention for identification of hemolytic peptides. Sci Rep 2024;14:22518. [PMID: 39342017 PMCID: PMC11438874 DOI: 10.1038/s41598-024-74326-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Accepted: 09/25/2024] [Indexed: 10/01/2024] Open

Zhang B, Hou Z, Yang Y, Wong KC, Zhu H, Li X. SOFB is a comprehensive ensemble deep learning approach for elucidating and characterizing protein-nucleic-acid-binding residues. Commun Biol 2024;7:679. [PMID: 38830995 PMCID: PMC11148103 DOI: 10.1038/s42003-024-06332-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 05/15/2024] [Indexed: 06/05/2024] Open

Abstract

Proteins and nucleic-acids are essential components of living organisms that interact in critical cellular processes. Accurate prediction of nucleic acid-binding residues in proteins can contribute to a better understanding of protein function. However, the discrepancy between protein sequence information and obtained structural and functional data renders most current computational models ineffective. Therefore, it is vital to design computational models based on protein sequence information to identify nucleic acid binding sites in proteins. Here, we implement an ensemble deep learning model-based nucleic-acid-binding residues on proteins identification method, called SOFB, which characterizes protein sequences by learning the semantics of biological dynamics contexts, and then develop an ensemble deep learning-based sequence network to learn feature representation and classification by explicitly modeling dynamic semantic information. Among them, the language learning model, which is constructed from natural language to biological language, captures the underlying relationships of protein sequences, and the ensemble deep learning-based sequence network consisting of different convolutional layers together with Bi-LSTM refines various features for optimal performance. Meanwhile, to address the imbalanced issue, we adopt ensemble learning to train multiple models and then incorporate them. Our experimental results on several DNA/RNA nucleic-acid-binding residue datasets demonstrate that our proposed model outperforms other state-of-the-art methods. In addition, we conduct an interpretability analysis of the identified nucleic acid binding residue sequences based on the attention weights of the language learning model, revealing novel insights into the dynamic semantic information that supports the identified nucleic acid binding residues. SOFB is available at https://github.com/Encryptional/SOFB and https://figshare.com/articles/online_resource/SOFB_figshare_rar/25499452 .

Collapse

Chaudhari JK, Pant S, Jha R, Pathak RK, Singh DB. Biological big-data sources, problems of storage, computational issues, and applications: a comprehensive review. Knowl Inf Syst 2024;66:3159-3209. [DOI: 10.1007/s10115-023-02049-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 09/12/2023] [Accepted: 12/11/2023] [Indexed: 01/03/2025]

Cordoves-Delgado G, García-Jacas CR. Predicting Antimicrobial Peptides Using ESMFold-Predicted Structures and ESM-2-Based Amino Acid Features with Graph Deep Learning. J Chem Inf Model 2024;64:4310-4321. [PMID: 38739853 DOI: 10.1021/acs.jcim.3c02061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]

Abstract

Currently, antimicrobial resistance constitutes a serious threat to human health. Drugs based on antimicrobial peptides (AMPs) constitute one of the alternatives to address it. Shallow and deep learning (DL)-based models have mainly been built from amino acid sequences to predict AMPs. Recent advances in tertiary (3D) structure prediction have opened new opportunities in this field. In this sense, models based on graphs derived from predicted peptide structures have recently been proposed. However, these models are not in correspondence with state-of-the-art approaches to codify evolutionary information, and, in addition, they are memory- and time-consuming because depend on multiple sequence alignment. Herein, we presented a framework to create alignment-free models based on graph representations generated from ESMFold-predicted peptide structures, whose nodes are characterized with amino acid-level evolutionary information derived from the Evolutionary Scale Modeling (ESM-2) models. A graph attention network (GAT) was implemented to assess the usefulness of the framework in the AMP classification. To this end, a set comprised of 67,058 peptides was used. It was demonstrated that the proposed methodology allowed to build GAT models with generalization abilities consistently better than 20 state-of-the-art non-DL-based and DL-based models. The best GAT models were developed using evolutionary information derived from the 36- and 33-layer ESM-2 models. Similarity studies showed that the best-built GAT models codified different chemical spaces, and thus they were fused to significantly improve the classification. In general, the results suggest that esm-AxP-GDL is a promissory tool to develop good, structure-dependent, and alignment-free models that can be successfully applied in the screening of large data sets. This framework should not only be useful to classify AMPs but also for modeling other peptide and protein activities.

Collapse

Lobanov MY, Slizen MV, Dovidchenko NV, Panfilov AV, Surin AA, Likhachev IV, Galzitskaya OV. Comparison of deep learning models with simple method to assess the problem of antimicrobial peptides prediction. Mol Inform 2024;43:e202200181. [PMID: 36961202 DOI: 10.1002/minf.202200181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 03/20/2023] [Accepted: 03/23/2023] [Indexed: 03/25/2023]

Li H, Meng J, Wang Z, Tang Y, Xia S, Wang Y, Qin Z, Luan Y. miPEPPred-FRL: A Novel Method for Predicting Plant MiRNA-Encoded Peptides Using Adaptive Feature Representation Learning. J Chem Inf Model 2024;64:2889-2900. [PMID: 37733290 DOI: 10.1021/acs.jcim.3c01020] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]

Chen L, Hu Z, Rong Y, Lou B. Deep2Pep: A deep learning method in multi-label classification of bioactive peptide. Comput Biol Chem 2024;109:108021. [PMID: 38308955 DOI: 10.1016/j.compbiolchem.2024.108021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 12/27/2023] [Accepted: 01/18/2024] [Indexed: 02/05/2024]

Palacios A, Acharya P, Peidl A, Beck M, Blanco E, Mishra A, Bawa-Khalfe T, Pakhrin S. SumoPred-PLM: human SUMOylation and SUMO2/3 sites Prediction using Pre-trained Protein Language Model. NAR Genom Bioinform 2024;6:lqae011. [PMID: 38327870 PMCID: PMC10849187 DOI: 10.1093/nargab/lqae011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 11/17/2023] [Accepted: 01/17/2024] [Indexed: 02/09/2024] Open

Zhuang J, Gao W, Su R. EnAMP: A novel deep learning ensemble antibacterial peptide recognition algorithm based on multi-features. J Bioinform Comput Biol 2024;22:2450001. [PMID: 38406833 DOI: 10.1142/s021972002450001x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]

Liu F, Yuan C, Chen H, Yang F. Prediction of linear B-cell epitopes based on protein sequence features and BERT embeddings. Sci Rep 2024;14:2464. [PMID: 38291341 PMCID: PMC10828400 DOI: 10.1038/s41598-024-53028-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 01/26/2024] [Indexed: 02/01/2024] Open

Wang R, Wang T, Zhuo L, Wei J, Fu X, Zou Q, Yao X. Diff-AMP: tailored designed antimicrobial peptide framework with all-in-one generation, identification, prediction and optimization. Brief Bioinform 2024;25:bbae078. [PMID: 38446739 PMCID: PMC10939340 DOI: 10.1093/bib/bbae078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 01/25/2024] [Accepted: 02/08/2024] [Indexed: 03/08/2024] Open

Yu H, Wang R, Qiao J, Wei L. Multi-CGAN: Deep Generative Model-Based Multiproperty Antimicrobial Peptide Design. J Chem Inf Model 2024;64:316-326. [PMID: 38135439 DOI: 10.1021/acs.jcim.3c01881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2023]

Wang S, Liu Y, Liu Y, Zhang Y, Zhu X. BERT-5mC: an interpretable model for predicting 5-methylcytosine sites of DNA based on BERT. PeerJ 2023;11:e16600. [PMID: 38089911 PMCID: PMC10712318 DOI: 10.7717/peerj.16600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 11/15/2023] [Indexed: 12/18/2023] Open

Ma Y, Pei Y, Li C. Predictive Recognition of DNA-binding Proteins Based on Pre-trained Language Model BERT. J Bioinform Comput Biol 2023;21:2350028. [PMID: 38248912 DOI: 10.1142/s0219720023500282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2024]

Le NQK. Leveraging transformers-based language models in proteome bioinformatics. Proteomics 2023;23:e2300011. [PMID: 37381841 DOI: 10.1002/pmic.202300011] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 06/13/2023] [Accepted: 06/13/2023] [Indexed: 06/30/2023]

Guan C, Luo J, Li S, Tan ZL, Wang Y, Chen H, Yamamoto N, Zhang C, Lu Y, Chen J, Xing XH. Exploration of DPP-IV Inhibitory Peptide Design Rules Assisted by the Deep Learning Pipeline That Identifies the Restriction Enzyme Cutting Site. ACS OMEGA 2023;8:39662-39672. [PMID: 37901493 PMCID: PMC10601436 DOI: 10.1021/acsomega.3c05571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Accepted: 09/27/2023] [Indexed: 10/31/2023]

Affiliation(s)

Changge Guan Key Laboratory for Industrial Biocatalysis, Ministry of Education of China, Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
Jiawei Luo Department of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, China
Shucheng Li Key Laboratory for Industrial Biocatalysis, Ministry of Education of China, Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
Zheng Lin Tan School of Life Science and Technology, Tokyo Institute of Technology, 4259 Nagatsutacho, Midori Ward, Yokohama, Kanagawa Prefecture 226-0026, Japan
Yi Wang Key Laboratory for Industrial Biocatalysis, Ministry of Education of China, Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
Haihong Chen Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Shenzhen 518055, China Institute of Biomedical Health Technology and Engineering, Shenzhen Bay Laboratory, Shenzhen 518118, China
Naoyuki Yamamoto School of Life Science and Technology, Tokyo Institute of Technology, 4259 Nagatsutacho, Midori Ward, Yokohama, Kanagawa Prefecture 226-0026, Japan
Chong Zhang Key Laboratory for Industrial Biocatalysis, Ministry of Education of China, Department of Chemical Engineering, Tsinghua University, Beijing 100084, China Center for Synthetic and Systems Biology, Tsinghua University, Beijing 100084, China
Yuan Lu Key Laboratory for Industrial Biocatalysis, Ministry of Education of China, Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
Junjie Chen Department of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, China
Xin-Hui Xing Key Laboratory for Industrial Biocatalysis, Ministry of Education of China, Department of Chemical Engineering, Tsinghua University, Beijing 100084, China Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Shenzhen 518055, China Institute of Biomedical Health Technology and Engineering, Shenzhen Bay Laboratory, Shenzhen 518118, China Center for Synthetic and Systems Biology, Tsinghua University, Beijing 100084, China

Collapse

Zhang J, Yan W, Zhang Q, Li Z, Liang L, Zuo M, Zhang Y. Umami-BERT: An interpretable BERT-based model for umami peptides prediction. Food Res Int 2023;172:113142. [PMID: 37689906 DOI: 10.1016/j.foodres.2023.113142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 06/12/2023] [Accepted: 06/13/2023] [Indexed: 09/11/2023]

Yao L, Zhang Y, Li W, Chung C, Guan J, Zhang W, Chiang Y, Lee T. DeepAFP: An effective computational framework for identifying antifungal peptides based on deep learning. Protein Sci 2023;32:e4758. [PMID: 37595093 PMCID: PMC10503419 DOI: 10.1002/pro.4758] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 08/02/2023] [Accepted: 08/10/2023] [Indexed: 08/20/2023]

Ju H, Bai J, Jiang J, Che Y, Chen X. Comparative evaluation and analysis of DNA N4-methylcytosine methylation sites using deep learning. Front Genet 2023;14:1254827. [PMID: 37671040 PMCID: PMC10476523 DOI: 10.3389/fgene.2023.1254827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 07/31/2023] [Indexed: 09/07/2023] Open

Xu J, Li F, Li C, Guo X, Landersdorfer C, Shen HH, Peleg AY, Li J, Imoto S, Yao J, Akutsu T, Song J. iAMPCN: a deep-learning approach for identifying antimicrobial peptides and their functional activities. Brief Bioinform 2023;24:bbad240. [PMID: 37369638 PMCID: PMC10359087 DOI: 10.1093/bib/bbad240] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 05/30/2023] [Accepted: 06/08/2023] [Indexed: 06/29/2023] Open

Abstract

Antimicrobial peptides (AMPs) are short peptides that play crucial roles in diverse biological processes and have various functional activities against target organisms. Due to the abuse of chemical antibiotics and microbial pathogens' increasing resistance to antibiotics, AMPs have the potential to be alternatives to antibiotics. As such, the identification of AMPs has become a widely discussed topic. A variety of computational approaches have been developed to identify AMPs based on machine learning algorithms. However, most of them are not capable of predicting the functional activities of AMPs, and those predictors that can specify activities only focus on a few of them. In this study, we first surveyed 10 predictors that can identify AMPs and their functional activities in terms of the features they employed and the algorithms they utilized. Then, we constructed comprehensive AMP datasets and proposed a new deep learning-based framework, iAMPCN (identification of AMPs based on CNNs), to identify AMPs and their related 22 functional activities. Our experiments demonstrate that iAMPCN significantly improved the prediction performance of AMPs and their corresponding functional activities based on four types of sequence features. Benchmarking experiments on the independent test datasets showed that iAMPCN outperformed a number of state-of-the-art approaches for predicting AMPs and their functional activities. Furthermore, we analyzed the amino acid preferences of different AMP activities and evaluated the model on datasets of varying sequence redundancy thresholds. To facilitate the community-wide identification of AMPs and their corresponding functional types, we have made the source codes of iAMPCN publicly available at https://github.com/joy50706/iAMPCN/tree/master. We anticipate that iAMPCN can be explored as a valuable tool for identifying potential AMPs with specific functional activities for further experimental validation.

Collapse

Affiliation(s)

Jing Xu Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
Fuyi Li Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia College of Information Engineering, Northwest A&F University, Shaanxi 712100, China The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC 3800, Australia
Chen Li Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
Xudong Guo College of Information Engineering, Northwest A&F University, Shaanxi 712100, China
Cornelia Landersdorfer Monash Institute of Pharmaceutical Sciences, Monash University, Melbourne, VIC 3800, Australia
Hsin-Hui Shen Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia Department of Materials Science and Engineering, Faculty of Engineering, Monash University, Clayton, VIC, 3800, Australia
Anton Y Peleg Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia Department of Infectious Diseases, Alfred Hospital, Alfred Health, Melbourne, Victoria, Australia
Jian Li Monash Biomedicine Discovery Institute and Department of Microbiology, Monash University, Melbourne, VIC 3800, Australia
Seiya Imoto Division of Health Medical Intelligence, Human Genome Center, Institute of Medical Science, The University of Tokyo, Minato-ku, Tokyo, Japan Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
Jianhua Yao Tencent AI Lab, Tencent, Shenzhen, China
Tatsuya Akutsu Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji 611-0011, Japan
Jiangning Song Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji 611-0011, Japan

Collapse

Jing Y, Zhang S, Wang H. DapNet-HLA: Adaptive dual-attention mechanism network based on deep learning to predict non-classical HLA binding sites. Anal Biochem 2023;666:115075. [PMID: 36740003 DOI: 10.1016/j.ab.2023.115075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 01/30/2023] [Accepted: 02/02/2023] [Indexed: 02/05/2023]

Liu Y, Wang S, Li X, Liu Y, Zhu X. NeuroPpred-SVM: A New Model for Predicting Neuropeptides Based on Embeddings of BERT. J Proteome Res 2023;22:718-728. [PMID: 36749151 DOI: 10.1021/acs.jproteome.2c00363] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]

Deep learning drives efficient discovery of novel antihypertensive peptides from soybean protein isolate. Food Chem 2023;404:134690. [DOI: 10.1016/j.foodchem.2022.134690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 09/29/2022] [Accepted: 10/17/2022] [Indexed: 11/06/2022]

Yu H, Luo X. IPPF-FE: an integrated peptide and protein function prediction framework based on fused features and ensemble models. Brief Bioinform 2023;24:6834141. [PMID: 36403184 DOI: 10.1093/bib/bbac476] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 09/23/2022] [Accepted: 10/05/2022] [Indexed: 11/21/2022] Open

Liu Y, Liu Y, Wang S, Zhu X. LBCE-XGB: A XGBoost Model for Predicting Linear B-Cell Epitopes Based on BERT Embeddings. Interdiscip Sci 2023;15:293-305. [PMID: 36646842 DOI: 10.1007/s12539-023-00549-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 12/28/2022] [Accepted: 01/03/2023] [Indexed: 01/18/2023]

Application of a deep generative model produces novel and diverse functional peptides against microbial resistance. Comput Struct Biotechnol J 2022;21:463-471. [PMID: 36618982 PMCID: PMC9804011 DOI: 10.1016/j.csbj.2022.12.029] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 12/13/2022] [Accepted: 12/16/2022] [Indexed: 12/23/2022] Open

IUP-BERT: Identification of Umami Peptides Based on BERT Features. Foods 2022;11:foods11223742. [PMID: 36429332 PMCID: PMC9689418 DOI: 10.3390/foods11223742] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 11/14/2022] [Accepted: 11/16/2022] [Indexed: 11/23/2022] Open

García-Jacas CR, García-González LA, Martinez-Rios F, Tapia-Contreras IP, Brizuela CA. Handcrafted versus non-handcrafted (self-supervised) features for the classification of antimicrobial peptides: complementary or redundant? Brief Bioinform 2022;23:6754757. [PMID: 36215083 DOI: 10.1093/bib/bbac428] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 08/28/2022] [Accepted: 09/02/2022] [Indexed: 12/14/2022] Open

Abstract

Antimicrobial peptides (AMPs) have received a great deal of attention given their potential to become a plausible option to fight multi-drug resistant bacteria as well as other pathogens. Quantitative sequence-activity models (QSAMs) have been helpful to discover new AMPs because they allow to explore a large universe of peptide sequences and help reduce the number of wet lab experiments. A main aspect in the building of QSAMs based on shallow learning is to determine an optimal set of protein descriptors (features) required to discriminate between sequences with different antimicrobial activities. These features are generally handcrafted from peptide sequence datasets that are labeled with specific antimicrobial activities. However, recent developments have shown that unsupervised approaches can be used to determine features that outperform human-engineered (handcrafted) features. Thus, knowing which of these two approaches contribute to a better classification of AMPs, it is a fundamental question in order to design more accurate models. Here, we present a systematic and rigorous study to compare both types of features. Experimental outcomes show that non-handcrafted features lead to achieve better performances than handcrafted features. However, the experiments also prove that an improvement in performance is achieved when both types of features are merged. A relevance analysis reveals that non-handcrafted features have higher information content than handcrafted features, while an interaction-based importance analysis reveals that handcrafted features are more important. These findings suggest that there is complementarity between both types of features. Comparisons regarding state-of-the-art deep models show that shallow models yield better performances both when fed with non-handcrafted features alone and when fed with non-handcrafted and handcrafted features together.

Collapse

Dong B, Li M, Jiang B, Gao B, Li D, Zhang T. Antimicrobial Peptides Prediction method based on sequence multidimensional feature embedding. Front Genet 2022;13:1069558. [PMID: 36468005 PMCID: PMC9714691 DOI: 10.3389/fgene.2022.1069558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 11/02/2022] [Indexed: 09/10/2024] Open

An J, Weng X. Collectively encoding protein properties enriches protein language models. BMC Bioinformatics 2022;23:467. [DOI: 10.1186/s12859-022-05031-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 10/31/2022] [Indexed: 11/10/2022] Open

Pang Y, Yao L, Xu J, Wang Z, Lee TY. Integrating transformer and imbalanced multi-label learning to identify antimicrobial peptides and their functional activities. Bioinformatics 2022;38:5368-5374. [PMID: 36326438 PMCID: PMC9750108 DOI: 10.1093/bioinformatics/btac711] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 10/08/2022] [Accepted: 11/02/2022] [Indexed: 11/06/2022] Open

Yan J, Cai J, Zhang B, Wang Y, Wong DF, Siu SWI. Recent Progress in the Discovery and Design of Antimicrobial Peptides Using Traditional Machine Learning and Deep Learning. Antibiotics (Basel) 2022;11:1451. [PMID: 36290108 PMCID: PMC9598685 DOI: 10.3390/antibiotics11101451] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 10/11/2022] [Accepted: 10/13/2022] [Indexed: 11/16/2022] Open

PD-BertEDL: An Ensemble Deep Learning Method Using BERT and Multivariate Representation to Predict Peptide Detectability. Int J Mol Sci 2022;23:ijms232012385. [PMID: 36293242 PMCID: PMC9604182 DOI: 10.3390/ijms232012385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 10/11/2022] [Accepted: 10/12/2022] [Indexed: 12/03/2022] Open

Chen S, Li Q, Zhao J, Bin Y, Zheng C. NeuroPred-CLQ: incorporating deep temporal convolutional networks and multi-head attention mechanism to predict neuropeptides. Brief Bioinform 2022;23:6672901. [DOI: 10.1093/bib/bbac319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 06/27/2022] [Accepted: 07/14/2022] [Indexed: 11/13/2022] Open