1
|
Yao Y, Zhang D, Fan H, Wu T, Su Y, Bin Y. Prediction of Chemically Modified Antimicrobial Peptides and Their Sub-functional Activities Using Hybrid Features. Probiotics Antimicrob Proteins 2025:10.1007/s12602-025-10575-6. [PMID: 40397268 DOI: 10.1007/s12602-025-10575-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/29/2025] [Indexed: 05/22/2025]
Abstract
Antimicrobial peptides (AMPs) demonstrate a broad spectrum of activities against various pathogens, thereby offering a promising strategy to mitigate the urgent challenge of antimicrobial resistance. Recent studies indicate that chemically modified AMPs (cmAMPs), which contain chemically modified amino acids, have the potential to alleviate the adverse effects commonly associated with conventional AMPs. Nevertheless, there remains a notable deficiency in computational methods specifically designed for the analysis and prediction of cmAMPs and their sub-function predictions. In this study, we proposed a two-layer model, termed as iCMAMP, aimed for the identification of cmAMPs and their sub-functional activities. The first layer, referred to as iCMAMP-1L, integrates three categories encompassing seven distinct groups of features, in conjunction with an ensemble method designed at enhancing predictive accuracy for cmAMPs. This ensemble approach effectively extracts relevant insights from a heterogeneous array of features sets while addressing potential dimensionality challenges. On the test dataset, iCMAMP-1L achieved an ACC of 0.934 and an MCC of 0.868, representing improvements of 3.4% and 6.8%, respectively, over AntiMPmod, which is the sole existing method for predicting cmAMPs. A comparative analysis between cmAMPs and their corresponding AMPs revealed that chemical modifications can significantly reduce hemolysis and toxicity associated with AMPs, while the functional characteristics of the peptides are primarily determined by their sequences. The second layer of our model, designated as iCMAMP-2L, employed a multi-label classification approach to predict the sub-functional activities of cmAMPs, with a specific focus on the dipeptide composition-based features. On the test dataset, iCMAMP-2L achieved an Accuracy of 0.390 and an Absolute true of 0.621. The data and Python code used in the iCMAMP model are available at https://github.com/swicher123/iCMAMP/tree/master .
Collapse
Affiliation(s)
- Yujie Yao
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China
| | - Daijun Zhang
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China
| | - Henghui Fan
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China
| | - Ting Wu
- Department of Infectious Diseases & Anhui Province Key Laboratory of Infectious Diseases, The First Affiliated Hospital of Anhui Medical University, Hefei, 230022, Anhui, China.
- Institute of Bacterial Resistance & Anhui Center for Surveillance of Bacterial Resistance, Anhui Medical University, Hefei, 230022, Anhui, China.
| | - Yansen Su
- School of Artificial Intelligence, Anhui University, Hefei, 230601, Anhui, China.
| | - Yannan Bin
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, Anhui, China.
| |
Collapse
|
2
|
Zhenghui L, Wenxing H, Yan W, Jihong Z, Xiaojun X, Lixin G, Mengshan L. Ensemble learning based on bi-directional gated recurrent unit and convolutional neural network with word embedding module for bioactive peptide prediction. Food Chem 2025; 468:142464. [PMID: 39675273 DOI: 10.1016/j.foodchem.2024.142464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Revised: 11/12/2024] [Accepted: 12/11/2024] [Indexed: 12/17/2024]
Abstract
Bioactive peptides, as small protein fragments, are essential mediators of diverse physiological activities, such as antimicrobial, anti-inflammatory, anticancer, antioxidant, and immunomodulatory functions. Despite their substantial potential in pharmaceuticals and the food industry, conventional methods for peptide classification and activity prediction are limited by high costs, time-intensive procedures, and extensive data processing requirements. Here, we present BioPepPred-DLEmb, a novel computational model integrating Convolutional Neural Networks (CNNs) and Bidirectional Gated Recurrent Units (BiGRUs), augmented with natural language processing to encode amino acids into information-dense vectors. Evaluated across nine bioactive peptide datasets, BioPepPred-DLEmb demonstrates superior predictive accuracy (0.909) and sensitivity (0.911) compared to traditional methods. Through UMAP visualization and Kplogo analysis, the model effectively differentiates peptide activity states and identifies key biomarkers. The predicted antimicrobial peptides (Pred-AMPs) exhibit potent efficacy in vitro, achieving low micromolar inhibitory concentrations (2-16 μmol/L) against pathogens such as Escherichia coli and Acinetobacter baumannii. These findings establish a robust foundation for bioactive peptide development, with implications for advancements in precision medicine, personalized therapies, and functional food innovations.
Collapse
Affiliation(s)
- Lai Zhenghui
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, Jiangxi, China
| | - Hu Wenxing
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, Jiangxi, China
| | - Wu Yan
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, Jiangxi, China
| | - Zhu Jihong
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, Jiangxi, China
| | - Xie Xiaojun
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, Jiangxi, China
| | - Guan Lixin
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, Jiangxi, China
| | - Li Mengshan
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, Jiangxi, China.
| |
Collapse
|
3
|
Dong R, Liu R, Liu Z, Liu Y, Zhao G, Li H, Hou S, Ma X, Kang H, Liu J, Guo F, Zhao P, Wang J, Wang C, Wu X, Ye S, Zhu C. Exploring the repository of de novo-designed bifunctional antimicrobial peptides through deep learning. eLife 2025; 13:RP97330. [PMID: 40079572 PMCID: PMC11906162 DOI: 10.7554/elife.97330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/15/2025] Open
Abstract
Antimicrobial peptides (AMPs) are attractive candidates to combat antibiotic resistance for their capability to target biomembranes and restrict a wide range of pathogens. It is a daunting challenge to discover novel AMPs due to their sparse distributions in a vast peptide universe, especially for peptides that demonstrate potencies for both bacterial membranes and viral envelopes. Here, we establish a de novo AMP design framework by bridging a deep generative module and a graph-encoding activity regressor. The generative module learns hidden 'grammars' of AMP features and produces candidates sequentially pass antimicrobial predictor and antiviral classifiers. We discovered 16 bifunctional AMPs and experimentally validated their abilities to inhibit a spectrum of pathogens in vitro and in animal models. Notably, P076 is a highly potent bactericide with the minimal inhibitory concentration of 0.21 μM against multidrug-resistant Acinetobacter baumannii, while P002 broadly inhibits five enveloped viruses. Our study provides feasible means to uncover the sequences that simultaneously encode antimicrobial and antiviral activities, thus bolstering the function spectra of AMPs to combat a wide range of drug-resistant infections.
Collapse
Affiliation(s)
- Ruihan Dong
- Frontiers Science Center for Synthetic Biology (Ministry of Education), Tianjin Key Laboratory of Function and Application of Biological Macromolecular Structures, School of Life Sciences, Faculty of Medicine, Tianjin UniversityTianjinChina
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking UniversityBeijingChina
| | - Rongrong Liu
- Department of Microbiology, School of Basic Medicine, Fourth Military Medical UniversityShaanxiChina
| | - Ziyu Liu
- Department of Microbiology, School of Basic Medicine, Fourth Military Medical UniversityShaanxiChina
| | - Yangang Liu
- Department of Microbiology, Second Military Medical UniversityShanghaiChina
| | - Gaomei Zhao
- State Key Laboratory of Trauma and Chemical Poisoning, Institute of Combined Injury of PLA, College of Preventive Medicine, Third Military Medical University (Army Medical University)ChongqingChina
| | - Honglei Li
- Tianjin Cancer Hospital Airport HospitalTianjinChina
| | - Shiyuan Hou
- Department of Microbiology, School of Basic Medicine, Fourth Military Medical UniversityShaanxiChina
| | - Xiaohan Ma
- Department of Microbiology, School of Basic Medicine, Fourth Military Medical UniversityShaanxiChina
| | - Huarui Kang
- Department of Microbiology, School of Basic Medicine, Fourth Military Medical UniversityShaanxiChina
| | - Jing Liu
- Department of Microbiology, School of Basic Medicine, Fourth Military Medical UniversityShaanxiChina
| | - Fei Guo
- School of Computer Science and Engineering, Central South UniversityChangshaChina
| | - Ping Zhao
- Department of Microbiology, Second Military Medical UniversityShanghaiChina
| | - Junping Wang
- State Key Laboratory of Trauma and Chemical Poisoning, Institute of Combined Injury of PLA, College of Preventive Medicine, Third Military Medical University (Army Medical University)ChongqingChina
| | - Cheng Wang
- State Key Laboratory of Trauma and Chemical Poisoning, Institute of Combined Injury of PLA, College of Preventive Medicine, Third Military Medical University (Army Medical University)ChongqingChina
| | - Xingan Wu
- Department of Microbiology, School of Basic Medicine, Fourth Military Medical UniversityShaanxiChina
| | - Sheng Ye
- Frontiers Science Center for Synthetic Biology (Ministry of Education), Tianjin Key Laboratory of Function and Application of Biological Macromolecular Structures, School of Life Sciences, Faculty of Medicine, Tianjin UniversityTianjinChina
| | - Cheng Zhu
- Frontiers Science Center for Synthetic Biology (Ministry of Education), Tianjin Key Laboratory of Function and Application of Biological Macromolecular Structures, School of Life Sciences, Faculty of Medicine, Tianjin UniversityTianjinChina
| |
Collapse
|
4
|
Zhou X, Liu G, Cao S, Lv J. Deep Learning for Antimicrobial Peptides: Computational Models and Databases. J Chem Inf Model 2025; 65:1708-1717. [PMID: 39927895 DOI: 10.1021/acs.jcim.5c00006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2025]
Abstract
Antimicrobial peptides are a promising strategy to combat antimicrobial resistance. However, the experimental discovery of antimicrobial peptides is both time-consuming and laborious. In recent years, the development of computational technologies (especially deep learning) has provided new opportunities for antimicrobial peptide prediction. Various computational models have been proposed to predict antimicrobial peptide. In this review, we focus on deep learning models for antimicrobial peptide prediction. We first collected and summarized available data resources for antimicrobial peptides. Subsequently, we summarized existing deep learning models for antimicrobial peptides and discussed their limitations and challenges. This study aims to help computational biologists design better deep learning models for antimicrobial peptide prediction.
Collapse
Affiliation(s)
- Xiangrun Zhou
- College of Computer Science and Technology, Jilin University, Changchun, 130000, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130000, China
| | - Guixia Liu
- College of Computer Science and Technology, Jilin University, Changchun, 130000, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130000, China
| | - Shuyuan Cao
- College of Computer Science and Technology, Jilin University, Changchun, 130000, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130000, China
| | - Ji Lv
- School of Computer Science and Technology, Zhejiang Normal University, Jinhua, 321004, China
| |
Collapse
|
5
|
He M, Jiang Y, Yang Y, Gong K, Jiang X, Tian Y. MSCMamba: Prediction of Antimicrobial Peptide Activity Values by Fusing Multiscale Convolution with Mamba Module. J Phys Chem B 2025; 129:1956-1965. [PMID: 39915928 DOI: 10.1021/acs.jpcb.4c07752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
Antimicrobial peptides (AMPs) have important developmental prospects as potential candidates for novel antibiotics. Although many studies have been devoted to the identification of AMPs and the qualitative prediction of their functional activities, few methods address the quantitative prediction of their activity values. In this paper, we propose a regression model called MSCMamba, which fuses multiscale convolutional neural network with Mamba module to accurately predict the activity values of AMPs. AMPs sequences are feature-extracted by multiple encoding methods and fed into a multiscale convolutional network and a Mamba module to capture local and long-range dependent features, respectively. The model fuses these two outputs and predicts the activity values of AMPs through a linear layer. Experimental results show that MSCMamba outperforms the current state-of-the-art methods in several performance metrics, especially with an increase in R2 from 0.422 to 0.467, representing a 10.66% improvement. Additionally, we did a series of ablation experiments to verify the validity of each part of the MSCMamba model and the performance enhancement of feature diversification.This study provides a new method for activity prediction of AMPs, which is expected to accelerate the development of novel antibiotics.
Collapse
Affiliation(s)
- Mingyue He
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
| | - Yongquan Jiang
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
- Artificial Intelligence Research Institute, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
| | - Yan Yang
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
- Artificial Intelligence Research Institute, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
| | - Kuanping Gong
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
| | - Xuanpei Jiang
- School of Life Science and Technology, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
| | - Yuan Tian
- School of Life Science and Technology, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
| |
Collapse
|
6
|
Yue J, Li T, Xu J, Chen Z, Li Y, Liang S, Liu Z, Wang Y. Discovery of anticancer peptides from natural and generated sequences using deep learning. Int J Biol Macromol 2025; 290:138880. [PMID: 39706427 DOI: 10.1016/j.ijbiomac.2024.138880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2024] [Revised: 12/10/2024] [Accepted: 12/16/2024] [Indexed: 12/23/2024]
Abstract
Anticancer peptides (ACPs) demonstrate significant potential in clinical cancer treatment due to their ability to selectively target and kill cancer cells. In recent years, numerous artificial intelligence (AI) algorithms have been developed. However, many predictive methods lack sufficient wet lab validation, thereby constraining the progress of models and impeding the discovery of novel ACPs. This study proposes a comprehensive research strategy by introducing CNBT-ACPred, an ACP prediction model based on a three-channel deep learning architecture, supported by extensive in vitro and in vivo experiments. CNBT-ACPred achieved an accuracy of 0.9554 and a Matthews Correlation Coefficient (MCC) of 0.8602. Compared to existing excellent models, CNBT-ACPred increased accuracy by at least 5 % and improved MCC by 15 %. Predictions were conducted on over 3.8 million sequences from Uniprot, along with 100,000 sequences generated by a deep generative model, ultimately identifying 37 out of 41 candidate peptides from >30 species that exhibited effective in vitro tumor inhibitory activity. Among these, tPep14 demonstrated significant anticancer effects in two mouse xenograft models without detectable toxicity. Finally, the study revealed correlations between the amino acid composition, structure, and function of the identified ACP candidates.
Collapse
Affiliation(s)
- Jianda Yue
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China.
| | - Tingting Li
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China.
| | - Jiawei Xu
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China.
| | - Zihui Chen
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China
| | - Yaqi Li
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China.
| | - Songping Liang
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China.
| | - Zhonghua Liu
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China.
| | - Ying Wang
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China.
| |
Collapse
|
7
|
Zhao J, Liu H, Kang L, Gao W, Lu Q, Rao Y, Yue Z. deep-AMPpred: A Deep Learning Method for Identifying Antimicrobial Peptides and Their Functional Activities. J Chem Inf Model 2025; 65:997-1008. [PMID: 39792442 DOI: 10.1021/acs.jcim.4c01913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2025]
Abstract
Antimicrobial peptides (AMPs) are small peptides that play an important role in disease defense. As the problem of pathogen resistance caused by the misuse of antibiotics intensifies, the identification of AMPs as alternatives to antibiotics has become a hot topic. Accurately identifying AMPs using computational methods has been a key issue in the field of bioinformatics in recent years. Although there are many machine learning-based AMP identification tools, most of them do not focus on or only focus on a few functional activities. Predicting the multiple activities of antimicrobial peptides can help discover candidate peptides with broad-spectrum antimicrobial ability. We propose a two-stage AMP predictor deep-AMPpred, in which the first stage distinguishes AMP from other peptides, and the second stage solves the multilabel problem of 13 common functional activities of AMP. deep-AMPpred combines the ESM-2 model to encode the features of AMP and integrates CNN, BiLSTM, and CBAM models to discover AMP and its functional activities. The ESM-2 model captures the global contextual features of the peptide sequence, while CNN, BiLSTM, and CBAM combine local feature extraction, long-term and short-term dependency modeling, and attention mechanisms to improve the performance of deep-AMPpred in AMP and its function prediction. Experimental results demonstrate that deep-AMPpred performs well in accurately identifying AMPs and predicting their functional activities. This confirms the effectiveness of using the ESM-2 model to capture meaningful peptide sequence features and integrating multiple deep learning models for AMP identification and activity prediction.
Collapse
Affiliation(s)
- Jun Zhao
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Key Laboratory of Agricultural Sensors for Ministry of Agriculture and Rural Affairs, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Hangcheng Liu
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Key Laboratory of Agricultural Sensors for Ministry of Agriculture and Rural Affairs, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Leyao Kang
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Key Laboratory of Agricultural Sensors for Ministry of Agriculture and Rural Affairs, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Wanling Gao
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Key Laboratory of Agricultural Sensors for Ministry of Agriculture and Rural Affairs, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Quan Lu
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Key Laboratory of Agricultural Sensors for Ministry of Agriculture and Rural Affairs, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Yuan Rao
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Key Laboratory of Agricultural Sensors for Ministry of Agriculture and Rural Affairs, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Zhenyu Yue
- School of Information and Artificial Intelligence, Anhui Provincial Engineering Research Center for Beidou Precision Agriculture Information, Key Laboratory of Agricultural Sensors for Ministry of Agriculture and Rural Affairs, Anhui Agricultural University, Hefei, Anhui 230036, China
- Research Center for Biological Breeding Technology, Advance Academy, Anhui Agricultural University, Hefei, Anhui 230036, China
| |
Collapse
|
8
|
Guan C, Fernandes FC, Franco OL, de la Fuente-Nunez C. Leveraging large language models for peptide antibiotic design. CELL REPORTS. PHYSICAL SCIENCE 2025; 6:102359. [PMID: 39949833 PMCID: PMC11823563 DOI: 10.1016/j.xcrp.2024.102359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/16/2025]
Abstract
Large language models (LLMs) have significantly impacted various domains of our society, including recent applications in complex fields such as biology and chemistry. These models, built on sophisticated neural network architectures and trained on extensive datasets, are powerful tools for designing, optimizing, and generating molecules. This review explores the role of LLMs in discovering and designing antibiotics, focusing on peptide molecules. We highlight advancements in drug design and outline the challenges of applying LLMs in these areas.
Collapse
Affiliation(s)
- Changge Guan
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
- Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, USA
- These authors contributed equally
| | - Fabiano C. Fernandes
- Centro de Análises Proteômicas e Bioquímicas, Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília, Brazil
- Departamento de Ciência da Computação, Instituto Federal de Brasília, Campus Taguatinga, Brasília, Brazil
- These authors contributed equally
| | - Octavio L. Franco
- Centro de Análises Proteômicas e Bioquímicas, Pós-Graduação em Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, Brasília, Brazil
- S-Inova Biotech, Programa de Pós-Graduação em Biotecnologia, Universidade Católica Dom Bosco, Campo Grande, Brazil
| | - Cesar de la Fuente-Nunez
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
- Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
9
|
Yao L, Guan J, Xie P, Chung CR, Zhao Z, Dong D, Guo Y, Zhang W, Deng J, Pang Y, Liu Y, Peng Y, Horng JT, Chiang YC, Lee TY. dbAMP 3.0: updated resource of antimicrobial activity and structural annotation of peptides in the post-pandemic era. Nucleic Acids Res 2025; 53:D364-D376. [PMID: 39540425 PMCID: PMC11701527 DOI: 10.1093/nar/gkae1019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2024] [Revised: 10/12/2024] [Accepted: 11/06/2024] [Indexed: 11/16/2024] Open
Abstract
Antimicrobial resistance is one of the most urgent global health threats, especially in the post-pandemic era. Antimicrobial peptides (AMPs) offer a promising alternative to traditional antibiotics, driving growing interest in recent years. dbAMP is a comprehensive database offering extensive annotations on AMPs, including sequence information, functional activity data, physicochemical properties and structural annotations. In this update, dbAMP has curated data from over 5200 publications, encompassing 33,065 AMPs and 2453 antimicrobial proteins from 3534 organisms. Additionally, dbAMP utilizes ESMFold to determine the three-dimensional structures of AMPs, providing over 30,000 structural annotations that facilitate structure-based functional insights for clinical drug development. Furthermore, dbAMP employs molecular docking techniques, providing over 100 docked complexes that contribute useful insights into the potential mechanisms of AMPs. The toxicity and stability of AMPs are critical factors in assessing their potential as clinical drugs. The updated dbAMP introduced an efficient tool for evaluating the hemolytic toxicity and half-life of AMPs, alongside an AMP optimization platform for designing AMPs with high antimicrobial activity, reduced toxicity and increased stability. The updated dbAMP is freely accessible at https://awi.cuhk.edu.cn/dbAMP/. Overall, dbAMP represents a comprehensive and essential resource for AMP analysis and design, poised to advance antimicrobial strategies in the post-pandemic era.
Collapse
Affiliation(s)
- Lantian Yao
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
| | - Jiahui Guan
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
| | - Peilin Xie
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
| | - Chia-Ru Chung
- Department of Computer Science and Information Engineering, National Central University, 320317, Taoyuan, Taiwan
| | - Zhihao Zhao
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
| | - Danhong Dong
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
| | - Yilin Guo
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
| | - Wenyang Zhang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
| | - Junyang Deng
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
| | - Yuxuan Pang
- Division of Health Medical Intelligence, Human Genome Center, The Institute of Medical Science, The University of Tokyo, 108-8639, Tokyo, Japan
| | - Yulan Liu
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
| | - Yunlu Peng
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
| | - Jorng-Tzong Horng
- Department of Computer Science and Information Engineering, National Central University, 320317, Taoyuan, Taiwan
| | - Ying-Chih Chiang
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China
| | - Tzong-Yi Lee
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, 300093, Hsinchu, Taiwan
- Center for Intelligent Drug Systems and Smart Bio-devices (IDS2B), National Yang Ming Chiao Tung University, 300093, Hsinchu, Taiwan
| |
Collapse
|
10
|
Brizuela CA, Liu G, Stokes JM, de la Fuente‐Nunez C. AI Methods for Antimicrobial Peptides: Progress and Challenges. Microb Biotechnol 2025; 18:e70072. [PMID: 39754551 PMCID: PMC11702388 DOI: 10.1111/1751-7915.70072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Revised: 11/18/2024] [Accepted: 12/16/2024] [Indexed: 01/06/2025] Open
Abstract
Antimicrobial peptides (AMPs) are promising candidates to combat multidrug-resistant pathogens. However, the high cost of extensive wet-lab screening has made AI methods for identifying and designing AMPs increasingly important, with machine learning (ML) techniques playing a crucial role. AI approaches have recently revolutionised this field by accelerating the discovery of new peptides with anti-infective activity, particularly in preclinical mouse models. Initially, classical ML approaches dominated the field, but recently there has been a shift towards deep learning (DL) models. Despite significant contributions, existing reviews have not thoroughly explored the potential of large language models (LLMs), graph neural networks (GNNs) and structure-guided AMP discovery and design. This review aims to fill that gap by providing a comprehensive overview of the latest advancements, challenges and opportunities in using AI methods, with a particular emphasis on LLMs, GNNs and structure-guided design. We discuss the limitations of current approaches and highlight the most relevant topics to address in the coming years for AMP discovery and design.
Collapse
Affiliation(s)
| | - Gary Liu
- Department of Biochemistry and Biomedical Sciences, Michael G. DeGroote Institute for Infectious Disease Research, David Braley Centre for Antibiotic DiscoveryMcMaster UniversityHamiltonOntarioCanada
| | - Jonathan M. Stokes
- Department of Biochemistry and Biomedical Sciences, Michael G. DeGroote Institute for Infectious Disease Research, David Braley Centre for Antibiotic DiscoveryMcMaster UniversityHamiltonOntarioCanada
| | - Cesar de la Fuente‐Nunez
- Machine Biology Group, Department of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of MedicineUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied ScienceUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Chemistry, School of Arts and SciencesUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Penn Institute for Computational ScienceUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| |
Collapse
|
11
|
Wang R, Liang X, Zhao Y, Xue W, Liang G. UniBioPAN: A Novel Universal Classification Architecture for Bioactive Peptides Inspired by Video Action Recognition. J Chem Inf Model 2024; 64:9276-9285. [PMID: 39571078 DOI: 10.1021/acs.jcim.4c01599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2024]
Abstract
The classification of bioactive peptides is of great importance in protein biology, but there is still a lack of a universal and effective classifier. Inspired by video action recognition, we developed the UniBioPAN architecture to create a universal peptide classifier to solve this problem. The architecture treats the peptide sequence as a video sequence and the molecular image of each amino acid in the peptide sequence as a video frame, enabling feature extraction and classification using convolutional neural networks, bidirectional long short-term memory networks, and fully connected networks. As a novel peptide classification architecture, UniBioPAN significantly outperforms other universal architecture in ACC, AUC and MCC across 11 data sets, and F1 score in 9 data sets. UniBioPAN is available in three ways: python script, jupyter notebook script and web server (https://gzliang.cqu.edu.cn/software/UniBioPAN.html). In summary, UniBioPAN is a universal, convenient, and high-performance peptide classification architecture. UniBioPAN holds significant importance in the discovery of bioactive peptides and the advancement of peptide classifiers. All the codes and data sets are publicly available at https://github.com/sanwrh/UniBioPAN.
Collapse
Affiliation(s)
- Ruihong Wang
- Key Laboratory of Biorheological Science and Technology (Chongqing University), Ministry of Education, Bioengineering College, Chongqing University, Chongqing 400044, China
| | - Xiao Liang
- Key Laboratory of Biorheological Science and Technology (Chongqing University), Ministry of Education, Bioengineering College, Chongqing University, Chongqing 400044, China
| | - Yi Zhao
- Key Laboratory of Biorheological Science and Technology (Chongqing University), Ministry of Education, Bioengineering College, Chongqing University, Chongqing 400044, China
| | - Wenjun Xue
- Key Laboratory of Biorheological Science and Technology (Chongqing University), Ministry of Education, Bioengineering College, Chongqing University, Chongqing 400044, China
| | - Guizhao Liang
- Key Laboratory of Biorheological Science and Technology (Chongqing University), Ministry of Education, Bioengineering College, Chongqing University, Chongqing 400044, China
- Bioengineering College of Chongqing University, No. 174, Shazheng Street, Shapingba District, Chongqing 400030, China
| |
Collapse
|
12
|
Fernández-Díaz R, Cossio-Pérez R, Agoni C, Lam HT, Lopez V, Shields DC. AutoPeptideML: a study on how to build more trustworthy peptide bioactivity predictors. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae555. [PMID: 39292535 PMCID: PMC11438549 DOI: 10.1093/bioinformatics/btae555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Revised: 08/08/2024] [Accepted: 09/17/2024] [Indexed: 09/20/2024]
Abstract
MOTIVATION Automated machine learning (AutoML) solutions can bridge the gap between new computational advances and their real-world applications by enabling experimental scientists to build their own custom models. We examine different steps in the development life-cycle of peptide bioactivity binary predictors and identify key steps where automation cannot only result in a more accessible method, but also more robust and interpretable evaluation leading to more trustworthy models. RESULTS We present a new automated method for drawing negative peptides that achieves better balance between specificity and generalization than current alternatives. We study the effect of homology-based partitioning for generating the training and testing data subsets and demonstrate that model performance is overestimated when no such homology correction is used, which indicates that prior studies may have overestimated their performance when applied to new peptide sequences. We also conduct a systematic analysis of different protein language models as peptide representation methods and find that they can serve as better descriptors than a naive alternative, but that there is no significant difference across models with different sizes or algorithms. Finally, we demonstrate that an ensemble of optimized traditional machine learning algorithms can compete with more complex neural network models, while being more computationally efficient. We integrate these findings into AutoPeptideML, an easy-to-use AutoML tool to allow researchers without a computational background to build new predictive models for peptide bioactivity in a matter of minutes. AVAILABILITY AND IMPLEMENTATION Source code, documentation, and data are available at https://github.com/IBM/AutoPeptideML and a dedicated web-server at http://peptide.ucd.ie/AutoPeptideML. A static version of the software to ensure the reproduction of the results is available at https://zenodo.org/records/13363975.
Collapse
Affiliation(s)
- Raúl Fernández-Díaz
- IBM Research, Dublin, Dublin D15 HN66, Ireland
- School of Medicine, University College Dublin, Dublin D04 C1P1, Ireland
- Conway Institute of Biomolecular and Biomedical Science, University College Dublin, Dublin D04 C1P, Ireland
- The SFI Centre for Research Training in Genomics Data Science, Ireland
| | - Rodrigo Cossio-Pérez
- School of Medicine, University College Dublin, Dublin D04 C1P1, Ireland
- Conway Institute of Biomolecular and Biomedical Science, University College Dublin, Dublin D04 C1P, Ireland
- Department of Science and Technology, National University of Quilmes, Bernal B1876, Provincia de Buenos Aires, Argentina
| | - Clement Agoni
- School of Medicine, University College Dublin, Dublin D04 C1P1, Ireland
- Conway Institute of Biomolecular and Biomedical Science, University College Dublin, Dublin D04 C1P, Ireland
- Discipline of Pharmaceutical Sciences, School of Health Sciences, University of KwaZulu-Natal, Durban 4000, South Africa
| | | | | | - Denis C Shields
- School of Medicine, University College Dublin, Dublin D04 C1P1, Ireland
- Conway Institute of Biomolecular and Biomedical Science, University College Dublin, Dublin D04 C1P, Ireland
| |
Collapse
|
13
|
Feng R, Li S, Zhang Y. AI-powered microscopy image analysis for parasitology: integrating human expertise. Trends Parasitol 2024; 40:633-646. [PMID: 38824067 DOI: 10.1016/j.pt.2024.05.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 05/06/2024] [Accepted: 05/07/2024] [Indexed: 06/03/2024]
Abstract
Microscopy image analysis plays a pivotal role in parasitology research. Deep learning (DL), a subset of artificial intelligence (AI), has garnered significant attention. However, traditional DL-based methods for general purposes are data-driven, often lacking explainability due to their black-box nature and sparse instructional resources. To address these challenges, this article presents a comprehensive review of recent advancements in knowledge-integrated DL models tailored for microscopy image analysis in parasitology. The massive amounts of human expert knowledge from parasitologists can enhance the accuracy and explainability of AI-driven decisions. It is expected that the adoption of knowledge-integrated DL models will open up a wide range of applications in the field of parasitology.
Collapse
Affiliation(s)
- Ruijun Feng
- College of Science, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong 518055, China; School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
| | - Sen Li
- College of Science, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong 518055, China
| | - Yang Zhang
- College of Science, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong 518055, China.
| |
Collapse
|
14
|
Xu J, Ruan X, Yang J, Hu B, Li S, Hu J. SME-MFP: A novel spatiotemporal neural network with multiangle initialization embedding toward multifunctional peptides prediction. Comput Biol Chem 2024; 109:108033. [PMID: 38412804 DOI: 10.1016/j.compbiolchem.2024.108033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Revised: 01/09/2024] [Accepted: 02/17/2024] [Indexed: 02/29/2024]
Abstract
As a promising alternative to conventional antibiotic drugs in the biomedical field, functional peptide has been widely used in disease treatment owing to its low toxicity, high absorption rate, and biological activity. Recently, several machine learning methods have been developed for functional peptide prediction. However, the main research heavily relies on statistical features and few consider multifunctional peptide identification. So, we propose SME-MFP, a novel predictor in the imbalanced multi-label functional peptide datasets. First, we employ physicochemical and evolutionary information to represent the peptide sequence's initialization features from multiple perspectives. Second, the features are fused and then put into spatial feature extractors, where the residual connection and multiscale convolutional neural network extract more discriminative features of different lengths' peptide sequences. Besides, we also design AFT-based temporal feature extractors to fully capture the global interactions of the sequences. Finally, devising a new loss to replace the traditional cross entropy loss to settle the class imbalance problems. The results show that our framework not only enhances the model's ability to capture sequence features effectively, but also accuracy improves by 3.89% over existing methods on public peptide datasets.
Collapse
Affiliation(s)
- Jing Xu
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Xiaoli Ruan
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China.
| | - Jing Yang
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Bingqi Hu
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Shaobo Li
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Jianjun Hu
- Department of Computer Science and Engineering, University of South Carolina, Columbia 29208, USA
| |
Collapse
|
15
|
Lee B, Shin D. Contrastive learning for enhancing feature extraction in anticancer peptides. Brief Bioinform 2024; 25:bbae220. [PMID: 38725157 PMCID: PMC11082072 DOI: 10.1093/bib/bbae220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 03/28/2024] [Accepted: 04/21/2024] [Indexed: 05/13/2024] Open
Abstract
Cancer, recognized as a primary cause of death worldwide, has profound health implications and incurs a substantial social burden. Numerous efforts have been made to develop cancer treatments, among which anticancer peptides (ACPs) are garnering recognition for their potential applications. While ACP screening is time-consuming and costly, in silico prediction tools provide a way to overcome these challenges. Herein, we present a deep learning model designed to screen ACPs using peptide sequences only. A contrastive learning technique was applied to enhance model performance, yielding better results than a model trained solely on binary classification loss. Furthermore, two independent encoders were employed as a replacement for data augmentation, a technique commonly used in contrastive learning. Our model achieved superior performance on five of six benchmark datasets against previous state-of-the-art models. As prediction tools advance, the potential in peptide-based cancer therapeutics increases, promising a brighter future for oncology research and patient care.
Collapse
Affiliation(s)
- Byungjo Lee
- Research Institute, National Cancer Center, 323, Ilsan-ro, Ilsandong-gu, Goyang, 10408, Republic of Korea
| | - Dongkwan Shin
- Research Institute, National Cancer Center, 323, Ilsan-ro, Ilsandong-gu, Goyang, 10408, Republic of Korea
- Department of Cancer Biomedical Science, National Cancer Center Graduate School of Cancer Science and Policy, 323, Ilsan-ro, Ilsandong-gu, Goyang, 10408, Republic of Korea
| |
Collapse
|
16
|
Wang R, Wang T, Zhuo L, Wei J, Fu X, Zou Q, Yao X. Diff-AMP: tailored designed antimicrobial peptide framework with all-in-one generation, identification, prediction and optimization. Brief Bioinform 2024; 25:bbae078. [PMID: 38446739 PMCID: PMC10939340 DOI: 10.1093/bib/bbae078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 01/25/2024] [Accepted: 02/08/2024] [Indexed: 03/08/2024] Open
Abstract
Antimicrobial peptides (AMPs), short peptides with diverse functions, effectively target and combat various organisms. The widespread misuse of chemical antibiotics has led to increasing microbial resistance. Due to their low drug resistance and toxicity, AMPs are considered promising substitutes for traditional antibiotics. While existing deep learning technology enhances AMP generation, it also presents certain challenges. Firstly, AMP generation overlooks the complex interdependencies among amino acids. Secondly, current models fail to integrate crucial tasks like screening, attribute prediction and iterative optimization. Consequently, we develop a integrated deep learning framework, Diff-AMP, that automates AMP generation, identification, attribute prediction and iterative optimization. We innovatively integrate kinetic diffusion and attention mechanisms into the reinforcement learning framework for efficient AMP generation. Additionally, our prediction module incorporates pre-training and transfer learning strategies for precise AMP identification and screening. We employ a convolutional neural network for multi-attribute prediction and a reinforcement learning-based iterative optimization strategy to produce diverse AMPs. This framework automates molecule generation, screening, attribute prediction and optimization, thereby advancing AMP research. We have also deployed Diff-AMP on a web server, with code, data and server details available in the Data Availability section.
Collapse
Affiliation(s)
- Rui Wang
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, 325000 Wenzhou, China
| | - Tao Wang
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, 325000 Wenzhou, China
| | - Linlin Zhuo
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, 325000 Wenzhou, China
| | - Jinhang Wei
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, 325000 Wenzhou, China
| | - Xiangzheng Fu
- College of Computer Science and Electronic Engineering, Hunan University, 410012 Changsha, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, 611730 Chengdu, China
| | - Xiaojun Yao
- Faculty of Applied Sciences, Macao Polytechnic University, 999078 Macao, China
| |
Collapse
|
17
|
Chung CR, Liou JT, Wu LC, Horng JT, Lee TY. Multi-label classification and features investigation of antimicrobial peptides with various functional classes. iScience 2023; 26:108250. [PMID: 38025779 PMCID: PMC10679894 DOI: 10.1016/j.isci.2023.108250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Revised: 07/15/2023] [Accepted: 10/16/2023] [Indexed: 12/01/2023] Open
Abstract
The challenge of drug-resistant bacteria to global public health has led to increased attention on antimicrobial peptides (AMPs) as a targeted therapeutic alternative with a lower risk of resistance. However, high production costs and limitations in functional class prediction have hindered progress in this field. In this study, we used multi-label classifiers with binary relevance and algorithm adaptation techniques to predict different functions of AMPs across a wide range of pathogen categories, including bacteria, mammalian cells, fungi, viruses, and cancer cells. Our classifiers attained promising AUC scores varying from 0.8492 to 0.9126 on independent testing data. Forward feature selection identified sequence order and charge as critical, with specific amino acids (C and E) as discriminative. These findings provide valuable insights for the design of antimicrobial peptides (AMPs) with multiple functionalities, thus contributing to the broader effort to combat drug-resistant pathogens.
Collapse
Affiliation(s)
- Chia-Ru Chung
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
| | - Jhen-Ting Liou
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
| | - Li-Ching Wu
- Department of Biomedical Sciences and Engineering, National Central University, Taoyuan, Taiwan
| | - Jorng-Tzong Horng
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
- Department of Bioinformatics and Medical Engineering, Asia University, Taoyuan City, Taiwan
| | - Tzong-Yi Lee
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu City, Taiwan
- Center for Intelligent Drug Systems and Smart Biodevices (IDS2B), National Yang Ming Chiao Tung University, Hsinchu City, Taiwan
| |
Collapse
|
18
|
Le NQK. Leveraging transformers-based language models in proteome bioinformatics. Proteomics 2023; 23:e2300011. [PMID: 37381841 DOI: 10.1002/pmic.202300011] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 06/13/2023] [Accepted: 06/13/2023] [Indexed: 06/30/2023]
Abstract
In recent years, the rapid growth of biological data has increased interest in using bioinformatics to analyze and interpret this data. Proteomics, which studies the structure, function, and interactions of proteins, is a crucial area of bioinformatics. Using natural language processing (NLP) techniques in proteomics is an emerging field that combines machine learning and text mining to analyze biological data. Recently, transformer-based NLP models have gained significant attention for their ability to process variable-length input sequences in parallel, using self-attention mechanisms to capture long-range dependencies. In this review paper, we discuss the recent advancements in transformer-based NLP models in proteome bioinformatics and examine their advantages, limitations, and potential applications to improve the accuracy and efficiency of various tasks. Additionally, we highlight the challenges and future directions of using these models in proteome bioinformatics research. Overall, this review provides valuable insights into the potential of transformer-based NLP models to revolutionize proteome bioinformatics.
Collapse
Affiliation(s)
- Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
- AIBioMed Research Group, Taipei Medical University, Taipei, Taiwan
- Research Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei, Taiwan
- Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan
| |
Collapse
|
19
|
Xing W, Zhang J, Li C, Huo Y, Dong G. iAMP-Attenpred: a novel antimicrobial peptide predictor based on BERT feature extraction method and CNN-BiLSTM-Attention combination model. Brief Bioinform 2023; 25:bbad443. [PMID: 38055840 PMCID: PMC10699745 DOI: 10.1093/bib/bbad443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 10/31/2023] [Accepted: 11/11/2023] [Indexed: 12/08/2023] Open
Abstract
As a kind of small molecule protein that can fight against various microorganisms in nature, antimicrobial peptides (AMPs) play an indispensable role in maintaining the health of organisms and fortifying defenses against diseases. Nevertheless, experimental approaches for AMP identification still demand substantial allocation of human resources and material inputs. Alternatively, computing approaches can assist researchers effectively and promptly predict AMPs. In this study, we present a novel AMP predictor called iAMP-Attenpred. As far as we know, this is the first work that not only employs the popular BERT model in the field of natural language processing (NLP) for AMPs feature encoding, but also utilizes the idea of combining multiple models to discover AMPs. Firstly, we treat each amino acid from preprocessed AMPs and non-AMP sequences as a word, and then input it into BERT pre-training model for feature extraction. Moreover, the features obtained from BERT method are fed to a composite model composed of one-dimensional CNN, BiLSTM and attention mechanism for better discriminating features. Finally, a flatten layer and various fully connected layers are utilized for the final classification of AMPs. Experimental results reveal that, compared with the existing predictors, our iAMP-Attenpred predictor achieves better performance indicators, such as accuracy, precision and so on. This further demonstrates that using the BERT approach to capture effective feature information of peptide sequences and combining multiple deep learning models are effective and meaningful for predicting AMPs.
Collapse
Affiliation(s)
- Wenxuan Xing
- School of Computer Science and Engineering, Northeastern University, No.195 Chuangxin Road, Hunnan District, Shenyang 110170, China
| | - Jie Zhang
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, No.29 Erdos East Street, Saihan District, Hohhot 010011, China
| | - Chen Li
- School of Computer Science and Engineering, Northeastern University, No.195 Chuangxin Road, Hunnan District, Shenyang 110170, China
| | - Yujia Huo
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, No.29 Erdos East Street, Saihan District, Hohhot 010011, China
| | - Gaifang Dong
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, No.29 Erdos East Street, Saihan District, Hohhot 010011, China
| |
Collapse
|
20
|
Yao L, Zhang Y, Li W, Chung C, Guan J, Zhang W, Chiang Y, Lee T. DeepAFP: An effective computational framework for identifying antifungal peptides based on deep learning. Protein Sci 2023; 32:e4758. [PMID: 37595093 PMCID: PMC10503419 DOI: 10.1002/pro.4758] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 08/02/2023] [Accepted: 08/10/2023] [Indexed: 08/20/2023]
Abstract
Fungal infections have become a significant global health issue, affecting millions worldwide. Antifungal peptides (AFPs) have emerged as a promising alternative to conventional antifungal drugs due to their low toxicity and low propensity for inducing resistance. In this study, we developed a deep learning-based framework called DeepAFP to efficiently identify AFPs. DeepAFP fully leverages and mines composition information, evolutionary information, and physicochemical properties of peptides by employing combined kernels from multiple branches of convolutional neural network with bi-directional long short-term memory layers. In addition, DeepAFP integrates a transfer learning strategy to obtain efficient representations of peptides for improving model performance. DeepAFP demonstrates strong predictive ability on carefully curated datasets, yielding an accuracy of 93.29% and an F1-score of 93.45% on the DeepAFP-Main dataset. The experimental results show that DeepAFP outperforms existing AFP prediction tools, achieving state-of-the-art performance. Finally, we provide a downloadable AFP prediction tool to meet the demands of large-scale prediction and facilitate the usage of our framework by the public or other researchers. Our framework can accurately identify AFPs in a short time without requiring significant human and material resources, and hence can accelerate the development of AFPs as well as contribute to the treatment of fungal infections. Furthermore, our method can provide new perspectives for other biological sequence analysis tasks.
Collapse
Affiliation(s)
- Lantian Yao
- Kobilka Institute of Innovative Drug Discovery, School of MedicineThe Chinese University of Hong KongShenzhenChina
- School of Science and EngineeringThe Chinese University of Hong KongShenzhenChina
| | - Yuntian Zhang
- School of MedicineThe Chinese University of Hong KongShenzhenChina
| | - Wenshuo Li
- School of Science and EngineeringThe Chinese University of Hong KongShenzhenChina
| | - Chia‐Ru Chung
- Department of Computer Science and Information EngineeringNational Central UniversityTaoyuanTaiwan
| | - Jiahui Guan
- School of MedicineThe Chinese University of Hong KongShenzhenChina
| | - Wenyang Zhang
- School of MedicineThe Chinese University of Hong KongShenzhenChina
| | - Ying‐Chih Chiang
- Kobilka Institute of Innovative Drug Discovery, School of MedicineThe Chinese University of Hong KongShenzhenChina
- School of MedicineThe Chinese University of Hong KongShenzhenChina
| | - Tzong‐Yi Lee
- Institute of Bioinformatics and Systems BiologyNational Yang Ming Chiao Tung UniversityHsinchuTaiwan
- Center for Intelligent Drug Systems and Smart Bio‐devices (IDS2B)National Yang Ming Chiao Tung UniversityHsinchuTaiwan
| |
Collapse
|
21
|
Qiu S, Liu R, Liang Y. GR-m6A: Prediction of N6-methyladenosine sites in mammals with molecular graph and residual network. Comput Biol Med 2023; 163:107202. [PMID: 37450964 DOI: 10.1016/j.compbiomed.2023.107202] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 06/14/2023] [Accepted: 06/25/2023] [Indexed: 07/18/2023]
Abstract
RNA N6-methyladenine (m6A), which is produced by the methylation of the N6 position of eukaryotic adenine, is a relatively common post-transcriptional modification on the surface of the molecule, which frequently plays a crucial role in biological processes. Biological experimental methods to identify m6A have been studied and implemented in recent years, but they cannot be promoted widely due to drawbacks such as the time and cost of reagents and equipment. Therefore, researchers have proposed computational strategies for identifying m6A sites, but these strategies do not account for the mechanism of methylation occurrence or the structure of RNA molecules. This study, therefore, proposed a novel deep learning model for predicting m6A sites, GR-m6A, which predicts m6A sites by extracting features from the physicochemical properties and spatial structure of molecules via residual networks. In GR-m6A, each RNA base string is represented by SMILES as two matrices comprising topology structural information and node attributes with molecular physicochemical characteristics. The feature encoding matrix was then obtained by fusing the topology matrix and the node matrix in accordance with the graphical convolutional network principle. Correspondingly, the more discriminative features were extracted from the encoding matrix using the residual neural network and predicted using a multilayer perceptron. As evident from the 5-fold cross-validation and independent validation, the GR-m6A model outperformed other existing methods. Thus, we hope that GR-m6A can aid researchers in predicting mammalian m6A loci. The source code and database are available at https://github.com/YingLiangjxau/GR-m6A.
Collapse
Affiliation(s)
- Shi Qiu
- College of Engineering, Jiangxi Agricultural University, Nanchang 310045, Jiangxi, China.
| | - Renxin Liu
- College of Engineering, Jiangxi Agricultural University, Nanchang 310045, Jiangxi, China.
| | - Ying Liang
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang 310045, Jiangxi, China.
| |
Collapse
|