1
|
Hu Y, Zhou J, Gao Y, Chen B, Su J, Li H. Deep Learning Accelerates the Development of Antimicrobial Peptides Comprising 15 Amino Acids. Assay Drug Dev Technol 2025. [PMID: 40139786 DOI: 10.1089/adt.2025.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2025] Open
Abstract
The emergence of multidrug-resistant bacteria has led to an urgent need for novel antimicrobial agents. Antimicrobial peptides (AMPs) exhibit broad-spectrum and highly effective antibacterial activity and are less prone to resistance, making them potential candidates for the next generation of antimicrobial drugs. However, screening for AMPs from a vast library of peptides through wet lab experiments is a slow and laborious process. By leveraging large datasets of labeled peptides, researchers utilize deep learning algorithms to train models that capture complex patterns and features associated with antimicrobial activity, which advance the discovery and development of novel AMPs. Since the discovery of certain lengths of AMPs has been rarely reported, we applied deep learning to accelerate the discovery of AMPs consisting of 15 amino acids and developed a model named AMPPRED15 in this article. Wet lab experiments were also conducted to evaluate the performance of the model. Fortunately, we successfully identified two AMPs, one of which demonstrated antibacterial activities comparable to the marketed antibiotic cefoperazone sodium.
Collapse
Affiliation(s)
- Yuchen Hu
- National '111' Centre for Cellular Regulation and Molecular Pharmaceutics, Key Laboratory of Fermentation Engineering (Ministry of Education), Cooperative Innovation Centre of Industrial Fermentation (Ministry of Education & Hubei Province), School of Life and Health Sciences, Hubei University of Technology, Wuhan, PR China
| | - Junchao Zhou
- National '111' Centre for Cellular Regulation and Molecular Pharmaceutics, Key Laboratory of Fermentation Engineering (Ministry of Education), Cooperative Innovation Centre of Industrial Fermentation (Ministry of Education & Hubei Province), School of Life and Health Sciences, Hubei University of Technology, Wuhan, PR China
| | - Yuhang Gao
- National '111' Centre for Cellular Regulation and Molecular Pharmaceutics, Key Laboratory of Fermentation Engineering (Ministry of Education), Cooperative Innovation Centre of Industrial Fermentation (Ministry of Education & Hubei Province), School of Life and Health Sciences, Hubei University of Technology, Wuhan, PR China
| | - Ban Chen
- National '111' Centre for Cellular Regulation and Molecular Pharmaceutics, Key Laboratory of Fermentation Engineering (Ministry of Education), Cooperative Innovation Centre of Industrial Fermentation (Ministry of Education & Hubei Province), School of Life and Health Sciences, Hubei University of Technology, Wuhan, PR China
| | - Jiangtao Su
- National '111' Centre for Cellular Regulation and Molecular Pharmaceutics, Key Laboratory of Fermentation Engineering (Ministry of Education), Cooperative Innovation Centre of Industrial Fermentation (Ministry of Education & Hubei Province), School of Life and Health Sciences, Hubei University of Technology, Wuhan, PR China
| | - Hong Li
- School of Pharmacy, Guangxi Medical University, Nanning, PR China
| |
Collapse
|
2
|
He M, Jiang Y, Yang Y, Gong K, Jiang X, Tian Y. MSCMamba: Prediction of Antimicrobial Peptide Activity Values by Fusing Multiscale Convolution with Mamba Module. J Phys Chem B 2025; 129:1956-1965. [PMID: 39915928 DOI: 10.1021/acs.jpcb.4c07752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
Antimicrobial peptides (AMPs) have important developmental prospects as potential candidates for novel antibiotics. Although many studies have been devoted to the identification of AMPs and the qualitative prediction of their functional activities, few methods address the quantitative prediction of their activity values. In this paper, we propose a regression model called MSCMamba, which fuses multiscale convolutional neural network with Mamba module to accurately predict the activity values of AMPs. AMPs sequences are feature-extracted by multiple encoding methods and fed into a multiscale convolutional network and a Mamba module to capture local and long-range dependent features, respectively. The model fuses these two outputs and predicts the activity values of AMPs through a linear layer. Experimental results show that MSCMamba outperforms the current state-of-the-art methods in several performance metrics, especially with an increase in R2 from 0.422 to 0.467, representing a 10.66% improvement. Additionally, we did a series of ablation experiments to verify the validity of each part of the MSCMamba model and the performance enhancement of feature diversification.This study provides a new method for activity prediction of AMPs, which is expected to accelerate the development of novel antibiotics.
Collapse
Affiliation(s)
- Mingyue He
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
| | - Yongquan Jiang
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
- Artificial Intelligence Research Institute, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
| | - Yan Yang
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
- Artificial Intelligence Research Institute, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
| | - Kuanping Gong
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
| | - Xuanpei Jiang
- School of Life Science and Technology, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
| | - Yuan Tian
- School of Life Science and Technology, Southwest Jiaotong University, Chengdu 610031 Sichuan, China
| |
Collapse
|
3
|
Rathore AS, Choudhury S, Arora A, Tijare P, Raghava GPS. ToxinPred 3.0: An improved method for predicting the toxicity of peptides. Comput Biol Med 2024; 179:108926. [PMID: 39038391 DOI: 10.1016/j.compbiomed.2024.108926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 05/17/2024] [Accepted: 07/17/2024] [Indexed: 07/24/2024]
Abstract
Toxicity emerges as a prominent challenge in the design of therapeutic peptides, causing the failure of numerous peptides during clinical trials. In 2013, our group developed ToxinPred, a computational method that has been extensively adopted by the scientific community for predicting peptide toxicity. In this paper, we propose a refined variant of ToxinPred that showcases improved reliability and accuracy in predicting peptide toxicity. Initially, we utilized a similarity/alignment-based approach employing BLAST to predict toxic peptides, which yielded satisfactory accuracy; however, the method suffered from inadequate coverage. Subsequently, we employed a motif-based approach using MERCI software to uncover specific patterns or motifs that are exclusively observed in toxic peptides. The search for these motifs in peptides allowed us to predict toxic peptides with a high level of specificity with poor sensitivity. To overcome the coverage limitations, we developed alignment-free methods using machine/deep learning techniques to balance sensitivity and specificity of prediction. Deep learning model (ANN - LSTM with fixed sequence length) developed using one-hot encoding achieved a maximum AUROC of 0.93 with MCC of 0.71 on an independent dataset. Machine learning model (extra tree) developed using compositional features of peptides achieved a maximum AUROC of 0.95 with MCC of 0.78. We also developed large language models and achieved maximum AUC of 0.93 using ESM2-t33. Finally, we developed hybrid or ensemble methods combining two or more methods to enhance performance. Our specific hybrid method, which combines a motif-based approach with a machine learning-based model, achieved a maximum AUROC of 0.98 with MCC 0.81 on an independent dataset. In this study, all models were trained and tested on 80 % of data using five-fold cross-validation and evaluated on the remaining 20 % of data called independent dataset. The evaluation of all methods on an independent dataset revealed that the method proposed in this study exhibited better performance than existing methods. To cater to the needs of the scientific community, we have developed a standalone software, pip package and web-based server ToxinPred3 (https://github.com/raghavagps/toxinpred3 and https://webs.iiitd.edu.in/raghava/toxinpred3/).
Collapse
Affiliation(s)
- Anand Singh Rathore
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Shubham Choudhury
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Akanksha Arora
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Purva Tijare
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| |
Collapse
|
4
|
Kang Y, Zhang H, Wang X, Yang Y, Jia Q. MMDB: Multimodal dual-branch model for multi-functional bioactive peptide prediction. Anal Biochem 2024; 690:115491. [PMID: 38460901 DOI: 10.1016/j.ab.2024.115491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 01/21/2024] [Accepted: 02/19/2024] [Indexed: 03/11/2024]
Abstract
Bioactive peptides can hinder oxidative processes and microbial spoilage in foodstuffs and play important roles in treating diverse diseases and disorders. While most of the methods focus on single-functional bioactive peptides and have obtained promising prediction performance, it is still a significant challenge to accurately detect complex and diverse functions simultaneously with the quick increase of multi-functional bioactive peptides. In contrast to previous research on multi-functional bioactive peptide prediction based solely on sequence, we propose a novel multimodal dual-branch (MMDB) lightweight deep learning model that designs two different branches to effectively capture the complementary information of peptide sequence and structural properties. Specifically, a multi-scale dilated convolution with Bi-LSTM branch is presented to effectively model the different scales sequence properties of peptides while a multi-layer convolution branch is proposed to capture structural information. To the best of our knowledge, this is the first effective extraction of peptide sequence features using multi-scale dilated convolution without parameter increase. Multimodal features from both branches are integrated via a fully connected layer for multi-label classification. Compared to state-of-the-art methods, our MMDB model exhibits competitive results across metrics, with a 9.1% Coverage increase and 5.3% and 3.5% improvements in Precision and Accuracy, respectively.
Collapse
Affiliation(s)
- Yan Kang
- National Pilot School of Software, Yunnan University, Kunming, 650091, Yunnan, China; Yunnan Key Laboratory of Software Engineering, China
| | - Huadong Zhang
- National Pilot School of Software, Yunnan University, Kunming, 650091, Yunnan, China
| | - Xinchao Wang
- National Pilot School of Software, Yunnan University, Kunming, 650091, Yunnan, China
| | - Yun Yang
- National Pilot School of Software, Yunnan University, Kunming, 650091, Yunnan, China; Yunnan Key Laboratory of Software Engineering, China.
| | - Qi Jia
- School of Information Science, Yunnan University, Kunming, 650091, Yunnan, China
| |
Collapse
|
5
|
Shaon MSH, Karim T, Sultan MF, Ali MM, Ahmed K, Hasan MZ, Moustafa A, Bui FM, Al-Zahrani FA. AMP-RNNpro: a two-stage approach for identification of antimicrobials using probabilistic features. Sci Rep 2024; 14:12892. [PMID: 38839785 PMCID: PMC11153637 DOI: 10.1038/s41598-024-63461-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 05/29/2024] [Indexed: 06/07/2024] Open
Abstract
Antimicrobials are molecules that prevent the formation of microorganisms such as bacteria, viruses, fungi, and parasites. The necessity to detect antimicrobial peptides (AMPs) using machine learning and deep learning arises from the need for efficiency to accelerate the discovery of AMPs, and contribute to developing effective antimicrobial therapies, especially in the face of increasing antibiotic resistance. This study introduced AMP-RNNpro based on Recurrent Neural Network (RNN), an innovative model for detecting AMPs, which was designed with eight feature encoding methods that are selected according to four criteria: amino acid compositional, grouped amino acid compositional, autocorrelation, and pseudo-amino acid compositional to represent the protein sequences for efficient identification of AMPs. In our framework, two-stage predictions have been conducted. Initially, this study analyzed 33 models on these feature extractions. Then, we selected the best six models from these models using rigorous performance metrics. In the second stage, probabilistic features have been generated from the selected six models in each feature encoding and they are aggregated to be fed into our final meta-model called AMP-RNNpro. This study also introduced 20 features with SHAP, which are crucial in the drug development fields, where we discover AAC, ASDC, and CKSAAGP features are highly impactful for detection and drug discovery. Our proposed framework, AMP-RNNpro excels in the identification of novel Amps with 97.15% accuracy, 96.48% sensitivity, and 97.87% specificity. We built a user-friendly website for demonstrating the accurate prediction of AMPs based on the proposed approach which can be accessed at http://13.126.159.30/ .
Collapse
Affiliation(s)
- Md Shazzad Hossain Shaon
- Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh
| | - Tasmin Karim
- Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh
| | - Md Fahim Sultan
- Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh
| | - Md Mamun Ali
- Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
- Department of Software Engineering, Daffodil International University, Daffodil Smart City (DSC), Birulia, Savar, Dhaka, 1216, Bangladesh
| | - Kawsar Ahmed
- Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh.
- Department of Electrical and Computer Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada.
- Group of Bio-photomatiχ, Information and Communication Technology, Mawlana Bhashani Science and Technology University, Santosh, Tangail, 1902, Bangladesh.
| | - Md Zahid Hasan
- Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh
- Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh
| | - Ahmed Moustafa
- Department of Human Anatomy and Physiology, The Faculty of Health Sciences, University of Johannesburg, Johannesburg, South Africa
- School of Psychology, Centre for Data Analytics, Bond University, Gold Coast, QLD, Australia
| | - Francis M Bui
- Department of Electrical and Computer Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
| | | |
Collapse
|
6
|
Szymczak P, Szczurek E. Artificial intelligence-driven antimicrobial peptide discovery. Curr Opin Struct Biol 2023; 83:102733. [PMID: 37992451 DOI: 10.1016/j.sbi.2023.102733] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 10/06/2023] [Accepted: 10/30/2023] [Indexed: 11/24/2023]
Abstract
Antimicrobial peptides (AMPs) emerge as promising agents against antimicrobial resistance, providing an alternative to conventional antibiotics. Artificial intelligence (AI) revolutionized AMP discovery through both discrimination and generation approaches. The discriminators aid in the identification of promising candidates by predicting key peptide properties such as activity and toxicity, while the generators learn the distribution of peptides and enable sampling novel AMP candidates, either de novo or as analogs of a prototype peptide. Moreover, the controlled generation of AMPs with desired properties is achieved by discriminator-guided filtering, positive-only learning, latent space sampling, as well as conditional and optimized generation. Here we review recent achievements in AI-driven AMP discovery, highlighting the most exciting directions.
Collapse
Affiliation(s)
- Paulina Szymczak
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097, Warsaw, Poland.
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097, Warsaw, Poland.
| |
Collapse
|