1
|
Zhang Y, Yu L, Xue L, Liu F, Jing R, Luo J. Optimizing lipocalin sequence classification with ensemble deep learning models. PLoS One 2025; 20:e0319329. [PMID: 40238838 PMCID: PMC12002463 DOI: 10.1371/journal.pone.0319329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2024] [Accepted: 01/30/2025] [Indexed: 04/18/2025] Open
Abstract
Deep learning (DL) has become a powerful tool for the recognition and classification of biological sequences. However, conventional single-architecture models often struggle with suboptimal predictive performance and high computational costs. To address these challenges, we present EnsembleDL-Lipo, an innovative ensemble deep learning framework that combines Convolutional Neural Networks (CNNs) and Deep Neural Networks (DNNs) to enhance the identification of lipocalin sequences. Lipocalins are multifunctional extracellular proteins involved in various diseases and stress responses, and their low sequence similarity and occurrence in the 'twilight zone' of sequence alignment present significant hurdles for accurate classification. These challenges necessitate efficient computational methods to complement traditional, labor-intensive experimental approaches. EnsembleDL-Lipo overcomes these issues by leveraging a set of PSSM-based features to train a large ensemble of deep learning models. The framework integrates multiple feature representations derived from position-specific scoring matrices (PSSMs), optimizing classification performance across diverse sequence patterns. The model achieved superior results on the training dataset, with an accuracy (ACC) of 97.65%, recall of 97.10%, Matthews correlation coefficient (MCC) of 0.95, and area under the curve (AUC) of 0.99. Validation on an independent test set further confirmed the robustness of the model, yielding an ACC of 95.79%, recall of 90.48%, MCC of 0.92, and AUC of 0.97. These results demonstrate that EnsembleDL-Lipo is a highly effective and computationally efficient tool for lipocalin sequence identification, significantly outperforming existing methods and offering strong potential for applications in biomarker discovery.
Collapse
Affiliation(s)
- Yonglin Zhang
- Department of Pharmacy, Affiliated Hospital of North Sichuan Medical College, Nanchong, Sichuan, China
| | - Lezheng Yu
- School of Chemistry and Materials Science, Guizhou Education University, Guiyang, China
| | - Li Xue
- School of Public Health, Southwest Medical University, Luzhou, China
| | - Fengjuan Liu
- School of Geography and Resources, Guizhou Education University, Guiyang, China
| | - Runyu Jing
- School of mathematics and big data, Guizhou Education University, Guiyang, China
| | - Jiesi Luo
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, Sichuan, China
- Sichuan Key Medical Laboratory of New Drug Discovery and Druggability Evaluation, Luzhou Key Laboratory of Activity Screening and Druggability Evaluation for Chinese Materia Medica, Southwest Medical University, Luzhou, Sichuan, China
| |
Collapse
|
2
|
Cai J, Yan J, Un C, Wang Y, Campbell-Valois FX, Siu SWI. BERT-AmPEP60: A BERT-Based Transfer Learning Approach to Predict the Minimum Inhibitory Concentrations of Antimicrobial Peptides for Escherichia coli and Staphylococcus aureus. J Chem Inf Model 2025; 65:3186-3202. [PMID: 40086449 PMCID: PMC12004541 DOI: 10.1021/acs.jcim.4c01749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Revised: 02/06/2025] [Accepted: 02/06/2025] [Indexed: 03/16/2025]
Abstract
Antimicrobial peptides (AMPs) are a promising alternative for combating bacterial drug resistance. While current computer prediction models excel at binary classification of AMPs based on sequences, there is a lack of regression methods to accurately quantify AMP activity against specific bacteria, making the identification of highly potent AMPs a challenge. Here, we present a deep learning method, BERT-AmPEP60, based on the fine-tuned Bidirectional Encoder Representations from Transformers (BERT) architecture to extract embedding features from input sequences. Using the transfer learning strategy, we built regression models to predict the minimum inhibitory concentration (MIC) of peptides for Escherichia coli (EC) and Staphylococcus aureus (SA). In five independent experiments with 10% leave-out sequences as the test sets, the optimal EC and SA models outperformed the state-of-the-art regression method and traditional machine learning methods, achieving an average mean squared error of 0.2664 and 0.3032 (log μM), respectively. They also showed a Pearson correlation coefficient of 0.7955 and 0.7530, and a Kendall correlation coefficient of 0.5797 and 0.5222, respectively. Our models outperformed existing deep learning and machine learning methods that rely on conventional sequence features. This work underscores the effectiveness of utilizing BERT with transfer learning for training quantitative AMP prediction models specific for different bacterial species. The web server of BERT-AmPEP60 can be found at https://app.cbbio.online/ampep/home. To facilitate development, the program source codes are available at https://github.com/janecai0714/AMP_regression_EC_SA.
Collapse
Affiliation(s)
- Jianxiu Cai
- Faculty
of Applied Sciences, Macao Polytechnic University, Rua de Luís Gonzaga Gomes, Macau SAR 99078, China
- Institute
of Science and Environment, University of
Saint Joseph, Rua de
Luís Gonzaga Gomes, Macau SAR 99078, China
| | - Jielu Yan
- Institute
of Science and Environment, University of
Saint Joseph, Rua de
Luís Gonzaga Gomes, Macau SAR 99078, China
- School
of Computer Science, Chongqing University, Shapingba, Chongqing 400044, China
| | - Chonwai Un
- T-Rex
Technology HK Limited, Unit 1017-1, 10/F, Building 19W, Hongkong Science
Park, Shatin, Hong Kong, New Territories
| | - Yapeng Wang
- Faculty
of Applied Sciences, Macao Polytechnic University, Rua de Luís Gonzaga Gomes, Macau SAR 99078, China
| | - François-Xavier Campbell-Valois
- Host-Microbe
Interactions Laboratory, Center for Chemical and Synthetic Biology,
Department of Chemistry and Biomolecular Sciences, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada
- Centre for
Infection, Immunity, and Inflammation, University
of Ottawa, Ottawa K1N 6N5, Ontario, Canada
- Department
of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa K1N 6N5, Ontario, Canada
| | - Shirley W. I. Siu
- Centre
for Artificial Intelligence Driven Drug Discovery, Faculty of Applied
Sciences, Macao Polytechnic University, Rua de Luís Gonzaga Gomes, Macau SAR 99078, China
| |
Collapse
|
3
|
Chen R, You Y, Liu Y, Sun X, Ma T, Lao X, Zheng H. Deep-Learning-Based Approaches for Rational Design of Stapled Peptides With High Antimicrobial Activity and Stability. Microb Biotechnol 2025; 18:e70121. [PMID: 40042163 PMCID: PMC11881016 DOI: 10.1111/1751-7915.70121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2024] [Revised: 02/09/2025] [Accepted: 02/15/2025] [Indexed: 05/12/2025] Open
Abstract
Antimicrobial peptides (AMPs) face stability and toxicity challenges in clinical use. Stapled modification enhances their stability and effectiveness, but its application in peptide design is rarely reported. This study built ten prediction models for stapled AMPs using deep and machine learning, tested their accuracy with an independent data set and wet lab experiments, and characterised stapled loop structures using structural, sequence and amino acid descriptors. AlphaFold improved stapled peptide structure prediction. The support vector machine model performed best, while two deep learning models achieved the highest accuracy of 1.0 on an external test set. Designed cysteine- and lysine-stapled peptides inhibited various bacteria with low concentrations and showed good serum stability and low haemolytic activity. This study highlights the potential of the deep learning method in peptide modification and design.
Collapse
Affiliation(s)
- Ruole Chen
- School of Life Science and TechnologyChina Pharmaceutical UniversityNanjingJiangsuChina
| | - Yuhao You
- School of Life Science and TechnologyChina Pharmaceutical UniversityNanjingJiangsuChina
| | - Yanchao Liu
- School of Life Science and TechnologyChina Pharmaceutical UniversityNanjingJiangsuChina
| | - Xin Sun
- School of Life Science and TechnologyChina Pharmaceutical UniversityNanjingJiangsuChina
| | - Tianyue Ma
- School of Life Science and TechnologyChina Pharmaceutical UniversityNanjingJiangsuChina
| | - Xingzhen Lao
- School of Life Science and TechnologyChina Pharmaceutical UniversityNanjingJiangsuChina
| | - Heng Zheng
- School of Life Science and TechnologyChina Pharmaceutical UniversityNanjingJiangsuChina
| |
Collapse
|
4
|
Slavokhotova AA, Shelenkov AA, Rogozhin EA. Computational Prediction and Structural Analysis of α-Hairpinins, a Ubiquitous Family of Antimicrobial Peptides, Using the Cysmotif Searcher Pipeline. Antibiotics (Basel) 2024; 13:1019. [PMID: 39596714 PMCID: PMC11591084 DOI: 10.3390/antibiotics13111019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Revised: 10/25/2024] [Accepted: 10/26/2024] [Indexed: 11/29/2024] Open
Abstract
BACKGROUND α-Hairpinins are a family of antimicrobial peptides, promising antimicrobial agents, which includes only 12 currently revealed members with proven activity, although their real number is supposed to be much higher. α-Hairpinins are short peptides containing four cysteine residues arranged in a specific Cys-motif. These antimicrobial peptides (AMPs) have a characteristic helix-loop-helix structure with two disulfide bonds. Isolation of α-hairpinins by biochemical methods is cost- and labor-consuming, thus requiring reliable preliminary in silico prediction. METHODS In this study, we developed a special algorithm for the prediction of putative α-hairpinins on the basis of characteristic motifs with four (4C) and six (6C) cysteines deduced from translated plant transcriptome sequences. We integrated this algorithm into the Cysmotif searcher pipeline and then analyzed all transcriptomes available from the One Thousand Plant Transcriptomes project. RESULTS We predicted more than 2000 putative α-hairpinins belonging to various plant sources including algae, mosses, ferns, and true flowering plants. These data make α-hairpinins one of the ubiquitous antimicrobial peptides, being widespread among various plants. The largest numbers of α-hairpinins were revealed in the Papaveraceae family and in Papaver somniferum in particular. CONCLUSIONS By analyzing the primary structure of α-hairpinins, we concluded that more predicted peptides with the 6C motif are likely to have potent antimicrobial activity in comparison to the ones possessing 4C motifs. In addition, we found 30 α-hairpinin precursors containing from two to eight Cys-rich modules. A striking similarity between some α-hairpinin modules belonging to diverse plants was revealed. These data allowed us to assume that the evolution of α-hairpinin precursors possibly involved changing the number of Cys-rich modules, leading to some missing middle and C-terminal modules, in particular.
Collapse
Affiliation(s)
- Anna A. Slavokhotova
- Central Research Institute of Epidemiology, Novogireevskaya Str., 3a, 111123 Moscow, Russia
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry RAS, Miklukho-Maklaya Str., 16/10, 117437 Moscow, Russia;
| | - Andrey A. Shelenkov
- Central Research Institute of Epidemiology, Novogireevskaya Str., 3a, 111123 Moscow, Russia
| | - Eugene A. Rogozhin
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry RAS, Miklukho-Maklaya Str., 16/10, 117437 Moscow, Russia;
- All-Russian Institute for Plant Protection, Podbelskogo Str., 196608 Saint-Petersburg-Pushkin, Russia
| |
Collapse
|
5
|
Isaac KS, Combe M, Potter G, Sokolenko S. Machine learning tools for peptide bioactivity evaluation - Implications for cell culture media optimization and the broader cultivated meat industry. Curr Res Food Sci 2024; 9:100842. [PMID: 39435450 PMCID: PMC11491887 DOI: 10.1016/j.crfs.2024.100842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Accepted: 09/07/2024] [Indexed: 10/23/2024] Open
Abstract
Although bioactive peptides have traditionally been studied for their health-promoting qualities in the context of nutrition and medicine, the past twenty years have seen a steady increase in their application to cell culture media optimization. Complex natural sources of bioactive peptides, such as hydrolysates, offer a sustainable and cost-effective means of promoting cellular growth, making them an essential component of scaling-up cultivated meat production. However, the sheer diversity of hydrolysates makes product selection difficult, highlighting the need for functional characterization. Traditional wet-lab techniques for isolating and estimating peptide bioactivity cannot keep pace with peptide identification using high-throughput tools such as mass spectrometry, requiring the development and use of machine learning-based classifiers. This review provides a comprehensive list of available software tools to evaluate peptide bioactivity, classified and compared based on the algorithm, training set, functionality, and limitations of the underlying models. We curated independent test sets to compare the predictive performance of different models based on specific bioactivity classification relevant to promoting cell culture growth: antioxidant and anti-inflammatory. A comprehensive screening of all bioactivity classifiers revealed that while there are approximately fifty tools to elucidate antimicrobial activity and sixteen that predict anti-inflammatory activity, fewer tools are available for other functionalities related to cell growth - five that predict antioxidant activity and two for growth factor and/or cell signaling prediction. A thorough evaluation of the available tools revealed significant issues with sensitivity, specificity, and overall accuracy. Despite the overall interest in estimating peptide bioactivity, our work highlights key gaps in the broader adoption of existing software for the specific application of cell culture media optimization in the context of cultivated meat and beyond.
Collapse
Affiliation(s)
- Kathy Sharon Isaac
- Process Engineering and Applied Science, Dalhousie University, 5273 DaCosta Row, PO Box 15000, Halifax, B3H 4R2, NS, Canada
| | - Michelle Combe
- Process Engineering and Applied Science, Dalhousie University, 5273 DaCosta Row, PO Box 15000, Halifax, B3H 4R2, NS, Canada
| | | | - Stanislav Sokolenko
- Process Engineering and Applied Science, Dalhousie University, 5273 DaCosta Row, PO Box 15000, Halifax, B3H 4R2, NS, Canada
| |
Collapse
|
6
|
Zhang Y, Liu LH, Xu B, Zhang Z, Yang M, He Y, Chen J, Zhang Y, Hu Y, Chen X, Sun Z, Ge Q, Wu S, Lei W, Li K, Cui H, Yang G, Zhao X, Wang M, Xia J, Cao Z, Jiang A, Wu YR. Screening antimicrobial peptides and probiotics using multiple deep learning and directed evolution strategies. Acta Pharm Sin B 2024; 14:3476-3492. [PMID: 39234615 PMCID: PMC11372459 DOI: 10.1016/j.apsb.2024.05.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 03/25/2024] [Accepted: 05/06/2024] [Indexed: 09/06/2024] Open
Abstract
Owing to their limited accuracy and narrow applicability, current antimicrobial peptide (AMP) prediction models face obstacles in industrial application. To address these limitations, we developed and improved an AMP prediction model using Comparing and Optimizing Multiple DEep Learning (COMDEL) algorithms, coupled with high-throughput AMP screening method, finally reaching an accuracy of 94.8% in test and 88% in experiment verification, surpassing other state-of-the-art models. In conjunction with COMDEL, we employed the phage-assisted evolution method to screen Sortase in vivo and developed a cell-free AMP synthesis system in vitro, ultimately increasing AMPs yields to a range of 0.5-2.1 g/L within hours. Moreover, by multi-omics analysis using COMDEL, we identified Lactobacillus plantarum as the most promising candidate for AMP generation among 35 edible probiotics. Following this, we developed a microdroplet sorting approach and successfully screened three L. plantarum mutants, each showing a twofold increase in antimicrobial ability, underscoring their substantial industrial application values.
Collapse
Affiliation(s)
- Yu Zhang
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| | - Li-Hua Liu
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
- Biology Department and Institute of Marine Sciences, College of Science, Shantou University, Shantou 515063, China
| | - Bo Xu
- School of Basic Medical Sciences, Hubei University of Science and Technology, Xianning 437100, China
| | - Zhiqian Zhang
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| | - Min Yang
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| | - Yiyang He
- School of Education, Jianghan University, Wuhan 430056, China
| | - Jingjing Chen
- Yeasen Biotechnology (Shanghai) Co., Ltd., Shanghai 200000, China
| | - Yang Zhang
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| | - Yucheng Hu
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| | - Xipeng Chen
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| | - Zitong Sun
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| | - Qijun Ge
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| | - Song Wu
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| | - Wei Lei
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| | - Kaizheng Li
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| | - Hua Cui
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| | - Gangzhu Yang
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| | - Xuemei Zhao
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| | - Man Wang
- Yeasen Biotechnology (Shanghai) Co., Ltd., Shanghai 200000, China
| | - Jiaqi Xia
- School of Basic Medicine, Jiamusi University, Jiamusi 154000, China
| | - Zhen Cao
- Yeasen Biotechnology (Shanghai) Co., Ltd., Shanghai 200000, China
| | - Ao Jiang
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| | - Yi-Rui Wu
- Tidetron Bioworks Technology (Guangzhou) Co., Ltd., Guangzhou Qianxiang Bioworks Co., Ltd., Guangzhou 510000, China
| |
Collapse
|
7
|
Yu D, Andersson-Li M, Maes S, Andersson-Li L, Neumann NF, Odlare M, Jonsson A. Development of a logic regression-based approach for the discovery of host- and niche-informative biomarkers in Escherichia coli and their application for microbial source tracking. Appl Environ Microbiol 2024; 90:e0022724. [PMID: 38940567 PMCID: PMC11267920 DOI: 10.1128/aem.00227-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 06/07/2024] [Indexed: 06/29/2024] Open
Abstract
Microbial source tracking leverages a wide range of approaches designed to trace the origins of fecal contamination in aquatic environments. Although source tracking methods are typically employed within the laboratory setting, computational techniques can be leveraged to advance microbial source tracking methodology. Herein, we present a logic regression-based supervised learning approach for the discovery of source-informative genetic markers within intergenic regions across the Escherichia coli genome that can be used for source tracking. With just single intergenic loci, logic regression was able to identify highly source-specific (i.e., exceeding 97.00%) biomarkers for a wide range of host and niche sources, with sensitivities reaching as high as 30.00%-50.00% for certain source categories, including pig, sheep, mouse, and wastewater, depending on the specific intergenic locus analyzed. Restricting the source range to reflect the most prominent zoonotic sources of E. coli transmission (i.e., bovine, chicken, human, and pig) allowed for the generation of informative biomarkers for all host categories, with specificities of at least 90.00% and sensitivities between 12.50% and 70.00%, using the sequence data from key intergenic regions, including emrKY-evgAS, ibsB-(mdtABCD-baeSR), ompC-rcsDB, and yedS-yedR, that appear to be involved in antibiotic resistance. Remarkably, we were able to use this approach to classify 48 out of 113 river water E. coli isolates collected in Northwestern Sweden as either beaver, human, or reindeer in origin with a high degree of consensus-thus highlighting the potential of logic regression modeling as a novel approach for augmenting current source tracking efforts.IMPORTANCEThe presence of microbial contaminants, particularly from fecal sources, within water poses a serious risk to public health. The health and economic burden of waterborne pathogens can be substantial-as such, the ability to detect and identify the sources of fecal contamination in environmental waters is crucial for the control of waterborne diseases. This can be accomplished through microbial source tracking, which involves the use of various laboratory techniques to trace the origins of microbial pollution in the environment. Building on current source tracking methodology, we describe a novel workflow that uses logic regression, a supervised machine learning method, to discover genetic markers in Escherichia coli, a common fecal indicator bacterium, that can be used for source tracking efforts. Importantly, our research provides an example of how the rise in prominence of machine learning algorithms can be applied to improve upon current microbial source tracking methodology.
Collapse
Affiliation(s)
- Daniel Yu
- School of Public Health, University of Alberta, Edmonton, Alberta, Canada
| | | | - Sharon Maes
- Department of Natural Sciences, Design and Sustainable Development, Mid Sweden University, Östersund, Sweden
| | - Lili Andersson-Li
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Solna, Sweden
| | - Norman F. Neumann
- School of Public Health, University of Alberta, Edmonton, Alberta, Canada
| | - Monica Odlare
- Department of Natural Sciences, Design and Sustainable Development, Mid Sweden University, Östersund, Sweden
| | - Anders Jonsson
- Department of Natural Sciences, Design and Sustainable Development, Mid Sweden University, Östersund, Sweden
| |
Collapse
|
8
|
van Teijlingen A, Edwards DC, Hu L, Lilienkampf A, Cockroft SL, Tuttle T. An active machine learning discovery platform for membrane-disrupting and pore-forming peptides. Phys Chem Chem Phys 2024; 26:17745-17752. [PMID: 38873737 PMCID: PMC11202314 DOI: 10.1039/d4cp01404a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Accepted: 05/30/2024] [Indexed: 06/15/2024]
Abstract
Membrane-disrupting and pore-forming peptides (PFPs) play a substantial role in bionanotechnology and can determine the life and death of cells. The control of chemical and ion transport through cell membranes is essential to maintaining concentration gradients. Likewise, the delivery of drugs and intracellular proteins aided by pore-forming agents is of interest in treating malfunctioning cells. Known PFPs tend to be up to 50 residues in length, which is commensurate with the thickness of a lipid bilayer. Accordingly, few short PFPs are known. Here we show that the discovery of PFPs can be accelerated via an active machine learning approach. The approach identified 71 potential PFPs from the 25.6 billion octapeptide sequence space; 13 sequences were tested experimentally, and all were found to have the predicted membrane-disrupting ability, with 1 forming highly stable pores. Experimental verification of the predicted pore-forming ability demonstrated that a range of short peptides can form pores in membranes, while the positioning and characteristics of residues that favour pore-forming behaviour were identified. This approach identified more ultrashort (8-residues, unmodified, non-cyclic) PFPs than previously known. We anticipate our findings and methodology will be useful in discovering new pore-forming and membrane-disrupting peptides for a range of applications from nanoreactors to therapeutics.
Collapse
Affiliation(s)
- Alexander van Teijlingen
- 1Pure and Applied Chemistry, University of Strathclyde, 295 Cathedral Street, Glasgow, G1 1XL, UK.
| | - Daniel C Edwards
- EaStCHEM School of Chemistry, Joseph Black Building, University of Edinburgh, David Brewster Road, Edinburgh, EH9 3FJ, UK
| | - Liao Hu
- EaStCHEM School of Chemistry, Joseph Black Building, University of Edinburgh, David Brewster Road, Edinburgh, EH9 3FJ, UK
| | - Annamaria Lilienkampf
- EaStCHEM School of Chemistry, Joseph Black Building, University of Edinburgh, David Brewster Road, Edinburgh, EH9 3FJ, UK
| | - Scott L Cockroft
- EaStCHEM School of Chemistry, Joseph Black Building, University of Edinburgh, David Brewster Road, Edinburgh, EH9 3FJ, UK
| | - Tell Tuttle
- 1Pure and Applied Chemistry, University of Strathclyde, 295 Cathedral Street, Glasgow, G1 1XL, UK.
| |
Collapse
|
9
|
Cordoves-Delgado G, García-Jacas CR. Predicting Antimicrobial Peptides Using ESMFold-Predicted Structures and ESM-2-Based Amino Acid Features with Graph Deep Learning. J Chem Inf Model 2024; 64:4310-4321. [PMID: 38739853 DOI: 10.1021/acs.jcim.3c02061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Currently, antimicrobial resistance constitutes a serious threat to human health. Drugs based on antimicrobial peptides (AMPs) constitute one of the alternatives to address it. Shallow and deep learning (DL)-based models have mainly been built from amino acid sequences to predict AMPs. Recent advances in tertiary (3D) structure prediction have opened new opportunities in this field. In this sense, models based on graphs derived from predicted peptide structures have recently been proposed. However, these models are not in correspondence with state-of-the-art approaches to codify evolutionary information, and, in addition, they are memory- and time-consuming because depend on multiple sequence alignment. Herein, we presented a framework to create alignment-free models based on graph representations generated from ESMFold-predicted peptide structures, whose nodes are characterized with amino acid-level evolutionary information derived from the Evolutionary Scale Modeling (ESM-2) models. A graph attention network (GAT) was implemented to assess the usefulness of the framework in the AMP classification. To this end, a set comprised of 67,058 peptides was used. It was demonstrated that the proposed methodology allowed to build GAT models with generalization abilities consistently better than 20 state-of-the-art non-DL-based and DL-based models. The best GAT models were developed using evolutionary information derived from the 36- and 33-layer ESM-2 models. Similarity studies showed that the best-built GAT models codified different chemical spaces, and thus they were fused to significantly improve the classification. In general, the results suggest that esm-AxP-GDL is a promissory tool to develop good, structure-dependent, and alignment-free models that can be successfully applied in the screening of large data sets. This framework should not only be useful to classify AMPs but also for modeling other peptide and protein activities.
Collapse
Affiliation(s)
- Greneter Cordoves-Delgado
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | - César R García-Jacas
- Cátedras CONAHCYT - Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| |
Collapse
|
10
|
Lobanov MY, Slizen MV, Dovidchenko NV, Panfilov AV, Surin AA, Likhachev IV, Galzitskaya OV. Comparison of deep learning models with simple method to assess the problem of antimicrobial peptides prediction. Mol Inform 2024; 43:e202200181. [PMID: 36961202 DOI: 10.1002/minf.202200181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 03/20/2023] [Accepted: 03/23/2023] [Indexed: 03/25/2023]
Abstract
Antibiotic-resistant strains are an emerging threat to public health. The usage of antimicrobial peptides (AMPs) is one of the promising approaches to solve this problem. For the development of new AMPs, it is necessary to have reliable prediction methods. Recently, deep learning approaches have been used to predict AMP. In this paper, we want to compare simple and complex methods for these purposes. We used the BERT transformer to create sequence embeddings and the multilayer perceptron (MLP) and light attention (LA) approaches for classification. One of them reached about 80 % accuracy and specificity in benchmark testing, which is on par with the best available methods. For comparison, we proposed a simple method using only the amino acid composition of proteins or peptides. This method has shown good results, at the level of the best methods. We have prepared a special server for predicting the ability of AMPs by amino acid composition: http://bioproteom.protres.ru/antimicrob/.
Collapse
Affiliation(s)
- M Y Lobanov
- Laboratory of Bioinformatics and Proteomics, Institute of Protein Research, Russian Academy of Sciences, 142290, Pushchino, Moscow Region, Russia
| | - M V Slizen
- Laboratory of Bioinformatics and Proteomics, Institute of Protein Research, Russian Academy of Sciences, 142290, Pushchino, Moscow Region, Russia
| | - N V Dovidchenko
- Laboratory of Bioinformatics and Proteomics, Institute of Protein Research, Russian Academy of Sciences, 142290, Pushchino, Moscow Region, Russia
| | - A V Panfilov
- Laboratory of Bioinformatics and Proteomics, Institute of Protein Research, Russian Academy of Sciences, 142290, Pushchino, Moscow Region, Russia
| | - A A Surin
- Faculty of Applied math, MIREA - Russian Technological University, Moscow, 119454, Russia
| | - I V Likhachev
- Laboratory of Bioinformatics and Proteomics, Institute of Protein Research, Russian Academy of Sciences, 142290, Pushchino, Moscow Region, Russia
- Institute of Mathematical Problems of Biology branch of Keldysh Institute of Applied Mathematics, Russian Academy of Sciences, 142290, Pushchino, Russia
| | - O V Galzitskaya
- Laboratory of Bioinformatics and Proteomics, Institute of Protein Research, Russian Academy of Sciences, 142290, Pushchino, Moscow Region, Russia
- Laboratory of Structure and Function of Muscle Proteins, Institute of Theoretical and Experimental Biophysics, Russian Academy of Sciences, 142290, Pushchino, Moscow Region, Russia
| |
Collapse
|
11
|
Martínez‐Mauricio KL, García‐Jacas CR, Cordoves‐Delgado G. Examining evolutionary scale modeling-derived different-dimensional embeddings in the antimicrobial peptide classification through a KNIME workflow. Protein Sci 2024; 33:e4928. [PMID: 38501511 PMCID: PMC10949403 DOI: 10.1002/pro.4928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 01/28/2024] [Accepted: 01/30/2024] [Indexed: 03/20/2024]
Abstract
Molecular features play an important role in different bio-chem-informatics tasks, such as the Quantitative Structure-Activity Relationships (QSAR) modeling. Several pre-trained models have been recently created to be used in downstream tasks, either by fine-tuning a specific model or by extracting features to feed traditional classifiers. In this regard, a new family of Evolutionary Scale Modeling models (termed as ESM-2 models) was recently introduced, demonstrating outstanding results in protein structure prediction benchmarks. Herein, we studied the usefulness of the different-dimensional embeddings derived from the ESM-2 models to classify antimicrobial peptides (AMPs). To this end, we built a KNIME workflow to use the same modeling methodology across experiments in order to guarantee fair analyses. As a result, the 640- and 1280-dimensional embeddings derived from the 30- and 33-layer ESM-2 models, respectively, are the most valuable since statistically better performances were achieved by the QSAR models built from them. We also fused features of the different ESM-2 models, and it was concluded that the fusion contributes to getting better QSAR models than using features of a single ESM-2 model. Frequency studies revealed that only a portion of the ESM-2 embeddings is valuable for modeling tasks since between 43% and 66% of the features were never used. Comparisons regarding state-of-the-art deep learning (DL) models confirm that when performing methodologically principled studies in the prediction of AMPs, non-DL based QSAR models yield comparable-to-superior performances to DL-based QSAR models. The developed KNIME workflow is available-freely at https://github.com/cicese-biocom/classification-QSAR-bioKom. This workflow can be valuable to avoid unfair comparisons regarding new computational methods, as well as to propose new non-DL based QSAR models.
Collapse
Affiliation(s)
- Karla L. Martínez‐Mauricio
- Departamento de Ciencias de la ComputaciónCentro de Investigación Científica y de Educación Superior de Ensenada (CICESE)EnsenadaMexico
| | - César R. García‐Jacas
- Cátedras CONAHCYT – Departamento de Ciencias de la ComputaciónCentro de Investigación Científica y de Educación Superior de Ensenada (CICESE)EnsenadaMexico
| | - Greneter Cordoves‐Delgado
- Departamento de Ciencias de la ComputaciónCentro de Investigación Científica y de Educación Superior de Ensenada (CICESE)EnsenadaMexico
| |
Collapse
|
12
|
Bajiya N, Choudhury S, Dhall A, Raghava GPS. AntiBP3: A Method for Predicting Antibacterial Peptides against Gram-Positive/Negative/Variable Bacteria. Antibiotics (Basel) 2024; 13:168. [PMID: 38391554 PMCID: PMC10885866 DOI: 10.3390/antibiotics13020168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 02/03/2024] [Accepted: 02/06/2024] [Indexed: 02/24/2024] Open
Abstract
Most of the existing methods developed for predicting antibacterial peptides (ABPs) are mostly designed to target either gram-positive or gram-negative bacteria. In this study, we describe a method that allows us to predict ABPs against gram-positive, gram-negative, and gram-variable bacteria. Firstly, we developed an alignment-based approach using BLAST to identify ABPs and achieved poor sensitivity. Secondly, we employed a motif-based approach to predict ABPs and obtained high precision with low sensitivity. To address the issue of poor sensitivity, we developed alignment-free methods for predicting ABPs using machine/deep learning techniques. In the case of alignment-free methods, we utilized a wide range of peptide features that include different types of composition, binary profiles of terminal residues, and fastText word embedding. In this study, a five-fold cross-validation technique has been used to build machine/deep learning models on training datasets. These models were evaluated on an independent dataset with no common peptide between training and independent datasets. Our machine learning-based model developed using the amino acid binary profile of terminal residues achieved maximum AUC 0.93, 0.98, and 0.94 for gram-positive, gram-negative, and gram-variable bacteria, respectively, on an independent dataset. Our method performs better than existing methods when compared with existing approaches on an independent dataset. A user-friendly web server, standalone package and pip package have been developed to facilitate peptide-based therapeutics.
Collapse
Affiliation(s)
- Nisha Bajiya
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi 110020, India
| | - Shubham Choudhury
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi 110020, India
| | - Anjali Dhall
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi 110020, India
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi 110020, India
| |
Collapse
|
13
|
Xu J, Li F, Li C, Guo X, Landersdorfer C, Shen HH, Peleg AY, Li J, Imoto S, Yao J, Akutsu T, Song J. iAMPCN: a deep-learning approach for identifying antimicrobial peptides and their functional activities. Brief Bioinform 2023; 24:bbad240. [PMID: 37369638 PMCID: PMC10359087 DOI: 10.1093/bib/bbad240] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 05/30/2023] [Accepted: 06/08/2023] [Indexed: 06/29/2023] Open
Abstract
Antimicrobial peptides (AMPs) are short peptides that play crucial roles in diverse biological processes and have various functional activities against target organisms. Due to the abuse of chemical antibiotics and microbial pathogens' increasing resistance to antibiotics, AMPs have the potential to be alternatives to antibiotics. As such, the identification of AMPs has become a widely discussed topic. A variety of computational approaches have been developed to identify AMPs based on machine learning algorithms. However, most of them are not capable of predicting the functional activities of AMPs, and those predictors that can specify activities only focus on a few of them. In this study, we first surveyed 10 predictors that can identify AMPs and their functional activities in terms of the features they employed and the algorithms they utilized. Then, we constructed comprehensive AMP datasets and proposed a new deep learning-based framework, iAMPCN (identification of AMPs based on CNNs), to identify AMPs and their related 22 functional activities. Our experiments demonstrate that iAMPCN significantly improved the prediction performance of AMPs and their corresponding functional activities based on four types of sequence features. Benchmarking experiments on the independent test datasets showed that iAMPCN outperformed a number of state-of-the-art approaches for predicting AMPs and their functional activities. Furthermore, we analyzed the amino acid preferences of different AMP activities and evaluated the model on datasets of varying sequence redundancy thresholds. To facilitate the community-wide identification of AMPs and their corresponding functional types, we have made the source codes of iAMPCN publicly available at https://github.com/joy50706/iAMPCN/tree/master. We anticipate that iAMPCN can be explored as a valuable tool for identifying potential AMPs with specific functional activities for further experimental validation.
Collapse
Affiliation(s)
- Jing Xu
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
- Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Fuyi Li
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
- College of Information Engineering, Northwest A&F University, Shaanxi 712100, China
- The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC 3800, Australia
| | - Chen Li
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
- Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Xudong Guo
- College of Information Engineering, Northwest A&F University, Shaanxi 712100, China
| | - Cornelia Landersdorfer
- Monash Institute of Pharmaceutical Sciences, Monash University, Melbourne, VIC 3800, Australia
| | - Hsin-Hui Shen
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
- Department of Materials Science and Engineering, Faculty of Engineering, Monash University, Clayton, VIC, 3800, Australia
| | - Anton Y Peleg
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
- Department of Infectious Diseases, Alfred Hospital, Alfred Health, Melbourne, Victoria, Australia
| | - Jian Li
- Monash Biomedicine Discovery Institute and Department of Microbiology, Monash University, Melbourne, VIC 3800, Australia
| | - Seiya Imoto
- Division of Health Medical Intelligence, Human Genome Center, Institute of Medical Science, The University of Tokyo, Minato-ku, Tokyo, Japan
- Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | | | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji 611-0011, Japan
| | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
- Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji 611-0011, Japan
| |
Collapse
|
14
|
Yang S, Yang Z, Ni X. AMPFinder: A computational model to identify antimicrobial peptides and their functions based on sequence-derived information. Anal Biochem 2023; 673:115196. [PMID: 37236434 DOI: 10.1016/j.ab.2023.115196] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 05/22/2023] [Accepted: 05/23/2023] [Indexed: 05/28/2023]
Abstract
Antimicrobial peptides (AMPs) called host defense peptides have existed among all classes of life with 5-100 amino acids generally and can kill mycobacteria, envelop viruses, bacteria, fungi, cancerous cells and so on. Owing to the non-drug resistance of AMP, it has been a wonderful agent to find novel therapies. Therefore, it is urgent to identify AMPs and predict their function in a high-throughput way. In this paper, we propose a cascaded computational model to identify AMPs and their functional type based on sequence-derived and life language embedding, called AMPFinder. Compared with other state-of-the-art methods, AMPFinder obtains higher performance both on AMP identification and AMP function prediction. AMPFinder shows better performance with improvement of F1-score (1.45%-6.13%), MCC (2.92%-12.86%) and AUC (5.13%-8.56%) and AP (9.20%-21.07%) on an independent test dataset. And AMPFinder achieve lower bias of R2 on a public dataset by 10-fold cross-validation with an improvement of (18.82%-19.46%). The comparison with other state-of-the-art methods shows that AMP can accurately identify AMP and its function types. The datasets, source code and user-friendly application are available at https://github.com/abcair/AMPFinder.
Collapse
Affiliation(s)
- Sen Yang
- The Affiliated Changzhou No 2 People's Hospital of Nanjing Medical University, Changzhou, 213164, China; School of Computer Science and Artificial Intelligence Aliyun School of Big Data, School of Software, Changzhou University, Changzhou, 213164, China
| | - Zexi Yang
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data, School of Software, Changzhou University, Changzhou, 213164, China
| | - Xinye Ni
- The Affiliated Changzhou No 2 People's Hospital of Nanjing Medical University, Changzhou, 213164, China.
| |
Collapse
|
15
|
Zhou W, Liu Y, Li Y, Kong S, Wang W, Ding B, Han J, Mou C, Gao X, Liu J. TriNet: A tri-fusion neural network for the prediction of anticancer and antimicrobial peptides. PATTERNS (NEW YORK, N.Y.) 2023; 4:100702. [PMID: 36960450 PMCID: PMC10028424 DOI: 10.1016/j.patter.2023.100702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 12/20/2022] [Accepted: 02/03/2023] [Indexed: 03/04/2023]
Abstract
The accurate identification of anticancer peptides (ACPs) and antimicrobial peptides (AMPs) remains a computational challenge. We propose a tri-fusion neural network termed TriNet for the accurate prediction of both ACPs and AMPs. The framework first defines three kinds of features to capture the peptide information contained in serial fingerprints, sequence evolutions, and physicochemical properties, which are then fed into three parallel modules: a convolutional neural network module enhanced by channel attention, a bidirectional long short-term memory module, and an encoder module for training and final classification. To achieve a better training effect, TriNet is trained via a training approach using iterative interactions between the samples in the training and validation datasets. TriNet is tested on multiple challenging ACP and AMP datasets and exhibits significant improvements over various state-of-the-art methods. The web server and source code of TriNet are respectively available at http://liulab.top/TriNet/server and https://github.com/wanyunzh/TriNet.
Collapse
Affiliation(s)
- Wanyun Zhou
- SDU-ANU Joint Science College, Shandong University (Weihai), Weihai 264209, China
| | - Yufei Liu
- SDU-ANU Joint Science College, Shandong University (Weihai), Weihai 264209, China
| | - Yingxin Li
- School of Mechanical, Electrical & Information Engineering, Shandong University (Weihai), Weihai 264209, China
| | - Siqi Kong
- SDU-ANU Joint Science College, Shandong University (Weihai), Weihai 264209, China
| | - Weilin Wang
- SDU-ANU Joint Science College, Shandong University (Weihai), Weihai 264209, China
| | - Boyun Ding
- SDU-ANU Joint Science College, Shandong University (Weihai), Weihai 264209, China
| | - Jiyun Han
- School of Mathematics and Statistics, Shandong University (Weihai), Weihai 264209, China
| | - Chaozhou Mou
- School of Mathematics and Statistics, Shandong University (Weihai), Weihai 264209, China
| | - Xin Gao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
| | - Juntao Liu
- School of Mathematics and Statistics, Shandong University (Weihai), Weihai 264209, China
| |
Collapse
|
16
|
You Y, Liu H, Zhu Y, Zheng H. Rational design of stapled antimicrobial peptides. Amino Acids 2023; 55:421-442. [PMID: 36781451 DOI: 10.1007/s00726-023-03245-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Accepted: 01/30/2023] [Indexed: 02/15/2023]
Abstract
The global increase in antimicrobial drug resistance has dramatically reduced the effectiveness of traditional antibiotics. Structurally diverse antibiotics are urgently needed to combat multiple-resistant bacterial infections. As part of innate immunity, antimicrobial peptides have been recognized as the most promising candidates because they comprise diverse sequences and mechanisms of action and have a relatively low induction rate of resistance. However, because of their low chemical stability, susceptibility to proteases, and high hemolytic effect, their usage is subject to many restrictions. Chemical modifications such as D-amino acid substitution, cyclization, and unnatural amino acid modification have been used to improve the stability of antimicrobial peptides for decades. Among them, a side-chain covalent bridge modification, the so-called stapled peptide, has attracted much attention. The stapled side-chain bridge stabilizes the secondary structure, induces protease resistance, and increases cell penetration and biological activity. Recent progress in computer-aided drug design and artificial intelligence methods has also been used in the design of stapled antimicrobial peptides and has led to the successful discovery of many prospective peptides. This article reviews the possible structure-activity relationships of stapled antimicrobial peptides, the physicochemical properties that influence their activity (such as net charge, hydrophobicity, helicity, and dipole moment), and computer-aided methods of stapled peptide design. Antimicrobial peptides under clinical trial: Pexiganan (NCT01594762, 2012-05-07). Omiganan (NCT02576847, 2015-10-13).
Collapse
Affiliation(s)
- YuHao You
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 210009, People's Republic of China
| | - HongYu Liu
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 210009, People's Republic of China
| | - YouZhuo Zhu
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 210009, People's Republic of China
| | - Heng Zheng
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 210009, People's Republic of China.
| |
Collapse
|
17
|
García-Jacas CR, García-González LA, Martinez-Rios F, Tapia-Contreras IP, Brizuela CA. Handcrafted versus non-handcrafted (self-supervised) features for the classification of antimicrobial peptides: complementary or redundant? Brief Bioinform 2022; 23:6754757. [PMID: 36215083 DOI: 10.1093/bib/bbac428] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 08/28/2022] [Accepted: 09/02/2022] [Indexed: 12/14/2022] Open
Abstract
Antimicrobial peptides (AMPs) have received a great deal of attention given their potential to become a plausible option to fight multi-drug resistant bacteria as well as other pathogens. Quantitative sequence-activity models (QSAMs) have been helpful to discover new AMPs because they allow to explore a large universe of peptide sequences and help reduce the number of wet lab experiments. A main aspect in the building of QSAMs based on shallow learning is to determine an optimal set of protein descriptors (features) required to discriminate between sequences with different antimicrobial activities. These features are generally handcrafted from peptide sequence datasets that are labeled with specific antimicrobial activities. However, recent developments have shown that unsupervised approaches can be used to determine features that outperform human-engineered (handcrafted) features. Thus, knowing which of these two approaches contribute to a better classification of AMPs, it is a fundamental question in order to design more accurate models. Here, we present a systematic and rigorous study to compare both types of features. Experimental outcomes show that non-handcrafted features lead to achieve better performances than handcrafted features. However, the experiments also prove that an improvement in performance is achieved when both types of features are merged. A relevance analysis reveals that non-handcrafted features have higher information content than handcrafted features, while an interaction-based importance analysis reveals that handcrafted features are more important. These findings suggest that there is complementarity between both types of features. Comparisons regarding state-of-the-art deep models show that shallow models yield better performances both when fed with non-handcrafted features alone and when fed with non-handcrafted and handcrafted features together.
Collapse
Affiliation(s)
- César R García-Jacas
- Cátedras CONACYT - Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | - Luis A García-González
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | | | - Issac P Tapia-Contreras
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | - Carlos A Brizuela
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| |
Collapse
|
18
|
Dong B, Li M, Jiang B, Gao B, Li D, Zhang T. Antimicrobial Peptides Prediction method based on sequence multidimensional feature embedding. Front Genet 2022; 13:1069558. [PMID: 36468005 PMCID: PMC9714691 DOI: 10.3389/fgene.2022.1069558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 11/02/2022] [Indexed: 09/10/2024] Open
Abstract
Antimicrobial peptides (AMPs) are alkaline substances with efficient bactericidal activity produced in living organisms. As the best substitute for antibiotics, they have been paid more and more attention in scientific research and clinical application. AMPs can be produced from almost all organisms and are capable of killing a wide variety of pathogenic microorganisms. In addition to being antibacterial, natural AMPs have many other therapeutically important activities, such as wound healing, antioxidant and immunomodulatory effects. To discover new AMPs, the use of wet experimental methods is expensive and difficult, and bioinformatics technology can effectively solve this problem. Recently, some deep learning methods have been applied to the prediction of AMPs and achieved good results. To further improve the prediction accuracy of AMPs, this paper designs a new deep learning method based on sequence multidimensional representation. By encoding and embedding sequence features, and then inputting the model to identify AMPs, high-precision classification of AMPs and Non-AMPs with lengths of 10-200 is achieved. The results show that our method improved accuracy by 1.05% compared to the most advanced model in independent data validation without decreasing other indicators.
Collapse
Affiliation(s)
- Benzhi Dong
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Mengna Li
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Bei Jiang
- Tianjin Second People's Hospital, Tianjin Institute of Hepatology, Tianjin, China
| | - Bo Gao
- Department of Radiology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Dan Li
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Tianjiao Zhang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| |
Collapse
|
19
|
Yan J, Cai J, Zhang B, Wang Y, Wong DF, Siu SWI. Recent Progress in the Discovery and Design of Antimicrobial Peptides Using Traditional Machine Learning and Deep Learning. Antibiotics (Basel) 2022; 11:1451. [PMID: 36290108 PMCID: PMC9598685 DOI: 10.3390/antibiotics11101451] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 10/11/2022] [Accepted: 10/13/2022] [Indexed: 11/16/2022] Open
Abstract
Antimicrobial resistance has become a critical global health problem due to the abuse of conventional antibiotics and the rise of multi-drug-resistant microbes. Antimicrobial peptides (AMPs) are a group of natural peptides that show promise as next-generation antibiotics due to their low toxicity to the host, broad spectrum of biological activity, including antibacterial, antifungal, antiviral, and anti-parasitic activities, and great therapeutic potential, such as anticancer, anti-inflammatory, etc. Most importantly, AMPs kill bacteria by damaging cell membranes using multiple mechanisms of action rather than targeting a single molecule or pathway, making it difficult for bacterial drug resistance to develop. However, experimental approaches used to discover and design new AMPs are very expensive and time-consuming. In recent years, there has been considerable interest in using in silico methods, including traditional machine learning (ML) and deep learning (DL) approaches, to drug discovery. While there are a few papers summarizing computational AMP prediction methods, none of them focused on DL methods. In this review, we aim to survey the latest AMP prediction methods achieved by DL approaches. First, the biology background of AMP is introduced, then various feature encoding methods used to represent the features of peptide sequences are presented. We explain the most popular DL techniques and highlight the recent works based on them to classify AMPs and design novel peptide sequences. Finally, we discuss the limitations and challenges of AMP prediction.
Collapse
Affiliation(s)
- Jielu Yan
- PAMI Research Group, Department of Computer and Information Science, University of Macau, Taipa, Macau, China
| | - Jianxiu Cai
- Faculty of Applied Sciences, Macao Polytechnic University, Macau, China
- Institute of Science and Environment, University of Saint Joseph, Estr. Marginal da Ilha Verde, Macau, China
| | - Bob Zhang
- PAMI Research Group, Department of Computer and Information Science, University of Macau, Taipa, Macau, China
| | - Yapeng Wang
- Faculty of Applied Sciences, Macao Polytechnic University, Macau, China
| | - Derek F. Wong
- NLP2CT Lab, Department of Computer and Information Science, University of Macau, Taipa, Macau, China
| | - Shirley W. I. Siu
- Institute of Science and Environment, University of Saint Joseph, Estr. Marginal da Ilha Verde, Macau, China
- School of Pharmaceutical Sciences, Universiti Sains Malaysia, Pulau Pinang 11800, Malaysia
| |
Collapse
|
20
|
Jiang Y, Luo J, Huang D, Liu Y, Li DD. Machine Learning Advances in Microbiology: A Review of Methods and Applications. Front Microbiol 2022; 13:925454. [PMID: 35711777 PMCID: PMC9196628 DOI: 10.3389/fmicb.2022.925454] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 05/09/2022] [Indexed: 12/18/2022] Open
Abstract
Microorganisms play an important role in natural material and elemental cycles. Many common and general biology research techniques rely on microorganisms. Machine learning has been gradually integrated with multiple fields of study. Machine learning, including deep learning, aims to use mathematical insights to optimize variational functions to aid microbiology using various types of available data to help humans organize and apply collective knowledge of various research objects in a systematic and scaled manner. Classification and prediction have become the main achievements in the development of microbial community research in the direction of computational biology. This review summarizes the application and development of machine learning and deep learning in the field of microbiology and shows and compares the advantages and disadvantages of different algorithm tools in four fields: microbiome and taxonomy, microbial ecology, pathogen and epidemiology, and drug discovery.
Collapse
|
21
|
Sequeira AM, Lousa D, Rocha M. ProPythia: A Python package for protein classification based on machine and deep learning. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.07.102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
22
|
García-Jacas CR, Pinacho-Castellanos SA, García-González LA, Brizuela CA. Do deep learning models make a difference in the identification of antimicrobial peptides? Brief Bioinform 2022; 23:6563422. [PMID: 35380616 DOI: 10.1093/bib/bbac094] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 02/16/2022] [Accepted: 02/23/2022] [Indexed: 12/21/2022] Open
Abstract
In the last few decades, antimicrobial peptides (AMPs) have been explored as an alternative to classical antibiotics, which in turn motivated the development of machine learning models to predict antimicrobial activities in peptides. The first generation of these predictors was filled with what is now known as shallow learning-based models. These models require the computation and selection of molecular descriptors to characterize each peptide sequence and train the models. The second generation, known as deep learning-based models, which no longer requires the explicit computation and selection of those descriptors, started to be used in the prediction task of AMPs just four years ago. The superior performance claimed by deep models regarding shallow models has created a prevalent inertia to using deep learning to identify AMPs. However, methodological flaws and/or modeling biases in the building of deep models do not support such superiority. Here, we analyze the main pitfalls that led to establish biased conclusions on the leading performance of deep models. Also, we analyze whether deep models truly contribute to achieve better predictions than shallow models by performing fair studies on different state-of-the-art benchmarking datasets. The experiments reveal that deep models do not outperform shallow models in the classification of AMPs, and that both types of models codify similar chemical information since their predictions are highly similar. Thus, according to the currently available datasets, we conclude that the use of deep learning could not be the most suitable approach to develop models to identify AMPs, mainly because shallow models achieve comparable-to-superior performances and are simpler (Ockham's razor principle). Even so, we suggest the use of deep learning only when its capabilities lead to obtaining significantly better performance gains worth the additional computational cost.
Collapse
Affiliation(s)
- César R García-Jacas
- Cátedras CONACYT - Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | - Sergio A Pinacho-Castellanos
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México.,Centro de Investigación y Desarrollo de Tecnología Digital (CITEDI), Instituto Politécnico Nacional (IPN), 22435 Tijuana, Baja California, México
| | - Luis A García-González
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | - Carlos A Brizuela
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| |
Collapse
|
23
|
Prediction of Linear Cationic Antimicrobial Peptides Active against Gram-Negative and Gram-Positive Bacteria Based on Machine Learning Models. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12073631] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Antimicrobial peptides (AMPs) are considered as promising alternatives to conventional antibiotics in order to overcome the growing problems of antibiotic resistance. Computational prediction approaches receive an increasing interest to identify and design the best candidate AMPs prior to the in vitro tests. In this study, we focused on the linear cationic peptides with non-hemolytic activity, which are downloaded from the Database of Antimicrobial Activity and Structure of Peptides (DBAASP). Referring to the MIC (Minimum inhibition concentration) values, we have assigned a positive label to a peptide if it shows antimicrobial activity; otherwise, the peptide is labeled as negative. Here, we focused on the peptides showing antimicrobial activity against Gram-negative and against Gram-positive bacteria separately, and we created two datasets accordingly. Ten different physico-chemical properties of the peptides are calculated and used as features in our study. Following data exploration and data preprocessing steps, a variety of classification algorithms are used with 100-fold Monte Carlo Cross-Validation to build models and to predict the antimicrobial activity of the peptides. Among the generated models, Random Forest has resulted in the best performance metrics for both Gram-negative dataset (Accuracy: 0.98, Recall: 0.99, Specificity: 0.97, Precision: 0.97, AUC: 0.99, F1: 0.98) and Gram-positive dataset (Accuracy: 0.95, Recall: 0.95, Specificity: 0.95, Precision: 0.90, AUC: 0.97, F1: 0.92) after outlier elimination is applied. This prediction approach might be useful to evaluate the antibacterial potential of a candidate peptide sequence before moving to the experimental studies.
Collapse
|
24
|
Zhang D, Wang S. A protein succinylation sites prediction method based on the hybrid architecture of LSTM network and CNN. J Bioinform Comput Biol 2022; 20:2250003. [PMID: 35191361 DOI: 10.1142/s0219720022500032] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The succinylation modification of protein participates in the regulation of a variety of cellular processes. Identification of modified substrates with precise sites is the basis for understanding the molecular mechanism and regulation of succinylation. In this work, we picked and chose five superior feature codes: CKSAAP, ACF, BLOSUM62, AAindex, and one-hot, according to their performance in the problem of succinylation sites prediction. Then, LSTM network and CNN were used to construct four models: LSTM-CNN, CNN-LSTM, LSTM, and CNN. The five selected features were, respectively, input into each of these four models for training to compare the four models. Based on the performance of each model, the optimal model among them was chosen to construct a hybrid model DeepSucc that was composed of five sub-modules for integrating heterogeneous information. Under the 10-fold cross-validation, the hybrid model DeepSucc achieves 86.26% accuracy, 84.94% specificity, 87.57% sensitivity, 0.9406 AUC, and 0.7254 MCC. When compared with other prediction tools using an independent test set, DeepSucc outperformed them in sensitivity and MCC. The datasets and source codes can be accessed at https://github.com/1835174863zd/DeepSucc.
Collapse
Affiliation(s)
- Die Zhang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming 650504, P. R. China
| | - Shunfang Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming 650504, P. R. China
| |
Collapse
|
25
|
Abstract
Antibiotic resistance constitutes a global threat and could lead to a future pandemic. One strategy is to develop a new generation of antimicrobials. Naturally occurring antimicrobial peptides (AMPs) are recognized templates and some are already in clinical use. To accelerate the discovery of new antibiotics, it is useful to predict novel AMPs from the sequenced genomes of various organisms. The antimicrobial peptide database (APD) provided the first empirical peptide prediction program. It also facilitated the testing of the first machine-learning algorithms. This chapter provides an overview of machine-learning predictions of AMPs. Most of the predictors, such as AntiBP, CAMP, and iAMPpred, involve a single-label prediction of antimicrobial activity. This type of prediction has been expanded to antifungal, antiviral, antibiofilm, anti-TB, hemolytic, and anti-inflammatory peptides. The multiple functional roles of AMPs annotated in the APD also enabled multi-label predictions (iAMP-2L, MLAMP, and AMAP), which include antibacterial, antiviral, antifungal, antiparasitic, antibiofilm, anticancer, anti-HIV, antimalarial, insecticidal, antioxidant, chemotactic, spermicidal activities, and protease inhibiting activities. Also considered in predictions are peptide posttranslational modification, 3D structure, and microbial species-specific information. We compare important amino acids of AMPs implied from machine learning with the frequently occurring residues of the major classes of natural peptides. Finally, we discuss advances, limitations, and future directions of machine-learning predictions of antimicrobial peptides. Ultimately, we may assemble a pipeline of such predictions beyond antimicrobial activity to accelerate the discovery of novel AMP-based antimicrobials.
Collapse
Affiliation(s)
- Guangshun Wang
- Department of Pathology and Microbiology, College of Medicine, University of Nebraska Medical Center, 985900 Nebraska Medical Center, Omaha, NE 68198-5900, USA;,Corresponding to: Dr. Monique van Hoek: ; Dr. Iosif Vaisman: ; Dr. Guangshun Wang:
| | - Iosif I. Vaisman
- School of Systems Biology, George Mason University, 10920 George Mason Circle, Manassas, VA, 20110, USA.,Corresponding to: Dr. Monique van Hoek: ; Dr. Iosif Vaisman: ; Dr. Guangshun Wang:
| | - Monique L. van Hoek
- School of Systems Biology, George Mason University, 10920 George Mason Circle, Manassas, VA, 20110, USA.,Corresponding to: Dr. Monique van Hoek: ; Dr. Iosif Vaisman: ; Dr. Guangshun Wang:
| |
Collapse
|
26
|
Dong S, Wang S. Assembled graph neural network using graph transformer with edges for protein model quality assessment. J Mol Graph Model 2021; 110:108053. [PMID: 34773871 DOI: 10.1016/j.jmgm.2021.108053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2021] [Revised: 10/13/2021] [Accepted: 10/13/2021] [Indexed: 10/19/2022]
Abstract
Acquainting protein's structure is of vital importance to accurately understanding its function. Computational method of deep learning has made great progress in protein structure prediction from sequence, and has the potential to help structural biology research. The computational methods usually require independent protein structure model quality assessment to select the best from the model pool or guide protein structure refinement. We construct a graph neural network finely assembled with Graph Transformer Feature Extractor and message-passing layers for protein model quality assessment. The graph based method can more naturally embody the protein structure than a sequence or voxelized representation method. Although the widely used graph convolutional network has a strong ability to learn spatial patterns, it does not weigh the dependencies of different nodes on other nodes. So we introduce Graph Transformer to excavate the different degrees of neighboring residue nodes contributing to their local environments and extract local features. This is subsequently followed by message-passing layers to transmit-receive local information. Our network makes better use of edge information and is lightweight since relatively few input features and number of network layers, and experimental results demonstrate that our model outperforms various existing methods. Core code is made freely available at: https://github.com/Crystal-Dsq/proteinqa.
Collapse
Affiliation(s)
- Shiqi Dong
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, China
| | - Shunfang Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, China.
| |
Collapse
|
27
|
Bin Hafeez A, Jiang X, Bergen PJ, Zhu Y. Antimicrobial Peptides: An Update on Classifications and Databases. Int J Mol Sci 2021; 22:11691. [PMID: 34769122 PMCID: PMC8583803 DOI: 10.3390/ijms222111691] [Citation(s) in RCA: 154] [Impact Index Per Article: 38.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 10/24/2021] [Accepted: 10/25/2021] [Indexed: 02/06/2023] Open
Abstract
Antimicrobial peptides (AMPs) are distributed across all kingdoms of life and are an indispensable component of host defenses. They consist of predominantly short cationic peptides with a wide variety of structures and targets. Given the ever-emerging resistance of various pathogens to existing antimicrobial therapies, AMPs have recently attracted extensive interest as potential therapeutic agents. As the discovery of new AMPs has increased, many databases specializing in AMPs have been developed to collect both fundamental and pharmacological information. In this review, we summarize the sources, structures, modes of action, and classifications of AMPs. Additionally, we examine current AMP databases, compare valuable computational tools used to predict antimicrobial activity and mechanisms of action, and highlight new machine learning approaches that can be employed to improve AMP activity to combat global antimicrobial resistance.
Collapse
Affiliation(s)
- Ahmer Bin Hafeez
- Centre of Biotechnology and Microbiology, University of Peshawar, Peshawar 25120, Pakistan;
| | - Xukai Jiang
- Infection and Immunity Program, Department of Microbiology, Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia; (X.J.); (P.J.B.)
- National Glycoengineering Research Center, Shandong University, Qingdao 266237, China
| | - Phillip J. Bergen
- Infection and Immunity Program, Department of Microbiology, Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia; (X.J.); (P.J.B.)
| | - Yan Zhu
- Infection and Immunity Program, Department of Microbiology, Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia; (X.J.); (P.J.B.)
| |
Collapse
|
28
|
Yadav V, Misra R. A review emphasizing on utility of heptad repeat sequence as a tool to design pharmacologically safe peptide-based antibiotics. Biochimie 2021; 191:126-139. [PMID: 34492334 DOI: 10.1016/j.biochi.2021.09.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 08/25/2021] [Accepted: 09/03/2021] [Indexed: 12/31/2022]
Abstract
Extensive usage of antibiotics has created an unprecedented scenario of the rapid emergence of many drug-resistant bacteria, which has become an alarming public health concern around the globe. Search for better alternatives that are as efficacious as antibiotics led to the discovery of antimicrobial peptides (AMPs). These small cationic amphiphilic peptides have emerged as a promising option as antimicrobial agents, owing to their multifaceted implications against varied pathogens. Recent years have witnessed tremendous growth in research on AMPs resulting in them being tested in clinical trials of which six got approved for topical application. The relatively less successful outcome has been attributed to the poor cell selectivity shown by most of the naturally occurring AMPs. This drawback needs to be circumvented by identifying strategies to design safe and effective peptides. In the present review, we have emphasized the importance of heptad repeat sequence (leucine and/or phenylalanine zipper motif) as a tool that has shown great promise in remodeling the toxic AMPs to safe antimicrobial agents.
Collapse
Affiliation(s)
- Vikas Yadav
- Department of Translational Medicine, Clinical Research Centre, Skåne University Hospital, Lund University, Malmö, Sweden; Interdisciplinary Cluster for Applied Genoproteomics (GIGA), University of Liège (ULiège), Liège, Belgium.
| | - Richa Misra
- Department of Zoology, Sri Venkateswara College, University of Delhi, Delhi, India
| |
Collapse
|
29
|
iEnhancer-RD: Identification of enhancers and their strength using RKPK features and deep neural networks. Anal Biochem 2021; 630:114318. [PMID: 34364858 DOI: 10.1016/j.ab.2021.114318] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 07/02/2021] [Accepted: 07/27/2021] [Indexed: 11/20/2022]
Abstract
Enhancers are regulatory elements involved in gene expression.It is a part of DNA, which can enhance the transcription rate of gene. However, the identification of enhancer by biological experimental methods is time-consuming and expensive. Therefore, there is an urgent need for more efficient methods to identify them.In this study, we propose a new feature extraction method RKPK, which combines three feature methods and uses the recursive feature elimination algorithm for feature selection, and apply deep neural network as classifier to construct the iEnhancer-RD calculation method for enhancer identification. It is a two-layer classification architecture in which the first layer(layer I) identifies enhancers from a set of DNA sequences, and the second layer(layer II) divides the identified enhancers into two subgroups, namely strong and weak enhancers. Independent dataset test indicates that the proposed method is significantly better than most existing methods, and attains the accuracy of 78.8% and 70.5% in the two layers, respectively. Our iEnhancer-RD architecture is implemented in Python and is available at https://github.com/YangHuan639/iEnhancer-RD.
Collapse
|
30
|
Singh O, Hsu WL, Su ECY. Co-AMPpred for in silico-aided predictions of antimicrobial peptides by integrating composition-based features. BMC Bioinformatics 2021; 22:389. [PMID: 34330209 PMCID: PMC8325260 DOI: 10.1186/s12859-021-04305-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 07/21/2021] [Indexed: 12/24/2022] Open
Abstract
Background Antimicrobial peptides (AMPs) are oligopeptides that act as crucial components of innate immunity, naturally occur in all multicellular organisms, and are involved in the first line of defense function. Recent studies showed that AMPs perpetuate great potential that is not limited to antimicrobial activity. They are also crucial regulators of host immune responses that can modulate a wide range of activities, such as immune regulation, wound healing, and apoptosis. However, a microorganism's ability to adapt and to resist existing antibiotics triggered the scientific community to develop alternatives to conventional antibiotics. Therefore, to address this issue, we proposed Co-AMPpred, an in silico-aided AMP prediction method based on compositional features of amino acid residues to classify AMPs and non-AMPs. Results In our study, we developed a prediction method that incorporates composition-based sequence and physicochemical features into various machine-learning algorithms. Then, the boruta feature-selection algorithm was used to identify discriminative biological features. Furthermore, we only used discriminative biological features to develop our model. Additionally, we performed a stratified tenfold cross-validation technique to validate the predictive performance of our AMP prediction model and evaluated on the independent holdout test dataset. A benchmark dataset was collected from previous studies to evaluate the predictive performance of our model. Conclusions Experimental results show that combining composition-based and physicochemical features outperformed existing methods on both the benchmark training dataset and a reduced training dataset. Finally, our proposed method achieved 80.8% accuracies and 0.871 area under the receiver operating characteristic curve by evaluating on independent test set. Our code and datasets are available at https://github.com/onkarS23/CoAMPpred. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04305-2.
Collapse
Affiliation(s)
- Onkar Singh
- Bioinformatics Program, Taiwan International Graduate Program, Institute of Information Science, Academia Sinica, Taipei, Taiwan.,Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, 250 Wu-Xing Street, Taipei, 11031, Taiwan
| | - Wen-Lian Hsu
- Bioinformatics Program, Taiwan International Graduate Program, Institute of Information Science, Academia Sinica, Taipei, Taiwan.,Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Emily Chia-Yu Su
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, 250 Wu-Xing Street, Taipei, 11031, Taiwan. .,Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan.
| |
Collapse
|