1
|
Wei Z, Shen Y, Tang X, Wen J, Song Y, Wei M, Cheng J, Zhu X. AVPpred-BWR: antiviral peptides prediction via biological words representation. Bioinformatics 2025; 41:btaf126. [PMID: 40152250 PMCID: PMC11968319 DOI: 10.1093/bioinformatics/btaf126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2024] [Revised: 02/17/2025] [Accepted: 03/26/2025] [Indexed: 03/29/2025] Open
Abstract
MOTIVATION Antiviral peptides (AVPs) are short chains of amino acids, showing great potential as antiviral drugs. The traditional wisdom (e.g. wet experiments) for identifying the AVPs is time-consuming and laborious, while cutting-edge computational methods are less accurate to predict them. RESULTS In this article, we propose an AVPs prediction model via biological words representation, dubbed AVPpred-BWR. Based on the fact that the secondary structures of AVPs mainly consist of α-helix and loop, we explore the biological words of 1mer (corresponding to loops) and 4mer (4 continuous residues, corresponding to α-helix). That is, the peptides sequences are decomposed into biological words, and then the concealed sequential information is represented by training the Word2Vec models. Moreover, in order to extract multi-scale features, we leverage a CNN-Transformer framework to process the embeddings of 1mer and 4mer generated by Word2Vec models. To the best of our knowledge, this is the first time to realize the word segmentation of protein primary structure sequences based on the regularity of protein secondary structure. AVPpred-BWR illustrates clear improvements over its competitors on the independent test set (e.g. improvements of 4.6% and 11.0% for AUROC and MCC, respectively, compared to UniDL4BioPep). AVAILABILITY AND IMPLEMENTATION AVPpred-BWR is publicly available at: https://github.com/zyweizm/AVPpred-BWR or https://zenodo.org/records/14880447 (doi: 10.5281/zenodo.14880447).
Collapse
Affiliation(s)
- Zhuoyu Wei
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Yongqi Shen
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Xiang Tang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Jian Wen
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Youyi Song
- School of Science, China Pharmaceutical University, Nanjing 210009, China
| | - Mingqiang Wei
- School of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
| | - Jing Cheng
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Xiaolei Zhu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui 230036, China
| |
Collapse
|
2
|
Nawaz M, Huiyuan Y, Akhtar F, Tianyue M, Zheng H. Deep learning in the discovery of antiviral peptides and peptidomimetics: databases and prediction tools. Mol Divers 2025:10.1007/s11030-025-11173-y. [PMID: 40153158 DOI: 10.1007/s11030-025-11173-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2025] [Accepted: 03/18/2025] [Indexed: 03/30/2025]
Abstract
Antiviral peptides (AVPs) represent a novel and promising therapeutic alternative to conventional antiviral treatments, due to their broad-spectrum activity, high specificity, and low toxicity. The emergence of zoonotic viruses such as Zika, Ebola, and SARS-CoV-2 have accelerated AVP research, driven by advancements in data availability and artificial intelligence (AI). This review focuses on the development of AVP databases, their physicochemical properties, and predictive tools utilizing machine learning for AVP discovery. Machine learning plays a pivotal role in advancing and developing antiviral peptides and peptidomimetics, particularly through the development of specialized databases such as DRAVP, AVPdb, and DBAASP. These resources facilitate AVP characterization but face limitations, including small datasets, incomplete annotations, and inadequate integration with multi-omics data.The antiviral efficacy of AVPs is closely linked to their physicochemical properties, such as hydrophobicity and amphipathic α-helical structures, which enable viral membrane disruption and specific target interactions. Computational prediction tools employing machine learning and deep learning have significantly advanced AVP discovery. However, challenges like overfitting, limited experimental validation, and a lack of mechanistic insights hinder clinical translation.Future advancements should focus on improved validation frameworks, integration of in vivo data, and the development of interpretable models to elucidate AVP mechanisms. Expanding predictive models to address multi-target interactions and incorporating complex biological environments will be crucial for translating AVPs into effective clinical therapies.
Collapse
Affiliation(s)
- Maryam Nawaz
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 211100, People's Republic of China
| | - Yao Huiyuan
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 211100, People's Republic of China
| | - Fahad Akhtar
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 211100, People's Republic of China
| | - Ma Tianyue
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 211100, People's Republic of China
| | - Heng Zheng
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 211100, People's Republic of China.
| |
Collapse
|
3
|
Dong R, Liu R, Liu Z, Liu Y, Zhao G, Li H, Hou S, Ma X, Kang H, Liu J, Guo F, Zhao P, Wang J, Wang C, Wu X, Ye S, Zhu C. Exploring the repository of de novo-designed bifunctional antimicrobial peptides through deep learning. eLife 2025; 13:RP97330. [PMID: 40079572 PMCID: PMC11906162 DOI: 10.7554/elife.97330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/15/2025] Open
Abstract
Antimicrobial peptides (AMPs) are attractive candidates to combat antibiotic resistance for their capability to target biomembranes and restrict a wide range of pathogens. It is a daunting challenge to discover novel AMPs due to their sparse distributions in a vast peptide universe, especially for peptides that demonstrate potencies for both bacterial membranes and viral envelopes. Here, we establish a de novo AMP design framework by bridging a deep generative module and a graph-encoding activity regressor. The generative module learns hidden 'grammars' of AMP features and produces candidates sequentially pass antimicrobial predictor and antiviral classifiers. We discovered 16 bifunctional AMPs and experimentally validated their abilities to inhibit a spectrum of pathogens in vitro and in animal models. Notably, P076 is a highly potent bactericide with the minimal inhibitory concentration of 0.21 μM against multidrug-resistant Acinetobacter baumannii, while P002 broadly inhibits five enveloped viruses. Our study provides feasible means to uncover the sequences that simultaneously encode antimicrobial and antiviral activities, thus bolstering the function spectra of AMPs to combat a wide range of drug-resistant infections.
Collapse
Affiliation(s)
- Ruihan Dong
- Frontiers Science Center for Synthetic Biology (Ministry of Education), Tianjin Key Laboratory of Function and Application of Biological Macromolecular Structures, School of Life Sciences, Faculty of Medicine, Tianjin UniversityTianjinChina
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking UniversityBeijingChina
| | - Rongrong Liu
- Department of Microbiology, School of Basic Medicine, Fourth Military Medical UniversityShaanxiChina
| | - Ziyu Liu
- Department of Microbiology, School of Basic Medicine, Fourth Military Medical UniversityShaanxiChina
| | - Yangang Liu
- Department of Microbiology, Second Military Medical UniversityShanghaiChina
| | - Gaomei Zhao
- State Key Laboratory of Trauma and Chemical Poisoning, Institute of Combined Injury of PLA, College of Preventive Medicine, Third Military Medical University (Army Medical University)ChongqingChina
| | - Honglei Li
- Tianjin Cancer Hospital Airport HospitalTianjinChina
| | - Shiyuan Hou
- Department of Microbiology, School of Basic Medicine, Fourth Military Medical UniversityShaanxiChina
| | - Xiaohan Ma
- Department of Microbiology, School of Basic Medicine, Fourth Military Medical UniversityShaanxiChina
| | - Huarui Kang
- Department of Microbiology, School of Basic Medicine, Fourth Military Medical UniversityShaanxiChina
| | - Jing Liu
- Department of Microbiology, School of Basic Medicine, Fourth Military Medical UniversityShaanxiChina
| | - Fei Guo
- School of Computer Science and Engineering, Central South UniversityChangshaChina
| | - Ping Zhao
- Department of Microbiology, Second Military Medical UniversityShanghaiChina
| | - Junping Wang
- State Key Laboratory of Trauma and Chemical Poisoning, Institute of Combined Injury of PLA, College of Preventive Medicine, Third Military Medical University (Army Medical University)ChongqingChina
| | - Cheng Wang
- State Key Laboratory of Trauma and Chemical Poisoning, Institute of Combined Injury of PLA, College of Preventive Medicine, Third Military Medical University (Army Medical University)ChongqingChina
| | - Xingan Wu
- Department of Microbiology, School of Basic Medicine, Fourth Military Medical UniversityShaanxiChina
| | - Sheng Ye
- Frontiers Science Center for Synthetic Biology (Ministry of Education), Tianjin Key Laboratory of Function and Application of Biological Macromolecular Structures, School of Life Sciences, Faculty of Medicine, Tianjin UniversityTianjinChina
| | - Cheng Zhu
- Frontiers Science Center for Synthetic Biology (Ministry of Education), Tianjin Key Laboratory of Function and Application of Biological Macromolecular Structures, School of Life Sciences, Faculty of Medicine, Tianjin UniversityTianjinChina
| |
Collapse
|
4
|
Koul M, Kaushik S, Singh K, Sharma D. VITALdb: to select the best viroinformatics tools for a desired virus or application. Brief Bioinform 2025; 26:bbaf084. [PMID: 40063348 PMCID: PMC11892104 DOI: 10.1093/bib/bbaf084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2024] [Revised: 01/14/2025] [Accepted: 02/17/2025] [Indexed: 05/13/2025] Open
Abstract
The recent pandemics of viral diseases, COVID-19/mpox (humans) and lumpy skin disease (cattle), have kept us glued to viral research. These pandemics along with the recent human metapneumovirus outbreak have exposed the urgency for early diagnosis of viral infections, vaccine development, and discovery of novel antiviral drugs and therapeutics. To support this, there is an armamentarium of virus-specific computational tools that are currently available. VITALdb (VIroinformatics Tools and ALgorithms database) is a resource of ~360 viroinformatics tools encompassing all major viruses (SARS-CoV-2, influenza virus, human immunodeficiency virus, papillomavirus, herpes simplex virus, hepatitis virus, dengue virus, Ebola virus, Zika virus, etc.) and several diverse applications [structural and functional annotation, antiviral peptides development, subspecies characterization, recognition of viral recombination, inhibitors identification, phylogenetic analysis, virus-host prediction, viral metagenomics, detection of mutation(s), primer designing, etc.]. Resources, tools, and other utilities mentioned in this article will not only facilitate further developments in the realm of viroinformatics but also provide tremendous fillip to translate fundamental knowledge into applied research. Most importantly, VITALdb is an inevitable tool for selecting the best tool(s) to carry out a desired task and hence will prove to be a vital database (VITALdb) for the scientific community. Database URL: https://compbio.iitr.ac.in/vitaldb.
Collapse
Affiliation(s)
- Mira Koul
- Computational Biology and Translational Bioinformatics (CBTB) Laboratory, Department of Biosciences and Bioengineering, Indian Institute of Technology Roorkee, Roorkee 247667, Uttarakhand, India
| | - Shalini Kaushik
- Computational Biology and Translational Bioinformatics (CBTB) Laboratory, Department of Biosciences and Bioengineering, Indian Institute of Technology Roorkee, Roorkee 247667, Uttarakhand, India
| | - Kavya Singh
- Computational Biology and Translational Bioinformatics (CBTB) Laboratory, Department of Biosciences and Bioengineering, Indian Institute of Technology Roorkee, Roorkee 247667, Uttarakhand, India
| | - Deepak Sharma
- Computational Biology and Translational Bioinformatics (CBTB) Laboratory, Department of Biosciences and Bioengineering, Indian Institute of Technology Roorkee, Roorkee 247667, Uttarakhand, India
| |
Collapse
|
5
|
Barroso RA, Agüero-Chapin G, Sousa R, Marrero-Ponce Y, Antunes A. Unlocking Antimicrobial Peptides: In Silico Proteolysis and Artificial Intelligence-Driven Discovery from Cnidarian Omics. Molecules 2025; 30:550. [PMID: 39942653 PMCID: PMC11820242 DOI: 10.3390/molecules30030550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2024] [Revised: 01/20/2025] [Accepted: 01/21/2025] [Indexed: 02/16/2025] Open
Abstract
Overcoming the growing challenge of antimicrobial resistance (AMR), which affects millions of people worldwide, has driven attention for the exploration of marine-derived antimicrobial peptides (AMPs) for innovative solutions. Cnidarians, such as corals, sea anemones, and jellyfish, are a promising valuable resource of these bioactive peptides due to their robust innate immune systems yet are still poorly explored. Hence, we employed an in silico proteolysis strategy to search for novel AMPs from omics data of 111 Cnidaria species. Millions of peptides were retrieved and screened using shallow- and deep-learning models, prioritizing AMPs with a reduced toxicity and with a structural distinctiveness from characterized AMPs. After complex network analysis, a final dataset of 3130 Cnidaria singular non-haemolytic and non-toxic AMPs were identified. Such unique AMPs were mined for their putative antibacterial activity, revealing 20 favourable candidates for in vitro testing against important ESKAPEE pathogens, offering potential new avenues for antibiotic development.
Collapse
Affiliation(s)
- Ricardo Alexandre Barroso
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208 Porto, Portugal; (R.A.B.); (G.A.-C.); (R.S.)
- Department of Biology, Faculty of Sciences of University of Porto (FCUP), Rua do Campo Alegre s/n, 4169-007 Porto, Portugal
| | - Guillermin Agüero-Chapin
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208 Porto, Portugal; (R.A.B.); (G.A.-C.); (R.S.)
- Department of Biology, Faculty of Sciences of University of Porto (FCUP), Rua do Campo Alegre s/n, 4169-007 Porto, Portugal
| | - Rita Sousa
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208 Porto, Portugal; (R.A.B.); (G.A.-C.); (R.S.)
- Department of Biology, Faculty of Sciences of University of Porto (FCUP), Rua do Campo Alegre s/n, 4169-007 Porto, Portugal
| | - Yovani Marrero-Ponce
- Facultad de Ingeniería, Universidad Panamericana, Augusto Rodin No. 498, Insurgentes Mixcoac, Benito Juárez, Ciudad de Mexico 03920, Mexico;
- Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Instituto de Simulación Computacional (ISC-USFQ), Universidad San Francisco de Quito (USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Ecuador
| | - Agostinho Antunes
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208 Porto, Portugal; (R.A.B.); (G.A.-C.); (R.S.)
- Department of Biology, Faculty of Sciences of University of Porto (FCUP), Rua do Campo Alegre s/n, 4169-007 Porto, Portugal
| |
Collapse
|
6
|
Hashemi S, Vosough P, Taghizadeh S, Savardashtaki A. Therapeutic peptide development revolutionized: Harnessing the power of artificial intelligence for drug discovery. Heliyon 2024; 10:e40265. [PMID: 39605829 PMCID: PMC11600032 DOI: 10.1016/j.heliyon.2024.e40265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 10/07/2024] [Accepted: 11/07/2024] [Indexed: 11/29/2024] Open
Abstract
Due to the spread of antibiotic resistance, global attention is focused on its inhibition and the expansion of effective medicinal compounds. The novel functional properties of peptides have opened up new horizons in personalized medicine. With artificial intelligence methods combined with therapeutic peptide products, pharmaceuticals and biotechnology advance drug development rapidly and reduce costs. Short-chain peptides inhibit a wide range of pathogens and have great potential for targeting diseases. To address the challenges of synthesis and sustainability, artificial intelligence methods, namely machine learning, must be integrated into their production. Learning methods can use complicated computations to select the active and toxic compounds of the drug and its metabolic activity. Through this comprehensive review, we investigated the artificial intelligence method as a potential tool for finding peptide-based drugs and providing a more accurate analysis of peptides through the introduction of predictable databases for effective selection and development.
Collapse
Affiliation(s)
- Samaneh Hashemi
- Student Research Committee, Shiraz University of Medical Sciences, Shiraz, Iran
- Department of Medical Biotechnology, School of Advanced Medical Sciences and Technologies, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Parisa Vosough
- Student Research Committee, Shiraz University of Medical Sciences, Shiraz, Iran
- Department of Medical Biotechnology, School of Advanced Medical Sciences and Technologies, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Saeed Taghizadeh
- Department of Medical Biotechnology, School of Advanced Medical Sciences and Technologies, Shiraz University of Medical Sciences, Shiraz, Iran
- Pharmaceutical Science Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Amir Savardashtaki
- Department of Medical Biotechnology, School of Advanced Medical Sciences and Technologies, Shiraz University of Medical Sciences, Shiraz, Iran
- Infertility Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| |
Collapse
|
7
|
Beltrán JF, Herrera-Belén L, Yáñez AJ, Jimenez L. Prediction of viral oncoproteins through the combination of generative adversarial networks and machine learning techniques. Sci Rep 2024; 14:27108. [PMID: 39511292 PMCID: PMC11543823 DOI: 10.1038/s41598-024-77028-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2024] [Accepted: 10/18/2024] [Indexed: 11/15/2024] Open
Abstract
Viral oncoproteins play crucial roles in transforming normal cells into cancer cells, representing a significant factor in the etiology of various cancers. Traditionally, identifying these oncoproteins is both time-consuming and costly. With advancements in computational biology, bioinformatics tools based on machine learning have emerged as effective methods for predicting biological activities. Here, for the first time, we propose an innovative approach that combines Generative Adversarial Networks (GANs) with supervised learning methods to enhance the accuracy and generalizability of viral oncoprotein prediction. Our methodology evaluated multiple machine learning models, including Random Forest, Multilayer Perceptron, Light Gradient Boosting Machine, eXtreme Gradient Boosting, and Support Vector Machine. In ten-fold cross-validation on our training dataset, the GAN-enhanced Random Forest model demonstrated superior performance metrics: 0.976 accuracy, 0.976 F1 score, 0.977 precision, 0.976 sensitivity, and 1.0 AUC. During independent testing, this model achieved 0.982 accuracy, 0.982 F1 score, 0.982 precision, 0.982 sensitivity, and 1.0 AUC. These results establish our new tool, VirOncoTarget, accessible via a web application. We anticipate that VirOncoTarget will be a valuable resource for researchers, enabling rapid and reliable viral oncoprotein prediction and advancing our understanding of their role in cancer biology.
Collapse
Affiliation(s)
- Jorge F Beltrán
- Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar 01145, Temuco, Chile.
| | - Lisandra Herrera-Belén
- Departamento de Ciencias Básicas, Facultad de Ciencias, Universidad Santo Tomas, Temuco, Chile
| | - Alejandro J Yáñez
- Departamento de Investigación y Desarrollo, Greenvolution SpA, Puerto Varas, Chile
- Interdisciplinary Center for Aquaculture Research (INCAR), Concepcion, Chile
| | - Luis Jimenez
- Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar 01145, Temuco, Chile
| |
Collapse
|
8
|
Kumar A, Singh D. Generative Adversarial Network-Based Augmentation With Noval 2-Step Authentication for Anti-Coronavirus Peptide Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1942-1954. [PMID: 39037884 DOI: 10.1109/tcbb.2024.3431688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/24/2024]
Abstract
The virus poses a longstanding and enduring danger to various forms of life. Despite the ongoing endeavors to combat viral diseases, there exists a necessity to explore and develop novel therapeutic options. Antiviral peptides are bioactive molecules with a favorable toxicity profile, making them promising alternatives for viral infection treatment. Therefore, this article employed a generative adversarial network for antiviral peptide augmentation and a novel two-step authentication process for augmented synthetic peptides to enhance antiviral activity prediction. Additionally, five widely utilized deep learning models were employed for classification purposes. Initially, a GAN was used to augment the antiviral peptide. In a two-step authentication process, the NCBI-BLAST was utilized to identify the antiviral activity resemblance between the synthetic and real peptide. Subsequently, the hydrophobicity, hydrophilicity, hydroxylic nature, positive charge, and negative charge of synthetic and authentic antiviral peptides were compared before their utilization. Later, to examine the impact of authenticated peptide augmentation in the prediction of antiviral peptides, a comparison is conducted with the outcomes of non-peptide augmented prediction. The study demonstrates that the 1-D convolution neural network with augmented peptide exhibits superior performance compared to other employed classifiers and state-of-the-art models. The network attains a mean classification accuracy of 95.41%, an AUC value of 0.95, and an MCC value of 0.90 on the benchmark antiviral and anti-corona peptides dataset. Thus, the performance of the proposed model indicates its efficacy in predicting the antiviral activity of peptides.
Collapse
|
9
|
Zhao M, Zhang Y, Wang M, Ma LZ. dsAMP and dsAMPGAN: Deep Learning Networks for Antimicrobial Peptides Recognition and Generation. Antibiotics (Basel) 2024; 13:948. [PMID: 39452213 PMCID: PMC11504993 DOI: 10.3390/antibiotics13100948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2024] [Revised: 10/03/2024] [Accepted: 10/03/2024] [Indexed: 10/26/2024] Open
Abstract
Antibiotic resistance is a growing public health challenge. Antimicrobial peptides (AMPs) effectively target microorganisms through non-specific mechanisms, limiting their ability to develop resistance. Therefore, the prediction and design of new AMPs is crucial. Recently, deep learning has spurred interest in computational approaches to peptide drug discovery. This study presents a novel deep learning framework for AMP classification, function prediction, and generation. We developed discoverAMP (dsAMP), a robust AMP predictor using CNN Attention BiLSTM and transfer learning, which outperforms existing classifiers. In addition, dsAMPGAN, a Generative Adversarial Network (GAN)-based model, generates new AMP candidates. Our results demonstrate the superior performance of dsAMP in terms of sensitivity, specificity, Matthew correlation coefficient, accuracy, precision, F1 score, and area under the ROC curve, achieving >95% classification accuracy with transfer learning on a small dataset. Furthermore, dsAMPGAN successfully synthesizes AMPs similar to natural ones, as confirmed by comparisons of physical and chemical properties. This model serves as a reliable tool for the identification of novel AMPs in clinical settings and supports the development of AMPs to effectively combat antibiotic resistance.
Collapse
Affiliation(s)
- Min Zhao
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; (M.Z.); (Y.Z.)
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yu Zhang
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; (M.Z.); (Y.Z.)
- Department of Bioscience and Biotechnology, Graduate School of Bioresource and Bioenvironmental Sciences, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan
| | - Maolin Wang
- CAAC Key Laboratory of General Aviation Operation, Civil Aviation Management Institute of China, Beijing 100102, China
| | - Luyan Z. Ma
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; (M.Z.); (Y.Z.)
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
10
|
Farias JG, Herrera-Belén L, Jimenez L, Beltrán JF. PROTA: A Robust Tool for Protamine Prediction Using a Hybrid Approach of Machine Learning and Deep Learning. Int J Mol Sci 2024; 25:10267. [PMID: 39408595 PMCID: PMC11476296 DOI: 10.3390/ijms251910267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2024] [Revised: 09/18/2024] [Accepted: 09/23/2024] [Indexed: 10/20/2024] Open
Abstract
Protamines play a critical role in DNA compaction and stabilization in sperm cells, significantly influencing male fertility and various biotechnological applications. Traditionally, identifying these proteins is a challenging and time-consuming process due to their species-specific variability and complexity. Leveraging advancements in computational biology, we present PROTA, a novel tool that combines machine learning (ML) and deep learning (DL) techniques to predict protamines with high accuracy. For the first time, we integrate Generative Adversarial Networks (GANs) with supervised learning methods to enhance the accuracy and generalizability of protamine prediction. Our methodology evaluated multiple ML models, including Light Gradient-Boosting Machine (LIGHTGBM), Multilayer Perceptron (MLP), Random Forest (RF), eXtreme Gradient Boosting (XGBOOST), k-Nearest Neighbors (KNN), Logistic Regression (LR), Naive Bayes (NB), and Radial Basis Function-Support Vector Machine (RBF-SVM). During ten-fold cross-validation on our training dataset, the MLP model with GAN-augmented data demonstrated superior performance metrics: 0.997 accuracy, 0.997 F1 score, 0.998 precision, 0.997 sensitivity, and 1.0 AUC. In the independent testing phase, this model achieved 0.999 accuracy, 0.999 F1 score, 1.0 precision, 0.999 sensitivity, and 1.0 AUC. These results establish PROTA, accessible via a user-friendly web application. We anticipate that PROTA will be a crucial resource for researchers, enabling the rapid and reliable prediction of protamines, thereby advancing our understanding of their roles in reproductive biology, biotechnology, and medicine.
Collapse
Affiliation(s)
- Jorge G. Farias
- Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar 01145, Temuco 4811230, Chile; (J.G.F.); (L.J.)
| | - Lisandra Herrera-Belén
- Departamento de Ciencias Básicas, Facultad de Ciencias, Universidad Santo Tomas, Temuco 4780000, Chile;
| | - Luis Jimenez
- Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar 01145, Temuco 4811230, Chile; (J.G.F.); (L.J.)
| | - Jorge F. Beltrán
- Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar 01145, Temuco 4811230, Chile; (J.G.F.); (L.J.)
| |
Collapse
|
11
|
Medina-Ortiz D, Contreras S, Fernández D, Soto-García N, Moya I, Cabas-Mora G, Olivera-Nappa Á. Protein Language Models and Machine Learning Facilitate the Identification of Antimicrobial Peptides. Int J Mol Sci 2024; 25:8851. [PMID: 39201537 PMCID: PMC11487388 DOI: 10.3390/ijms25168851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Revised: 08/05/2024] [Accepted: 08/08/2024] [Indexed: 09/02/2024] Open
Abstract
Peptides are bioactive molecules whose functional versatility in living organisms has led to successful applications in diverse fields. In recent years, the amount of data describing peptide sequences and function collected in open repositories has substantially increased, allowing the application of more complex computational models to study the relations between the peptide composition and function. This work introduces AMP-Detector, a sequence-based classification model for the detection of peptides' functional biological activity, focusing on accelerating the discovery and de novo design of potential antimicrobial peptides (AMPs). AMP-Detector introduces a novel sequence-based pipeline to train binary classification models, integrating protein language models and machine learning algorithms. This pipeline produced 21 models targeting antimicrobial, antiviral, and antibacterial activity, achieving average precision exceeding 83%. Benchmark analyses revealed that our models outperformed existing methods for AMPs and delivered comparable results for other biological activity types. Utilizing the Peptide Atlas, we applied AMP-Detector to discover over 190,000 potential AMPs and demonstrated that it is an integrative approach with generative learning to aid in de novo design, resulting in over 500 novel AMPs. The combination of our methodology, robust models, and a generative design strategy offers a significant advancement in peptide-based drug discovery and represents a pivotal tool for therapeutic applications.
Collapse
Affiliation(s)
- David Medina-Ortiz
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Punta Arenas 6210005, Chile
- Centre for Biotechnology and Bioengineering, CeBiB, Universidad de Chile, Santiago 8370456, Chile
| | - Seba Contreras
- Max Planck Institute for Dynamics and Self-Organization, Am Faßberg 17, 37077 Göttingen, Germany
| | - Diego Fernández
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Punta Arenas 6210005, Chile
| | - Nicole Soto-García
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Punta Arenas 6210005, Chile
| | - Iván Moya
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Punta Arenas 6210005, Chile
- Departamento de Ingeniería Química, Universidad de Magallanes, Punta Arenas 6210005, Chile
| | - Gabriel Cabas-Mora
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Punta Arenas 6210005, Chile
| | - Álvaro Olivera-Nappa
- Centre for Biotechnology and Bioengineering, CeBiB, Universidad de Chile, Santiago 8370456, Chile
- Departamento de Ingeniería Química, Biotecnología y Materiales, Universidad de Chile, Santiago 8370456, Chile
| |
Collapse
|
12
|
de Llano García D, Marrero-Ponce Y, Agüero-Chapin G, Ferri FJ, Antunes A, Martinez-Rios F, Rodríguez H. Innovative Alignment-Based Method for Antiviral Peptide Prediction. Antibiotics (Basel) 2024; 13:768. [PMID: 39200068 PMCID: PMC11350826 DOI: 10.3390/antibiotics13080768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Revised: 08/08/2024] [Accepted: 08/09/2024] [Indexed: 09/01/2024] Open
Abstract
Antiviral peptides (AVPs) represent a promising strategy for addressing the global challenges of viral infections and their growing resistances to traditional drugs. Lab-based AVP discovery methods are resource-intensive, highlighting the need for efficient computational alternatives. In this study, we developed five non-trained but supervised multi-query similarity search models (MQSSMs) integrated into the StarPep toolbox. Rigorous testing and validation across diverse AVP datasets confirmed the models' robustness and reliability. The top-performing model, M13+, demonstrated impressive results, with an accuracy of 0.969 and a Matthew's correlation coefficient of 0.71. To assess their competitiveness, the top five models were benchmarked against 14 publicly available machine-learning and deep-learning AVP predictors. The MQSSMs outperformed these predictors, highlighting their efficiency in terms of resource demand and public accessibility. Another significant achievement of this study is the creation of the most comprehensive dataset of antiviral sequences to date. In general, these results suggest that MQSSMs are promissory tools to develop good alignment-based models that can be successfully applied in the screening of large datasets for new AVP discovery.
Collapse
Affiliation(s)
- Daniela de Llano García
- School of Chemical Sciences and Engineering, Yachay Tech University, Hda. San José s/n y Proyecto Yachay, Urcuquí 100119, Imbabura, Ecuador; (D.d.L.G.); (H.R.)
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Pichincha, Ecuador
- Facultad de Ingeniería, Universidad Panamericana, Augusto Rodin 498, Benito Juárez 03920, Ciudad de México, Mexico;
- Computer Science Department, Universitat de València, 46100 Valencia, Burjassot, Spain;
| | - Guillermin Agüero-Chapin
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
| | - Francesc J. Ferri
- Computer Science Department, Universitat de València, 46100 Valencia, Burjassot, Spain;
| | - Agostinho Antunes
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
| | - Felix Martinez-Rios
- Facultad de Ingeniería, Universidad Panamericana, Augusto Rodin 498, Benito Juárez 03920, Ciudad de México, Mexico;
| | - Hortensia Rodríguez
- School of Chemical Sciences and Engineering, Yachay Tech University, Hda. San José s/n y Proyecto Yachay, Urcuquí 100119, Imbabura, Ecuador; (D.d.L.G.); (H.R.)
| |
Collapse
|
13
|
Lefin N, Herrera-Belén L, Farias JG, Beltrán JF. Review and perspective on bioinformatics tools using machine learning and deep learning for predicting antiviral peptides. Mol Divers 2024; 28:2365-2374. [PMID: 37626205 DOI: 10.1007/s11030-023-10718-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 08/15/2023] [Indexed: 08/27/2023]
Abstract
Viruses constitute a constant threat to global health and have caused millions of human and animal deaths throughout human history. Despite advances in the discovery of antiviral compounds that help fight these pathogens, finding a solution to this problem continues to be a task that consumes time and financial resources. Currently, artificial intelligence (AI) has revolutionized many areas of the biological sciences, making it possible to decipher patterns in amino acid sequences that encode different functions and activities. Within the field of AI, machine learning, and deep learning algorithms have been used to discover antimicrobial peptides. Due to their effectiveness and specificity, antimicrobial peptides (AMPs) hold excellent promise for treating various infections caused by pathogens. Antiviral peptides (AVPs) are a specific type of AMPs that have activity against certain viruses. Unlike the research focused on the development of tools and methods for the prediction of antimicrobial peptides, those related to the prediction of AVPs are still scarce. Given the significance of AVPs as potential pharmaceutical options for human and animal health and the ongoing AI revolution, we have reviewed and summarized the current machine learning and deep learning-based tools and methods available for predicting these types of peptides.
Collapse
Affiliation(s)
- Nicolás Lefin
- Department of Chemical Engineering, Faculty of Engineering and Science, University of La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile
| | - Lisandra Herrera-Belén
- Departamento de Ciencias Básicas, Facultad de Ciencias, Universidad Santo Tomás, Temuco, Chile
| | - Jorge G Farias
- Department of Chemical Engineering, Faculty of Engineering and Science, University of La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile
| | - Jorge F Beltrán
- Department of Chemical Engineering, Faculty of Engineering and Science, University of La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile.
| |
Collapse
|
14
|
Goles M, Daza A, Cabas-Mora G, Sarmiento-Varón L, Sepúlveda-Yañez J, Anvari-Kazemabad H, Davari MD, Uribe-Paredes R, Olivera-Nappa Á, Navarrete MA, Medina-Ortiz D. Peptide-based drug discovery through artificial intelligence: towards an autonomous design of therapeutic peptides. Brief Bioinform 2024; 25:bbae275. [PMID: 38856172 PMCID: PMC11163380 DOI: 10.1093/bib/bbae275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/23/2024] [Accepted: 06/04/2024] [Indexed: 06/11/2024] Open
Abstract
With their diverse biological activities, peptides are promising candidates for therapeutic applications, showing antimicrobial, antitumour and hormonal signalling capabilities. Despite their advantages, therapeutic peptides face challenges such as short half-life, limited oral bioavailability and susceptibility to plasma degradation. The rise of computational tools and artificial intelligence (AI) in peptide research has spurred the development of advanced methodologies and databases that are pivotal in the exploration of these complex macromolecules. This perspective delves into integrating AI in peptide development, encompassing classifier methods, predictive systems and the avant-garde design facilitated by deep-generative models like generative adversarial networks and variational autoencoders. There are still challenges, such as the need for processing optimization and careful validation of predictive models. This work outlines traditional strategies for machine learning model construction and training techniques and proposes a comprehensive AI-assisted peptide design and validation pipeline. The evolving landscape of peptide design using AI is emphasized, showcasing the practicality of these methods in expediting the development and discovery of novel peptides within the context of peptide-based drug discovery.
Collapse
Affiliation(s)
- Montserrat Goles
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Av. Pdte. Manuel Bulnes 01855, 6210427, Punta Arenas, Chile
- Departamento de Ingeniería Química, Biotecnología y Materiales, Universidad de Chile, Beauchef 851, 8370456, Santiago, Chile
| | - Anamaría Daza
- Centre for Biotechnology and Bioengineering, CeBiB, Universidad de Chile, Beauchef 851, 8370456, Santiago, Chile
| | - Gabriel Cabas-Mora
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Av. Pdte. Manuel Bulnes 01855, 6210427, Punta Arenas, Chile
| | - Lindybeth Sarmiento-Varón
- Centro Asistencial de Docencia e Investigación, CADI, Universidad de Magallanes, Av. Los Flamencos 01364, 6210005, Punta Arenas, Chile
| | - Julieta Sepúlveda-Yañez
- Facultad de Ciencias de la Salud, Universidad de Magallanes, Av. Pdte. Manuel Bulnes 01855, 6210427, Punta Arenas, Chile
| | - Hoda Anvari-Kazemabad
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Av. Pdte. Manuel Bulnes 01855, 6210427, Punta Arenas, Chile
| | - Mehdi D Davari
- Department of Bioorganic Chemistry, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120, Halle, Germany
| | - Roberto Uribe-Paredes
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Av. Pdte. Manuel Bulnes 01855, 6210427, Punta Arenas, Chile
| | - Álvaro Olivera-Nappa
- Centre for Biotechnology and Bioengineering, CeBiB, Universidad de Chile, Beauchef 851, 8370456, Santiago, Chile
| | - Marcelo A Navarrete
- Centro Asistencial de Docencia e Investigación, CADI, Universidad de Magallanes, Av. Los Flamencos 01364, 6210005, Punta Arenas, Chile
- Escuela de Medicina, Universidad de Magallanes, Av. Pdte. Manuel Bulnes 01855, 6210427, Punta Arenas, Chile
| | - David Medina-Ortiz
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Av. Pdte. Manuel Bulnes 01855, 6210427, Punta Arenas, Chile
- Centre for Biotechnology and Bioengineering, CeBiB, Universidad de Chile, Beauchef 851, 8370456, Santiago, Chile
| |
Collapse
|
15
|
Ullah M, Akbar S, Raza A, Zou Q. DeepAVP-TPPred: identification of antiviral peptides using transformed image-based localized descriptors and binary tree growth algorithm. Bioinformatics 2024; 40:btae305. [PMID: 38710482 PMCID: PMC11256913 DOI: 10.1093/bioinformatics/btae305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 04/08/2024] [Accepted: 05/03/2024] [Indexed: 05/08/2024] Open
Abstract
MOTIVATION Despite the extensive manufacturing of antiviral drugs and vaccination, viral infections continue to be a major human ailment. Antiviral peptides (AVPs) have emerged as potential candidates in the pursuit of novel antiviral drugs. These peptides show vigorous antiviral activity against a diverse range of viruses by targeting different phases of the viral life cycle. Therefore, the accurate prediction of AVPs is an essential yet challenging task. Lately, many machine learning-based approaches have developed for this purpose; however, their limited capabilities in terms of feature engineering, accuracy, and generalization make these methods restricted. RESULTS In the present study, we aim to develop an efficient machine learning-based approach for the identification of AVPs, referred to as DeepAVP-TPPred, to address the aforementioned problems. First, we extract two new transformed feature sets using our designed image-based feature extraction algorithms and integrate them with an evolutionary information-based feature. Next, these feature sets were optimized using a novel feature selection approach called binary tree growth Algorithm. Finally, the optimal feature space from the training dataset was fed to the deep neural network to build the final classification model. The proposed model DeepAVP-TPPred was tested using stringent 5-fold cross-validation and two independent dataset testing methods, which achieved the maximum performance and showed enhanced efficiency over existing predictors in terms of both accuracy and generalization capabilities. AVAILABILITY AND IMPLEMENTATION https://github.com/MateeullahKhan/DeepAVP-TPPred.
Collapse
Affiliation(s)
- Matee Ullah
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan 610054, China
| | - Shahid Akbar
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan 610054, China
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan 23200, Pakistan
| | - Ali Raza
- Department of Computer Science, MY University, Islamabad 45750, Pakistan
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324003, China
| |
Collapse
|
16
|
Kao HJ, Weng TH, Chen CH, Chen YC, Huang KY, Weng SL. iDVEIP: A computer-aided approach for the prediction of viral entry inhibitory peptides. Proteomics 2024; 24:e2300257. [PMID: 38263811 DOI: 10.1002/pmic.202300257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 01/03/2024] [Accepted: 01/05/2024] [Indexed: 01/25/2024]
Abstract
With the notable surge in therapeutic peptide development, various peptides have emerged as potential agents against virus-induced diseases. Viral entry inhibitory peptides (VEIPs), a subset of antiviral peptides (AVPs), offer a promising avenue as entry inhibitors (EIs) with distinct advantages over chemical counterparts. Despite this, a comprehensive analytical platform for characterizing these peptides and their effectiveness in blocking viral entry remains lacking. In this study, we introduce a groundbreaking in silico approach that leverages bioinformatics analysis and machine learning to characterize and identify novel VEIPs. Cross-validation results demonstrate the efficacy of a model combining sequence-based features in predicting VEIPs with high accuracy, validated through independent testing. Additionally, an EI type model has been developed to distinguish peptides specifically acting as Eis from AVPs with alternative activities. Notably, we present iDVEIP, a web-based tool accessible at http://mer.hc.mmh.org.tw/iDVEIP/, designed for automatic analysis and prediction of VEIPs. Emphasizing its capabilities, the tool facilitates comprehensive analyses of peptide characteristics, providing detailed amino acid composition data for each prediction. Furthermore, we showcase the tool's utility in identifying EIs against severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2).
Collapse
Affiliation(s)
- Hui-Ju Kao
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City, Taiwan
| | - Tzu-Hsiang Weng
- Department of Obstetrics and Gynecology, MacKay Memorial Hospital, Taipei City, Taiwan
| | - Chia-Hung Chen
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City, Taiwan
| | - Yu-Chi Chen
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City, Taiwan
| | - Kai-Yao Huang
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City, Taiwan
- Department of Medicine, MacKay Medical College, New Taipei City, Taiwan
- Institute of Biomedical Sciences, MacKay Medical College, New Taipei City, Taiwan
| | - Shun-Long Weng
- Department of Medicine, MacKay Medical College, New Taipei City, Taiwan
- Department of Obstetrics and Gynecology, Hsinchu MacKay Memorial Hospital, Hsinchu City, Taiwan
- MacKay Junior College of Medicine, Nursing and Management, Taipei City, Taiwan
| |
Collapse
|
17
|
Beltrán JF, Herrera-Belén L, Parraguez-Contreras F, Farías JG, Machuca-Sepúlveda J, Short S. MultiToxPred 1.0: a novel comprehensive tool for predicting 27 classes of protein toxins using an ensemble machine learning approach. BMC Bioinformatics 2024; 25:148. [PMID: 38609877 PMCID: PMC11010298 DOI: 10.1186/s12859-024-05748-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 03/14/2024] [Indexed: 04/14/2024] Open
Abstract
Protein toxins are defense mechanisms and adaptations found in various organisms and microorganisms, and their use in scientific research as therapeutic candidates is gaining relevance due to their effectiveness and specificity against cellular targets. However, discovering these toxins is time-consuming and expensive. In silico tools, particularly those based on machine learning and deep learning, have emerged as valuable resources to address this challenge. Existing tools primarily focus on binary classification, determining whether a protein is a toxin or not, and occasionally identifying specific types of toxins. For the first time, we propose a novel approach capable of classifying protein toxins into 27 distinct categories based on their mode of action within cells. To accomplish this, we assessed multiple machine learning techniques and found that an ensemble model incorporating the Light Gradient Boosting Machine and Quadratic Discriminant Analysis algorithms exhibited the best performance. During the tenfold cross-validation on the training dataset, our model exhibited notable metrics: 0.840 accuracy, 0.827 F1 score, 0.836 precision, 0.840 sensitivity, and 0.989 AUC. In the testing stage, using an independent dataset, the model achieved 0.846 accuracy, 0.838 F1 score, 0.847 precision, 0.849 sensitivity, and 0.991 AUC. These results present a powerful next-generation tool called MultiToxPred 1.0, accessible through a web application. We believe that MultiToxPred 1.0 has the potential to become an indispensable resource for researchers, facilitating the efficient identification of protein toxins. By leveraging this tool, scientists can accelerate their search for these toxins and advance their understanding of their therapeutic potential.
Collapse
Affiliation(s)
- Jorge F Beltrán
- Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile.
| | - Lisandra Herrera-Belén
- Departamento de Ciencias Básicas, Facultad de Ciencias, Universidad Santo Tomas, Temuco, Chile
| | - Fernanda Parraguez-Contreras
- Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile
| | - Jorge G Farías
- Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile
| | - Jorge Machuca-Sepúlveda
- Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile
| | - Stefania Short
- Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile
| |
Collapse
|
18
|
Akbar S, Raza A, Zou Q. Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model. BMC Bioinformatics 2024; 25:102. [PMID: 38454333 PMCID: PMC10921744 DOI: 10.1186/s12859-024-05726-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 03/01/2024] [Indexed: 03/09/2024] Open
Abstract
BACKGROUND Viral infections have been the main health issue in the last decade. Antiviral peptides (AVPs) are a subclass of antimicrobial peptides (AMPs) with substantial potential to protect the human body against various viral diseases. However, there has been significant production of antiviral vaccines and medications. Recently, the development of AVPs as an antiviral agent suggests an effective way to treat virus-affected cells. Recently, the involvement of intelligent machine learning techniques for developing peptide-based therapeutic agents is becoming an increasing interest due to its significant outcomes. The existing wet-laboratory-based drugs are expensive, time-consuming, and cannot effectively perform in screening and predicting the targeted motif of antiviral peptides. METHODS In this paper, we proposed a novel computational model called Deepstacked-AVPs to discriminate AVPs accurately. The training sequences are numerically encoded using a novel Tri-segmentation-based position-specific scoring matrix (PSSM-TS) and word2vec-based semantic features. Composition/Transition/Distribution-Transition (CTDT) is also employed to represent the physiochemical properties based on structural features. Apart from these, the fused vector is formed using PSSM-TS features, semantic information, and CTDT descriptors to compensate for the limitations of single encoding methods. Information gain (IG) is applied to choose the optimal feature set. The selected features are trained using a stacked-ensemble classifier. RESULTS The proposed Deepstacked-AVPs model achieved a predictive accuracy of 96.60%%, an area under the curve (AUC) of 0.98, and a precision-recall (PR) value of 0.97 using training samples. In the case of the independent samples, our model obtained an accuracy of 95.15%, an AUC of 0.97, and a PR value of 0.97. CONCLUSION Our Deepstacked-AVPs model outperformed existing models with a ~ 4% and ~ 2% higher accuracy using training and independent samples, respectively. The reliability and efficacy of the proposed Deepstacked-AVPs model make it a valuable tool for scientists and may perform a beneficial role in pharmaceutical design and research academia.
Collapse
Affiliation(s)
- Shahid Akbar
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, 23200, KP, Pakistan
| | - Ali Raza
- Department of Physical and Numerical Sciences, Qurtuba University of Science and Information Technology, Peshawar, 25124, KP, Pakistan
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China.
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, 324000, People's Republic of China.
| |
Collapse
|
19
|
Feijoo-Coronel ML, Mendes B, Ramírez D, Peña-Varas C, de los Monteros-Silva NQE, Proaño-Bolaños C, de Oliveira LC, Lívio DF, da Silva JA, da Silva JMSF, Pereira MGAG, Rodrigues MQRB, Teixeira MM, Granjeiro PA, Patel K, Vaiyapuri S, Almeida JR. Antibacterial and Antiviral Properties of Chenopodin-Derived Synthetic Peptides. Antibiotics (Basel) 2024; 13:78. [PMID: 38247637 PMCID: PMC10812719 DOI: 10.3390/antibiotics13010078] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 01/10/2024] [Accepted: 01/11/2024] [Indexed: 01/23/2024] Open
Abstract
Antimicrobial peptides have been developed based on plant-derived molecular scaffolds for the treatment of infectious diseases. Chenopodin is an abundant seed storage protein in quinoa, an Andean plant with high nutritional and therapeutic properties. Here, we used computer- and physicochemical-based strategies and designed four peptides derived from the primary structure of Chenopodin. Two peptides reproduce natural fragments of 14 amino acids from Chenopodin, named Chen1 and Chen2, and two engineered peptides of the same length were designed based on the Chen1 sequence. The two amino acids of Chen1 containing amide side chains were replaced by arginine (ChenR) or tryptophan (ChenW) to generate engineered cationic and hydrophobic peptides. The evaluation of these 14-mer peptides on Staphylococcus aureus and Escherichia coli showed that Chen1 does not have antibacterial activity up to 512 µM against these strains, while other peptides exhibited antibacterial effects at lower concentrations. The chemical substitutions of glutamine and asparagine by amino acids with cationic or aromatic side chains significantly favoured their antibacterial effects. These peptides did not show significant hemolytic activity. The fluorescence microscopy analysis highlighted the membranolytic nature of Chenopodin-derived peptides. Using molecular dynamic simulations, we found that a pore is formed when multiple peptides are assembled in the membrane. Whereas, some of them form secondary structures when interacting with the membrane, allowing water translocations during the simulations. Finally, Chen2 and ChenR significantly reduced SARS-CoV-2 infection. These findings demonstrate that Chenopodin is a highly useful template for the design, engineering, and manufacturing of non-toxic, antibacterial, and antiviral peptides.
Collapse
Affiliation(s)
- Marcia L. Feijoo-Coronel
- Biomolecules Discovery Group, Universidad Regional Amazónica Ikiam, Km 7 Via Muyuna, Tena 150101, Ecuador
| | - Bruno Mendes
- Biomolecules Discovery Group, Universidad Regional Amazónica Ikiam, Km 7 Via Muyuna, Tena 150101, Ecuador
| | - David Ramírez
- Departamento de Farmacología, Facultad de Ciencias Biológicas, Universidad de Concepción, Concepción 4030000, Chile
| | - Carlos Peña-Varas
- Departamento de Farmacología, Facultad de Ciencias Biológicas, Universidad de Concepción, Concepción 4030000, Chile
| | | | - Carolina Proaño-Bolaños
- Biomolecules Discovery Group, Universidad Regional Amazónica Ikiam, Km 7 Via Muyuna, Tena 150101, Ecuador
| | - Leonardo Camilo de Oliveira
- Centro de Pesquisa e Desenvolvimento de Fármacos, Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Federal University of Minas Gerais, Belo Horizonte 31270-901, Brazil
| | - Diego Fernandes Lívio
- Campus Centro Oeste, Federal University of São João Del-Rei, Rua Sebastião Gonçalves Filho, n 400, Chanadour, Divinópolis 35501-296, Brazil
| | - José Antônio da Silva
- Campus Centro Oeste, Federal University of São João Del-Rei, Rua Sebastião Gonçalves Filho, n 400, Chanadour, Divinópolis 35501-296, Brazil
| | - José Maurício S. F. da Silva
- Departamento de Bioquímica, Centro de Ciências Biomédicas, Federal University of Alfenas, Rua Gabriel Monteiro da Silva, 700, Sala E209, Alfenas 37130-001, Brazil
| | - Marília Gabriella A. G. Pereira
- Departamento de Bioquímica, Centro de Ciências Biomédicas, Federal University of Alfenas, Rua Gabriel Monteiro da Silva, 700, Sala E209, Alfenas 37130-001, Brazil
| | - Marina Q. R. B. Rodrigues
- Departamento de Bioquímica, Centro de Ciências Biomédicas, Federal University of Alfenas, Rua Gabriel Monteiro da Silva, 700, Sala E209, Alfenas 37130-001, Brazil
- Departamento de Engenharia de Biossistemas, Campus Dom Bosco, Federal University of São João Del-Rei, Praça Dom Helvécio, 74, Fábricas, São João del-Rei 36301-160, Brazil
| | - Mauro M. Teixeira
- Centro de Pesquisa e Desenvolvimento de Fármacos, Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Federal University of Minas Gerais, Belo Horizonte 31270-901, Brazil
| | - Paulo Afonso Granjeiro
- Campus Centro Oeste, Federal University of São João Del-Rei, Rua Sebastião Gonçalves Filho, n 400, Chanadour, Divinópolis 35501-296, Brazil
| | - Ketan Patel
- School of Biological Sciences, University of Reading, Reading RG6 6UB, UK
| | | | - José R. Almeida
- Biomolecules Discovery Group, Universidad Regional Amazónica Ikiam, Km 7 Via Muyuna, Tena 150101, Ecuador
- School of Pharmacy, University of Reading, Reading RG6 6UB, UK
| |
Collapse
|
20
|
Ma C, Wolfinger R. A prediction model for blood-brain barrier penetrating peptides based on masked peptide transformers with dynamic routing. Brief Bioinform 2023; 24:bbad399. [PMID: 37985456 DOI: 10.1093/bib/bbad399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 09/26/2023] [Accepted: 10/17/2023] [Indexed: 11/22/2023] Open
Abstract
Blood-brain barrier penetrating peptides (BBBPs) are short peptide sequences that possess the ability to traverse the selective blood-brain interface, making them valuable drug candidates or carriers for various payloads. However, the in vivo or in vitro validation of BBBPs is resource-intensive and time-consuming, driving the need for accurate in silico prediction methods. Unfortunately, the scarcity of experimentally validated BBBPs hinders the efficacy of current machine-learning approaches in generating reliable predictions. In this paper, we present DeepB3P3, a novel framework for BBBPs prediction. Our contribution encompasses four key aspects. Firstly, we propose a novel deep learning model consisting of a transformer encoder layer, a convolutional network backbone, and a capsule network classification head. This integrated architecture effectively learns representative features from peptide sequences. Secondly, we introduce masked peptides as a powerful data augmentation technique to compensate for small training set sizes in BBBP prediction. Thirdly, we develop a novel threshold-tuning method to handle imbalanced data by approximating the optimal decision threshold using the training set. Lastly, DeepB3P3 provides an accurate estimation of the uncertainty level associated with each prediction. Through extensive experiments, we demonstrate that DeepB3P3 achieves state-of-the-art accuracy of up to 98.31% on a benchmarking dataset, solidifying its potential as a promising computational tool for the prediction and discovery of BBBPs.
Collapse
Affiliation(s)
- Chunwei Ma
- JMP Statistical Discovery, LLC, Cary, 27513, NC, USA
- Department of Computer Science and Engineering, University at Buffalo, Buffalo, 14260, NY, USA
| | | |
Collapse
|
21
|
Ali F, Kumar H, Alghamdi W, Kateb FA, Alarfaj FK. Recent Advances in Machine Learning-Based Models for Prediction of Antiviral Peptides. ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING : STATE OF THE ART REVIEWS 2023; 30:1-12. [PMID: 37359746 PMCID: PMC10148704 DOI: 10.1007/s11831-023-09933-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 04/19/2023] [Indexed: 06/28/2023]
Abstract
Viruses have killed and infected millions of people across the world. It causes several chronic diseases like COVID-19, HIV, and hepatitis. To cope with such diseases and virus infections, antiviral peptides (AVPs) have been applied in the design of drugs. Keeping in view the significant role in pharmaceutical industry and other research fields, identification of AVPs is highly indispensable. In this connection, experimental and computational methods were proposed to identify AVPs. However, more accurate predictors for boosting AVPs identification are highly desirable. This work presents a thorough study and reports the available predictors of AVPs. We explained applied datasets, feature representation approaches, classification algorithms, and evaluation parameters of performance. In this study, the limitations of the existing studies and the best methods were emphasized. Provided the pros and cons of the applied classifiers. The future insights demonstrate efficient feature encoding approaches, best feature optimization schemes, and effective classification techniques that can improve the performance of novel method for accurate prediction of AVPs.
Collapse
Affiliation(s)
- Farman Ali
- Sarhad University of Science and Information Technology Peshawar, Mardan Campus, Khyber Pakhtunkhwa, Pakistan
| | - Harish Kumar
- Department of Computer Science, College of Computer Science, King Khalid University, Abha, Saudi Arabia
| | - Wajdi Alghamdi
- Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, 21589 Saudi Arabia
| | - Faris A. Kateb
- Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, 21589 Saudi Arabia
| | - Fawaz Khaled Alarfaj
- Department of Management Information Systems, King Faisal University, Hufof, Saudi Arabia
| |
Collapse
|