1
|
Shahid, Hayat M, Raza A, Akbar S, Alghamdi W, Iqbal N, Zou Q. pACPs-DNN: Predicting anticancer peptides using novel peptide transformation into evolutionary and structure matrix-based images with self-attention deep learning model. Comput Biol Chem 2025; 117:108441. [PMID: 40168838 DOI: 10.1016/j.compbiolchem.2025.108441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2025] [Revised: 03/18/2025] [Accepted: 03/22/2025] [Indexed: 04/03/2025]
Abstract
Globally, cancer remains a major health challenge due to its high mortality rates. Traditional experimental approaches and therapies are resource-intensive and often cause significant side effects. Anticancer peptides (ACPs) have emerged as alternative therapeutic agents owing to their selectivity, safety, and potential to mitigate drug resistance. In this paper, we propose pACPs-DNN, a novel attention mechanism-based deep learning model developed for the accurate prediction of ACPs and non-ACPs. The pACPs-DNN model transforms input peptides into image representations using residue-wise energy contact matrix (RECM), substitution Matrix Representation (SMR), and Position Specific Scoring Matrix (PSSM) embeddings, followed by local binary pattern (LBP)-based decomposition to capture enhanced structural and local semantic features. These transformations generate novel feature sets, including RECM_LBP, LBP_SMR, and LBP_PSSM. Subsequently, a two-tier feature selection approach is employed to identify a high-ranking optimal feature set, which is then used to train an attention-based deep neural network. The proposed pACPs-DNN model achieves an impressive training accuracy of 96.91 % and an AUC of 0.98. To evaluate its generalization capability, the model was validated on independent datasets, demonstrating significant improvements of 5 % and 3.5 % in accuracy over existing models on the Ind-I and Ind-II datasets, respectively. The demonstrated efficacy and robustness of pACPs-DNN highlight its potential as a valuable tool for advancing drug discovery and academic research in cancer-related therapeutic development.
Collapse
Affiliation(s)
- Shahid
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, KP 23200, Pakistan
| | - Maqsood Hayat
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, KP 23200, Pakistan.
| | - Ali Raza
- Department of Computer Science, Bahria University, Islamabad 44220, Pakistan
| | - Shahid Akbar
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, KP 23200, Pakistan; Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Wajdi Alghamdi
- Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Nadeem Iqbal
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, KP 23200, Pakistan
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China; Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China.
| |
Collapse
|
2
|
Dražić E, Jelušić D, Janković Bevandić P, Mauša G, Kalafatovic D. Using Machine Learning to Fast-Track Peptide Nanomaterial Discovery. ACS NANO 2025; 19:20295-20320. [PMID: 40440125 DOI: 10.1021/acsnano.5c00670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2025]
Abstract
Peptides can serve as building blocks for supramolecular materials because of their unique ability to self-assemble, offering potential applications in drug delivery, tissue engineering, and nanotechnology. In this review, we describe peptide self-assembly as a sequence- and context-dependent process and its resulting complexity due to the heterogeneity of the sequences and experimental conditions, which makes cross-laboratory reproducibility a serious challenge and standardized reporting a necessity. Given the large number of possible peptide permutations, machine learning (ML) is suitable for navigating the peptide search space with the aim of reducing trial-and-error experimentation and speeding up the discovery of self-assembling peptides. However, we point out that ML is not a point-and-shoot tool that can be applied directly to any problem and requires careful consideration, domain knowledge, and proper data preparation to achieve meaningful results. In addition, we discuss the lack of negative data reported to be the main limiting factor in the effective application of ML. Considering the transformative potential of artificial intelligence, we conclude that grasping the power of large language models and generative approaches, coupled with explainability techniques, will expedite peptide nanomaterials discovery.
Collapse
Affiliation(s)
- Ena Dražić
- University of Rijeka, Center for Artificial Intelligence and Cybersecurity, 51000 Rijeka, Croatia
| | - Darijan Jelušić
- University of Rijeka, Center for Artificial Intelligence and Cybersecurity, 51000 Rijeka, Croatia
- University of Rijeka, Faculty of Engineering, 51000 Rijeka, Croatia
| | | | - Goran Mauša
- University of Rijeka, Center for Artificial Intelligence and Cybersecurity, 51000 Rijeka, Croatia
- University of Rijeka, Faculty of Engineering, 51000 Rijeka, Croatia
| | - Daniela Kalafatovic
- University of Rijeka, Center for Artificial Intelligence and Cybersecurity, 51000 Rijeka, Croatia
- University of Rijeka, Faculty of Engineering, 51000 Rijeka, Croatia
| |
Collapse
|
3
|
Pandey A, Chen W, Keten S. COLOR: A Compositional Linear Operation-Based Representation of Protein Sequences for Identification of Monomer Contributions to Properties. J Chem Inf Model 2025; 65:4320-4333. [PMID: 40272990 DOI: 10.1021/acs.jcim.5c00205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2025]
Abstract
The properties of biological materials like proteins and nucleic acids are largely determined by their primary sequence. Certain segments in the sequence strongly influence specific functions, but identifying these segments, or so-called motifs, is challenging due to the complexity of sequential data. While deep learning (DL) models can accurately capture sequence-property relationships, the degree of nonlinearity in these models limits the assessment of monomer contributions to a property─a critical step in identifying key motifs. Recent advances in explainable AI (XAI) offer attention and gradient-based methods for estimating monomeric contributions. However, these methods are primarily applied to classification tasks, such as binding site identification, where they achieve limited accuracy (40-45%) and rely on qualitative evaluations. To address these limitations, we introduce a DL model with interpretable steps, enabling direct tracing of monomeric contributions. Inspired by the masking technique commonly used in vision and natural language processing domains, we propose a new metric ( I ) for quantitative analysis on datasets mainly containing distinct properties of anticancer peptides (ACP), antimicrobial peptides (AMP), and collagen. Our model exhibits 22% higher explainability than the gradient and attention-based state-of-the-art models, recognizes critical motifs (RRR, RRI, and RSS) that significantly destabilize ACPs, and identifies motifs in AMPs that are 50% more effective in converting non-AMPs to AMPs. These findings highlight the potential of our model in guiding mutation strategies for designing protein-based biomaterials.
Collapse
Affiliation(s)
- Akash Pandey
- Department of Mechanical Engineering, Northwestern University, Evanston, Illinois 60208, United States
| | - Wei Chen
- Department of Mechanical Engineering, Northwestern University, Evanston, Illinois 60208, United States
| | - Sinan Keten
- Department of Mechanical Engineering, Northwestern University, Evanston, Illinois 60208, United States
- Department of Civil and Environmental Engineering, Northwestern University, Evanston, Illinois 60208, United States
| |
Collapse
|
4
|
Varela-Quitián YF, Mendez-Rivera FE, Bernal-Estévez DA. Cationic antimicrobial peptides: potential templates for anticancer agents. Front Med (Lausanne) 2025; 12:1548603. [PMID: 40342581 PMCID: PMC12058764 DOI: 10.3389/fmed.2025.1548603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2024] [Accepted: 04/07/2025] [Indexed: 05/11/2025] Open
Abstract
Cancer is a major global health concern and one of the leading causes of death worldwide. According to the World Health Organization (WHO), there is an urgent need for novel therapeutic agents to treat this disease. Some antimicrobial peptides (AMPs) have demonstrated activity against both microbial pathogens and cancer cells. Among these, cationic AMPs (CAMPs) have garnered significant attention because of their ability to selectively interact with the negatively charged surfaces of cancer cell membranes. CAMPs present several advantages such as high specificity for targeting cancer cells, minimal toxicity to normal cells, reduced probability of inducing resistance, stability under physiological conditions, ease of chemical modification, and low production costs. This review focuses on CAMPs with anticancer properties such as KLA, bovine lactoferricin derivatives, and LTX-315, and briefly explores common bioinformatics tools for Anticancer Peptides (ACPs) selection pipeline from AMPs.
Collapse
|
5
|
Cai J, Yan J, Un C, Wang Y, Campbell-Valois FX, Siu SWI. BERT-AmPEP60: A BERT-Based Transfer Learning Approach to Predict the Minimum Inhibitory Concentrations of Antimicrobial Peptides for Escherichia coli and Staphylococcus aureus. J Chem Inf Model 2025; 65:3186-3202. [PMID: 40086449 PMCID: PMC12004541 DOI: 10.1021/acs.jcim.4c01749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Revised: 02/06/2025] [Accepted: 02/06/2025] [Indexed: 03/16/2025]
Abstract
Antimicrobial peptides (AMPs) are a promising alternative for combating bacterial drug resistance. While current computer prediction models excel at binary classification of AMPs based on sequences, there is a lack of regression methods to accurately quantify AMP activity against specific bacteria, making the identification of highly potent AMPs a challenge. Here, we present a deep learning method, BERT-AmPEP60, based on the fine-tuned Bidirectional Encoder Representations from Transformers (BERT) architecture to extract embedding features from input sequences. Using the transfer learning strategy, we built regression models to predict the minimum inhibitory concentration (MIC) of peptides for Escherichia coli (EC) and Staphylococcus aureus (SA). In five independent experiments with 10% leave-out sequences as the test sets, the optimal EC and SA models outperformed the state-of-the-art regression method and traditional machine learning methods, achieving an average mean squared error of 0.2664 and 0.3032 (log μM), respectively. They also showed a Pearson correlation coefficient of 0.7955 and 0.7530, and a Kendall correlation coefficient of 0.5797 and 0.5222, respectively. Our models outperformed existing deep learning and machine learning methods that rely on conventional sequence features. This work underscores the effectiveness of utilizing BERT with transfer learning for training quantitative AMP prediction models specific for different bacterial species. The web server of BERT-AmPEP60 can be found at https://app.cbbio.online/ampep/home. To facilitate development, the program source codes are available at https://github.com/janecai0714/AMP_regression_EC_SA.
Collapse
Affiliation(s)
- Jianxiu Cai
- Faculty
of Applied Sciences, Macao Polytechnic University, Rua de Luís Gonzaga Gomes, Macau SAR 99078, China
- Institute
of Science and Environment, University of
Saint Joseph, Rua de
Luís Gonzaga Gomes, Macau SAR 99078, China
| | - Jielu Yan
- Institute
of Science and Environment, University of
Saint Joseph, Rua de
Luís Gonzaga Gomes, Macau SAR 99078, China
- School
of Computer Science, Chongqing University, Shapingba, Chongqing 400044, China
| | - Chonwai Un
- T-Rex
Technology HK Limited, Unit 1017-1, 10/F, Building 19W, Hongkong Science
Park, Shatin, Hong Kong, New Territories
| | - Yapeng Wang
- Faculty
of Applied Sciences, Macao Polytechnic University, Rua de Luís Gonzaga Gomes, Macau SAR 99078, China
| | - François-Xavier Campbell-Valois
- Host-Microbe
Interactions Laboratory, Center for Chemical and Synthetic Biology,
Department of Chemistry and Biomolecular Sciences, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada
- Centre for
Infection, Immunity, and Inflammation, University
of Ottawa, Ottawa K1N 6N5, Ontario, Canada
- Department
of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa K1N 6N5, Ontario, Canada
| | - Shirley W. I. Siu
- Centre
for Artificial Intelligence Driven Drug Discovery, Faculty of Applied
Sciences, Macao Polytechnic University, Rua de Luís Gonzaga Gomes, Macau SAR 99078, China
| |
Collapse
|
6
|
Asim MN, Asif T, Mehmood F, Dengel A. Peptide classification landscape: An in-depth systematic literature review on peptide types, databases, datasets, predictors architectures and performance. Comput Biol Med 2025; 188:109821. [PMID: 39987697 DOI: 10.1016/j.compbiomed.2025.109821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2024] [Revised: 02/03/2025] [Accepted: 02/05/2025] [Indexed: 02/25/2025]
Abstract
Peptides are gaining significant attention in diverse fields such as the pharmaceutical market has seen a steady rise in peptide-based therapeutics over the past six decades. Peptides have been utilized in the development of distinct applications including inhibitors of SARS-COV-2 and treatments for conditions like cancer and diabetes. Distinct types of peptides possess unique characteristics, and development of peptide-specific applications require the discrimination of one peptide type from others. To the best of our knowledge, approximately 230 Artificial Intelligence (AI) driven applications have been developed for 22 distinct types of peptides, yet there remains significant room for development of new predictors. A Comprehensive review addresses the critical gap by providing a consolidated platform for the development of AI-driven peptide classification applications. This paper offers several key contributions, including presenting the biological foundations of 22 unique peptide types and categorizes them into four main classes: Regulatory, Therapeutic, Nutritional, and Delivery Peptides. It offers an in-depth overview of 47 databases that have been used to develop peptide classification benchmark datasets. It summarizes details of 288 benchmark datasets that are used in development of diverse types AI-driven peptide classification applications. It provides a detailed summary of 197 sequence representation learning methods and 94 classifiers that have been used to develop 230 distinct AI-driven peptide classification applications. Across 22 distinct types peptide classification tasks related to 288 benchmark datasets, it demonstrates performance values of 230 AI-driven peptide classification applications. It summarizes experimental settings and various evaluation measures that have been employed to assess the performance of AI-driven peptide classification applications. The primary focus of this manuscript is to consolidate scattered information into a single comprehensive platform. This resource will greatly assist researchers who are interested in developing new AI-driven peptide classification applications.
Collapse
Affiliation(s)
- Muhammad Nabeel Asim
- German Research Center for Artificial Intelligence, Kaiserslautern, 67663, Germany; Intelligentx GmbH (intelligentx.com), Kaiserslautern, Germany.
| | - Tayyaba Asif
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany
| | - Faiza Mehmood
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; Institute of Data Sciences, University of Engineering and Technology, Lahore, Pakistan
| | - Andreas Dengel
- German Research Center for Artificial Intelligence, Kaiserslautern, 67663, Germany; Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; Intelligentx GmbH (intelligentx.com), Kaiserslautern, Germany
| |
Collapse
|
7
|
Wang S, Ma B. Anti-Cancer Peptides Identification and Activity Type Classification With Protein Sequence Pre-Training. IEEE J Biomed Health Inform 2025; 29:1692-1701. [PMID: 40048353 DOI: 10.1109/jbhi.2024.3358632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2025]
Abstract
Cancer remains a significant global health challenge, responsible for millions of deaths annually. Addressing this issue necessitates the discovery of novel anti-cancer drugs. Anti-cancer peptides (ACPs), with their unique ability to selectively target cancer cells, offer new hope in discovering low side-effect anti-cancer drugs. However, the process of discovering novel ACPs is both time-consuming and costly. Therefore, there is an urgent need for a computational method that can predict whether a given peptide is an ACP and classify its specific functional types. In this paper, we introduce DUO-ACP, a model serving dual roles in ACP prediction: identification and functional type classification. DUO-ACP employs two embedding modules to acquire knowledge about global protein features and local ACP characteristics, complemented by a prediction module. When assessed on two publicly available datasets for each task, DUO-ACP surpasses all existing methods, achieving outstanding results: an ACP identification accuracy of 89.5% and a Macro-averaged AUC of 88.6% in ACP functional type classification. We further interpret the contribution of each part of our model, including the two types of embeddings as well as ensemble learning. On a new curated dataset, the prediction results of DUO-ACP closely match existing literature, highlighting DUO-ACP's generalization capabilities on previously unseen data and displaying the potential capability of discovering novel ACP.
Collapse
|
8
|
Zhang R, Li Y, Jiang Q, Li Y, Cai Z, Zhang H. ESMR4FBP: A pLM-based regression prediction model for specific properties of food-derived peptides optimized multiple bionic metaheuristic algorithms. Food Chem 2025; 464:141840. [PMID: 39509883 DOI: 10.1016/j.foodchem.2024.141840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Revised: 09/12/2024] [Accepted: 10/27/2024] [Indexed: 11/15/2024]
Abstract
Due to the growing emphasis on food safety, peptide research is increasingly focusing on food sources. Traditional methods for determining peptide properties are expensive. While artificial intelligence (AI) models can reduce cost, existing peptide models often lack accuracy. This study aimed to develop a regression model capable of predicting peptide properties. We integrated the ESM-2 model with the LSTM architecture and optimized the model structure using three metaheuristic algorithms, including WOA, SSA, and HHO. Using an antioxidant tripeptide dataset, our model achieved an R2 of 0.9458 and RMSE of 0.3135, outperforming the state-of-the-art (SOTA) model by 11.66 % and 50.00 %, respectively. The developed model was further applied to the bitter peptide dataset, resulting in R2 of 0.8385 and RMSE of 0.4414, respectively. These results suggest that our model has the potential to accurately predict the properties of various types of peptides.
Collapse
Affiliation(s)
- Ruihao Zhang
- College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, PR China; Future Food Laboratory, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing 314100, PR China
| | - Yonghui Li
- Department of Grain Science and Industry, Kansas State University, Manhattan, KS 66506, USA
| | - Qinbo Jiang
- College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, PR China
| | - Yang Li
- College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, PR China
| | - Zhe Cai
- College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, PR China
| | - Hui Zhang
- College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, PR China.
| |
Collapse
|
9
|
Yue J, Li T, Xu J, Chen Z, Li Y, Liang S, Liu Z, Wang Y. Discovery of anticancer peptides from natural and generated sequences using deep learning. Int J Biol Macromol 2025; 290:138880. [PMID: 39706427 DOI: 10.1016/j.ijbiomac.2024.138880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2024] [Revised: 12/10/2024] [Accepted: 12/16/2024] [Indexed: 12/23/2024]
Abstract
Anticancer peptides (ACPs) demonstrate significant potential in clinical cancer treatment due to their ability to selectively target and kill cancer cells. In recent years, numerous artificial intelligence (AI) algorithms have been developed. However, many predictive methods lack sufficient wet lab validation, thereby constraining the progress of models and impeding the discovery of novel ACPs. This study proposes a comprehensive research strategy by introducing CNBT-ACPred, an ACP prediction model based on a three-channel deep learning architecture, supported by extensive in vitro and in vivo experiments. CNBT-ACPred achieved an accuracy of 0.9554 and a Matthews Correlation Coefficient (MCC) of 0.8602. Compared to existing excellent models, CNBT-ACPred increased accuracy by at least 5 % and improved MCC by 15 %. Predictions were conducted on over 3.8 million sequences from Uniprot, along with 100,000 sequences generated by a deep generative model, ultimately identifying 37 out of 41 candidate peptides from >30 species that exhibited effective in vitro tumor inhibitory activity. Among these, tPep14 demonstrated significant anticancer effects in two mouse xenograft models without detectable toxicity. Finally, the study revealed correlations between the amino acid composition, structure, and function of the identified ACP candidates.
Collapse
Affiliation(s)
- Jianda Yue
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China.
| | - Tingting Li
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China.
| | - Jiawei Xu
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China.
| | - Zihui Chen
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China
| | - Yaqi Li
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China.
| | - Songping Liang
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China.
| | - Zhonghua Liu
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China.
| | - Ying Wang
- The National and Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha 410081, Hunan, China; Peptide and small molecule drug R&D plateform, Furong Laboratory, Hunan Normal University, Changsha 410081, Hunan, China; Institute of Interdisciplinary Studies, Hunan Normal University, Changsha 410081, Hunan, China.
| |
Collapse
|
10
|
Barroso RA, Agüero-Chapin G, Sousa R, Marrero-Ponce Y, Antunes A. Unlocking Antimicrobial Peptides: In Silico Proteolysis and Artificial Intelligence-Driven Discovery from Cnidarian Omics. Molecules 2025; 30:550. [PMID: 39942653 PMCID: PMC11820242 DOI: 10.3390/molecules30030550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2024] [Revised: 01/20/2025] [Accepted: 01/21/2025] [Indexed: 02/16/2025] Open
Abstract
Overcoming the growing challenge of antimicrobial resistance (AMR), which affects millions of people worldwide, has driven attention for the exploration of marine-derived antimicrobial peptides (AMPs) for innovative solutions. Cnidarians, such as corals, sea anemones, and jellyfish, are a promising valuable resource of these bioactive peptides due to their robust innate immune systems yet are still poorly explored. Hence, we employed an in silico proteolysis strategy to search for novel AMPs from omics data of 111 Cnidaria species. Millions of peptides were retrieved and screened using shallow- and deep-learning models, prioritizing AMPs with a reduced toxicity and with a structural distinctiveness from characterized AMPs. After complex network analysis, a final dataset of 3130 Cnidaria singular non-haemolytic and non-toxic AMPs were identified. Such unique AMPs were mined for their putative antibacterial activity, revealing 20 favourable candidates for in vitro testing against important ESKAPEE pathogens, offering potential new avenues for antibiotic development.
Collapse
Affiliation(s)
- Ricardo Alexandre Barroso
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208 Porto, Portugal; (R.A.B.); (G.A.-C.); (R.S.)
- Department of Biology, Faculty of Sciences of University of Porto (FCUP), Rua do Campo Alegre s/n, 4169-007 Porto, Portugal
| | - Guillermin Agüero-Chapin
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208 Porto, Portugal; (R.A.B.); (G.A.-C.); (R.S.)
- Department of Biology, Faculty of Sciences of University of Porto (FCUP), Rua do Campo Alegre s/n, 4169-007 Porto, Portugal
| | - Rita Sousa
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208 Porto, Portugal; (R.A.B.); (G.A.-C.); (R.S.)
- Department of Biology, Faculty of Sciences of University of Porto (FCUP), Rua do Campo Alegre s/n, 4169-007 Porto, Portugal
| | - Yovani Marrero-Ponce
- Facultad de Ingeniería, Universidad Panamericana, Augusto Rodin No. 498, Insurgentes Mixcoac, Benito Juárez, Ciudad de Mexico 03920, Mexico;
- Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Instituto de Simulación Computacional (ISC-USFQ), Universidad San Francisco de Quito (USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Ecuador
| | - Agostinho Antunes
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208 Porto, Portugal; (R.A.B.); (G.A.-C.); (R.S.)
- Department of Biology, Faculty of Sciences of University of Porto (FCUP), Rua do Campo Alegre s/n, 4169-007 Porto, Portugal
| |
Collapse
|
11
|
Kim J, Woo J, Park JY, Kim KJ, Kim D. Deep learning for NAD/NADP cofactor prediction and engineering using transformer attention analysis in enzymes. Metab Eng 2025; 87:86-94. [PMID: 39571721 DOI: 10.1016/j.ymben.2024.11.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 09/25/2024] [Accepted: 11/17/2024] [Indexed: 12/13/2024]
Abstract
Understanding and manipulating the cofactor preferences of NAD(P)-dependent oxidoreductases, the most widely distributed enzyme group in nature, is increasingly crucial in bioengineering. However, large-scale identification of the cofactor preferences and the design of mutants to switch cofactor specificity remain as complex tasks. Here, we introduce DISCODE (Deep learning-based Iterative pipeline to analyze Specificity of COfactors and to Design Enzyme), a novel transformer-based deep learning model to predict NAD(P) cofactor preferences. For model training, a total of 7,132 NAD(P)-dependent enzyme sequences were collected. Leveraging whole-length sequence information, DISCODE classifies the cofactor preferences of NAD(P)-dependent oxidoreductase protein sequences without structural or taxonomic limitation. The model showed 97.4% and 97.3% of accuracy and F1 score, respectively. A notable feature of DISCODE is the interpretability of its transformer layers. Analysis of attention layers in the model enables identification of several residues that showed significantly higher attention weights. They were well aligned with structurally important residues that closely interact with NAD(P), facilitating the identification of key residues for determining cofactor specificities. These key residues showed high consistency with verified cofactor switching mutants. Integrated into an enzyme design pipeline, DISCODE coupled with attention analysis, enables a fully automated approach to redesign cofactor specificity.
Collapse
Affiliation(s)
- Jaehyung Kim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Jihoon Woo
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Joon Young Park
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Kyung-Jin Kim
- School of Life Sciences, BK21 FOUR KNU Creative BioResearch Group, KNU Institute of Microbiology, Kyungpook National University, Daegu, 41566, Republic of Korea
| | - Donghyuk Kim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea.
| |
Collapse
|
12
|
Huang G, Cao Y, Dai Q, Chen W. ACP-DPE: A Dual-Channel Deep Learning Model for Anticancer Peptide Prediction. IET Syst Biol 2025; 19:e70010. [PMID: 40119615 PMCID: PMC11928748 DOI: 10.1049/syb2.70010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2024] [Revised: 02/13/2025] [Accepted: 02/20/2025] [Indexed: 03/24/2025] Open
Abstract
Cancer is a serious and complex disease caused by uncontrolled cell growth and is becoming one of the leading causes of death worldwide. Anticancer peptides (ACPs), as a bioactive peptide with lower toxicity, emerge as a promising means of effectively treating cancer. Identifying ACPs is challenging due to the limitation of experimental conditions. To address this, we proposed a dual-channel-based deep learning method, termed ACP-DPE, for ACP prediction. The ACP-DPE consisted of two parallel channels: one was an embedding layer followed by the bi-directional gated recurrent unit (Bi-GRU) module, and the other was an adaptive embedding layer followed by the dilated convolution module. The Bi-GRU module captured the peptide sequence dependencies, whereas the dilated convolution module characterised the local relationship of amino acids. Experimental results show that ACP-DPE achieves an accuracy of 82.81% and a sensitivity of 86.63%, surpassing the state-of-the-art method by 3.86% and 5.1%, respectively. These findings demonstrate the effectiveness of ACP-DPE for ACP prediction and highlight its potential as a valuable tool in cancer treatment research.
Collapse
Affiliation(s)
- Guohua Huang
- College of Information Science and EngineeringShaoyang UniversityShaoyangChina
- Hunan Provincial Key Laboratory of Finance & Economics Big Data Science and TechnologyHunan University of Finance and EconomicsChangshaChina
| | - Yujie Cao
- College of Information Science and EngineeringShaoyang UniversityShaoyangChina
| | - Qi Dai
- College of Life Science and MedicineZhejiang Sci‐Tech UniversityHangzhouChina
| | - Weihong Chen
- Hunan Provincial Key Laboratory of Finance & Economics Big Data Science and TechnologyHunan University of Finance and EconomicsChangshaChina
| |
Collapse
|
13
|
Dhoriyani J, Bergman MT, Hall CK, You F. Integrating biophysical modeling, quantum computing, and AI to discover plastic-binding peptides that combat microplastic pollution. PNAS NEXUS 2025; 4:pgae572. [PMID: 39871828 PMCID: PMC11770337 DOI: 10.1093/pnasnexus/pgae572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Accepted: 12/16/2024] [Indexed: 01/29/2025]
Abstract
Methods are needed to mitigate microplastic (MP) pollution to minimize their harm to the environment and human health. Given the ability of polypeptides to adsorb strongly to materials of micro- or nanometer size, plastic-binding peptides (PBPs) could help create bio-based tools for detecting, filtering, or degrading MNP pollution. However, the development of such tools is prevented by the lack of PBPs. In this work, we discover and evaluate PBPs for several common plastics by combining biophysical modeling, molecular dynamics (MD), quantum computing, and reinforcement learning. We frame peptide affinity for a given plastic through a Potts model that is a function of the amino acid sequence and then search for the amino acid sequences with the greatest predicted affinity using quantum annealing. We also use proximal policy optimization to find PBPs with a broader range of physicochemical properties, such as isoelectric point or solubility. Evaluation of the discovered PBPs in MD simulations demonstrates that the peptides have high affinity for two of the plastics: polyethylene and polypropylene. We conclude by describing how our computational approach could be paired with experimental approaches to create a nexus for designing and optimizing peptide-based tools that aid the detection, capture, or biodegradation of MPs. We thus hope that this study will aid in the fight against MP pollution.
Collapse
Affiliation(s)
- Jeet Dhoriyani
- Systems Engineering, College of Engineering, Cornell University, Ithaca, NY 14853, USA
| | - Michael T Bergman
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27606, USA
| | - Carol K Hall
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27606, USA
| | - Fengqi You
- Systems Engineering, College of Engineering, Cornell University, Ithaca, NY 14853, USA
- Robert Frederick Smith School of Chemical and Biomolecular Engineering, Cornell University, Ithaca, NY 14853, USA
- Cornell University AI for Science Institute, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
14
|
Bizzotto E, Zampieri G, Treu L, Filannino P, Di Cagno R, Campanaro S. Classification of bioactive peptides: A systematic benchmark of models and encodings. Comput Struct Biotechnol J 2024; 23:2442-2452. [PMID: 38867723 PMCID: PMC11168199 DOI: 10.1016/j.csbj.2024.05.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 05/10/2024] [Accepted: 05/22/2024] [Indexed: 06/14/2024] Open
Abstract
Bioactive peptides are short amino acid chains possessing biological activity and exerting physiological effects relevant to human health. Despite their therapeutic value, their identification remains a major problem, as it mainly relies on time-consuming in vitro tests. While bioinformatic tools for the identification of bioactive peptides are available, they are focused on specific functional classes and have not been systematically tested on realistic settings. To tackle this problem, bioactive peptide sequences and functions were here gathered from a variety of databases to generate a unified collection of bioactive peptides from microbial fermentation. This collection was organized into nine functional classes including some previously studied and some unexplored such as immunomodulatory, opioid and cardiovascular peptides. Upon assessing their sequence properties, four alternative encoding methods were tested in combination with a multitude of machine learning algorithms, from basic classifiers like logistic regression to advanced algorithms like BERT. Tests on a total of 171 models showed that, while some functions are intrinsically easier to detect, no single combination of classifiers and encoders worked universally well for all classes. For this reason, we unified all the best individual models for each class and generated CICERON (Classification of bIoaCtive pEptides fRom micrObial fermeNtation), a classification tool for the functional classification of peptides. State-of-the-art classifiers were found to underperform on our realistic benchmark dataset compared to the models included in CICERON. Altogether, our work provides a tool for real-world peptide classification and can serve as a benchmark for future model development.
Collapse
Affiliation(s)
- Edoardo Bizzotto
- Department of Biology, University of Padua, Via U. Bassi 58/b, Padova 35131, Italy
| | - Guido Zampieri
- Department of Biology, University of Padua, Via U. Bassi 58/b, Padova 35131, Italy
| | - Laura Treu
- Department of Biology, University of Padua, Via U. Bassi 58/b, Padova 35131, Italy
| | - Pasquale Filannino
- Department of Soil, Plant and Food Science, University of Bari Aldo Moro, Via G. Amendola 165/a, Bari 70126, Italy
| | - Raffaella Di Cagno
- Faculty of Agricultural, Environmental and Food Sciences, Free University of Bolzano, Piazza Universita, 5, Bolzano 39100, Italy
| | - Stefano Campanaro
- Department of Biology, University of Padua, Via U. Bassi 58/b, Padova 35131, Italy
| |
Collapse
|
15
|
Hashemi S, Vosough P, Taghizadeh S, Savardashtaki A. Therapeutic peptide development revolutionized: Harnessing the power of artificial intelligence for drug discovery. Heliyon 2024; 10:e40265. [PMID: 39605829 PMCID: PMC11600032 DOI: 10.1016/j.heliyon.2024.e40265] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 10/07/2024] [Accepted: 11/07/2024] [Indexed: 11/29/2024] Open
Abstract
Due to the spread of antibiotic resistance, global attention is focused on its inhibition and the expansion of effective medicinal compounds. The novel functional properties of peptides have opened up new horizons in personalized medicine. With artificial intelligence methods combined with therapeutic peptide products, pharmaceuticals and biotechnology advance drug development rapidly and reduce costs. Short-chain peptides inhibit a wide range of pathogens and have great potential for targeting diseases. To address the challenges of synthesis and sustainability, artificial intelligence methods, namely machine learning, must be integrated into their production. Learning methods can use complicated computations to select the active and toxic compounds of the drug and its metabolic activity. Through this comprehensive review, we investigated the artificial intelligence method as a potential tool for finding peptide-based drugs and providing a more accurate analysis of peptides through the introduction of predictable databases for effective selection and development.
Collapse
Affiliation(s)
- Samaneh Hashemi
- Student Research Committee, Shiraz University of Medical Sciences, Shiraz, Iran
- Department of Medical Biotechnology, School of Advanced Medical Sciences and Technologies, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Parisa Vosough
- Student Research Committee, Shiraz University of Medical Sciences, Shiraz, Iran
- Department of Medical Biotechnology, School of Advanced Medical Sciences and Technologies, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Saeed Taghizadeh
- Department of Medical Biotechnology, School of Advanced Medical Sciences and Technologies, Shiraz University of Medical Sciences, Shiraz, Iran
- Pharmaceutical Science Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Amir Savardashtaki
- Department of Medical Biotechnology, School of Advanced Medical Sciences and Technologies, Shiraz University of Medical Sciences, Shiraz, Iran
- Infertility Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| |
Collapse
|
16
|
Abdelbaky I, Elhakeem M, Tayara H, Badr E, Abdul Salam M. Enhanced prediction of hemolytic activity in antimicrobial peptides using deep learning-based sequence analysis. BMC Bioinformatics 2024; 25:368. [PMID: 39604856 PMCID: PMC11603801 DOI: 10.1186/s12859-024-05983-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Accepted: 11/11/2024] [Indexed: 11/29/2024] Open
Abstract
Antimicrobial peptides (AMPs) are a promising class of antimicrobial drugs due to their broad-spectrum activity against microorganisms. However, their clinical application is limited by their potential to cause hemolysis, the destruction of red blood cells. To address this issue, we propose a deep learning model based on convolutional neural networks (CNNs) for predicting the hemolytic activity of AMPs. Peptide sequences are represented using one-hot encoding, and the CNN architecture consists of multiple convolutional and fully connected layers. The model was trained on six different datasets: HemoPI-1, HemoPI-2, HemoPI-3, RNN-Hem, Hlppredfuse, and AMP-Combined, achieving Matthew's correlation coefficients of 0.9274, 0.5614, 0.6051, 0.6142, 0.8799, and 0.7484, respectively. Our model outperforms previously reported methods and can facilitate the development of novel AMPs with reduced hemolytic activity, which is crucial for their therapeutic use in treating bacterial infections.
Collapse
Affiliation(s)
- Ibrahim Abdelbaky
- Artificial Intelligence Department, Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt.
| | - Mohamed Elhakeem
- Artificial Intelligence Department, Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt.
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju, 54896, South Korea.
| | - Elsayed Badr
- Scientific Computing Department, Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt
- The Egyptian School of Data Science (ESDS), Benha, Egypt
- Department of Information Systems, College of Information Technology, Misr University for Science and Technology, Giza, Egypt
| | - Mustafa Abdul Salam
- Artificial Intelligence Department, Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt
- Department of Computer Engineering and Information, College of Engineering, Wadi Ad Dwaser, Prince Sattam Bin Abdulaziz University, Al-Kharj, 16273, Saudi Arabia
| |
Collapse
|
17
|
Kalemati M, Noroozi A, Shahbakhsh A, Koohi S. ParaAntiProt provides paratope prediction using antibody and protein language models. Sci Rep 2024; 14:29141. [PMID: 39587231 PMCID: PMC11589832 DOI: 10.1038/s41598-024-80940-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Accepted: 11/22/2024] [Indexed: 11/27/2024] Open
Abstract
Efficiently predicting the paratope holds immense potential for enhancing antibody design, treating cancers and other serious diseases, and advancing personalized medicine. Although traditional methods are highly accurate, they are often time-consuming, labor-intensive, and reliant on 3D structures, restricting their broader use. On the other hand, machine learning-based methods, besides relying on structural data, entail descriptor computation, consideration of diverse physicochemical properties, and feature engineering. Here, we develop a deep learning-assisted prediction method for paratope identification, relying solely on amino acid sequences and being antigen-agnostic. Built on the ProtTrans architecture, and utilizing pre-trained protein and antibody language models, we extract efficient embeddings for predicting paratope. By incorporating positional encoding for Complementarity Determining Regions, our model gains a deeper structural understanding, achieving remarkable performance with a 0.904 ROC AUC, 0.701 F1-score, and 0.585 MCC on benchmark datasets. In addition to yielding accurate antibody paratope predictions, our method exhibits strong performance in predicting nanobody paratope, achieving a ROC AUC of 0.912 and a PR AUC of 0.665 on the nanobody dataset. Notably, our approach outperforms structure-based prediction methods, boasting a PR AUC of 0.731. Various conducted ablation studies, which elaborate on the impact of each part of the model on the prediction task, show that the improvement in prediction performance by applying CDR positional encoding together with CNNs depends on the specific protein and antibody language models used. These results highlight the potential of our method to advance disease understanding and aid in the discovery of new diagnostics and antibody therapies.
Collapse
Affiliation(s)
- Mahmood Kalemati
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Alireza Noroozi
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Aref Shahbakhsh
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Somayyeh Koohi
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran.
| |
Collapse
|
18
|
Srivastava P, Steuer A, Ferri F, Nicoli A, Schultz K, Bej S, Di Pizio A, Wolkenhauer O. Bitter peptide prediction using graph neural networks. J Cheminform 2024; 16:111. [PMID: 39375808 PMCID: PMC11459932 DOI: 10.1186/s13321-024-00909-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 09/22/2024] [Indexed: 10/09/2024] Open
Abstract
Bitter taste is an unpleasant taste modality that affects food consumption. Bitter peptides are generated during enzymatic processes that produce functional, bioactive protein hydrolysates or during the aging process of fermented products such as cheese, soybean protein, and wine. Understanding the underlying peptide sequences responsible for bitter taste can pave the way for more efficient identification of these peptides. This paper presents BitterPep-GCN, a feature-agnostic graph convolution network for bitter peptide prediction. The graph-based model learns the embedding of amino acids in the bitter peptide sequences and uses mixed pooling for bitter classification. BitterPep-GCN was benchmarked using BTP640, a publicly available bitter peptide dataset. The latent peptide embeddings generated by the trained model were used to analyze the activity of sequence motifs responsible for the bitter taste of the peptides. Particularly, we calculated the activity for individual amino acids and dipeptide, tripeptide, and tetrapeptide sequence motifs present in the peptides. Our analyses pinpoint specific amino acids, such as F, G, P, and R, as well as sequence motifs, notably tripeptide and tetrapeptide motifs containing FF, as key bitter signatures in peptides. This work not only provides a new predictor of bitter taste for a more efficient identification of bitter peptides in various food products but also gives a hint into the molecular basis of bitterness.Scientific ContributionOur work provides the first application of Graph Neural Networks for the prediction of peptide bitter taste. The best-developed model, BitterPep-GCN, learns the embedding of amino acids in the bitter peptide sequences and uses mixed pooling for bitter classification. The embeddings were used to analyze the sequence motifs responsible for the bitter taste.
Collapse
Affiliation(s)
- Prashant Srivastava
- Institute of Computer Science, University of Rostock, 18051, Rostock, Germany
| | - Alexandra Steuer
- Section III In Silico Biology & Machine Learning, Leibniz Institute for Food Systems Biology at the Technical University of Munich, 85354, Freising, Germany
- Professorship for Chemoinformatics and Protein Modelling, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
| | - Francesco Ferri
- Section III In Silico Biology & Machine Learning, Leibniz Institute for Food Systems Biology at the Technical University of Munich, 85354, Freising, Germany
- Professorship for Chemoinformatics and Protein Modelling, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
| | - Alessandro Nicoli
- Section III In Silico Biology & Machine Learning, Leibniz Institute for Food Systems Biology at the Technical University of Munich, 85354, Freising, Germany
- Professorship for Chemoinformatics and Protein Modelling, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
| | - Kristian Schultz
- Institute of Computer Science, University of Rostock, 18051, Rostock, Germany
| | - Saptarshi Bej
- Indian Institute of Science Education and Research Thiruvananthapuram, Maruthamala P. O, Vithura, 695551, Kerala, India
| | - Antonella Di Pizio
- Section III In Silico Biology & Machine Learning, Leibniz Institute for Food Systems Biology at the Technical University of Munich, 85354, Freising, Germany.
- Professorship for Chemoinformatics and Protein Modelling, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany.
| | - Olaf Wolkenhauer
- Institute of Computer Science, University of Rostock, 18051, Rostock, Germany.
- Section III In Silico Biology & Machine Learning, Leibniz Institute for Food Systems Biology at the Technical University of Munich, 85354, Freising, Germany.
| |
Collapse
|
19
|
Kilimci ZH, Yalcin M. ACP-ESM: A novel framework for classification of anticancer peptides using protein-oriented transformer approach. Artif Intell Med 2024; 156:102951. [PMID: 39173421 DOI: 10.1016/j.artmed.2024.102951] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 07/19/2024] [Accepted: 08/13/2024] [Indexed: 08/24/2024]
Abstract
Anticancer peptides (ACPs) are a class of molecules that have gained significant attention in the field of cancer research and therapy. ACPs are short chains of amino acids, the building blocks of proteins, and they possess the ability to selectively target and kill cancer cells. One of the key advantages of ACPs is their ability to selectively target cancer cells while sparing healthy cells to a greater extent. This selectivity is often attributed to differences in the surface properties of cancer cells compared to normal cells. That is why ACPs are being investigated as potential candidates for cancer therapy. ACPs may be used alone or in combination with other treatment modalities like chemotherapy and radiation therapy. While ACPs hold promise as a novel approach to cancer treatment, there are challenges to overcome, including optimizing their stability, improving selectivity, and enhancing their delivery to cancer cells, continuous increasing in number of peptide sequences, developing a reliable and precise prediction model. In this work, we propose an efficient transformer-based framework to identify ACPs for by performing accurate a reliable and precise prediction model. For this purpose, four different transformer models, namely ESM, ProtBERT, BioBERT, and SciBERT are employed to detect ACPs from amino acid sequences. To demonstrate the contribution of the proposed framework, extensive experiments are carried on widely-used datasets in the literature, two versions of AntiCp2, cACP-DeepGram, ACP-740. Experiment results show the usage of proposed model enhances classification accuracy when compared to the literature studies. The proposed framework, ESM, exhibits 96.45% of accuracy for AntiCp2 dataset, 97.66% of accuracy for cACP-DeepGram dataset, and 88.51% of accuracy for ACP-740 dataset, thence determining new state-of-the-art. The code of proposed framework is publicly available at github (https://github.com/mstf-yalcin/acp-esm).
Collapse
Affiliation(s)
- Zeynep Hilal Kilimci
- Department of Information Systems Engineering, Kocaeli University, 41001, Kocaeli, Turkey.
| | - Mustafa Yalcin
- Department of Information Systems Engineering, Kocaeli University, 41001, Kocaeli, Turkey.
| |
Collapse
|
20
|
Wang X, Wang S. ACP-PDAFF: Pretrained model and dual-channel attentional feature fusion for anticancer peptides prediction. Comput Biol Chem 2024; 112:108141. [PMID: 38996756 DOI: 10.1016/j.compbiolchem.2024.108141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 05/26/2024] [Accepted: 06/28/2024] [Indexed: 07/14/2024]
Abstract
Anticancer peptides(ACPs) have attracted significant interest as a novel method of treating cancer due to their ability to selectively kill cancer cells without damaging normal cells. Many artificial intelligence-based methods have demonstrated impressive performance in predicting ACPs. Nevertheless, the limitations of existing methods in feature engineering include handcrafted features driven by prior knowledge, insufficient feature extraction, and inefficient feature fusion. In this study, we propose a model based on a pretrained model, and dual-channel attentional feature fusion(DAFF), called ACP-PDAFF. Firstly, to reduce the heavy dependence on expert knowledge-based handcrafted features, binary profile features (BPF) and physicochemical properties features(PCPF) are used as inputs to the transformer model. Secondly, aimed at learning more diverse feature informations of ACPs, a pretrained model ProtBert is utilized. Thirdly, for better fusion of different feature channels, DAFF is employed. Finally, to evaluate the performance of the model, we compare it with other methods on five benchmark datasets, including ACP-Mixed-80 dataset, Main and Alternate datasets of AntiCP 2.0, LEE and Independet dataset, and ACPred-Fuse dataset. And the accuracies obtained by ACP-PDAFF are 0.86, 0.80, 0.94, 0.97 and 0.95 on five datasets, respectively, higher than existing methods by 1% to 12%. Therefore, by learning rich feature informations and effectively fusing different feature channels, ACD-PDAFF achieves outstanding performance. Our code and the datasets are available at https://github.com/wongsing/ACP-PDAFF.
Collapse
Affiliation(s)
- Xinyi Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China
| | - Shunfang Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China.
| |
Collapse
|
21
|
Chung CR, Chien CY, Tang Y, Wu LC, Hsu JBK, Lu JJ, Lee TY, Bai C, Horng JT. An ensemble deep learning model for predicting minimum inhibitory concentrations of antimicrobial peptides against pathogenic bacteria. iScience 2024; 27:110718. [PMID: 39262770 PMCID: PMC11388163 DOI: 10.1016/j.isci.2024.110718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 07/09/2024] [Accepted: 08/08/2024] [Indexed: 09/13/2024] Open
Abstract
The rise of antibiotic resistance necessitates effective alternative therapies. Antimicrobial peptides (AMPs) are promising due to their broad inhibitory effects. This study focuses on predicting the minimum inhibitory concentration (MIC) of AMPs against whom-priority pathogens: Staphylococcus aureus ATCC 25923, Escherichia coli ATCC 25922, and Pseudomonas aeruginosa ATCC 27853. We developed a comprehensive regression model integrating AMP sequence-based and genomic features. Using eight AI-based architectures, including deep learning with protein language model embeddings, we created an ensemble model combining bi-directional long short-term memory (BiLSTM), convolutional neural network (CNN), and multi-branch model (MBM). The ensemble model showed superior performance with Pearson correlation coefficients of 0.756, 0.781, and 0.802 for the bacterial strains, demonstrating its accuracy in predicting MIC values. This work sets a foundation for future studies to enhance model performance and advance AMP applications in combating antibiotic resistance.
Collapse
Affiliation(s)
- Chia-Ru Chung
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
| | - Chung-Yu Chien
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
| | - Yun Tang
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Li-Ching Wu
- Department of Biomedical Sciences and Engineering, National Central University, Taoyuan, Taiwan
| | - Justin Bo-Kai Hsu
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, Taiwan
| | - Jang-Jih Lu
- Department of Laboratory Medicine, Chang Gung Memorial Hospital at Linkou, Taoyuan City, Taiwan
- School of Medicine, Chang Gung University, Taoyuan City, Taiwan
- Department of Medical Biotechnology and Laboratory Science, Chang Gung University, Taoyuan City, Taiwan
| | - Tzong-Yi Lee
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
- Center for Intelligent Drug Systems and Smart Biodevices (IDS2B), National Yang Ming Chiao Tung University, Hsinchu City, Taiwan
| | - Chen Bai
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong (Shenzhen), Shenzhen 518172, China
| | - Jorng-Tzong Horng
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
- Department of Laboratory Medicine, Chang Gung Memorial Hospital at Linkou, Taoyuan City, Taiwan
| |
Collapse
|
22
|
Kamble P, Nagar PR, Bhakhar KA, Garg P, Sobhia ME, Naidu S, Bharatam PV. Cancer pharmacoinformatics: Databases and analytical tools. Funct Integr Genomics 2024; 24:166. [PMID: 39294509 DOI: 10.1007/s10142-024-01445-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2024] [Revised: 08/26/2024] [Accepted: 09/03/2024] [Indexed: 09/20/2024]
Abstract
Cancer is a subject of extensive investigation, and the utilization of omics technology has resulted in the generation of substantial volumes of big data in cancer research. Numerous databases are being developed to manage and organize this data effectively. These databases encompass various domains such as genomics, transcriptomics, proteomics, metabolomics, immunology, and drug discovery. The application of computational tools into various core components of pharmaceutical sciences constitutes "Pharmacoinformatics", an emerging paradigm in rational drug discovery. The three major features of pharmacoinformatics include (i) Structure modelling of putative drugs and targets, (ii) Compilation of databases and analysis using statistical approaches, and (iii) Employing artificial intelligence/machine learning algorithms for the discovery of novel therapeutic molecules. The development, updating, and analysis of databases using statistical approaches play a pivotal role in pharmacoinformatics. Multiple software tools are associated with oncoinformatics research. This review catalogs the databases and computational tools related to cancer drug discovery and highlights their potential implications in the pharmacoinformatics of cancer.
Collapse
Affiliation(s)
- Pradnya Kamble
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, S.A.S. Nagar, Punjab, India
| | - Prinsa R Nagar
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, S.A.S. Nagar, Punjab, India
| | - Kaushikkumar A Bhakhar
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, S.A.S. Nagar, Punjab, India
| | - Prabha Garg
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, S.A.S. Nagar, Punjab, India
| | - M Elizabeth Sobhia
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, S.A.S. Nagar, Punjab, India
| | - Srivatsava Naidu
- Center of Biomedical Engineering, Indian Institute of Technology Ropar, Rupnagar, Punjab, India
| | - Prasad V Bharatam
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, S.A.S. Nagar, Punjab, India.
- Department of Medicinal Chemistry, National Institute of Pharmaceutical Education and Research, S.A.S. Nagar, Punjab, India.
| |
Collapse
|
23
|
Cheong HH, Zuo W, Chen J, Un CW, Si YW, Wong KH, Kwok HF, Siu SWI. Identification of Anticancer Peptides from the Genome of Candida albicans: in Silico Screening, in Vitro and in Vivo Validations. J Chem Inf Model 2024; 64:6174-6189. [PMID: 39008832 DOI: 10.1021/acs.jcim.4c00501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/17/2024]
Abstract
Anticancer peptides (ACPs) are promising future therapeutics, but their experimental discovery remains time-consuming and costly. To accelerate the discovery process, we propose a computational screening workflow to identify, filter, and prioritize peptide sequences based on predicted class probability, antitumor activity, and toxicity. The workflow was applied to identify novel ACPs with potent activity against colorectal cancer from the genome sequences of Candida albicans. As a result, four candidates were identified and validated in the HCT116 colon cancer cell line. Among them, PCa1 and PCa2 emerged as the most potent, displaying IC50 values of 3.75 and 56.06 μM, respectively, and demonstrating a 4-fold selectivity for cancer cells over normal cells. In the colon xenograft nude mice model, the administration of both peptides resulted in substantial inhibition of tumor growth without causing significant adverse effects. In conclusion, this work not only contributes a proven computational workflow for ACP discovery but also introduces two peptides, PCa1 and PCa2, as promising candidates poised for further development as targeted therapies for colon cancer. The method as a web service is available at https://app.cbbio.online/acpep/home and the source code at https://github.com/cartercheong/AcPEP_classification.git.
Collapse
Affiliation(s)
- Hong-Hin Cheong
- Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Avenida de Universidade, Taipa, Macau SAR 999078, China
| | - Weimin Zuo
- Department of Biomedical Sciences, Faculty of Health Sciences, University of Macau, Avenida de Universidade, Taipa, Macau SAR 999078, China
- Cancer Centre, Faculty of Health Sciences, University of Macau, Avenida de Universidade, Taipa, Macau SAR 999078, China
| | - Jiarui Chen
- Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Avenida de Universidade, Taipa, Macau SAR 999078, China
| | - Chon-Wai Un
- Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Avenida de Universidade, Taipa, Macau SAR 999078, China
| | - Yain-Whar Si
- Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Avenida de Universidade, Taipa, Macau SAR 999078, China
| | - Koon Ho Wong
- Department of Biomedical Sciences, Faculty of Health Sciences, University of Macau, Avenida de Universidade, Taipa, Macau SAR 999078, China
- MoE Frontiers Science Center for Precision Oncology, University of Macau, Avenida de Universidade, Taipa, Macau SAR 999078, China
- Cancer Centre, Faculty of Health Sciences, University of Macau, Avenida de Universidade, Taipa, Macau SAR 999078, China
| | - Hang Fai Kwok
- Department of Biomedical Sciences, Faculty of Health Sciences, University of Macau, Avenida de Universidade, Taipa, Macau SAR 999078, China
- MoE Frontiers Science Center for Precision Oncology, University of Macau, Avenida de Universidade, Taipa, Macau SAR 999078, China
- Cancer Centre, Faculty of Health Sciences, University of Macau, Avenida de Universidade, Taipa, Macau SAR 999078, China
| | - Shirley W I Siu
- Centre for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, R. de Luís Gonzaga Gomes, Macau SAR 999078, China
- Institute of Science and Environment, University of Saint Joseph, Estrada Marginal da Ilha Verde 14-17, Macau SAR 999078, China
| |
Collapse
|
24
|
Garai S, Thomas J, Dey P, Das D. LGBM-ACp: an ensemble model for anticancer peptide prediction and in silico screening with potential drug targets. Mol Divers 2024; 28:1965-1981. [PMID: 36637711 DOI: 10.1007/s11030-023-10602-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 01/06/2023] [Indexed: 01/14/2023]
Abstract
Conventional cancer therapies are highly expensive and have serious complications. An alternative approach now emphasizes on the development of small, biologically active peptides without acute toxicity. Experimental screening to find curative anticancer peptides (ACP) often gives rise to multiple obstacles and is time dependent. Consequently, developing an effective computational technique to identify promising ACP candidates prior to preclinical research is in high demand. This study proposed a machine-learning framework that used the light gradient-boosting machine as a classifier and two compositional and two binary profile features as input. The ensemble model displayed an accuracy, MCC, and AUROC of 97.52%, 0.91, and 0.98, respectively, which outclassed most of the existing sequence-based computational tools. A distinct dataset of non-mutagenic, non-toxic, and non-inhibitory Cytochrome P-450 peptides was used to validate the hybrid model. The most relevant ACP in the alternative dataset was compared with two standard ACPs, beta defensin 2, and cecropin-A. Molecular docking of the predicted peptide revealed that it has a strong binding affinity with twenty-five anticancer drug targets, most notably phosphoenolpyruvate carboxykinase (- 7.2 kcal/mol). Additionally, molecular dynamics simulation and principal component analysis supported the stability of the peptide-receptor complex. Overall, the present findings will take a step forward in rational drug design through rapid identification and screening of therapeutic peptides.
Collapse
Affiliation(s)
- Swarnava Garai
- Department of Bioengineering, NIT Agartala, Tripura, 799046, India
| | - Juanit Thomas
- Department of Bioengineering, NIT Agartala, Tripura, 799046, India
| | - Palash Dey
- Civil Engineering Department, The ICFAI University, Tripura, 799210, India
| | - Deeplina Das
- Department of Bioengineering, NIT Agartala, Tripura, 799046, India.
| |
Collapse
|
25
|
Catacutan DB, Alexander J, Arnold A, Stokes JM. Machine learning in preclinical drug discovery. Nat Chem Biol 2024:10.1038/s41589-024-01679-1. [PMID: 39030362 DOI: 10.1038/s41589-024-01679-1] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 06/13/2024] [Indexed: 07/21/2024]
Abstract
Drug-discovery and drug-development endeavors are laborious, costly and time consuming. These programs can take upward of 12 years and cost US $2.5 billion, with a failure rate of more than 90%. Machine learning (ML) presents an opportunity to improve the drug-discovery process. Indeed, with the growing abundance of public and private large-scale biological and chemical datasets, ML techniques are becoming well positioned as useful tools that can augment the traditional drug-development process. In this Perspective, we discuss the integration of algorithmic methods throughout the preclinical phases of drug discovery. Specifically, we highlight an array of ML-based efforts, across diverse disease areas, to accelerate initial hit discovery, mechanism-of-action (MOA) elucidation and chemical property optimization. With advances in the application of ML across diverse therapeutic areas, we posit that fully ML-integrated drug-discovery pipelines will define the future of drug-development programs.
Collapse
Affiliation(s)
- Denise B Catacutan
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
- Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada
| | - Jeremie Alexander
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
- Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada
| | - Autumn Arnold
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
- Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada
| | - Jonathan M Stokes
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada.
- Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada.
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada.
| |
Collapse
|
26
|
Bhattarai S, Tayara H, Chong KT. Advancing Peptide-Based Cancer Therapy with AI: In-Depth Analysis of State-of-the-Art AI Models. J Chem Inf Model 2024; 64:4941-4957. [PMID: 38874445 DOI: 10.1021/acs.jcim.4c00295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
Anticancer peptides (ACPs) play a vital role in selectively targeting and eliminating cancer cells. Evaluating and comparing predictions from various machine learning (ML) and deep learning (DL) techniques is challenging but crucial for anticancer drug research. We conducted a comprehensive analysis of 15 ML and 10 DL models, including the models released after 2022, and found that support vector machines (SVMs) with feature combination and selection significantly enhance overall performance. DL models, especially convolutional neural networks (CNNs) with light gradient boosting machine (LGBM) based feature selection approaches, demonstrate improved characterization. Assessment using a new test data set (ACP10) identifies ACPred, MLACP 2.0, AI4ACP, mACPred, and AntiCP2.0_AAC as successive optimal predictors, showcasing robust performance. Our review underscores current prediction tool limitations and advocates for an omnidirectional ACP prediction framework to propel ongoing research.
Collapse
Affiliation(s)
- Sadik Bhattarai
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
- Advanced Electronics and Information Research Center, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
| |
Collapse
|
27
|
Xu J, Ruan X, Yang J, Hu B, Li S, Hu J. SME-MFP: A novel spatiotemporal neural network with multiangle initialization embedding toward multifunctional peptides prediction. Comput Biol Chem 2024; 109:108033. [PMID: 38412804 DOI: 10.1016/j.compbiolchem.2024.108033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Revised: 01/09/2024] [Accepted: 02/17/2024] [Indexed: 02/29/2024]
Abstract
As a promising alternative to conventional antibiotic drugs in the biomedical field, functional peptide has been widely used in disease treatment owing to its low toxicity, high absorption rate, and biological activity. Recently, several machine learning methods have been developed for functional peptide prediction. However, the main research heavily relies on statistical features and few consider multifunctional peptide identification. So, we propose SME-MFP, a novel predictor in the imbalanced multi-label functional peptide datasets. First, we employ physicochemical and evolutionary information to represent the peptide sequence's initialization features from multiple perspectives. Second, the features are fused and then put into spatial feature extractors, where the residual connection and multiscale convolutional neural network extract more discriminative features of different lengths' peptide sequences. Besides, we also design AFT-based temporal feature extractors to fully capture the global interactions of the sequences. Finally, devising a new loss to replace the traditional cross entropy loss to settle the class imbalance problems. The results show that our framework not only enhances the model's ability to capture sequence features effectively, but also accuracy improves by 3.89% over existing methods on public peptide datasets.
Collapse
Affiliation(s)
- Jing Xu
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Xiaoli Ruan
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China.
| | - Jing Yang
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Bingqi Hu
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Shaobo Li
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Jianjun Hu
- Department of Computer Science and Engineering, University of South Carolina, Columbia 29208, USA
| |
Collapse
|
28
|
Azad H, Akbar MY, Sarfraz J, Haider W, Riaz MN, Ali GM, Ghazanfar S. G-ACP: a machine learning approach to the prediction of therapeutic peptides for gastric cancer. J Biomol Struct Dyn 2024:1-14. [PMID: 38450672 DOI: 10.1080/07391102.2024.2323141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 02/15/2024] [Indexed: 03/08/2024]
Abstract
Conventional Gastrointestinal (GI) cancer treatments are quite expensive and have major hazards. Nowadays, a different strategy places more emphasis on creating tiny biologically active peptides that do not cause severe poisoning. Anticancer peptides (ACPs) are found through experimental screening, which is time-dependent and frequently fraught with difficulties. Gastric ACPs are emerging as a promising GI cancer treatment in the current day. It is crucial to identify novel gastric ACPs to have an improved knowledge of their functioning processes and treatment of gastric cancer. As a result of the post-genomic era's massive production of peptide sequences, rapid and effective ACPs using a computational method are essential. Several adaptive statistical techniques for distinguishing ACPs and non-ACPs have recently been developed. A variety of adapted statistically significant methods have been developed to differentiate between ACPs and non-ACPs. Despite significant progress, there is no specific model for the prediction of gastric ACPs because the specific model will predict a particular type of peptide more accurately and quickly. To overcome this, an initiative is taken for the creation of a reliable framework for the accurate identification of gastric ACPs. The current technique in particular contains four possible features along with one hybrid feature encoding mechanisms which are the target-class motif previously indicated by Amino Acid Composition, Dipeptide Composition, Tripeptide Composition (TPC), Pseudo Amino Acid Composition (PAAC), and their Hybrid. Machine Learning algorithms make high-performance and accurate prediction tools. Moreover, highly variable and ideal deep feature selection is done using an ANOVA-based F score for feature pruning. Experiments on a range of algorithms are carried out to identify the optimal operating strategy due to the diverse nature of learning. Following analysis of the empirical results, Naïve Bayes with TPC and Hybrid feature space outperforms other methods with 0.99 accuracy score on the testing dataset. To find the model generalization an external validation is carried out. In external datasets, the Extra Trees with PAAC features outperforms with the accuracy of 0.94. The comparison study shows that our suggested model will predict gastric ACPs more accurately and will be useful in drug development and gastric cancer. The predictive model can be freely accessed at https://github.com/humeraazad10/G-ACP.git.
Collapse
Affiliation(s)
- Humera Azad
- Department of Biosciences (Bioinformatics) Islamabad, Comsats University Islamabad, Pakistan
| | - Muhammad Yasir Akbar
- National Institute for Genomics and Advanced Biotechnology (NIGAB), National Agricultural Research Center (NARC), Pakistan
| | | | - Waseem Haider
- Department of Biosciences (Bioinformatics) Islamabad, Comsats University Islamabad, Pakistan
| | - Muhammad Naeem Riaz
- National Institute for Genomics and Advanced Biotechnology (NIGAB), National Agricultural Research Center (NARC), Pakistan
| | - Ghulam Muhammad Ali
- Department of Biosciences (Bioinformatics) Islamabad, Comsats University Islamabad, Pakistan
| | - Shakira Ghazanfar
- National Institute for Genomics and Advanced Biotechnology (NIGAB), National Agricultural Research Center (NARC), Pakistan
| |
Collapse
|
29
|
Liu M, Wu T, Li X, Zhu Y, Chen S, Huang J, Zhou F, Liu H. ACPPfel: Explainable deep ensemble learning for anticancer peptides prediction based on feature optimization. Front Genet 2024; 15:1352504. [PMID: 38487252 PMCID: PMC10937565 DOI: 10.3389/fgene.2024.1352504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 02/19/2024] [Indexed: 03/17/2024] Open
Abstract
Background: Cancer is a significant global health problem that continues to cause a high number of deaths worldwide. Traditional cancer treatments often come with risks that can compromise the functionality of vital organs. As a potential alternative to these conventional therapies, Anticancer peptides (ACPs) have garnered attention for their small size, high specificity, and reduced toxicity, making them as a promising option for cancer treatments. Methods: However, the process of identifying effective ACPs through wet-lab screening experiments is time-consuming and requires a lot of labor. To overcome this challenge, a deep ensemble learning method is constructed to predict anticancer peptides (ACPs) in this study. To evaluate the reliability of the framework, four different datasets are used in this study for training and testing. During the training process of the model, integration of feature selection methods, feature dimensionality reduction measures, and optimization of the deep ensemble model are carried out. Finally, we explored the interpretability of features that affected the final prediction results and built a web server platform to facilitate anticancer peptides prediction, which can be used by all researchers for further studies. This web server can be accessed at http://lmylab.online:5001/. Results: The result of this study achieves an accuracy rate of 98.53% and an AUC (Area under Curve) value of 0.9972 on the ACPfel dataset, it has improvements on other datasets as well.
Collapse
Affiliation(s)
- Mingyou Liu
- School of Biology and Engineering (School of Health Medicine Modern Industry), Guizhou Medical University, Guiyang, China
- Engineering Research Center of Health Medicine Biotechnology of Guizhou Province, Guizhou Medical University, Guiyang, China
| | - Tao Wu
- School of Biology and Engineering (School of Health Medicine Modern Industry), Guizhou Medical University, Guiyang, China
| | - Xue Li
- School of Biology and Engineering (School of Health Medicine Modern Industry), Guizhou Medical University, Guiyang, China
- Engineering Research Center of Health Medicine Biotechnology of Guizhou Province, Guizhou Medical University, Guiyang, China
| | - Yingxue Zhu
- School of Biology and Engineering (School of Health Medicine Modern Industry), Guizhou Medical University, Guiyang, China
- Engineering Research Center of Health Medicine Biotechnology of Guizhou Province, Guizhou Medical University, Guiyang, China
| | - Sen Chen
- School of Biology and Engineering (School of Health Medicine Modern Industry), Guizhou Medical University, Guiyang, China
| | - Jian Huang
- School of Life Science and Technology, University of Electronic Science and Technology, Chengdu, China
- School of Healthcare Technology, Chengdu Neusoft University, Chengdu, China
| | - Fengfeng Zhou
- School of Biology and Engineering (School of Health Medicine Modern Industry), Guizhou Medical University, Guiyang, China
- College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Hongmei Liu
- School of Biology and Engineering (School of Health Medicine Modern Industry), Guizhou Medical University, Guiyang, China
- Engineering Research Center of Health Medicine Biotechnology of Guizhou Province, Guizhou Medical University, Guiyang, China
- College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| |
Collapse
|
30
|
Purohit K, Reddy N, Sunna A. Exploring the Potential of Bioactive Peptides: From Natural Sources to Therapeutics. Int J Mol Sci 2024; 25:1391. [PMID: 38338676 PMCID: PMC10855437 DOI: 10.3390/ijms25031391] [Citation(s) in RCA: 36] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 01/18/2024] [Accepted: 01/21/2024] [Indexed: 02/12/2024] Open
Abstract
Bioactive peptides, specific protein fragments with positive health effects, are gaining traction in drug development for advantages like enhanced penetration, low toxicity, and rapid clearance. This comprehensive review navigates the intricate landscape of peptide science, covering discovery to functional characterization. Beginning with a peptidomic exploration of natural sources, the review emphasizes the search for novel peptides. Extraction approaches, including enzymatic hydrolysis, microbial fermentation, and specialized methods for disulfide-linked peptides, are extensively covered. Mass spectrometric analysis techniques for data acquisition and identification, such as liquid chromatography, capillary electrophoresis, untargeted peptide analysis, and bioinformatics, are thoroughly outlined. The exploration of peptide bioactivity incorporates various methodologies, from in vitro assays to in silico techniques, including advanced approaches like phage display and cell-based assays. The review also discusses the structure-activity relationship in the context of antimicrobial peptides (AMPs), ACE-inhibitory peptides (ACEs), and antioxidative peptides (AOPs). Concluding with key findings and future research directions, this interdisciplinary review serves as a comprehensive reference, offering a holistic understanding of peptides and their potential therapeutic applications.
Collapse
Affiliation(s)
- Kruttika Purohit
- School of Natural Sciences, Macquarie University, Sydney, NSW 2109, Australia;
- Australian Research Council Industrial Transformation Training Centre for Facilitated Advancement of Australia’s Bioactives (FAAB), Sydney, NSW 2109, Australia;
| | - Narsimha Reddy
- Australian Research Council Industrial Transformation Training Centre for Facilitated Advancement of Australia’s Bioactives (FAAB), Sydney, NSW 2109, Australia;
- School of Science, Parramatta Campus, Western Sydney University, Penrith, NSW 2751, Australia
| | - Anwar Sunna
- School of Natural Sciences, Macquarie University, Sydney, NSW 2109, Australia;
- Australian Research Council Industrial Transformation Training Centre for Facilitated Advancement of Australia’s Bioactives (FAAB), Sydney, NSW 2109, Australia;
- Biomolecular Discovery Research Centre, Macquarie University, Sydney, NSW 2109, Australia
| |
Collapse
|
31
|
Zuo W, Kwok HF. Design of Bioengineered Peptides/Proteases as Anti-cancer Reagents with Integrated Omics and Machine Learning Approaches. Methods Mol Biol 2024; 2747:295-309. [PMID: 38038948 DOI: 10.1007/978-1-0716-3589-6_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
Cancer is a heterogeneous disorder of uncontrolled growth of cells, which has proven to be a major burden worldwide. Many treatment options are available for cancer therapy, yet side effects and drug resistance remain major hurdles. Therefore, it is necessary to develop novel drugs for cancer therapy. Anti-cancer peptides (ACPs) are attractive candidates with remarkable potency, low toxicity, and high specificity advantages. However, traditional experimental identification of ACPs is time-consuming and expensive. Integrated omics combined with machine learning (ML) is considered a new powerful and cost-effective strategy to discover ACPs from natural products. In this chapter, we describe in detail experimental procedures for collecting both transcriptomic and proteomic data from venoms, followed by descriptive approaches to ML prediction.
Collapse
Affiliation(s)
- Weimin Zuo
- Cancer Centre, Faculty of Health Sciences, University of Macau, Avenida de Universidade, Taipa, Macau SAR, China
- School of Biomedical Sciences, Faculty of Health Sciences, University of Macau, Avenida de Universidade, Taipa, Macau SAR, China
| | - Hang Fai Kwok
- Cancer Centre, Faculty of Health Sciences, University of Macau, Avenida de Universidade, Taipa, Macau SAR, China.
- School of Biomedical Sciences, Faculty of Health Sciences, University of Macau, Avenida de Universidade, Taipa, Macau SAR, China.
- MoE Frontiers Science Center for Precision Oncology, University of Macau, Avenida de Universidade, Taipa, Macau SAR, China.
| |
Collapse
|
32
|
Chang L, Mondal A, Singh B, Martínez-Noa Y, Perez A. Revolutionizing Peptide-Based Drug Discovery: Advances in the Post-AlphaFold Era. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2024; 14:e1693. [PMID: 38680429 PMCID: PMC11052547 DOI: 10.1002/wcms.1693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 09/18/2023] [Indexed: 05/01/2024]
Abstract
Peptide-based drugs offer high specificity, potency, and selectivity. However, their inherent flexibility and differences in conformational preferences between their free and bound states create unique challenges that have hindered progress in effective drug discovery pipelines. The emergence of AlphaFold (AF) and Artificial Intelligence (AI) presents new opportunities for enhancing peptide-based drug discovery. We explore recent advancements that facilitate a successful peptide drug discovery pipeline, considering peptides' attractive therapeutic properties and strategies to enhance their stability and bioavailability. AF enables efficient and accurate prediction of peptide-protein structures, addressing a critical requirement in computational drug discovery pipelines. In the post-AF era, we are witnessing rapid progress with the potential to revolutionize peptide-based drug discovery such as the ability to rank peptide binders or classify them as binders/non-binders and the ability to design novel peptide sequences. However, AI-based methods are struggling due to the lack of well-curated datasets, for example to accommodate modified amino acids or unconventional cyclization. Thus, physics-based methods, such as docking or molecular dynamics simulations, continue to hold a complementary role in peptide drug discovery pipelines. Moreover, MD-based tools offer valuable insights into binding mechanisms, as well as the thermodynamic and kinetic properties of complexes. As we navigate this evolving landscape, a synergistic integration of AI and physics-based methods holds the promise of reshaping the landscape of peptide-based drug discovery.
Collapse
Affiliation(s)
- Liwei Chang
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | - Arup Mondal
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | - Bhumika Singh
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | | | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL 32611
| |
Collapse
|
33
|
Lee YJ. Examining the functional space of gut microbiome-derived peptides. Microbiologyopen 2023; 12:e1393. [PMID: 38129980 PMCID: PMC10714122 DOI: 10.1002/mbo3.1393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 12/04/2023] [Accepted: 12/05/2023] [Indexed: 12/23/2023] Open
Abstract
The human gut microbiome contains thousands of small, novel peptides that could play a role in microbe-microbe and host-microbe interactions, contributing to human health and disease. Although these peptides have not yet been systematically characterized, computational tools can be used to elucidate the bioactivities they may have. This article proposes probing the functional space of gut microbiome-derived peptides (MDPs) using in silico approaches for three bioactivities: antimicrobial, anticancer, and nucleomodulins. Machine learning programs that support peptide and protein queries are provided for each bioactivity. Considering the biases of an activity-centric approach, activity-agnostic tools using structural and chemical similarity and target prediction are also described. Gut MDPs represent a vast functional space that can not only contribute to our understanding of microbiome interactions but potentially even serve as a source of life-changing therapeutics.
Collapse
Affiliation(s)
- Ying‐Chiang J. Lee
- Department of Molecular BiologyPrinceton UniversityPrincetonNew JerseyUSA
| |
Collapse
|
34
|
Sun M, Hu H, Pang W, Zhou Y. ACP-BC: A Model for Accurate Identification of Anticancer Peptides Based on Fusion Features of Bidirectional Long Short-Term Memory and Chemically Derived Information. Int J Mol Sci 2023; 24:15447. [PMID: 37895128 PMCID: PMC10607064 DOI: 10.3390/ijms242015447] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Revised: 09/10/2023] [Accepted: 10/20/2023] [Indexed: 10/29/2023] Open
Abstract
Anticancer peptides (ACPs) have been proven to possess potent anticancer activities. Although computational methods have emerged for rapid ACPs identification, their accuracy still needs improvement. In this study, we propose a model called ACP-BC, a three-channel end-to-end model that utilizes various combinations of data augmentation techniques. In the first channel, features are extracted from the raw sequence using a bidirectional long short-term memory network. In the second channel, the entire sequence is converted into a chemical molecular formula, which is further simplified using Simplified Molecular Input Line Entry System notation to obtain deep abstract features through a bidirectional encoder representation transformer (BERT). In the third channel, we manually selected four effective features according to dipeptide composition, binary profile feature, k-mer sparse matrix, and pseudo amino acid composition. Notably, the application of chemical BERT in predicting ACPs is novel and successfully integrated into our model. To validate the performance of our model, we selected two benchmark datasets, ACPs740 and ACPs240. ACP-BC achieved prediction accuracy with 87% and 90% on these two datasets, respectively, representing improvements of 1.3% and 7% compared to existing state-of-the-art methods on these datasets. Therefore, systematic comparative experiments have shown that the ACP-BC can effectively identify anticancer peptides.
Collapse
Affiliation(s)
- Mingwei Sun
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (M.S.); (H.H.)
| | - Haoyuan Hu
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (M.S.); (H.H.)
| | - Wei Pang
- School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh EH14 4AS, UK;
| | - You Zhou
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (M.S.); (H.H.)
- College of Software, Jilin University, Changchun 130012, China
| |
Collapse
|
35
|
Yao L, Zhang Y, Li W, Chung C, Guan J, Zhang W, Chiang Y, Lee T. DeepAFP: An effective computational framework for identifying antifungal peptides based on deep learning. Protein Sci 2023; 32:e4758. [PMID: 37595093 PMCID: PMC10503419 DOI: 10.1002/pro.4758] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 08/02/2023] [Accepted: 08/10/2023] [Indexed: 08/20/2023]
Abstract
Fungal infections have become a significant global health issue, affecting millions worldwide. Antifungal peptides (AFPs) have emerged as a promising alternative to conventional antifungal drugs due to their low toxicity and low propensity for inducing resistance. In this study, we developed a deep learning-based framework called DeepAFP to efficiently identify AFPs. DeepAFP fully leverages and mines composition information, evolutionary information, and physicochemical properties of peptides by employing combined kernels from multiple branches of convolutional neural network with bi-directional long short-term memory layers. In addition, DeepAFP integrates a transfer learning strategy to obtain efficient representations of peptides for improving model performance. DeepAFP demonstrates strong predictive ability on carefully curated datasets, yielding an accuracy of 93.29% and an F1-score of 93.45% on the DeepAFP-Main dataset. The experimental results show that DeepAFP outperforms existing AFP prediction tools, achieving state-of-the-art performance. Finally, we provide a downloadable AFP prediction tool to meet the demands of large-scale prediction and facilitate the usage of our framework by the public or other researchers. Our framework can accurately identify AFPs in a short time without requiring significant human and material resources, and hence can accelerate the development of AFPs as well as contribute to the treatment of fungal infections. Furthermore, our method can provide new perspectives for other biological sequence analysis tasks.
Collapse
Affiliation(s)
- Lantian Yao
- Kobilka Institute of Innovative Drug Discovery, School of MedicineThe Chinese University of Hong KongShenzhenChina
- School of Science and EngineeringThe Chinese University of Hong KongShenzhenChina
| | - Yuntian Zhang
- School of MedicineThe Chinese University of Hong KongShenzhenChina
| | - Wenshuo Li
- School of Science and EngineeringThe Chinese University of Hong KongShenzhenChina
| | - Chia‐Ru Chung
- Department of Computer Science and Information EngineeringNational Central UniversityTaoyuanTaiwan
| | - Jiahui Guan
- School of MedicineThe Chinese University of Hong KongShenzhenChina
| | - Wenyang Zhang
- School of MedicineThe Chinese University of Hong KongShenzhenChina
| | - Ying‐Chih Chiang
- Kobilka Institute of Innovative Drug Discovery, School of MedicineThe Chinese University of Hong KongShenzhenChina
- School of MedicineThe Chinese University of Hong KongShenzhenChina
| | - Tzong‐Yi Lee
- Institute of Bioinformatics and Systems BiologyNational Yang Ming Chiao Tung UniversityHsinchuTaiwan
- Center for Intelligent Drug Systems and Smart Bio‐devices (IDS2B)National Yang Ming Chiao Tung UniversityHsinchuTaiwan
| |
Collapse
|
36
|
Yan J, Zhang B, Zhou M, Campbell-Valois FX, Siu SWI. A deep learning method for predicting the minimum inhibitory concentration of antimicrobial peptides against Escherichia coli using Multi-Branch-CNN and Attention. mSystems 2023; 8:e0034523. [PMID: 37431995 PMCID: PMC10506472 DOI: 10.1128/msystems.00345-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 05/31/2023] [Indexed: 07/12/2023] Open
Abstract
Antimicrobial peptides (AMPs) are a promising alternative to antibiotics to combat drug resistance in pathogenic bacteria. However, the development of AMPs with high potency and specificity remains a challenge, and new tools to evaluate antimicrobial activity are needed to accelerate the discovery process. Therefore, we proposed MBC-Attention, a combination of a multi-branch convolution neural network architecture and attention mechanisms to predict the experimental minimum inhibitory concentration of peptides against Escherichia coli. The optimal MBC-Attention model achieved an average Pearson correlation coefficient (PCC) of 0.775 and a root mean squared error (RMSE) of 0.533 (log μM) in three independent tests of randomly drawn sequences from the data set. This results in a 5-12% improvement in PCC and a 6-13% improvement in RMSE compared to 17 traditional machine learning models and 2 optimally tuned models using random forest and support vector machine. Ablation studies confirmed that the two proposed attention mechanisms, global attention and local attention, contributed largely to performance improvement. IMPORTANCE Antimicrobial peptides (AMPs) are potential candidates for replacing conventional antibiotics to combat drug resistance in pathogenic bacteria. Therefore, it is necessary to evaluate the antimicrobial activity of AMPs quantitatively. However, wet-lab experiments are labor-intensive and time-consuming. To accelerate the evaluation process, we develop a deep learning method called MBC-Attention to regress the experimental minimum inhibitory concentration of AMPs against Escherichia coli. The proposed model outperforms traditional machine learning methods. Data, scripts to reproduce experiments, and the final production models are available on GitHub.
Collapse
Affiliation(s)
- Jielu Yan
- PAMI Research Group, Department of Computer and Information Science, University of Macau, Taipa, Macau, China
| | - Bob Zhang
- PAMI Research Group, Department of Computer and Information Science, University of Macau, Taipa, Macau, China
| | - Mingliang Zhou
- School of Computer Science, Chongqing University, Shapingba, Chongqing, China
| | - François-Xavier Campbell-Valois
- Host-Microbe Interactions Laboratory, Center for Chemical and Synthetic Biology, Department of Chemistry and Biomolecular Sciences, University of Ottawa, Ottawa, Ontario, Canada
- Centre for Infection, Immunity, and Inflammation, University of Ottawa, Ottawa, Ontario, Canada
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
| | - Shirley W. I. Siu
- Institute of Science and Environment, University of Saint Joseph, Macau, China
| |
Collapse
|
37
|
Saini S, Rathore A, Sharma S, Saini A. Exploratory data analysis of physicochemical parameters of natural antimicrobial and anticancer peptides: Unraveling the patterns and trends for the rational design of novel peptides. BIOIMPACTS : BI 2023; 14:26438. [PMID: 38327633 PMCID: PMC10844588 DOI: 10.34172/bi.2023.26438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 11/04/2022] [Accepted: 12/04/2022] [Indexed: 02/09/2024]
Abstract
Introduction Peptide-based research has attained new avenues in the antibiotics and cancer drug resistance era. The basis of peptide design research lies in playing with or altering physicochemical parameters. Here in this work, we have done exploratory data analysis (EDA) of physicochemical parameters of antimicrobial peptides (AMPs) and anticancer peptides (ACPs), two promising therapeutics for microbial and cancer drug resistance to deduce patterns and trends. Methods Briefly, we have captured the natural AMPs and ACPs data from the APD3 database. After cleaning the data manually and by CD-HIT web server, further data analysis has been done using Python-based packages, modlAMP and Pandas. We have extracted the descriptive statistics of 10 physicochemical parameters of AMPs and ACPs to build a comprehensive dataset containing all major parameters. The global analysis of datasets has been done using modlAMP to find the initial patterns in global data. The subsets of AMPs and ACPs were curated based on the length of the peptides and were analyzed by Pandas package to deduce the graphical profile of AMPs and ACPs. Results EDA of AMPs and ACPs shows selectivity in the length and amino acid compositions. The distribution of physicochemical parameters in defined quartile ranges was observed in the descriptive statistical and graphical analysis. The preferred length range of AMPs and ACPs was found to be 21-30 amino acids, whereas few outliers in each parameter were evident after EDA analysis. Conclusion The derived patterns from natural AMPs and ACPs can be used for the rational design of novel peptides. The statistical and graphical data distribution findings will help in combining the different parameters for potent design of novel AMPs and ACPs.
Collapse
Affiliation(s)
- Sandeep Saini
- Department of Biophysics, Panjab University, Sector 25, Chandigarh 160014, India
- Department of Bioinformatics, Goswami Ganesh Dutta Sanatan Dharma College, Sector 32-C, Chandigarh 160030, India
| | - Aayushi Rathore
- Institute of Bioinformatics and Applied Biotechnology, Biotech Park, Bengaluru 560100, India
| | - Sheetal Sharma
- Department of Biophysics, Panjab University, Sector 25, Chandigarh 160014, India
| | - Avneet Saini
- Department of Biophysics, Panjab University, Sector 25, Chandigarh 160014, India
| |
Collapse
|
38
|
Ye J, Li A, Zheng H, Yang B, Lu Y. Machine Learning Advances in Predicting Peptide/Protein-Protein Interactions Based on Sequence Information for Lead Peptides Discovery. Adv Biol (Weinh) 2023; 7:e2200232. [PMID: 36775876 DOI: 10.1002/adbi.202200232] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 12/30/2022] [Indexed: 02/14/2023]
Abstract
Peptides have shown increasing advantages and significant clinical value in drug discovery and development. With the development of high-throughput technologies and artificial intelligence (AI), machine learning (ML) methods for discovering new lead peptides have been expanded and incorporated into rational drug design. Predictions of peptide-protein interactions (PepPIs) and protein-protein interactions (PPIs) are both opportunities and challenges in computational biology, which will help to better understand the mechanisms of disease and provide the impetus for the discovery of lead peptides. This paper comprehensively reviews computational models for PepPI and PPI predictions. It begins with an introduction of various databases of peptide ligands and target proteins. Then it discusses data formats and feature representations for proteins and peptides. Furthermore, classical ML methods and emerging deep learning (DL) methods that can be used to train prediction models of PepPI and PPI are classified into four categories, and their advantages and disadvantages are analyzed. To assess the relative performance of different models, different validation protocols and evaluation indexes are discussed. The goal of this review is to help researchers quickly get started to develop computational frameworks using these integrated resources and eventually promote the discovery of lead peptides.
Collapse
Affiliation(s)
- Jiahao Ye
- School of Medicine, Shanghai University, Shanghai, 200444, China
| | - An Li
- Department of Critical Care Medicine, Shanghai Tenth People's Hospital, School of Medicine, Tongji University, Shanghai, 200072, China
- Department of Biochemical Pharmacy, School of Pharmacy, Second Military Medical University, Shanghai, 200433, China
| | - Hao Zheng
- School of Medicine, Shanghai University, Shanghai, 200444, China
| | - Banghua Yang
- School of Medicine, Shanghai University, Shanghai, 200444, China
| | - Yiming Lu
- School of Medicine, Shanghai University, Shanghai, 200444, China
- Department of Critical Care Medicine, Shanghai Tenth People's Hospital, School of Medicine, Tongji University, Shanghai, 200072, China
- Department of Biochemical Pharmacy, School of Pharmacy, Second Military Medical University, Shanghai, 200433, China
| |
Collapse
|
39
|
Asensio-Calavia P, González-Acosta S, Otazo-Pérez A, López MR, Morales-delaNuez A, Pérez de la Lastra JM. Teleost Piscidins-In Silico Perspective of Natural Peptide Antibiotics from Marine Sources. Antibiotics (Basel) 2023; 12:antibiotics12050855. [PMID: 37237758 DOI: 10.3390/antibiotics12050855] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 04/28/2023] [Accepted: 05/02/2023] [Indexed: 05/28/2023] Open
Abstract
Fish, like all other animals, are exposed to constant contact with microbes, both on their skin and on the surfaces of their respiratory and digestive systems. Fish have a system of non-specific immune responses that provides them with initial protection against infection and allows them to survive under normal conditions despite the presence of these potential invaders. However, fish are less protected against invading diseases than other marine vertebrates because their epidermal surface, composed primarily of living cells, lacks the keratinized skin that serves as an efficient natural barrier in other marine vertebrates. Antimicrobial peptides (AMPs) are one type of innate immune protection present in all life forms. AMPs have been shown to have a broader range of biological effects than conventional antibiotics, including antibacterial, antiviral, antiprotozoal, and antifungal effects. Although other AMPs, such as defensins and hepcidins, are found in all vertebrates and are relatively well conserved, piscidins are found exclusively in Teleost fish and are not found in any other animal. Therefore, there is less information on the expression and bioactivity of piscidins than on other AMPs. Piscidins are highly effective against Gram-positive and Gram-negative bacteria that cause disease in fish and humans and have the potential to be used as pharmacological anti-infectives in biomedicine and aquaculture. To better understand the potential benefits and limitations of using these peptides as therapeutic agents, we are conducting a comprehensive study of the Teleost piscidins included in the "reviewed" category of the UniProt database using bioinformatics tools. They all have amphipathic alpha-helical structures. The amphipathic architecture of piscidin peptides and positively charged residues influence their antibacterial activity. These alpha-helices are intriguing antimicrobial drugs due to their stability in high-salt and metal environments. New treatments for multidrug-resistant bacteria, cancer, and inflammation may be inspired by piscidin peptides.
Collapse
Affiliation(s)
- Patricia Asensio-Calavia
- Biotechnology of Macromolecules Research Group, Instituto de Productos Naturales y Agrobiología (IPNA-CSIC), Avda. Astrofísico Francisco Sánchez, 3, 38206 San Cristóbal de La Laguna, Spain
- School of Doctoral and Graduate Studies, Universidad de La Laguna, Avda. Astrofísico Francisco Sánchez, SN. Edificio Calabaza-Apdo. 456, 38200 San Cristóbal de La Laguna, Spain
| | - Sergio González-Acosta
- Biotechnology of Macromolecules Research Group, Instituto de Productos Naturales y Agrobiología (IPNA-CSIC), Avda. Astrofísico Francisco Sánchez, 3, 38206 San Cristóbal de La Laguna, Spain
- School of Doctoral and Graduate Studies, Universidad de La Laguna, Avda. Astrofísico Francisco Sánchez, SN. Edificio Calabaza-Apdo. 456, 38200 San Cristóbal de La Laguna, Spain
| | - Andrea Otazo-Pérez
- Biotechnology of Macromolecules Research Group, Instituto de Productos Naturales y Agrobiología (IPNA-CSIC), Avda. Astrofísico Francisco Sánchez, 3, 38206 San Cristóbal de La Laguna, Spain
- School of Doctoral and Graduate Studies, Universidad de La Laguna, Avda. Astrofísico Francisco Sánchez, SN. Edificio Calabaza-Apdo. 456, 38200 San Cristóbal de La Laguna, Spain
| | - Manuel R López
- Biotechnology of Macromolecules Research Group, Instituto de Productos Naturales y Agrobiología (IPNA-CSIC), Avda. Astrofísico Francisco Sánchez, 3, 38206 San Cristóbal de La Laguna, Spain
| | - Antonio Morales-delaNuez
- Biotechnology of Macromolecules Research Group, Instituto de Productos Naturales y Agrobiología (IPNA-CSIC), Avda. Astrofísico Francisco Sánchez, 3, 38206 San Cristóbal de La Laguna, Spain
| | - José Manuel Pérez de la Lastra
- Biotechnology of Macromolecules Research Group, Instituto de Productos Naturales y Agrobiología (IPNA-CSIC), Avda. Astrofísico Francisco Sánchez, 3, 38206 San Cristóbal de La Laguna, Spain
| |
Collapse
|
40
|
Yao L, Li W, Zhang Y, Deng J, Pang Y, Huang Y, Chung CR, Yu J, Chiang YC, Lee TY. Accelerating the Discovery of Anticancer Peptides through Deep Forest Architecture with Deep Graphical Representation. Int J Mol Sci 2023; 24:ijms24054328. [PMID: 36901759 PMCID: PMC10001941 DOI: 10.3390/ijms24054328] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Revised: 02/02/2023] [Accepted: 02/07/2023] [Indexed: 02/24/2023] Open
Abstract
Cancer is one of the leading diseases threatening human life and health worldwide. Peptide-based therapies have attracted much attention in recent years. Therefore, the precise prediction of anticancer peptides (ACPs) is crucial for discovering and designing novel cancer treatments. In this study, we proposed a novel machine learning framework (GRDF) that incorporates deep graphical representation and deep forest architecture for identifying ACPs. Specifically, GRDF extracts graphical features based on the physicochemical properties of peptides and integrates their evolutionary information along with binary profiles for constructing models. Moreover, we employ the deep forest algorithm, which adopts a layer-by-layer cascade architecture similar to deep neural networks, enabling excellent performance on small datasets but without complicated tuning of hyperparameters. The experiment shows GRDF exhibits state-of-the-art performance on two elaborate datasets (Set 1 and Set 2), achieving 77.12% accuracy and 77.54% F1-score on Set 1, as well as 94.10% accuracy and 94.15% F1-score on Set 2, exceeding existing ACP prediction methods. Our models exhibit greater robustness than the baseline algorithms commonly used for other sequence analysis tasks. In addition, GRDF is well-interpretable, enabling researchers to better understand the features of peptide sequences. The promising results demonstrate that GRDF is remarkably effective in identifying ACPs. Therefore, the framework presented in this study could assist researchers in facilitating the discovery of anticancer peptides and contribute to developing novel cancer treatments.
Collapse
Affiliation(s)
- Lantian Yao
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Wenshuo Li
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Yuntian Zhang
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Junyang Deng
- School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Yuxuan Pang
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Yixian Huang
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Chia-Ru Chung
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Jinhan Yu
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Ying-Chih Chiang
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- Correspondence: (Y.-C.C.); (T.-Y.L.)
| | - Tzong-Yi Lee
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- Correspondence: (Y.-C.C.); (T.-Y.L.)
| |
Collapse
|
41
|
Zhang H, Saravanan KM, Wei Y, Jiao Y, Yang Y, Pan Y, Wu X, Zhang JZH. Deep Learning-Based Bioactive Therapeutic Peptide Generation and Screening. J Chem Inf Model 2023; 63:835-845. [PMID: 36724090 DOI: 10.1021/acs.jcim.2c01485] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Many bioactive peptides demonstrated therapeutic effects over complicated diseases, such as antiviral, antibacterial, anticancer, etc. It is possible to generate a large number of potentially bioactive peptides using deep learning in a manner analogous to the generation of de novo chemical compounds using the acquired bioactive peptides as a training set. Such generative techniques would be significant for drug development since peptides are much easier and cheaper to synthesize than compounds. Despite the limited availability of deep learning-based peptide-generating models, we have built an LSTM model (called LSTM_Pep) to generate de novo peptides and fine-tuned the model to generate de novo peptides with specific prospective therapeutic benefits. Remarkably, the Antimicrobial Peptide Database has been effectively utilized to generate various kinds of potential active de novo peptides. We proposed a pipeline for screening those generated peptides for a given target and used the main protease of SARS-COV-2 as a proof-of-concept. Moreover, we have developed a deep learning-based protein-peptide prediction model (DeepPep) for rapid screening of the generated peptides for the given targets. Together with the generating model, we have demonstrated that iteratively fine-tuning training, generating, and screening peptides for higher-predicted binding affinity peptides can be achieved. Our work sheds light on developing deep learning-based methods and pipelines to effectively generate and obtain bioactive peptides with a specific therapeutic effect and showcases how artificial intelligence can help discover de novo bioactive peptides that can bind to a particular target.
Collapse
Affiliation(s)
- Haiping Zhang
- Shenzhen Institute of Synthetic Biology, Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China
| | - Konda Mani Saravanan
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai 600073, Tamil Nadu, India
| | - Yanjie Wei
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China
| | - Yang Jiao
- Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Yang Yang
- Shenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for infectious disease, State Key Discipline of Infectious Disease, Shenzhen Third People's Hospital, Second Hospital Affiliated to Southern University of Science and Technology, Shenzhen 518112, China
| | - Yi Pan
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China.,Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Xuli Wu
- School of Medicine, Shenzhen University, Shenzhen 518060, Guangdong, China
| | - John Z H Zhang
- Shenzhen Institute of Synthetic Biology, Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China.,East China Normal University, Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
42
|
Zhu F. Amaranth proteins and peptides: Biological properties and food uses. Food Res Int 2023; 164:112405. [PMID: 36738021 DOI: 10.1016/j.foodres.2022.112405] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 12/16/2022] [Accepted: 12/24/2022] [Indexed: 12/31/2022]
Abstract
Amaranthus grains have attracted great attention due to its attractive health benefits. The grains have processing properties (e.g., starch related properties) similar to those of common cereals. Amaranth grains are gluten free and protein is a significant component of these grains. Proteins of the grains have been used in various food applications such as formulations of edible films and emulsions for controlled release of bioactive compounds. The proteins have been hydrolyzed using different enzymes to produce peptides and hydrolysates, which showed a range of biological functions including anti-hypertensive and antioxidant activities among others. They have been formulated into staple foods including breads and pastas for improved nutritional quality. This review summarizes the recent advances of the last 5 years in understanding the biological functions and food applications of proteins, protein hydrolysates and peptides from the grains of different Amaranthus species. Limitations in the studies summarized are critically discussed with an aim to improve the efficiency in amaranth grain protein and peptide research.
Collapse
Affiliation(s)
- Fan Zhu
- School of Chemical Sciences, The University of Auckland, Private Bag 92019, Auckland 1142, New Zealand.
| |
Collapse
|
43
|
Ghaly G, Tallima H, Dabbish E, Badr ElDin N, Abd El-Rahman MK, Ibrahim MAA, Shoeib T. Anti-Cancer Peptides: Status and Future Prospects. Molecules 2023; 28:molecules28031148. [PMID: 36770815 PMCID: PMC9920184 DOI: 10.3390/molecules28031148] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 12/26/2022] [Accepted: 01/19/2023] [Indexed: 01/26/2023] Open
Abstract
The dramatic rise in cancer incidence, alongside treatment deficiencies, has elevated cancer to the second-leading cause of death globally. The increasing morbidity and mortality of this disease can be traced back to a number of causes, including treatment-related side effects, drug resistance, inadequate curative treatment and tumor relapse. Recently, anti-cancer bioactive peptides (ACPs) have emerged as a potential therapeutic choice within the pharmaceutical arsenal due to their high penetration, specificity and fewer side effects. In this contribution, we present a general overview of the literature concerning the conformational structures, modes of action and membrane interaction mechanisms of ACPs, as well as provide recent examples of their successful employment as targeting ligands in cancer treatment. The use of ACPs as a diagnostic tool is summarized, and their advantages in these applications are highlighted. This review expounds on the main approaches for peptide synthesis along with their reconstruction and modification needed to enhance their therapeutic effect. Computational approaches that could predict therapeutic efficacy and suggest ACP candidates for experimental studies are discussed. Future research prospects in this rapidly expanding area are also offered.
Collapse
Affiliation(s)
- Gehane Ghaly
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
| | - Hatem Tallima
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
| | - Eslam Dabbish
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
| | - Norhan Badr ElDin
- Analytical Chemistry Department, Faculty of Pharmacy, Cairo University, Kasr-El Aini Street, Cairo 11562, Egypt
| | - Mohamed K. Abd El-Rahman
- Analytical Chemistry Department, Faculty of Pharmacy, Cairo University, Kasr-El Aini Street, Cairo 11562, Egypt
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA
| | - Mahmoud A. A. Ibrahim
- Computational Chemistry Laboratory, Chemistry Department, Faculty of Science, Minia University, Minia 61519, Egypt
- School of Health Sciences, University of Kwa-Zulu-Natal, Westville, Durban 4000, South Africa
| | - Tamer Shoeib
- Department of Chemistry, The American University in Cairo, New Cairo 11835, Egypt
- Correspondence:
| |
Collapse
|
44
|
Kordi M, Borzouyi Z, Chitsaz S, Asmaei MH, Salami R, Tabarzad M. Antimicrobial peptides with anticancer activity: Today status, trends and their computational design. Arch Biochem Biophys 2023; 733:109484. [PMID: 36473507 DOI: 10.1016/j.abb.2022.109484] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 11/29/2022] [Accepted: 11/30/2022] [Indexed: 12/12/2022]
Abstract
Some antimicrobial peptides have been shown to be able to inhibit the proliferation of cancer cell lines. Various strategies for treating cancers with active peptides have been pursued. According to the reports, anticancer peptides are important therapeutic peptides, which can act through two distinct pathways: they either just create pores in the cell membrane, or they have a vital intracellular target. In this review, publications up to Sep. 2021 had extracted form Scopus and PubMed using "antimicrobial peptide" and "anticancer peptide" as keywords. In second step, "computational design" related publications extracted. Among publications, those have similar scopes were classified and selected based on mechanisms of action and application. In this review, the most recent advances in the field of antimicrobial peptides with anti-cancer activities have been summarized. Freely available webservers such as AntiCP, ACPP, iACP, iACP-GAEnsC, ACPred are discussed here. In conclusion, despite some limitations of ACPs such as production cost and challenges, short half-life and toxicity on normal cells, the beneficial properties of AMPs make some of them good therapeutic agents for cancer therapy. Towards designing novel ACPs, the computational methods have substantial position and have been used progressively, today.
Collapse
Affiliation(s)
- Masoumeh Kordi
- Department of Plant Science and Biotechnology, School of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran.
| | - Zeynab Borzouyi
- Department of Agriculture, School of Agriculture and Plant Breeding, Islamic Azad University, Sabzevar, Iran
| | - Saideh Chitsaz
- Department of Microbiology, Islamic Azad University, Karaj, Iran
| | | | - Robab Salami
- Department of Plant Science and Biotechnology, School of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran
| | - Maryam Tabarzad
- Protein Technology Research Center, Shahid Beheshti University of Medical Science, Iran.
| |
Collapse
|
45
|
Yan J, Cai J, Zhang B, Wang Y, Wong DF, Siu SWI. Recent Progress in the Discovery and Design of Antimicrobial Peptides Using Traditional Machine Learning and Deep Learning. Antibiotics (Basel) 2022; 11:1451. [PMID: 36290108 PMCID: PMC9598685 DOI: 10.3390/antibiotics11101451] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 10/11/2022] [Accepted: 10/13/2022] [Indexed: 11/16/2022] Open
Abstract
Antimicrobial resistance has become a critical global health problem due to the abuse of conventional antibiotics and the rise of multi-drug-resistant microbes. Antimicrobial peptides (AMPs) are a group of natural peptides that show promise as next-generation antibiotics due to their low toxicity to the host, broad spectrum of biological activity, including antibacterial, antifungal, antiviral, and anti-parasitic activities, and great therapeutic potential, such as anticancer, anti-inflammatory, etc. Most importantly, AMPs kill bacteria by damaging cell membranes using multiple mechanisms of action rather than targeting a single molecule or pathway, making it difficult for bacterial drug resistance to develop. However, experimental approaches used to discover and design new AMPs are very expensive and time-consuming. In recent years, there has been considerable interest in using in silico methods, including traditional machine learning (ML) and deep learning (DL) approaches, to drug discovery. While there are a few papers summarizing computational AMP prediction methods, none of them focused on DL methods. In this review, we aim to survey the latest AMP prediction methods achieved by DL approaches. First, the biology background of AMP is introduced, then various feature encoding methods used to represent the features of peptide sequences are presented. We explain the most popular DL techniques and highlight the recent works based on them to classify AMPs and design novel peptide sequences. Finally, we discuss the limitations and challenges of AMP prediction.
Collapse
Affiliation(s)
- Jielu Yan
- PAMI Research Group, Department of Computer and Information Science, University of Macau, Taipa, Macau, China
| | - Jianxiu Cai
- Faculty of Applied Sciences, Macao Polytechnic University, Macau, China
- Institute of Science and Environment, University of Saint Joseph, Estr. Marginal da Ilha Verde, Macau, China
| | - Bob Zhang
- PAMI Research Group, Department of Computer and Information Science, University of Macau, Taipa, Macau, China
| | - Yapeng Wang
- Faculty of Applied Sciences, Macao Polytechnic University, Macau, China
| | - Derek F. Wong
- NLP2CT Lab, Department of Computer and Information Science, University of Macau, Taipa, Macau, China
| | - Shirley W. I. Siu
- Institute of Science and Environment, University of Saint Joseph, Estr. Marginal da Ilha Verde, Macau, China
- School of Pharmaceutical Sciences, Universiti Sains Malaysia, Pulau Pinang 11800, Malaysia
| |
Collapse
|
46
|
Abstract
The problem of human trust is one of the most fundamental problems in applied artificial intelligence in drug discovery. In silico models have been widely used to accelerate the process of drug discovery in recent years. However, most of these models can only give reliable predictions within a limited chemical space that the training set covers (applicability domain). Predictions of samples falling outside the applicability domain are unreliable and sometimes dangerous for the drug-design decision-making process. Uncertainty quantification accordingly has drawn great attention to enable autonomous drug designing. By quantifying the confidence level of model predictions, the reliability of the predictions can be quantitatively represented to assist researchers in their molecular reasoning and experimental design. Here we summarize the state-of-the-art approaches to uncertainty quantification and underline how they can be used for drug design and discovery projects. Furthermore, we also outline four representative application scenarios of uncertainty quantification in drug discovery.
Collapse
Affiliation(s)
- Jie Yu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Dingyan Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| |
Collapse
|
47
|
Thi Phan L, Woo Park H, Pitti T, Madhavan T, Jeon YJ, Manavalan B. MLACP 2.0: An updated machine learning tool for anticancer peptide prediction. Comput Struct Biotechnol J 2022; 20:4473-4480. [PMID: 36051870 PMCID: PMC9421197 DOI: 10.1016/j.csbj.2022.07.043] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 07/25/2022] [Accepted: 07/25/2022] [Indexed: 12/24/2022] Open
Abstract
We present a novel meta-approach, MLACP 2.0, and implement it as a user-friendly webserver for the accurate identification of ACPs. MLACP 2.0 employed 11 different encoding schemes and eight different classifiers, including convolutional neural networks, to create a stable meta-model. Benchmarking study has demonstrated that MLACP 2.0 achieves superior performance in ACP prediction compared to publicly available state-of-the-art predictors.
Anticancer peptides are emerging anticancer drug that offers fewer side effects and is more effective than chemotherapy and targeted therapy. Predicting anticancer peptides from sequence information is one of the most challenging tasks in immunoinformatics. In the past ten years, machine learning-based approaches have been proposed for identifying ACP activity from peptide sequences. These methods include our previous method MLACP (developed in 2017) which made a significant impact on anticancer research. MLACP tool has been widely used by the research community, however, its robustness must be improved significantly for its continued practical application. In this study, the first large non-redundant training and independent datasets were constructed for ACP research. Using the training dataset, the study explored a wide range of feature encodings and developed their respective models using seven different conventional classifiers. Subsequently, a subset of encoding-based models was selected for each classifier based on their performance, whose predicted scores were concatenated and trained through a convolutional neural network (CNN), whose corresponding predictor is named MLACP 2.0. The evaluation of MLACP 2.0 with a very diverse independent dataset showed excellent performance and significantly outperformed the recent ACP prediction tools. Additionally, MLACP 2.0 exhibits superior performance during cross-validation and independent assessment when compared to CNN-based embedding models and conventional single models. Consequently, we anticipate that our proposed MLACP 2.0 will facilitate the design of hypothesis-driven experiments by making it easier to discover novel ACPs. The MLACP 2.0 is freely available at https://balalab-skku.org/mlacp2.
Collapse
|
48
|
Agüero-Chapin G, Galpert-Cañizares D, Domínguez-Pérez D, Marrero-Ponce Y, Pérez-Machado G, Teijeira M, Antunes A. Emerging Computational Approaches for Antimicrobial Peptide Discovery. Antibiotics (Basel) 2022; 11:antibiotics11070936. [PMID: 35884190 PMCID: PMC9311958 DOI: 10.3390/antibiotics11070936] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 07/01/2022] [Accepted: 07/08/2022] [Indexed: 02/05/2023] Open
Abstract
In the last two decades many reports have addressed the application of artificial intelligence (AI) in the search and design of antimicrobial peptides (AMPs). AI has been represented by machine learning (ML) algorithms that use sequence-based features for the discovery of new peptidic scaffolds with promising biological activity. From AI perspective, evolutionary algorithms have been also applied to the rational generation of peptide libraries aimed at the optimization/design of AMPs. However, the literature has scarcely dedicated to other emerging non-conventional in silico approaches for the search/design of such bioactive peptides. Thus, the first motivation here is to bring up some non-standard peptide features that have been used to build classical ML predictive models. Secondly, it is valuable to highlight emerging ML algorithms and alternative computational tools to predict/design AMPs as well as to explore their chemical space. Another point worthy of mention is the recent application of evolutionary algorithms that actually simulate sequence evolution to both the generation of diversity-oriented peptide libraries and the optimization of hit peptides. Last but not least, included here some new considerations in proteogenomic analyses currently incorporated into the computational workflow for unravelling AMPs in natural sources.
Collapse
Affiliation(s)
- Guillermin Agüero-Chapin
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
- Correspondence: (G.A.-C.); (A.A.); Tel.: +351-22-340-1813 (G.A.-C. & A.A.)
| | - Deborah Galpert-Cañizares
- Departamento de Ciencia de la Computación, Universidad Central Marta Abreu de Las Villas (UCLV), Santa Clara 54830, Cuba;
| | - Dany Domínguez-Pérez
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Proquinorte, Unipessoal, Lda, Avenida 5 de Outubro, 124, 7º Piso, Avenidas Novas, 1050-061 Lisboa, Portugal
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas and Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Ecuador;
| | - Gisselle Pérez-Machado
- EpiDisease S.L—Spin-Off of Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), 46980 Valencia, Spain;
| | - Marta Teijeira
- Departamento de Química Orgánica, Facultade de Química, Universidade de Vigo, 36310 Vigo, Spain;
- Instituto de Investigación Sanitaria Galicia Sur, Hospital Álvaro Cunqueiro, 36213 Vigo, Spain
| | - Agostinho Antunes
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
- Correspondence: (G.A.-C.); (A.A.); Tel.: +351-22-340-1813 (G.A.-C. & A.A.)
| |
Collapse
|
49
|
Otović E, Njirjak M, Kalafatovic D, Mauša G. Sequential Properties Representation Scheme for Recurrent Neural Network-Based Prediction of Therapeutic Peptides. J Chem Inf Model 2022; 62:2961-2972. [PMID: 35704881 DOI: 10.1021/acs.jcim.2c00526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The discovery of therapeutic peptides is often accelerated by means of virtual screening supported by machine learning-based predictive models. The predictive performance of such models is sensitive to the choice of data and its representation scheme. While the peptide physicochemical and compositional representations fail to distinguish sequence permutations, the amino acid arrangement within the sequence lacks the important information contained in physicochemical, conformational, topological, and geometrical properties. In this paper, we propose a solution to the identified information gap by implementing a hybrid scheme that complements the best traits from both approaches with the aim of predicting antimicrobial and antiviral activities based on experimental data from DRAMP 2.0, AVPdb, and Uniprot data repositories. Using the Friedman test of statistical significance, we compared our hybrid, sequential properties approach to peptide properties, one-hot vector encoding, and word embedding schemes in the 10-fold cross-validation setting, with respect to the F1 score, Matthews correlation coefficient, geometric mean, recall, and precision evaluation metrics. Moreover, the sequence modeling neural network was employed to gain insight into the synergic effect of both properties- and amino acid order-based predictions. The results suggest that sequential properties significantly (P < 0.01) surpasses the aforementioned state-of-the-art representation schemes. This makes it a strong candidate for increasing the predictive power of screening methods based on machine learning, applicable to any category of peptides.
Collapse
Affiliation(s)
- Erik Otović
- University of Rijeka, Faculty of Engineering, 51000 Rijeka, Croatia
| | - Marko Njirjak
- University of Rijeka, Faculty of Engineering, 51000 Rijeka, Croatia
| | - Daniela Kalafatovic
- University of Rijeka, Department of Biotechnology, 51000 Rijeka, Croatia.,University of Rijeka, Center for Artificial Intelligence and Cybersecurity, 51000 Rijeka, Croatia
| | - Goran Mauša
- University of Rijeka, Faculty of Engineering, 51000 Rijeka, Croatia.,University of Rijeka, Center for Artificial Intelligence and Cybersecurity, 51000 Rijeka, Croatia
| |
Collapse
|
50
|
Yan J, Zhang B, Zhou M, Kwok HF, Siu SWI. Multi-Branch-CNN: Classification of ion channel interacting peptides using multi-branch convolutional neural network. Comput Biol Med 2022; 147:105717. [PMID: 35752114 DOI: 10.1016/j.compbiomed.2022.105717] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2022] [Revised: 05/18/2022] [Accepted: 06/05/2022] [Indexed: 11/03/2022]
Abstract
Ligand peptides that have high affinity for ion channels are critical for regulating ion flux across the plasma membrane. These peptides are now being considered as potential drug candidates for many diseases, such as cardiovascular disease and cancers. In this work, we developed Multi-Branch-CNN, a CNN method with multiple input branches for identifying three types of ion channel peptide binders (sodium, potassium, and calcium) from intra- and inter-feature types. As for its real-world applications, prediction models that are able to recognize novel sequences having high or low similarities to training sequences are required. To this end, we tested our models on two test sets: a general test set including sequences spanning different similarity levels to those of the training set, and a novel-test set consisting of only sequences that bear little resemblance to sequences from the training set. Our experiments showed that the Multi-Branch-CNN method performs better than thirteen traditional ML algorithms (TML13), yielding an improvement in accuracy of 3.2%, 1.2%, and 2.3% on the test sets as well as 8.8%, 14.3%, and 14.6% on the novel-test sets for sodium, potassium, and calcium ion channels, respectively. We confirmed the effectiveness of Multi-Branch-CNN by comparing it to the standard CNN method with one input branch (Single-Branch-CNN) and an ensemble method (TML13-Stack). The data sets, script files to reproduce the experiments, and the final predictive models are freely available at https://github.com/jieluyan/Multi-Branch-CNN.
Collapse
Affiliation(s)
- Jielu Yan
- PAMI Research Group, Department of Computer and Information Science, University of Macau, Taipa, Macao Special Administrative Region of China
| | - Bob Zhang
- PAMI Research Group, Department of Computer and Information Science, University of Macau, Taipa, Macao Special Administrative Region of China.
| | - Mingliang Zhou
- School of Computer Science, Chongqing University, Shapingba, Chongqing, China
| | - Hang Fai Kwok
- Department of Biomedical Sciences, Faculty of Health Sciences, University of Macau, Taipa, Macao Special Administrative Region of China.
| | - Shirley W I Siu
- Department of Computer and Information Science, University of Macau, Taipa, Macao Special Administrative Region of China; Institute of Science and Environment, University of Saint Joseph, Estr. Marginal da Ilha Verde, Macao Special Administrative Region of China.
| |
Collapse
|