1
|
Abbas Z, Kim S, Lee N, Kazmi SAW, Lee SW. A robust ensemble framework for anticancer peptide classification using multi-model voting approach. Comput Biol Med 2025; 188:109750. [PMID: 40032410 DOI: 10.1016/j.compbiomed.2025.109750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2024] [Revised: 01/14/2025] [Accepted: 01/22/2025] [Indexed: 03/05/2025]
Abstract
Anticancer peptides (ACPs) hold great potential for cancer therapeutics, yet accurately identifying them remains a challenging task due to the complexity of peptide sequences and their interactions with biological systems. In this study, we propose a novel machine learning-based framework for ACP classification, integrating multiple feature sets, including sequence composition, physicochemical properties, and embedding features derived from pre-trained language models. We evaluate the performance of various classifiers on benchmark datasets and compare our model against state-of-the-art methods. The results demonstrate that our model outperforms existing methods such as UniDL4BioPep, ACPred-Fuse, and iACP with an accuracy of 75.58%, an AUC of 0.8272, and an MCC of 0.5119. Our approach provides a more balanced sensitivity of 0.7384 and specificity of 0.773, ensuring robust identification of both ACPs and non-ACPs. These findings suggest that incorporating diverse feature sets can significantly enhance ACP classification, potentially facilitating the discovery of novel anticancer peptides for therapeutic applications.
Collapse
Affiliation(s)
- Zeeshan Abbas
- Department of Precision Medicine, Sungkyunkwan University School of Medicine, Suwon, Republic of Korea; Department of Artificial Intelligence, Sungkyunkwan University, Suwon 16419, Republic of Korea
| | - Sunyeup Kim
- Department of Precision Medicine, Sungkyunkwan University School of Medicine, Suwon, Republic of Korea
| | - Nangkyeong Lee
- Department of Precision Medicine, Sungkyunkwan University School of Medicine, Suwon, Republic of Korea
| | | | - Seung Won Lee
- Department of Precision Medicine, Sungkyunkwan University School of Medicine, Suwon, Republic of Korea; Department of Artificial Intelligence, Sungkyunkwan University, Suwon 16419, Republic of Korea; Department of Metabiohealth, Sungkyunkwan University, Suwon 16419, Republic of Korea; Personalized Cancer Immunotherapy Research Center, Sungkyunkwan University School of Medicine, Suwon 16419, Republic of Korea.
| |
Collapse
|
2
|
Asim MN, Asif T, Mehmood F, Dengel A. Peptide classification landscape: An in-depth systematic literature review on peptide types, databases, datasets, predictors architectures and performance. Comput Biol Med 2025; 188:109821. [PMID: 39987697 DOI: 10.1016/j.compbiomed.2025.109821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2024] [Revised: 02/03/2025] [Accepted: 02/05/2025] [Indexed: 02/25/2025]
Abstract
Peptides are gaining significant attention in diverse fields such as the pharmaceutical market has seen a steady rise in peptide-based therapeutics over the past six decades. Peptides have been utilized in the development of distinct applications including inhibitors of SARS-COV-2 and treatments for conditions like cancer and diabetes. Distinct types of peptides possess unique characteristics, and development of peptide-specific applications require the discrimination of one peptide type from others. To the best of our knowledge, approximately 230 Artificial Intelligence (AI) driven applications have been developed for 22 distinct types of peptides, yet there remains significant room for development of new predictors. A Comprehensive review addresses the critical gap by providing a consolidated platform for the development of AI-driven peptide classification applications. This paper offers several key contributions, including presenting the biological foundations of 22 unique peptide types and categorizes them into four main classes: Regulatory, Therapeutic, Nutritional, and Delivery Peptides. It offers an in-depth overview of 47 databases that have been used to develop peptide classification benchmark datasets. It summarizes details of 288 benchmark datasets that are used in development of diverse types AI-driven peptide classification applications. It provides a detailed summary of 197 sequence representation learning methods and 94 classifiers that have been used to develop 230 distinct AI-driven peptide classification applications. Across 22 distinct types peptide classification tasks related to 288 benchmark datasets, it demonstrates performance values of 230 AI-driven peptide classification applications. It summarizes experimental settings and various evaluation measures that have been employed to assess the performance of AI-driven peptide classification applications. The primary focus of this manuscript is to consolidate scattered information into a single comprehensive platform. This resource will greatly assist researchers who are interested in developing new AI-driven peptide classification applications.
Collapse
Affiliation(s)
- Muhammad Nabeel Asim
- German Research Center for Artificial Intelligence, Kaiserslautern, 67663, Germany; Intelligentx GmbH (intelligentx.com), Kaiserslautern, Germany.
| | - Tayyaba Asif
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany
| | - Faiza Mehmood
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; Institute of Data Sciences, University of Engineering and Technology, Lahore, Pakistan
| | - Andreas Dengel
- German Research Center for Artificial Intelligence, Kaiserslautern, 67663, Germany; Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany; Intelligentx GmbH (intelligentx.com), Kaiserslautern, Germany
| |
Collapse
|
3
|
Cao J, Zhou W, Yu Q, Ji J, Zhang J, He S, Zhu Z. MDTL-ACP: Anticancer Peptides Prediction Based on Multi-Domain Transfer Learning. IEEE J Biomed Health Inform 2025; 29:1714-1725. [PMID: 38147420 DOI: 10.1109/jbhi.2023.3347138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
Anticancer peptides (ACPs) have emerged as one of the most promising therapeutic agents for cancer treatment. They are bioactive peptides featuring broad-spectrum activity and low drug-resistance. The discovery of ACPs via traditional biochemical methods is laborious and costly. Accordingly, various computational methods have been developed to facilitate the discovery of ACPs. However, the data resources and knowledge of ACPs are still very scarce, and only a few of them are clinically verified, which limits the competence of computational methods. To address this issue, in this article, we propose an ACP prediction model based on multi-domain transfer learning, namely MDTL-ACP, to discriminate novel ACPs from plentiful inactive peptides. In particular, we collect abundant antimicrobial peptides (AMPs) from four well-studied peptide domains and extract their inherent features as the input of MDTL-ACP. The features learned from multiple source domains of AMPs are then transferred into the target prediction task of ACPs via artificial neural network-based shared-extractor and task-specific classifiers in MDTL-ACP. The knowledge captured in the transferred features enhances the prediction of ACPs in the target domain. Experimental results demonstrate that MDTL-ACP can outperform the traditional and state-of-the-art ACP prediction methods.
Collapse
|
4
|
Wang X, Zhang Z, Liu C. iACP-DFSRA: Identification of Anticancer Peptides Based on a Dual-channel Fusion Strategy of ResCNN and Attention. J Mol Biol 2024; 436:168810. [PMID: 39362624 DOI: 10.1016/j.jmb.2024.168810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Revised: 09/10/2024] [Accepted: 09/27/2024] [Indexed: 10/05/2024]
Abstract
Anticancer peptides (ACPs) have been widely applied in the treatment of cancer owing to good safety, rational side effects, and high selectivity. However, the number of ACPs that have been experimentally validated is limited as identification of ACPs is extremely expensive. Hence, accurate and cost-effective identification methods for ACPs are urgently needed. In this work, we proposed a deep learning-based model, named iACP-DFSRA, for ACPs identification. Specifically, we adopted two kinds of sequence embedding technologies, ProtBert_BFD pre-training language model and handcrafted features to encode protein sequences. Then, the LightGBM was used for feature selection, and the selected features were input into ResCNN and Attention mechanism, respectively, to extract local and global features. Finally, the concatenate features were deeply fused by using the Attention mechanism to allow key features to be paid more attention to by the model and make predictions by fully connected layer. The results of 10-fold cross-validation demonstrated that the iACP-DFSRA model delivered improved results in most metrics with Sp of 94.15%, Sn of 95.32%, Acc of 94.74% and MCC of 89.48% compared to the latest AACFlow model. Indeed, the iACP-DFSRA model is the only model with Acc > 90% and MCC > 80% on this independent test dataset. Furthermore, we have further demonstrated the superiority of our model on additional datasets. In addition, t-SNE and SHAP interpretation analysis demonstrated that it is crucial to use two channels for feature extraction and use the Attention mechanism for deep fusion, which helps the iACP-DFSRA to predict ACPs more effectively.
Collapse
Affiliation(s)
- Xin Wang
- School of Science, Dalian Maritime University, Dalian 116026, China.
| | - Zimeng Zhang
- School of Science, Dalian Maritime University, Dalian 116026, China
| | - Chang Liu
- School of Science, Dalian Maritime University, Dalian 116026, China
| |
Collapse
|
5
|
Ullah F, Salam A, Nadeem M, Amin F, AlSalman H, Abrar M, Alfakih T. Extended dipeptide composition framework for accurate identification of anticancer peptides. Sci Rep 2024; 14:17381. [PMID: 39075193 PMCID: PMC11286958 DOI: 10.1038/s41598-024-68475-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Accepted: 07/24/2024] [Indexed: 07/31/2024] Open
Abstract
The identification of anticancer peptides (ACPs) is crucial, especially in the development of peptide-based cancer therapy. The classical models such as Split Amino Acid Composition (SAAC) and Pseudo Amino Acid Composition (PseAAC) lack the incorporation of feature representation. These advancements improve the predictive accuracy and efficiency of ACP identification. Thus, the effort of this research is to propose and develop an advanced framework based on feature extraction. Thus, to achieve this objective herein we propose an Extended Dipeptide Composition (EDPC) framework. The proposed EDPC framework extends the dipeptide composition by considering the local sequence environment information and reforming the CD-HIT framework to remove noise and redundancy. To measure the accuracy, we have performed several experiments. These experiments were employed using four famous machine learning (ML) algorithms named; Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and K Nearest Neighbor (KNN). For comparisons, we have used accuracy, specificity, sensitivity, precision, recall, and F1-Score as evaluation criteria. The reliability of the proposed framework is further evaluated using statistical significance tests. As a result, the proposed EDPC framework exhibited enhanced performance than SAAC and PseAAC, where the SVM model delivered the highest accuracy of 96. 6% and significant enhancements in specificity, sensitivity, precision, and F1-score over multiple datasets. Due to the incorporation of enhanced feature representation and the incorporation of local and global sequence profiles proposed EDPC achieves higher classification performance. The proposed frameworks can deal with noise and also duplicating features. These are accompanied by a wide range of feature representations. Finally, our proposed framework can be used for clinical applications where ACP identification is essential. Future works will include extending to a larger variety of datasets, incorporating tertiary structural information, and using deep learning techniques to improve the proposed EDPC.
Collapse
Affiliation(s)
- Faizan Ullah
- Department of Computer Science, Bacha Khan University, Charsadda, 24420, Pakistan
| | - Abdu Salam
- Department of Computer Science, Abdul Wali Khan University, Mardan, 23200, Pakistan
| | - Muhammad Nadeem
- Department of Computer Science and Software Engineering, International Islamic University, Islamabad, 44000, Pakistan
| | - Farhan Amin
- School of Computer Science and Engineering, Yeungnam University, Gyeongsan, 38541, Korea.
| | - Hussain AlSalman
- Department of Computer Science, College of Computer and Information Sciences, King Saud University, 11543, Riyadh, Saudi Arabia.
| | - Mohammad Abrar
- Faculty of Computer Studies, Arab Open University, Muscat, Oman
| | - Taha Alfakih
- Department of Information Systems, College of Computer and Information Sciences, King Saud University, 11543, Riyadh, Saudi Arabia
| |
Collapse
|
6
|
Bhattarai S, Tayara H, Chong KT. Advancing Peptide-Based Cancer Therapy with AI: In-Depth Analysis of State-of-the-Art AI Models. J Chem Inf Model 2024; 64:4941-4957. [PMID: 38874445 DOI: 10.1021/acs.jcim.4c00295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
Anticancer peptides (ACPs) play a vital role in selectively targeting and eliminating cancer cells. Evaluating and comparing predictions from various machine learning (ML) and deep learning (DL) techniques is challenging but crucial for anticancer drug research. We conducted a comprehensive analysis of 15 ML and 10 DL models, including the models released after 2022, and found that support vector machines (SVMs) with feature combination and selection significantly enhance overall performance. DL models, especially convolutional neural networks (CNNs) with light gradient boosting machine (LGBM) based feature selection approaches, demonstrate improved characterization. Assessment using a new test data set (ACP10) identifies ACPred, MLACP 2.0, AI4ACP, mACPred, and AntiCP2.0_AAC as successive optimal predictors, showcasing robust performance. Our review underscores current prediction tool limitations and advocates for an omnidirectional ACP prediction framework to propel ongoing research.
Collapse
Affiliation(s)
- Sadik Bhattarai
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
- Advanced Electronics and Information Research Center, Jeonbuk National University, Jeonju-si, 54896 Jeollabuk-do, South Korea
| |
Collapse
|
7
|
Kao HJ, Weng TH, Chen CH, Chen YC, Chi YH, Huang KY, Weng SL. Integrating In Silico and In Vitro Approaches to Identify Natural Peptides with Selective Cytotoxicity against Cancer Cells. Int J Mol Sci 2024; 25:6848. [PMID: 38999958 PMCID: PMC11240926 DOI: 10.3390/ijms25136848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 06/14/2024] [Accepted: 06/18/2024] [Indexed: 07/14/2024] Open
Abstract
Anticancer peptides (ACPs) are bioactive compounds known for their selective cytotoxicity against tumor cells via various mechanisms. Recent studies have demonstrated that in silico machine learning methods are effective in predicting peptides with anticancer activity. In this study, we collected and analyzed over a thousand experimentally verified ACPs, specifically targeting peptides derived from natural sources. We developed a precise prediction model based on their sequence and structural features, and the model's evaluation results suggest its strong predictive ability for anticancer activity. To enhance reliability, we integrated the results of this model with those from other available methods. In total, we identified 176 potential ACPs, some of which were synthesized and further evaluated using the MTT colorimetric assay. All of these putative ACPs exhibited significant anticancer effects and selective cytotoxicity against specific tumor cells. In summary, we present a strategy for identifying and characterizing natural peptides with selective cytotoxicity against cancer cells, which could serve as novel therapeutic agents. Our prediction model can effectively screen new molecules for potential anticancer activity, and the results from in vitro experiments provide compelling evidence of the candidates' anticancer effects and selective cytotoxicity.
Collapse
Affiliation(s)
- Hui-Ju Kao
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City 300, Taiwan
- Department of Medical Research, Hsinchu Municipal MacKay Children's Hospital, Hsinchu City 300, Taiwan
| | - Tzu-Han Weng
- Department of Dermatology, MacKay Memorial Hospital, Taipei City 104, Taiwan
| | - Chia-Hung Chen
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City 300, Taiwan
- Department of Medical Research, Hsinchu Municipal MacKay Children's Hospital, Hsinchu City 300, Taiwan
| | - Yu-Chi Chen
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City 300, Taiwan
- Department of Medical Research, Hsinchu Municipal MacKay Children's Hospital, Hsinchu City 300, Taiwan
| | - Yu-Hsiang Chi
- National Center for High-Performance Computing, Hsinchu City 300, Taiwan
| | - Kai-Yao Huang
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City 300, Taiwan
- Department of Medical Research, Hsinchu Municipal MacKay Children's Hospital, Hsinchu City 300, Taiwan
- Department of Medicine, MacKay Medical College, New Taipei City 252, Taiwan
- Institute of Biomedical Sciences, MacKay Medical College, New Taipei City 252, Taiwan
| | - Shun-Long Weng
- Department of Medicine, MacKay Medical College, New Taipei City 252, Taiwan
- Department of Obstetrics and Gynecology, Hsinchu MacKay Memorial Hospital, Hsinchu City 300, Taiwan
- Department of Obstetrics and Gynecology, Hsinchu Municipal MacKay Children's Hospital, Hsinchu City 300, Taiwan
| |
Collapse
|
8
|
Bian J, Liu X, Dong G, Hou C, Huang S, Zhang D. ACP-ML: A sequence-based method for anticancer peptide prediction. Comput Biol Med 2024; 170:108063. [PMID: 38301519 DOI: 10.1016/j.compbiomed.2024.108063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/08/2024] [Accepted: 01/27/2024] [Indexed: 02/03/2024]
Abstract
Cancer is a serious malignant tumor and is difficult to cure. Chemotherapy, as a primary treatment for cancer, causes significant harm to normal cells in the body and is often accompanied by serious side effects. Recently, anti-cancer peptides (ACPs) as a type of protein for treating cancers dominated research into the development of new anti-tumor drugs because of their ability to specifically target and destroy cancer cells. The screening of proteins with cancer-inhibiting properties from a large pool of proteins is key to the development of anti-tumor drugs. However, it is expensive and inefficient to accurately identify protein functions only through biological experiments due to their complex structure. Therefore, we propose a new prediction model ACP-ML to effectively predict ACPs. In terms of feature extraction, DPC, PseAAC, CTDC, CTDT and CS-Pse-PSSM features were used and the most optimal feature set was selected by comparing combinations of these features. Then, a two-step feature selection process using MRMD and RFE algorithms was performed to determine the most crucial features from the most optimal feature set for identifying ACPs. Furthermore, we assessed the classification accuracy of single learning models and different strategies-based ensemble models through ten-fold cross-validation. Ultimately, a voting-based ensemble learning method is developed to predict ACPs. To validate its effectiveness, two independent test sets were used to perform tests, achieving accuracy of 90.891 % and 92.578 % respectively. Compared with existing anticancer peptide prediction algorithms, the proposed feature processing method is more effective, and the proposed ensemble model ACP-ML exhibits stronger generalization capability and higher accuracy.
Collapse
Affiliation(s)
- Jilong Bian
- Northeast Forestry University, College of Computer and Control Engineering, Harbin, Heilongjiang, China.
| | - Xuan Liu
- Northeast Forestry University, College of Computer and Control Engineering, Harbin, Heilongjiang, China
| | - Guanghui Dong
- Northeast Forestry University, College of Computer and Control Engineering, Harbin, Heilongjiang, China
| | - Chang Hou
- Northeast Forestry University, College of Computer and Control Engineering, Harbin, Heilongjiang, China
| | - Shan Huang
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin, Heilongjiang, China.
| | - Dandan Zhang
- Department of Obstetrics and Gynecology, The First Affiliated Hospital of Harbin Medical University, Harbin, Heilongjiang, China.
| |
Collapse
|
9
|
Medvedeva A, Domakhina S, Vasnetsov C, Vasnetsov V, Kolomeisky A. Physical-Chemical Approach to Designing Drugs with Multiple Targets. J Phys Chem Lett 2024; 15:1828-1835. [PMID: 38330920 DOI: 10.1021/acs.jpclett.3c03624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2024]
Abstract
Many people simultaneously exhibit multiple diseases, which complicates efficient medical treatments. For example, patients with cancer are frequently susceptible to infections. However, developing drugs that could simultaneously target several diseases is challenging. We present a novel theoretical method to assist in selecting compounds with multiple therapeutic targets. The idea is to find correlations between the physical and chemical properties of drug molecules and their abilities to work against multiple targets. As a first step, we investigated potential drugs against cancer and viral infections. Specifically, we investigated antimicrobial peptides (AMPs), which are short positively charged biomolecules produced by living systems as a part of their immune defense. AMPs show anticancer and antiviral activity. We use chemoinformatics and correlation analysis as a part of the machine-learning method to identify the specific properties that distinguish AMPs with dual anticancer and antiviral activities. Physical-chemical arguments to explain these observations are presented.
Collapse
Affiliation(s)
- Angela Medvedeva
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
| | - Sofya Domakhina
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
| | - Catherine Vasnetsov
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
| | - Victor Vasnetsov
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
| | - Anatoly Kolomeisky
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, United States
- Department of Physics and Astronomy, Rice University, Houston, Texas 77005, United States
| |
Collapse
|
10
|
Karim T, Shaon MSH, Sultan MF, Hasan MZ, Kafy AA. ANNprob-ACPs: A novel anticancer peptide identifier based on probabilistic feature fusion approach. Comput Biol Med 2024; 169:107915. [PMID: 38171261 DOI: 10.1016/j.compbiomed.2023.107915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 12/28/2023] [Accepted: 12/29/2023] [Indexed: 01/05/2024]
Abstract
Anticancer Peptides (ACPs) offer significant potential as cancer treatment drugs in this modern era. Quickly identifying active compounds from protein sequences is crucial for healthcare and cancer treatment. In this paper ANNprob-ACPs, a novel and effective model for detecting ACPs has been implemented based on nine feature encoding techniques, including AAC, CC, W2V, DPC, PAAC, QSO, CTDC, CTDT, and CKSAAGP. After analyzing the performance of several machine learning models, the six best models were selected based on their overall performances in every evaluation metric. The probability scores of each model were subsequently aggregated and used as input of our meta- model, called ANNprob-ACPs. Our model outperformed all others and its potential to lead to phenomenal identification of ACPs. The results of this study showed notable improvement in 10-fold cross-validation and independent test, with accuracy of 93.72% and 90.62%, respectively. Our proposed model, ANNprob-ACPs outperformed existing approaches in terms of accuracy and effectiveness in discovering ACPs. By using SHAP, this study obtained the physicochemical properties of QSO, and compositional properties of DPC, AAC, and PAAC are more impactful for our model's performances, which have a major impact on a drug's interactions and future discoveries. Consequently, this model is crucial for the future and has a high probability of detecting ACPs more frequently. We developed a web server of ANNprob-ACPs, which is accessible at ANNprob-ACPs webserver.
Collapse
Affiliation(s)
- Tasmin Karim
- Department of Computer Science & Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh; Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh.
| | - Md Shazzad Hossain Shaon
- Department of Computer Science & Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh; Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh.
| | - Md Fahim Sultan
- Department of Computer Science & Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh; Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh.
| | - Md Zahid Hasan
- Department of Computer Science & Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh; Health Informatics Research Lab, Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh.
| | - Abdulla-Al Kafy
- Department of Urban & Regional Planning, Rajshahi University of Engineering & Technology (RUET), Rajshahi, 6204, Bangladesh.
| |
Collapse
|
11
|
Li C, Jin K. Chemical Strategies towards the Development of Effective Anticancer Peptides. Curr Med Chem 2024; 31:1839-1873. [PMID: 37170992 DOI: 10.2174/0929867330666230426111157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 01/28/2023] [Accepted: 02/24/2023] [Indexed: 05/13/2023]
Abstract
Cancer is increasingly recognized as one of the primary causes of death and has become a multifaceted global health issue. Modern medical science has made significant advancements in the diagnosis and therapy of cancer over the past decade. The detrimental side effects, lack of efficacy, and multidrug resistance of conventional cancer therapies have created an urgent need for novel anticancer therapeutics or treatments with low cytotoxicity and drug resistance. The pharmaceutical groups have recognized the crucial role that peptide therapeutic agents can play in addressing unsatisfied healthcare demands and how these become great supplements or even preferable alternatives to biological therapies and small molecules. Anticancer peptides, as a vibrant therapeutic strategy against various cancer cells, have demonstrated incredible anticancer potential due to high specificity and selectivity, low toxicity, and the ability to target the surface of traditional "undruggable" proteins. This review will provide the research progression of anticancer peptides, mainly focusing on the discovery and modifications along with the optimization and application of these peptides in clinical practice.
Collapse
Affiliation(s)
- Cuicui Li
- Key Laboratory of Chemical Biology (Ministry of Education), Department of Medicinal Chemistry, School of Pharmacy, Cheeloo College of Medicine, Shandong University, Jinan, Shandong, 250012, China
| | - Kang Jin
- Key Laboratory of Chemical Biology (Ministry of Education), Department of Medicinal Chemistry, School of Pharmacy, Cheeloo College of Medicine, Shandong University, Jinan, Shandong, 250012, China
| |
Collapse
|
12
|
Wang Y, Wang L, Li C, Pei Y, Liu X, Tian Y. AMP-EBiLSTM: employing novel deep learning strategies for the accurate prediction of antimicrobial peptides. Front Genet 2023; 14:1232117. [PMID: 37554402 PMCID: PMC10405519 DOI: 10.3389/fgene.2023.1232117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 07/11/2023] [Indexed: 08/10/2023] Open
Abstract
Antimicrobial peptides are present ubiquitously in intra- and extra-biological environments and display considerable antibacterial and antifungal activities. Clinically, it has shown good antibacterial effect in the treatment of diabetic foot and its complications. However, the discovery and screening of antimicrobial peptides primarily rely on wet lab experiments, which are inefficient. This study endeavors to create a precise and efficient method of predicting antimicrobial peptides by incorporating novel machine learning technologies. We proposed a deep learning strategy named AMP-EBiLSTM to accurately predict them, and compared its performance with ensemble learning and baseline models. We utilized Binary Profile Feature (BPF) and Pseudo Amino Acid Composition (PSEAAC) for effective local sequence capture and amino acid information extraction, respectively, in deep learning and ensemble learning. Each model was cross-validated and externally tested independently. The results demonstrate that the Enhanced Bi-directional Long Short-Term Memory (EBiLSTM) deep learning model outperformed others with an accuracy of 92.39% and AUC value of 0.9771 on the test set. On the other hand, the ensemble learning models demonstrated cost-effectiveness in terms of training time on a T4 server equipped with 16 GB of GPU memory and 8 vCPUs, with training durations varying from 0 to 30 s. Therefore, the strategy we propose is expected to predict antimicrobial peptides more accurately in the future.
Collapse
Affiliation(s)
- Yuanda Wang
- School of Modern Post (School of Automation), Beijing University of Posts and Telecommunications, Beijing, China
| | - Liyang Wang
- School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Chengquan Li
- School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Yilin Pei
- School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Xiaoxiao Liu
- Laboratory Medicine, Guangdong Provincial People’s Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China
| | - Yu Tian
- Vascular Surgery Department, Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Tongji Shanxi Hospital, Third Hospital of Shanxi Medical University, Taiyuan, China
| |
Collapse
|
13
|
Mohammed A, Kora R. A Comprehensive Review on Ensemble Deep Learning: Opportunities and Challenges. JOURNAL OF KING SAUD UNIVERSITY - COMPUTER AND INFORMATION SCIENCES 2023. [DOI: 10.1016/j.jksuci.2023.01.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
14
|
Yuan Q, Chen K, Yu Y, Le NQK, Chua MCH. Prediction of anticancer peptides based on an ensemble model of deep learning and machine learning using ordinal positional encoding. Brief Bioinform 2023; 24:6987656. [PMID: 36642410 DOI: 10.1093/bib/bbac630] [Citation(s) in RCA: 42] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 12/01/2022] [Accepted: 12/28/2022] [Indexed: 01/17/2023] Open
Abstract
Anticancer peptides (ACPs) are the types of peptides that have been demonstrated to have anticancer activities. Using ACPs to prevent cancer could be a viable alternative to conventional cancer treatments because they are safer and display higher selectivity. Due to ACP identification being highly lab-limited, expensive and lengthy, a computational method is proposed to predict ACPs from sequence information in this study. The process includes the input of the peptide sequences, feature extraction in terms of ordinal encoding with positional information and handcrafted features, and finally feature selection. The whole model comprises of two modules, including deep learning and machine learning algorithms. The deep learning module contained two channels: bidirectional long short-term memory (BiLSTM) and convolutional neural network (CNN). Light Gradient Boosting Machine (LightGBM) was used in the machine learning module. Finally, this study voted the three models' classification results for the three paths resulting in the model ensemble layer. This study provides insights into ACP prediction utilizing a novel method and presented a promising performance. It used a benchmark dataset for further exploration and improvement compared with previous studies. Our final model has an accuracy of 0.7895, sensitivity of 0.8153 and specificity of 0.7676, and it was increased by at least 2% compared with the state-of-the-art studies in all metrics. Hence, this paper presents a novel method that can potentially predict ACPs more effectively and efficiently. The work and source codes are made available to the community of researchers and developers at https://github.com/khanhlee/acp-ope/.
Collapse
Affiliation(s)
- Qitong Yuan
- Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, 119615, Singapore, Singapore
| | - Keyi Chen
- Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, 119615, Singapore, Singapore
| | - Yimin Yu
- Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, 119615, Singapore, Singapore
| | - Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, 250 Wuxing St, 106, Taipei, Taiwan.,Research Center for Artificial Intelligence in Medicine, Taipei Medical University, 250 Wuxing St, 106, Taipei, Taiwan.,Translational Imaging Research Center, Taipei Medical University Hospital, 252 Wuxing St, 110, Taipei, Taiwan
| | - Matthew Chin Heng Chua
- Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, 119615, Singapore, Singapore
| |
Collapse
|
15
|
Sharma L, Bisht GS. Short Antimicrobial Peptides: Therapeutic Potential and Recent Advancements. Curr Pharm Des 2023; 29:3005-3017. [PMID: 38018196 DOI: 10.2174/0113816128248959231102114334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 09/28/2023] [Accepted: 10/11/2023] [Indexed: 11/30/2023]
Abstract
There has been a lot of interest in antimicrobial peptides (AMPs) as potential next-generation antibiotics. They are components of the innate immune system. AMPs have broad-spectrum action and are less prone to resistance development. They show potential applications in various fields, including medicine, agriculture, and the food industry. However, despite the good activity and safety profiles, AMPs have had difficulty finding success in the clinic due to their various limitations, such as production cost, proteolytic susceptibility, and oral bioavailability. To overcome these flaws, a number of solutions have been devised, one of which is developing short antimicrobial peptides. Short antimicrobial peptides do have an advantage over longer peptides as they are more stable and do not collapse during absorption. They have generated a lot of interest because of their evolutionary success and advantageous properties, such as low molecular weight, selective targets, cell or organelles with minimal toxicity, and enormous therapeutic potential. This article provides an overview of the development of short antimicrobial peptides with an emphasis on those with ≤ 30 amino acid residues as a potential therapeutic agent to fight drug-resistant microorganisms. It also emphasizes their applications in many fields and discusses their current state in clinical trials.
Collapse
Affiliation(s)
- Lalita Sharma
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Himachal Pradesh, India
| | - Gopal Singh Bisht
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Himachal Pradesh, India
| |
Collapse
|
16
|
Zhou C, Peng D, Liao B, Jia R, Wu F. ACP_MS: prediction of anticancer peptides based on feature extraction. Brief Bioinform 2022; 23:6793775. [PMID: 36326080 DOI: 10.1093/bib/bbac462] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 09/10/2022] [Accepted: 09/27/2022] [Indexed: 11/06/2022] Open
Abstract
Anticancer peptides (ACPs) are bioactive peptides with antitumor activity and have become the most promising drugs in the treatment of cancer. Therefore, the accurate prediction of ACPs is of great significance to the research of cancer diseases. In the paper, we developed a more efficient prediction model called ACP_MS. Firstly, the monoMonoKGap method is used to extract the characteristic of anticancer peptide sequences and form the digital features. Then, the AdaBoost model is used to select the most discriminating features from the digital features. Finally, a stochastic gradient descent algorithm is introduced to identify anticancer peptide sequences. We adopt 7-fold cross-validation and independent test set validation, and the final accuracy of the main dataset reached 92.653% and 91.597%, respectively. The accuracy of the alternate dataset reached 98.678% and 98.317%, respectively. Compared with other advanced prediction models, the ACP_MS model improves the identification ability of anticancer peptide sequences. The data of this model can be downloaded from the public website for free https://github.com/Zhoucaimao1998/Zc.
Collapse
Affiliation(s)
- Caimao Zhou
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China.,Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Dejun Peng
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China.,Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Bo Liao
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China.,Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Ranran Jia
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China.,Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Fangxiang Wu
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China.,Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| |
Collapse
|
17
|
ACP-ADA: A Boosting Method with Data Augmentation for Improved Prediction of Anticancer Peptides. Int J Mol Sci 2022; 23:ijms232012194. [PMID: 36293050 PMCID: PMC9603247 DOI: 10.3390/ijms232012194] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 10/08/2022] [Accepted: 10/11/2022] [Indexed: 11/30/2022] Open
Abstract
Cancer is the second-leading cause of death worldwide, and therapeutic peptides that target and destroy cancer cells have received a great deal of interest in recent years. Traditional wet experiments are expensive and inefficient for identifying novel anticancer peptides; therefore, the development of an effective computational approach is essential to recognize ACP candidates before experimental methods are used. In this study, we proposed an Ada-boosting algorithm with the base learner random forest called ACP-ADA, which integrates binary profile feature, amino acid index, and amino acid composition with a 210-dimensional feature space vector to represent the peptides. Training samples in the feature space were augmented to increase the sample size and further improve the performance of the model in the case of insufficient samples. Furthermore, we used five-fold cross-validation to find model parameters, and the cross-validation results showed that ACP-ADA outperforms existing methods for this feature combination with data augmentation in terms of performance metrics. Specifically, ACP-ADA recorded an average accuracy of 86.4% and a Mathew’s correlation coefficient of 74.01% for dataset ACP740 and 90.83% and 81.65% for dataset ACP240; consequently, it can be a very useful tool in drug development and biomedical research.
Collapse
|
18
|
Akbar S, Hayat M, Tahir M, Khan S, Alarfaj FK. cACP-DeepGram: Classification of anticancer peptides via deep neural network and skip-gram-based word embedding model. Artif Intell Med 2022; 131:102349. [DOI: 10.1016/j.artmed.2022.102349] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Revised: 05/24/2022] [Accepted: 07/04/2022] [Indexed: 12/28/2022]
|
19
|
Multi-feature Fusion Method Based on Linear Neighborhood Propagation Predict Plant LncRNA-Protein Interactions. Interdiscip Sci 2022; 14:545-554. [PMID: 35040094 DOI: 10.1007/s12539-022-00501-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 12/28/2021] [Accepted: 01/04/2022] [Indexed: 12/31/2022]
Abstract
Long non-coding RNAs (lncRNAs) have attracted extensive attention due to their important roles in various biological processes, among which lncRNA-protein interaction plays an important regulatory role in plant immunity and life activities. Laboratory methods are time consuming and labor-intensive, so that many computational methods have gradually emerged as auxiliary tools to assist relevant research. However, there are relatively few methods to predict lncRNA-protein interaction of plant. Due to the lack of experimentally verified interactions data, there is an imbalance between known and unknown interaction samples in plant data sets. In this study, a multi-feature fusion method based on linear neighborhood propagation is developed to predict plant unobserved lncRNA-protein interaction pairs through known interaction pairs, called MPLPLNP. The linear neighborhood similarity of the feature space is calculated and the results are predicted by label propagation. Meanwhile, multiple feature training is integrated to better explore the potential interaction information in the data. The experimental results show that the proposed multi-feature fusion method can improve the performance of the model, and is superior to other state-of-the-art approaches. Moreover, the proposed approach has better performance and generalization ability on various plant datasets, which is expected to facilitate the related research of plant molecular biology.
Collapse
|
20
|
ACPNet: A Deep Learning Network to Identify Anticancer Peptides by Hybrid Sequence Information. Molecules 2022; 27:molecules27051544. [PMID: 35268644 PMCID: PMC8912097 DOI: 10.3390/molecules27051544] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Revised: 02/20/2022] [Accepted: 02/23/2022] [Indexed: 12/18/2022] Open
Abstract
Cancer is one of the most dangerous threats to human health. One of the issues is drug resistance action, which leads to side effects after drug treatment. Numerous therapies have endeavored to relieve the drug resistance action. Recently, anticancer peptides could be a novel and promising anticancer candidate, which can inhibit tumor cell proliferation, migration, and suppress the formation of tumor blood vessels, with fewer side effects. However, it is costly, laborious and time consuming to identify anticancer peptides by biological experiments with a high throughput. Therefore, accurately identifying anti-cancer peptides becomes a key and indispensable step for anticancer peptides therapy. Although some existing computer methods have been developed to predict anticancer peptides, the accuracy still needs to be improved. Thus, in this study, we propose a deep learning-based model, called ACPNet, to distinguish anticancer peptides from non-anticancer peptides (non-ACPs). ACPNet employs three different types of peptide sequence information, peptide physicochemical properties and auto-encoding features linking the training process. ACPNet is a hybrid deep learning network, which fuses fully connected networks and recurrent neural networks. The comparison with other existing methods on ACPs82 datasets shows that ACPNet not only achieves the improvement of 1.2% Accuracy, 2.0% F1-score, and 7.2% Recall, but also gets balanced performance on the Matthews correlation coefficient. Meanwhile, ACPNet is verified on an independent dataset, with 20 proven anticancer peptides, and only one anticancer peptide is predicted as non-ACPs. The comparison and independent validation experiment indicate that ACPNet can accurately distinguish anticancer peptides from non-ACPs.
Collapse
|
21
|
Dhall A, Jain S, Sharma N, Naorem LD, Kaur D, Patiyal S, Raghava GPS. In silico tools and databases for designing cancer immunotherapy. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2021; 129:1-50. [PMID: 35305716 DOI: 10.1016/bs.apcsb.2021.11.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Immunotherapy is a rapidly growing therapy for cancer which have numerous benefits over conventional treatments like surgery, chemotherapy, and radiation. Overall survival of cancer patients has improved significantly due to the use of immunotherapy. It acts as a novel pillar for treating different malignancies from their primary to the metastatic stage. Recent preferments in high-throughput sequencing and computational immunology leads to the development of targeted immunotherapy for precision oncology. In the last few decades, several computational methods and resources have been developed for designing immunotherapy against cancer. In this review, we have summarized cancer-associated genomic, transcriptomic, and mutation profile repositories. We have also enlisted in silico methods for the prediction of vaccine candidates, HLA binders, cytokines inducing peptides, and potential neoepitopes. Of note, we have incorporated the most important bioinformatics pipelines and resources for the designing of cancer immunotherapy. Moreover, to facilitate the scientific community, we have developed a web portal entitled ImmCancer (https://webs.iiitd.edu.in/raghava/immcancer/), comprises cancer immunotherapy tools and repositories.
Collapse
Affiliation(s)
- Anjali Dhall
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, India
| | - Shipra Jain
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, India
| | - Neelam Sharma
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, India
| | - Leimarembi Devi Naorem
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, India
| | - Dilraj Kaur
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, India
| | - Sumeet Patiyal
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, India
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, India.
| |
Collapse
|
22
|
Cai L, Wang L, Fu X, Zeng X. Active Semisupervised Model for Improving the Identification of Anticancer Peptides. ACS OMEGA 2021; 6:23998-24008. [PMID: 34568678 PMCID: PMC8459422 DOI: 10.1021/acsomega.1c03132] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Indexed: 06/13/2023]
Abstract
Cancer is one of the most dangerous threats to human health. Accurate identification of anticancer peptides (ACPs) is valuable for the development and design of new anticancer agents. However, most machine-learning algorithms have limited ability to identify ACPs, and their accuracy is sensitive to the amount of label data. In this paper, we construct a new technology that combines active learning (AL) and label propagation (LP) algorithm to solve this problem, called (ACP-ALPM). First, we develop an efficient feature representation method based on various descriptor information and coding information of the peptide sequence. Then, an AL strategy is used to filter out the most informative data for model training, and a more powerful LP classifier is cast through continuous iterations. Finally, we evaluate the performance of ACP-ALPM and compare it with that of some of the state-of-the-art and classic methods; experimental results show that our method is significantly superior to them. In addition, through the experimental comparison of random selection and AL on three public data sets, it is proved that the AL strategy is more effective. Notably, a visualization experiment further verified that AL can utilize unlabeled data to improve the performance of the model. We hope that our method can be extended to other types of peptides and provide more inspiration for other similar work.
Collapse
Affiliation(s)
- Lijun Cai
- Department of Information
Science and Technology, Hunan University, Changsha, Hunan 410000, China
| | - Li Wang
- Department of Information
Science and Technology, Hunan University, Changsha, Hunan 410000, China
| | - Xiangzheng Fu
- Department of Information
Science and Technology, Hunan University, Changsha, Hunan 410000, China
| | - Xiangxiang Zeng
- Department of Information
Science and Technology, Hunan University, Changsha, Hunan 410000, China
| |
Collapse
|
23
|
Lv Z, Cui F, Zou Q, Zhang L, Xu L. Anticancer peptides prediction with deep representation learning features. Brief Bioinform 2021; 22:bbab008. [PMID: 33529337 DOI: 10.1093/bib/bbab008] [Citation(s) in RCA: 87] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 12/20/2020] [Accepted: 01/05/2021] [Indexed: 12/13/2022] Open
Abstract
Anticancer peptides constitute one of the most promising therapeutic agents for combating common human cancers. Using wet experiments to verify whether a peptide displays anticancer characteristics is time-consuming and costly. Hence, in this study, we proposed a computational method named identify anticancer peptides via deep representation learning features (iACP-DRLF) using light gradient boosting machine algorithm and deep representation learning features. Two kinds of sequence embedding technologies were used, namely soft symmetric alignment embedding and unified representation (UniRep) embedding, both of which involved deep neural network models based on long short-term memory networks and their derived networks. The results showed that the use of deep representation learning features greatly improved the capability of the models to discriminate anticancer peptides from other peptides. Also, UMAP (uniform manifold approximation and projection for dimension reduction) and SHAP (shapley additive explanations) analysis proved that UniRep have an advantage over other features for anticancer peptide identification. The python script and pretrained models could be downloaded from https://github.com/zhibinlv/iACP-DRLF or from http://public.aibiochem.net/iACP-DRLF/.
Collapse
Affiliation(s)
- Zhibin Lv
- University of Electronic Science and Technology of China
| | - Feifei Cui
- University of Electronic Science and Technology of China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences at University of Electronic Science and Technology of China
| | - Lichao Zhang
- School of Intelligent Manufacturing and Equipment, Shenzhen Institute of Information Technology, China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, China
| |
Collapse
|
24
|
Perpetuo L, Klein J, Ferreira R, Guedes S, Amado F, Leite-Moreira A, Silva AMS, Thongboonkerd V, Vitorino R. How can artificial intelligence be used for peptidomics? Expert Rev Proteomics 2021; 18:527-556. [PMID: 34343059 DOI: 10.1080/14789450.2021.1962303] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
INTRODUCTION Peptidomics is an emerging field of omics sciences using advanced isolation, analysis, and computational techniques that enable qualitative and quantitative analyses of various peptides in biological samples. Peptides can act as useful biomarkers and as therapeutic molecules for diseases. AREAS COVERED The use of therapeutic peptides can be predicted quickly and efficiently using data-driven computational methods, particularly artificial intelligence (AI) approach. Various AI approaches are useful for peptide-based drug discovery, such as support vector machine, random forest, extremely randomized trees, and other more recently developed deep learning methods. AI methods are relatively new to the development of peptide-based therapies, but these techniques already become essential tools in protein science by dissecting novel therapeutic peptides and their functions (Figure 1).[Figure: see text]. EXPERT OPINION Researchers have shown that AI models can facilitate the development of peptidomics and selective peptide therapies in the field of peptide science. Biopeptide prediction is important for the discovery and development of successful peptide-based drugs. Due to their ability to predict therapeutic roles based on sequence details, many AI-dependent prediction tools have been developed (Figure 1).
Collapse
Affiliation(s)
- Luís Perpetuo
- iBiMED, Department of Medical Sciences, University of Aveiro, Aveiro
| | - Julie Klein
- Institut National de la Santé et de la Recherche Médicale (INSERM), U1297, Institute of Cardiovascular and Metabolic Disease, Université Toulouse III, Toulouse, France
| | - Rita Ferreira
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro
| | - Sofia Guedes
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro
| | - Francisco Amado
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro
| | - Adelino Leite-Moreira
- UnIC, Departamento de Cirurgia e Fisiologia, Faculdade de Medicina da Universidade do Porto, Porto
| | - Artur M S Silva
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro
| | - Visith Thongboonkerd
- Medical Proteomics Unit, Office for Research and Development, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Rui Vitorino
- iBiMED, Department of Medical Sciences, University of Aveiro, Aveiro.,LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro.,UnIC, Departamento de Cirurgia e Fisiologia, Faculdade de Medicina da Universidade do Porto, Porto
| |
Collapse
|
25
|
Chen XG, Zhang W, Yang X, Li C, Chen H. ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation. Front Genet 2021; 12:698477. [PMID: 34276801 PMCID: PMC8279753 DOI: 10.3389/fgene.2021.698477] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 06/07/2021] [Indexed: 12/09/2022] Open
Abstract
Anticancer peptides (ACPs) have provided a promising perspective for cancer treatment, and the prediction of ACPs is very important for the discovery of new cancer treatment drugs. It is time consuming and expensive to use experimental methods to identify ACPs, so computational methods for ACP identification are urgently needed. There have been many effective computational methods, especially machine learning-based methods, proposed for such predictions. Most of the current machine learning methods try to find suitable features or design effective feature learning techniques to accurately represent ACPs. However, the performance of these methods can be further improved for cases with insufficient numbers of samples. In this article, we propose an ACP prediction model called ACP-DA (Data Augmentation), which uses data augmentation for insufficient samples to improve the prediction performance. In our method, to better exploit the information of peptide sequences, peptide sequences are represented by integrating binary profile features and AAindex features, and then the samples in the training set are augmented in the feature space. After data augmentation, the samples are used to train the machine learning model, which is used to predict ACPs. The performance of ACP-DA exceeds that of existing methods, and ACP-DA achieves better performance in the prediction of ACPs compared with a method without data augmentation. The proposed method is available at http://github.com/chenxgscuec/ACPDA.
Collapse
Affiliation(s)
- Xian-Gan Chen
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, China.,Hubei Engineering Technology Research Center of Agricultural Big Data, Wuhan, China
| | - Xiaofei Yang
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| | - Chenhong Li
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| | - Hengling Chen
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| |
Collapse
|
26
|
Huang KY, Tseng YJ, Kao HJ, Chen CH, Yang HH, Weng SL. Identification of subtypes of anticancer peptides based on sequential features and physicochemical properties. Sci Rep 2021; 11:13594. [PMID: 34193950 PMCID: PMC8245499 DOI: 10.1038/s41598-021-93124-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 06/08/2021] [Indexed: 11/25/2022] Open
Abstract
Anticancer peptides (ACPs) are a kind of bioactive peptides which could be used as a novel type of anticancer drug that has several advantages over chemistry-based drug, including high specificity, strong tumor penetration capacity, and low toxicity to normal cells. As the number of experimentally verified bioactive peptides has increased significantly, various of in silico approaches are imperative for investigating the characteristics of ACPs. However, the lack of methods for investigating the differences in physicochemical properties of ACPs. In this study, we compared the N- and C-terminal amino acid composition for each peptide, there are three major subtypes of ACPs that are defined based on the distribution of positively charged residues. For the first time, we were motivated to develop a two-step machine learning model for identification of the subtypes of ACPs, which classify the input data into the corresponding group before applying the classifier. Further, to improve the predictive power, the hybrid feature sets were considered for prediction. Evaluation by five-fold cross-validation showed that the two-step model trained with sequence-based features and physicochemical properties was most effective in discriminating between ACPs and non-ACPs. The two-step model trained with the hybrid features performed well, with a sensitivity of 86.75%, a specificity of 85.75%, an accuracy of 86.08%, and a Matthews Correlation Coefficient value of 0.703. Furthermore, the model also consistently provides the effective performance in independent testing set, with sensitivity of 77.6%, specificity of 94.74%, accuracy of 88.99% and the MCC value reached 0.75. Finally, the two-step model has been implemented as a web-based tool, namely iDACP, which is now freely available at http://mer.hc.mmh.org.tw/iDACP/ .
Collapse
Affiliation(s)
- Kai-Yao Huang
- Department of Medical Research, Hsinchu Mackay Memorial Hospital, Hsinchu City, 300, Taiwan
- Department of Medicine, Mackay Medical College, New Taipei City, 252, Taiwan
| | - Yi-Jhan Tseng
- Department of Medical Research, Hsinchu Mackay Memorial Hospital, Hsinchu City, 300, Taiwan
| | - Hui-Ju Kao
- Department of Medical Research, Hsinchu Mackay Memorial Hospital, Hsinchu City, 300, Taiwan
| | - Chia-Hung Chen
- Department of Medical Research, Hsinchu Mackay Memorial Hospital, Hsinchu City, 300, Taiwan
| | - Hsiao-Hsiang Yang
- Department of Medical Research, Hsinchu Mackay Memorial Hospital, Hsinchu City, 300, Taiwan
| | - Shun-Long Weng
- Department of Medicine, Mackay Medical College, New Taipei City, 252, Taiwan.
- Department of Obstetrics and Gynecology, Hsinchu Mackay Memorial Hospital, Hsinchu City, 300, Taiwan.
- Mackay Junior College of Medicine, Medicine, Nursing and Management College, Taipei City, 112, Taiwan.
| |
Collapse
|
27
|
Liscano Y, Oñate-Garzón J, Delgado JP. Peptides with Dual Antimicrobial-Anticancer Activity: Strategies to Overcome Peptide Limitations and Rational Design of Anticancer Peptides. Molecules 2020; 25:E4245. [PMID: 32947811 PMCID: PMC7570524 DOI: 10.3390/molecules25184245] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 09/04/2020] [Accepted: 09/11/2020] [Indexed: 12/31/2022] Open
Abstract
Peptides are naturally produced by all organisms and exhibit a wide range of physiological, immunomodulatory, and wound healing functions. Furthermore, they can provide with protection against microorganisms and tumor cells. Their multifaceted performance, high selectivity, and reduced toxicity have positioned them as effective therapeutic agents, representing a positive economic impact for pharmaceutical companies. Currently, efforts have been made to invest in the development of new peptides with antimicrobial and anticancer properties, but the poor stability of these molecules in physiological environments has triggered a bottleneck. Therefore, some tools, such as nanotechnology and in silico approaches can be applied as alternatives to try to overcome these obstacles. In silico studies provide a priori knowledge that can lead to the development of new anticancer peptides with enhanced biological activity and improved stability. This review focuses on the current status of research in peptides with dual antimicrobial-anticancer activity, including advances in computational biology using in silico analyses as a powerful tool for the study and rational design of these types of peptides.
Collapse
Affiliation(s)
- Yamil Liscano
- Research Group of Chemical and Biotechnology, Faculty of Basic Sciences, Universidad Santiago de Cali, 760035 Cali, Colombia;
- Research Group of Genetics, Regeneration and Cancer, Institute of Biology, Universidad de Antioquia, 050010 Medellin, Colombia;
| | - Jose Oñate-Garzón
- Research Group of Chemical and Biotechnology, Faculty of Basic Sciences, Universidad Santiago de Cali, 760035 Cali, Colombia;
| | - Jean Paul Delgado
- Research Group of Genetics, Regeneration and Cancer, Institute of Biology, Universidad de Antioquia, 050010 Medellin, Colombia;
| |
Collapse
|