1
|
Oliveira DF, Coleone AP, Lima FCDA, Batagin-Neto A. Reactivity of amino acids and short peptide sequences: identifying bioactive compounds via DFT calculations. Mol Divers 2025; 29:489-502. [PMID: 38700810 DOI: 10.1007/s11030-024-10868-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Accepted: 03/31/2024] [Indexed: 02/02/2025]
Abstract
Bioactive peptides are short amino acid sequences that play important roles in various physiological processes, including antioxidant and protective effects. These compounds can be obtained through protein hydrolysis and have a wide range of potential applications in a variety of areas. However, despite the potential of these compounds, more in-depth knowledge is still necessary to better understand details regarding their chemical reactivity and electronic properties. In this study, we used molecular modeling techniques to investigate the electronic structure of isolated amino acids (AA) and short peptide sequences. Details on the relative alignments between the frontier electronic levels, local chemical reactivity and donor-acceptor properties of the 20 primary amino acids and some di- and tripeptides were evaluated in the framework of the density functional theory (DFT). Our results suggest that the electronic properties of isolated amino acids can be used to interpret the reactivity of short sequences. We found that aromatic and charged amino acids, as well as Methionine, play a key role in determining the local reactivity of peptides, in agreement with experimental data. Our analyses also allowed us to identify the influence of the relative position of AA and terminations on the local reactivity of the sequences, which can guide experimental studies and help to propose/evaluate possible mechanisms of action. In summary, our data indicate that the position of active sites of polypeptides can be predicted from short sequences, providing a promising strategy for the synthesis and bioprospection of new optimized compounds.
Collapse
Affiliation(s)
- Daiane F Oliveira
- School of Pharmaceutical Sciences, EBB-MP, São Paulo State University (UNESP), Araraquara, SP, Brazil
| | - Alex P Coleone
- School of Sciences, POSMAT, São Paulo State University (UNESP), Bauru, SP, Brazil
| | - Filipe C D A Lima
- Federal Institute of Education, Science and Technology of São Paulo (IFSP), Matão, SP, Brazil
| | - Augusto Batagin-Neto
- Institute of Sciences and Engineering, São Paulo State University (UNESP), Itapeva, SP, Brazil.
| |
Collapse
|
2
|
Vrbnjak K, Sewduth RN. Recent Advances in Peptide Drug Discovery: Novel Strategies and Targeted Protein Degradation. Pharmaceutics 2024; 16:1486. [PMID: 39598608 PMCID: PMC11597556 DOI: 10.3390/pharmaceutics16111486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Revised: 11/19/2024] [Accepted: 11/20/2024] [Indexed: 11/29/2024] Open
Abstract
Recent technological advancements, including computer-assisted drug discovery, gene-editing techniques, and high-throughput screening approaches, have greatly expanded the palette of methods for the discovery of peptides available to researchers. These emerging strategies, driven by recent advances in bioinformatics and multi-omics, have significantly improved the efficiency of peptide drug discovery when compared with traditional in vitro and in vivo methods, cutting costs and improving their reliability. An added benefit of peptide-based drugs is the ability to precisely target protein-protein interactions, which are normally a particularly challenging aspect of drug discovery. Another recent breakthrough in this field is targeted protein degradation through proteolysis-targeting chimeras. These revolutionary compounds represent a noteworthy advancement over traditional small-molecule inhibitors due to their unique mechanism of action, which allows for the degradation of specific proteins with unprecedented specificity. The inclusion of a peptide as a protein-of-interest-targeting moiety allows for improved versatility and the possibility of targeting otherwise undruggable proteins. In this review, we discuss various novel wet-lab and computational multi-omic methods for peptide drug discovery, provide an overview of therapeutic agents discovered through these cutting-edge techniques, and discuss the potential for the therapeutic delivery of peptide-based drugs.
Collapse
Affiliation(s)
- Katarina Vrbnjak
- VIB-KU Leuven Center for Cancer Biology (VIB), 3000 Leuven, Belgium
| | | |
Collapse
|
3
|
Culkins C, Adomanis R, Phan N, Robinson B, Slaton E, Lothrop E, Chen Y, Kimmel BR. Unlocking the Gates: Therapeutic Agents for Noninvasive Drug Delivery Across the Blood-Brain Barrier. Mol Pharm 2024; 21:5430-5454. [PMID: 39324552 DOI: 10.1021/acs.molpharmaceut.4c00604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/27/2024]
Abstract
The blood-brain barrier (BBB) is a highly selective network of various cell types that acts as a filter between the blood and the brain parenchyma. Because of this, the BBB remains a major obstacle for drug delivery to the central nervous system (CNS). In recent years, there has been a focus on developing various modifiable platforms, such as monoclonal antibodies (mAbs), nanobodies (Nbs), peptides, and nanoparticles, as both therapeutic agents and carriers for targeted drug delivery to treat brain cancers and diseases. Methods for bypassing the BBB can be invasive or noninvasive. Invasive techniques, such as transient disruption of the BBB using low pulse electrical fields and intracerebroventricular infusion, lack specificity and have numerous safety concerns. In this review, we will focus on noninvasive transport mechanisms that offer high levels of biocompatibility, personalization, specificity and are regarded as generally safer than their invasive counterparts. Modifiable platforms can be designed to noninvasively traverse the BBB through one or more of the following pathways: passive diffusion through a physio-pathologically disrupted BBB, adsorptive-mediated transcytosis, receptor-mediated transcytosis, shuttle-mediated transcytosis, and somatic gene transfer. Through understanding the noninvasive pathways, new applications, including Chimeric Antigen Receptors T-cell (CAR-T) therapy, and approaches for drug delivery across the BBB are emerging.
Collapse
Affiliation(s)
- Courtney Culkins
- Department of Chemical and Biomolecular Engineering, The Ohio State University, Columbus, Ohio 43210, United States
| | - Roman Adomanis
- Department of Chemical and Biomolecular Engineering, The Ohio State University, Columbus, Ohio 43210, United States
| | - Nathan Phan
- Department of Chemical and Biomolecular Engineering, The Ohio State University, Columbus, Ohio 43210, United States
| | - Blaise Robinson
- Department of Chemical and Biomolecular Engineering, The Ohio State University, Columbus, Ohio 43210, United States
| | - Ethan Slaton
- Department of Chemical and Biomolecular Engineering, The Ohio State University, Columbus, Ohio 43210, United States
| | - Elijah Lothrop
- Department of Chemical and Biomolecular Engineering, The Ohio State University, Columbus, Ohio 43210, United States
| | - Yinuo Chen
- Department of Chemical and Biomolecular Engineering, The Ohio State University, Columbus, Ohio 43210, United States
| | - Blaise R Kimmel
- Department of Chemical and Biomolecular Engineering, The Ohio State University, Columbus, Ohio 43210, United States
- Center for Cancer Engineering, Ohio State University Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio 43210, United States
- Pelotonia Institute for Immuno-Oncology, Ohio State University Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio 43210, United States
| |
Collapse
|
4
|
Schaduangrat N, Khemawoot P, Jiso A, Charoenkwan P, Shoombuatong W. MetaCGRP is a high-precision meta-model for large-scale identification of CGRP inhibitors using multi-view information. Sci Rep 2024; 14:24764. [PMID: 39433940 PMCID: PMC11494111 DOI: 10.1038/s41598-024-75487-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Accepted: 10/07/2024] [Indexed: 10/23/2024] Open
Abstract
Migraine is considered one of the debilitating primary headache conditions with an estimated worldwide occurrence of approximately 14-15%, contributing highly to factors responsible for global disability. Calcitonin gene-related peptide (CGRP) is a neuropeptide that plays a crucial role in the pathophysiology of migraines and thus, its inhibition can help relieve migraine symptoms. However, conventional process of CGRP drug development has been laborious and time-consuming with incurred costs exceeding one billion dollars. On the other hand, machine learning (ML)-based approaches that are capable of accurately identifying CGRP inhibitors could greatly facilitate in expediting the discovery of novel CGRP drugs. Therefore, this study proposes a novel and high-accuracy meta-model, namely MetaCGRP, that can precisely identify CGRP inhibitors. To the best of our knowledge, MetaCGRP is the first SMILES-based approach that has been developed to identify CGRP inhibitors without the use of 3D structural information. In brief, we initially employed different molecular representation methods coupled with popular ML algorithms to construct a pool of baseline models. Then, all baseline models were optimized and used to generate multi-view features. Finally, we employed the feature selection method to optimize the multi-view features and determine the best feature subset to enable the construction of the meta-model. Both cross-validation and independent tests indicated that MetaCGRP clearly outperforms several conventional ML classifiers, with accuracies of 0.898 and 0.799 on the training and independent test datasets, respectively. In addition, MetaCGRP in conjunction with molecular docking was utilized to identify five potential natural product candidates from Thai herbal pharmacopoeia and analyze their binding affinity and interactions to CGRP. To facilitate community-wide efforts in expediting the discovery of novel CGRP inhibitors, a user-friendly web server for MetaCGRP is freely available at https://pmlabqsar.pythonanywhere.com/MetaCGRP .
Collapse
Affiliation(s)
- Nalini Schaduangrat
- Center for Research Innovation and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Phisit Khemawoot
- Chakri Naruebodindra Medical Institute, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Samut Prakan, 10540, Thailand
| | - Apisada Jiso
- Chakri Naruebodindra Medical Institute, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Samut Prakan, 10540, Thailand
| | - Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai, 50200, Thailand.
| | - Watshara Shoombuatong
- Center for Research Innovation and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand.
| |
Collapse
|
5
|
Yang S, Xu P. LLM4THP: a computing tool to identify tumor homing peptides by molecular and sequence representation of large language model based on two-layer ensemble model strategy. Amino Acids 2024; 56:62. [PMID: 39404804 PMCID: PMC11480143 DOI: 10.1007/s00726-024-03422-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Accepted: 10/04/2024] [Indexed: 10/19/2024]
Abstract
Tumor homing peptides (THPs) have a distinctive capacity to specifically attach to tumor cells, providing a promising approach for targeted cancer treatment and detection. Although THPs have the potential for significant impact, their detection by conventional methods is both time-consuming and expensive. To tackle this issue, we provide LLM4THP, an innovative computational approach that utilizes large language models (LLMs) to quickly and effectively detect THPs. LLM4THP utilizes two protein LLMs, ESM2 and Prot_T5_XL_UniRef50, to encode peptide sequences. This allows for the capture of complex patterns and relationships within the peptide data. In addition, we utilize inherent sequence characteristics such as Amino Acid Composition (AAC), Pseudo Amino Acid Composition (PAAC), Amphiphilic Pseudo Amino Acid Composition (APAAC), and Composition, Transition, and Distribution (CTD) to improve the representation of peptides. The RDKitDescriptors feature representation approach transforms peptide sequences into molecular objects and computes chemical characteristics, resulting in enhanced THP identification. The LLM4THP ensemble strategy incorporates various features into a two-layer learning architecture. The first layer consists of LightGBM, XGBoost, Random Forest, and Extremely Randomized Trees, which generate a set of meta results. The second layer utilizes Logistic Regression to further refine the identification of sequences as either THP or non-THP. LLM4THP exhibits exceptional performance compared to the most advanced methods, showcasing enhancements in accuracy, Matthew's correlation coefficient, F1 score, area under the curve, and average precision. The source code and dataset can be accessed at the following URL: https://github.com/abcair/LLM4THP.
Collapse
Affiliation(s)
- Sen Yang
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou, 213164, China
- The Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou, 213164, China
| | - Piao Xu
- College of Economics and Management, Nanjing Forestry University, Nanjing, 210037, China.
| |
Collapse
|
6
|
Liu S, Shi T, Yu J, Li R, Lin H, Deng K. Research on Bitter Peptides in the Field of Bioinformatics: A Comprehensive Review. Int J Mol Sci 2024; 25:9844. [PMID: 39337334 PMCID: PMC11432553 DOI: 10.3390/ijms25189844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 09/06/2024] [Accepted: 09/09/2024] [Indexed: 09/30/2024] Open
Abstract
Bitter peptides are small molecular peptides produced by the hydrolysis of proteins under acidic, alkaline, or enzymatic conditions. These peptides can enhance food flavor and offer various health benefits, with attributes such as antihypertensive, antidiabetic, antioxidant, antibacterial, and immune-regulating properties. They show significant potential in the development of functional foods and the prevention and treatment of diseases. This review introduces the diverse sources of bitter peptides and discusses the mechanisms of bitterness generation and their physiological functions in the taste system. Additionally, it emphasizes the application of bioinformatics in bitter peptide research, including the establishment and improvement of bitter peptide databases, the use of quantitative structure-activity relationship (QSAR) models to predict bitterness thresholds, and the latest advancements in classification prediction models built using machine learning and deep learning algorithms for bitter peptide identification. Future research directions include enhancing databases, diversifying models, and applying generative models to advance bitter peptide research towards deepening and discovering more practical applications.
Collapse
Affiliation(s)
| | | | | | | | - Hao Lin
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; (S.L.); (T.S.); (J.Y.); (R.L.)
| | - Kejun Deng
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; (S.L.); (T.S.); (J.Y.); (R.L.)
| |
Collapse
|
7
|
Arif R, Kanwal S, Ahmed S, Kabir M. A Computational Predictor for Accurate Identification of Tumor Homing Peptides by Integrating Sequential and Deep BiLSTM Features. Interdiscip Sci 2024; 16:503-518. [PMID: 38733473 DOI: 10.1007/s12539-024-00628-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 03/16/2024] [Accepted: 03/27/2024] [Indexed: 05/13/2024]
Abstract
Cancer remains a severe illness, and current research indicates that tumor homing peptides (THPs) play an important part in cancer therapy. The identification of THPs can provide crucial insights for drug-discovery and pharmaceutical industries as they allow for tailored medication delivery towards cancer cells. These peptides have a high affinity enabling particular receptors present upon tumor surfaces, allowing for the creation of precision medications that reduce off-target consequences and enhance cancer patient treatment results. Wet-lab techniques are considered essential tools for studying THPs; however, they're labor-extensive and time-consuming, therefore making prediction of THPs a challenging task for the researchers. Computational-techniques, on the other hand, are considered significant tools in identifying THPs according to the sequence data. Despite many strategies have been presented to predict new THP, there is still a need to develop a robust method with higher rates of success. In this paper, we developed a novel framework, THP-DF, for accurately identifying THPs on a large-scale. Firstly, the peptide sequences are encoded through various sequential features. Secondly, each feature is passed to BiLSTM and attention layers to extract simplified deep features. Finally, an ensemble-framework is formed via integrating sequential- and deep features which are fed to a support vector machine which with 10-fold cross-validation to carry to validate the efficiency. The experimental results showed that THP-DF worked better on both [Formula: see text] and [Formula: see text] datasets by achieving accuracy of > 95% which are higher than existing predictors both datasets. This indicates that the proposed predictor could be a beneficial tool to precisely and rapidly identify THPs and will contribute to the cutting-edge cancer treatment strategies and pharmaceuticals.
Collapse
Affiliation(s)
- Roha Arif
- School of Systems and Technology, University of Management and Technology, Lahore, 54782, Pakistan
| | - Sameera Kanwal
- School of Systems and Technology, University of Management and Technology, Lahore, 54782, Pakistan
| | - Saeed Ahmed
- School of Systems and Technology, University of Management and Technology, Lahore, 54782, Pakistan
| | - Muhammad Kabir
- School of Systems and Technology, University of Management and Technology, Lahore, 54782, Pakistan.
| |
Collapse
|
8
|
Gu ZF, Hao YD, Wang TY, Cai PL, Zhang Y, Deng KJ, Lin H, Lv H. Prediction of blood-brain barrier penetrating peptides based on data augmentation with Augur. BMC Biol 2024; 22:86. [PMID: 38637801 PMCID: PMC11027412 DOI: 10.1186/s12915-024-01883-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 04/05/2024] [Indexed: 04/20/2024] Open
Abstract
BACKGROUND The blood-brain barrier serves as a critical interface between the bloodstream and brain tissue, mainly composed of pericytes, neurons, endothelial cells, and tightly connected basal membranes. It plays a pivotal role in safeguarding brain from harmful substances, thus protecting the integrity of the nervous system and preserving overall brain homeostasis. However, this remarkable selective transmission also poses a formidable challenge in the realm of central nervous system diseases treatment, hindering the delivery of large-molecule drugs into the brain. In response to this challenge, many researchers have devoted themselves to developing drug delivery systems capable of breaching the blood-brain barrier. Among these, blood-brain barrier penetrating peptides have emerged as promising candidates. These peptides had the advantages of high biosafety, ease of synthesis, and exceptional penetration efficiency, making them an effective drug delivery solution. While previous studies have developed a few prediction models for blood-brain barrier penetrating peptides, their performance has often been hampered by issue of limited positive data. RESULTS In this study, we present Augur, a novel prediction model using borderline-SMOTE-based data augmentation and machine learning. we extract highly interpretable physicochemical properties of blood-brain barrier penetrating peptides while solving the issues of small sample size and imbalance of positive and negative samples. Experimental results demonstrate the superior prediction performance of Augur with an AUC value of 0.932 on the training set and 0.931 on the independent test set. CONCLUSIONS This newly developed Augur model demonstrates superior performance in predicting blood-brain barrier penetrating peptides, offering valuable insights for drug development targeting neurological disorders. This breakthrough may enhance the efficiency of peptide-based drug discovery and pave the way for innovative treatment strategies for central nervous system diseases.
Collapse
Affiliation(s)
- Zhi-Feng Gu
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, PR China
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, PR China
| | - Yu-Duo Hao
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, PR China
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, PR China
| | - Tian-Yu Wang
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, PR China
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, PR China
| | - Pei-Ling Cai
- School of Basic Medical Sciences, Chengdu University, Chengdu, 610106, PR China
| | - Yang Zhang
- Innovative Institute of Chinese Medicine and Pharmacy, Academy for Interdiscipline, Chengdu University of Traditional Chinese Medicine, Chengdu, 610072, PR China
| | - Ke-Jun Deng
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, PR China
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, PR China
| | - Hao Lin
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, PR China.
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, PR China.
| | - Hao Lv
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, PR China.
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, PR China.
| |
Collapse
|
9
|
Yang X, Jin J, Wang R, Li Z, Wang Y, Wei L. CACPP: A Contrastive Learning-Based Siamese Network to Identify Anticancer Peptides Based on Sequence Only. J Chem Inf Model 2024; 64:2807-2816. [PMID: 37252890 DOI: 10.1021/acs.jcim.3c00297] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Anticancer peptides (ACPs) recently have been receiving increasing attention in cancer therapy due to their low consumption, few adverse side effects, and easy accessibility. However, it remains a great challenge to identify anticancer peptides via experimental approaches, requiring expensive and time-consuming experimental studies. In addition, traditional machine-learning-based methods are proposed for ACP prediction mainly depending on hand-crafted feature engineering, which normally achieves low prediction performance. In this study, we propose CACPP (Contrastive ACP Predictor), a deep learning framework based on the convolutional neural network (CNN) and contrastive learning for accurately predicting anticancer peptides. In particular, we introduce the TextCNN model to extract the high-latent features based on the peptide sequences only and exploit the contrastive learning module to learn more distinguishable feature representations to make better predictions. Comparative results on the benchmark data sets indicate that CACPP outperforms all the state-of-the-art methods in the prediction of anticancer peptides. Moreover, to intuitively show that our model has good classification ability, we visualize the dimension reduction of the features from our model and explore the relationship between ACP sequences and anticancer functions. Furthermore, we also discuss the influence of data set construction on model prediction and explore our model performance on the data sets with verified negative samples.
Collapse
Affiliation(s)
- Xuetong Yang
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| | - Junru Jin
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| | - Ruheng Wang
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| | - Zhongshen Li
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| | - Yu Wang
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| | - Leyi Wei
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| |
Collapse
|
10
|
Goshisht MK. Machine Learning and Deep Learning in Synthetic Biology: Key Architectures, Applications, and Challenges. ACS OMEGA 2024; 9:9921-9945. [PMID: 38463314 PMCID: PMC10918679 DOI: 10.1021/acsomega.3c05913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 01/19/2024] [Accepted: 01/30/2024] [Indexed: 03/12/2024]
Abstract
Machine learning (ML), particularly deep learning (DL), has made rapid and substantial progress in synthetic biology in recent years. Biotechnological applications of biosystems, including pathways, enzymes, and whole cells, are being probed frequently with time. The intricacy and interconnectedness of biosystems make it challenging to design them with the desired properties. ML and DL have a synergy with synthetic biology. Synthetic biology can be employed to produce large data sets for training models (for instance, by utilizing DNA synthesis), and ML/DL models can be employed to inform design (for example, by generating new parts or advising unrivaled experiments to perform). This potential has recently been brought to light by research at the intersection of engineering biology and ML/DL through achievements like the design of novel biological components, best experimental design, automated analysis of microscopy data, protein structure prediction, and biomolecular implementations of ANNs (Artificial Neural Networks). I have divided this review into three sections. In the first section, I describe predictive potential and basics of ML along with myriad applications in synthetic biology, especially in engineering cells, activity of proteins, and metabolic pathways. In the second section, I describe fundamental DL architectures and their applications in synthetic biology. Finally, I describe different challenges causing hurdles in the progress of ML/DL and synthetic biology along with their solutions.
Collapse
Affiliation(s)
- Manoj Kumar Goshisht
- Department of Chemistry, Natural and
Applied Sciences, University of Wisconsin—Green
Bay, Green
Bay, Wisconsin 54311-7001, United States
| |
Collapse
|
11
|
Wang D, Jin J, Li Z, Wang Y, Fan M, Liang S, Su R, Wei L. StructuralDPPIV: a novel deep learning model based on atom structure for predicting dipeptidyl peptidase-IV inhibitory peptides. Bioinformatics 2024; 40:btae057. [PMID: 38305458 PMCID: PMC10904144 DOI: 10.1093/bioinformatics/btae057] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 12/07/2023] [Accepted: 01/30/2024] [Indexed: 02/03/2024] Open
Abstract
MOTIVATION Diabetes is a chronic metabolic disorder that has been a major cause of blindness, kidney failure, heart attacks, stroke, and lower limb amputation across the world. To alleviate the impact of diabetes, researchers have developed the next generation of anti-diabetic drugs, known as dipeptidyl peptidase IV inhibitory peptides (DPP-IV-IPs). However, the discovery of these promising drugs has been restricted due to the lack of effective peptide-mining tools. RESULTS Here, we presented StructuralDPPIV, a deep learning model designed for DPP-IV-IP identification, which takes advantage of both molecular graph features in amino acid and sequence information. Experimental results on the independent test dataset and two wet experiment datasets show that our model outperforms the other state-of-art methods. Moreover, to better study what StructuralDPPIV learns, we used CAM technology and perturbation experiment to analyze our model, which yielded interpretable insights into the reasoning behind prediction results. AVAILABILITY AND IMPLEMENTATION The project code is available at https://github.com/WeiLab-BioChem/Structural-DPP-IV.
Collapse
Affiliation(s)
- Ding Wang
- School of Software, Shandong University, Jinan 250101, China
| | - Junru Jin
- School of Software, Shandong University, Jinan 250101, China
| | - Zhongshen Li
- School of Software, Shandong University, Jinan 250101, China
| | - Yu Wang
- School of Software, Shandong University, Jinan 250101, China
| | - Mushuang Fan
- School of Software, Shandong University, Jinan 250101, China
| | - Sirui Liang
- School of Software, Shandong University, Jinan 250101, China
| | - Ran Su
- College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
| | - Leyi Wei
- Faculty of Applied Sciences, Macao Polytechnic University, Macao 999078, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| |
Collapse
|
12
|
Yu Y, Liu S, Zhang X, Yu W, Pei X, Liu L, Jin Y. Identification and prediction of milk-derived bitter taste peptides based on peptidomics technology and machine learning method. Food Chem 2024; 433:137288. [PMID: 37683467 DOI: 10.1016/j.foodchem.2023.137288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 07/19/2023] [Accepted: 08/24/2023] [Indexed: 09/10/2023]
Abstract
Bitter taste peptides (BPs) are vital for drug and nutrition research, but large-scale screening of them is still time-consuming and costly. This study developed a complete workflow for screening BPs based on peptidomics technology and machine learning method. Using an expanded dataset and a new combination of BPs' characteristic factors, a novel classification prediction model (CPM-BP) based on the Light Gradient Boosting Machine algorithm was constructed with an accuracy of 90.3 % for predicting BPs. Among 724 significantly different peptides between spoiled and fresh UHT milk, 180 potential BPs were predicted using CPM-BP and eleven of them were previously reported. One known BP (FALPQYLK) and three predicted potential BPs (FALPQYL, FFVAPFPEVFGKE, EMPFPKYP) were verified by determination of calcium mobilization of HEK293T cells expressing human bitter taste receptor T2R4 (hT2R4). Three potential BPs could activate the hT2R4 and are demonstrated to be BPs, which proved the effectiveness of CPM-BP.
Collapse
Affiliation(s)
- Yang Yu
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, China
| | - Shengchi Liu
- School of Information Science and Engineering, Dalian Polytechnic University, Dalian, Liaoning 116034, China
| | - Xinchen Zhang
- School of Information Science and Engineering, Dalian Polytechnic University, Dalian, Liaoning 116034, China
| | - Wenhao Yu
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, China
| | - Xiaoyan Pei
- Inner Mongolia Yili Industrial Group Co., Ltd., Hohhot, Inner Mongolia 010110, China
| | - Li Liu
- School of Information Science and Engineering, Dalian Polytechnic University, Dalian, Liaoning 116034, China.
| | - Yan Jin
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, China.
| |
Collapse
|
13
|
Jiang J, Pei H, Li J, Li M, Zou Q, Lv Z. FEOpti-ACVP: identification of novel anti-coronavirus peptide sequences based on feature engineering and optimization. Brief Bioinform 2024; 25:bbae037. [PMID: 38366802 PMCID: PMC10939380 DOI: 10.1093/bib/bbae037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 12/27/2023] [Accepted: 01/17/2024] [Indexed: 02/18/2024] Open
Abstract
Anti-coronavirus peptides (ACVPs) represent a relatively novel approach of inhibiting the adsorption and fusion of the virus with human cells. Several peptide-based inhibitors showed promise as potential therapeutic drug candidates. However, identifying such peptides in laboratory experiments is both costly and time consuming. Therefore, there is growing interest in using computational methods to predict ACVPs. Here, we describe a model for the prediction of ACVPs that is based on the combination of feature engineering (FE) optimization and deep representation learning. FEOpti-ACVP was pre-trained using two feature extraction frameworks. At the next step, several machine learning approaches were tested in to construct the final algorithm. The final version of FEOpti-ACVP outperformed existing methods used for ACVPs prediction and it has the potential to become a valuable tool in ACVP drug design. A user-friendly webserver of FEOpti-ACVP can be accessed at http://servers.aibiochem.net/soft/FEOpti-ACVP/.
Collapse
Affiliation(s)
- Jici Jiang
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Hongdi Pei
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Jiayu Li
- College of Life Science, Sichuan University, Chengdu 610065, China
| | - Mingxin Li
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Zhibin Lv
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| |
Collapse
|
14
|
Grønning AGB, Schéele C. Integrating a Multi-label Deep Learning Approach with Protein Information to Compare Bioactive Peptides in Brain and Plasma. Methods Mol Biol 2024; 2758:179-195. [PMID: 38549014 DOI: 10.1007/978-1-0716-3646-6_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
Peptide therapeutics is gaining momentum. Advances in the field of peptidomics have enabled researchers to harvest vital information from various organisms and tissue types concerning peptide existence, expression and function. The development of mass spectrometry techniques for high-throughput peptide quantitation has paved the way for the identification and discovery of numerous known and novel peptides. Though much has been achieved, scientists are still facing difficulties when it comes to reducing the search space of the large mass spectrometry-generated peptidomics datasets and focusing on the subset of functionally relevant peptides. Moreover, there is currently no straightforward way to analytically compare the distributions of bioactive peptides in distinct biological samples, which may reveal much useful information when seeking to characterize tissue- or fluid-specific peptidomes. In this chapter, we demonstrate how to identify, rank, and compare predicted bioactive peptides and bioactivity distributions from extensive peptidomics datasets. To aid this task, we utilize MultiPep, a multi-label deep learning approach designed for classifying peptide bioactivities, to identify bioactive peptides. The predicted bioactivities are synergistically combined with protein information from the UniProt database, which assist in navigating through the jungle of putative therapeutic peptides and relevant peptide leads.
Collapse
Affiliation(s)
- Alexander G B Grønning
- Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Copenhagen, Denmark.
| | - Camilla Schéele
- Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
15
|
Ahmad B, Achek A, Farooq M, Choi S. Accelerated NLRP3 inflammasome-inhibitory peptide design using a recurrent neural network model and molecular dynamics simulations. Comput Struct Biotechnol J 2023; 21:4825-4835. [PMID: 37854633 PMCID: PMC10579963 DOI: 10.1016/j.csbj.2023.09.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 09/27/2023] [Accepted: 09/27/2023] [Indexed: 10/20/2023] Open
Abstract
Anomalous NLRP3 inflammasome responses have been linked to multiple health issues, including but not limited to atherosclerosis, diabetes, metabolic syndrome, cardiovascular disease, and neurodegenerative disease. Thus, targeting NLRP3 and modulating its associated immune response might be a promising strategy for developing new anti-inflammatory drugs. Herein, we report a computational method for de novo peptide design for targeting NLRP3 inflammasomes. The described method leverages a long-short-term memory (LSTM) network based on a recurrent neural network (RNN) to model a valuable latent space of molecules. The resulting classifiers are utilized to guide the selection of molecules generated by the model based on circular dichroism spectra and physicochemical features derived from high-throughput molecular dynamics simulations. Of the experimentally tested sequences, 60% of the peptides showed NLRP3-mediated inhibition of IL-1β and IL-18. One peptide displayed high potency against NLRP3-mediated IL-1β inhibition. However, NLRC4 and AIM2 inflammasome-mediated IL-1β secretion was uninterrupted by this peptide, demonstrating its selectivity toward the NLRP3 inflammasome. Overall, these results indicate that deep learning and molecular dynamics can accelerate the discovery of NLRP3 inhibitors with potent and selective activity.
Collapse
Affiliation(s)
- Bilal Ahmad
- Department of Molecular Science and Technology, Ajou University, Suwon 16499, South Korea
- S&K Therapeutics, Ajou University, Campus Plaza 418, Worldcup-ro 199, Yeongtong-gu, Suwon 16502, South Korea
| | - Asma Achek
- Department of Molecular Science and Technology, Ajou University, Suwon 16499, South Korea
- Technology Development Platform, Institut Pasteur Korea, Seongnam 13488, Soouth Korea
| | - Mariya Farooq
- Department of Molecular Science and Technology, Ajou University, Suwon 16499, South Korea
- S&K Therapeutics, Ajou University, Campus Plaza 418, Worldcup-ro 199, Yeongtong-gu, Suwon 16502, South Korea
| | - Sangdun Choi
- Department of Molecular Science and Technology, Ajou University, Suwon 16499, South Korea
- S&K Therapeutics, Ajou University, Campus Plaza 418, Worldcup-ro 199, Yeongtong-gu, Suwon 16502, South Korea
| |
Collapse
|
16
|
Ma C, Wolfinger R. A prediction model for blood-brain barrier penetrating peptides based on masked peptide transformers with dynamic routing. Brief Bioinform 2023; 24:bbad399. [PMID: 37985456 DOI: 10.1093/bib/bbad399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 09/26/2023] [Accepted: 10/17/2023] [Indexed: 11/22/2023] Open
Abstract
Blood-brain barrier penetrating peptides (BBBPs) are short peptide sequences that possess the ability to traverse the selective blood-brain interface, making them valuable drug candidates or carriers for various payloads. However, the in vivo or in vitro validation of BBBPs is resource-intensive and time-consuming, driving the need for accurate in silico prediction methods. Unfortunately, the scarcity of experimentally validated BBBPs hinders the efficacy of current machine-learning approaches in generating reliable predictions. In this paper, we present DeepB3P3, a novel framework for BBBPs prediction. Our contribution encompasses four key aspects. Firstly, we propose a novel deep learning model consisting of a transformer encoder layer, a convolutional network backbone, and a capsule network classification head. This integrated architecture effectively learns representative features from peptide sequences. Secondly, we introduce masked peptides as a powerful data augmentation technique to compensate for small training set sizes in BBBP prediction. Thirdly, we develop a novel threshold-tuning method to handle imbalanced data by approximating the optimal decision threshold using the training set. Lastly, DeepB3P3 provides an accurate estimation of the uncertainty level associated with each prediction. Through extensive experiments, we demonstrate that DeepB3P3 achieves state-of-the-art accuracy of up to 98.31% on a benchmarking dataset, solidifying its potential as a promising computational tool for the prediction and discovery of BBBPs.
Collapse
Affiliation(s)
- Chunwei Ma
- JMP Statistical Discovery, LLC, Cary, 27513, NC, USA
- Department of Computer Science and Engineering, University at Buffalo, Buffalo, 14260, NY, USA
| | | |
Collapse
|
17
|
Li Z, Jin J, He W, Long W, Yu H, Gao X, Nakai K, Zou Q, Wei L. CoraL: interpretable contrastive meta-learning for the prediction of cancer-associated ncRNA-encoded small peptides. Brief Bioinform 2023; 24:bbad352. [PMID: 37861173 DOI: 10.1093/bib/bbad352] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 08/29/2023] [Accepted: 09/17/2023] [Indexed: 10/21/2023] Open
Abstract
NcRNA-encoded small peptides (ncPEPs) have recently emerged as promising targets and biomarkers for cancer immunotherapy. Therefore, identifying cancer-associated ncPEPs is crucial for cancer research. In this work, we propose CoraL, a novel supervised contrastive meta-learning framework for predicting cancer-associated ncPEPs. Specifically, the proposed meta-learning strategy enables our model to learn meta-knowledge from different types of peptides and train a promising predictive model even with few labeled samples. The results show that our model is capable of making high-confidence predictions on unseen cancer biomarkers with only five samples, potentially accelerating the discovery of novel cancer biomarkers for immunotherapy. Moreover, our approach remarkably outperforms existing deep learning models on 15 cancer-associated ncPEPs datasets, demonstrating its effectiveness and robustness. Interestingly, our model exhibits outstanding performance when extended for the identification of short open reading frames derived from ncPEPs, demonstrating the strong prediction ability of CoraL at the transcriptome level. Importantly, our feature interpretation analysis discovers unique sequential patterns as the fingerprint for each cancer-associated ncPEPs, revealing the relationship among certain cancer biomarkers that are validated by relevant literature and motif comparison. Overall, we expect CoraL to be a useful tool to decipher the pathogenesis of cancer and provide valuable information for cancer research. The dataset and source code of our proposed method can be found at https://github.com/Johnsunnn/CoraL.
Collapse
Affiliation(s)
- Zhongshen Li
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| | - Junru Jin
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| | - Wenjia He
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Wentao Long
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| | - Haoqing Yu
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| | - Xin Gao
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Kenta Nakai
- Department of Computational Biology and Medical Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba 277-8562, Japan
- Human Genome Center, The Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai Minato-ku, Tokyo 108-8639, Japan
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Leyi Wei
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| |
Collapse
|
18
|
Li X, Yang Q, Luo G, Xu L, Dong W, Wang W, Dong S, Wang K, Xuan P, Gao X. SAGDTI: self-attention and graph neural network with multiple information representations for the prediction of drug-target interactions. BIOINFORMATICS ADVANCES 2023; 3:vbad116. [PMID: 38282612 PMCID: PMC10818136 DOI: 10.1093/bioadv/vbad116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 07/31/2023] [Accepted: 08/24/2023] [Indexed: 01/30/2024]
Abstract
Motivation Accurate identification of target proteins that interact with drugs is a vital step in silico, which can significantly foster the development of drug repurposing and drug discovery. In recent years, numerous deep learning-based methods have been introduced to treat drug-target interaction (DTI) prediction as a classification task. The output of this task is binary identification suggesting the absence or presence of interactions. However, existing studies often (i) neglect the unique molecular attributes when embedding drugs and proteins, and (ii) determine the interaction of drug-target pairs without considering biological interaction information. Results In this study, we propose an end-to-end attention-derived method based on the self-attention mechanism and graph neural network, termed SAGDTI. The aim of this method is to overcome the aforementioned drawbacks in the identification of DTI. SAGDTI is the first method to sufficiently consider the unique molecular attribute representations for both drugs and targets in the input form of the SMILES sequences and three-dimensional structure graphs. In addition, our method aggregates the feature attributes of biological information between drugs and targets through multi-scale topologies and diverse connections. Experimental results illustrate that SAGDTI outperforms existing prediction models, which benefit from the unique molecular attributes embedded by atom-level attention and biological interaction information representation aggregated by node-level attention. Moreover, a case study on severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) shows that our model is a powerful tool for identifying DTIs in real life. Availability and implementation The data and codes underlying this article are available in Github at https://github.com/lixiaokun2020/SAGDTI.
Collapse
Affiliation(s)
- Xiaokun Li
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Harbin 150090, China
| | - Qiang Yang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Harbin 150090, China
| | - Gongning Luo
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Long Xu
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Harbin 150090, China
| | - Weihe Dong
- Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Harbin 150090, China
- College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China
| | - Wei Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Suyu Dong
- College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China
| | - Kuanquan Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- Department of Computer Science, School of Engineering, Shantou University, Shantou 515063, China
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal 23955, Saudi Arabia
| |
Collapse
|
19
|
Guan J, Yao L, Chung CR, Chiang YC, Lee TY. StackTHPred: Identifying Tumor-Homing Peptides through GBDT-Based Feature Selection with Stacking Ensemble Architecture. Int J Mol Sci 2023; 24:10348. [PMID: 37373494 DOI: 10.3390/ijms241210348] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 05/31/2023] [Accepted: 06/02/2023] [Indexed: 06/29/2023] Open
Abstract
One of the major challenges in cancer therapy lies in the limited targeting specificity exhibited by existing anti-cancer drugs. Tumor-homing peptides (THPs) have emerged as a promising solution to this issue, due to their capability to specifically bind to and accumulate in tumor tissues while minimally impacting healthy tissues. THPs are short oligopeptides that offer a superior biological safety profile, with minimal antigenicity, and faster incorporation rates into target cells/tissues. However, identifying THPs experimentally, using methods such as phage display or in vivo screening, is a complex, time-consuming task, hence the need for computational methods. In this study, we proposed StackTHPred, a novel machine learning-based framework that predicts THPs using optimal features and a stacking architecture. With an effective feature selection algorithm and three tree-based machine learning algorithms, StackTHPred has demonstrated advanced performance, surpassing existing THP prediction methods. It achieved an accuracy of 0.915 and a 0.831 Matthews Correlation Coefficient (MCC) score on the main dataset, and an accuracy of 0.883 and a 0.767 MCC score on the small dataset. StackTHPred also offers favorable interpretability, enabling researchers to better understand the intrinsic characteristics of THPs. Overall, StackTHPred is beneficial for both the exploration and identification of THPs and facilitates the development of innovative cancer therapies.
Collapse
Affiliation(s)
- Jiahui Guan
- School of Medicine, The Chinese University of Hong Kong (Shenzhen) 2001 Longxiang Road, Shenzhen 518172, China
| | - Lantian Yao
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Chia-Ru Chung
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Ying-Chih Chiang
- School of Medicine, The Chinese University of Hong Kong (Shenzhen) 2001 Longxiang Road, Shenzhen 518172, China
- Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong (Shenzhen), 2001 Longxiang Road, Shenzhen 518172, China
| | - Tzong-Yi Lee
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan
| |
Collapse
|
20
|
Dong W, Yang Q, Wang J, Xu L, Li X, Luo G, Gao X. Multi-modality attribute learning-based method for drug-protein interaction prediction based on deep neural network. Brief Bioinform 2023; 24:7145903. [PMID: 37114624 DOI: 10.1093/bib/bbad161] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 03/19/2023] [Accepted: 04/02/2023] [Indexed: 04/29/2023] Open
Abstract
Identification of active candidate compounds for target proteins, also called drug-protein interaction (DPI) prediction, is an essential but time-consuming and expensive step, which leads to fostering the development of drug discovery. In recent years, deep network-based learning methods were frequently proposed in DPIs due to their powerful capability of feature representation. However, the performance of existing DPI methods is still limited by insufficiently labeled pharmacological data and neglected intermolecular information. Therefore, overcoming these difficulties to perfect the performance of DPIs is an urgent challenge for researchers. In this article, we designed an innovative 'multi-modality attributes' learning-based framework for DPIs with molecular transformer and graph convolutional networks, termed, multi-modality attributes (MMA)-DPI. Specifically, intermolecular sub-structural information and chemical semantic representations were extracted through an augmented transformer module from biomedical data. A tri-layer graph convolutional neural network module was applied to associate the neighbor topology information and learn the condensed dimensional features by aggregating a heterogeneous network that contains multiple biological representations of drugs, proteins, diseases and side effects. Then, the learned representations were taken as the input of a fully connected neural network module to further integrate them in molecular and topological space. Finally, the attribute representations were fused with adaptive learning weights to calculate the interaction score for the DPIs tasks. MMA-DPI was evaluated in different experimental conditions and the results demonstrate that the proposed method achieved higher performance than existing state-of-the-art frameworks.
Collapse
Affiliation(s)
- Weihe Dong
- College of information and Computer Engineering, Northeast Forestry University, Hexing Road, 150040, Harbin, China
| | - Qiang Yang
- School of Computer Science and Technology, Heilongjiang University, Xuefu Road, 150080, Harbin, China
- Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Xuefu Road, 150080, Harbin, China
| | - Jian Wang
- College of information and Computer Engineering, Northeast Forestry University, Hexing Road, 150040, Harbin, China
| | - Long Xu
- School of Computer Science and Technology, Heilongjiang University, Xuefu Road, 150080, Harbin, China
- Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Xuefu Road, 150080, Harbin, China
| | - Xiaokun Li
- School of Computer Science and Technology, Heilongjiang University, Xuefu Road, 150080, Harbin, China
- Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Xuefu Road, 150080, Harbin, China
| | - Gongning Luo
- Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal 23955, Saudi Arabia
- School of Computer Science and Technology, Harbin Institute of Technology, West Dazhi Street, 150001, Harbin, China
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal 23955, Saudi Arabia
| |
Collapse
|
21
|
Jiang J, Li J, Li J, Pei H, Li M, Zou Q, Lv Z. A Machine Learning Method to Identify Umami Peptide Sequences by Using Multiplicative LSTM Embedded Features. Foods 2023; 12:foods12071498. [PMID: 37048319 PMCID: PMC10094688 DOI: 10.3390/foods12071498] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 03/24/2023] [Accepted: 03/30/2023] [Indexed: 04/05/2023] Open
Abstract
Umami peptides enhance the umami taste of food and have good food processing properties, nutritional value, and numerous potential applications. Wet testing for the identification of umami peptides is a time-consuming and expensive process. Here, we report the iUmami-DRLF that uses a logistic regression (LR) method solely based on the deep learning pre-trained neural network feature extraction method, unified representation (UniRep based on multiplicative LSTM), for feature extraction from the peptide sequences. The findings demonstrate that deep learning representation learning significantly enhanced the capability of models in identifying umami peptides and predictive precision solely based on peptide sequence information. The newly validated taste sequences were also used to test the iUmami-DRLF and other predictors, and the result indicates that the iUmami-DRLF has better robustness and accuracy and remains valid at higher probability thresholds. The iUmami-DRLF method can aid further studies on enhancing the umami flavor of food for satisfying the need for an umami-flavored diet.
Collapse
Affiliation(s)
- Jici Jiang
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Jiayu Li
- College of Life Science, Sichuan University, Chengdu 610065, China
| | - Junxian Li
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Hongdi Pei
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
- Wu Yuzhang Honors College, Sichuan University, Chengdu 610065, China
| | - Mingxin Li
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Zhibin Lv
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| |
Collapse
|
22
|
Li Z, Gao E, Zhou J, Han W, Xu X, Gao X. Applications of deep learning in understanding gene regulation. CELL REPORTS METHODS 2023; 3:100384. [PMID: 36814848 PMCID: PMC9939384 DOI: 10.1016/j.crmeth.2022.100384] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Gene regulation is a central topic in cell biology. Advances in omics technologies and the accumulation of omics data have provided better opportunities for gene regulation studies than ever before. For this reason deep learning, as a data-driven predictive modeling approach, has been successfully applied to this field during the past decade. In this article, we aim to give a brief yet comprehensive overview of representative deep-learning methods for gene regulation. Specifically, we discuss and compare the design principles and datasets used by each method, creating a reference for researchers who wish to replicate or improve existing methods. We also discuss the common problems of existing approaches and prospectively introduce the emerging deep-learning paradigms that will potentially alleviate them. We hope that this article will provide a rich and up-to-date resource and shed light on future research directions in this area.
Collapse
Affiliation(s)
- Zhongxiao Li
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Elva Gao
- The KAUST School, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Juexiao Zhou
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Wenkai Han
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Xiaopeng Xu
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Xin Gao
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| |
Collapse
|
23
|
Zhang X, Wei L, Ye X, Zhang K, Teng S, Li Z, Jin J, Kim MJ, Sakurai T, Cui L, Manavalan B, Wei L. SiameseCPP: a sequence-based Siamese network to predict cell-penetrating peptides by contrastive learning. Brief Bioinform 2023; 24:6958502. [PMID: 36562719 DOI: 10.1093/bib/bbac545] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 11/07/2022] [Accepted: 11/10/2022] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Cell-penetrating peptides (CPPs) have received considerable attention as a means of transporting pharmacologically active molecules into living cells without damaging the cell membrane, and thus hold great promise as future therapeutics. Recently, several machine learning-based algorithms have been proposed for predicting CPPs. However, most existing predictive methods do not consider the agreement (disagreement) between similar (dissimilar) CPPs and depend heavily on expert knowledge-based handcrafted features. RESULTS In this study, we present SiameseCPP, a novel deep learning framework for automated CPPs prediction. SiameseCPP learns discriminative representations of CPPs based on a well-pretrained model and a Siamese neural network consisting of a transformer and gated recurrent units. Contrastive learning is used for the first time to build a CPP predictive model. Comprehensive experiments demonstrate that our proposed SiameseCPP is superior to existing baseline models for predicting CPPs. Moreover, SiameseCPP also achieves good performance on other functional peptide datasets, exhibiting satisfactory generalization ability.
Collapse
Affiliation(s)
- Xin Zhang
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Lesong Wei
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Kai Zhang
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Saisai Teng
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Zhongshen Li
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Junru Jin
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Min Jae Kim
- Department of integrative Biotechnology, College of Biotechnology & Bioengineering, Sungkyunkwan University, Seobu-ro, Jangan-gu, Suwon-si, Gyeonggi-do 16419, Republic of Korea
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Lizhen Cui
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Balachandran Manavalan
- Department of integrative Biotechnology, College of Biotechnology & Bioengineering, Sungkyunkwan University, Seobu-ro, Jangan-gu, Suwon-si, Gyeonggi-do 16419, Republic of Korea
| | - Leyi Wei
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| |
Collapse
|
24
|
Sun TJ, Bu HL, Yan X, Sun ZH, Zha MS, Dong GF. LABAMPsGCN: A framework for identifying lactic acid bacteria antimicrobial peptides based on graph convolutional neural network. Front Genet 2022; 13:1062576. [PMID: 36406112 PMCID: PMC9669054 DOI: 10.3389/fgene.2022.1062576] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 10/24/2022] [Indexed: 08/01/2023] Open
Abstract
Lactic acid bacteria antimicrobial peptides (LABAMPs) are a class of active polypeptide produced during the metabolic process of lactic acid bacteria, which can inhibit or kill pathogenic bacteria or spoilage bacteria in food. LABAMPs have broad application in important practical fields closely related to human beings, such as food production, efficient agricultural planting, and so on. However, screening for antimicrobial peptides by biological experiment researchers is time-consuming and laborious. Therefore, it is urgent to develop a model to predict LABAMPs. In this work, we design a graph convolutional neural network framework for identifying of LABAMPs. We build heterogeneous graph based on amino acids, tripeptide and their relationships and learn weights of a graph convolutional network (GCN). Our GCN iteratively completes the learning of embedded words and sequence weights in the graph under the supervision of inputting sequence labels. We applied 10-fold cross-validation experiment to two training datasets and acquired accuracy of 0.9163 and 0.9379 respectively. They are higher that of other machine learning and GNN algorithms. In an independent test dataset, accuracy of two datasets is 0.9130 and 0.9291, which are 1.08% and 1.57% higher than the best methods of other online webservers.
Collapse
Affiliation(s)
- Tong-Jie Sun
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China
| | - He-Long Bu
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China
| | - Xin Yan
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China
| | - Zhi-Hong Sun
- College of Food Science and Engineering, Inner Mongolia Agricultural University, Hohhot, China
| | - Mu-Su Zha
- College of Food Science and Engineering, Inner Mongolia Agricultural University, Hohhot, China
| | - Gai-Fang Dong
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China
| |
Collapse
|
25
|
Improved prediction and characterization of blood-brain barrier penetrating peptides using estimated propensity scores of dipeptides. J Comput Aided Mol Des 2022; 36:781-796. [DOI: 10.1007/s10822-022-00476-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 09/15/2022] [Indexed: 11/27/2022]
|
26
|
Liu J, Li M, Chen X. AntiMF: A deep learning framework for predicting anticancer peptides based on multi-view feature extraction. Methods 2022; 207:38-43. [PMID: 36100141 DOI: 10.1016/j.ymeth.2022.07.017] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 07/20/2022] [Accepted: 07/26/2022] [Indexed: 01/10/2023] Open
Abstract
In recent years, anticancer peptides have emerged as a new viable option in cancer therapy, with the ability to overcome the considerable side effects and poor outcomes of standard cancer therapies. Accurate anticancer peptide identification can facilitate its finding and speed up its application in treating cancer. However, many recent approaches are based on machine learning, which not only restricts the representation ability of the models but also requires a complex hand-crafted feature extraction process. Here, we propose AntiMF, a deep learning model that utilizes multi-view mechanism based on different feature extraction models. Comparative results show that our model has a better performance than the state-of-the-art methods in the prediction of anticancer peptides. By using an ensemble learning framework to extract representation, AntiMF can capture the different dimensional information, which can make model representation more complete. Moreover, we visualize what AntiMF learns on one of its ensemble models to intuitively show the effectivity of our model, indicating that AntiMF has the great potential ability to be an effective and useful model to identify anticancer peptides accurately.
Collapse
Affiliation(s)
- Jingjing Liu
- Eye Hospital, The First Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| | - Minghao Li
- Beidahuang Industry Group General Hospital, Harbin 150001, China
| | - Xin Chen
- Eye Hospital, The First Affiliated Hospital of Harbin Medical University, Harbin 150001, China; Department of Neurosurgical Laboratory, The First Affiliated Hospital of Harbin Medical University, Harbin 150001, China.
| |
Collapse
|
27
|
Identify Bitter Peptides by Using Deep Representation Learning Features. Int J Mol Sci 2022; 23:ijms23147877. [PMID: 35887225 PMCID: PMC9315524 DOI: 10.3390/ijms23147877] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 07/01/2022] [Accepted: 07/14/2022] [Indexed: 02/04/2023] Open
Abstract
A bitter taste often identifies hazardous compounds and it is generally avoided by most animals and humans. Bitterness of hydrolyzed proteins is caused by the presence of bitter peptides. To improve palatability, bitter peptides need to be identified experimentally in a time-consuming and expensive process, before they can be removed or degraded. Here, we report the development of a machine learning prediction method, iBitter-DRLF, which is based on a deep learning pre-trained neural network feature extraction method. It uses three sequence embedding techniques, soft symmetric alignment (SSA), unified representation (UniRep), and bidirectional long short-term memory (BiLSTM). These were initially combined into various machine learning algorithms to build several models. After optimization, the combined features of UniRep and BiLSTM were finally selected, and the model was built in combination with a light gradient boosting machine (LGBM). The results showed that the use of deep representation learning greatly improves the ability of the model to identify bitter peptides, achieving accurate prediction based on peptide sequence data alone. By helping to identify bitter peptides, iBitter-DRLF can help research into improving the palatability of peptide therapeutics and dietary supplements in the future. A webserver is available, too.
Collapse
|
28
|
Assessing sequence-based protein-protein interaction predictors for use in therapeutic peptide engineering. Sci Rep 2022; 12:9610. [PMID: 35688894 PMCID: PMC9187631 DOI: 10.1038/s41598-022-13227-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 04/25/2022] [Indexed: 12/01/2022] Open
Abstract
Engineering peptides to achieve a desired therapeutic effect through the inhibition of a specific target activity or protein interaction is a non-trivial task. Few of the existing in silico peptide design algorithms generate target-specific peptides. Instead, many methods produce peptides that achieve a desired effect through an unknown mechanism. In contrast with resource-intensive high-throughput experiments, in silico screening is a cost-effective alternative that can prune the space of candidates when engineering target-specific peptides. Using a set of FDA-approved peptides we curated specifically for this task, we assess the applicability of several sequence-based protein–protein interaction predictors as a screening tool within the context of peptide therapeutic engineering. We show that similarity-based protein–protein interaction predictors are more suitable for this purpose than the state-of-the-art deep learning methods publicly available at the time of writing. We also show that this approach is mostly useful when designing new peptides against targets for which naturally-occurring interactors are already known, and that deploying it for de novo peptide engineering tasks may require gathering additional target-specific training data. Taken together, this work offers evidence that supports the use of similarity-based protein–protein interaction predictors for peptide therapeutic engineering, especially peptide analogs.
Collapse
|
29
|
Charoenkwan P, Schaduangrat N, Lio' P, Moni MA, Manavalan B, Shoombuatong W. NEPTUNE: A novel computational approach for accurate and large-scale identification of tumor homing peptides. Comput Biol Med 2022; 148:105700. [PMID: 35715261 DOI: 10.1016/j.compbiomed.2022.105700] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 05/31/2022] [Accepted: 06/04/2022] [Indexed: 11/16/2022]
Abstract
Tumor homing peptides (THPs) play a crucial role in recognizing and specifically binding to cancer cells. Although experimental approaches can facilitate the precise identification of THPs, they are usually time-consuming, labor-intensive, and not cost-effective. However, computational approaches can identify THPs by utilizing sequence information alone, thus highlighting their great potential for large-scale identification of THPs. Herein, we propose NEPTUNE, a novel computational approach for the accurate and large-scale identification of THPs from sequence information. Specifically, we constructed variant baseline models from multiple feature encoding schemes coupled with six popular machine learning algorithms. Subsequently, we comprehensively assessed and investigated the effects of these baseline models on THP prediction. Finally, the probabilistic information generated by the optimal baseline models is fed into a support vector machine-based classifier to construct the final meta-predictor (NEPTUNE). Cross-validation and independent tests demonstrated that NEPTUNE achieved superior performance for THP prediction compared with its constituent baseline models and the existing methods. Moreover, we employed the powerful SHapley additive exPlanations method to improve the interpretation of NEPTUNE and elucidate the most important features for identifying THPs. Finally, we implemented an online web server using NEPTUNE, which is available at http://pmlabstack.pythonanywhere.com/NEPTUNE. NEPTUNE could be beneficial for the large-scale identification of unknown THP candidates for follow-up experimental validation.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai, 50200, Thailand
| | - Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Pietro Lio'
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Mohammad Ali Moni
- Artificial Intelligence & Digital Health, School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland St Lucia, QLD, 4072, Australia
| | - Balachandran Manavalan
- Computational Biology and Bioinformatics Laboratory, Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Gyeonggi-do, Republic of Korea.
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand.
| |
Collapse
|
30
|
Li Y, Li X, Liu Y, Yao Y, Huang G. MPMABP: A CNN and Bi-LSTM-Based Method for Predicting Multi-Activities of Bioactive Peptides. Pharmaceuticals (Basel) 2022; 15:707. [PMID: 35745625 PMCID: PMC9231127 DOI: 10.3390/ph15060707] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 05/23/2022] [Accepted: 05/30/2022] [Indexed: 12/30/2022] Open
Abstract
Bioactive peptides are typically small functional peptides with 2-20 amino acid residues and play versatile roles in metabolic and biological processes. Bioactive peptides are multi-functional, so it is vastly challenging to accurately detect all their functions simultaneously. We proposed a convolution neural network (CNN) and bi-directional long short-term memory (Bi-LSTM)-based deep learning method (called MPMABP) for recognizing multi-activities of bioactive peptides. The MPMABP stacked five CNNs at different scales, and used the residual network to preserve the information from loss. The empirical results showed that the MPMABP is superior to the state-of-the-art methods. Analysis on the distribution of amino acids indicated that the lysine preferred to appear in the anti-cancer peptide, the leucine in the anti-diabetic peptide, and the proline in the anti-hypertensive peptide. The method and analysis are beneficial to recognize multi-activities of bioactive peptides.
Collapse
Affiliation(s)
- You Li
- School of Electrical Engineering, Shaoyang University, Shaoyang 422000, China; (Y.L.); (X.L.)
| | - Xueyong Li
- School of Electrical Engineering, Shaoyang University, Shaoyang 422000, China; (Y.L.); (X.L.)
| | - Yuewu Liu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China;
| | - Yuhua Yao
- School of Mathematics and Statistics, Hainan Normal University, Haikou 571158, China;
| | - Guohua Huang
- School of Electrical Engineering, Shaoyang University, Shaoyang 422000, China; (Y.L.); (X.L.)
| |
Collapse
|
31
|
Guo X, Jiang Y, Zou Q. Structured Sparse Regularized TSK Fuzzy System for predicting therapeutic peptides. Brief Bioinform 2022; 23:6570018. [PMID: 35438149 DOI: 10.1093/bib/bbac135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 03/19/2022] [Accepted: 03/22/2022] [Indexed: 11/13/2022] Open
Abstract
Therapeutic peptides act on the skeletal system, digestive system and blood system, have antibacterial properties and help relieve inflammation. In order to reduce the resource consumption of wet experiments for the identification of therapeutic peptides, many computational-based methods have been developed to solve the identification of therapeutic peptides. Due to the insufficiency of traditional machine learning methods in dealing with feature noise. We propose a novel therapeutic peptide identification method called Structured Sparse Regularized Takagi-Sugeno-Kang Fuzzy System on Within-Class Scatter (SSR-TSK-FS-WCS). Our method achieves good performance on multiple therapeutic peptides and UCI datasets.
Collapse
Affiliation(s)
- Xiaoyi Guo
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, P.R.China
| | - Yizhang Jiang
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, P.R.China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, P.R.China
| |
Collapse
|
32
|
de Oliveira ECL, da Costa KS, Taube PS, Lima AH, Junior CDSDS. Biological Membrane-Penetrating Peptides: Computational Prediction and Applications. Front Cell Infect Microbiol 2022; 12:838259. [PMID: 35402305 PMCID: PMC8992797 DOI: 10.3389/fcimb.2022.838259] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 02/21/2022] [Indexed: 12/14/2022] Open
Abstract
Peptides comprise a versatile class of biomolecules that present a unique chemical space with diverse physicochemical and structural properties. Some classes of peptides are able to naturally cross the biological membranes, such as cell membrane and blood-brain barrier (BBB). Cell-penetrating peptides (CPPs) and blood-brain barrier-penetrating peptides (B3PPs) have been explored by the biotechnological and pharmaceutical industries to develop new therapeutic molecules and carrier systems. The computational prediction of peptides’ penetration into biological membranes has been emerged as an interesting strategy due to their high throughput and low-cost screening of large chemical libraries. Structure- and sequence-based information of peptides, as well as atomistic biophysical models, have been explored in computer-assisted discovery strategies to classify and identify new structures with pharmacokinetic properties related to the translocation through biomembranes. Computational strategies to predict the permeability into biomembranes include cheminformatic filters, molecular dynamics simulations, artificial intelligence algorithms, and statistical models, and the choice of the most adequate method depends on the purposes of the computational investigation. Here, we exhibit and discuss some principles and applications of these computational methods widely used to predict the permeability of peptides into biomembranes, exhibiting some of their pharmaceutical and biotechnological applications.
Collapse
Affiliation(s)
- Ewerton Cristhian Lima de Oliveira
- Institute of Technology, Federal University of Pará, Belém, Brazil
- *Correspondence: Kauê Santana da Costa, ; Ewerton Cristhian Lima de Oliveira,
| | - Kauê Santana da Costa
- Laboratory of Computational Simulation, Institute of Biodiversity, Federal University of Western Pará, Santarém, Brazil
- *Correspondence: Kauê Santana da Costa, ; Ewerton Cristhian Lima de Oliveira,
| | - Paulo Sérgio Taube
- Laboratory of Computational Simulation, Institute of Biodiversity, Federal University of Western Pará, Santarém, Brazil
| | - Anderson H. Lima
- Laboratório de Planejamento e Desenvolvimento de Fármacos, Instituto de Ciências Exatas e Naturais, Universidade Federal do Pará, Belém, Brazil
| | | |
Collapse
|