1
|
Cai K, Zhang Z, Zhu W, Liu X, Yu T, Liao W. Predicting Antidiabetic Peptide Activity: A Machine Learning Perspective on Type 1 and Type 2 Diabetes. Int J Mol Sci 2024; 25:10020. [PMID: 39337508 PMCID: PMC11432216 DOI: 10.3390/ijms251810020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Revised: 09/12/2024] [Accepted: 09/15/2024] [Indexed: 09/30/2024] Open
Abstract
Diabetes mellitus (DM) presents a critical global health challenge, characterized by persistent hyperglycemia and associated with substantial economic and health-related burdens. This study employs advanced machine-learning techniques to improve the prediction and classification of antidiabetic peptides, with a particular focus on differentiating those effective against T1DM from those targeting T2DM. We integrate feature selection with analysis methods, including logistic regression, support vector machines (SVM), and adaptive boosting (AdaBoost), to classify antidiabetic peptides based on key features. Feature selection through the Lasso-penalized method identifies critical peptide characteristics that significantly influence antidiabetic activity, thereby establishing a robust foundation for future peptide design. A comprehensive evaluation of logistic regression, SVM, and AdaBoost shows that AdaBoost consistently outperforms the other methods, making it the most effective approach for classifying antidiabetic peptides. This research underscores the potential of machine learning in the systematic evaluation of bioactive peptides, contributing to the advancement of peptide-based therapies for diabetes management.
Collapse
Affiliation(s)
- Kaida Cai
- Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China; (Z.Z.); (W.Z.); (X.L.)
- Key Laboratory of Environmental Medicine Engineering, Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China;
| | - Zhe Zhang
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China; (Z.Z.); (W.Z.); (X.L.)
| | - Wenzhou Zhu
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China; (Z.Z.); (W.Z.); (X.L.)
| | - Xiangwei Liu
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China; (Z.Z.); (W.Z.); (X.L.)
| | - Tingqing Yu
- Key Laboratory of Environmental Medicine Engineering, Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China;
- Department of Nutrition and Food Hygiene, School of Public Health, Southeast University, Nanjing 210009, China
| | - Wang Liao
- Key Laboratory of Environmental Medicine Engineering, Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China;
- Department of Nutrition and Food Hygiene, School of Public Health, Southeast University, Nanjing 210009, China
| |
Collapse
|
2
|
Li B, Chen H, Huang J, He B. CD47Binder: Identify CD47 Binding Peptides by Combining Next-Generation Phage Display Data and Multiple Peptide Descriptors. Interdiscip Sci 2023; 15:578-589. [PMID: 37389722 DOI: 10.1007/s12539-023-00575-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 06/08/2023] [Accepted: 06/09/2023] [Indexed: 07/01/2023]
Abstract
CD47/SIRPα pathway is a new breakthrough in the field of tumor immunity after PD-1/PD-L1. While current monoclonal antibody therapies targeting CD47/SIRPα have demonstrated some anti-tumor effectiveness, there are several inherent limitations associated with these formulations. In the paper, we developed a predictive model that combines next-generation phage display (NGPD) and traditional machine learning methods to distinguish CD47 binding peptides. First, we utilized NGPD biopanning technology to screen CD47 binding peptides. Second, ten traditional machine learning methods based on multiple peptide descriptors and three deep learning methods were used to build computational models for identifying CD47 binding peptides. Finally, we proposed an integrated model based on support vector machine. During the five-fold cross-validation, the integrated predictor demonstrated specificity, accuracy, and sensitivity of 0.755, 0.764, and 0.772, respectively. Furthermore, an online bioinformatics tool called CD47Binder has been developed for the integrated predictor. This tool is readily accessible on http://i.uestc.edu.cn/CD47Binder/cgi-bin/CD47Binder.pl .
Collapse
Affiliation(s)
- Bowen Li
- Medical College, Guizhou University, Huaxi District, Guiyang, 550025, Guizhou, China
| | - Heng Chen
- Medical College, Guizhou University, Huaxi District, Guiyang, 550025, Guizhou, China.
| | - Jian Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, No.2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu, 6173001, Sichuan, China.
| | - Bifang He
- Medical College, Guizhou University, Huaxi District, Guiyang, 550025, Guizhou, China.
- State Key Laboratory of Public Big Data, Guizhou University, Huaxi District, Guiyang, 550025, Guizhou, China.
| |
Collapse
|
3
|
Chen X, Huang J, He B. AntiDMPpred: a web service for identifying anti-diabetic peptides. PeerJ 2022; 10:e13581. [PMID: 35722269 PMCID: PMC9205309 DOI: 10.7717/peerj.13581] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Accepted: 05/23/2022] [Indexed: 01/17/2023] Open
Abstract
Diabetes mellitus (DM) is a chronic metabolic disease that has been a major threat to human health globally, causing great economic and social adversities. The oral administration of anti-diabetic peptide drugs has become a novel route for diabetes therapy. Numerous bioactive peptides have demonstrated potential anti-diabetic properties and are promising as alternative treatment measures to prevent and manage diabetes. The computational prediction of anti-diabetic peptides can help promote peptide-based drug discovery in the process of searching newly effective therapeutic peptide agents for diabetes treatment. Here, we resorted to random forest to develop a computational model, named AntiDMPpred, for predicting anti-diabetic peptides. A benchmark dataset with 236 anti-diabetic and 236 non-anti-diabetic peptides was first constructed. Four types of sequence-derived descriptors were used to represent the peptide sequences. We then combined four machine learning methods and six feature scoring methods to select the non-redundant features, which were fed into diverse machine learning classifiers to train the models. Experimental results show that AntiDMPpred reached an accuracy of 77.12% and area under the receiver operating curve (AUCROC) of 0.8193 in the nested five-fold cross-validation, yielding a satisfactory performance and surpassing other classifiers implemented in the study. The web service is freely accessible at http://i.uestc.edu.cn/AntiDMPpred/cgi-bin/AntiDMPpred.pl. We hope AntiDMPpred could improve the discovery of anti-diabetic bioactive peptides.
Collapse
Affiliation(s)
- Xue Chen
- Medical College, Guizhou University, Guiyang, China
| | - Jian Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Bifang He
- Medical College, Guizhou University, Guiyang, China
| |
Collapse
|
4
|
Chen X, Zhang Q, Li B, Lu C, Yang S, Long J, He B, Chen H, Huang J. BBPpredict: A Web Service for Identifying Blood-Brain Barrier Penetrating Peptides. Front Genet 2022; 13:845747. [PMID: 35656322 PMCID: PMC9152268 DOI: 10.3389/fgene.2022.845747] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 03/30/2022] [Indexed: 12/22/2022] Open
Abstract
Blood-brain barrier (BBB) is a major barrier to drug delivery into the brain in the treatment of central nervous system (CNS) diseases. Blood-brain barrier penetrating peptides (BBPs), a class of peptides that can cross BBB through various mechanisms without damaging BBB, are effective drug candidates for CNS diseases. However, identification of BBPs by experimental methods is time-consuming and laborious. To discover more BBPs as drugs for CNS disease, it is urgent to develop computational methods that can quickly and accurately identify BBPs and non-BBPs. In the present study, we created a training dataset that consists of 326 BBPs derived from previous databases and published manuscripts and 326 non-BBPs collected from UniProt, to construct a BBP predictor based on sequence information. We also constructed an independent testing dataset with 99 BBPs and 99 non-BBPs. Multiple machine learning methods were compared based on the training dataset via a nested cross-validation. The final BBP predictor was constructed based on the training dataset and the results showed that random forest (RF) method outperformed other classification algorithms on the training and independent testing dataset. Compared with previous BBP prediction tools, the RF-based predictor, named BBPpredict, performs considerably better than state-of-the-art BBP predictors. BBPpredict is expected to contribute to the discovery of novel BBPs, or at least can be a useful complement to the existing methods in this area. BBPpredict is freely available at http://i.uestc.edu.cn/BBPpredict/cgi-bin/BBPpredict.pl.
Collapse
Affiliation(s)
- Xue Chen
- Medical College, Guizhou University, Guiyang, China
| | | | - Bowen Li
- Medical College, Guizhou University, Guiyang, China
| | - Chunying Lu
- Medical College, Guizhou University, Guiyang, China
| | | | - Jinjin Long
- Medical College, Guizhou University, Guiyang, China
| | - Bifang He
- Medical College, Guizhou University, Guiyang, China
| | - Heng Chen
- Medical College, Guizhou University, Guiyang, China
| | - Jian Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
5
|
Zhou Y, Xie S, Yang Y, Jiang L, Liu S, Li W, Abagna HB, Ning L, Huang J. SSH2.0: A Better Tool for Predicting the Hydrophobic Interaction Risk of Monoclonal Antibody. Front Genet 2022; 13:842127. [PMID: 35368659 PMCID: PMC8965096 DOI: 10.3389/fgene.2022.842127] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 01/31/2022] [Indexed: 01/11/2023] Open
Abstract
Therapeutic antibodies play a crucial role in the treatment of various diseases. However, the success rate of antibody drug development is low partially because of unfavourable biophysical properties of antibody drug candidates such as the high aggregation tendency, which is mainly driven by hydrophobic interactions of antibody molecules. Therefore, early screening of the risk of hydrophobic interaction of antibody drug candidates is crucial. Experimental screening is laborious, time-consuming, and costly, warranting the development of efficient and high-throughput computational tools for prediction of hydrophobic interactions of therapeutic antibodies. In the present study, 131 antibodies with hydrophobic interaction experiment data were used to train a new support vector machine-based ensemble model, termed SSH2.0, to predict the hydrophobic interactions of antibodies. Feature selection was performed against CKSAAGP by using the graph-based algorithm MRMD2.0. Based on the antibody sequence, SSH2.0 achieved the sensitivity and accuracy of 100.00 and 83.97%, respectively. This approach eliminates the need of three-dimensional structure of antibodies and enables rapid screening of therapeutic antibody candidates in the early developmental stage, thereby saving time and cost. In addition, a web server was constructed that is freely available at http://i.uestc.edu.cn/SSH2/.
Collapse
Affiliation(s)
- Yuwei Zhou
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Shiyang Xie
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Yue Yang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Lixu Jiang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Siqi Liu
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Wei Li
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hamza Bukari Abagna
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Lin Ning
- School of Healthcare Technology, Chengdu Neusoft University, Chengdu, China
| | - Jian Huang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
6
|
Yang S, Huang J, He B. CASPredict: a web service for identifying Cas proteins. PeerJ 2021; 9:e11887. [PMID: 34395100 PMCID: PMC8327967 DOI: 10.7717/peerj.11887] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Accepted: 07/09/2021] [Indexed: 12/16/2022] Open
Abstract
Clustered regularly interspaced short palindromic repeats (CRISPR) and their associated (Cas) proteins constitute the CRISPR-Cas systems, which play a key role in prokaryote adaptive immune system against invasive foreign elements. In recent years, the CRISPR-Cas systems have also been designed to facilitate target gene editing in eukaryotic genomes. As one of the important components of the CRISPR-Cas system, Cas protein plays an irreplaceable role. The effector module composed of Cas proteins is used to distinguish the type of CRISPR-Cas systems. Effective prediction and identification of Cas proteins can help biologists further infer the type of CRISPR-Cas systems. Moreover, the class 2 CRISPR-Cas systems are gradually applied in the field of genome editing. The discovery of Cas protein will help provide more candidates for genome editing. In this paper, we described a web service named CASPredict (http://i.uestc.edu.cn/caspredict/cgi-bin/CASPredict.pl) for identifying Cas proteins. CASPredict first predicts Cas proteins based on support vector machine (SVM) by using the optimal dipeptide composition and then annotates the function of Cas proteins based on the hmmscan search algorithm. The ten-fold cross-validation results showed that the 84.84% of Cas proteins were correctly classified. CASPredict will be a useful tool for the identification of Cas proteins, or at least can play a complementary role to the existing methods in this area.
Collapse
Affiliation(s)
- Shanshan Yang
- Medical College, Guizhou University, Guiyang, Guizhou Province, China
| | - Jian Huang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan Province, China
| | - Bifang He
- Medical College, Guizhou University, Guiyang, Guizhou Province, China.,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan Province, China
| |
Collapse
|
7
|
Ahmadi A, Ayyadevara VSSA, Baudry J, Roh KH. Calcium signaling on Jurkat T cells induced by microbeads coated with novel peptide ligands specific to human CD3ε. J Mater Chem B 2021; 9:1661-1675. [PMID: 33481966 DOI: 10.1039/d0tb02235g] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
CD3ε is expressed on T lymphocytes as a part of the T cell receptor (TCR)-CD3 complex. Together with other CD3 molecules, CD3ε is responsible for the activation of T cells via transducing the event of antigen recognition by the TCR into intracellular signaling cascades. The present study first aims to identify a novel peptide ligand that binds to human CD3ε in a specific manner and to perform an initial evaluation of its biological efficacy on the human T cell line, Jurkat cells. We screened a phage-display peptide library against human CD3ε using a subtractive biopanning process, from which we identified 13 phage clones displaying unique peptide sequences. One dominant phage clone displaying the 7 amino acid sequence of WSLGYTG, which occupied 90% of tested plaques (18 out of 20) after the 5th round of biopanning, demonstrated a superior binding behavior to other clones in the binding assays against recombinant CD3ε on microbeads or Jurkat cells. The synthesized peptide also showed specific binding to Jurkat cells in a dose-dependent manner but not to B cell lymphoma line, 2PK3 cells. Molecular modeling and docking simulation confirmed that the selected peptide ligand in an energetically stable conformation binds to a pocket of CD3ε that is not hidden by either CD3γ or CD3δ. Lastly, magnetic microbeads conjugated with the synthesized peptide ligands showed a weak but specific association with Jurkat cells and induced the calcium flux, a hallmark indication of proximal T cell receptor signaling, which gave rise to an enhancement of IL-2 section and cell proliferation. The novel peptide ligand and its various multivalent forms have a great potential in applications related to T cell biology and T cell immunotherapy.
Collapse
Affiliation(s)
- Armin Ahmadi
- Department of Chemical & Materials Engineering, University of Alabama in Huntsville, 301 Sparkman Drive NW, Huntsville, AL 35899, USA.
| | - V S S Abhinav Ayyadevara
- Biotechnology Science and Engineering, University of Alabama in Huntsville, Huntsville, AL 35899, USA
| | - Jerome Baudry
- Biotechnology Science and Engineering, University of Alabama in Huntsville, Huntsville, AL 35899, USA and Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL 35899, USA
| | - Kyung-Ho Roh
- Department of Chemical & Materials Engineering, University of Alabama in Huntsville, 301 Sparkman Drive NW, Huntsville, AL 35899, USA. and Biotechnology Science and Engineering, University of Alabama in Huntsville, Huntsville, AL 35899, USA
| |
Collapse
|
8
|
Li J, Chen H, Fan F, Qiu J, Du L, Xiao J, Duan X, Chen H, Liao W. White-matter functional topology: a neuromarker for classification and prediction in unmedicated depression. Transl Psychiatry 2020; 10:365. [PMID: 33127899 PMCID: PMC7603321 DOI: 10.1038/s41398-020-01053-4] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Revised: 09/28/2020] [Accepted: 10/09/2020] [Indexed: 02/03/2023] Open
Abstract
Aberrant topological organization of brain connectomes underlies pathological mechanisms in major depressive disorder (MDD). However, accumulating evidence has only focused on functional organization in brain gray-matter, ignoring functional information in white-matter (WM) that has been confirmed to have reliable and stable topological organizations. The present study aimed to characterize the functional pattern disruptions of MDD from a new perspective-WM functional connectome topological organization. A case-control, cross-sectional resting-state functional magnetic resonance imaging study was conducted on both discovery [91 unmedicated MDD patients, and 225 healthy controls (HCs)], and replication samples (34 unmedicated MDD patients, and 25 HCs). The WM functional networks were constructed in 128 anatomical regions, and their global topological properties (e.g., small-worldness) were analyzed using graph theory-based approaches. At the system-level, ubiquitous small-worldness architecture and local information-processing capacity were detectable in unmedicated MDD patients but were less salient than in HCs, implying a shift toward randomization in MDD WM functional connectomes. Consistent results were replicated in an independent sample. For clinical applications, small-world topology of WM functional connectome showed a predictive effect on disease severity (Hamilton Depression Rating Scale) in discovery sample (r = 0.34, p = 0.001). Furthermore, the topologically-based classification model could be generalized to discriminate MDD patients from HCs in replication sample (accuracy, 76%; sensitivity, 74%; specificity, 80%). Our results highlight a reproducible topologically shifted WM functional connectome structure and provide possible clinical applications involving an optimal small-world topology as a potential neuromarker for the classification and prediction of MDD patients.
Collapse
Affiliation(s)
- Jiao Li
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China
| | - Heng Chen
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China
- School of Medicine, Guizhou University, Guiyang, 550025, People's Republic of China
| | - Feiyang Fan
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China
| | - Jiang Qiu
- School of Psychology, Southwest University, Chongqing, 400715, People's Republic of China
| | - Lian Du
- Department of PsyCiatry, The First Affiliated Hospital of Chongqing Medical University, Chongqing, 400016, People's Republic of China
| | - Jinming Xiao
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China
| | - Xujun Duan
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China
| | - Huafu Chen
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China.
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China.
| | - Wei Liao
- The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China.
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China.
| |
Collapse
|
9
|
Wang Y, Kang J, Li N, Zhou Y, Tang Z, He B, Huang J. NeuroCS: A Tool to Predict Cleavage Sites of Neuropeptide Precursors. Protein Pept Lett 2020; 27:337-345. [PMID: 31721688 DOI: 10.2174/0929866526666191112150636] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2019] [Revised: 07/16/2019] [Accepted: 09/24/2019] [Indexed: 11/22/2022]
Abstract
BACKGROUND Neuropeptides are a class of bioactive peptides produced from neuropeptide precursors through a series of extremely complex processes, mediating neuronal regulations in many aspects. Accurate identification of cleavage sites of neuropeptide precursors is of great significance for the development of neuroscience and brain science. OBJECTIVE With the explosive growth of neuropeptide precursor data, it is pretty much needed to develop bioinformatics methods for predicting neuropeptide precursors' cleavage sites quickly and efficiently. METHODS We started with processing the neuropeptide precursor data from SwissProt and NueoPedia into two sets of data, training dataset and testing dataset. Subsequently, six feature extraction schemes were applied to generate different feature sets and then feature selection methods were used to find the optimal feature subset of each. Thereafter the support vector machine was utilized to build models for different feature types. Finally, the performance of models were evaluated with the independent testing dataset. RESULTS Six models are built through support vector machine. Among them the enhanced amino acid composition-based model reaches the highest accuracy of 91.60% in the 5-fold cross validation. When evaluated with independent testing dataset, it also showed an excellent performance with a high accuracy of 90.37% and Area under Receiver Operating Characteristic curve up to 0.9576. CONCLUSION The performance of the developed model was decent. Moreover, for users' convenience, an online web server called NeuroCS is built, which is freely available at http://i.uestc.edu.cn/NeuroCS/dist/index.html#/. NeuroCS can be used to predict neuropeptide precursors' cleavage sites effectively.
Collapse
Affiliation(s)
- Ying Wang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Juanjuan Kang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Ning Li
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Yuwei Zhou
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zhongjie Tang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Bifang He
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China.,Medical College, Guizhou University, Guiyang, China
| | - Jian Huang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
10
|
Tang Q, Nie F, Kang J, Chen W. ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species. Comput Struct Biotechnol J 2020; 18:2445-2452. [PMID: 33005306 PMCID: PMC7509369 DOI: 10.1016/j.csbj.2020.09.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 08/30/2020] [Accepted: 09/01/2020] [Indexed: 02/07/2023] Open
Abstract
A computational method for identifying non-coding promoters was proposed for the first time. A high-quality dataset was built to train and test the models for identifying non-coding promoters. A user-friendly web server was developed to recognize non-coding promoters.
The promoter is located near the transcription start sites and regulates transcription initiation of the gene. Accurate identification of promoters is essential for understanding the mechanism of gene regulation. Since experimental methods are costly and ineffective, developing efficient and accurate computational tools to identify promoters are necessary. Although a series of methods have been proposed for identifying promoters, none of them is able to identify the promoters of non-coding RNA (ncRNA). In the present work, a new method called ncPro-ML was proposed to identify the promoter of ncRNA in Homo sapiens and Mus musculus, in which different kinds of sequence encoding schemes were used to convert DNA sequences into feature vectors. To test the length effect, for each species, datasets including sequences with different lengths were built. The results demonstrated that ncPro-ML achieved the best performance based on the dataset with the sequence length of 221 nucleotides for human and mouse. The performances of ncPro-ML were also satisfying from both independent dataset test and cross-species test. The results indicate that the proposed predictor can server as a powerful tool for the discovery of ncRNA promoters. In addition, a web-server for ncPro-ML was developed, which can be freely accessed at http://www.bio-bigdata.cn/ncPro-ML/.
Collapse
Affiliation(s)
- Qiang Tang
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
| | - Fulei Nie
- Center for Genomics and Computational Biology, Scholl of Life Sciences, North China University of Science and Technology, Tangshan 063210, China
- School of Public Health, North China University of Science and Technology, Tangshan 063210, China
| | - Juanjuan Kang
- Affiliated Foshan Maternity & Child Healthcare Hospital, Southern Medical University (Foshan Maternity & Child Healthcare Hospital), Foshan 528000, China
| | - Wei Chen
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
- Center for Genomics and Computational Biology, Scholl of Life Sciences, North China University of Science and Technology, Tangshan 063210, China
- School of Public Health, North China University of Science and Technology, Tangshan 063210, China
- Corresponding author: Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China.
| |
Collapse
|
11
|
SSH: A Tool for Predicting Hydrophobic Interaction of Monoclonal Antibodies Using Sequences. BIOMED RESEARCH INTERNATIONAL 2020; 2020:3508107. [PMID: 32596302 PMCID: PMC7288208 DOI: 10.1155/2020/3508107] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/29/2020] [Revised: 04/28/2020] [Accepted: 05/13/2020] [Indexed: 12/31/2022]
Abstract
Therapeutic antibodies are one of the most important parts of the pharmaceutical industry. They are widely used in treating various diseases such as autoimmune diseases, cancer, inflammation, and infectious diseases. Their development process however is often brought to a standstill or takes a longer time and is then more expensive due to their hydrophobicity problems. Hydrophobic interactions can cause problems on half-life, drug administration, and immunogenicity at all stages of antibody drug development. Some of the most widely accepted and used technologies for determining the hydrophobic interactions of antibodies include standup monolayer adsorption chromatography (SMAC), salt-gradient affinity-capture self-interaction nanoparticle spectroscopy (SGAC-SINS), and hydrophobic interaction chromatography (HIC). However, to measure SMAC, SGAC-SINS, and HIC for hundreds of antibody drug candidates is time-consuming and costly. To save time and money, a predictor called SSH is developed. Based on the antibody's sequence only, it can predict the hydrophobic interactions of monoclonal antibodies (mAbs). Using the leave-one-out crossvalidation, SSH achieved 91.226% accuracy, 96.396% sensitivity or recall, 84.196% specificity, 87.754% precision, 0.828 Mathew correlation coefficient (MCC), 0.919 f-score, and 0.961 area under the receiver operating characteristic (ROC) curve (AUC).
Collapse
|
12
|
He B, Dzisoo AM, Derda R, Huang J. Development and Application of Computational Methods in Phage Display Technology. Curr Med Chem 2020; 26:7672-7693. [PMID: 29956612 DOI: 10.2174/0929867325666180629123117] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Revised: 02/08/2018] [Accepted: 03/20/2018] [Indexed: 12/12/2022]
Abstract
BACKGROUND Phage display is a powerful and versatile technology for the identification of peptide ligands binding to multiple targets, which has been successfully employed in various fields, such as diagnostics and therapeutics, drug-delivery and material science. The integration of next generation sequencing technology with phage display makes this methodology more productive. With the widespread use of this technique and the fast accumulation of phage display data, databases for these data and computational methods have become an indispensable part in this community. This review aims to summarize and discuss recent progress in the development and application of computational methods in the field of phage display. METHODS We undertook a comprehensive search of bioinformatics resources and computational methods for phage display data via Google Scholar and PubMed. The methods and tools were further divided into different categories according to their uses. RESULTS We described seven special or relevant databases for phage display data, which provided an evidence-based source for phage display researchers to clean their biopanning results. These databases can identify and report possible target-unrelated peptides (TUPs), thereby excluding false-positive data from peptides obtained from phage display screening experiments. More than 20 computational methods for analyzing biopanning data were also reviewed. These methods were classified into computational methods for reporting TUPs, for predicting epitopes and for analyzing next generation phage display data. CONCLUSION The current bioinformatics archives, methods and tools reviewed here have benefitted the biopanning community. To develop better or new computational tools, some promising directions are also discussed.
Collapse
Affiliation(s)
- Bifang He
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 611731, China.,School of Medicine, Guizhou University, Guiyang 550025, China
| | - Anthony Mackitz Dzisoo
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Ratmir Derda
- Department of Chemistry, University of Alberta, Edmonton T6G 2G2, Alberta, Canada
| | - Jian Huang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 611731, China
| |
Collapse
|
13
|
Malebary SJ, Rehman MSU, Khan YD. iCrotoK-PseAAC: Identify lysine crotonylation sites by blending position relative statistical features according to the Chou's 5-step rule. PLoS One 2019; 14:e0223993. [PMID: 31751380 PMCID: PMC6874067 DOI: 10.1371/journal.pone.0223993] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 10/02/2019] [Indexed: 01/22/2023] Open
Abstract
Among different post-translational modifications (PTMs), one of the most important one is the lysine crotonylation in proteins. Its importance cannot be undermined related to different diseases and essential biological practice. The key step for finding the hidden mechanisms of crotonylation along with their occurrence sites is to completely apprehend the mechanism behind this biological process. In previously reported studies, researchers have used different techniques, like position weighted matrix (PWM), support vector machine (SVM), k nearest neighbors (KNN), and many others. However, the maximum prediction accuracy achieved was not such high. To address this, herein, we propose an improved predictor for lysine crotonylation sites named iCrotoK-PseAAC, in which we have incorporated various position and composition relative features along with statistical moments into PseAAC. The results of self-consistency testing were 100% accurate, while the 10-fold cross validation gave 99.0% accuracy. Based on the validation and comparison of model, it is concluded that the iCrotoK-PseAAC is more accurate than the previously proposed models.
Collapse
Affiliation(s)
- Sharaf Jameel Malebary
- Department of Information Technology, King Abdul Aziz University, Rabigh, Kingdom of Saudi Arabia
| | - Muhammad Safi ur Rehman
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Pakistan
| | - Yaser Daanial Khan
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Pakistan
| |
Collapse
|
14
|
Jiang L, Yu M, Zhou Y, Tang Z, Li N, Kang J, He B, Huang J. AGONOTES: A Robot Annotator for Argonaute Proteins. Interdiscip Sci 2019; 12:109-116. [PMID: 31741225 DOI: 10.1007/s12539-019-00349-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 10/06/2019] [Accepted: 10/30/2019] [Indexed: 12/01/2022]
Abstract
The argonaute protein (Ago) exists in almost all organisms. In eukaryotes, it functions as a regulatory system for gene expression. In prokaryotes, it is a type of defense system against foreign invasive genomes. The Ago system has been engineered for gene silencing and genome editing and plays an important role in biological studies. With an increasing number of genomes and proteomes of various microbes becoming available, computational tools for identifying and annotating argonaute proteins are urgently needed. We introduce AGONOTES (Argonaute Notes). It is a web service especially designed for identifying and annotating Ago. AGONOTES uses the BLASTP similarity search algorithm to categorize all submitted proteins into three groups: prokaryotic argonaute protein (pAgo), eukaryotic argonaute protein (eAgo), and non-argonaute protein (non-Ago). Argonaute proteins can then be aligned to the corresponding standard set of Ago sequences using the multiple sequence alignment program MUSCLE. All functional domains of Ago can further be curated from the alignment results and visualized easily through Bio::Graphic modules in the BioPerl bundle. Compared with existing tools such as CD-Search and available databases such as UniProt and AGONOTES showed a much better performance on domain annotations, which is fundamental in studying the new Ago. AGONOTES can be freely accessed at http://i.uestc.edu.cn/agonotes/. AGONOTES is a friendly tool for annotating Ago domains from a proteome or a series of protein sequences.
Collapse
Affiliation(s)
- Lixu Jiang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 637111, China
| | - Min Yu
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 637111, China
| | - Yuwei Zhou
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 637111, China
| | - Zhongjie Tang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 637111, China
| | - Ning Li
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 637111, China
| | - Juanjuan Kang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 637111, China
| | - Bifang He
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 637111, China.,School of Medicine, Guizhou University, Guiyang, China
| | - Jian Huang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 637111, China.
| |
Collapse
|
15
|
Shi C, Chen J, Kang X, Zhao G, Lao X, Zheng H. Deep Learning in the Study of Protein-Related Interactions. Protein Pept Lett 2019; 27:359-369. [PMID: 31538879 DOI: 10.2174/0929866526666190723114142] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2019] [Revised: 03/13/2019] [Accepted: 04/05/2019] [Indexed: 11/22/2022]
Abstract
Protein-related interaction prediction is critical to understanding life processes, biological functions, and mechanisms of drug action. Experimental methods used to determine proteinrelated interactions have always been costly and inefficient. In recent years, advances in biological and medical technology have provided us with explosive biological and physiological data, and deep learning-based algorithms have shown great promise in extracting features and learning patterns from complex data. At present, deep learning in protein research has emerged. In this review, we provide an introductory overview of the deep neural network theory and its unique properties. Mainly focused on the application of this technology in protein-related interactions prediction over the past five years, including protein-protein interactions prediction, protein-RNA\DNA, Protein- drug interactions prediction, and others. Finally, we discuss some of the challenges that deep learning currently faces.
Collapse
Affiliation(s)
- Cheng Shi
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| | - Jiaxing Chen
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| | - Xinyue Kang
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| | - Guiling Zhao
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| | - Xingzhen Lao
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| | - Heng Zheng
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| |
Collapse
|
16
|
Chai G, Yu M, Jiang L, Duan Y, Huang J. HMMCAS: A Web Tool for the Identification and Domain Annotations of CAS Proteins. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1313-1315. [PMID: 28186905 DOI: 10.1109/tcbb.2017.2665542] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
The CRISPR-Cas (clustered regularly interspaced short palindromic repeats-CRISPR-associated proteins) adaptive immune systems are discovered in many bacteria and most archaea. These systems are encoded by cas (CRISPR-associated) operons that have an extremely diverse architecture. The most crucial step in the depiction of cas operons composition is the identification of cas genes or Cas proteins. With the continuous increase of the newly sequenced archaeal and bacterial genomes, the recognition of new Cas proteins is becoming possible, which not only provides candidates for novel genome editing tools but also helps to understand the prokaryotic immune system better. Here, we describe HMMCAS, a web service for the detection of CRISPR-associated structural and functional domains in protein sequences. HMMCAS uses hmmscan similarity search algorithm in HMMER3.1 to provide a fast, interactive service based on a comprehensive collection of hidden Markov models of Cas protein family. It can accurately identify the Cas proteins including those fusion proteins, for example the Cas1-Cas4 fusion protein in Candidatus Chloracidobacterium thermophilum B (Cab. thermophilum B). HMMCAS can also find putative cas operon and determine which type it belongs to. HMMCAS is freely available at http://i.uestc.edu.cn/hmmcas.
Collapse
|
17
|
He B, Chen H, Huang J. PhD7Faster 2.0: predicting clones propagating faster from the Ph.D.-7 phage display library by coupling PseAAC and tripeptide composition. PeerJ 2019; 7:e7131. [PMID: 31245183 PMCID: PMC6585900 DOI: 10.7717/peerj.7131] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2018] [Accepted: 05/15/2019] [Indexed: 01/08/2023] Open
Abstract
Selection from phage display libraries empowers isolation of high-affinity ligands for various targets. However, this method also identifies propagation-related target-unrelated peptides (PrTUPs). These false positive hits appear because of their amplification advantages. In this report, we present PhD7Faster 2.0 for predicting fast-propagating clones from the Ph.D.-7 phage display library, which was developed based on the support vector machine. Feature selection was performed against PseAAC and tripeptide composition using the incremental feature selection method. Ten-fold cross-validation results show that PhD7Faster 2.0 succeeds a decent performance with the accuracy of 81.84%, the Matthews correlation coefficient of 0.64 and the area under the ROC curve of 0.90. The permutation test with 1,000 shuffles resulted in p < 0.001. We implemented PhD7Faster 2.0 into a publicly accessible web tool (http://i.uestc.edu.cn/sarotup3/cgi-bin/PhD7Faster.pl) and constructed standalone graphical user interface and command-line versions for different systems. The standalone PhD7Faster 2.0 is able to detect PrTUPs within small datasets as well as large-scale datasets. This makes PhD7Faster 2.0 an enhanced and powerful tool for scanning and reporting faster-growing clones from the Ph.D.-7 phage display library.
Collapse
Affiliation(s)
- Bifang He
- School of Medicine, Guizhou University, Guiyang, Guizhou, China.,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Heng Chen
- School of Medicine, Guizhou University, Guiyang, Guizhou, China
| | - Jian Huang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| |
Collapse
|
18
|
He B, Chen H, Li N, Huang J. SAROTUP: a suite of tools for finding potential target-unrelated peptides from phage display data. Int J Biol Sci 2019; 15:1452-1459. [PMID: 31337975 PMCID: PMC6643146 DOI: 10.7150/ijbs.31957] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Accepted: 04/09/2019] [Indexed: 01/13/2023] Open
Abstract
SAROTUP (Scanner And Reporter Of Target-Unrelated Peptides) 3.1 is a significant upgrade to the widely used SAROTUP web server for the rapid identification of target-unrelated peptides (TUPs) in phage display data. At present, SAROTUP has gathered a suite of tools for finding potential TUPs and other purposes. Besides the TUPScan, the motif-based tool, and three tools based on the BDB database, i.e., MimoScan, MimoSearch, and MimoBlast, three predictors based on support vector machine, i.e., PhD7Faster, SABinder and PSBinder, are integrated into SAROTUP. The current version of SAROTUP contains 27 TUP motifs and 823 TUP sequences. We also developed the standalone SAROTUP application with graphical user interface (GUI) and command line versions for processing deep sequencing phage display data and distributed it as an open source package, which can perform perfectly locally on almost all systems that support C++ with little or no modification. The web interfaces of SAROTUP have also been redesigned to be more self-evident and user-friendly. The latest version of SAROTUP is freely available at http://i.uestc.edu.cn/sarotup3.
Collapse
Affiliation(s)
- Bifang He
- School of Medicine, Guizhou University, Guiyang 550025, China.,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Heng Chen
- School of Medicine, Guizhou University, Guiyang 550025, China
| | - Ning Li
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Jian Huang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 611731, China
| |
Collapse
|
19
|
Lin H, Peng S, Huang J. Special issue on Computational Resources and Methods in Biological Sciences. Int J Biol Sci 2018; 14:807-810. [PMID: 29989106 PMCID: PMC6036761 DOI: 10.7150/ijbs.27554] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Accepted: 06/03/2018] [Indexed: 12/11/2022] Open
Abstract
This special issue covers a wide range of topics in computational biology, such as database construction, sequence analysis and function prediction with machine learning methods, disease-related diagnosis, drug-target and drug discovery, and electronic health record system construction.
Collapse
Affiliation(s)
- Hao Lin
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, China.,School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| | - Shaoliang Peng
- School of Computer Science, National University of Defense Technology, Changsha 410073, China
| | - Jian Huang
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, China.,School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| |
Collapse
|
20
|
Tang H, Zhao YW, Zou P, Zhang CM, Chen R, Huang P, Lin H. HBPred: a tool to identify growth hormone-binding proteins. Int J Biol Sci 2018; 14:957-964. [PMID: 29989085 PMCID: PMC6036759 DOI: 10.7150/ijbs.24174] [Citation(s) in RCA: 136] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Accepted: 01/15/2018] [Indexed: 12/19/2022] Open
Abstract
Hormone-binding protein (HBP) is a kind of soluble carrier protein and can selectively and non-covalently interact with hormone. HBP plays an important role in life growth, but its function is still unclear. Correct recognition of HBPs is the first step to further study their function and understand their biological process. However, it is difficult to correctly recognize HBPs from more and more proteins through traditional biochemical experiments because of high experimental cost and long experimental period. To overcome these disadvantages, we designed a computational method for identifying HBPs accurately in the study. At first, we collected HBP data from UniProt to establish a high-quality benchmark dataset. Based on the dataset, the dipeptide composition was extracted from HBP residue sequences. In order to find out the optimal features to provide key clues for HBP identification, the analysis of various (ANOVA) was performed for feature ranking. The optimal features were selected through the incremental feature selection strategy. Subsequently, the features were inputted into support vector machine (SVM) for prediction model construction. Jackknife cross-validation results showed that 88.6% HBPs and 81.3% non-HBPs were correctly recognized, suggesting that our proposed model was powerful. This study provides a new strategy to identify HBPs. Moreover, based on the proposed model, we established a webserver called HBPred, which could be freely accessed at http://lin-group.cn/server/HBPred.
Collapse
Affiliation(s)
- Hua Tang
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China
| | - Ya-Wei Zhao
- Key Laboratory for NeuroInformation of Ministry of Education, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Ping Zou
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China
| | - Chun-Mei Zhang
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China
| | - Rong Chen
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China
| | - Po Huang
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China
| | - Hao Lin
- Key Laboratory for NeuroInformation of Ministry of Education, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
21
|
Kang J, Fang Y, Yao P, Li N, Tang Q, Huang J. NeuroPP: A Tool for the Prediction of Neuropeptide Precursors Based on Optimal Sequence Composition. Interdiscip Sci 2018. [DOI: 10.1007/s12539-018-0287-2] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
22
|
He B, Tjhung KF, Bennett NJ, Chou Y, Rau A, Huang J, Derda R. Compositional Bias in Naïve and Chemically-modified Phage-Displayed Libraries uncovered by Paired-end Deep Sequencing. Sci Rep 2018; 8:1214. [PMID: 29352178 PMCID: PMC5775325 DOI: 10.1038/s41598-018-19439-2] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Accepted: 01/02/2018] [Indexed: 01/09/2023] Open
Abstract
Understanding the composition of a genetically-encoded (GE) library is instrumental to the success of ligand discovery. In this manuscript, we investigate the bias in GE-libraries of linear, macrocyclic and chemically post-translationally modified (cPTM) tetrapeptides displayed on the M13KE platform, which are produced via trinucleotide cassette synthesis (19 codons) and NNK-randomized codon. Differential enrichment of synthetic DNA {S}, ligated vector {L} (extension and ligation of synthetic DNA into the vector), naïve libraries {N} (transformation of the ligated vector into the bacteria followed by expression of the library for 4.5 hours to yield a "naïve" library), and libraries chemically modified by aldehyde ligation and cysteine macrocyclization {M} characterized by paired-end deep sequencing, detected a significant drop in diversity in {L} → {N}, but only a minor compositional difference in {S} → {L} and {N} → {M}. Libraries expressed at the N-terminus of phage protein pIII censored positively charged amino acids Arg and Lys; libraries expressed between pIII domains N1 and N2 overcame Arg/Lys-censorship but introduced new bias towards Gly and Ser. Interrogation of biases arising from cPTM by aldehyde ligation and cysteine macrocyclization unveiled censorship of sequences with Ser/Phe. Analogous analysis can be used to explore library diversity in new display platforms and optimize cPTM of these libraries.
Collapse
Affiliation(s)
- Bifang He
- Department of Chemistry and Alberta Glycomics Centre, University of Alberta, Edmonton, AB T6G 2G2, Canada
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Katrina F Tjhung
- Department of Chemistry and Alberta Glycomics Centre, University of Alberta, Edmonton, AB T6G 2G2, Canada
- The Scripps Research Institute, 10550 N. Torrey Pines Rd., La Jolla, CA, 92037, USA
- The Salk Institute, 10010 N. Torrey Pines Rd., La Jolla, CA, 92037, USA
| | - Nicholas J Bennett
- Department of Chemistry and Alberta Glycomics Centre, University of Alberta, Edmonton, AB T6G 2G2, Canada
| | - Ying Chou
- Department of Chemistry and Alberta Glycomics Centre, University of Alberta, Edmonton, AB T6G 2G2, Canada
| | - Andrea Rau
- GABI, INRA, AgroParisTech, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Jian Huang
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- Center for Information in Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Ratmir Derda
- Department of Chemistry and Alberta Glycomics Centre, University of Alberta, Edmonton, AB T6G 2G2, Canada.
| |
Collapse
|
23
|
He B, Jiang L, Duan Y, Chai G, Fang Y, Kang J, Yu M, Li N, Tang Z, Yao P, Wu P, Derda R, Huang J. Biopanning data bank 2018: hugging next generation phage display. Database (Oxford) 2018; 2018:4955852. [PMID: 29688378 PMCID: PMC7206649 DOI: 10.1093/database/bay032] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Revised: 02/07/2018] [Accepted: 03/07/2018] [Indexed: 12/12/2022]
Abstract
Database URL The BDB database is available at http://immunet.cn/bdb.
Collapse
Affiliation(s)
- Bifang He
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| | - Lixu Jiang
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| | - Yaocong Duan
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| | - Guoshi Chai
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| | - Yewei Fang
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| | - Juanjuan Kang
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| | - Min Yu
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| | - Ning Li
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| | - Zhongjie Tang
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| | - Pengcheng Yao
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| | - Pengcheng Wu
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| | - Ratmir Derda
- Department of Chemistry, University of Alberta, 11227 Saskatchewan Drive, Edmonton, AB T6G 2G2, Canada
| | - Jian Huang
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu 611731, China
| |
Collapse
|
24
|
PSBinder: A Web Service for Predicting Polystyrene Surface-Binding Peptides. BIOMED RESEARCH INTERNATIONAL 2017; 2017:5761517. [PMID: 29445741 PMCID: PMC5763211 DOI: 10.1155/2017/5761517] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Accepted: 11/02/2017] [Indexed: 11/18/2022]
Abstract
Polystyrene surface-binding peptides (PSBPs) are useful as affinity tags to build a highly effective ELISA system. However, they are also a quite common type of target-unrelated peptides (TUPs) in the panning of phage-displayed random peptide library. As TUP, PSBP will mislead the analysis of panning results if not identified. Therefore, it is necessary to find a way to quickly and easily foretell if a peptide is likely to be a PSBP or not. In this paper, we describe PSBinder, a predictor based on SVM. To our knowledge, it is the first web server for predicting PSBP. The SVM model was built with the feature of optimized dipeptide composition and 87.02% (MCC = 0.74; AUC = 0.91) of peptides were correctly classified by fivefold cross-validation. PSBinder can be used to exclude highly possible PSBP from biopanning results or to find novel candidates for polystyrene affinity tags. Either way, it is valuable for biotechnology community.
Collapse
|
25
|
Qiu WR, Sun BQ, Tang H, Huang J, Lin H. Identify and analysis crotonylation sites in histone by using support vector machines. Artif Intell Med 2017; 83:75-81. [DOI: 10.1016/j.artmed.2017.02.007] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2016] [Revised: 01/25/2017] [Indexed: 10/20/2022]
|
26
|
Dao FY, Yang H, Su ZD, Yang W, Wu Y, Hui D, Chen W, Tang H, Lin H. Recent Advances in Conotoxin Classification by Using Machine Learning Methods. Molecules 2017; 22:molecules22071057. [PMID: 28672838 PMCID: PMC6152242 DOI: 10.3390/molecules22071057] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2017] [Revised: 06/12/2017] [Accepted: 06/19/2017] [Indexed: 11/16/2022] Open
Abstract
Conotoxins are disulfide-rich small peptides, which are invaluable peptides that target ion channel and neuronal receptors. Conotoxins have been demonstrated as potent pharmaceuticals in the treatment of a series of diseases, such as Alzheimer's disease, Parkinson's disease, and epilepsy. In addition, conotoxins are also ideal molecular templates for the development of new drug lead compounds and play important roles in neurobiological research as well. Thus, the accurate identification of conotoxin types will provide key clues for the biological research and clinical medicine. Generally, conotoxin types are confirmed when their sequence, structure, and function are experimentally validated. However, it is time-consuming and costly to acquire the structure and function information by using biochemical experiments. Therefore, it is important to develop computational tools for efficiently and effectively recognizing conotoxin types based on sequence information. In this work, we reviewed the current progress in computational identification of conotoxins in the following aspects: (i) construction of benchmark dataset; (ii) strategies for extracting sequence features; (iii) feature selection techniques; (iv) machine learning methods for classifying conotoxins; (v) the results obtained by these methods and the published tools; and (vi) future perspectives on conotoxin classification. The paper provides the basis for in-depth study of conotoxins and drug therapy research.
Collapse
Affiliation(s)
- Fu-Ying Dao
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Hui Yang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Zhen-Dong Su
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Wuritu Yang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
- Development and Planning Department, Inner Mongolia University, Hohhot 010021, China.
| | - Yun Wu
- College of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China.
| | - Ding Hui
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Wei Chen
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
- Department of Physics, School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan 063000, China.
| | - Hua Tang
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China.
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| |
Collapse
|
27
|
Zhang Y, He B, Liu K, Ning L, Luo D, Xu K, Zhu W, Wu Z, Huang J, Xu X. A novel peptide specifically binding to VEGF receptor suppresses angiogenesis in vitro and in vivo. Signal Transduct Target Ther 2017; 2:17010. [PMID: 29263914 PMCID: PMC5661615 DOI: 10.1038/sigtrans.2017.10] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2016] [Revised: 02/21/2017] [Accepted: 02/21/2017] [Indexed: 12/27/2022] Open
Abstract
Vascular endothelial growth factor (VEGF), one of the most important angiogenic factors, plays an essential role in both physiological and pathological angiogenesis through binding to VEGF receptors (VEGFRs). Here we report a novel peptide designated HRHTKQRHTALH (peptide HRH), which was isolated from the Ph.D. -12 phage display library using VEGFR-Fc fusion protein as the bait. This peptide was found to dose-dependently inhibit the proliferation of human umbilical vein endothelial cells stimulated by VEGF. The anti-angiogenesis effect of the HRH peptide was further confirmed in vivo using the chick chorioallantoic membrane assay, which was also dose-dependent. Besides, peptide HRH was proved to inhibit corneal neovascularization in an alkali-burnt rat corneal model and a suture-induced rat corneal model. Taken together, these findings suggest that the HRH peptide can inhibit angiogenesis both in vitro and in vivo. Consequently, the HRHTKQRHTALH peptide might be a promising lead peptide for the development of potential angiogenic inhibitors.
Collapse
Affiliation(s)
- Yuan Zhang
- Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Bifang He
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Kun Liu
- Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Lin Ning
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Delun Luo
- Chengdu Nuoen Biotechnologies, LTD, Chengdu, China
| | - Kai Xu
- Chengdu Nuoen Biotechnologies, LTD, Chengdu, China
| | - Wenli Zhu
- Chengdu Nuoen Biotechnologies, LTD, Chengdu, China
| | - Zhigang Wu
- Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China.,Chengdu Nuoen Biotechnologies, LTD, Chengdu, China
| | - Jian Huang
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China.,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Xun Xu
- Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| |
Collapse
|