1
|
Chen L, Qu R, Liu X. Improved multi-label classifiers for predicting protein subcellular localization. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:214-236. [PMID: 38303420 DOI: 10.3934/mbe.2024010] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Protein functions are closely related to their subcellular locations. At present, the prediction of protein subcellular locations is one of the most important problems in protein science. The evident defects of traditional methods make it urgent to design methods with high efficiency and low costs. To date, lots of computational methods have been proposed. However, this problem is far from being completely solved. Recently, some multi-label classifiers have been proposed to identify subcellular locations of human, animal, Gram-negative bacterial and eukaryotic proteins. These classifiers adopted the protein features derived from gene ontology information. Although they provided good performance, they can be further improved by adopting more powerful machine learning algorithms. In this study, four improved multi-label classifiers were set up for identification of subcellular locations of the above four protein types. The random k-labelsets (RAKEL) algorithm was used to tackle proteins with multiple locations, and random forest was used as the basic prediction engine. All classifiers were tested by jackknife test, indicating their high performance. Comparisons with previous classifiers further confirmed the superiority of the proposed classifiers.
Collapse
Affiliation(s)
- Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Ruyun Qu
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Xintong Liu
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| |
Collapse
|
2
|
Chen L, Chen Y. RMTLysPTM: recognizing multiple types of lysine PTM sites by deep analysis on sequences. Brief Bioinform 2023; 25:bbad450. [PMID: 38066710 PMCID: PMC10783864 DOI: 10.1093/bib/bbad450] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 10/24/2023] [Accepted: 11/15/2023] [Indexed: 12/18/2023] Open
Abstract
Post-translational modification (PTM) occurs after a protein is translated from ribonucleic acid. It is an important living creature life phenomenon because it is implicated in almost all cellular processes. Identification of PTM sites from a given protein sequence is a hot topic in bioinformatics. Lots of computational methods have been proposed, and they provide good performance. However, most previous methods can only tackle one PTM type. Few methods consider multiple PTM types. In this study, a multi-label classification model, named RMTLysPTM, was developed to recognize four types of lysine (K) PTM sites, including acetylation, crotonylation, methylation and succinylation. The surrounding sites of a lysine site were selected to constitute a peptide segment, representing the lysine at the center. Deep analysis was conducted to count the distribution of 2-residues with fixed location across the four types of lysine PTM sites. By aggregating the distribution information of 2-residues in one peptide segment, the peptide segment was encoded by informative features. Furthermore, a prediction engine that can precisely capture the traits of the above representations was designed to recognize the types of lysine PTM sites. The cross-validation results on two datasets (Qiu and CPLM training datasets) suggested that the model had extremely high performance and RMTLysPTM had strong generalization ability by testing it on protein Q16778 and CPLM testing datasets. The model was found to be generally superior to all previous models and those using popular methods and features. A web server was set up for RMTLysPTM, and it can be accessed at http://119.3.127.138/.
Collapse
Affiliation(s)
- Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People’s Republic of China
| | - Yuwei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People’s Republic of China
| |
Collapse
|
3
|
Zou K, Wang S, Wang Z, Zhang Z, Yang F. HAR_Locator: a novel protein subcellular location prediction model of immunohistochemistry images based on hybrid attention modules and residual units. Front Mol Biosci 2023; 10:1171429. [PMID: 37664182 PMCID: PMC10470064 DOI: 10.3389/fmolb.2023.1171429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 08/04/2023] [Indexed: 09/05/2023] Open
Abstract
Introduction: Proteins located in subcellular compartments have played an indispensable role in the physiological function of eukaryotic organisms. The pattern of protein subcellular localization is conducive to understanding the mechanism and function of proteins, contributing to investigating pathological changes of cells, and providing technical support for targeted drug research on human diseases. Automated systems based on featurization or representation learning and classifier design have attracted interest in predicting the subcellular location of proteins due to a considerable rise in proteins. However, large-scale, fine-grained protein microscopic images are prone to trapping and losing feature information in the general deep learning models, and the shallow features derived from statistical methods have weak supervision abilities. Methods: In this work, a novel model called HAR_Locator was developed to predict the subcellular location of proteins by concatenating multi-view abstract features and shallow features, whose advanced advantages are summarized in the following three protocols. Firstly, to get discriminative abstract feature information on protein subcellular location, an abstract feature extractor called HARnet based on Hybrid Attention modules and Residual units was proposed to relieve gradient dispersion and focus on protein-target regions. Secondly, it not only improves the supervision ability of image information but also enhances the generalization ability of the HAR_Locator through concatenating abstract features and shallow features. Finally, a multi-category multi-classifier decision system based on an Artificial Neural Network (ANN) was introduced to obtain the final output results of samples by fitting the most representative result from five subset predictors. Results: To evaluate the model, a collection of 6,778 immunohistochemistry (IHC) images from the Human Protein Atlas (HPA) database was used to present experimental results, and the accuracy, precision, and recall evaluation indicators were significantly increased to 84.73%, 84.77%, and 84.70%, respectively, compared with baseline predictors.
Collapse
Affiliation(s)
- Kai Zou
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
| | - Simeng Wang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
| | - Ziqian Wang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
| | - Zhihai Zhang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
| | - Fan Yang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
- Artificial Intelligence and Bioinformation Cognition Laboratory, Jiangxi Science and Technology Normal University, Nanchang, China
| |
Collapse
|
4
|
Khojasteh H, Pirgazi J, Ghanbari Sorkhi A. Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques. PLoS One 2023; 18:e0288173. [PMID: 37535616 PMCID: PMC10399861 DOI: 10.1371/journal.pone.0288173] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 06/21/2023] [Indexed: 08/05/2023] Open
Abstract
Drug discovery relies on predicting drug-target interaction (DTI), which is an important challenging task. The purpose of DTI is to identify the interaction between drug chemical compounds and protein targets. Traditional wet lab experiments are time-consuming and expensive, that's why in recent years, the use of computational methods based on machine learning has attracted the attention of many researchers. Actually, a dry lab environment focusing more on computational methods of interaction prediction can be helpful in limiting search space for wet lab experiments. In this paper, a novel multi-stage approach for DTI is proposed that called SRX-DTI. In the first stage, combination of various descriptors from protein sequences, and a FP2 fingerprint that is encoded from drug are extracted as feature vectors. A major challenge in this application is the imbalanced data due to the lack of known interactions, in this regard, in the second stage, the One-SVM-US technique is proposed to deal with this problem. Next, the FFS-RF algorithm, a forward feature selection algorithm, coupled with a random forest (RF) classifier is developed to maximize the predictive performance. This feature selection algorithm removes irrelevant features to obtain optimal features. Finally, balanced dataset with optimal features is given to the XGBoost classifier to identify DTIs. The experimental results demonstrate that our proposed approach SRX-DTI achieves higher performance than other existing methods in predicting DTIs. The datasets and source code are available at: https://github.com/Khojasteh-hb/SRX-DTI.
Collapse
Affiliation(s)
- Hakimeh Khojasteh
- Department of Computer Engineering, University of Zanjan, Zanjan, Iran
- School of Biological Sciences Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | - Jamshid Pirgazi
- School of Biological Sciences Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
- Department of Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
| | - Ali Ghanbari Sorkhi
- Department of Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
| |
Collapse
|
5
|
Ullah M, Hadi F, Song J, Yu DJ. PScL-2LSAESM: bioimage-based prediction of protein subcellular localization by integrating heterogeneous features with the two-level SAE-SM and mean ensemble method. Bioinformatics 2023; 39:6839969. [PMID: 36413068 PMCID: PMC9947927 DOI: 10.1093/bioinformatics/btac727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 11/02/2022] [Accepted: 11/21/2022] [Indexed: 11/23/2022] Open
Abstract
MOTIVATION Over the past decades, a variety of in silico methods have been developed to predict protein subcellular localization within cells. However, a common and major challenge in the design and development of such methods is how to effectively utilize the heterogeneous feature sets extracted from bioimages. In this regards, limited efforts have been undertaken. RESULTS We propose a new two-level stacked autoencoder network (termed 2L-SAE-SM) to improve its performance by integrating the heterogeneous feature sets. In particular, in the first level of 2L-SAE-SM, each optimal heterogeneous feature set is fed to train our designed stacked autoencoder network (SAE-SM). All the trained SAE-SMs in the first level can output the decision sets based on their respective optimal heterogeneous feature sets, known as 'intermediate decision' sets. Such intermediate decision sets are then ensembled using the mean ensemble method to generate the 'intermediate feature' set for the second-level SAE-SM. Using the proposed framework, we further develop a novel predictor, referred to as PScL-2LSAESM, to characterize image-based protein subcellular localization. Extensive benchmarking experiments on the latest benchmark training and independent test datasets collected from the human protein atlas databank demonstrate the effectiveness of the proposed 2L-SAE-SM framework for the integration of heterogeneous feature sets. Moreover, performance comparison of the proposed PScL-2LSAESM with current state-of-the-art methods further illustrates that PScL-2LSAESM clearly outperforms the existing state-of-the-art methods for the task of protein subcellular localization. AVAILABILITY AND IMPLEMENTATION https://github.com/csbio-njust-edu/PScL-2LSAESM. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matee Ullah
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Fazal Hadi
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
| | | | - Dong-Jun Yu
- To whom correspondence should be addressed. or
| |
Collapse
|
6
|
Rahardja S, Wang M, Nguyen BP, Fränti P, Rahardja S. A lightweight classification of adaptor proteins using transformer networks. BMC Bioinformatics 2022; 23:461. [PMID: 36333658 PMCID: PMC9635127 DOI: 10.1186/s12859-022-05000-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Accepted: 09/13/2022] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND Adaptor proteins play a key role in intercellular signal transduction, and dysfunctional adaptor proteins result in diseases. Understanding its structure is the first step to tackling the associated conditions, spurring ongoing interest in research into adaptor proteins with bioinformatics and computational biology. Our study aims to introduce a small, new, and superior model for protein classification, pushing the boundaries with new machine learning algorithms. RESULTS We propose a novel transformer based model which includes convolutional block and fully connected layer. We input protein sequences from a database, extract PSSM features, then process it via our deep learning model. The proposed model is efficient and highly compact, achieving state-of-the-art performance in terms of area under the receiver operating characteristic curve, Matthew's Correlation Coefficient and Receiver Operating Characteristics curve. Despite merely 20 hidden nodes translating to approximately 1% of the complexity of previous best known methods, the proposed model is still superior in results and computational efficiency. CONCLUSIONS The proposed model is the first transformer model used for recognizing adaptor protein, and outperforms all existing methods, having PSSM profiles as inputs that comprises convolutional blocks, transformer and fully connected layers for the use of classifying adaptor proteins.
Collapse
Affiliation(s)
- Sylwan Rahardja
- grid.9668.10000 0001 0726 2490School of Computing, University of Eastern Finland, Joensuu, Finland
| | - Mou Wang
- grid.440588.50000 0001 0307 1240School of Marine Science and Technology, Northwestern Polytechnical University and Singapore Institute of Technology, 710072 Xi’an, China
| | - Binh P. Nguyen
- grid.267827.e0000 0001 2292 3111School of Mathematics and Statistics, Victoria University of Wellington, Wellington, New Zealand
| | - Pasi Fränti
- grid.9668.10000 0001 0726 2490School of Computing, University of Eastern Finland, Joensuu, Finland
| | - Susanto Rahardja
- grid.440588.50000 0001 0307 1240School of Marine Science and Technology, Northwestern Polytechnical University and Singapore Institute of Technology, 710072 Xi’an, China ,grid.486188.b0000 0004 1790 4399Singapore Institute of Technology, Singapore, 138683 Singapore
| |
Collapse
|
7
|
Cong H, Liu H, Cao Y, Chen Y, Liang C. Multiple Protein Subcellular Locations Prediction Based on Deep Convolutional Neural Networks with Self-Attention Mechanism. Interdiscip Sci 2022; 14:421-438. [PMID: 35066812 DOI: 10.1007/s12539-021-00496-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2021] [Revised: 12/06/2021] [Accepted: 12/13/2021] [Indexed: 12/12/2022]
Abstract
As an important research field in bioinformatics, protein subcellular location prediction is critical to reveal the protein functions and provide insightful information for disease diagnosis and drug development. Predicting protein subcellular locations remains a challenging task due to the difficulty of finding representative features and robust classifiers. Many feature fusion methods have been widely applied to tackle the above issues. However, they still suffer from accuracy loss due to feature redundancy. Furthermore, multiple protein subcellular locations prediction is more complicated since it is fundamentally a multi-label classification problem. The traditional binary classifiers or even multi-class classifiers cannot achieve satisfactory results. This paper proposes a novel method for protein subcellular location prediction with both single and multiple sites based on deep convolutional neural networks. Specifically, we first obtain the integrated features by simultaneously considering the pseudo amino acid, amino acid index distribution, and physicochemical property. We then adopt deep convolutional neural networks to extract high-dimensional features from the fused feature, removing the redundant preliminary features and gaining better representations of the raw sequences. Moreover, we use the self-attention mechanism and a customized loss function to ensure that the model is more inclined to positive data. In addition, we use random k-label sets to reduce the number of prediction labels. Meanwhile, we employ a hybrid strategy of over-sampling and under-sampling to tackle the data imbalance problem. We compare our model with three representative classification alternatives. The experiment results show that our model achieves the best performance in terms of accuracy, demonstrating the efficacy of the proposed model.
Collapse
Affiliation(s)
- Hanhan Cong
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
- Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan, China
| | - Hong Liu
- School of Information Science and Engineering, Shandong Normal University, Jinan, China.
- Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan, China.
| | - Yi Cao
- School of Information Science and Engineering, University of Jinan, Jinan, China
- Shandong Provincial Key Laboratory of Network Based Intelligent, Computing University of Jinan, Jinan, China
| | - Yuehui Chen
- School of Information Science and Engineering, University of Jinan, Jinan, China
- Shandong Provincial Key Laboratory of Network Based Intelligent, Computing University of Jinan, Jinan, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| |
Collapse
|
8
|
Naseer S, Hussain W, Khan YD, Rasool N. iPhosS(Deep)-PseAAC: Identification of Phosphoserine Sites in Proteins Using Deep Learning on General Pseudo Amino Acid Compositions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1703-1714. [PMID: 33242308 DOI: 10.1109/tcbb.2020.3040747] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Among all the PTMs, the protein phosphorylation is pivotal for various pathological and physiological processes. About 30 percent of eukaryotic proteins undergo the phosphorylation modification, leading to various changes in conformation, function, stability, localization, and so forth. In eukaryotic proteins, phosphorylation occurs on serine (S), Threonine (T) and Tyrosine (Y) residues. Among these all, serine phosphorylation has its own importance as it is associated with various importance biological processes, including energy metabolism, signal transduction pathways, cell cycling, and apoptosis. Thus, its identification is important, however, the in vitro, ex vivo and in vivo identification can be laborious, time-taking and costly. There is a dire need of an efficient and accurate computational model to help researchers and biologists identifying these sites, in an easy manner. Herein, we propose a novel predictor for identification of Phosphoserine sites (PhosS) in proteins, by integrating the Chou's Pseudo Amino Acid Composition (PseAAC) with deep features. We used well-known DNNs for both the tasks of learning a feature representation of peptide sequences and performing classifications. Among different DNNs, the best score is shown by Covolutional Neural Network based model which renders CNN based prediction model the best for Phosphoserine prediction. Based on these results, it is concluded that the proposed model can help to identify PhosS sites in a very efficient and accurate manner which can help scientists understand the mechanism of this modification in proteins.
Collapse
|
9
|
Ullah M, Han K, Hadi F, Xu J, Song J, Yu DJ. PScL-HDeep: image-based prediction of protein subcellular location in human tissue using ensemble learning of handcrafted and deep learned features with two-layer feature selection. Brief Bioinform 2021; 22:bbab278. [PMID: 34337652 PMCID: PMC8574991 DOI: 10.1093/bib/bbab278] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2021] [Revised: 06/30/2021] [Accepted: 07/01/2021] [Indexed: 01/17/2023] Open
Abstract
Protein subcellular localization plays a crucial role in characterizing the function of proteins and understanding various cellular processes. Therefore, accurate identification of protein subcellular location is an important yet challenging task. Numerous computational methods have been proposed to predict the subcellular location of proteins. However, most existing methods have limited capability in terms of the overall accuracy, time consumption and generalization power. To address these problems, in this study, we developed a novel computational approach based on human protein atlas (HPA) data, referred to as PScL-HDeep, for accurate and efficient image-based prediction of protein subcellular location in human tissues. We extracted different handcrafted and deep learned (by employing pretrained deep learning model) features from different viewpoints of the image. The step-wise discriminant analysis (SDA) algorithm was applied to generate the optimal feature set from each original raw feature set. To further obtain a more informative feature subset, support vector machine-based recursive feature elimination with correlation bias reduction (SVM-RFE + CBR) feature selection algorithm was applied to the integrated feature set. Finally, the classification models, namely support vector machine with radial basis function (SVM-RBF) and support vector machine with linear kernel (SVM-LNR), were learned on the final selected feature set. To evaluate the performance of the proposed method, a new gold standard benchmark training dataset was constructed from the HPA databank. PScL-HDeep achieved the maximum performance on 10-fold cross validation test on this dataset and showed a better efficacy over existing predictors. Furthermore, we also illustrated the generalization ability of the proposed method by conducting a stringent independent validation test.
Collapse
Affiliation(s)
- Matee Ullah
- Nanjing University of Science and Technology, China
| | - Ke Han
- School of Computer Science and Engineering, Nanjing University of Science and Technology, China
| | - Fazal Hadi
- Pakistan Institute of Engineering and Applied Sciences, Islamabad, Pakistan
| | - Jian Xu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, China
| | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Australia
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, China
| |
Collapse
|
10
|
Akbar S, Ahmad A, Hayat M, Rehman AU, Khan S, Ali F. iAtbP-Hyb-EnC: Prediction of antitubercular peptides via heterogeneous feature representation and genetic algorithm based ensemble learning model. Comput Biol Med 2021; 137:104778. [PMID: 34481183 DOI: 10.1016/j.compbiomed.2021.104778] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Revised: 08/16/2021] [Accepted: 08/17/2021] [Indexed: 11/26/2022]
Abstract
Tuberculosis (TB) is a worldwide illness caused by the bacteria Mycobacterium tuberculosis. Owing to the high prevalence of multidrug-resistant tuberculosis, numerous traditional strategies for developing novel alternative therapies have been presented. The effectiveness and dependability of these procedures are not always consistent. Peptide-based therapy has recently been regarded as a preferable alternative due to its excellent selectivity in targeting specific cells without affecting the normal cells. However, due to the rapid growth of the peptide samples, predicting TB accurately has become a challenging task. To effectively identify antitubercular peptides, an intelligent and reliable prediction model is indispensable. An ensemble learning approach was used in this study to improve expected results by compensating for the shortcomings of individual classification algorithms. Initially, three distinct representation approaches were used to formulate the training samples: k-space amino acid composition, composite physiochemical properties, and one-hot encoding. The feature vectors of the applied feature extraction methods are then combined to generate a heterogeneous vector. Finally, utilizing individual and heterogeneous vectors, five distinct nature classification models were used to evaluate prediction rates. In addition, a genetic algorithm-based ensemble model was used to improve the suggested model's prediction and training capabilities. Using Training and independent datasets, the proposed ensemble model achieved an accuracy of 94.47% and 92.68%, respectively. It was observed that our proposed "iAtbP-Hyb-EnC" model outperformed and reported ~10% highest training accuracy than existing predictors. The "iAtbP-Hyb-EnC" model is suggested to be a reliable tool for scientists and might play a valuable role in academic research and drug discovery. The source code and all datasets are publicly available at https://github.com/Farman335/iAtbP-Hyb-EnC.
Collapse
Affiliation(s)
- Shahid Akbar
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP, 23200, Pakistan.
| | - Ashfaq Ahmad
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP, 23200, Pakistan.
| | - Maqsood Hayat
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP, 23200, Pakistan.
| | - Ateeq Ur Rehman
- Department of Information Technology, The University of Haripur, KP, Pakistan.
| | - Salman Khan
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP, 23200, Pakistan.
| | - Farman Ali
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.
| |
Collapse
|
11
|
Akmal MA, Hussain W, Rasool N, Khan YD, Khan SA, Chou KC. Using CHOU'S 5-Steps Rule to Predict O-Linked Serine Glycosylation Sites by Blending Position Relative Features and Statistical Moment. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2045-2056. [PMID: 31985438 DOI: 10.1109/tcbb.2020.2968441] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Glycosylation of proteins in eukaryote cells is an important and complicated post-translation modification due to its pivotal role and association with crucial physiological functions within most of the proteins. Identification of glycosylation sites in a polypeptide chain is not an easy task due to multiple impediments. Analytical identification of these sites is expensive and laborious. There is a dire need to develop a reliable computational method for precise determination of such sites which can help researchers to save time and effort. Herein, we propose a novel predictor namely iGlycoS-PseAAC by integrating the Chou's Pseudo Amino Acid Composition (PseAAC) and relative/absolute position-based features. The self-consistency results show that the accuracy revealed by the model using the benchmark dataset for prediction of O-linked glycosylation having serine sites is 98.8 percent. The overall accuracy of predictor achieved through 10-fold cross validation by combining the positive and negative results is 97.2 percent. The overall accuracy achieved through Jackknife test is 96.195 percent by aggregating of all the prediction results. Thus the proposed predictor can help in predicting the O-linked glycosylated serine sites in an efficient and accurate way. The overall results show that the accuracy of the iGlycoS-PseAAC is higher than the existing tools.
Collapse
|
12
|
Khan YD, Khan NS, Naseer S, Butt AH. iSUMOK-PseAAC: prediction of lysine sumoylation sites using statistical moments and Chou's PseAAC. PeerJ 2021; 9:e11581. [PMID: 34430072 PMCID: PMC8349168 DOI: 10.7717/peerj.11581] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 05/19/2021] [Indexed: 01/25/2023] Open
Abstract
Sumoylation is the post-translational modification that is involved in the adaption of the cells and the functional properties of a large number of proteins. Sumoylation has key importance in subcellular concentration, transcriptional synchronization, chromatin remodeling, response to stress, and regulation of mitosis. Sumoylation is associated with developmental defects in many human diseases such as cancer, Huntington's, Alzheimer's, Parkinson's, Spin cerebellar ataxia 1, and amyotrophic lateral sclerosis. The covalent bonding of Sumoylation is essential to inheriting part of the operative characteristics of some other proteins. For that reason, the prediction of the Sumoylation site has significance in the scientific community. A novel and efficient technique is proposed to predict the Sumoylation sites in proteins by incorporating Chou's Pseudo Amino Acid Composition (PseAAC) with statistical moments-based features. The outcomes from the proposed system using 10 fold cross-validation testing are 94.51%, 94.24%, 94.79% and 0.8903% accuracy, sensitivity, specificity and MCC, respectively. The performance of the proposed system is so far the best in comparison to the other state-of-the-art methods. The codes for the current study are available on the GitHub repository using the link: https://github.com/csbioinfopk/iSumoK-PseAAC.
Collapse
Affiliation(s)
- Yaser Daanial Khan
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Punjab, Pakistan
| | - Nabeel Sabir Khan
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Punjab, Pakistan
| | - Sheraz Naseer
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Punjab, Pakistan
| | - Ahmad Hassan Butt
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Punjab, Pakistan
| |
Collapse
|
13
|
Naseer S, Hussain W, Khan YD, Rasool N. NPalmitoylDeep-PseAAC: A Predictor of N-Palmitoylation Sites in Proteins Using Deep Representations of Proteins and PseAAC via Modified 5-Steps Rule. Curr Bioinform 2021. [DOI: 10.2174/1574893615999200605142828] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
Among all the major Post-translational modification, lipid modifications
possess special significance due to their widespread functional importance in eukaryotic cells. There
exist multiple types of lipid modifications and Palmitoylation, among them, is one of the broader
types of modification, having three different types. The N-Palmitoylation is carried out by
attachment of palmitic acid to an N-terminal cysteine. Due to the association of N-Palmitoylation
with various biological functions and diseases such as Alzheimer’s and other neurodegenerative
diseases, its identification is very important.
Objective:
The in vitro, ex vivo and in vivo identification of Palmitoylation is laborious, time-taking
and costly. There is a dire need for an efficient and accurate computational model to help researchers
and biologists identify these sites, in an easy manner. Herein, we propose a novel prediction model
for the identification of N-Palmitoylation sites in proteins.
Method:
The proposed prediction model is developed by combining the Chou’s Pseudo Amino
Acid Composition (PseAAC) with deep neural networks. We used well-known deep neural
networks (DNNs) for both the tasks of learning a feature representation of peptide sequences and
developing a prediction model to perform classification.
Results:
Among different DNNs, Gated Recurrent Unit (GRU) based RNN model showed the
highest scores in terms of accuracy, and all other computed measures, and outperforms all the
previously reported predictors.
Conclusion:
The proposed GRU based RNN model can help to identify N-Palmitoylation in a very
efficient and accurate manner which can help scientists understand the mechanism of this
modification in proteins.
Collapse
Affiliation(s)
- Sheraz Naseer
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, P.O. Box 10033, C-II, Johar Town, Lahore 54770, Pakistan
| | - Waqar Hussain
- National Center of Artificial Intelligence, Punjab University College of Information Technology, University of the Punjab, Lahore, Pakistan
| | - Yaser Daanial Khan
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, P.O. Box 10033, C-II, Johar Town, Lahore 54770, Pakistan
| | - Nouman Rasool
- Dr Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, 75270, Pakistan
| |
Collapse
|
14
|
Dong GF, Zheng L, Huang SH, Gao J, Zuo YC. Amino Acid Reduction Can Help to Improve the Identification of Antimicrobial Peptides and Their Functional Activities. Front Genet 2021; 12:669328. [PMID: 33959153 PMCID: PMC8093877 DOI: 10.3389/fgene.2021.669328] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 03/23/2021] [Indexed: 02/03/2023] Open
Abstract
Antimicrobial peptides (AMPs) are considered as potential substitutes of antibiotics in the field of new anti-infective drug design. There have been several machine learning algorithms and web servers in identifying AMPs and their functional activities. However, there is still room for improvement in prediction algorithms and feature extraction methods. The reduced amino acid (RAA) alphabet effectively solved the problems of simplifying protein complexity and recognizing the structure conservative region. This article goes into details about evaluating the performances of more than 5,000 amino acid reduced descriptors generated from 74 types of amino acid reduced alphabet in the first stage and the second stage to construct an excellent two-stage classifier, Identification of Antimicrobial Peptides by Reduced Amino Acid Cluster (iAMP-RAAC), for identifying AMPs and their functional activities, respectively. The results show that the first stage AMP classifier is able to achieve the accuracy of 97.21 and 97.11% for the training data set and independent test dataset. In the second stage, our classifier still shows good performance. At least three of the four metrics, sensitivity (SN), specificity (SP), accuracy (ACC), and Matthews correlation coefficient (MCC), exceed the calculation results in the literature. Further, the ANOVA with incremental feature selection (IFS) is used for feature selection to further improve prediction performance. The prediction performance is further improved after the feature selection of each stage. At last, a user-friendly web server, iAMP-RAAC, is established at http://bioinfor.imu.edu. cn/iampraac.
Collapse
Affiliation(s)
- Gai-Fang Dong
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application of Agriculture and Animal Husbandry, College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China
| | - Lei Zheng
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Sheng-Hui Huang
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Jing Gao
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application of Agriculture and Animal Husbandry, College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China
| | - Yong-Chun Zuo
- The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| |
Collapse
|
15
|
iAmideV-Deep: Valine Amidation Site Prediction in Proteins Using Deep Learning and Pseudo Amino Acid Compositions. Symmetry (Basel) 2021. [DOI: 10.3390/sym13040560] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Amidation is an important post translational modification where a peptide ends with an amide group (–NH2) rather than carboxyl group (–COOH). These amidated peptides are less sensitive to proteolytic degradation with extended half-life in the bloodstream. Amides are used in different industries like pharmaceuticals, natural products, and biologically active compounds. The in-vivo, ex-vivo, and in-vitro identification of amidation sites is a costly and time-consuming but important task to study the physiochemical properties of amidated peptides. A less costly and efficient alternative is to supplement wet lab experiments with accurate computational models. Hence, an urgent need exists for efficient and accurate computational models to easily identify amidated sites in peptides. In this study, we present a new predictor, based on deep neural networks (DNN) and Pseudo Amino Acid Compositions (PseAAC), to learn efficient, task-specific, and effective representations for valine amidation site identification. Well-known DNN architectures are used in this contribution to learn peptide sequence representations and classify peptide chains. Of all the different DNN based predictors developed in this study, Convolutional neural network-based model showed the best performance surpassing all other DNN based models and reported literature contributions. The proposed model will supplement in-vivo methods and help scientists to determine valine amidation very efficiently and accurately, which in turn will enhance understanding of the valine amidation in different biological processes.
Collapse
|
16
|
Awais M, Hussain W, Khan YD, Rasool N, Khan SA, Chou KC. iPhosH-PseAAC: Identify Phosphohistidine Sites in Proteins by Blending Statistical Moments and Position Relative Features According to the Chou's 5-Step Rule and General Pseudo Amino Acid Composition. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:596-610. [PMID: 31144645 DOI: 10.1109/tcbb.2019.2919025] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Protein phosphorylation is one of the key mechanism in prokaryotes and eukaryotes and is responsible for various biological functions such as protein degradation, intracellular localization, the multitude of cellular processes, molecular association, cytoskeletal dynamics, and enzymatic inhibition/activation. Phosphohistidine (PhosH) has a key role in a number of biological processes, including central metabolism to signalling in eukaryotes and bacteria. Thus, identification of phosphohistidine sites in a protein sequence is crucial, and experimental identification can be expensive, time-taking, and laborious. To address this problem, here, we propose a novel computational model namely iPhosH-PseAAC for prediction of phosphohistidine sites in a given protein sequence using pseudo amino acid composition (PseAAC), statistical moments, and position relative features. The results of the proposed predictor are validated through self-consistency testing, 10-fold cross-validation, and jackknife testing. The self-consistency validation gave the 100 percent accuracy, whereas, for cross-validation, the accuracy achieved is 94.26 percent. Moreover, jackknife testing gave 97.07 percent accuracy for the proposed model. Thus, the proposed model iPhosH-PseAAC for prediction of iPhosH site has the great ability to predict the PhosH sites in given proteins.
Collapse
|
17
|
Liu GH, Zhang BW, Qian G, Wang B, Mao B, Bichindaritz I. Bioimage-Based Prediction of Protein Subcellular Location in Human Tissue with Ensemble Features and Deep Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1966-1980. [PMID: 31107658 DOI: 10.1109/tcbb.2019.2917429] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Prediction of protein subcellular location has currently become a hot topic because it has been proven to be useful for understanding both the disease mechanisms and novel drug design. With the rapid development of automated microscopic imaging technology in recent years, classification methods of bioimage-based protein subcellular location have attracted considerable attention for images can describe the protein distribution intuitively and in detail. In the current study, a prediction method of protein subcellular location was proposed based on multi-view image features that are extracted from three different views, including the four texture features of the original image, the global and local features of the protein extracted from the protein channel images after color segmentation, and the global features of DNA extracted from the DNA channel image. Finally, the extracted features were combined together to improve the performance of subcellular localization prediction. From the performance comparison of different combination features under the same classifier, the best ensemble features could be obtained. In this work, a classifier based on Stacked Auto-encoders and the random forest was also put forward. To improve the prediction results, the deep network was combined with the traditional statistical classification methods. Stringent cross-validation and independent validation tests on the benchmark dataset demonstrated the efficacy of the proposed method.
Collapse
|
18
|
Cong H, Liu H, Chen Y, Cao Y. Self-evoluting framework of deep convolutional neural network for multilocus protein subcellular localization. Med Biol Eng Comput 2020; 58:3017-3038. [PMID: 33078303 DOI: 10.1007/s11517-020-02275-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Accepted: 10/14/2020] [Indexed: 12/12/2022]
Abstract
In the present paper, deep convolutional neural network (DCNN) is applied to multilocus protein subcellular localization as it is more suitable for multi-class classification. There are two main problems with this application. First, the appropriate features for correlation between multiple sites are hard to find. Second, the classifier structure is difficult to determine as it is greatly affected by the distribution of classified data. To solve these problems, a self-evoluting framework using DCNNs for multilocus protein subcellular localization is proposed. It has three characteristics that the previous algorithms do not. The first is that it combines the ant colony algorithm with the DCNN to form a self-evoluting algorithm for multilocus protein subcellular localization. The second is that it randomly groups subcellular sites using a limited random k-labelsets multi-label classification method. It also solves complex problems in a divide-and-conquer approach and proposes a flexible expansion model. The third is that it realizes the random selection feature extraction method in the positioning process and avoids the defects in individual feature extraction methods. The algorithm in the present paper is tested on the human database, and the overall correct rate is 67.17%, which is higher than that for the stacked self-encoder (SAE), support vector machine (SVM), random forest classifier (RF), or single deep convolutional neural network.Graphical abstract The algorithm mentioned in the present paper mainly includes four parts. They are protein sequence data preprocessing, integrated DCNN model construction, finding optimal DCNN combination by ant colony optimization, and protein subcellular localization for sequences. These parts are sequential relationships and the data obtained in the previous part is the basis for the latter part of the function. In the part of data preprocessing, the limited RAkEL multi-label classification method is used to randomly group subcellular sites. At the same time, the feature fusion of protein sequences is carried out by using multiple feature extraction methods. Each combination including features and sites information corresponds to a DCNN model. In the part of finding optimal DCNN combination by ant colony optimization, the main purpose is to find the best combination of DCNN models through the global optimization ability of the ant colony algorithm. The positioning of sequences is mainly to obtain multilocus subcellular localization by the optimal model combination.
Collapse
Affiliation(s)
- Hanhan Cong
- School of Information Science and Engineering, Shandong Normal University, No. 88, Wenhua East Road, Jinan City, China.,Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Shandong Normal University, Jinan, China
| | - Hong Liu
- School of Information Science and Engineering, Shandong Normal University, No. 88, Wenhua East Road, Jinan City, China. .,Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Shandong Normal University, Jinan, China.
| | - Yuehui Chen
- School of Information Science and Engineering, University of Jinan, Jinan, China.,Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan, China
| | - Yi Cao
- School of Information Science and Engineering, University of Jinan, Jinan, China.,Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan, China
| |
Collapse
|
19
|
Gachpazan M, Kashani H, Khazaei M, Hassanian SM, Rezayi M, Asgharzadeh F, Ghayour-Mobarhan M, Ferns GA, Avan A. The Impact of Statin Therapy on the Survival of Patients with Gastrointestinal Cancer. Curr Drug Targets 2020; 20:738-747. [PMID: 30539694 DOI: 10.2174/1389450120666181211165449] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Revised: 10/25/2018] [Accepted: 12/05/2018] [Indexed: 12/13/2022]
Abstract
Statins are 3-hydroxy-3-methylglutaryl coenzyme A (HMG-CoA) reductase inhibitors that may play an important role in the evolution of cancers, due to their effects on cancer cell metabolism. Statins affect several potential pathways, including cell proliferation, angiogenesis, apoptosis and metastasis. The number of trials assessing the putative clinical benefits of statins in cancer is increasing. Currently, there are several trials listed on the global trial identifier website clinicaltrials.gov. Given the compelling evidence from these trials in a variety of clinical settings, there have been calls for a clinical trial of statins in the adjuvant gastrointestinal cancer setting. However, randomized controlled trials on specific cancer types in relation to statin use, as well as studies on populations without a clinical indication for using statins, have elucidated some potential underlying biological mechanisms, and the investigation of different statins is probably warranted. It would be useful for these trials to incorporate the assessment of tumour biomarkers predictive of statin response in their design. This review summarizes the recent preclinical and clinical studies that assess the application of statins in the treatment of gastrointestinal cancers with particular emphasize on their association with cancer risk.
Collapse
Affiliation(s)
- Meysam Gachpazan
- Metabolic syndrome Research center, Mashhad University of Medical Sciences, Mashhad, Iran.,Department of Modern Sciences and Technologies; Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Hoda Kashani
- Department of Modern Sciences and Technologies; Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Khazaei
- Metabolic syndrome Research center, Mashhad University of Medical Sciences, Mashhad, Iran.,Student Research Committee, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Seyed Mahdi Hassanian
- Metabolic syndrome Research center, Mashhad University of Medical Sciences, Mashhad, Iran.,Department of Medical Biochemistry; Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Rezayi
- Metabolic syndrome Research center, Mashhad University of Medical Sciences, Mashhad, Iran.,Department of Modern Sciences and Technologies; Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Fereshteh Asgharzadeh
- Student Research Committee, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Ghayour-Mobarhan
- Metabolic syndrome Research center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Gordon A Ferns
- Brighton & Sussex Medical School, Division of Medical Education, Falmer, Brighton, Sussex BN1 9PH, United Kingdom
| | - Amir Avan
- Metabolic syndrome Research center, Mashhad University of Medical Sciences, Mashhad, Iran.,Department of Modern Sciences and Technologies; Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.,Cancer Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| |
Collapse
|
20
|
Abstract
During the last three decades or so, many efforts have been made to study the protein cleavage
sites by some disease-causing enzyme, such as HIV (Human Immunodeficiency Virus) protease
and SARS (Severe Acute Respiratory Syndrome) coronavirus main proteinase. It has become increasingly
clear <i>via</i> this mini-review that the motivation driving the aforementioned studies is quite wise,
and that the results acquired through these studies are very rewarding, particularly for developing peptide
drugs.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|
21
|
Chou KC. An Insightful 10-year Recollection Since the Emergence of the 5-steps Rule. Curr Pharm Des 2020; 25:4223-4234. [PMID: 31782354 DOI: 10.2174/1381612825666191129164042] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Accepted: 11/25/2019] [Indexed: 11/22/2022]
Abstract
OBJECTIVE One of the most challenging and also the most difficult problems is how to formulate a biological sequence with a vector but considerably keep its sequence order information. METHODS To address such a problem, the approach of Pseudo Amino Acid Components or PseAAC has been developed. RESULTS AND CONCLUSION It has become increasingly clear via the 10-year recollection that the aforementioned proposal has been indeed very powerful.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, Massachusetts 02478, United States.,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
22
|
Saikia S, Bordoloi M, Sarmah R. Established and In-trial GPCR Families in Clinical Trials: A Review for Target Selection. Curr Drug Targets 2020; 20:522-539. [PMID: 30394207 DOI: 10.2174/1389450120666181105152439] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Revised: 08/28/2018] [Accepted: 10/22/2018] [Indexed: 12/14/2022]
Abstract
The largest family of drug targets in clinical trials constitute of GPCRs (G-protein coupled receptors) which accounts for about 34% of FDA (Food and Drug Administration) approved drugs acting on 108 unique GPCRs. Factors such as readily identifiable conserved motif in structures, 127 orphan GPCRs despite various de-orphaning techniques, directed functional antibodies for validation as drug targets, etc. has widened their therapeutic windows. The availability of 44 crystal structures of unique receptors, unexplored non-olfactory GPCRs (encoded by 50% of the human genome) and 205 ligand receptor complexes now present a strong foundation for structure-based drug discovery and design. The growing impact of polypharmacology for complex diseases like schizophrenia, cancer etc. warrants the need for novel targets and considering the undiscriminating and selectivity of GPCRs, they can fulfill this purpose. Again, natural genetic variations within the human genome sometimes delude the therapeutic expectations of some drugs, resulting in medication response differences and ADRs (adverse drug reactions). Around ~30 billion US dollars are dumped annually for poor accounting of ADRs in the US alone. To curb such undesirable reactions, the knowledge of established and currently in clinical trials GPCRs families can offer huge understanding towards the drug designing prospects including "off-target" effects reducing economical resource and time. The druggability of GPCR protein families and critical roles played by them in complex diseases are explained. Class A, class B1, class C and class F are generally established family and GPCRs in phase I (19%), phase II(29%), phase III(52%) studies are also reviewed. From the phase I studies, frizzled receptors accounted for the highest in trial targets, neuropeptides in phase II and melanocortin in phase III studies. Also, the bioapplications for nanoparticles along with future prospects for both nanomedicine and GPCR drug industry are discussed. Further, the use of computational techniques and methods employed for different target validations are also reviewed along with their future potential for the GPCR based drug discovery.
Collapse
Affiliation(s)
- Surovi Saikia
- Natural Products Chemistry Group, CSIR North East Institute of Science & Technology, Jorhat-785006, Assam, India
| | - Manobjyoti Bordoloi
- Natural Products Chemistry Group, CSIR North East Institute of Science & Technology, Jorhat-785006, Assam, India
| | - Rajeev Sarmah
- Allied Health Sciences, Assam Down Town University, Panikhaiti, Guwahati 781026, Assam, India
| |
Collapse
|
23
|
Malik N, Dhiman P, Khatkar A. In Silico Design and Synthesis of Targeted Curcumin Derivatives as Xanthine Oxidase Inhibitors. Curr Drug Targets 2020; 20:593-603. [PMID: 30465499 DOI: 10.2174/1389450120666181122100511] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Revised: 09/24/2018] [Accepted: 11/02/2018] [Indexed: 12/11/2022]
Abstract
BACKGROUND Curcumin is a well-known pharmacophore and some of its derivatives are shown to target xanthine oxidase (XO) to alleviate disorders caused by the excess production of uric acid. OBJECTIVE Curcumin based derivatives were designed, synthesized and evaluated for their antioxidant and xanthine oxidase inhibitory potential. METHOD In this report, we designed and synthesized two series of curcumin derivatives modified by inserting pyrazole and pyrimidine ring to central keto group. The synthesized compounds were evaluated for their antioxidant and xanthine oxidase inhibitory potential. RESULTS Results showed that pyrazole analogues of curcumin produced excellent XO inhibitory potency with the IC50 values varying from 06.255 µM to 10.503 µM. Among pyrimidine derivatives compound CU3a1 having ortho nitro substitution exhibited more potent xanthine oxidase inhibitory activity than any other curcumin derivative of this series. CONCLUSION Curcumin derivatives CU5b1, CU5b2, CU5b3, and CU3a1 showed a potent inhibitory activity against xanthine oxidase along with good antioxidant potential.
Collapse
Affiliation(s)
- Neelam Malik
- Laboratory for Preservation Technology and Enzyme Inhibition Studies, Department of Pharmaceutical Sciences, M.D.University, Rohtak, Haryana, India
| | - Priyanka Dhiman
- Laboratory for Preservation Technology and Enzyme Inhibition Studies, Department of Pharmaceutical Sciences, M.D.University, Rohtak, Haryana, India
| | - Anurag Khatkar
- Laboratory for Preservation Technology and Enzyme Inhibition Studies, Department of Pharmaceutical Sciences, M.D.University, Rohtak, Haryana, India
| |
Collapse
|
24
|
Hu Y, Lu Y, Wang S, Zhang M, Qu X, Niu B. Application of Machine Learning Approaches for the Design and Study of Anticancer Drugs. Curr Drug Targets 2020; 20:488-500. [PMID: 30091413 DOI: 10.2174/1389450119666180809122244] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Revised: 06/19/2018] [Accepted: 06/25/2018] [Indexed: 12/14/2022]
Abstract
BACKGROUND Globally the number of cancer patients and deaths are continuing to increase yearly, and cancer has, therefore, become one of the world's highest causes of morbidity and mortality. In recent years, the study of anticancer drugs has become one of the most popular medical topics. OBJECTIVE In this review, in order to study the application of machine learning in predicting anticancer drugs activity, some machine learning approaches such as Linear Discriminant Analysis (LDA), Principal components analysis (PCA), Support Vector Machine (SVM), Random forest (RF), k-Nearest Neighbor (kNN), and Naïve Bayes (NB) were selected, and the examples of their applications in anticancer drugs design are listed. RESULTS Machine learning contributes a lot to anticancer drugs design and helps researchers by saving time and is cost effective. However, it can only be an assisting tool for drug design. CONCLUSION This paper introduces the application of machine learning approaches in anticancer drug design. Many examples of success in identification and prediction in the area of anticancer drugs activity prediction are discussed, and the anticancer drugs research is still in active progress. Moreover, the merits of some web servers related to anticancer drugs are mentioned.
Collapse
Affiliation(s)
- Yan Hu
- School of Life Sciences, Shanghai University, Shanghai 200444, China
| | - Yi Lu
- School of Life Sciences, Shanghai University, Shanghai 200444, China
| | - Shuo Wang
- School of Life Sciences, Shanghai University, Shanghai 200444, China
| | - Mengying Zhang
- School of Life Sciences, Shanghai University, Shanghai 200444, China
| | - Xiaosheng Qu
- National Engineering Laboratory of Southwest Endangered Medicinal Resources Development, Guangxi Botanical Garden of Medicinal Plants, 530023,Nanning, China
| | - Bing Niu
- School of Life Sciences, Shanghai University, Shanghai 200444, China
| |
Collapse
|
25
|
|
26
|
Zheng L, Huang S, Mu N, Zhang H, Zhang J, Chang Y, Yang L, Zuo Y. RAACBook: a web server of reduced amino acid alphabet for sequence-dependent inference by using Chou's five-step rule. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020; 2019:5650975. [PMID: 31802128 PMCID: PMC6893003 DOI: 10.1093/database/baz131] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Revised: 10/16/2019] [Accepted: 10/17/2019] [Indexed: 12/12/2022]
Abstract
By reducing amino acid alphabet, the protein complexity can be significantly simplified, which could improve computational efficiency, decrease information redundancy and reduce chance of overfitting. Although some reduced alphabets have been proposed, different classification rules could produce distinctive results for protein sequence analysis. Thus, it is urgent to construct a systematical frame for reduced alphabets. In this work, we constructed a comprehensive web server called RAACBook for protein sequence analysis and machine learning application by integrating reduction alphabets. The web server contains three parts: (i) 74 types of reduced amino acid alphabet were manually extracted to generate 673 reduced amino acid clusters (RAACs) for dealing with unique protein problems. It is easy for users to select desired RAACs from a multilayer browser tool. (ii) An online tool was developed to analyze primary sequence of protein. The tool could produce K-tuple reduced amino acid composition by defining three correlation parameters (K-tuple, g-gap, λ-correlation). The results are visualized as sequence alignment, mergence of RAA composition, feature distribution and logo of reduced sequence. (iii) The machine learning server is provided to train the model of protein classification based on K-tuple RAAC. The optimal model could be selected according to the evaluation indexes (ROC, AUC, MCC, etc.). In conclusion, RAACBook presents a powerful and user-friendly service in protein sequence analysis and computational proteomics. RAACBook can be freely available at http://bioinfor.imu.edu.cn/raacbook. Database URL: http://bioinfor.imu.edu.cn/raacbook
Collapse
Affiliation(s)
- Lei Zheng
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Zhaojun Road No.24, Hohhot, 010070, China
| | - Shenghui Huang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Zhaojun Road No.24, Hohhot, 010070, China
| | - Nengjiang Mu
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Zhaojun Road No.24, Hohhot, 010070, China
| | - Haoyue Zhang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Zhaojun Road No.24, Hohhot, 010070, China
| | - Jiayu Zhang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Zhaojun Road No.24, Hohhot, 010070, China
| | - Yu Chang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Zhaojun Road No.24, Hohhot, 010070, China
| | - Lei Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Baojian Road No.157, Harbin 150081, China
| | - Yongchun Zuo
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Zhaojun Road No.24, Hohhot, 010070, China
| |
Collapse
|
27
|
Rehman AU, Olof Olsson P, Khan N, Khan K. Identification of Human Secretome and Membrane Proteome-Based Cancer Biomarkers Utilizing Bioinformatics. J Membr Biol 2020; 253:257-270. [PMID: 32415382 DOI: 10.1007/s00232-020-00122-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2020] [Accepted: 05/02/2020] [Indexed: 12/12/2022]
Abstract
Cellular secreted proteins (secretome), together with cellular membrane proteins, collectively referred to as secretory and membrane proteins (SMPs) are a large potential source of biomarkers as they can be used to indicate cell types and conditions. SMPs have been shown to be ideal candidates for several clinically approved drug regimens including for cancer. This study aimed at performing a functional analysis of SMPs within different cancer subtypes to provide great clinical targets for potential prognostic, diagnostic and the therapeutics use. Using an innovative majority decision-based algorithm and transcriptomic data spanning 5 cancer types and over 3000 samples, we quantified the relative difference in SMPs gene expression compared to normal adjacent tissue. A detailed deep data mining analysis revealed a consistent group of downregulated SMP isoforms, enriched in hematopoietic cell lineages (HCL), in multiple cancer types. HCL-associated genes were frequently downregulated in successive cancer stages and high expression was associated with good patient prognosis. In addition, we suggest a potential mechanism by which cancer cells suppress HCL signaling by reducing the expression of immune-related genes. Our data identified potential biomarkers for the cancer immunotherapy. We conclude that our approach may be applicable for the delineation of other types of cancer and illuminate specific targets for therapeutics and diagnostics.
Collapse
Affiliation(s)
- Adeel Ur Rehman
- Hefei National Laboratory for Physical Sciences at Microscale, The CAS Key Laboratory of Innate Immunity and Chronic Diseases, School of Life Sciences, University of Science and Technology of China, Hefei, 230027, China.
| | | | - Naveed Khan
- Max Plank Partner Institute of Computational Biology, Shanghai Institute of Biological Sciences, Shanghai, 200032, China
| | - Khalid Khan
- Department of Respiratory and Critical Care Medicine, The Second Clinical Medical College (Shenzhen People's Hospital) of Jinan University, Shenzhen Institute of Respiratory Diseases, Shenzhen, China.,Integrated Chinese and Western Medicine Postdoctoral Research Station, Jinan University, Guangzhou, China
| |
Collapse
|
28
|
Wang S, Wang Y, Yu C, Cao Y, Yu Y, Pan Y, Su D, Lu Q, Yang W, Zuo Y, Yang L. Characterization of the relationship between FLI1 and immune infiltrate level in tumour immune microenvironment for breast cancer. J Cell Mol Med 2020; 24:5501-5514. [PMID: 32249526 PMCID: PMC7214163 DOI: 10.1111/jcmm.15205] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Revised: 01/31/2020] [Accepted: 03/06/2020] [Indexed: 12/24/2022] Open
Abstract
Breast cancer is the most common cancer and the leading cause of cancer death among women in the world. Tumour‐infiltrating lymphocytes were defined as the white blood cells left in the vasculature and localized in tumours. Recently, tumour‐infiltrating lymphocytes were found to be associated with good prognosis and response to immunotherapy in tumours. In this study, to examine the influence of FLI1 in immune system in breast cancer, we interrogated the relationship between the FLI1 expression levels with infiltration levels of 28 immune cell types. By splitting the breast cancer samples into high and low expression FLI1 subtypes, we found that the high expression FLI1 subtype was enriched in many immune cell types, and the up‐regulated differentially expressed genes between them were enriched in immune system processes, immune‐related KEGG pathways and biological processes. In addition, many important immune‐related features were found to be positively correlated with the FLI1 expression level. Furthermore, we found that the FLI1 was correlated with the immune‐related genes. Our findings may provide useful help for recognizing the relationship between tumour immune microenvironment and FLI1, and may unravel clinical outcomes and immunotherapy utility for FLI1 in breast cancer.
Collapse
Affiliation(s)
- Shiyuan Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yakun Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Chunlu Yu
- Public Health College, Harbin Medical University, Harbin, China
| | - Yiyin Cao
- Public Health College, Harbin Medical University, Harbin, China
| | - Yao Yu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yi Pan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Dongqing Su
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Qianzi Lu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Wuritu Yang
- The State key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Yongchun Zuo
- The State key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Lei Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| |
Collapse
|
29
|
Identifying FL11 subtype by characterizing tumor immune microenvironment in prostate adenocarcinoma via Chou's 5-steps rule. Genomics 2020; 112:1500-1515. [DOI: 10.1016/j.ygeno.2019.08.021] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Revised: 08/03/2019] [Accepted: 08/26/2019] [Indexed: 12/14/2022]
|
30
|
López Y, Dehzangi A, Reddy HM, Sharma A. C-iSUMO: A sumoylation site predictor that incorporates intrinsic characteristics of amino acid sequences. Comput Biol Chem 2020; 87:107235. [PMID: 32604027 DOI: 10.1016/j.compbiolchem.2020.107235] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Revised: 12/16/2019] [Accepted: 02/12/2020] [Indexed: 12/13/2022]
Abstract
Post-translational modifications are considered important molecular interactions in protein science. One of these modifications is "sumoylation" whose computational detection has recently become a challenge. In this paper, we propose a new computational predictor which makes use of the sine and cosine of backbone torsion angles and the accessible surface area for predicting sumoylation sites. The aforementioned features were computed for all the proteins in our benchmark dataset, and a training matrix consisting of sumoylation and non-sumoylation sites was ultimately created. This training matrix was balanced by undersampling the majority class (non-sumoylation sites) using the NearMiss method. Finally, an AdaBoost classifier was used for discriminating between sumoylation and non-sumoylation sites. Our predictor was called "C-iSumo" because of its effective use of circular functions. C-iSumo was compared with another predictor which was outperformed in statistical metrics such as sensitivity (0.734), accuracy (0.746) and Matthews correlation coefficient (0.494).
Collapse
Affiliation(s)
- Yosvany López
- Genesis Institute of Genetic Research, Genesis Healthcare Co., Tokyo, Japan.
| | - Abdollah Dehzangi
- Department of Computer Science, Morgan State University, Baltimore, Maryland, USA
| | | | - Alok Sharma
- School of Engineering and Physics, University of the South Pacific, Suva, Fiji; Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan; Institute for Integrated and Intelligent Systems, Griffith University, Queensland, Australia.
| |
Collapse
|
31
|
Zheng H, Yang H, Gong D, Mai L, Qiu X, Chen L, Su X, Wei R, Zeng Z. Progress in the Mechanism and Clinical Application of Cilostazol. Curr Top Med Chem 2020; 19:2919-2936. [PMID: 31763974 DOI: 10.2174/1568026619666191122123855] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2019] [Revised: 07/27/2019] [Accepted: 08/02/2019] [Indexed: 12/20/2022]
Abstract
Cilostazol is a unique platelet inhibitor that has been used clinically for more than 20 years. As a phosphodiesterase type III inhibitor, cilostazol is capable of reversible inhibition of platelet aggregation and vasodilation, has antiproliferative effects, and is widely used in the treatment of peripheral arterial disease, cerebrovascular disease, percutaneous coronary intervention, etc. This article briefly reviews the pharmacological mechanisms and clinical application of cilostazol.
Collapse
Affiliation(s)
- Huilei Zheng
- Department of Medical Examination & Health Management, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China.,Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China
| | - Hua Yang
- Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China.,Department of Critical Care Medicine, Second People's Hospital of Nanning, Nanning, Guangxi, China
| | - Danping Gong
- Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China.,Elderly Cardiology Ward, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Lanxian Mai
- Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China.,Disciplinary Construction Office, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Xiaoling Qiu
- Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China
| | - Lidai Chen
- Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China
| | - Xiaozhou Su
- Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China
| | - Ruoqi Wei
- Department of Computer Science and Engineering, University of Bridgeport,126 Park Ave, BRIDGEPORT, CT 06604, United States
| | - Zhiyu Zeng
- Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China.,Elderly Cardiology Ward, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| |
Collapse
|
32
|
Garapati HS, Male G, Mishra K. Predicting subcellular localization of proteins using protein-protein interaction data. Genomics 2020; 112:2361-2368. [PMID: 31945465 DOI: 10.1016/j.ygeno.2020.01.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Revised: 01/01/2020] [Accepted: 01/11/2020] [Indexed: 10/25/2022]
Abstract
The knowledge of subcellular localization of proteins can provide useful clues about their functions. The conventional methods to determine the subcellular localization are unable to keep pace with the rate at which the new data is being generated. Thus, though sequence information is available, the localization and function of a number of proteins remains unknown. In this study, we have developed a script that makes use of the physical interactors of a protein and their localization data to predict the subcellular localization. We used the script to predict the localization of yeast proteins for which there is no localization data. Further, we experimentally verified the predicted localization for six arbitrarily chosen proteins and found our predictions to be correct for five of the proteins.
Collapse
Affiliation(s)
- Hita Sony Garapati
- Department of Biochemistry, School of Life Sciences, University of Hyderabad, Hyderabad 500046, India
| | - Gurranna Male
- Department of Biochemistry, School of Life Sciences, University of Hyderabad, Hyderabad 500046, India
| | - Krishnaveni Mishra
- Department of Biochemistry, School of Life Sciences, University of Hyderabad, Hyderabad 500046, India.
| |
Collapse
|
33
|
Some illuminating remarks on molecular genetics and genomics as well as drug development. Mol Genet Genomics 2020; 295:261-274. [PMID: 31894399 DOI: 10.1007/s00438-019-01634-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Accepted: 12/05/2019] [Indexed: 02/07/2023]
Abstract
Facing the explosive growth of biological sequences unearthed in the post-genomic age, one of the most important but also most difficult problems in computational biology is how to express a biological sequence with a discrete model or a vector, but still keep it with considerable sequence-order information or its special pattern. To deal with such a challenging problem, the ideas of "pseudo amino acid components" and "pseudo K-tuple nucleotide composition" have been proposed. The ideas and their approaches have further stimulated the birth for "distorted key theory", "wenxing diagram", and substantially strengthening the power in treating the multi-label systems, as well as the establishment of the famous "5-steps rule". All these logic developments are quite natural that are very useful not only for theoretical scientists but also for experimental scientists in conducting genetics/genomics analysis and drug development. Presented in this review paper are also their future perspectives; i.e., their impacts will become even more significant and propounding.
Collapse
|
34
|
Ju Z, Wang SY. Prediction of lysine formylation sites using the composition of k-spaced amino acid pairs via Chou's 5-steps rule and general pseudo components. Genomics 2020; 112:859-866. [DOI: 10.1016/j.ygeno.2019.05.027] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2019] [Revised: 05/13/2019] [Accepted: 05/30/2019] [Indexed: 11/30/2022]
|
35
|
Shao YT, Liu XX, Lu Z, Chou KC. pLoc_Deep-mPlant: Predict Subcellular Localization of Plant Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/ns.2020.125021] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
36
|
Lu Z, Chou KC. pLoc_Deep-mGpos: Predict Subcellular Localization of Gram Positive Bacteria Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/jbise.2020.135005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
37
|
Shao Y, Chou KC. pLoc_Deep-mVirus: A CNN Model for Predicting Subcellular Localization of Virus Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/ns.2020.126033] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
38
|
Shao Y, Chou KC. pLoc_Deep-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/ns.2020.126034] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
39
|
Khanh Le NQ, Nguyen QH, Chen X, Rahardja S, Nguyen BP. Classification of adaptor proteins using recurrent neural networks and PSSM profiles. BMC Genomics 2019; 20:966. [PMID: 31874633 PMCID: PMC6929330 DOI: 10.1186/s12864-019-6335-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Accepted: 11/25/2019] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Adaptor proteins are carrier proteins that play a crucial role in signal transduction. They commonly consist of several modular domains, each having its own binding activity and operating by forming complexes with other intracellular-signaling molecules. Many studies determined that the adaptor proteins had been implicated in a variety of human diseases. Therefore, creating a precise model to predict the function of adaptor proteins is one of the vital tasks in bioinformatics and computational biology. Few computational biology studies have been conducted to predict the protein functions, and in most of those studies, position specific scoring matrix (PSSM) profiles had been used as the features to be fed into the neural networks. However, the neural networks could not reach the optimal result because the sequential information in PSSMs has been lost. This study proposes an innovative approach by incorporating recurrent neural networks (RNNs) and PSSM profiles to resolve this problem. RESULTS Compared to other state-of-the-art methods which had been applied successfully in other problems, our method achieves enhancement in all of the common measurement metrics. The area under the receiver operating characteristic curve (AUC) metric in prediction of adaptor proteins in the cross-validation and independent datasets are 0.893 and 0.853, respectively. CONCLUSIONS This study opens a research path that can promote the use of RNNs and PSSM profiles in bioinformatics and computational biology. Our approach is reproducible by scientists that aim to improve the performance results of different protein function prediction problems. Our source code and datasets are available at https://github.com/ngphubinh/adaptors.
Collapse
Affiliation(s)
- Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence in Medicine, Taipei Medical University, Keelung Road, Da'an Distric, Taipei City 106, Taiwan (R.O.C.)
| | - Quang H Nguyen
- School of Information and Communication Technology, Hanoi University of Science and Technology, 1 Dai Co Viet, Hanoi 100000, Vietnam
| | - Xuan Chen
- Beijing Genomics Institute, 21 Hongan 3rd Street, Shenzhen 518083, China
| | - Susanto Rahardja
- School of Marine Science and Technology, Northwestern Polytechnical University, 127 West Youyi Road, Xi'an 710072, China.
| | - Binh P Nguyen
- School of Mathematics and Statistics, Victoria University of Wellington, Gate 7, Kelburn Parade, Wellington 6140, New Zealand
| |
Collapse
|
40
|
Lu B, Liu XH, Liao SM, Lu ZL, Chen D, Troy Ii FA, Huang RB, Zhou GP. A Possible Modulation Mechanism of Intramolecular and Intermolecular Interactions for NCAM Polysialylation and Cell Migration. Curr Top Med Chem 2019; 19:2271-2282. [PMID: 31648641 DOI: 10.2174/1568026619666191018094805] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Revised: 08/01/2019] [Accepted: 08/06/2019] [Indexed: 12/31/2022]
Abstract
Polysialic acid (polySia) is a novel glycan that posttranslationally modifies neural cell adhesion molecules (NCAMs) in mammalian cells. Up-regulation of polySia-NCAM expression or NCAM polysialylation is associated with tumor cell migration and progression in many metastatic cancers and neurocognition. It has been known that two highly homologous mammalian polysialyltransferases (polySTs), ST8Sia II (STX) and ST8Sia IV (PST), can catalyze polysialylation of NCAM, and two polybasic domains, polybasic region (PBR) and polysialyltransferase domain (PSTD) in polySTs play key roles in affecting polyST activity or NCAM polysialylation. However, the molecular mechanisms of NCAM polysialylation and cell migration are still not entirely clear. In this minireview, the recent research results about the intermolecular interactions between the PBR and NCAM, the PSTD and cytidine monophosphate-sialic acid (CMP-Sia), the PSTD and polySia, and as well as the intramolecular interaction between the PBR and the PSTD within the polyST, are summarized. Based on these cooperative interactions, we have built a novel model of NCAM polysialylation and cell migration mechanisms, which may be helpful to design and develop new polysialyltransferase inhibitors.
Collapse
Affiliation(s)
- Bo Lu
- The National Engineering Research Center for Non-Food Biorefinery, Guangxi Academy of Sciences, Nanning, Guangxi 530007, China
| | - Xue-Hui Liu
- Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Si-Ming Liao
- The National Engineering Research Center for Non-Food Biorefinery, Guangxi Academy of Sciences, Nanning, Guangxi 530007, China
| | - Zhi-Long Lu
- The National Engineering Research Center for Non-Food Biorefinery, Guangxi Academy of Sciences, Nanning, Guangxi 530007, China
| | - Dong Chen
- The National Engineering Research Center for Non-Food Biorefinery, Guangxi Academy of Sciences, Nanning, Guangxi 530007, China
| | - Frederic A Troy Ii
- Department of Biochemistry and Molecular Medicine, University of California School of Medicine, Davis, CA, 95817, United States
| | - Ri-Bo Huang
- The National Engineering Research Center for Non-Food Biorefinery, Guangxi Academy of Sciences, Nanning, Guangxi 530007, China.,Life Science and Biotechnology College, Guangxi University, Nanning, Guangxi 530004, China
| | - Guo-Ping Zhou
- The National Engineering Research Center for Non-Food Biorefinery, Guangxi Academy of Sciences, Nanning, Guangxi 530007, China
| |
Collapse
|
41
|
Javed F, Hayat M. Predicting subcellular localization of multi-label proteins by incorporating the sequence features into Chou's PseAAC. Genomics 2019; 111:1325-1332. [DOI: 10.1016/j.ygeno.2018.09.004] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2018] [Accepted: 09/04/2018] [Indexed: 12/13/2022]
|
42
|
pLoc_bal-mHum: Predict subcellular localization of human proteins by PseAAC and quasi-balancing training dataset. Genomics 2019; 111:1274-1282. [DOI: 10.1016/j.ygeno.2018.08.007] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 08/14/2018] [Accepted: 08/16/2018] [Indexed: 12/17/2022]
|
43
|
Ju Z, Wang SY. Identify Lysine Neddylation Sites Using Bi-profile Bayes Feature Extraction via the Chou's 5-steps Rule and General Pseudo Components. Curr Genomics 2019; 20:592-601. [PMID: 32581647 PMCID: PMC7290059 DOI: 10.2174/1389202921666191223154629] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Revised: 10/19/2019] [Accepted: 11/07/2019] [Indexed: 01/06/2023] Open
Abstract
Introduction Neddylation is a highly dynamic and reversible post-translational modification. The abnormality of neddylation has previously been shown to be closely related to some human diseases. The detection of neddylation sites is essential for elucidating the regulation mechanisms of protein neddylation. Objective As the detection of the lysine neddylation sites by the traditional experimental method is often expensive and time-consuming, it is imperative to design computational methods to identify neddylation sites. Methods In this study, a bioinformatics tool named NeddPred is developed to identify underlying protein neddylation sites. A bi-profile bayes feature extraction is used to encode neddylation sites and a fuzzy support vector machine model is utilized to overcome the problem of noise and class imbalance in the prediction. Results Matthew's correlation coefficient of NeddPred achieved 0.7082 and an area under the receiver operating characteristic curve of 0.9769. Independent tests show that NeddPred significantly outperforms existing lysine neddylation sites predictor NeddyPreddy. Conclusion Therefore, NeddPred can be a complement to the existing tools for the prediction of neddylation sites. A user-friendly webserver for NeddPred is accessible at 123.206.31.171/NeddPred/.
Collapse
Affiliation(s)
- Zhe Ju
- College of Science, Shenyang Aerospace University, Shenyang110136, P.R. China
| | - Shi-Yun Wang
- College of Science, Shenyang Aerospace University, Shenyang110136, P.R. China
| |
Collapse
|
44
|
Guo YH, Kuruganti R, Gao Y. Recent Advances in Ginsenosides as Potential Therapeutics Against Breast Cancer. Curr Top Med Chem 2019; 19:2334-2347. [DOI: 10.2174/1568026619666191018100848] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Revised: 05/10/2019] [Accepted: 08/16/2019] [Indexed: 12/14/2022]
Abstract
The dried root of ginseng (Panax ginseng C. A. Meyer or Panax quinquefolius L.) is a traditional
Chinese medicine widely used to manage cancer symptoms and chemotherapy side effects in
Asia. The anti-cancer efficacy of ginseng is attributed mainly to the presence of saponins, which are
commonly known as ginsenosides. Ginsenosides were first identified as key active ingredients in Panax
ginseng and subsequently found in Panax quinquefolius, both of the same genus. To review the recent
advances on anti-cancer effects of ginsenosides against breast cancer, we conducted a literature study of
scientific articles published from 2010 through 2018 to date by searching the major databases including
Pubmed, SciFinder, Science Direct, Springer, Google Scholar, and CNKI. A total of 50 articles authored
in either English or Chinese related to the anti-breast cancer activity of ginsenosides have been
reviewed, and the in vitro, in vivo, and clinical studies on ginsenosides are summarized. This review focuses
on how ginsenosides exert their anti-breast cancer activities through various mechanisms of action
such as modulation of cell growth, modulation of the cell cycle, modulation of cell death, inhibition of
angiogenesis, inhibition of metastasis, inhibition of multidrug resistance, and cancer immunemodulation.
In summary, recent advances in the evaluation of ginsenosides as therapeutic agents against
breast cancer support further pre-clinical and clinical studies to treat primary and metastatic breast tumors.
Collapse
Affiliation(s)
- Yu-hang Guo
- International Ginseng Institute, School of Agriculture, Middle Tennessee State University, Murfreesboro, TN 37132, United States
| | - Revathimadhubala Kuruganti
- International Ginseng Institute, School of Agriculture, Middle Tennessee State University, Murfreesboro, TN 37132, United States
| | - Ying Gao
- International Ginseng Institute, School of Agriculture, Middle Tennessee State University, Murfreesboro, TN 37132, United States
| |
Collapse
|
45
|
Chou KC. Impacts of Pseudo Amino Acid Components and 5-steps Rule to Proteomics and Proteome Analysis. Curr Top Med Chem 2019; 19:2283-2300. [DOI: 10.2174/1568026619666191018100141] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Revised: 08/18/2019] [Accepted: 08/26/2019] [Indexed: 01/27/2023]
Abstract
Stimulated by the 5-steps rule during the last decade or so, computational proteomics has achieved remarkable progresses in the following three areas: (1) protein structural class prediction; (2) protein subcellular location prediction; (3) post-translational modification (PTM) site prediction. The results obtained by these predictions are very useful not only for an in-depth study of the functions of proteins and their biological processes in a cell, but also for developing novel drugs against major diseases such as cancers, Alzheimer’s, and Parkinson’s. Moreover, since the targets to be predicted may have the multi-label feature, two sets of metrics are introduced: one is for inspecting the global prediction quality, while the other for the local prediction quality. All the predictors covered in this review have a userfriendly web-server, through which the majority of experimental scientists can easily obtain their desired data without the need to go through the complicated mathematics.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| |
Collapse
|
46
|
Xiao X, Chen WJ, Qiu WR. A Novel Prediction of Quaternary Structural Type of Proteins with Gene Ontology. Protein Pept Lett 2019; 27:313-320. [PMID: 31749418 DOI: 10.2174/0929866526666191014144618] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Revised: 05/20/2019] [Accepted: 06/29/2019] [Indexed: 11/22/2022]
Abstract
BACKGROUND The information of quaternary structure attributes of proteins is very important because it is closely related to the biological functions of proteins. With the rapid development of new generation sequencing technology, we are facing a challenge: how to automatically identify the four-level attributes of new polypeptide chains according to their sequence information (i.e., whether they are formed as just as a monomer, or as a hetero-oligomer, or a homo-oligomer). OBJECTIVE In this article, our goal is to find a new way to represent protein sequences, thereby improving the prediction rate of protein quaternary structure. METHODS In this article, we developed a prediction system for protein quaternary structural type in which a protein sequence was expressed by combining the Pfam functional-domain and gene ontology. turn protein features into digital sequences, and complete the prediction of quaternary structure through specific machine learning algorithms and verification algorithm. RESULTS Our data set contains 5495 protein samples. Through the method provided in this paper, we classify proteins into monomer, or as a hetero-oligomer, or a homo-oligomer, and the prediction rate is 74.38%, which is 3.24% higher than that of previous studies. Through this new feature extraction method, we can further classify the four-level structure of proteins, and the results are also correspondingly improved. CONCLUSION After the applying the new prediction system, compared with the previous results, we have successfully improved the prediction rate. We have reason to believe that the feature extraction method in this paper has better practicability and can be used as a reference for other protein classification problems.
Collapse
Affiliation(s)
- Xuan Xiao
- School of Information, Jingdezhen Ceramic Institute, Jingdezhen 333403, China
| | - Wei-Jie Chen
- School of Information, Jingdezhen Ceramic Institute, Jingdezhen 333403, China
| | - Wang-Ren Qiu
- School of Information, Jingdezhen Ceramic Institute, Jingdezhen 333403, China.,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| |
Collapse
|
47
|
Xuan P, Cui H, Shen T, Sheng N, Zhang T. HeteroDualNet: A Dual Convolutional Neural Network With Heterogeneous Layers for Drug-Disease Association Prediction via Chou's Five-Step Rule. Front Pharmacol 2019; 10:1301. [PMID: 31780934 PMCID: PMC6856670 DOI: 10.3389/fphar.2019.01301] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Accepted: 10/11/2019] [Indexed: 11/14/2022] Open
Abstract
Identifying new treatments for existing drugs can help reduce drug development costs and explore novel indications of drugs. The prediction of associations between drugs and diseases is challenging because their similarities and relations are complicated and non-linear. We propose a HeteroDualNet model to address this issue. Firstly, three types of matrices are extracted to represent intra-drug similarities, intra-disease similarity and drug-disease associations. The intra-drug similarities consider three drug features and a newly introduced drug-related disease correlation. Secondly, an embedding mechanism is proposed to integrate these matrices in a heterogenous drug-disease association layer (hetero-layer). Further, a neighbouring heterogeneous layer (hetero-layer-N) is constructed to incorporate the biological premise that similar drugs can often treat related diseases. Finally, a dual convolutional neural network is built with hetero-layer and hetero-layer-N as two branches to learn from characteristics of drug-disease and the relations of their neighbours simultaneously. HeteroDualNet outperformed the other four methods in comparison over a public dataset of 763 drugs and 681 diseases in terms of Areas Under the Curves of Receiver Operating Characteristics and Precision-Recall, and recall rate at top k. Case study of five drugs further proved the capacity of HeteroDualNet in finding reliable disease candidates of drugs as validated by database records or literature. Our findings show that the embedded heterogenous layers of original and neighbouring drug-disease representations in a dual neural network improved the association prediction performance.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Bundoora, VIC, Australia
| | - Tonghui Shen
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Nan Sheng
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin, China
| |
Collapse
|
48
|
Shen Y, Ding Y, Tang J, Zou Q, Guo F. Critical evaluation of web-based prediction tools for human protein subcellular localization. Brief Bioinform 2019; 21:1628-1640. [DOI: 10.1093/bib/bbz106] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 07/23/2019] [Accepted: 07/27/2019] [Indexed: 11/12/2022] Open
Abstract
Abstract
Human protein subcellular localization has an important research value in biological processes, also in elucidating protein functions and identifying drug targets. Over the past decade, a number of protein subcellular localization prediction tools have been designed and made freely available online. The purpose of this paper is to summarize the progress of research on the subcellular localization of human proteins in recent years, including commonly used data sets proposed by the predecessors and the performance of all selected prediction tools against the same benchmark data set. We carry out a systematic evaluation of several publicly available subcellular localization prediction methods on various benchmark data sets. Among them, we find that mLASSO-Hum and pLoc-mHum provide a statistically significant improvement in performance, as measured by the value of accuracy, relative to the other methods. Meanwhile, we build a new data set using the latest version of Uniprot database and construct a new GO-based prediction method HumLoc-LBCI in this paper. Then, we test all selected prediction tools on the new data set. Finally, we discuss the possible development directions of human protein subcellular localization. Availability: The codes and data are available from http://www.lbci.cn/syn/.
Collapse
Affiliation(s)
- Yinan Shen
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Yijie Ding
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, China
| | - Jijun Tang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
- School of Computational Science and Engineering, University of South Carolina, Columbia, U.S
- Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Fei Guo
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
49
|
Xie NZ, Li JX, Huang RB. Biological Production of (S)-acetoin: A State-of-the-Art Review. Curr Top Med Chem 2019; 19:2348-2356. [PMID: 31648637 DOI: 10.2174/1568026619666191018111424] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Revised: 08/28/2019] [Accepted: 09/02/2019] [Indexed: 12/24/2022]
Abstract
Acetoin is an important four-carbon compound that has many applications in foods, chemical synthesis, cosmetics, cigarettes, soaps, and detergents. Its stereoisomer (S)-acetoin, a high-value chiral compound, can also be used to synthesize optically active drugs, which could enhance targeting properties and reduce side effects. Recently, considerable progress has been made in the development of biotechnological routes for (S)-acetoin production. In this review, various strategies for biological (S)- acetoin production are summarized, and their constraints and possible solutions are described. Furthermore, future prospects of biological production of (S)-acetoin are discussed.
Collapse
Affiliation(s)
- Neng-Zhong Xie
- National Engineering Research Center for Non-Food Biorefinery, State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Key Laboratory of Bio-refinery, Guangxi Biomass Engineering Technology Research Center, Guangxi Academy of Sciences, 98 Daling Road, Nanning, 530007, China
| | - Jian-Xiu Li
- National Engineering Research Center for Non-Food Biorefinery, State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Key Laboratory of Bio-refinery, Guangxi Biomass Engineering Technology Research Center, Guangxi Academy of Sciences, 98 Daling Road, Nanning, 530007, China
| | - Ri-Bo Huang
- National Engineering Research Center for Non-Food Biorefinery, State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Key Laboratory of Bio-refinery, Guangxi Biomass Engineering Technology Research Center, Guangxi Academy of Sciences, 98 Daling Road, Nanning, 530007, China.,State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, College of Life Science and Technology, Guangxi University, 100 Daxue Road, Nanning, 530004, China
| |
Collapse
|
50
|
Chou KC. Advances in Predicting Subcellular Localization of Multi-label Proteins and its Implication for Developing Multi-target Drugs. Curr Med Chem 2019; 26:4918-4943. [PMID: 31060481 DOI: 10.2174/0929867326666190507082559] [Citation(s) in RCA: 78] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Revised: 01/29/2019] [Accepted: 01/31/2019] [Indexed: 12/16/2022]
Abstract
The smallest unit of life is a cell, which contains numerous protein molecules. Most
of the functions critical to the cell’s survival are performed by these proteins located in its different
organelles, usually called ‘‘subcellular locations”. Information of subcellular localization
for a protein can provide useful clues about its function. To reveal the intricate pathways at the
cellular level, knowledge of the subcellular localization of proteins in a cell is prerequisite.
Therefore, one of the fundamental goals in molecular cell biology and proteomics is to determine
the subcellular locations of proteins in an entire cell. It is also indispensable for prioritizing
and selecting the right targets for drug development. Unfortunately, it is both timeconsuming
and costly to determine the subcellular locations of proteins purely based on experiments.
With the avalanche of protein sequences generated in the post-genomic age, it is highly
desired to develop computational methods for rapidly and effectively identifying the subcellular
locations of uncharacterized proteins based on their sequences information alone. Actually,
considerable progresses have been achieved in this regard. This review is focused on those
methods, which have the capacity to deal with multi-label proteins that may simultaneously
exist in two or more subcellular location sites. Protein molecules with this kind of characteristic
are vitally important for finding multi-target drugs, a current hot trend in drug development.
Focused in this review are also those methods that have use-friendly web-servers established so
that the majority of experimental scientists can use them to get the desired results without the
need to go through the detailed mathematics involved.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|