1
|
Zou K, Wang S, Wang Z, Zhang Z, Yang F. HAR_Locator: a novel protein subcellular location prediction model of immunohistochemistry images based on hybrid attention modules and residual units. Front Mol Biosci 2023; 10:1171429. [PMID: 37664182 PMCID: PMC10470064 DOI: 10.3389/fmolb.2023.1171429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 08/04/2023] [Indexed: 09/05/2023] Open
Abstract
Introduction: Proteins located in subcellular compartments have played an indispensable role in the physiological function of eukaryotic organisms. The pattern of protein subcellular localization is conducive to understanding the mechanism and function of proteins, contributing to investigating pathological changes of cells, and providing technical support for targeted drug research on human diseases. Automated systems based on featurization or representation learning and classifier design have attracted interest in predicting the subcellular location of proteins due to a considerable rise in proteins. However, large-scale, fine-grained protein microscopic images are prone to trapping and losing feature information in the general deep learning models, and the shallow features derived from statistical methods have weak supervision abilities. Methods: In this work, a novel model called HAR_Locator was developed to predict the subcellular location of proteins by concatenating multi-view abstract features and shallow features, whose advanced advantages are summarized in the following three protocols. Firstly, to get discriminative abstract feature information on protein subcellular location, an abstract feature extractor called HARnet based on Hybrid Attention modules and Residual units was proposed to relieve gradient dispersion and focus on protein-target regions. Secondly, it not only improves the supervision ability of image information but also enhances the generalization ability of the HAR_Locator through concatenating abstract features and shallow features. Finally, a multi-category multi-classifier decision system based on an Artificial Neural Network (ANN) was introduced to obtain the final output results of samples by fitting the most representative result from five subset predictors. Results: To evaluate the model, a collection of 6,778 immunohistochemistry (IHC) images from the Human Protein Atlas (HPA) database was used to present experimental results, and the accuracy, precision, and recall evaluation indicators were significantly increased to 84.73%, 84.77%, and 84.70%, respectively, compared with baseline predictors.
Collapse
Affiliation(s)
- Kai Zou
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
| | - Simeng Wang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
| | - Ziqian Wang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
| | - Zhihai Zhang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
| | - Fan Yang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
- Artificial Intelligence and Bioinformation Cognition Laboratory, Jiangxi Science and Technology Normal University, Nanchang, China
| |
Collapse
|
2
|
Li J, Zou Q, Yuan L. A review from biological mapping to computation-based subcellular localization. MOLECULAR THERAPY. NUCLEIC ACIDS 2023; 32:507-521. [PMID: 37215152 PMCID: PMC10192651 DOI: 10.1016/j.omtn.2023.04.015] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Subcellular localization is crucial to the study of virus and diseases. Specifically, research on protein subcellular localization can help identify clues between virus and host cells that can aid in the design of targeted drugs. Research on RNA subcellular localization is significant for human diseases (such as Alzheimer's disease, colon cancer, etc.). To date, only reviews addressing subcellular localization of proteins have been published, which are outdated for reference, and reviews of RNA subcellular localization are not comprehensive. Therefore, we collated (the most up-to-date) literature on protein and RNA subcellular localization to help researchers understand changes in the field of protein and RNA subcellular localization. Extensive and complete methods for constructing subcellular localization models have also been summarized, which can help readers understand the changes in application of biotechnology and computer science in subcellular localization research and explore how to use biological data to construct improved subcellular localization models. This paper is the first review to cover both protein subcellular localization and RNA subcellular localization. We urge researchers from biology and computational biology to jointly pay attention to transformation patterns, interrelationships, differences, and causality of protein subcellular localization and RNA subcellular localization.
Collapse
Affiliation(s)
- Jing Li
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, 1 Chengdian Road, Quzhou, Zhejiang 324000, China
- School of Biomedical Sciences, University of Hong Kong, Hong Kong, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, 1 Chengdian Road, Quzhou, Zhejiang 324000, China
| | - Lei Yuan
- Department of Hepatobiliary Surgery, Quzhou People's Hospital, 100 Minjiang Main Road, Quzhou, Zhejiang 324000, China
| |
Collapse
|
3
|
Yang R, Liu J, Zhang L. ECAmyloid: An amyloid predictor based on ensemble learning and comprehensive sequence-derived features. Comput Biol Chem 2023; 104:107853. [PMID: 36990028 DOI: 10.1016/j.compbiolchem.2023.107853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 03/17/2023] [Accepted: 03/20/2023] [Indexed: 03/30/2023]
Abstract
Amyloid fibrils formed by the mis-aggregation of amyloid proteins can lead to neuronal degenerations in the Alzheimer's disease. Predicting amyloid proteins not only contributes to understanding physicochemical properties and formation mechanism of amyloid proteins, but also has significant implications in the amyloid disease treatment and the development of a new purpose for amyloid materials. In this study, an ensemble learning model with sequence-derived features, ECAmyloid, is proposed to identify amyloids. The sequence-derived features including Pseudo Position Specificity Score Matrix (Pse-PSSM), Split Amino Acid Composition (SAAC), Solvent Accessibility (SA), and Secondary Structure Information (SSI) are employed to incorporate sequence composition, evolutionary and structural information. The individual learners of the ensemble learning model are selected by an increment classifier selection strategy. The final prediction results are determined by voting of prediction results of multiple individual learners. In view of the imbalanced benchmark dataset, the Synthetic Minority Over-sampling Technique (SMOTE) is adopted to generate positive samples. To eliminate irrelevant features and redundant features, correlation-based feature subset (CFS) selection combined with a heuristic search strategy is performed to obtain the optimal feature subset. Experimental results indicate that the ensemble classifier achieves an accuracy of 98.29%, a sensitivity of 0.992, a specificity of 0.974 on the training dataset using the 10-fold cross validation, far higher than the results obtained by its individual learners. Compared with the original feature set, the accuracy, sensitivity, specificity, MCC, F1-score, G-Mean of the ensemble method trained by the optimal feature subset are improved by 1.05%, 0.012, 0.01, 0.021, 0.011 and 0.011, respectively. Moreover, the comparison results with existing methods on two same independent test datasets demonstrate that the proposed method is an effective and promising predictor for large-scale determination of amyloid proteins. The data and code used to develop ECAmyloid has been shared to Github, and can be freely downloaded at https://github.com/KOALA-L/ECAmyloid.git.
Collapse
Affiliation(s)
- Runtao Yang
- School of Mechanical, Electrical and Information Engineering, Shandong University at Weihai, 264209, China
| | - Jiaming Liu
- School of Mechanical, Electrical and Information Engineering, Shandong University at Weihai, 264209, China
| | - Lina Zhang
- School of Mechanical, Electrical and Information Engineering, Shandong University at Weihai, 264209, China.
| |
Collapse
|
4
|
Perveen G, Alturise F, Alkhalifah T, Daanial Khan Y. Hemolytic-Pred: A machine learning-based predictor for hemolytic proteins using position and composition-based features. Digit Health 2023; 9:20552076231180739. [PMID: 37434723 PMCID: PMC10331097 DOI: 10.1177/20552076231180739] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Accepted: 05/22/2023] [Indexed: 07/13/2023] Open
Abstract
Objective The objective of this study is to propose a novel in-silico method called Hemolytic-Pred for identifying hemolytic proteins based on their sequences, using statistical moment-based features, along with position-relative and frequency-relative information. Methods Primary sequences were transformed into feature vectors using statistical and position-relative moment-based features. Varying machine learning algorithms were employed for classification. Computational models were rigorously evaluated using four different validation. The Hemolytic-Pred webserver is available for further analysis at http://ec2-54-160-229-10.compute-1.amazonaws.com/. Results XGBoost outperformed the other six classifiers with an accuracy value of 0.99, 0.98, 0.97, and 0.98 for self-consistency test, 10-fold cross-validation, Jackknife test, and independent set test, respectively. The proposed method with the XGBoost classifier is a workable and robust solution for predicting hemolytic proteins efficiently and accurately. Conclusions The proposed method of Hemolytic-Pred with XGBoost classifier is a reliable tool for the timely identification of hemolytic cells and diagnosis of various related severe disorders. The application of Hemolytic-Pred can yield profound benefits in the medical field.
Collapse
Affiliation(s)
- Gulnaz Perveen
- Department of Computer Science, School
of Systems and Technology, University of Management and Technology, Lahore, Punjab,
Pakistan
| | - Fahad Alturise
- Department of Computer, College of
Science and Arts in Ar Rass Qassim University, Buraidah, Qassim, Saudi Arabia
| | - Tamim Alkhalifah
- Department of Computer, College of
Science and Arts in Ar Rass Qassim University, Buraidah, Qassim, Saudi Arabia
| | - Yaser Daanial Khan
- Department of Computer Science, School
of Systems and Technology, University of Management and Technology, Lahore, Punjab,
Pakistan
| |
Collapse
|
5
|
Ensemble of Networks for Multilabel Classification. SIGNALS 2022. [DOI: 10.3390/signals3040054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Multilabel learning goes beyond standard supervised learning models by associating a sample with more than one class label. Among the many techniques developed in the last decade to handle multilabel learning best approaches are those harnessing the power of ensembles and deep learners. This work proposes merging both methods by combining a set of gated recurrent units, temporal convolutional neural networks, and long short-term memory networks trained with variants of the Adam optimization approach. We examine many Adam variants, each fundamentally based on the difference between present and past gradients, with step size adjusted for each parameter. We also combine Incorporating Multiple Clustering Centers and a bootstrap-aggregated decision trees ensemble, which is shown to further boost classification performance. In addition, we provide an ablation study for assessing the performance improvement that each module of our ensemble produces. Multiple experiments on a large set of datasets representing a wide variety of multilabel tasks demonstrate the robustness of our best ensemble, which is shown to outperform the state-of-the-art.
Collapse
|
6
|
Yu Y, Yu Y, Cui N, Ma L, Tao R, Ma Z, Meng X, Fan H. Lignin biosynthesis regulated by CsCSE1 is required for Cucumis sativus defence to Podosphaera xanthii. PLANT PHYSIOLOGY AND BIOCHEMISTRY : PPB 2022; 186:88-98. [PMID: 35830761 DOI: 10.1016/j.plaphy.2022.06.030] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 06/28/2022] [Accepted: 06/30/2022] [Indexed: 06/15/2023]
Abstract
Lignin is a complex phenolic compound that can enhance the stiffness, hydrophobicity, and antioxidant capacity of the cell wall; it thus provides a critical barrier against pathogen and insect invaders. Caffeoyl shikimate esterase (CSE) is a key novel enzyme involved in lignin biosynthesis that is associated with genetic improvements in lignocellulosic biomass; however, no research thus far have revealed the role of CSE in resistance to pathogenic stress. CsCSE1 (Cucsa.134370) has previously been shown to highly associated with the response of cucumber to attack by Podosphaera xanthii through RNA sequencing. Here, we detected the exactly role of CsCSE1 in the defence of cucumber to P. xanthii infection. Homologous sequence alignment revealed that CsCSE1 contains two highly conserved lyase domains (GXSXG), suggesting that CsCSE1 possesses CSE activity. Subcellular localization analysis manifested that CsCSE1 was localized to the plasma membrane and endoplasmic reticulum (ER). Functional analysis demonstrated that the transient silencing of CsCSE1 in cucumber dramatically attenuated resistance to P. xanthii, whereas overexpression of CsCSE1 in cucumber markedly increased resistance to P. xanthii. Further investigation of the abundance of lignin in transient transgenic plants revealed that CsCSE1 might actively mediate the disease resistance of cucumber by promoting lignin biosynthesis. CsCSE1 also affects the expression of its downstream lignin biosynthesis-related genes, like CsLAC, CsCOMT, CsCCR, and CsCAD. The results of this study provide targets for the genetic breeding of tolerant cucumber cultivars as well as new insights that could aid the control of plant diseases.
Collapse
Affiliation(s)
- Yongbo Yu
- College of Bioscience and Biotechnology, Shenyang Agricultural University, Shenyang, 110866, China
| | - Yang Yu
- College of Bioscience and Biotechnology, Shenyang Agricultural University, Shenyang, 110866, China
| | - Na Cui
- College of Bioscience and Biotechnology, Shenyang Agricultural University, Shenyang, 110866, China; Key Laboratory of Protected Horticulture of Ministry of Education, Shenyang Agricultural University, Shenyang, 110866, China
| | - Lifeng Ma
- College of Bioscience and Biotechnology, Shenyang Agricultural University, Shenyang, 110866, China
| | - Ran Tao
- College of Bioscience and Biotechnology, Shenyang Agricultural University, Shenyang, 110866, China
| | - Zhangtong Ma
- College of Bioscience and Biotechnology, Shenyang Agricultural University, Shenyang, 110866, China
| | - Xiangnan Meng
- College of Bioscience and Biotechnology, Shenyang Agricultural University, Shenyang, 110866, China; Key Laboratory of Protected Horticulture of Ministry of Education, Shenyang Agricultural University, Shenyang, 110866, China.
| | - Haiyan Fan
- College of Bioscience and Biotechnology, Shenyang Agricultural University, Shenyang, 110866, China; Key Laboratory of Protected Horticulture of Ministry of Education, Shenyang Agricultural University, Shenyang, 110866, China.
| |
Collapse
|
7
|
Ning Q, Zhao X, Ma Z. A Novel Method for Identification of Glutarylation Sites Combining Borderline-SMOTE With Tomek Links Technique in Imbalanced Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2632-2641. [PMID: 34236968 DOI: 10.1109/tcbb.2021.3095482] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Glutarylation is a type of post-translational modification that occurs on lysine residues. It plays an irreplaceable role in various cellular functions. Therefore, identification of glutarylation sites is significant for understanding the molecular mechanism of glutarylation. In this study, we proposed a method named DEXGB_Glu to identify lysine glutarylation sites using XGBoost as classifier which was optimized by differential evolution algorithm. Aiming at the imbalance between positive samples and negative samples, Borderline-SMOTE method was employed to synthesize positive samples, increasing their amount equal to negative samples. Then, Tomek links technique was applied to filter out noise data. Analysis of this method and its results showed that differential evolution algorithm obviously improved the performance and the combination of Borderline-SMOTE and Tomek links effectively solved the imbalance between positive samples and negative samples. Finally, the performance of this method was much better than other methods in prediction of glutarylation sites. The data and code are available on https://github.com/ningq669/DEXGB_Glu.
Collapse
|
8
|
Wu L, Gao S, Yao S, Wu F, Li J, Dong Y, Zhang Y. Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM. Front Genet 2022; 13:912614. [PMID: 35783287 PMCID: PMC9240597 DOI: 10.3389/fgene.2022.912614] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 05/20/2022] [Indexed: 11/13/2022] Open
Abstract
Identifying the subcellular localization of a given protein is an essential part of biological and medical research, since the protein must be localized in the correct organelle to ensure physiological function. Conventional biological experiments for protein subcellular localization have some limitations, such as high cost and low efficiency, thus massive computational methods are proposed to solve these problems. However, some of these methods need to be improved further for protein subcellular localization with class imbalance problem. We propose a new model, generating minority samples for protein subcellular localization (Gm-PLoc), to predict the subcellular localization of multi-label proteins. This model includes three steps: using the position specific scoring matrix to extract distinguishable features of proteins; synthesizing samples of the minority category to balance the distribution of categories based on the revised generative adversarial networks; training a classifier with the rebalanced dataset to predict the subcellular localization of multi-label proteins. One benchmark dataset is selected to evaluate the performance of the presented model, and the experimental results demonstrate that Gm-PLoc performs well for the multi-label protein subcellular localization.
Collapse
Affiliation(s)
- Liwen Wu
- Engineering Research Center of Cyberspace, Yunnan University, Kunming, China
- School of Software, Yunnan University, Kunming, China
| | - Song Gao
- Engineering Research Center of Cyberspace, Yunnan University, Kunming, China
- School of Software, Yunnan University, Kunming, China
| | - Shaowen Yao
- Engineering Research Center of Cyberspace, Yunnan University, Kunming, China
- School of Software, Yunnan University, Kunming, China
| | - Feng Wu
- Engineering Research Center of Cyberspace, Yunnan University, Kunming, China
- School of Software, Yunnan University, Kunming, China
| | - Jie Li
- Engineering Research Center of Cyberspace, Yunnan University, Kunming, China
- School of Software, Yunnan University, Kunming, China
| | - Yunyun Dong
- Engineering Research Center of Cyberspace, Yunnan University, Kunming, China
- School of Software, Yunnan University, Kunming, China
| | - Yunqi Zhang
- Engineering Research Center of Cyberspace, Yunnan University, Kunming, China
- School of Software, Yunnan University, Kunming, China
- Yunnan Key Laboratory of Statistical Modeling and Data Analysis, School of Mathematics and Statistics, Yunnan University, Kunming, China
- *Correspondence: Yunqi Zhang,
| |
Collapse
|
9
|
Yu L, Qiu W, Lin W, Cheng X, Xiao X, Dai J. HGDTI: predicting drug-target interaction by using information aggregation based on heterogeneous graph neural network. BMC Bioinformatics 2022; 23:126. [PMID: 35413800 PMCID: PMC9004085 DOI: 10.1186/s12859-022-04655-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 03/28/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In research on new drug discovery, the traditional wet experiment has a long period. Predicting drug-target interaction (DTI) in silico can greatly narrow the scope of search of candidate medications. Excellent algorithm model may be more effective in revealing the potential connection between drug and target in the bioinformatics network composed of drugs, proteins and other related data. RESULTS In this work, we have developed a heterogeneous graph neural network model, named as HGDTI, which includes a learning phase of network node embedding and a training phase of DTI classification. This method first obtains the molecular fingerprint information of drugs and the pseudo amino acid composition information of proteins, then extracts the initial features of nodes through Bi-LSTM, and uses the attention mechanism to aggregate heterogeneous neighbors. In several comparative experiments, the overall performance of HGDTI significantly outperforms other state-of-the-art DTI prediction models, and the negative sampling technology is employed to further optimize the prediction power of model. In addition, we have proved the robustness of HGDTI through heterogeneous network content reduction tests, and proved the rationality of HGDTI through other comparative experiments. These results indicate that HGDTI can utilize heterogeneous information to capture the embedding of drugs and targets, and provide assistance for drug development. CONCLUSIONS The HGDTI based on heterogeneous graph neural network model, can utilize heterogeneous information to capture the embedding of drugs and targets, and provide assistance for drug development. For the convenience of related researchers, a user-friendly web-server has been established at http://bioinfo.jcu.edu.cn/hgdti .
Collapse
Affiliation(s)
- Liyi Yu
- School of Information Engineering, Jingdezhen Ceramic Institute, Jingdezhen, China
| | - Wangren Qiu
- School of Information Engineering, Jingdezhen Ceramic Institute, Jingdezhen, China
| | - Weizhong Lin
- School of Information Engineering, Jingdezhen Ceramic Institute, Jingdezhen, China
| | - Xiang Cheng
- School of Information Engineering, Jingdezhen Ceramic Institute, Jingdezhen, China
| | - Xuan Xiao
- School of Information Engineering, Jingdezhen Ceramic Institute, Jingdezhen, China.
| | - Jiexia Dai
- School of Foreign Languages, Jingdezhen University, Jingdezhen, China
| |
Collapse
|
10
|
Xiong E, Cao D, Qu C, Zhao P, Wu Z, Yin D, Zhao Q, Gong F. Multilocation proteins in organelle communication: Based on protein-protein interactions. PLANT DIRECT 2022; 6:e386. [PMID: 35229068 PMCID: PMC8861329 DOI: 10.1002/pld3.386] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/10/2021] [Revised: 12/17/2021] [Accepted: 01/18/2022] [Indexed: 05/25/2023]
Abstract
Protein-protein interaction (PPI) plays a crucial role in most biological processes, including signal transduction and cell apoptosis. Importantly, the knowledge of PPIs can be useful for identification of multimeric protein complexes and elucidation of uncharacterized protein functions. Arabidopsis thaliana, the best-characterized dicotyledonous plant, the steadily increasing amount of information on the levels of its proteome and signaling pathways is progressively enabling more researchers to construct models for cellular processes for the plant, which in turn encourages more experimental data to be generated. In this study, we performed an overview analysis of the 10 major organelles and their associated proteins of the dicotyledonous model plant Arabidopsis thaliana via PPI network, and found that PPI may play an important role in organelle communication. Further, multilocation proteins, especially phosphorylation-related multilocation proteins, can function as a "needle and thread" via PPIs and play an important role in organelle communication. Similar results were obtained in a monocotyledonous model crop, rice. Furthermore, we provide a research strategy for multilocation proteins by LOPIT technique, proteomics, and bioinformatics analysis and also describe their potential role in the field of plant science. The results provide a new view that the phosphorylation-related multilocation proteins play an important role in organelle communication and provide new insight into PPIs and novel directions for proteomic research. The research of phosphorylation-related multilocation proteins may promote the development of organelle communication and provide an important theoretical basis for plant responses to external stress.
Collapse
Affiliation(s)
- Erhui Xiong
- College of AgronomyHenan Agricultural UniversityZhengzhouChina
| | - Di Cao
- College of AgronomyHenan Agricultural UniversityZhengzhouChina
| | - Chengxin Qu
- College of AgronomyHenan Agricultural UniversityZhengzhouChina
| | - Pengfei Zhao
- College of AgronomyHenan Agricultural UniversityZhengzhouChina
| | - Zhaokun Wu
- College of AgronomyHenan Agricultural UniversityZhengzhouChina
| | - Dongmei Yin
- College of AgronomyHenan Agricultural UniversityZhengzhouChina
| | - Quanzhi Zhao
- College of AgronomyHenan Agricultural UniversityZhengzhouChina
| | - Fangping Gong
- College of AgronomyHenan Agricultural UniversityZhengzhouChina
| |
Collapse
|
11
|
Chang V, Kandadai K, Xu QA, Guan S. Development of a Diabetes Diagnosis System Using Machine Learning Algorithms. INTERNATIONAL JOURNAL OF DISTRIBUTED SYSTEMS AND TECHNOLOGIES 2022. [DOI: 10.4018/ijdst.296246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This paper describes how to develop diabetes diagnosis through the combined use of the support vector machine, the Decision Tree, Naive Bayes, K-nearest and finally, Random Forest (RF) algorithms. These methods are useful to predict diabetes jointly. The appropriateness of ML-depended techniques to tackle this issue has been revealed. This diabetes diagnosis system using machine-learning algorithms is used to review papers. This project was based on developing python-based code for machine learning algorithms to perform large scales of diabetes analysis. The hardware requirement of machine learning is RAM that is 128 GB DDR4 2133 MHz and 2 TB Hard Disk and needs 512 GB SSD. One standard library is NumPy that uses to support multi-dimensional arrays objects, various components, and matrices. The Random Forest Prediction representing the pictorial visualization of the model and the accuracy for the data analysis using the Random Forest is 76%.
Collapse
|
12
|
Liao Z, Pan G, Sun C, Tang J. Predicting subcellular location of protein with evolution information and sequence-based deep learning. BMC Bioinformatics 2021; 22:515. [PMID: 34686152 PMCID: PMC8539821 DOI: 10.1186/s12859-021-04404-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Accepted: 09/24/2021] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Protein subcellular localization prediction plays an important role in biology research. Since traditional methods are laborious and time-consuming, many machine learning-based prediction methods have been proposed. However, most of the proposed methods ignore the evolution information of proteins. In order to improve the prediction accuracy, we present a deep learning-based method to predict protein subcellular locations. RESULTS Our method utilizes not only amino acid compositions sequence but also evolution matrices of proteins. Our method uses a bidirectional long short-term memory network that processes the entire protein sequence and a convolutional neural network that extracts features from protein sequences. The position specific scoring matrix is used as a supplement to protein sequences. Our method was trained and tested on two benchmark datasets. The experiment results show that our method yields accurate results on the two datasets with an average precision of 0.7901, ranking loss of 0.0758 and coverage of 1.2848. CONCLUSION The experiment results show that our method outperforms five methods currently available. According to those experiments, we can see that our method is an acceptable alternative to predict protein subcellular location.
Collapse
Affiliation(s)
- Zhijun Liao
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, 1 Xuefu North Road, University Town, Fuzhou, 350122 FJ China
- Department of Computer Science and Engineering, University of South Carolina, 550 Assembly St, Columbia, SC 29208 USA
| | - Gaofeng Pan
- Department of Computer Science and Engineering, University of South Carolina, 550 Assembly St, Columbia, SC 29208 USA
| | - Chao Sun
- Department of Computer Science and Engineering, University of South Carolina, 550 Assembly St, Columbia, SC 29208 USA
| | - Jijun Tang
- Department of Computer Science and Engineering, University of South Carolina, 550 Assembly St, Columbia, SC 29208 USA
- College of Electrical and Power Engineering, Taiyuan University of Technology, No. 79 Yinze West Street, Taiyuan, 030024 SX China
| |
Collapse
|
13
|
Akbar S, Ahmad A, Hayat M, Rehman AU, Khan S, Ali F. iAtbP-Hyb-EnC: Prediction of antitubercular peptides via heterogeneous feature representation and genetic algorithm based ensemble learning model. Comput Biol Med 2021; 137:104778. [PMID: 34481183 DOI: 10.1016/j.compbiomed.2021.104778] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Revised: 08/16/2021] [Accepted: 08/17/2021] [Indexed: 11/26/2022]
Abstract
Tuberculosis (TB) is a worldwide illness caused by the bacteria Mycobacterium tuberculosis. Owing to the high prevalence of multidrug-resistant tuberculosis, numerous traditional strategies for developing novel alternative therapies have been presented. The effectiveness and dependability of these procedures are not always consistent. Peptide-based therapy has recently been regarded as a preferable alternative due to its excellent selectivity in targeting specific cells without affecting the normal cells. However, due to the rapid growth of the peptide samples, predicting TB accurately has become a challenging task. To effectively identify antitubercular peptides, an intelligent and reliable prediction model is indispensable. An ensemble learning approach was used in this study to improve expected results by compensating for the shortcomings of individual classification algorithms. Initially, three distinct representation approaches were used to formulate the training samples: k-space amino acid composition, composite physiochemical properties, and one-hot encoding. The feature vectors of the applied feature extraction methods are then combined to generate a heterogeneous vector. Finally, utilizing individual and heterogeneous vectors, five distinct nature classification models were used to evaluate prediction rates. In addition, a genetic algorithm-based ensemble model was used to improve the suggested model's prediction and training capabilities. Using Training and independent datasets, the proposed ensemble model achieved an accuracy of 94.47% and 92.68%, respectively. It was observed that our proposed "iAtbP-Hyb-EnC" model outperformed and reported ~10% highest training accuracy than existing predictors. The "iAtbP-Hyb-EnC" model is suggested to be a reliable tool for scientists and might play a valuable role in academic research and drug discovery. The source code and all datasets are publicly available at https://github.com/Farman335/iAtbP-Hyb-EnC.
Collapse
Affiliation(s)
- Shahid Akbar
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP, 23200, Pakistan.
| | - Ashfaq Ahmad
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP, 23200, Pakistan.
| | - Maqsood Hayat
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP, 23200, Pakistan.
| | - Ateeq Ur Rehman
- Department of Information Technology, The University of Haripur, KP, Pakistan.
| | - Salman Khan
- Department of Computer Science, Abdul Wali Khan University, Mardan, KP, 23200, Pakistan.
| | - Farman Ali
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.
| |
Collapse
|
14
|
Akmal MA, Hussain W, Rasool N, Khan YD, Khan SA, Chou KC. Using CHOU'S 5-Steps Rule to Predict O-Linked Serine Glycosylation Sites by Blending Position Relative Features and Statistical Moment. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2045-2056. [PMID: 31985438 DOI: 10.1109/tcbb.2020.2968441] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Glycosylation of proteins in eukaryote cells is an important and complicated post-translation modification due to its pivotal role and association with crucial physiological functions within most of the proteins. Identification of glycosylation sites in a polypeptide chain is not an easy task due to multiple impediments. Analytical identification of these sites is expensive and laborious. There is a dire need to develop a reliable computational method for precise determination of such sites which can help researchers to save time and effort. Herein, we propose a novel predictor namely iGlycoS-PseAAC by integrating the Chou's Pseudo Amino Acid Composition (PseAAC) and relative/absolute position-based features. The self-consistency results show that the accuracy revealed by the model using the benchmark dataset for prediction of O-linked glycosylation having serine sites is 98.8 percent. The overall accuracy of predictor achieved through 10-fold cross validation by combining the positive and negative results is 97.2 percent. The overall accuracy achieved through Jackknife test is 96.195 percent by aggregating of all the prediction results. Thus the proposed predictor can help in predicting the O-linked glycosylated serine sites in an efficient and accurate way. The overall results show that the accuracy of the iGlycoS-PseAAC is higher than the existing tools.
Collapse
|
15
|
Khan YD, Khan NS, Naseer S, Butt AH. iSUMOK-PseAAC: prediction of lysine sumoylation sites using statistical moments and Chou's PseAAC. PeerJ 2021; 9:e11581. [PMID: 34430072 PMCID: PMC8349168 DOI: 10.7717/peerj.11581] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 05/19/2021] [Indexed: 01/25/2023] Open
Abstract
Sumoylation is the post-translational modification that is involved in the adaption of the cells and the functional properties of a large number of proteins. Sumoylation has key importance in subcellular concentration, transcriptional synchronization, chromatin remodeling, response to stress, and regulation of mitosis. Sumoylation is associated with developmental defects in many human diseases such as cancer, Huntington's, Alzheimer's, Parkinson's, Spin cerebellar ataxia 1, and amyotrophic lateral sclerosis. The covalent bonding of Sumoylation is essential to inheriting part of the operative characteristics of some other proteins. For that reason, the prediction of the Sumoylation site has significance in the scientific community. A novel and efficient technique is proposed to predict the Sumoylation sites in proteins by incorporating Chou's Pseudo Amino Acid Composition (PseAAC) with statistical moments-based features. The outcomes from the proposed system using 10 fold cross-validation testing are 94.51%, 94.24%, 94.79% and 0.8903% accuracy, sensitivity, specificity and MCC, respectively. The performance of the proposed system is so far the best in comparison to the other state-of-the-art methods. The codes for the current study are available on the GitHub repository using the link: https://github.com/csbioinfopk/iSumoK-PseAAC.
Collapse
Affiliation(s)
- Yaser Daanial Khan
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Punjab, Pakistan
| | - Nabeel Sabir Khan
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Punjab, Pakistan
| | - Sheraz Naseer
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Punjab, Pakistan
| | - Ahmad Hassan Butt
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Punjab, Pakistan
| |
Collapse
|
16
|
Analysis and Prediction of Adverse Reaction of Drugs with Machine Learning Models for Tracking the Severity. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2021. [DOI: 10.1007/s13369-021-05999-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
17
|
Li G, Du X, Li X, Zou L, Zhang G, Wu Z. Prediction of DNA binding proteins using local features and long-term dependencies with primary sequences based on deep learning. PeerJ 2021; 9:e11262. [PMID: 33986992 PMCID: PMC8101451 DOI: 10.7717/peerj.11262] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Accepted: 03/22/2021] [Indexed: 12/12/2022] Open
Abstract
DNA-binding proteins (DBPs) play pivotal roles in many biological functions such as alternative splicing, RNA editing, and methylation. Many traditional machine learning (ML) methods and deep learning (DL) methods have been proposed to predict DBPs. However, these methods either rely on manual feature extraction or fail to capture long-term dependencies in the DNA sequence. In this paper, we propose a method, called PDBP-Fusion, to identify DBPs based on the fusion of local features and long-term dependencies only from primary sequences. We utilize convolutional neural network (CNN) to learn local features and use bi-directional long-short term memory network (Bi-LSTM) to capture critical long-term dependencies in context. Besides, we perform feature extraction, model training, and model prediction simultaneously. The PDBP-Fusion approach can predict DBPs with 86.45% sensitivity, 79.13% specificity, 82.81% accuracy, and 0.661 MCC on the PDB14189 benchmark dataset. The MCC of our proposed methods has been increased by at least 9.1% compared to other advanced prediction models. Moreover, the PDBP-Fusion also gets superior performance and model robustness on the PDB2272 independent dataset. It demonstrates that the PDBP-Fusion can be used to predict DBPs from sequences accurately and effectively; the online server is at http://119.45.144.26:8080/PDBP-Fusion/.
Collapse
Affiliation(s)
- Guobin Li
- School of Artificial Intelligence and Big Data, Hefei University, Hefei, China
| | - Xiuquan Du
- School of Computer Science and Technology, Anhui University, Hefei, China
| | - Xinlu Li
- School of Artificial Intelligence and Big Data, Hefei University, Hefei, China
| | - Le Zou
- School of Artificial Intelligence and Big Data, Hefei University, Hefei, China
| | - Guanhong Zhang
- School of Artificial Intelligence and Big Data, Hefei University, Hefei, China
| | - Zhize Wu
- School of Artificial Intelligence and Big Data, Hefei University, Hefei, China
| |
Collapse
|
18
|
Awais M, Hussain W, Khan YD, Rasool N, Khan SA, Chou KC. iPhosH-PseAAC: Identify Phosphohistidine Sites in Proteins by Blending Statistical Moments and Position Relative Features According to the Chou's 5-Step Rule and General Pseudo Amino Acid Composition. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:596-610. [PMID: 31144645 DOI: 10.1109/tcbb.2019.2919025] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Protein phosphorylation is one of the key mechanism in prokaryotes and eukaryotes and is responsible for various biological functions such as protein degradation, intracellular localization, the multitude of cellular processes, molecular association, cytoskeletal dynamics, and enzymatic inhibition/activation. Phosphohistidine (PhosH) has a key role in a number of biological processes, including central metabolism to signalling in eukaryotes and bacteria. Thus, identification of phosphohistidine sites in a protein sequence is crucial, and experimental identification can be expensive, time-taking, and laborious. To address this problem, here, we propose a novel computational model namely iPhosH-PseAAC for prediction of phosphohistidine sites in a given protein sequence using pseudo amino acid composition (PseAAC), statistical moments, and position relative features. The results of the proposed predictor are validated through self-consistency testing, 10-fold cross-validation, and jackknife testing. The self-consistency validation gave the 100 percent accuracy, whereas, for cross-validation, the accuracy achieved is 94.26 percent. Moreover, jackknife testing gave 97.07 percent accuracy for the proposed model. Thus, the proposed model iPhosH-PseAAC for prediction of iPhosH site has the great ability to predict the PhosH sites in given proteins.
Collapse
|
19
|
Aziz AZB, Hasan MAM, Shin J. Identification of RNA pseudouridine sites using deep learning approaches. PLoS One 2021; 16:e0247511. [PMID: 33621235 PMCID: PMC7901771 DOI: 10.1371/journal.pone.0247511] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 02/08/2021] [Indexed: 01/05/2023] Open
Abstract
Pseudouridine(Ψ) is widely popular among various RNA modifications which have been confirmed to occur in rRNA, mRNA, tRNA, and nuclear/nucleolar RNA. Hence, identifying them has vital significance in academic research, drug development and gene therapies. Several laboratory techniques for Ψ identification have been introduced over the years. Although these techniques produce satisfactory results, they are costly, time-consuming and requires skilled experience. As the lengths of RNA sequences are getting longer day by day, an efficient method for identifying pseudouridine sites using computational approaches is very important. In this paper, we proposed a multi-channel convolution neural network using binary encoding. We employed k-fold cross-validation and grid search to tune the hyperparameters. We evaluated its performance in the independent datasets and found promising results. The results proved that our method can be used to identify pseudouridine sites for associated purposes. We have also implemented an easily accessible web server at http://103.99.176.239/ipseumulticnn/.
Collapse
Affiliation(s)
- Abu Zahid Bin Aziz
- Department of Computer Science & Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh
- * E-mail:
| | - Md. Al Mehedi Hasan
- Department of Computer Science & Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh
| | - Jungpil Shin
- School of Computer Science and Engineering, University of Aizu, Aizuwakamatsu, Japan
| |
Collapse
|
20
|
Zhang Q, Zhang Y, Li S, Han Y, Jin S, Gu H, Yu B. Accurate prediction of multi-label protein subcellular localization through multi-view feature learning with RBRL classifier. Brief Bioinform 2021; 22:6127451. [PMID: 33537726 DOI: 10.1093/bib/bbab012] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Revised: 12/12/2020] [Accepted: 01/06/2021] [Indexed: 01/27/2023] Open
Abstract
Multi-label proteins can participate in carrier transportation, enzyme catalysis, hormone regulation and other life activities. Meanwhile, they play a key role in the fields of biopharmaceuticals, gene and cell therapy. This article proposes a prediction method called Mps-mvRBRL to predict the subcellular localization (SCL) of multi-label protein. Firstly, pseudo position-specific scoring matrix, dipeptide composition, position specific scoring matrix-transition probability composition, gene ontology and pseudo amino acid composition algorithms are used to obtain numerical information from different views. Based on the contribution of five individual feature extraction methods, differential evolution is used for the first time to learn the weight of single feature, and then these original features use a weighted combination method to fuse multi-view information. Secondly, the fused high-dimensional features use a weighted linear discriminant analysis framework based on binary weight form to eliminate irrelevant information. Finally, the best feature vector is input into the joint ranking support vector machine and binary relevance with robust low-rank learning classifier to predict the SCL. After applying leave-one-out cross-validation, the overall actual accuracy (OAA) and overall location accuracy (OLA) of Mps-mvRBRL on the training set of Gram-positive bacteria are both 99.81%. The OAA on the test sets of plant, virus and Gram-negative bacteria datasets are 97.24%, 98.55% and 98.20%, respectively, and the OLA are 97.16%, 97.62% and 98.28%, respectively. The results show that the model achieves good prediction performance for predicting the SCL of multi-label protein.
Collapse
Affiliation(s)
- Qi Zhang
- College of Mathematics and Physics, Qingdao University of Science and Technology, China
| | - Yandan Zhang
- College of Mathematics and Physics, Qingdao University of Science and Technology, China
| | - Shan Li
- School of Mathematics and Statistics, Central South University, China
| | - Yu Han
- College of Mathematics and Physics, Qingdao University of Science and Technology, China
| | - Shuping Jin
- College of Mathematics and Physics, Qingdao University of Science and Technology, China
| | - Haiming Gu
- College of Mathematics and Physics, Qingdao University of Science and Technology, China
| | - Bin Yu
- College of Mathematics and Physics, Qingdao University of Science and Technology, China
| |
Collapse
|
21
|
Li J, Zhang L, He S, Guo F, Zou Q. SubLocEP: a novel ensemble predictor of subcellular localization of eukaryotic mRNA based on machine learning. Brief Bioinform 2021; 22:6059770. [PMID: 33388743 DOI: 10.1093/bib/bbaa401] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Revised: 11/28/2020] [Accepted: 12/08/2020] [Indexed: 01/23/2023] Open
Abstract
MOTIVATION mRNA location corresponds to the location of protein translation and contributes to precise spatial and temporal management of the protein function. However, current assignment of subcellular localization of eukaryotic mRNA reveals important limitations: (1) turning multiple classifications into multiple dichotomies makes the training process tedious; (2) the majority of the models trained by classical algorithm are based on the extraction of single sequence information; (3) the existing state-of-the-art models have not reached an ideal level in terms of prediction and generalization ability. To achieve better assignment of subcellular localization of eukaryotic mRNA, a better and more comprehensive model must be developed. RESULTS In this paper, SubLocEP is proposed as a two-layer integrated prediction model for accurate prediction of the location of sequence samples. Unlike the existing models based on limited features, SubLocEP comprehensively considers additional feature attributes and is combined with LightGBM to generated single feature classifiers. The initial integration model (single-layer model) is generated according to the categories of a feature. Subsequently, two single-layer integration models are weighted (sequence-based: physicochemical properties = 3:2) to produce the final two-layer model. The performance of SubLocEP on independent datasets is sufficient to indicate that SubLocEP is an accurate and stable prediction model with strong generalization ability. Additionally, an online tool has been developed that contains experimental data and can maximize the user convenience for estimation of subcellular localization of eukaryotic mRNA.
Collapse
Affiliation(s)
| | - Lichao Zhang
- School of Intelligent Manufacturing and Equipment, Shenzhen Institute of Information Technology
| | | | | | | |
Collapse
|
22
|
Liu GH, Zhang BW, Qian G, Wang B, Mao B, Bichindaritz I. Bioimage-Based Prediction of Protein Subcellular Location in Human Tissue with Ensemble Features and Deep Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1966-1980. [PMID: 31107658 DOI: 10.1109/tcbb.2019.2917429] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Prediction of protein subcellular location has currently become a hot topic because it has been proven to be useful for understanding both the disease mechanisms and novel drug design. With the rapid development of automated microscopic imaging technology in recent years, classification methods of bioimage-based protein subcellular location have attracted considerable attention for images can describe the protein distribution intuitively and in detail. In the current study, a prediction method of protein subcellular location was proposed based on multi-view image features that are extracted from three different views, including the four texture features of the original image, the global and local features of the protein extracted from the protein channel images after color segmentation, and the global features of DNA extracted from the DNA channel image. Finally, the extracted features were combined together to improve the performance of subcellular localization prediction. From the performance comparison of different combination features under the same classifier, the best ensemble features could be obtained. In this work, a classifier based on Stacked Auto-encoders and the random forest was also put forward. To improve the prediction results, the deep network was combined with the traditional statistical classification methods. Stringent cross-validation and independent validation tests on the benchmark dataset demonstrated the efficacy of the proposed method.
Collapse
|
23
|
Pandey RP, Kumar S, Ahmad S, Vibhuti A, Raj VS, Verma AK, Sharma P, Leal E. Use Chou's 5-steps rule to evaluate protective efficacy induced by antigenic proteins of Mycobacterium tuberculosis encapsulated in chitosan nanoparticles. Life Sci 2020; 256:117961. [PMID: 32534039 DOI: 10.1016/j.lfs.2020.117961] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The study focuses on whether antigenic proteins encapsulated in biopolymeric nanoparticles can augment protective efficacy. Chitosan nanoparticles (ChN) were prepared by ionic gelation method and Culture Filtrate Proteins (CFP) - CFP-10 and CFP-21 of Mycobacterium tuberculosis (Mtb) were encapsulated in ChN. The binding efficiency of nanoparticles with CFP-10 and CFP-21 proteins was confirmed by UV-Spectrophotometer. The efficacy of nanoparticles-encapsulated antigenic proteins administered intraperitoneal against Mtb aerosol infection was evaluated in Balb/c mice. Protection study was done by bacterial counts [CFU]. CFP-10 and CFP-21 proteins primed cells demonstrated a Th1 bias T cell response in an ex vivo assay. ChN-CFP10 and ChN-CFP21 nanoparticles have both protective and therapeutic potential against Mtb. In the group of mice immunized with CHN-CFP-10 the number of colonies reduced significantly from day 15 to day 60. ChN-CFP-21 showed maximum protection in ChN-CFP-21 immunized mice. ChN-CFP-10 and ChN-CFP-21 clearly showed enhanced protection against Mtb.
Collapse
Affiliation(s)
- Ramendra Pati Pandey
- Centre for Drug Design Discovery and Development (C4D), SRM University, Delhi-NCR, Rajiv Gandhi Education City, Sonepat 131 029, Haryana, India
| | - Santosh Kumar
- ICGEB (International Centre For Genetic Engineering And Biotechnology), New Delhi 110067, India
| | - Saheem Ahmad
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, University of Ha'il, Ha'il, 55476, Saudi Arabia
| | - Arpana Vibhuti
- Centre for Drug Design Discovery and Development (C4D), SRM University, Delhi-NCR, Rajiv Gandhi Education City, Sonepat 131 029, Haryana, India.
| | - V Samuel Raj
- Centre for Drug Design Discovery and Development (C4D), SRM University, Delhi-NCR, Rajiv Gandhi Education City, Sonepat 131 029, Haryana, India.
| | - Anita Kamra Verma
- Nano-Biotech Laboratory, Department of Zoology, Kirori Mal College, University of Delhi, New Delhi 110003, India
| | - Pawan Sharma
- ICGEB (International Centre For Genetic Engineering And Biotechnology), New Delhi 110067, India
| | - Elcio Leal
- Institute of Biological Sciences, Federal University of Para, Para 66075-000, Brazil.
| |
Collapse
|
24
|
Gachpazan M, Kashani H, Khazaei M, Hassanian SM, Rezayi M, Asgharzadeh F, Ghayour-Mobarhan M, Ferns GA, Avan A. The Impact of Statin Therapy on the Survival of Patients with Gastrointestinal Cancer. Curr Drug Targets 2020; 20:738-747. [PMID: 30539694 DOI: 10.2174/1389450120666181211165449] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Revised: 10/25/2018] [Accepted: 12/05/2018] [Indexed: 12/13/2022]
Abstract
Statins are 3-hydroxy-3-methylglutaryl coenzyme A (HMG-CoA) reductase inhibitors that may play an important role in the evolution of cancers, due to their effects on cancer cell metabolism. Statins affect several potential pathways, including cell proliferation, angiogenesis, apoptosis and metastasis. The number of trials assessing the putative clinical benefits of statins in cancer is increasing. Currently, there are several trials listed on the global trial identifier website clinicaltrials.gov. Given the compelling evidence from these trials in a variety of clinical settings, there have been calls for a clinical trial of statins in the adjuvant gastrointestinal cancer setting. However, randomized controlled trials on specific cancer types in relation to statin use, as well as studies on populations without a clinical indication for using statins, have elucidated some potential underlying biological mechanisms, and the investigation of different statins is probably warranted. It would be useful for these trials to incorporate the assessment of tumour biomarkers predictive of statin response in their design. This review summarizes the recent preclinical and clinical studies that assess the application of statins in the treatment of gastrointestinal cancers with particular emphasize on their association with cancer risk.
Collapse
Affiliation(s)
- Meysam Gachpazan
- Metabolic syndrome Research center, Mashhad University of Medical Sciences, Mashhad, Iran.,Department of Modern Sciences and Technologies; Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Hoda Kashani
- Department of Modern Sciences and Technologies; Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Khazaei
- Metabolic syndrome Research center, Mashhad University of Medical Sciences, Mashhad, Iran.,Student Research Committee, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Seyed Mahdi Hassanian
- Metabolic syndrome Research center, Mashhad University of Medical Sciences, Mashhad, Iran.,Department of Medical Biochemistry; Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Rezayi
- Metabolic syndrome Research center, Mashhad University of Medical Sciences, Mashhad, Iran.,Department of Modern Sciences and Technologies; Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Fereshteh Asgharzadeh
- Student Research Committee, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Ghayour-Mobarhan
- Metabolic syndrome Research center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Gordon A Ferns
- Brighton & Sussex Medical School, Division of Medical Education, Falmer, Brighton, Sussex BN1 9PH, United Kingdom
| | - Amir Avan
- Metabolic syndrome Research center, Mashhad University of Medical Sciences, Mashhad, Iran.,Department of Modern Sciences and Technologies; Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.,Cancer Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| |
Collapse
|
25
|
Abstract
During the last three decades or so, many efforts have been made to study the protein cleavage
sites by some disease-causing enzyme, such as HIV (Human Immunodeficiency Virus) protease
and SARS (Severe Acute Respiratory Syndrome) coronavirus main proteinase. It has become increasingly
clear <i>via</i> this mini-review that the motivation driving the aforementioned studies is quite wise,
and that the results acquired through these studies are very rewarding, particularly for developing peptide
drugs.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|
26
|
Bouziane H, Chouarfia A. Use of Chou's 5-steps rule to predict the subcellular localization of gram-negative and gram-positive bacterial proteins by multi-label learning based on gene ontology annotation and profile alignment. J Integr Bioinform 2020; 18:51-79. [PMID: 32598314 PMCID: PMC8035964 DOI: 10.1515/jib-2019-0091] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Accepted: 04/08/2020] [Indexed: 12/31/2022] Open
Abstract
To date, many proteins generated by large-scale genome sequencing projects are still uncharacterized and subject to intensive investigations by both experimental and computational means. Knowledge of protein subcellular localization (SCL) is of key importance for protein function elucidation. However, it remains a challenging task, especially for multiple sites proteins known to shuttle between cell compartments to perform their proper biological functions and proteins which do not have significant homology to proteins of known subcellular locations. Due to their low-cost and reasonable accuracy, machine learning-based methods have gained much attention in this context with the availability of a plethora of biological databases and annotated proteins for analysis and benchmarking. Various predictive models have been proposed to tackle the SCL problem, using different protein sequence features pertaining to the subcellular localization, however, the overwhelming majority of them focuses on single localization and cover very limited cellular locations. The prediction was basically established on sorting signals, amino acids compositions, and homology. To improve the prediction quality, focus is actually on knowledge information extracted from annotation databases, such as protein-protein interactions and Gene Ontology (GO) functional domains annotation which has been recently a widely adopted and essential information for learning systems. To deal with such problem, in the present study, we considered SCL prediction task as a multi-label learning problem and tried to label both single site and multiple sites unannotated bacterial protein sequences by mining proteins homology relationships using both GO terms of protein homologs and PSI-BLAST profiles. The experiments using 5-fold cross-validation tests on the benchmark datasets showed a significant improvement on the results obtained by the proposed consensus multi-label prediction model which discriminates six compartments for Gram-negative and five compartments for Gram-positive bacterial proteins.
Collapse
Affiliation(s)
- Hafida Bouziane
- Département d’Informatique, Université des Sciences et de la Technologie d’Oran Mohamed Boudiaf, USTO-MB BP 1505, El M’Naouer, 31000, Oran, Algeria
| | - Abdallah Chouarfia
- Département d’Informatique, Université des Sciences et de la Technologie d’Oran Mohamed Boudiaf, USTO-MB BP 1505, El M’Naouer, 31000, Oran, Algeria
| |
Collapse
|
27
|
Chou KC. An Insightful 10-year Recollection Since the Emergence of the 5-steps Rule. Curr Pharm Des 2020; 25:4223-4234. [PMID: 31782354 DOI: 10.2174/1381612825666191129164042] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Accepted: 11/25/2019] [Indexed: 11/22/2022]
Abstract
OBJECTIVE One of the most challenging and also the most difficult problems is how to formulate a biological sequence with a vector but considerably keep its sequence order information. METHODS To address such a problem, the approach of Pseudo Amino Acid Components or PseAAC has been developed. RESULTS AND CONCLUSION It has become increasingly clear via the 10-year recollection that the aforementioned proposal has been indeed very powerful.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, Massachusetts 02478, United States.,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
28
|
|
29
|
|
30
|
Zheng L, Huang S, Mu N, Zhang H, Zhang J, Chang Y, Yang L, Zuo Y. RAACBook: a web server of reduced amino acid alphabet for sequence-dependent inference by using Chou's five-step rule. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020; 2019:5650975. [PMID: 31802128 PMCID: PMC6893003 DOI: 10.1093/database/baz131] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Revised: 10/16/2019] [Accepted: 10/17/2019] [Indexed: 12/12/2022]
Abstract
By reducing amino acid alphabet, the protein complexity can be significantly simplified, which could improve computational efficiency, decrease information redundancy and reduce chance of overfitting. Although some reduced alphabets have been proposed, different classification rules could produce distinctive results for protein sequence analysis. Thus, it is urgent to construct a systematical frame for reduced alphabets. In this work, we constructed a comprehensive web server called RAACBook for protein sequence analysis and machine learning application by integrating reduction alphabets. The web server contains three parts: (i) 74 types of reduced amino acid alphabet were manually extracted to generate 673 reduced amino acid clusters (RAACs) for dealing with unique protein problems. It is easy for users to select desired RAACs from a multilayer browser tool. (ii) An online tool was developed to analyze primary sequence of protein. The tool could produce K-tuple reduced amino acid composition by defining three correlation parameters (K-tuple, g-gap, λ-correlation). The results are visualized as sequence alignment, mergence of RAA composition, feature distribution and logo of reduced sequence. (iii) The machine learning server is provided to train the model of protein classification based on K-tuple RAAC. The optimal model could be selected according to the evaluation indexes (ROC, AUC, MCC, etc.). In conclusion, RAACBook presents a powerful and user-friendly service in protein sequence analysis and computational proteomics. RAACBook can be freely available at http://bioinfor.imu.edu.cn/raacbook. Database URL: http://bioinfor.imu.edu.cn/raacbook
Collapse
Affiliation(s)
- Lei Zheng
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Zhaojun Road No.24, Hohhot, 010070, China
| | - Shenghui Huang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Zhaojun Road No.24, Hohhot, 010070, China
| | - Nengjiang Mu
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Zhaojun Road No.24, Hohhot, 010070, China
| | - Haoyue Zhang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Zhaojun Road No.24, Hohhot, 010070, China
| | - Jiayu Zhang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Zhaojun Road No.24, Hohhot, 010070, China
| | - Yu Chang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Zhaojun Road No.24, Hohhot, 010070, China
| | - Lei Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Baojian Road No.157, Harbin 150081, China
| | - Yongchun Zuo
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Zhaojun Road No.24, Hohhot, 010070, China
| |
Collapse
|
31
|
Wang S, Wang Y, Yu C, Cao Y, Yu Y, Pan Y, Su D, Lu Q, Yang W, Zuo Y, Yang L. Characterization of the relationship between FLI1 and immune infiltrate level in tumour immune microenvironment for breast cancer. J Cell Mol Med 2020; 24:5501-5514. [PMID: 32249526 PMCID: PMC7214163 DOI: 10.1111/jcmm.15205] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Revised: 01/31/2020] [Accepted: 03/06/2020] [Indexed: 12/24/2022] Open
Abstract
Breast cancer is the most common cancer and the leading cause of cancer death among women in the world. Tumour‐infiltrating lymphocytes were defined as the white blood cells left in the vasculature and localized in tumours. Recently, tumour‐infiltrating lymphocytes were found to be associated with good prognosis and response to immunotherapy in tumours. In this study, to examine the influence of FLI1 in immune system in breast cancer, we interrogated the relationship between the FLI1 expression levels with infiltration levels of 28 immune cell types. By splitting the breast cancer samples into high and low expression FLI1 subtypes, we found that the high expression FLI1 subtype was enriched in many immune cell types, and the up‐regulated differentially expressed genes between them were enriched in immune system processes, immune‐related KEGG pathways and biological processes. In addition, many important immune‐related features were found to be positively correlated with the FLI1 expression level. Furthermore, we found that the FLI1 was correlated with the immune‐related genes. Our findings may provide useful help for recognizing the relationship between tumour immune microenvironment and FLI1, and may unravel clinical outcomes and immunotherapy utility for FLI1 in breast cancer.
Collapse
Affiliation(s)
- Shiyuan Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yakun Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Chunlu Yu
- Public Health College, Harbin Medical University, Harbin, China
| | - Yiyin Cao
- Public Health College, Harbin Medical University, Harbin, China
| | - Yao Yu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yi Pan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Dongqing Su
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Qianzi Lu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Wuritu Yang
- The State key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Yongchun Zuo
- The State key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Lei Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| |
Collapse
|
32
|
Rashid MM, Shatabda S, Hasan MM, Kurata H. Recent Development of Machine Learning Methods in Microbial Phosphorylation Sites. Curr Genomics 2020; 21:194-203. [PMID: 33071613 PMCID: PMC7521030 DOI: 10.2174/1389202921666200427210833] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 04/12/2020] [Accepted: 04/13/2020] [Indexed: 01/10/2023] Open
Abstract
A variety of protein post-translational modifications has been identified that control many cellular functions. Phosphorylation studies in mycobacterial organisms have shown critical importance in diverse biological processes, such as intercellular communication and cell division. Recent technical advances in high-precision mass spectrometry have determined a large number of microbial phosphorylated proteins and phosphorylation sites throughout the proteome analysis. Identification of phosphorylated proteins with specific modified residues through experimentation is often labor-intensive, costly and time-consuming. All these limitations could be overcome through the application of machine learning (ML) approaches. However, only a limited number of computational phosphorylation site prediction tools have been developed so far. This work aims to present a complete survey of the existing ML-predictors for microbial phosphorylation. We cover a variety of important aspects for developing a successful predictor, including operating ML algorithms, feature selection methods, window size, and software utility. Initially, we review the currently available phosphorylation site databases of the microbiome, the state-of-the-art ML approaches, working principles, and their performances. Lastly, we discuss the limitations and future directions of the computational ML methods for the prediction of phosphorylation.
Collapse
Affiliation(s)
| | | | - Md. Mehedi Hasan
- Address correspondence to these authors at the Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828;, E-mail: and Biomedical Informatics R&D Center, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828; E-mail:
| | - Hiroyuki Kurata
- Address correspondence to these authors at the Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828;, E-mail: and Biomedical Informatics R&D Center, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828; E-mail:
| |
Collapse
|
33
|
iPseU-Layer: Identifying RNA Pseudouridine Sites Using Layered Ensemble Model. Interdiscip Sci 2020; 12:193-203. [PMID: 32170573 DOI: 10.1007/s12539-020-00362-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Revised: 02/16/2020] [Accepted: 02/19/2020] [Indexed: 01/28/2023]
Abstract
Pseudouridine represents one of the most prevalent post-transcriptional RNA modifications. The identification of pseudouridine sites is an essential step toward understanding RNA functions, RNA structure stabilization, translation process, and RNA stability; however, high-throughput experimental techniques remain expensive and time-consuming in lab explorations and biochemical processes. Thus, how to develop an efficient pseudouridine site identification method based on machine learning is very important both in academic research and drug development. Motived by this, we present an effective layered ensemble model designated as iPseU-Layer for identification of RNA pseudouridine sites. The proposed iPseU-Layer approach is essentially based on three different machine learning layers including: feature selection layer, feature extraction and fusion layer, and prediction layer. The feature selection layer reduces the dimensionality, which can be regarded as a data pre-processing stage. The feature extraction and fusion layer utilizes an ensemble method which is implemented through various machine learning algorithms to generate some outputs. The prediction layer applies classic random forest to identify the final results. Furthermore, we systematically conduct the validation experiments using cross-validation tests and independent test with the current state-of-the-art models. The proposed iPseU-Layer provides a promising predictive performance in terms of sensitivity, specificity, accuracy and Matthews correlation coefficient. Collectively, these findings indicate that the framework of iPseU-Layer is a feasible and effective strategy for the prediction of RNA pseudouridine sites.
Collapse
|
34
|
Identifying FL11 subtype by characterizing tumor immune microenvironment in prostate adenocarcinoma via Chou's 5-steps rule. Genomics 2020; 112:1500-1515. [DOI: 10.1016/j.ygeno.2019.08.021] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Revised: 08/03/2019] [Accepted: 08/26/2019] [Indexed: 12/14/2022]
|
35
|
Zheng H, Yang H, Gong D, Mai L, Qiu X, Chen L, Su X, Wei R, Zeng Z. Progress in the Mechanism and Clinical Application of Cilostazol. Curr Top Med Chem 2020; 19:2919-2936. [PMID: 31763974 DOI: 10.2174/1568026619666191122123855] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2019] [Revised: 07/27/2019] [Accepted: 08/02/2019] [Indexed: 12/20/2022]
Abstract
Cilostazol is a unique platelet inhibitor that has been used clinically for more than 20 years. As a phosphodiesterase type III inhibitor, cilostazol is capable of reversible inhibition of platelet aggregation and vasodilation, has antiproliferative effects, and is widely used in the treatment of peripheral arterial disease, cerebrovascular disease, percutaneous coronary intervention, etc. This article briefly reviews the pharmacological mechanisms and clinical application of cilostazol.
Collapse
Affiliation(s)
- Huilei Zheng
- Department of Medical Examination & Health Management, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China.,Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China
| | - Hua Yang
- Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China.,Department of Critical Care Medicine, Second People's Hospital of Nanning, Nanning, Guangxi, China
| | - Danping Gong
- Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China.,Elderly Cardiology Ward, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Lanxian Mai
- Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China.,Disciplinary Construction Office, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Xiaoling Qiu
- Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China
| | - Lidai Chen
- Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China
| | - Xiaozhou Su
- Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China
| | - Ruoqi Wei
- Department of Computer Science and Engineering, University of Bridgeport,126 Park Ave, BRIDGEPORT, CT 06604, United States
| | - Zhiyu Zeng
- Guangxi Key Laboratory of Precision Medicine in Cardio-cerebrovascular Diseases Control and Prevention,Nanning, Guangxi, China.,Guangxi Clinical Research Center for Cardio-cerebrovascular Diseases, Nanning, Guangxi, China.,Elderly Cardiology Ward, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| |
Collapse
|
36
|
Alaa A, Eldeib AM, Metwally AA. Protein Subcellular Localization Prediction Based on Internal Micro-similarities of Markov Chains. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2019:1355-1358. [PMID: 31946144 DOI: 10.1109/embc.2019.8857598] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Elucidating protein subcellular localization is an essential topic in proteomics research due to its importance in the process of drug discovery. Unfortunately, experimentally uncovering protein subcellular targets is an arduous process that may not result in a successful localization. In contrast, computational methods can rapidly predict protein subcellular targets and are an efficient alternative to experimental methods for unannotated proteins. In this work, we introduce a new method to predict protein subcellular localization which increases the predictive power of generative probabilistic models while preserving their explanatory benefit. Our method exploits Markov models to produce a feature vector that records micro-similarities between the underlying probability distributions of a given sequence and their counterparts in reference models. Compared to ordinary Markov chain inference, we show that our method improves overall accuracy by 10% under 10-fold cross-validation on a dataset consisting of 10 subcellular locations. The source code is publicly available on https://github.com/aametwally/MC MicroSimilarities.
Collapse
|
37
|
Some illuminating remarks on molecular genetics and genomics as well as drug development. Mol Genet Genomics 2020; 295:261-274. [PMID: 31894399 DOI: 10.1007/s00438-019-01634-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Accepted: 12/05/2019] [Indexed: 02/07/2023]
Abstract
Facing the explosive growth of biological sequences unearthed in the post-genomic age, one of the most important but also most difficult problems in computational biology is how to express a biological sequence with a discrete model or a vector, but still keep it with considerable sequence-order information or its special pattern. To deal with such a challenging problem, the ideas of "pseudo amino acid components" and "pseudo K-tuple nucleotide composition" have been proposed. The ideas and their approaches have further stimulated the birth for "distorted key theory", "wenxing diagram", and substantially strengthening the power in treating the multi-label systems, as well as the establishment of the famous "5-steps rule". All these logic developments are quite natural that are very useful not only for theoretical scientists but also for experimental scientists in conducting genetics/genomics analysis and drug development. Presented in this review paper are also their future perspectives; i.e., their impacts will become even more significant and propounding.
Collapse
|
38
|
Chou KC. Impacts of Pseudo Amino Acid Components and 5-steps Rule to Proteomics and Proteome Analysis. Curr Top Med Chem 2019; 19:2283-2300. [DOI: 10.2174/1568026619666191018100141] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Revised: 08/18/2019] [Accepted: 08/26/2019] [Indexed: 01/27/2023]
Abstract
Stimulated by the 5-steps rule during the last decade or so, computational proteomics has achieved remarkable progresses in the following three areas: (1) protein structural class prediction; (2) protein subcellular location prediction; (3) post-translational modification (PTM) site prediction. The results obtained by these predictions are very useful not only for an in-depth study of the functions of proteins and their biological processes in a cell, but also for developing novel drugs against major diseases such as cancers, Alzheimer’s, and Parkinson’s. Moreover, since the targets to be predicted may have the multi-label feature, two sets of metrics are introduced: one is for inspecting the global prediction quality, while the other for the local prediction quality. All the predictors covered in this review have a userfriendly web-server, through which the majority of experimental scientists can easily obtain their desired data without the need to go through the complicated mathematics.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| |
Collapse
|
39
|
Bonetta R, Valentino G. Machine learning techniques for protein function prediction. Proteins 2019; 88:397-413. [PMID: 31603244 DOI: 10.1002/prot.25832] [Citation(s) in RCA: 76] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 07/05/2019] [Accepted: 09/17/2019] [Indexed: 12/17/2022]
Abstract
Proteins play important roles in living organisms, and their function is directly linked with their structure. Due to the growing gap between the number of proteins being discovered and their functional characterization (in particular as a result of experimental limitations), reliable prediction of protein function through computational means has become crucial. This paper reviews the machine learning techniques used in the literature, following their evolution from simple algorithms such as logistic regression to more advanced methods like support vector machines and modern deep neural networks. Hyperparameter optimization methods adopted to boost prediction performance are presented. In parallel, the metamorphosis in the features used by these algorithms from classical physicochemical properties and amino acid composition, up to text-derived features from biomedical literature and learned feature representations using autoencoders, together with feature selection and dimensionality reduction techniques, are also reviewed. The success stories in the application of these techniques to both general and specific protein function prediction are discussed.
Collapse
Affiliation(s)
- Rosalin Bonetta
- Centre for Molecular Medicine and Biobanking, University of Malta, Msida, Malta
| | - Gianluca Valentino
- Department of Communications and Computer Engineering, University of Malta, Msida, Malta
| |
Collapse
|
40
|
Xuan P, Cui H, Shen T, Sheng N, Zhang T. HeteroDualNet: A Dual Convolutional Neural Network With Heterogeneous Layers for Drug-Disease Association Prediction via Chou's Five-Step Rule. Front Pharmacol 2019; 10:1301. [PMID: 31780934 PMCID: PMC6856670 DOI: 10.3389/fphar.2019.01301] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Accepted: 10/11/2019] [Indexed: 11/14/2022] Open
Abstract
Identifying new treatments for existing drugs can help reduce drug development costs and explore novel indications of drugs. The prediction of associations between drugs and diseases is challenging because their similarities and relations are complicated and non-linear. We propose a HeteroDualNet model to address this issue. Firstly, three types of matrices are extracted to represent intra-drug similarities, intra-disease similarity and drug-disease associations. The intra-drug similarities consider three drug features and a newly introduced drug-related disease correlation. Secondly, an embedding mechanism is proposed to integrate these matrices in a heterogenous drug-disease association layer (hetero-layer). Further, a neighbouring heterogeneous layer (hetero-layer-N) is constructed to incorporate the biological premise that similar drugs can often treat related diseases. Finally, a dual convolutional neural network is built with hetero-layer and hetero-layer-N as two branches to learn from characteristics of drug-disease and the relations of their neighbours simultaneously. HeteroDualNet outperformed the other four methods in comparison over a public dataset of 763 drugs and 681 diseases in terms of Areas Under the Curves of Receiver Operating Characteristics and Precision-Recall, and recall rate at top k. Case study of five drugs further proved the capacity of HeteroDualNet in finding reliable disease candidates of drugs as validated by database records or literature. Our findings show that the embedded heterogenous layers of original and neighbouring drug-disease representations in a dual neural network improved the association prediction performance.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Bundoora, VIC, Australia
| | - Tonghui Shen
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Nan Sheng
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin, China
| |
Collapse
|
41
|
Behbahani M, Nosrati M, Moradi M, Mohabatkar H. Using Chou's General Pseudo Amino Acid Composition to Classify Laccases from Bacterial and Fungal Sources via Chou's Five-Step Rule. Appl Biochem Biotechnol 2019; 190:1035-1048. [PMID: 31659712 DOI: 10.1007/s12010-019-03141-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Accepted: 09/12/2019] [Indexed: 01/28/2023]
Abstract
Laccases are a group of enzymes with a critical activity in the degradation process of both phenolic and non-phenolic compounds. These enzymes present in a diverse array of species, including fungi and bacteria. Since this enzyme is in the market for different usages from industry to medicine, having a better knowledge of its structures and properties from diverse sources will be useful to select the most appropriate candidate for different purposes. In the current study, sequence- and structure-based characteristics of these enzymes from fungi and bacteria, including pseudo amino acid composition (PseAAC), physicochemical characteristics, and their secondary structures, are being compared and classified. Autodock 4 software was used for docking analysis between these laccases and some phenolic and non-phenolic compounds. The results indicated that features including molecular weight, aliphatic, extinction coefficient, and random coil percentage of these protein groups present high degrees of diversity in most cases. Categorization of these enzymes by the notion of PseAAC, showed over 96% accuracy. The binding free energy between fungal laccases and their substrates showed to be considerably higher than those of bacterial ones. According to the outcomes of the current study, data mining methods by using different machine learning algorithms, especially neural networks, could provide valuable information for a fair comparison between fungal and bacterial laccases. These results also suggested an association between efficacy and physicochemical features of laccase enzymes from different sources.
Collapse
Affiliation(s)
- Mandana Behbahani
- Department of Biotechnology, Faculty of Biological Science and Technology, University of Isfahan, Isfahan, Iran
| | - Mokhtar Nosrati
- Department of Biotechnology, Faculty of Biological Science and Technology, University of Isfahan, Isfahan, Iran
| | - Mohammad Moradi
- Department of Biotechnology, Faculty of Biological Science and Technology, University of Isfahan, Isfahan, Iran
| | - Hassan Mohabatkar
- Department of Biotechnology, Faculty of Biological Science and Technology, University of Isfahan, Isfahan, Iran.
| |
Collapse
|
42
|
Chou KC. Advances in Predicting Subcellular Localization of Multi-label Proteins and its Implication for Developing Multi-target Drugs. Curr Med Chem 2019; 26:4918-4943. [PMID: 31060481 DOI: 10.2174/0929867326666190507082559] [Citation(s) in RCA: 78] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Revised: 01/29/2019] [Accepted: 01/31/2019] [Indexed: 12/16/2022]
Abstract
The smallest unit of life is a cell, which contains numerous protein molecules. Most
of the functions critical to the cell’s survival are performed by these proteins located in its different
organelles, usually called ‘‘subcellular locations”. Information of subcellular localization
for a protein can provide useful clues about its function. To reveal the intricate pathways at the
cellular level, knowledge of the subcellular localization of proteins in a cell is prerequisite.
Therefore, one of the fundamental goals in molecular cell biology and proteomics is to determine
the subcellular locations of proteins in an entire cell. It is also indispensable for prioritizing
and selecting the right targets for drug development. Unfortunately, it is both timeconsuming
and costly to determine the subcellular locations of proteins purely based on experiments.
With the avalanche of protein sequences generated in the post-genomic age, it is highly
desired to develop computational methods for rapidly and effectively identifying the subcellular
locations of uncharacterized proteins based on their sequences information alone. Actually,
considerable progresses have been achieved in this regard. This review is focused on those
methods, which have the capacity to deal with multi-label proteins that may simultaneously
exist in two or more subcellular location sites. Protein molecules with this kind of characteristic
are vitally important for finding multi-target drugs, a current hot trend in drug development.
Focused in this review are also those methods that have use-friendly web-servers established so
that the majority of experimental scientists can use them to get the desired results without the
need to go through the detailed mathematics involved.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|
43
|
Abstract
The smallest unit of life is a cell, which contains numerous protein molecules. Most
of the functions critical to the cell’s survival are performed by these proteins located in its different
organelles, usually called ‘‘subcellular locations”. Information of subcellular localization
for a protein can provide useful clues about its function. To reveal the intricate pathways at the
cellular level, knowledge of the subcellular localization of proteins in a cell is prerequisite.
Therefore, one of the fundamental goals in molecular cell biology and proteomics is to determine
the subcellular locations of proteins in an entire cell. It is also indispensable for prioritizing
and selecting the right targets for drug development. Unfortunately, it is both timeconsuming
and costly to determine the subcellular locations of proteins purely based on experiments.
With the avalanche of protein sequences generated in the post-genomic age, it is highly
desired to develop computational methods for rapidly and effectively identifying the subcellular
locations of uncharacterized proteins based on their sequences information alone. Actually,
considerable progresses have been achieved in this regard. This review is focused on those
methods, which have the capacity to deal with multi-label proteins that may simultaneously
exist in two or more subcellular location sites. Protein molecules with this kind of characteristic
are vitally important for finding multi-target drugs, a current hot trend in drug development.
Focused in this review are also those methods that have use-friendly web-servers established so
that the majority of experimental scientists can use them to get the desired results without the
need to go through the detailed mathematics involved.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|
44
|
Lan J, Liu Z, Liao C, Merkler DJ, Han Q, Li J. A Study for Therapeutic Treatment against Parkinson's Disease via Chou's 5-steps Rule. Curr Top Med Chem 2019; 19:2318-2333. [PMID: 31629395 DOI: 10.2174/1568026619666191019111528] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Revised: 08/05/2019] [Accepted: 08/22/2019] [Indexed: 11/22/2022]
Abstract
The enzyme L-DOPA decarboxylase (DDC), also called aromatic-L-amino-acid decarboxylase, catalyzes the biosynthesis of dopamine, serotonin, and trace amines. Its deficiency or perturbations in expression result in severe motor dysfunction or a range of neurodegenerative and psychiatric disorders. A DDC substrate, L-DOPA, combined with an inhibitor of the enzyme is still the most effective treatment for symptoms of Parkinson's disease. In this review, we provide an update regarding the structures, functions, and inhibitors of DDC, particularly with regards to the treatment of Parkinson's disease. This information will provide insight into the pharmacological treatment of Parkinson's disease.
Collapse
Affiliation(s)
- Jianqiang Lan
- Key Laboratory of Tropical Biological Resources of Ministry of Education, School of Life and Pharmaceutical Sciences, Hainan University, Haikou, Hainan 570228, China
| | - Zhongqiang Liu
- Key Laboratory of Tropical Biological Resources of Ministry of Education, School of Life and Pharmaceutical Sciences, Hainan University, Haikou, Hainan 570228, China
| | - Chenghong Liao
- Key Laboratory of Tropical Biological Resources of Ministry of Education, School of Life and Pharmaceutical Sciences, Hainan University, Haikou, Hainan 570228, China
| | - David J Merkler
- Department of Chemistry, University of South Florida, Tampa, FL, 33620, United States
| | - Qian Han
- Key Laboratory of Tropical Biological Resources of Ministry of Education, School of Life and Pharmaceutical Sciences, Hainan University, Haikou, Hainan 570228, China
| | - Jianyong Li
- Department of Biochemistry, Virginia Tech, Blacksburg, VA 24061, United States
| |
Collapse
|
45
|
Liang R, Xie J, Zhang C, Zhang M, Huang H, Huo H, Cao X, Niu B. Identifying Cancer Targets Based on Machine Learning Methods via Chou's 5-steps Rule and General Pseudo Components. Curr Top Med Chem 2019; 19:2301-2317. [PMID: 31622219 DOI: 10.2174/1568026619666191016155543] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2019] [Revised: 07/19/2019] [Accepted: 08/26/2019] [Indexed: 01/09/2023]
Abstract
In recent years, the successful implementation of human genome project has made people realize that genetic, environmental and lifestyle factors should be combined together to study cancer due to the complexity and various forms of the disease. The increasing availability and growth rate of 'big data' derived from various omics, opens a new window for study and therapy of cancer. In this paper, we will introduce the application of machine learning methods in handling cancer big data including the use of artificial neural networks, support vector machines, ensemble learning and naïve Bayes classifiers.
Collapse
Affiliation(s)
- Ruirui Liang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Jiayang Xie
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Chi Zhang
- Foshan Huaxia Eye Hospital, Huaxia Eye Hospital Group, Foshan 528000, China
| | - Mengying Zhang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Hai Huang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Haizhong Huo
- Department of General Surgery, Shanghai Ninth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai 200011, China
| | - Xin Cao
- Zhongshan Hospital, Institute of Clinical Science, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Bing Niu
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| |
Collapse
|
46
|
Identifying DNase I hypersensitive sites using multi-features fusion and F-score features selection via Chou's 5-steps rule. Biophys Chem 2019; 253:106227. [DOI: 10.1016/j.bpc.2019.106227] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Revised: 07/04/2019] [Accepted: 07/10/2019] [Indexed: 01/12/2023]
|
47
|
Vishnoi S, Garg P, Arora P. Physicochemical n-Grams Tool: A tool for protein physicochemical descriptor generation via Chou's 5-step rule. Chem Biol Drug Des 2019; 95:79-86. [PMID: 31483930 DOI: 10.1111/cbdd.13617] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 08/23/2019] [Accepted: 08/26/2019] [Indexed: 12/21/2022]
Abstract
Physicochemical n-Grams Tool (PnGT) is an open-source standalone software for calculating physicochemical descriptors of protein. PnGT was developed using the Python scripting language and developed the user interface using Tkinter. The software currently calculates 33 physicochemical descriptors along with the sequence length for the given protein primary sequence. The descriptor generated by this tool can be directly utilized as the feature vector for the development of proteomics statistical or machine learning predictive model.
Collapse
Affiliation(s)
- Shubham Vishnoi
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, Mohali, India
| | - Prabha Garg
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, Mohali, India
| | - Pooja Arora
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, Mohali, India
| |
Collapse
|
48
|
Liu Z, Dong W, Jiang W, He Z. csDMA: an improved bioinformatics tool for identifying DNA 6 mA modifications via Chou's 5-step rule. Sci Rep 2019; 9:13109. [PMID: 31511570 PMCID: PMC6739324 DOI: 10.1038/s41598-019-49430-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2019] [Accepted: 08/24/2019] [Indexed: 12/31/2022] Open
Abstract
DNA N6-methyldeoxyadenosine (6 mA) modifications were first found more than 60 years ago but were thought to be only widespread in prokaryotes and unicellular eukaryotes. With the development of high-throughput sequencing technology, 6 mA modifications were found in different multicellular eukaryotes by using experimental methods. However, the experimental methods were time-consuming and costly, which makes it is very necessary to develop computational methods instead. In this study, a machine learning-based prediction tool, named csDMA, was developed for predicting 6 mA modifications. Firstly, three feature encoding schemes, Motif, Kmer, and Binary, were used to generate the feature matrix. Secondly, different algorithms were selected into the prediction model and the ExtraTrees model received the best AUC of 0.878 by using 5-fold cross-validation on the training dataset. Besides, the ExtraTrees model also received the best AUC of 0.893 on the independent testing dataset. Finally, we compared our method with state-of-the-art predictors and the results shown that our model achieved better performance than existing tools.
Collapse
Affiliation(s)
- Ze Liu
- College of Water Resources and Architectural Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China.,Key Laboratory of Agricultural Soil and Water Engineering in Arid and Semiarid Areas, Ministry of Education, Northwest A & F University, Yangling, 712100, Shaanxi, China
| | - Wei Dong
- College of Water Resources and Architectural Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China. .,Key Laboratory of Agricultural Soil and Water Engineering in Arid and Semiarid Areas, Ministry of Education, Northwest A & F University, Yangling, 712100, Shaanxi, China.
| | - Wei Jiang
- College of Water Resources and Architectural Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China.,Key Laboratory of Agricultural Soil and Water Engineering in Arid and Semiarid Areas, Ministry of Education, Northwest A & F University, Yangling, 712100, Shaanxi, China
| | - Zili He
- College of Water Resources and Architectural Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China.,Key Laboratory of Agricultural Soil and Water Engineering in Arid and Semiarid Areas, Ministry of Education, Northwest A & F University, Yangling, 712100, Shaanxi, China
| |
Collapse
|
49
|
Chou KC. Proposing Pseudo Amino Acid Components is an Important Milestone for Proteome and Genome Analyses. Int J Pept Res Ther 2019. [DOI: 10.1007/s10989-019-09910-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
50
|
|