1
|
Khan S, Noor S, Awan HH, Iqbal S, AlQahtani SA, Dilshad N, Ahmad N. Deep-ProBind: binding protein prediction with transformer-based deep learning model. BMC Bioinformatics 2025; 26:88. [PMID: 40121399 PMCID: PMC11929993 DOI: 10.1186/s12859-025-06101-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2024] [Accepted: 03/04/2025] [Indexed: 03/25/2025] Open
Abstract
Binding proteins play a crucial role in biological systems by selectively interacting with specific molecules, such as DNA, RNA, or peptides, to regulate various cellular processes. Their ability to recognize and bind target molecules with high specificity makes them essential for signal transduction, transport, and enzymatic activity. Traditional experimental methods for identifying protein-binding peptides are costly and time-consuming. Current sequence-based approaches often struggle with accuracy, focusing too narrowly on proximal sequence features and ignoring structural data. This study presents Deep-ProBind, a powerful prediction model designed to classify protein binding sites by integrating sequence and structural information. The proposed model employs a transformer and evolutionary-based attention mechanism, i.e., Bidirectional Encoder Representations from Transformers (BERT) and Pseudo position specific scoring matrix -Discrete Wavelet Transform (PsePSSM -DWT) approach to encode peptides. The SHapley Additive exPlanations (SHAP) algorithm selects the optimal hybrid features, and a Deep Neural Network (DNN) is then used as the classification algorithm to predict protein-binding peptides. The performance of the proposed model was evaluated in comparison with traditional Machine Learning (ML) algorithms and existing models. Experimental results demonstrate that Deep-ProBind achieved 92.67% accuracy with tenfold cross-validation on benchmark datasets and 93.62% accuracy on independent samples. The Deep-ProBind outperforms existing models by 3.57% on training data and 1.52% on independent tests. These results demonstrate Deep-ProBind's reliability and effectiveness, making it a valuable tool for researchers and a potential resource in pharmacological studies, where peptide binding plays a critical role in therapeutic development.
Collapse
Affiliation(s)
- Salman Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, KPK, Pakistan
| | - Sumaiya Noor
- Business and Management Sciences Department, Purdue University, West Lafayette, IN, USA
| | - Hamid Hussain Awan
- Department of Computer Science, Rawalpindi Women University, Rawalpindi, 46300, Punjab, Pakistan
| | - Shehryar Iqbal
- School of Physics, Engineering and Computer Science, University of Hertfordshire, Hatfield, UK
| | - Salman A AlQahtani
- New Emerging Technologies and 5g Network and Beyond Research Chair, Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Naqqash Dilshad
- Department of Computer Science & Engineering, Sejong University, Seoul, 05006, South Korea
| | - Nijad Ahmad
- Department of Computer Science, Khurasan University, Jalalabad, Afghanistan.
| |
Collapse
|
2
|
Shahid, Hayat M, Alghamdi W, Akbar S, Raza A, Kadir RA, Sarker MR. pACP-HybDeep: predicting anticancer peptides using binary tree growth based transformer and structural feature encoding with deep-hybrid learning. Sci Rep 2025; 15:565. [PMID: 39747941 PMCID: PMC11695694 DOI: 10.1038/s41598-024-84146-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Accepted: 12/20/2024] [Indexed: 01/04/2025] Open
Abstract
Worldwide, Cancer remains a significant health concern due to its high mortality rates. Despite numerous traditional therapies and wet-laboratory methods for treating cancer-affected cells, these approaches often face limitations, including high costs and substantial side effects. Recently the high selectivity of peptides has garnered significant attention from scientists due to their reliable targeted actions and minimal adverse effects. Furthermore, keeping the significant outcomes of the existing computational models, we propose a highly reliable and effective model namely, pACP-HybDeep for the accurate prediction of anticancer peptides. In this model, training peptides are numerically encoded using an attention-based ProtBERT-BFD encoder to extract semantic features along with CTDT-based structural information. Furthermore, a k-nearest neighbor-based binary tree growth (BTG) algorithm is employed to select an optimal feature set from the multi-perspective vector. The selected feature vector is subsequently trained using a CNN + RNN-based deep learning model. Our proposed pACP-HybDeep model demonstrated a high training accuracy of 95.33%, and an AUC of 0.97. To validate the generalization capabilities of the model, our pACP-HybDeep model achieved accuracies of 94.92%, 92.26%, and 91.16% on independent datasets Ind-S1, Ind-S2, and Ind-S3, respectively. The demonstrated efficacy, and reliability of the pACP-HybDeep model using test datasets establish it as a valuable tool for researchers in academia and pharmaceutical drug design.
Collapse
Affiliation(s)
- Shahid
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, 23200, KP, Pakistan
| | - Maqsood Hayat
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, 23200, KP, Pakistan.
| | - Wajdi Alghamdi
- Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Shahid Akbar
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, 23200, KP, Pakistan.
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, China.
| | - Ali Raza
- Department of Computer Science, MY University, Islamabad, 45750, Pakistan
| | - Rabiah Abdul Kadir
- Institute of Visual Informatics, Universiti Kebangsaan Malaysia, 43600, Bangi, Selangor, Malaysia.
| | - Mahidur R Sarker
- Institute of Visual Informatics, Universiti Kebangsaan Malaysia, 43600, Bangi, Selangor, Malaysia
- Universidad de Dise˜no, Innovaci´on y Tecnología, UDIT, Av. Alfonso XIII, 97, 28016, Madrid, Spain
| |
Collapse
|
3
|
Guo C, Wang X, Ren H. Databases and computational methods for the identification of piRNA-related molecules: A survey. Comput Struct Biotechnol J 2024; 23:813-833. [PMID: 38328006 PMCID: PMC10847878 DOI: 10.1016/j.csbj.2024.01.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 12/31/2023] [Accepted: 01/15/2024] [Indexed: 02/09/2024] Open
Abstract
Piwi-interacting RNAs (piRNAs) are a class of small non-coding RNAs (ncRNAs) that plays important roles in many biological processes and major cancer diagnosis and treatment, thus becoming a hot research topic. This study aims to provide an in-depth review of computational piRNA-related research, including databases and computational models. Herein, we perform literature analysis and use comparative evaluation methods to summarize and analyze three aspects of computational piRNA-related research: (i) computational models for piRNA-related molecular identification tasks, (ii) computational models for piRNA-disease association prediction tasks, and (iii) computational resources and evaluation metrics for these tasks. This study shows that computational piRNA-related research has significantly progressed, exhibiting promising performance in recent years, whereas they also suffer from the emerging challenges of inconsistent naming systems and the lack of data. Different from other reviews on piRNA-related identification tasks that focus on the organization of datasets and computational methods, we pay more attention to the analysis of computational models, algorithms, and performances that aim to provide valuable references for computational piRNA-related identification tasks. This study will benefit the theoretical development and practical application of piRNAs by better understanding computational models and resources to investigate the biological functions and clinical implications of piRNA.
Collapse
Affiliation(s)
- Chang Guo
- Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, Guangzhou 510420, China
| | - Xiaoli Wang
- Institute of Reproductive Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Han Ren
- Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, Guangzhou 510420, China
- Laboratory of Language and Artificial Intelligence, Guangdong University of Foreign Studies, Guangzhou 510420, China
| |
Collapse
|
4
|
Noor S, Naseem A, Awan HH, Aslam W, Khan S, AlQahtani SA, Ahmad N. Deep-m5U: a deep learning-based approach for RNA 5-methyluridine modification prediction using optimized feature integration. BMC Bioinformatics 2024; 25:360. [PMID: 39563239 DOI: 10.1186/s12859-024-05978-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Accepted: 11/06/2024] [Indexed: 11/21/2024] Open
Abstract
BACKGROUND RNA 5-methyluridine (m5U) modifications play a crucial role in biological processes, making their accurate identification a key focus in computational biology. This paper introduces Deep-m5U, a robust predictor designed to enhance the prediction of m5U modifications. The proposed method, named Deep-m5U, utilizes a hybrid pseudo-K-tuple nucleotide composition (PseKNC) for sequence formulation, a Shapley Additive exPlanations (SHAP) algorithm for discriminant feature selection, and a deep neural network (DNN) as the classifier. RESULTS The model was evaluated using two benchmark datasets, i.e., Full Transcript and Mature mRNA. Deep-m5U achieved overall accuracies of 91.47% and 95.86% for the Full Transcript and Mature mRNA datasets with 10-fold cross-validation, and for independent samples, the model attained 92.94% and 95.17% accuracy. CONCLUSION Compared to existing models, Deep-m5U showed approximately 5.23% and 3.73% higher accuracy on the training data and 3.95% and 3.26% higher accuracy on independent samples for the Full Transcript and Mature mRNA datasets, respectively. The reliability and effectiveness of Deep-m5U make it a valuable tool for scientists and a potential asset in pharmaceutical design and research.
Collapse
Affiliation(s)
- Sumaiya Noor
- Business and Management Sciences Department, Purdue University, West Lafayette, IN, USA
| | - Afshan Naseem
- Institute of Oceanography and Environment (INOS), Universiti Malaysia Terengganu, 21030, Kuala Nerus, Terengganu, Malaysia
| | - Hamid Hussain Awan
- Department of Computer Science, Muslim Youth University, Islamabad, Pakistan
| | - Wasiq Aslam
- Department of Computer Science, Muslim Youth University, Islamabad, Pakistan
| | - Salman Khan
- New Emerging Technologies and 5G Network and Beyond Research Chair, Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Salman A AlQahtani
- New Emerging Technologies and 5G Network and Beyond Research Chair, Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Nijad Ahmad
- Department of Computer Science, Khurasan University, Jalalabad, Afghanistan.
| |
Collapse
|
5
|
Khan S, AlQahtani SA, Noor S, Ahmad N. PSSM-Sumo: deep learning based intelligent model for prediction of sumoylation sites using discriminative features. BMC Bioinformatics 2024; 25:284. [PMID: 39215231 PMCID: PMC11363370 DOI: 10.1186/s12859-024-05917-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Accepted: 08/27/2024] [Indexed: 09/04/2024] Open
Abstract
Post-translational modifications (PTMs) are fundamental to essential biological processes, exerting significant influence over gene expression, protein localization, stability, and genome replication. Sumoylation, a PTM involving the covalent addition of a chemical group to a specific protein sequence, profoundly impacts the functional diversity of proteins. Notably, identifying sumoylation sites has garnered significant attention due to their crucial roles in proteomic functions and their implications in various diseases, including Parkinson's and Alzheimer's. Despite the proposal of several computational models for identifying sumoylation sites, their effectiveness could be improved by the limitations associated with conventional learning methodologies. In this study, we introduce pseudo-position-specific scoring matrix (PsePSSM), a robust computational model designed for accurately predicting sumoylation sites using an optimized deep learning algorithm and efficient feature extraction techniques. Moreover, to streamline computational processes and eliminate irrelevant and noisy features, sequential forward selection using a support vector machine (SFS-SVM) is implemented to identify optimal features. The multi-layer Deep Neural Network (DNN) is a robust classifier, facilitating precise sumoylation site prediction. We meticulously assess the performance of PSSM-Sumo through a tenfold cross-validation approach, employing various statistical metrics such as the Matthews Correlation Coefficient (MCC), accuracy, sensitivity, specificity, and the Area under the ROC Curve (AUC). Comparative analyses reveal that PSSM-Sumo achieves an exceptional average prediction accuracy of 98.71%, surpassing existing models. The robustness and accuracy of the proposed model position it as a promising tool for advancing drug discovery and the diagnosis of diverse diseases linked to sumoylation sites.
Collapse
Affiliation(s)
- Salman Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, KPK, Pakistan
| | - Salman A AlQahtani
- New Emerging Technologies and 5G Network and Beyond Research Chair, Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Sumaiya Noor
- Business and Management Sciences Department, Purdue University, West Lafayette, IN, USA
| | - Nijad Ahmad
- Department of Computer Science, Khurasan University Jalalabad, Jalalabad, Afghanistan.
| |
Collapse
|
6
|
Khan S, Uddin I, Khan M, Iqbal N, Alshanbari HM, Ahmad B, Khan DM. Sequence based model using deep neural network and hybrid features for identification of 5-hydroxymethylcytosine modification. Sci Rep 2024; 14:9116. [PMID: 38643305 PMCID: PMC11551160 DOI: 10.1038/s41598-024-59777-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 04/15/2024] [Indexed: 04/22/2024] Open
Abstract
RNA modifications are pivotal in the development of newly synthesized structures, showcasing a vast array of alterations across various RNA classes. Among these, 5-hydroxymethylcytosine (5HMC) stands out, playing a crucial role in gene regulation and epigenetic changes, yet its detection through conventional methods proves cumbersome and costly. To address this, we propose Deep5HMC, a robust learning model leveraging machine learning algorithms and discriminative feature extraction techniques for accurate 5HMC sample identification. Our approach integrates seven feature extraction methods and various machine learning algorithms, including Random Forest, Naive Bayes, Decision Tree, and Support Vector Machine. Through K-fold cross-validation, our model achieved a notable 84.07% accuracy rate, surpassing previous models by 7.59%, signifying its potential in early cancer and cardiovascular disease diagnosis. This study underscores the promise of Deep5HMC in offering insights for improved medical assessment and treatment protocols, marking a significant advancement in RNA modification analysis.
Collapse
Affiliation(s)
- Salman Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Islam Uddin
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Mukhtaj Khan
- Department of Information Technology, The University of Haripur, Haripur, Pakistan
| | - Nadeem Iqbal
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Huda M Alshanbari
- Department of Mathematical Sciences, College of Science, Princess Nourah bint Abdulrahman University, P.O. Box 84428, 11671, Riyadh, Saudi Arabia
| | - Bakhtiyar Ahmad
- Higher Education Department Afghanistan, Kabul, Afghanistan.
| | - Dost Muhammad Khan
- Department of Statistics, Abdul Wali Khan University Mardan, Mardan, 23200, KP, Pakistan
| |
Collapse
|
7
|
Jha UC, Nayyar H, Roychowdhury R, Prasad PVV, Parida SK, Siddique KHM. Non-coding RNAs (ncRNAs) in plant: Master regulators for adapting to extreme temperature conditions. PLANT PHYSIOLOGY AND BIOCHEMISTRY : PPB 2023; 205:108164. [PMID: 38008006 DOI: 10.1016/j.plaphy.2023.108164] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 10/30/2023] [Accepted: 11/02/2023] [Indexed: 11/28/2023]
Abstract
Unusual daily temperature fluctuations caused by climate change and climate variability adversely impact agricultural crop production. Since plants are immobile and constantly receive external environmental signals, such as extreme high (heat) and low (cold) temperatures, they have developed complex molecular regulatory mechanisms to cope with stressful situations to sustain their natural growth and development. Among these mechanisms, non-coding RNAs (ncRNAs), particularly microRNAs (miRNAs), small-interfering RNAs (siRNAs), and long-non-coding RNAs (lncRNAs), play a significant role in enhancing heat and cold stress tolerance. This review explores the pivotal findings related to miRNAs, siRNAs, and lncRNAs, elucidating how they functionally regulate plant adaptation to extreme temperatures. In addition, this review addresses the challenges associated with uncovering these non-coding RNAs and understanding their roles in orchestrating heat and cold tolerance in plants.
Collapse
Affiliation(s)
- Uday Chand Jha
- Sustainable Intensification Innovation Lab, Kansas State University, Department of Agronomy, Manhattan, KS 66506, USA; ICAR-Indian Institute of Pulses Research, Kanpur, Uttar Pradesh 208024, India.
| | - Harsh Nayyar
- Department of Botany, Panjab University, Chandigarh, 160014, India.
| | - Rajib Roychowdhury
- Department of Plant Pathology and Weed Research, Institute of Plant Protection, Agricultural Research Organization (ARO) - The Volcani Institute, Rishon Lezion 7505101, Israel
| | - P V Vara Prasad
- Sustainable Intensification Innovation Lab, Kansas State University, Department of Agronomy, Manhattan, KS 66506, USA
| | - Swarup K Parida
- National Institute of Plant Genomic Research, New Delhi, 110067, India
| | - Kadambot H M Siddique
- The UWA Institute of Agriculture, The University of Western Australia, Perth, WA 6001, Australia
| |
Collapse
|
8
|
Raza A, Uddin J, Almuhaimeed A, Akbar S, Zou Q, Ahmad A. AIPs-SnTCN: Predicting Anti-Inflammatory Peptides Using fastText and Transformer Encoder-Based Hybrid Word Embedding with Self-Normalized Temporal Convolutional Networks. J Chem Inf Model 2023; 63:6537-6554. [PMID: 37905969 DOI: 10.1021/acs.jcim.3c01563] [Citation(s) in RCA: 42] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Inflammation is a biologically resistant response to harmful stimuli, such as infection, damaged cells, toxic chemicals, or tissue injuries. Its purpose is to eradicate pathogenic micro-organisms or irritants and facilitate tissue repair. Prolonged inflammation can result in chronic inflammatory diseases. However, wet-laboratory-based treatments are costly and time-consuming and may have adverse side effects on normal cells. In the past decade, peptide therapeutics have gained significant attention due to their high specificity in targeting affected cells without affecting healthy cells. Motivated by the significance of peptide-based therapies, we developed a highly discriminative prediction model called AIPs-SnTCN to predict anti-inflammatory peptides accurately. The peptide samples are encoded using word embedding techniques such as skip-gram and attention-based bidirectional encoder representation using a transformer (BERT). The conjoint triad feature (CTF) also collects structure-based cluster profile features. The fused vector of word embedding and sequential features is formed to compensate for the limitations of single encoding methods. Support vector machine-based recursive feature elimination (SVM-RFE) is applied to choose the ranking-based optimal space. The optimized feature space is trained by using an improved self-normalized temporal convolutional network (SnTCN). The AIPs-SnTCN model achieved a predictive accuracy of 95.86% and an AUC of 0.97 by using training samples. In the case of the alternate training data set, our model obtained an accuracy of 92.04% and an AUC of 0.96. The proposed AIPs-SnTCN model outperformed existing models with an ∼19% higher accuracy and an ∼14% higher AUC value. The reliability and efficacy of our AIPs-SnTCN model make it a valuable tool for scientists and may play a beneficial role in pharmaceutical design and research academia.
Collapse
Affiliation(s)
- Ali Raza
- Department of Physical and Numerical Sciences, Qurtuba University of Science and Information Technology, Peshawar, Khyber Pakhtunkhwa 25124, Pakistan
- Department of Computer Science, MY University, Islamabad 45750, Pakistan
| | - Jamal Uddin
- Department of Physical and Numerical Sciences, Qurtuba University of Science and Information Technology, Peshawar, Khyber Pakhtunkhwa 25124, Pakistan
| | - Abdullah Almuhaimeed
- Digital Health Institute, King Abdulaziz City for Science and Technology, Riyadh 11442, Saudi Arabia
| | - Shahid Akbar
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Khyber Pakhtunkhwa 23200, Pakistan
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, PR China
| | - Ashfaq Ahmad
- Department of Computer Science, MY University, Islamabad 45750, Pakistan
| |
Collapse
|
9
|
Khan S, Khan M, Iqbal N, Dilshad N, Almufareh MF, Alsubaie N. Enhancing Sumoylation Site Prediction: A Deep Neural Network with Discriminative Features. Life (Basel) 2023; 13:2153. [PMID: 38004293 PMCID: PMC10672286 DOI: 10.3390/life13112153] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 10/18/2023] [Accepted: 10/25/2023] [Indexed: 11/26/2023] Open
Abstract
Sumoylation is a post-translation modification (PTM) mechanism that involves many critical biological processes, such as gene expression, localizing and stabilizing proteins, and replicating the genome. Moreover, sumoylation sites are associated with different diseases, including Parkinson's and Alzheimer's. Due to its vital role in the biological process, identifying sumoylation sites in proteins is significant for monitoring protein functions and discovering multiple diseases. Therefore, in the literature, several computational models utilizing conventional ML methods have been introduced to classify sumoylation sites. However, these models cannot accurately classify the sumoylation sites due to intrinsic limitations associated with the conventional learning methods. This paper proposes a robust computational model (called Deep-Sumo) for predicting sumoylation sites based on a deep-learning algorithm with efficient feature representation methods. The proposed model employs a half-sphere exposure method to represent protein sequences in a feature vector. Principal Component Analysis is applied to extract discriminative features by eliminating noisy and redundant features. The discriminant features are given to a multilayer Deep Neural Network (DNN) model to predict sumoylation sites accurately. The performance of the proposed model is extensively evaluated using a 10-fold cross-validation test by considering various statistical-based performance measurement metrics. Initially, the proposed DNN is compared with the traditional learning algorithm, and subsequently, the performance of the Deep-Sumo is compared with the existing models. The validation results show that the proposed model reports an average accuracy of 96.47%, with improvement compared with the existing models. It is anticipated that the proposed model can be used as an effective tool for drug discovery and the diagnosis of multiple diseases.
Collapse
Affiliation(s)
- Salman Khan
- Department of Computer Science, Abdul Wali Khan University, Mardan 23200, Pakistan; (S.K.); (N.I.)
| | - Mukhtaj Khan
- Department of Information Technology, The University of Haripur, Haripur 22620, Pakistan;
| | - Nadeem Iqbal
- Department of Computer Science, Abdul Wali Khan University, Mardan 23200, Pakistan; (S.K.); (N.I.)
| | - Naqqash Dilshad
- Department of Convergence Engineering for Intelligent Drone, Sejong University, Seoul 05006, Republic of Korea;
| | - Maram Fahaad Almufareh
- Department of Information Systems, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi Arabia;
| | - Najah Alsubaie
- Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University (PNU), P.O. Box 84428, Riyadh 11671, Saudi Arabia
| |
Collapse
|
10
|
Wang Y, Zheng L. A Novel Deep Framework for English Communication Based on Educational Psychology Perspective. Front Public Health 2022; 10:916101. [PMID: 35801240 PMCID: PMC9253416 DOI: 10.3389/fpubh.2022.916101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Accepted: 05/09/2022] [Indexed: 11/21/2022] Open
Abstract
The impact of verbal reading practices on learning is examined from the perspective of educational psychology, using the motivation theory and the schema theory. This research intends to enhance learner's English communication abilities in response to the needs for national economic growth and scientific and technological development. To motivate students to improve their English, the research may address the issue of inadequate opportunities by adding an artificial intelligence (AI) conversation mechanism to the students speaking English exercise. First, cognitive psychology is analyzed in detail, and a model based on cognitive psychology is implemented to solve the problems existing in student's English communication. In addition, various measures are presented and used to increase student's oral English communication abilities. We used sixty students from North China University of Water Resources and Electric Power are separated into two classes: Class A and Class B. The experimental group is called Class A, while the control group is called Class B. Following a comparison of the outcomes obtained before and after training. The experimental group's reading comprehension, responding to questions, situational conversation, and subject description scores rose by 13.33, 15.19, 17.39, and 28.3 %, respectively. The overall average score of the class climbed by 17.75 %, whereas the scores of pupils in Class B improved just an undersized. The results reveal that following the vocalized reading exercise, the student's English grades, self-efficacy, and topic knowledge increased considerably in the experimental group. Moreover, the proposed model, employs computer simulation in the English communication teaching system and AI, which can aid in the creation of an interactive learning environment for students to improve their spoken English and English communication abilities.
Collapse
Affiliation(s)
- Ying Wang
- School of Foreign Studies, North China University of Water Resources and Electric Power, Zhengzhou, China
- *Correspondence: Ying Wang
| | - Liang Zheng
- School of Foreign Languages, Henan University of Engineering, Zhengzhou, China
| |
Collapse
|
11
|
Zhang T, Chen L, Li R, Liu N, Huang X, Wong G. PIWI-interacting RNAs in human diseases: databases and computational models. Brief Bioinform 2022; 23:6603448. [PMID: 35667080 DOI: 10.1093/bib/bbac217] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 04/24/2022] [Accepted: 05/09/2022] [Indexed: 11/12/2022] Open
Abstract
PIWI-interacting RNAs (piRNAs) are short 21-35 nucleotide molecules that comprise the largest class of non-coding RNAs and found in a large diversity of species including yeast, worms, flies, plants and mammals including humans. The most well-understood function of piRNAs is to monitor and protect the genome from transposons particularly in germline cells. Recent data suggest that piRNAs may have additional functions in somatic cells although they are expressed there in far lower abundance. Compared with microRNAs (miRNAs), piRNAs have more limited bioinformatics resources available. This review collates 39 piRNA specific and non-specific databases and bioinformatics resources, describes and compares their utility and attributes and provides an overview of their place in the field. In addition, we review 33 computational models based upon function: piRNA prediction, transposon element and mRNA-related piRNA prediction, cluster prediction, signature detection, target prediction and disease association. Based on the collection of databases and computational models, we identify trends and potential gaps in tool development. We further analyze the breadth and depth of piRNA data available in public sources, their contribution to specific human diseases, particularly in cancer and neurodegenerative conditions, and highlight a few specific piRNAs that appear to be associated with these diseases. This briefing presents the most recent and comprehensive mapping of piRNA bioinformatics resources including databases, models and tools for disease associations to date. Such a mapping should facilitate and stimulate further research on piRNAs.
Collapse
Affiliation(s)
- Tianjiao Zhang
- Faculty of Health Sciences, University of Macau, Taipa, Macau S.A.R. 999078, China
| | - Liang Chen
- Department of Computer Science, School of Engineering, Shantou University, Shantou, China
| | - Rongzhen Li
- Faculty of Health Sciences, University of Macau, Taipa, Macau S.A.R. 999078, China
| | - Ning Liu
- Faculty of Health Sciences, University of Macau, Taipa, Macau S.A.R. 999078, China
| | - Xiaobing Huang
- Faculty of Health Sciences, University of Macau, Taipa, Macau S.A.R. 999078, China
| | - Garry Wong
- Faculty of Health Sciences, University of Macau, Taipa, Macau S.A.R. 999078, China
| |
Collapse
|
12
|
Ali SD, Alam W, Tayara H, Chong KT. Identification of Functional piRNAs Using a Convolutional Neural Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1661-1669. [PMID: 33119510 DOI: 10.1109/tcbb.2020.3034313] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Piwi-interacting RNAs (piRNAs) are a distinct sub-class of small non-coding RNAs that are mainly responsible for germline stem cell maintenance, gene stability, and maintaining genome integrity by repression of transposable elements. piRNAs are also expressed aberrantly and associated with various kinds of cancers. To identify piRNAs and their role in guiding target mRNA deadenylation, the currently available computational methods require urgent improvements in performance. To facilitate this, we propose a robust predictor based on a lightweight and simplified deep learning architecture using a convolutional neural network (CNN) to extract significant features from raw RNA sequences without the need for more customized features. The proposed model's performance is comprehensively evaluated using k-fold cross-validation on a benchmark dataset. The proposed model significantly outperforms existing computational methods in the prediction of piRNAs and their role in target mRNA deadenylation. In addition, a user-friendly and publicly-accessible web server is available at http://nsclbio.jbnu.ac.kr/tools/2S-piRCNN/.
Collapse
|
13
|
Xu D, Yuan W, Fan C, Liu B, Lu MZ, Zhang J. Opportunities and Challenges of Predictive Approaches for the Non-coding RNA in Plants. FRONTIERS IN PLANT SCIENCE 2022; 13:890663. [PMID: 35498708 PMCID: PMC9048598 DOI: 10.3389/fpls.2022.890663] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 03/28/2022] [Indexed: 06/01/2023]
Affiliation(s)
- Dong Xu
- State Key Laboratory of Subtropical Silviculture, College of Forestry and Biotechnology, Zhejiang A&F University, Hangzhou, China
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Wenya Yuan
- State Key Laboratory of Subtropical Silviculture, College of Forestry and Biotechnology, Zhejiang A&F University, Hangzhou, China
| | - Chunjie Fan
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Tropical Forestry, Chinese Academy of Forestry, Guangzhou, China
| | - Bobin Liu
- Jiangsu Key Laboratory for Bioresources of Saline Soils, Jiangsu Synthetic Innovation Center for Coastal Bio-agriculture, School of Wetlands, Yancheng Teachers University, Yancheng, China
| | - Meng-Zhu Lu
- State Key Laboratory of Subtropical Silviculture, College of Forestry and Biotechnology, Zhejiang A&F University, Hangzhou, China
| | - Jin Zhang
- State Key Laboratory of Subtropical Silviculture, College of Forestry and Biotechnology, Zhejiang A&F University, Hangzhou, China
| |
Collapse
|
14
|
Nguyen TTD, Ho QT, Le NQK, Phan VD, Ou YY. Use Chou's 5-Steps Rule With Different Word Embedding Types to Boost Performance of Electron Transport Protein Prediction Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1235-1244. [PMID: 32750894 DOI: 10.1109/tcbb.2020.3010975] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Living organisms receive necessary energy substances directly from cellular respiration. The completion of electron storage and transportation requires the process of cellular respiration with the aid of electron transport chains. Therefore, the work of deciphering electron transport proteins is inevitably needed. The identification of these proteins with high performance has a prompt dependence on the choice of methods for feature extraction and machine learning algorithm. In this study, protein sequences served as natural language sentences comprising words. The nominated word embedding-based feature sets, hinged on the word embedding modulation and protein motif frequencies, were useful for feature choosing. Five word embedding types and a variety of conjoint features were examined for such feature selection. The support vector machine algorithm consequentially was employed to perform classification. The performance statistics within the 5-fold cross-validation including average accuracy, specificity, sensitivity, as well as MCC rates surpass 0.95. Such metrics in the independent test are 96.82, 97.16, 95.76 percent, and 0.9, respectively. Compared to state-of-the-art predictors, the proposed method can generate more preferable performance above all metrics indicating the effectiveness of the proposed method in determining electron transport proteins. Furthermore, this study reveals insights about the applicability of various word embeddings for understanding surveyed sequences.
Collapse
|
15
|
Khan S, Khan M, Iqbal N, Amiruddin Abd Rahman M, Khalis Abdul Karim M. Deep-piRNA: Bi-Layered Prediction Model for PIWI-Interacting RNA Using Discriminative Features. COMPUTERS, MATERIALS & CONTINUA 2022; 72:2243-2258. [DOI: 10.32604/cmc.2022.022901] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Accepted: 11/11/2021] [Indexed: 09/02/2023]
|
16
|
Wang J, Wu S, Zhang J, Chen J. Potential Prognosis and Diagnostic Value of AKT3, LSM12, MEF2C, and RAB30 in Exosomes in Colorectal Cancer on Spark Framework. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:8218043. [PMID: 34950443 PMCID: PMC8692012 DOI: 10.1155/2021/8218043] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 09/14/2021] [Indexed: 12/09/2022]
Abstract
Colorectal cancer (CRC) is a common malignant tumor and one of the leading causes of cancer-related deaths worldwide. CRC progression is greatly affected by the local microenvironment. In the study, we proposed a deep computational-based model for the classification of mRNA, lncRNA, and circRNA in exosomes. We, first, analyzed mRNA expression levels in CRC tumors and normal tissues. Secondly, we used GO and KEGG to analyze their functional enrichment. Thirdly, we analyzed the composition of immune cells in all TCGA samples and then evaluated the prognostic value of tumor-infiltrating immune cells in CRC. Lastly, we combined the TCGA dataset, i.e., COADN = 449 and ROADN = 6, for analysis and found that the expression levels of AKT3, LSM12, MEF2C, and RAB30 in exosomes were significantly correlated with tumor immune infiltration levels. The performance evaluation has shown that the proposed model based on neural networks performs better as compared to the existing methods. The proposed model can be used as a potential tool for the immune infiltration level and their role in cancer metastasis and progression, which can help us to explore potential strategies for CRC diagnosis, therapy, and prognosis.
Collapse
Affiliation(s)
- Jue Wang
- Department of Pathology, Jingjiang People's Hospital, Jingjiang, Taizhou, Jiangsu 214500, China
| | - Sheng Wu
- Department of Pathology, Jingjiang People's Hospital, Jingjiang, Taizhou, Jiangsu 214500, China
| | - Jiuwen Zhang
- Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
| | - Jing Chen
- Department of Pathology, Jingjiang People's Hospital, Jingjiang, Taizhou, Jiangsu 214500, China
| |
Collapse
|
17
|
Huang S, Yoshitake K, Asakawa S. A Review of Discovery Profiling of PIWI-Interacting RNAs and Their Diverse Functions in Metazoans. Int J Mol Sci 2021; 22:ijms222011166. [PMID: 34681826 PMCID: PMC8538981 DOI: 10.3390/ijms222011166] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 10/11/2021] [Accepted: 10/14/2021] [Indexed: 12/16/2022] Open
Abstract
PIWI-interacting RNAs (piRNAs) are a class of small non-coding RNAs (sncRNAs) that perform crucial biological functions in metazoans and defend against transposable elements (TEs) in germ lines. Recently, ubiquitously expressed piRNAs were discovered in soma and germ lines using small RNA sequencing (sRNA-seq) in humans and animals, providing new insights into the diverse functions of piRNAs. However, the role of piRNAs has not yet been fully elucidated, and sRNA-seq studies continue to reveal different piRNA activities in the genome. In this review, we summarize a set of simplified processes for piRNA analysis in order to provide a useful guide for researchers to perform piRNA research suitable for their study objectives. These processes can help expand the functional research on piRNAs from previously reported sRNA-seq results in metazoans. Ubiquitously expressed piRNAs have been discovered in the soma and germ lines in Annelida, Cnidaria, Echinodermata, Crustacea, Arthropoda, and Mollusca, but they are limited to germ lines in Chordata. The roles of piRNAs in TE silencing, gene expression regulation, epigenetic regulation, embryonic development, immune response, and associated diseases will continue to be discovered via sRNA-seq.
Collapse
Affiliation(s)
- Songqian Huang
- Correspondence: (S.H.); (S.A.); Tel.: +81-3-5841-5296 (S.A.); Fax: +81-3-5841-8166 (S.A.)
| | | | - Shuichi Asakawa
- Correspondence: (S.H.); (S.A.); Tel.: +81-3-5841-5296 (S.A.); Fax: +81-3-5841-8166 (S.A.)
| |
Collapse
|
18
|
Asim MN, Ibrahim MA, Imran Malik M, Dengel A, Ahmed S. Advances in Computational Methodologies for Classification and Sub-Cellular Locality Prediction of Non-Coding RNAs. Int J Mol Sci 2021; 22:8719. [PMID: 34445436 PMCID: PMC8395733 DOI: 10.3390/ijms22168719] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 02/06/2023] Open
Abstract
Apart from protein-coding Ribonucleic acids (RNAs), there exists a variety of non-coding RNAs (ncRNAs) which regulate complex cellular and molecular processes. High-throughput sequencing technologies and bioinformatics approaches have largely promoted the exploration of ncRNAs which revealed their crucial roles in gene regulation, miRNA binding, protein interactions, and splicing. Furthermore, ncRNAs are involved in the development of complicated diseases like cancer. Categorization of ncRNAs is essential to understand the mechanisms of diseases and to develop effective treatments. Sub-cellular localization information of ncRNAs demystifies diverse functionalities of ncRNAs. To date, several computational methodologies have been proposed to precisely identify the class as well as sub-cellular localization patterns of RNAs). This paper discusses different types of ncRNAs, reviews computational approaches proposed in the last 10 years to distinguish coding-RNA from ncRNA, to identify sub-types of ncRNAs such as piwi-associated RNA, micro RNA, long ncRNA, and circular RNA, and to determine sub-cellular localization of distinct ncRNAs and RNAs. Furthermore, it summarizes diverse ncRNA classification and sub-cellular localization determination datasets along with benchmark performance to aid the development and evaluation of novel computational methodologies. It identifies research gaps, heterogeneity, and challenges in the development of computational approaches for RNA sequence analysis. We consider that our expert analysis will assist Artificial Intelligence researchers with knowing state-of-the-art performance, model selection for various tasks on one platform, dominantly used sequence descriptors, neural architectures, and interpreting inter-species and intra-species performance deviation.
Collapse
Affiliation(s)
- Muhammad Nabeel Asim
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
| | - Muhammad Ali Ibrahim
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
| | - Muhammad Imran Malik
- National Center for Artificial Intelligence (NCAI), National University of Sciences and Technology, Islamabad 44000, Pakistan;
- School of Electrical Engineering & Computer Science, National University of Sciences and Technology, Islamabad 44000, Pakistan
| | - Andreas Dengel
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
| | - Sheraz Ahmed
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- DeepReader GmbH, Trippstadter Str. 122, 67663 Kaiserslautern, Germany
| |
Collapse
|
19
|
Computational Methods and Online Resources for Identification of piRNA-Related Molecules. Interdiscip Sci 2021; 13:176-191. [PMID: 33886096 DOI: 10.1007/s12539-021-00428-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Revised: 03/26/2021] [Accepted: 03/29/2021] [Indexed: 02/07/2023]
Abstract
piRNAs are a class of small non-coding RNA molecules, which interact with the PIWI family and have many important and diverse biological functions. The present review is aimed to provide guidelines and contribute to piRNA research. We focused on the four types of identification models on piRNA-related molecules, including piRNA, piRNA cluster, piRNA target, and disease-related piRNA. We evaluated the types of tools for the identification of piRNAs based on five aspects: datasets, features, classifiers, performance, and usability. We found the precision of 2lpiRNApred was the highest in datasets of model organisms, piRNN had a better performance of datasets of non-model organisms, and 2L-piRNA had the fastest recognition speed of all tools. In addition, we presented an overview of piRNA databases. The databases were divided into six categories: basic annotation, comprehensive annotation, isoform, cluster, target, and disease. We found that piRNA data of non-model organisms, piRNA target data, and piRNA-disease-associated data should be strengthened. Our review might assist researchers in selecting appropriate tools or datasets for their studies, reveal potential problems and shed light on future bioinformatics studies.
Collapse
|
20
|
Khan F, Khan M, Iqbal N, Khan S, Muhammad Khan D, Khan A, Wei DQ. Prediction of Recombination Spots Using Novel Hybrid Feature Extraction Method via Deep Learning Approach. Front Genet 2020; 11:539227. [PMID: 33093842 PMCID: PMC7527634 DOI: 10.3389/fgene.2020.539227] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Accepted: 08/13/2020] [Indexed: 01/20/2023] Open
Abstract
Meiotic recombination is the driving force of evolutionary development and an important source of genetic variation. The meiotic recombination does not take place randomly in a chromosome but occurs in some regions of the chromosome. A region in chromosomes with higher rate of meiotic recombination events are considered as hotspots and a region where frequencies of the recombination events are lower are called coldspots. Prediction of meiotic recombination spots provides useful information about the basic functionality of inheritance and genome diversity. This study proposes an intelligent computational predictor called iRSpots-DNN for the identification of recombination spots. The proposed predictor is based on a novel feature extraction method and an optimized deep neural network (DNN). The DNN was employed as a classification engine whereas, the novel features extraction method was developed to extract meaningful features for the identification of hotspots and coldspots across the yeast genome. Unlike previous algorithms, the proposed feature extraction avoids bias among different selected features and preserved the sequence discriminant properties along with the sequence-structure information simultaneously. This study also considered other effective classifiers named support vector machine (SVM), K-nearest neighbor (KNN), and random forest (RF) to predict recombination spots. Experimental results on a benchmark dataset with 10-fold cross-validation showed that iRSpots-DNN achieved the highest accuracy, i.e., 95.81%. Additionally, the performance of the proposed iRSpots-DNN is significantly better than the existing predictors on a benchmark dataset. The relevant benchmark dataset and source code are freely available at: https://github.com/Fatima-Khan12/iRspot_DNN/tree/master/iRspot_DNN.
Collapse
Affiliation(s)
- Fatima Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Mukhtaj Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Nadeem Iqbal
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Salman Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Dost Muhammad Khan
- Department of Statistics, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Abbas Khan
- Department of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Dong-Qing Wei
- Department of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.,State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Ministry of Education, Shanghai, China.,Peng Cheng Laboratory, Shenzhen, China
| |
Collapse
|
21
|
Abstract
During the last three decades or so, many efforts have been made to study the protein cleavage
sites by some disease-causing enzyme, such as HIV (Human Immunodeficiency Virus) protease
and SARS (Severe Acute Respiratory Syndrome) coronavirus main proteinase. It has become increasingly
clear <i>via</i> this mini-review that the motivation driving the aforementioned studies is quite wise,
and that the results acquired through these studies are very rewarding, particularly for developing peptide
drugs.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|
22
|
Chou KC. An Insightful 10-year Recollection Since the Emergence of the 5-steps Rule. Curr Pharm Des 2020; 25:4223-4234. [PMID: 31782354 DOI: 10.2174/1381612825666191129164042] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Accepted: 11/25/2019] [Indexed: 11/22/2022]
Abstract
OBJECTIVE One of the most challenging and also the most difficult problems is how to formulate a biological sequence with a vector but considerably keep its sequence order information. METHODS To address such a problem, the approach of Pseudo Amino Acid Components or PseAAC has been developed. RESULTS AND CONCLUSION It has become increasingly clear via the 10-year recollection that the aforementioned proposal has been indeed very powerful.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, Massachusetts 02478, United States.,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
23
|
Some illuminating remarks on molecular genetics and genomics as well as drug development. Mol Genet Genomics 2020; 295:261-274. [PMID: 31894399 DOI: 10.1007/s00438-019-01634-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Accepted: 12/05/2019] [Indexed: 02/07/2023]
Abstract
Facing the explosive growth of biological sequences unearthed in the post-genomic age, one of the most important but also most difficult problems in computational biology is how to express a biological sequence with a discrete model or a vector, but still keep it with considerable sequence-order information or its special pattern. To deal with such a challenging problem, the ideas of "pseudo amino acid components" and "pseudo K-tuple nucleotide composition" have been proposed. The ideas and their approaches have further stimulated the birth for "distorted key theory", "wenxing diagram", and substantially strengthening the power in treating the multi-label systems, as well as the establishment of the famous "5-steps rule". All these logic developments are quite natural that are very useful not only for theoretical scientists but also for experimental scientists in conducting genetics/genomics analysis and drug development. Presented in this review paper are also their future perspectives; i.e., their impacts will become even more significant and propounding.
Collapse
|
24
|
Liu XX, Chou KC. pLoc_Deep-mGneg: Predict Subcellular Localization of Gram Negative Bacterial Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/abb.2020.115011] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
25
|
Shao YT, Liu XX, Lu Z, Chou KC. pLoc_Deep-mPlant: Predict Subcellular Localization of Plant Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/ns.2020.125021] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
26
|
Shao YT, Liu XX, Lu Z, Chou KC. pLoc_Deep-mHum: Predict Subcellular Localization of Human Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/ns.2020.127042] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
27
|
Lu Z, Chou KC. pLoc_Deep-mGpos: Predict Subcellular Localization of Gram Positive Bacteria Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/jbise.2020.135005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
28
|
Shao Y, Chou KC. pLoc_Deep-mVirus: A CNN Model for Predicting Subcellular Localization of Virus Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/ns.2020.126033] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
29
|
Shao Y, Chou KC. pLoc_Deep-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/ns.2020.126034] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
30
|
Chou KC. Proposing Pseudo Amino Acid Components is an Important Milestone for Proteome and Genome Analyses. Int J Pept Res Ther 2019. [DOI: 10.1007/s10989-019-09910-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|