1
|
Ahmed S, Schaduangrat N, Chumnanpuen P, Shoombuatong W. GRU4ACE: Enhancing ACE inhibitory peptide prediction by integrating gated recurrent unit with multi-source feature embeddings. Protein Sci 2025; 34:e70026. [PMID: 40371738 PMCID: PMC12079467 DOI: 10.1002/pro.70026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2024] [Revised: 12/12/2024] [Accepted: 12/19/2024] [Indexed: 05/16/2025]
Abstract
Accurate identification of angiotensin-I-converting enzyme (ACE) inhibitory peptides is essential for understanding the primary factor regulating the renin-angiotensin system and guiding the development of new drug candidates. Given the inherent challenges in experimental processes, computational methods for in silico peptide identification can be invaluable for enabling high-throughput characterization of ACE inhibitory peptides. This study introduces GRU4ACE, an innovative deep learning framework based on multi-view information for identifying ACE inhibitory peptides. First, GRU4ACE utilizes multi-source feature encoding methods to capture the information embedded in ACE inhibitory peptides, including sequential information, graphical information, semantic information, and contextual information. Specifically, the feature representations used herein are derived from conventional feature descriptors, natural language processing (NLP)-based embeddings, and pre-trained protein language model (PLM)-based embeddings. Next, multiple feature embeddings were fused, and the elastic net was employed for feature optimization. Finally, the optimal feature subset with strong feature representation was input into a gated recurrent unit (GRU). The proposed GRU4ACE approach demonstrated superior performance over existing methods in terms of the independent test. To be specific, the balanced accuracy, sensitivity, and MCC scores of GRU4ACE reached 0.948, 0.934, and 0.895, which were 6.46%, 8.92%, and 12.51% higher than those of the compared methods, respectively. In addition, when comparing well-regarded feature descriptors, we found that the proposed multi-view features effectively captured crucial information, leading to improved ACE inhibitory peptide prediction performance. These comprehensive results highlight that GRU4ACE enhances prediction accuracy and significantly narrows down the search for new potential antihypertensive drugs.
Collapse
Affiliation(s)
- Saeed Ahmed
- Center for Research Innovation and Biomedical Informatics, Faculty of Medical TechnologyMahidol UniversityBangkokThailand
- Department of Computer ScienceUniversity of SwabiSwabisPakistan
| | - Nalini Schaduangrat
- Center for Research Innovation and Biomedical Informatics, Faculty of Medical TechnologyMahidol UniversityBangkokThailand
| | - Pramote Chumnanpuen
- Department of Zoology, Faculty of ScienceKasetsart UniversityBangkokThailand
- Kasetsart University International College (KUIC)Kasetsart UniversityBangkokThailand
| | - Watshara Shoombuatong
- Center for Research Innovation and Biomedical Informatics, Faculty of Medical TechnologyMahidol UniversityBangkokThailand
| |
Collapse
|
2
|
Jiang J, Zhang C, Ke L, Hayes N, Zhu Y, Qiu H, Zhang B, Zhou T, Wei GW. A review of machine learning methods for imbalanced data challenges in chemistry. Chem Sci 2025; 16:7637-7658. [PMID: 40271022 PMCID: PMC12013631 DOI: 10.1039/d5sc00270b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2025] [Accepted: 04/06/2025] [Indexed: 04/25/2025] Open
Abstract
Imbalanced data, where certain classes are significantly underrepresented in a dataset, is a widespread machine learning (ML) challenge across various fields of chemistry, yet it remains inadequately addressed. This data imbalance can lead to biased ML or deep learning (DL) models, which fail to accurately predict the underrepresented classes, thus limiting the robustness and applicability of these models. With the rapid advancement of ML and DL algorithms, several promising solutions to this issue have emerged, prompting the need for a comprehensive review of current methodologies. In this review, we examine the prominent ML approaches used to tackle the imbalanced data challenge in different areas of chemistry, including resampling techniques, data augmentation techniques, algorithmic approaches, and feature engineering strategies. Each of these methods is evaluated in the context of its application across various aspects of chemistry, such as drug discovery, materials science, cheminformatics, and catalysis. We also explore future directions for overcoming the imbalanced data challenge and emphasize data augmentation via physical models, large language models (LLMs), and advanced mathematics. The benefit of balanced data in new material design and production and the persistent challenges are discussed. Overall, this review aims to elucidate the prevalent ML techniques applied to mitigate the impacts of imbalanced data within the field of chemistry and offer insights into future directions for research and application.
Collapse
Affiliation(s)
- Jian Jiang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University Wuhan 430200 P R. China
- Department of Mathematics, Michigan State University East Lansing Michigan 48824 USA
| | - Chunhuan Zhang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University Wuhan 430200 P R. China
| | - Lu Ke
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University Wuhan 430200 P R. China
| | - Nicole Hayes
- Department of Mathematics, Michigan State University East Lansing Michigan 48824 USA
| | - Yueying Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University Wuhan 430200 P R. China
| | - Huahai Qiu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University Wuhan 430200 P R. China
| | - Bengong Zhang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University Wuhan 430200 P R. China
| | - Tianshou Zhou
- Key Laboratory of Computational Mathematics, Guangdong Province, School of Mathematics, Sun Yat-sen University Guangzhou 510006 P R. China
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University East Lansing Michigan 48824 USA
- Department of Electrical and Computer Engineering, Michigan State University East Lansing Michigan 48824 USA
- Department of Biochemistry and Molecular Biology, Michigan State University East Lansing Michigan 48824 USA
| |
Collapse
|
3
|
Cheng Z, Xu D, Ding D, Ding Y. Prediction of Drug-Target Interactions With High- Quality Negative Samples and a Network-Based Deep Learning Framework. IEEE J Biomed Health Inform 2025; 29:1567-1578. [PMID: 38227407 DOI: 10.1109/jbhi.2024.3354953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2024]
Abstract
Identification of drug-target interactions (DTIs) plays a crucial role in drug discovery. Compared to traditional experimental methods, computer-based methods for predicting DTIs can significantly reduce the time and financial burdens of drug development. In recent years, numerous machine learning-based methods have been proposed for predicting potential DTIs. However, a common limitation among these methods is the absence of high-quality negative samples. Moreover, the effective extraction of multisource information of drugs and proteins for DTI prediction remains a significant challenge. In this paper, we investigated two aspects: the selection of high-quality negative samples and the construction of a high-performance DTI prediction framework. Specifically, we found two types of hidden biases when randomly selecting negative samples from unlabeled drug-protein pairs and proposed a negative sample selection approach based on complex network theory. Furthermore, we proposed a novel DTI prediction method named HNetPa-DTI, which integrates topological information from the drug-protein-disease heterogeneous network and gene ontology (GO) and pathway annotation information of proteins. Specifically, we extracted topological information of the drug-protein-disease heterogeneous network using heterogeneous graph neural networks, and obtained GO and pathway annotation information of proteins from the GO term semantic similarity networks, GO term-protein bipartite networks, and pathway-protein bipartite network using graph neural networks. Experimental results show that HNetPa-DTI outperforms the baseline methods on four types of prediction tasks, demonstrating the superiority of our method.
Collapse
|
4
|
Lyu B, Gou W, Xu F, Chen L, Wang Z, Ren Z, Liu G, Li Y, Hou W. Target Discovery Driven by Chemical Biology and Computational Biology. CHEM REC 2025; 25:e202400182. [PMID: 39811950 DOI: 10.1002/tcr.202400182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2024] [Revised: 12/06/2024] [Indexed: 01/16/2025]
Abstract
Target identification is crucial for drug screening and development because it can reveal the mechanism of drug action and ensure the reliability and accuracy of the results. Chemical biology, an interdisciplinary field combining chemistry and biology, can assist in this process by studying the interactions between active molecular compounds and proteins and their physiological effects. It can also help predict potential drug targets or candidates, develop new biomarker assays and diagnostic reagents, and evaluate the selectivity and range of active compounds to reduce the risk of off-target effects. Chemical biology can achieve these goals using techniques such as changing protein thermal stability, enzyme sensitivity, and molecular structure and applying probes, isotope labeling and mass spectrometry. Concurrently, computational biology employs a diverse array of computational models to predict drug targets. This approach also offers innovative avenues for repurposing existing drugs. In this paper, we review the reported chemical biology and computational biology techniques for identifying different types of targets that can provide valuable insights for drug target discovery.
Collapse
Affiliation(s)
- Bohai Lyu
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
- Tianjin University of Traditional Chinese Medicine, Tianjin, 301617, China
| | - Wenfeng Gou
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
| | - Feifei Xu
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
| | - Leyuan Chen
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
| | - Zhiyun Wang
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
| | - Zhonghao Ren
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
- Department of Pharmacology, Shenyang Pharmaceutical University, 103 Wenhua Road, Shenhe District, Shenyang, 110016, China
| | - Gaiting Liu
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
- Tianjin University of Traditional Chinese Medicine, Tianjin, 301617, China
| | - Yiliang Li
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
| | - Wenbin Hou
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
| |
Collapse
|
5
|
Liu Y, Liu Y, Yang H, Zhang L, Che K, Xing L. NTMFF-DTA: Prediction of Drug-Target Affinity Based on Network Topology and Multi-feature Fusion. Interdiscip Sci 2025:10.1007/s12539-025-00692-9. [PMID: 39998589 DOI: 10.1007/s12539-025-00692-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2024] [Revised: 01/20/2025] [Accepted: 01/21/2025] [Indexed: 02/27/2025]
Abstract
Predicting drug-target binding affinity (DTA) is an important step in the complex process of drug discovery or drug repositioning. A large number of computational methods proposed for the task of DTA prediction utilize single features of proteins to measure drug-protein or protein-protein interactions, ignoring multi-feature fusion between protein-related features (e.g., solvent accessibility, protein pockets, secondary structures, and distance maps, etc.). To address the aforementioned constraints, we propose a new network topology and multi-feature fusion based approach for DTA prediction (NTMFF-DTA), which deeply mines protein multiple types of data and propagates drug information across domains. Data in drug-target interactions are often sparse, and multi-feature fusion can enrich data information by integrating multiple features, thus overcoming the data sparsity problem to some extent. The proposed approach offers two main contributions: (1) constructing a relationship-aware GAT that selectively focuses on the connections between nodes and edges in the molecular graph to capture the more central roles of nodes and edges in DTA prediction and (2) constructing an information propagation channel between different feature domains of drug proteins to achieve the sharing of the importance weight of drug atoms and edges, and combining with a multi-head self-attention mechanism to capture residue-enhancing features. The NTMFF-DTA model was comparatively tested against several leading baseline technologies on commonly used datasets. Experimental show that NTMFF-DTA can effectively and accurately predict DTA and outperform existing comparative models.
Collapse
Affiliation(s)
- Yuandong Liu
- Computer Science and Technology, Shandong University of Technology, Mashang, Zibo, 255000, China
| | - Youzhi Liu
- Computer Science and Technology, Shandong University of Technology, Mashang, Zibo, 255000, China
| | - Haoqin Yang
- Department of Mechanical Engineering, Shandong University of Technology, Mashang, Zibo, 255000, China
| | - Longbo Zhang
- Computer Science and Technology, Shandong University of Technology, Mashang, Zibo, 255000, China
| | - Kai Che
- Xi'an Aeronautics Computing Technique Research Institute, AVIC, Xi'an, 710065, China
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Linlin Xing
- Computer Science and Technology, Shandong University of Technology, Mashang, Zibo, 255000, China.
| |
Collapse
|
6
|
Zhang X, Liu Q. A graph neural network approach for hierarchical mapping of breast cancer protein communities. BMC Bioinformatics 2025; 26:23. [PMID: 39838298 PMCID: PMC11749236 DOI: 10.1186/s12859-024-06015-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2024] [Accepted: 12/16/2024] [Indexed: 01/23/2025] Open
Abstract
BACKGROUND Comprehensively mapping the hierarchical structure of breast cancer protein communities and identifying potential biomarkers from them is a promising way for breast cancer research. Existing approaches are subjective and fail to take information from protein sequences into consideration. Deep learning can automatically learn features from protein sequences and protein-protein interactions for hierarchical clustering. RESULTS Using a large amount of publicly available proteomics data, we created a hierarchical tree for breast cancer protein communities using a novel hierarchical graph neural network, with the supervision of gene ontology terms and assistance of a pre-trained deep contextual language model. Then, a group-lasso algorithm was applied to identify protein communities that are under both mutation burden and survival burden, undergo significant alterations when targeted by specific drug molecules, and show cancer-dependent perturbations. The resulting hierarchical map of protein communities shows how gene-level mutations and survival information converge on protein communities at different scales. Internal validity of the model was established through the convergence on BRCA2 as a breast cancer hotspot. Further overlaps with breast cancer cell dependencies revealed SUPT6H and RAD21, along with their respective protein systems, HOST:37 and HOST:861, as potential biomarkers. Using gene-level perturbation data of the HOST:37 and HOST:861 gene sets, three FDA-approved drugs with high therapeutic value were selected as potential treatments to be further evaluated. These drugs include mercaptopurine, pioglitazone, and colchicine. CONCLUSION The proposed graph neural network approach to analyzing breast cancer protein communities in a hierarchical structure provides a novel perspective on breast cancer prognosis and treatment. By targeting entire gene sets, we were able to evaluate the prognostic and therapeutic value of genes (or gene sets) at different levels, from gene-level to system-level biology. Cancer-specific gene dependencies provide additional context for pinpointing cancer-related systems and drug-induced alterations can highlight potential therapeutic targets. These identified protein communities, in conjunction with other protein communities under strong mutation and survival burdens, can potentially be used as clinical biomarkers for breast cancer.
Collapse
Affiliation(s)
- Xiao Zhang
- Department of Applied Computer Science, University of Winnipeg, Winnipeg, MB, R3B 2E9, Canada
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, R3E 0W2, Canada
| | - Qian Liu
- Department of Applied Computer Science, University of Winnipeg, Winnipeg, MB, R3B 2E9, Canada.
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, R3E 0W2, Canada.
| |
Collapse
|
7
|
Zhang S, Zhang R, Chen Z, Shao Z, Li A, Li F, Huang F. Neuroinflammation mediates the progression of neonate hypoxia-ischemia brain damage to Alzheimer's disease: a bioinformatics and experimental study. Front Aging Neurosci 2025; 16:1511668. [PMID: 39872979 PMCID: PMC11770030 DOI: 10.3389/fnagi.2024.1511668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Accepted: 12/05/2024] [Indexed: 01/30/2025] Open
Abstract
Background Traumatic brain injury (TBI) can generally be divided into focal damage and diffuse damage, and neonate Hypoxia-Ischemia Brain Damage (nHIBD) is one of the causes of diffuse damage. Patients with nHIBD are at an increased risk of developing Alzheimer's disease (AD). However, the shared pathogenesis of patients affected with both neurological disorders has not been fully elucidated. Purpose We here aim to identify the shared molecular signatures between nHIBD and AD. We used an integrated analysis of the cortex gene expression data, targeting differential expression of genes related to the mechanisms of neurodegeneration and cognitive impairment following traumatic brain injury. Methods The gene expression profiles of Alzheimer's disease (GSE203206) and that of Neonate Hypoxia-Ischemia Brain Damage (GSE23317) were obtained from the Gene Expression Omnibus (GEO) database. After identifying the common differentially expressed genes (DEGs) of Alzheimer's disease and neonate Hypoxia-Ischemia Brain Damage by limma package analysis, five kinds of analyses were performed on them, namely Gene Ontology (GO) and pathway enrichment analysis, protein-protein interaction network, DEG-transcription factor interactions and DEG-microRNA interactions, protein-drug interactions and protein-disease association analysis, and gene-inflammation association analysis and protein-inflammation association analysis. Results In total, 12 common DEGs were identified including HSPB1, VIM, MVD, TUBB4A, AACS, ANXA6, DIRAS2, RPH3A, CEND1, KALM, THOP1, AREL1. We also identified 11 hub proteins, three central regulatory transcription factors, and three microRNAs encoded by the DEGs. Protein-drug interaction analysis showed that CYC1 and UQCRFS1 are associated with different drugs. Gene-disease association analysis shows Mammary Neoplasms, Neoplasm Metastasis, Schizophrenia, and Brain Ischemia diseases are the most relevant to the hub proteins we identified. Gene-inflammation association analysis shows that the hub gene AREL1 is related to inflammatory response, while the protein-inflammation association analysis shows that the hub proteins AKT1 and MAPK14 are related to inflammatory response. Conclusion This study provides new insights into the shared molecular mechanisms between AD and nHIBD. These common pathways and hub genes could potentially be used to design therapeutic interventions, reducing the likelihood of Alzheimer's disease development in survivors of neonatal Hypoxic-Ischemia brain injury.
Collapse
Affiliation(s)
| | - Ruqiu Zhang
- School of Medicine, Yunnan University, Kunming, China
| | - Zhaoqin Chen
- State Key Laboratory for Conservation and Utilization of Bio-resources in Yunnan, Yunnan University, Kunming, China
| | - Zihan Shao
- Changxin School, Yunnan University, Kunming, China
| | - An Li
- School of Medicine, Yunnan University, Kunming, China
| | - Fan Li
- Medical College, Shantou University, Shantou, China
| | - Fang Huang
- School of Medicine, Yunnan University, Kunming, China
| |
Collapse
|
8
|
Khorramfard A, Pirgazi J, Ghanbari Sorkhi A. Predicting drug protein interactions based on improved support vector data description in unbalanced data. BIOIMPACTS : BI 2024; 15:30468. [PMID: 40256215 PMCID: PMC12008248 DOI: 10.34172/bi.30468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 07/24/2024] [Accepted: 09/07/2024] [Indexed: 04/22/2025]
Abstract
Introduction Predicting drug-protein interactions is critical in drug discovery, but traditional laboratory methods are expensive and time-consuming. Computational approaches, especially those leveraging machine learning, are increasingly popular. This paper introduces VASVDD, a multi-step method to predict drug-protein interactions. First, it extracts features from amino acid sequences in proteins and drug structures. To address the challenge of unbalanced datasets, a Support Vector Data Description (SVDD) approach is employed, outperforming standard techniques like SMOTE and ENN in balancing data. Subsequently, dimensionality reduction using a Variational Autoencoder (VAE) reduces features from 1074 to 32, improving computational efficiency and predictive performance. Methods The proposed method was evaluated on four datasets related to enzymes, G-protein-coupled receptors, ion channels, and nuclear receptors. Without preprocessing, the Gradient Boosting Classifier showed bias towards the majority class. However, balancing and dimensionality reduction significantly improved accuracy, sensitivity, specificity, and F1 scores. VASVDD demonstrated superior performance compared to other dimensionality reduction methods, such as kernel principal component analysis (kernel PCA) and Principal Component Analysis (PCA), and was validated across multiple classifiers, achieving higher AUROC values than existing techniques. Results The results highlight VASVDD's effectiveness and generalizability in predicting drug-target interactions. The method outperforms state-of-the-art techniques in terms of accuracy, robustness, and efficiency, making it a promising tool in bioinformatics for drug discovery. Conclusion The datasets analyzed during the current study are not publicly available but are available from the corresponding author upon reasonable request and source code are available on GitHub: https://github.com/alirezakhorramfard/vasvdd.
Collapse
Affiliation(s)
- Alireza Khorramfard
- Department of Electrical and Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
| | - Jamshid Pirgazi
- Department of Electrical and Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
| | - Ali Ghanbari Sorkhi
- Department of Electrical and Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
| |
Collapse
|
9
|
E U, T M, A V G, D P. A comprehensive survey of drug-target interaction analysis in allopathy and siddha medicine. Artif Intell Med 2024; 157:102986. [PMID: 39326289 DOI: 10.1016/j.artmed.2024.102986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 08/13/2024] [Accepted: 09/18/2024] [Indexed: 09/28/2024]
Abstract
Effective drug delivery is the cornerstone of modern healthcare, ensuring therapeutic compounds reach their intended targets efficiently. This paper explores the potential of personalized and holistic healthcare, driven by the synergy between traditional and allopathic medicine systems, with a specific focus on the vast reservoir of medicinal compounds found in plants rooted in the historical legacy of traditional medicine. Motivated by the desire to unlock the therapeutic potential of medicinal plants and bridge the gap between traditional and allopathic medicine, this survey delves into in-silico computational approaches for studying Drug-Target Interactions (DTI) within the contexts of allopathy and siddha medicine. The contributions of this survey are multifaceted: it offers a comprehensive overview of in-silico methods for DTI analysis in both systems, identifies common challenges in DTI studies, provides insights into future directions to advance DTI analysis, and includes a comparative analysis of DTI in allopathy and siddha medicine. The findings of this survey highlight the pivotal role of in-silico computational approaches in advancing drug research and development in both allopathy and siddha medicine, emphasizing the importance of integrating these methods to drive the future of personalized healthcare.
Collapse
Affiliation(s)
- Uma E
- Department of Information Science and Technology, College of Engineering Guindy, Chennai, India.
| | - Mala T
- Department of Information Science and Technology, College of Engineering Guindy, Chennai, India
| | - Geetha A V
- Department of Information Science and Technology, College of Engineering Guindy, Chennai, India
| | - Priyanka D
- Department of Information Science and Technology, College of Engineering Guindy, Chennai, India
| |
Collapse
|
10
|
Wang Y, Yin Z. Drug-target interaction prediction through fine-grained selection and bidirectional random walk methodology. Sci Rep 2024; 14:18104. [PMID: 39103483 PMCID: PMC11300600 DOI: 10.1038/s41598-024-69186-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 08/01/2024] [Indexed: 08/07/2024] Open
Abstract
The study of drug-target interaction plays an important role in the process of drug development. The subject of DTI forecasting has advanced significantly in the last several years, yielding numerous significant research findings and methodologies. Heterogeneous data sources provide richer information and comprehensive perspectives for drug-target interaction prediction, so many existing methods rely on heterogeneous networks, and graph embedding technology becomes an important technology to extract information from heterogeneous networks. These approaches, however, are less concerned with potential noisy information in heterogeneous networks and more focused on the extent of information extraction in those networks. Based on this, a potential DTI predictive network model called FBRWPC is proposed in this paper. It uses a fine-grained similarity selection program to first integrate similarity on similar networks and then a bidirectional random walk graph embedding learning method with restart to obtain an updated drug target interaction matrix. Through the use of similarity selection and fine-grained selection similarity integration, the framework can effectively filter out the noise present in heterogeneous networks and enhance the model's prediction performance. The experimental findings demonstrate that, even after being split up into four distinct types of data sets, FBRWPC can still retain great prediction performance, a sign of the model's resilience and good generalization.
Collapse
Affiliation(s)
- YaPing Wang
- School of Mathematics, Physics and Statistics, Institute for Frontier Medical Technology, Center of Intelligent Computing and Applied Statistics, Shanghai University of Engineering Science, Shanghai, 201620, China
| | - ZhiXiang Yin
- School of Mathematics, Physics and Statistics, Institute for Frontier Medical Technology, Center of Intelligent Computing and Applied Statistics, Shanghai University of Engineering Science, Shanghai, 201620, China.
| |
Collapse
|
11
|
Ahmed F, Samantasinghar A, Bae MA, Choi KH. Integrated ML-Based Strategy Identifies Drug Repurposing for Idiopathic Pulmonary Fibrosis. ACS OMEGA 2024; 9:29870-29883. [PMID: 39005763 PMCID: PMC11238209 DOI: 10.1021/acsomega.4c03796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Revised: 05/30/2024] [Accepted: 06/12/2024] [Indexed: 07/16/2024]
Abstract
Idiopathic pulmonary fibrosis (IPF) affects an estimated global population of around 3 million individuals. IPF is a medical condition with an unknown cause characterized by the formation of scar tissue in the lungs, leading to progressive respiratory disease. Currently, there are only two FDA-approved small molecule drugs specifically for the treatment of IPF and this has created a demand for the rapid development of drugs for IPF treatment. Moreover, denovo drug development is time and cost-intensive with less than a 10% success rate. Drug repurposing currently is the most feasible option for rapidly making the drugs to market for a rare and sporadic disease. Normally, the repurposing of drugs begins with a screening of FDA-approved drugs using computational tools, which results in a low hit rate. Here, an integrated machine learning-based drug repurposing strategy is developed to significantly reduce the false positive outcomes by introducing the predock machine-learning-based predictions followed by literature and GSEA-assisted validation and drug pathway prediction. The developed strategy is deployed to 1480 FDA-approved drugs and to drugs currently in a clinical trial for IPF to screen them against "TGFB1", "TGFB2", "PDGFR-a", "SMAD-2/3", "FGF-2", and more proteins resulting in 247 total and 27 potentially repurposable drugs. The literature and GSEA validation suggested that 72 of 247 (29.14%) drugs have been tried for IPF, 13 of 247 (5.2%) drugs have already been used for lung fibrosis, and 20 of 247 (8%) drugs have been tested for other fibrotic conditions such as cystic fibrosis and renal fibrosis. Pathway prediction of the remaining 142 drugs was carried out resulting in 118 distinct pathways. Furthermore, the analysis revealed that 29 of 118 pathways were directly or indirectly involved in IPF and 11 of 29 pathways were directly involved. Moreover, 15 potential drug combinations are suggested for showing a strong synergistic effect in IPF. The drug repurposing strategy reported here will be useful for rapidly developing drugs for treating IPF and other related conditions.
Collapse
Affiliation(s)
- Faheem Ahmed
- Department
of Mechatronics Engineering, Jeju National
University, Jeju 63243, Republic
of Korea
| | - Anupama Samantasinghar
- Department
of Mechatronics Engineering, Jeju National
University, Jeju 63243, Republic
of Korea
| | - Myung Ae Bae
- Therapeutics
and Biotechnology Division, Korea Research
Institute of Chemical Technology, Daejeon 34114, Korea
| | - Kyung Hyun Choi
- Department
of Mechatronics Engineering, Jeju National
University, Jeju 63243, Republic
of Korea
| |
Collapse
|
12
|
Zhang Q, Zuo L, Ren Y, Wang S, Wang W, Ma L, Zhang J, Xia B. FMCA-DTI: a fragment-oriented method based on a multihead cross attention mechanism to improve drug-target interaction prediction. Bioinformatics 2024; 40:btae347. [PMID: 38810106 PMCID: PMC11256963 DOI: 10.1093/bioinformatics/btae347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 04/23/2024] [Accepted: 05/28/2024] [Indexed: 05/31/2024] Open
Abstract
MOTIVATION Identifying drug-target interactions (DTI) is crucial in drug discovery. Fragments are less complex and can accurately characterize local features, which is important in DTI prediction. Recently, deep learning (DL)-based methods predict DTI more efficiently. However, two challenges remain in existing DL-based methods: (i) some methods directly encode drugs and proteins into integers, ignoring the substructure representation; (ii) some methods learn the features of the drugs and proteins separately instead of considering their interactions. RESULTS In this article, we propose a fragment-oriented method based on a multihead cross attention mechanism for predicting DTI, named FMCA-DTI. FMCA-DTI obtains multiple types of fragments of drugs and proteins by branch chain mining and category fragment mining. Importantly, FMCA-DTI utilizes the shared-weight-based multihead cross attention mechanism to learn the complex interaction features between different fragments. Experiments on three benchmark datasets show that FMCA-DTI achieves significantly improved performance by comparing it with four state-of-the-art baselines. AVAILABILITY AND IMPLEMENTATION The code for this workflow is available at: https://github.com/jacky102022/FMCA-DTI.
Collapse
Affiliation(s)
- Qi Zhang
- College of Mathematics and Computer Science, Yan'an University, Yan'an 716000, China
| | - Le Zuo
- College of Mathematics and Computer Science, Yan'an University, Yan'an 716000, China
| | - Ying Ren
- College of Mathematics and Computer Science, Yan'an University, Yan'an 716000, China
| | - Siyuan Wang
- College of Mathematics and Computer Science, Yan'an University, Yan'an 716000, China
| | - Wenfa Wang
- College of Mathematics and Computer Science, Yan'an University, Yan'an 716000, China
| | - Lerong Ma
- College of Mathematics and Computer Science, Yan'an University, Yan'an 716000, China
| | - Jing Zhang
- Medical College of Yan'an University, Yan'an University, Yan'an 716000, China
- Medical Research and Experimental Center, The Second Affiliated Hospital of Xi'an Medical University, Xi'an 710021, China
| | - Bisheng Xia
- College of Mathematics and Computer Science, Yan'an University, Yan'an 716000, China
| |
Collapse
|
13
|
Abubakar ML, Kapoor N, Sharma A, Gambhir L, Jasuja ND, Sharma G. Artificial Intelligence in Drug Identification and Validation: A Scoping Review. Drug Res (Stuttg) 2024; 74:208-219. [PMID: 38830370 DOI: 10.1055/a-2306-8311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
The end-to-end process in the discovery of drugs involves therapeutic candidate identification, validation of identified targets, identification of hit compound series, lead identification and optimization, characterization, and formulation and development. The process is lengthy, expensive, tedious, and inefficient, with a large attrition rate for novel drug discovery. Today, the pharmaceutical industry is focused on improving the drug discovery process. Finding and selecting acceptable drug candidates effectively can significantly impact the price and profitability of new medications. Aside from the cost, there is a need to reduce the end-to-end process time, limiting the number of experiments at various stages. To achieve this, artificial intelligence (AI) has been utilized at various stages of drug discovery. The present study aims to identify the recent work that has developed AI-based models at various stages of drug discovery, identify the stages that need more concern, present the taxonomy of AI methods in drug discovery, and provide research opportunities. From January 2016 to September 1, 2023, the study identified all publications that were cited in the electronic databases including Scopus, NCBI PubMed, MEDLINE, Anthropology Plus, Embase, APA PsycInfo, SOCIndex, and CINAHL. Utilising a standardized form, data were extracted, and presented possible research prospects based on the analysis of the extracted data.
Collapse
Affiliation(s)
| | - Neha Kapoor
- School of Applied Sciences, Suresh Gyan Vihar University, Jaipur, Rajasthan, India
| | - Asha Sharma
- Department of Zoology, Swargiya P. N. K. S. Govt. PG College, Dausa, Rajasthan, India
| | - Lokesh Gambhir
- School of Basic and Applied Sciences, Shri Guru Ram Rai University, Dehradun, Uttarakhand, India
| | | | - Gaurav Sharma
- School of Applied Sciences, Suresh Gyan Vihar University, Jaipur, Rajasthan, India
| |
Collapse
|
14
|
Yang X, Wuchty S, Liang Z, Ji L, Wang B, Zhu J, Zhang Z, Dong Y. Multi-modal features-based human-herpesvirus protein-protein interaction prediction by using LightGBM. Brief Bioinform 2024; 25:bbae005. [PMID: 38279649 PMCID: PMC10818167 DOI: 10.1093/bib/bbae005] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 12/25/2023] [Accepted: 01/01/2021] [Indexed: 01/28/2024] Open
Abstract
The identification of human-herpesvirus protein-protein interactions (PPIs) is an essential and important entry point to understand the mechanisms of viral infection, especially in malignant tumor patients with common herpesvirus infection. While natural language processing (NLP)-based embedding techniques have emerged as powerful approaches, the application of multi-modal embedding feature fusion to predict human-herpesvirus PPIs is still limited. Here, we established a multi-modal embedding feature fusion-based LightGBM method to predict human-herpesvirus PPIs. In particular, we applied document and graph embedding approaches to represent sequence, network and function modal features of human and herpesviral proteins. Training our LightGBM models through our compiled non-rigorous and rigorous benchmarking datasets, we obtained significantly better performance compared to individual-modal features. Furthermore, our model outperformed traditional feature encodings-based machine learning methods and state-of-the-art deep learning-based methods using various benchmarking datasets. In a transfer learning step, we show that our model that was trained on human-herpesvirus PPI dataset without cytomegalovirus data can reliably predict human-cytomegalovirus PPIs, indicating that our method can comprehensively capture multi-modal fusion features of protein interactions across various herpesvirus subtypes. The implementation of our method is available at https://github.com/XiaodiYangpku/MultimodalPPI/.
Collapse
Affiliation(s)
- Xiaodi Yang
- Department of Hematology, Peking University First Hospital, Beijing, China
| | - Stefan Wuchty
- Department of Computer Science, University of Miami, Miami FL, 33146, USA
- Department of Biology, University of Miami, Miami FL, 33146, USA
- Institute of Data Science and Computation, University of Miami, Miami, FL 33146, USA
- Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL 33136, USA
| | - Zeyin Liang
- Department of Hematology, Peking University First Hospital, Beijing, China
| | - Li Ji
- Department of Hematology, Peking University First Hospital, Beijing, China
| | - Bingjie Wang
- Department of Hematology, Peking University First Hospital, Beijing, China
| | - Jialin Zhu
- Department of Hematology, Peking University First Hospital, Beijing, China
| | - Ziding Zhang
- State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Yujun Dong
- Department of Hematology, Peking University First Hospital, Beijing, China
| |
Collapse
|
15
|
Choubey J, Wolkenhauer O, Chatterjee T. Systems Biology Approach to Analyze Microarray Datasets for Identification of Disease-Causing Genes: Case Study of Oral Squamous Cell Carcinoma. Methods Mol Biol 2024; 2719:13-31. [PMID: 37803110 DOI: 10.1007/978-1-0716-3461-5_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/08/2023]
Abstract
The discovery of potential disease-causing genes can aid medical progress. The post-genomic era has made this a more difficult task. Modern high-throughput methods have not solved the problem of identifying disease genes. Conventional methods cannot be used to investigate many rare or lethal diseases. Monitoring gene expression values in different samples using microarray technology is one of the best and most accurate ways to identify disease-causing genes. One of the most recent advances in experimental molecular biology is microarrays, which allow researchers to simultaneously monitor the expression levels of thousands of genes. Statistical analysis of microarray data might aid gene discovery by revealing pathways related to the target gene and facilitating identification of candidate genes. Systems biology, an interdisciplinary approach, has emerged as a crucial analytic tool with the potential to reveal previously unidentified causes and consequences of human illness. Genetic, environmental, immunological, or neurological factors have been implicated in the developing complex disorders like cancer. Because of this, it is important to approach the study of such disease from a novel perspective. The system biology approach allows us to rapidly identify disease-causing genes and assess their viability as therapeutic targets. This chapter demonstrates systems biology approaches to identify candidate genes using public database. Oral squamous cell carcinoma (OSCC) is used as a model disease to show how systems biology can be used successfully to identify and prioritize disease genes.
Collapse
Affiliation(s)
| | - Olaf Wolkenhauer
- Department of Systems Biology & Bioinformatics, University of Rostock, Rostock, Germany
| | | |
Collapse
|
16
|
Guo J, Zhang Y, Gao Y, Li S, Xu G, Tian Z, Xu Q, Li X, Li Y, Zhang Y. Systematical analyses of large-scale transcriptome reveal viral infection-related genes and disease comorbidities. ARTIFICIAL CELLS, NANOMEDICINE, AND BIOTECHNOLOGY 2023; 51:453-465. [PMID: 37651591 DOI: 10.1080/21691401.2023.2252477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 08/13/2023] [Accepted: 08/17/2023] [Indexed: 09/02/2023]
Abstract
Perturbation of transcriptome in viral infection patients is a recurrent theme impacting symptoms and mortality, yet a detailed understanding of pertinent transcriptome and identification of robust biomarkers is not complete. In this study, we manually collected 23 datasets related to 6,197 blood transcriptomes across 16 types of respiratory virus infections. We applied a comprehensive systems biology approach starting with whole-blood transcriptomes combined with multilevel bioinformatics analyses to characterize the expression, functional pathways, and protein-protein interaction (PPI) networks to identify robust biomarkers and disease comorbidities. Robust gene markers of infection with different viruses were identified, which can accurately classify the normal and infected patients in train and validation cohorts. The biological processes (BP) of different viruses showed great similarity and enriched in infection and immune response pathways. Network-based analyses revealed that a variety of viral infections were associated with nervous system diseases, neoplasms and metabolic diseases, and significantly correlated with brain tissues. In summary, our manually collected transcriptomes and comprehensive analyses reveal key molecular markers and disease comorbidities in the process of viral infection, which could provide a valuable theoretical basis for the prevention of subsequent public health events for respiratory virus infections.
Collapse
Affiliation(s)
- Jing Guo
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, Hainan, China
| | - Ya Zhang
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, Hainan, China
| | - Yueying Gao
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, Hainan, China
| | - Si Li
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, Hainan, China
| | - Gang Xu
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, Hainan, China
| | - Zhanyu Tian
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, Hainan, China
| | - Qi Xu
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, Hainan, China
| | - Xia Li
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, Hainan, China
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yongsheng Li
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, Hainan, China
| | - Yunpeng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| |
Collapse
|
17
|
Liyaqat T, Ahmad T, Saxena C. TeM-DTBA: time-efficient drug target binding affinity prediction using multiple modalities with Lasso feature selection. J Comput Aided Mol Des 2023; 37:573-584. [PMID: 37777631 DOI: 10.1007/s10822-023-00533-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 09/07/2023] [Indexed: 10/02/2023]
Abstract
Drug discovery, especially virtual screening and drug repositioning, can be accelerated through deeper understanding and prediction of Drug Target Interactions (DTIs). The advancement of deep learning as well as the time and financial costs associated with conventional wet-lab experiments have made computational methods for DTI prediction more popular. However, the majority of these computational methods handle the DTI problem as a binary classification task, ignoring the quantitative binding affinity that determines the drug efficacy to their target proteins. Moreover, computational space as well as execution time of the model is often ignored over accuracy. To address these challenges, we introduce a novel method, called Time-efficient Multimodal Drug Target Binding Affinity (TeM-DTBA), which predicts the binding affinity between drugs and targets by fusing different modalities based on compound structures and target sequences. We employ the Lasso feature selection method, which lowers the dimensionality of feature vectors and speeds up the proposed model training time by more than 50%. The results from two benchmark datasets demonstrate that our method outperforms state-of-the-art methods in terms of performance. The mean squared errors of 18.8% and 23.19%, achieved on the KIBA and Davis datasets, respectively, suggest that our method is more accurate in predicting drug-target binding affinity.
Collapse
Affiliation(s)
- Tanya Liyaqat
- Department of Computer Engineering, Jamia Millia Islamia, New Delhi, India.
| | - Tanvir Ahmad
- Department of Computer Engineering, Jamia Millia Islamia, New Delhi, India
| | - Chandni Saxena
- The Chinese University of Hong Kong, Sha Tin, SAR, China
| |
Collapse
|
18
|
Muneeb Hassan M, Ameeq M, Jamal F, Tahir MH, Mendy JT. Prevalence of covid-19 among patients with chronic obstructive pulmonary disease and tuberculosis. Ann Med 2023; 55:285-291. [PMID: 36594409 PMCID: PMC9815254 DOI: 10.1080/07853890.2022.2160491] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND The exhaustive information about non-communicable diseases associated with COVID-19 and severe acute respiratory syndrome corona virus-2 (SARS-CoV-2) are getting easier to find in the literature. However, there is a lack of knowledge regarding tuberculosis (TB) and chronic obstructed pulmonary disease (COPD), with numerous infections in COVID-19 patients. OBJECTIVES Priority is placed on determining the patient's prognosis based on the presence or absence of TB and COPD. Additionally, a comparison is made between the risk of death and the likelihood of recovery in terms of time in COVID-19 patients who have either COPD or TB. METHODOLOGY At the DHQ Hospital in Muzaffargarh, Punjab, Pakistan, 498 COVID-19 patients with TB and COPD were studied retrospectively. The duration of study started in February 2022 and concluded in August 2022. The Kaplan-Meier curves described time-to-death and time-to-recovery stratified by TB and COPD status. The Wilcoxon test compared the survival rates of people with TB and COPD in two matched paired groups and their status differences with their standard of living. RESULTS The risk of death in COVID-19 patients with TB was 1.476 times higher than in those without (95% CI: 0.949-2.295). The recovery risk in COVID-19 patients with TB was 0.677 times lower than in those without (95% CI: 0.436-1.054). Similarly, patients with TB had a significantly shorter time to death (p=.001) and longer time to recovery (p=.001). CONCLUSIONS According to the findings, the most significant contributor to an increased risk of morbidity and mortality in TB and COPD patients was the COVID-19.KEY MESSAGESSARS-Cov-19 is a new challenge for the universe in terms of prevention and treatment for people with tuberculosis and chronic obstructive pulmonary disease, among other diseases.Propensity score matching to control for potential biases.Compared to hospitalized patients with and without (TB and COPD) had an equivalently higher mortality rate.
Collapse
Affiliation(s)
| | - Muhammad Ameeq
- Department of Statistics, The Islamia University, Bahawalpur, Pakistan
| | - Farrukh Jamal
- Department of Statistics, The Islamia University, Bahawalpur, Pakistan
| | - Muhammad H Tahir
- Department of Statistics, The Islamia University, Bahawalpur, Pakistan
| | - John T Mendy
- Department of Mathematics, School of Arts and Science, University of The Gambia, Serekunda, The Gambia
| |
Collapse
|
19
|
Wang Y, Pan Z, Mou M, Xia W, Zhang H, Zhang H, Liu J, Zheng L, Luo Y, Zheng H, Yu X, Lian X, Zeng Z, Li Z, Zhang B, Zheng M, Li H, Hou T, Zhu F. A task-specific encoding algorithm for RNAs and RNA-associated interactions based on convolutional autoencoder. Nucleic Acids Res 2023; 51:e110. [PMID: 37889083 PMCID: PMC10682500 DOI: 10.1093/nar/gkad929] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 08/01/2023] [Accepted: 10/10/2023] [Indexed: 10/28/2023] Open
Abstract
RNAs play essential roles in diverse physiological and pathological processes by interacting with other molecules (RNA/protein/compound), and various computational methods are available for identifying these interactions. However, the encoding features provided by existing methods are limited and the existing tools does not offer an effective way to integrate the interacting partners. In this study, a task-specific encoding algorithm for RNAs and RNA-associated interactions was therefore developed. This new algorithm was unique in (a) realizing comprehensive RNA feature encoding by introducing a great many of novel features and (b) enabling task-specific integration of interacting partners using convolutional autoencoder-directed feature embedding. Compared with existing methods/tools, this novel algorithm demonstrated superior performances in diverse benchmark testing studies. This algorithm together with its source code could be readily accessed by all user at: https://idrblab.org/corain/ and https://github.com/idrblab/corain/.
Collapse
Affiliation(s)
- Yunxia Wang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Ziqi Pan
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Weiqi Xia
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Hongning Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Hanyu Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Jin Liu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Lingyan Zheng
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-ZJU Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Yongchao Luo
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Hanqi Zheng
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Xinyuan Yu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Xichen Lian
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Zhenyu Zeng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-ZJU Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Zhaorong Li
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-ZJU Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Bing Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-ZJU Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Mingyue Zheng
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Honglin Li
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
- School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Polytechnic Institute, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-ZJU Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
| |
Collapse
|
20
|
Sun M, Hu H, Pang W, Zhou Y. ACP-BC: A Model for Accurate Identification of Anticancer Peptides Based on Fusion Features of Bidirectional Long Short-Term Memory and Chemically Derived Information. Int J Mol Sci 2023; 24:15447. [PMID: 37895128 PMCID: PMC10607064 DOI: 10.3390/ijms242015447] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Revised: 09/10/2023] [Accepted: 10/20/2023] [Indexed: 10/29/2023] Open
Abstract
Anticancer peptides (ACPs) have been proven to possess potent anticancer activities. Although computational methods have emerged for rapid ACPs identification, their accuracy still needs improvement. In this study, we propose a model called ACP-BC, a three-channel end-to-end model that utilizes various combinations of data augmentation techniques. In the first channel, features are extracted from the raw sequence using a bidirectional long short-term memory network. In the second channel, the entire sequence is converted into a chemical molecular formula, which is further simplified using Simplified Molecular Input Line Entry System notation to obtain deep abstract features through a bidirectional encoder representation transformer (BERT). In the third channel, we manually selected four effective features according to dipeptide composition, binary profile feature, k-mer sparse matrix, and pseudo amino acid composition. Notably, the application of chemical BERT in predicting ACPs is novel and successfully integrated into our model. To validate the performance of our model, we selected two benchmark datasets, ACPs740 and ACPs240. ACP-BC achieved prediction accuracy with 87% and 90% on these two datasets, respectively, representing improvements of 1.3% and 7% compared to existing state-of-the-art methods on these datasets. Therefore, systematic comparative experiments have shown that the ACP-BC can effectively identify anticancer peptides.
Collapse
Affiliation(s)
- Mingwei Sun
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (M.S.); (H.H.)
| | - Haoyuan Hu
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (M.S.); (H.H.)
| | - Wei Pang
- School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh EH14 4AS, UK;
| | - You Zhou
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (M.S.); (H.H.)
- College of Software, Jilin University, Changchun 130012, China
| |
Collapse
|
21
|
Khojasteh H, Pirgazi J, Ghanbari Sorkhi A. Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques. PLoS One 2023; 18:e0288173. [PMID: 37535616 PMCID: PMC10399861 DOI: 10.1371/journal.pone.0288173] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 06/21/2023] [Indexed: 08/05/2023] Open
Abstract
Drug discovery relies on predicting drug-target interaction (DTI), which is an important challenging task. The purpose of DTI is to identify the interaction between drug chemical compounds and protein targets. Traditional wet lab experiments are time-consuming and expensive, that's why in recent years, the use of computational methods based on machine learning has attracted the attention of many researchers. Actually, a dry lab environment focusing more on computational methods of interaction prediction can be helpful in limiting search space for wet lab experiments. In this paper, a novel multi-stage approach for DTI is proposed that called SRX-DTI. In the first stage, combination of various descriptors from protein sequences, and a FP2 fingerprint that is encoded from drug are extracted as feature vectors. A major challenge in this application is the imbalanced data due to the lack of known interactions, in this regard, in the second stage, the One-SVM-US technique is proposed to deal with this problem. Next, the FFS-RF algorithm, a forward feature selection algorithm, coupled with a random forest (RF) classifier is developed to maximize the predictive performance. This feature selection algorithm removes irrelevant features to obtain optimal features. Finally, balanced dataset with optimal features is given to the XGBoost classifier to identify DTIs. The experimental results demonstrate that our proposed approach SRX-DTI achieves higher performance than other existing methods in predicting DTIs. The datasets and source code are available at: https://github.com/Khojasteh-hb/SRX-DTI.
Collapse
Affiliation(s)
- Hakimeh Khojasteh
- Department of Computer Engineering, University of Zanjan, Zanjan, Iran
- School of Biological Sciences Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | - Jamshid Pirgazi
- School of Biological Sciences Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
- Department of Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
| | - Ali Ghanbari Sorkhi
- Department of Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
| |
Collapse
|
22
|
Novak R, Salai G, Hrkac S, Vojtusek IK, Grgurevic L. Revisiting the Role of NAG across the Continuum of Kidney Disease. Bioengineering (Basel) 2023; 10:bioengineering10040444. [PMID: 37106631 PMCID: PMC10136202 DOI: 10.3390/bioengineering10040444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 03/28/2023] [Accepted: 03/30/2023] [Indexed: 04/07/2023] Open
Abstract
Acute and chronic kidney diseases are an evolving continuum for which reliable biomarkers of early disease are lacking. The potential use of glycosidases, enzymes involved in carbohydrate metabolism, in kidney disease detection has been under investigation since the 1960s. N-acetyl-beta-D-glucosaminidase (NAG) is a glycosidase commonly found in proximal tubule epithelial cells (PTECs). Due to its large molecular weight, plasma-soluble NAG cannot pass the glomerular filtration barrier; thus, increased urinary concentration of NAG (uNAG) may suggest injury to the proximal tubule. As the PTECs are the workhorses of the kidney that perform much of the filtration and reabsorption, they are a common starting point in acute and chronic kidney disease. NAG has previously been researched, and it is widely used as a valuable biomarker in both acute and chronic kidney disease, as well as in patients suffering from diabetes mellitus, heart failure, and other chronic diseases leading to kidney failure. Here, we present an overview of the research pertaining to uNAG’s biomarker potential across the spectrum of kidney disease, with an additional emphasis on environmental nephrotoxic substance exposure. In spite of a large body of evidence strongly suggesting connections between uNAG levels and multiple kidney pathologies, focused clinical validation tests and knowledge on underlining molecular mechanisms are largely lacking.
Collapse
Affiliation(s)
- Ruder Novak
- Center for Translational and Clinical Research, Department of Proteomics, School of Medicine, University of Zagreb, 10000 Zagreb, Croatia
| | - Grgur Salai
- Department of Pulmonology, University Hospital Dubrava, 10000 Zagreb, Croatia
| | - Stela Hrkac
- Department of of Clinical Immunology, Allergology and Rheumatology, University Hospital Dubrava, 10000 Zagreb, Croatia
| | - Ivana Kovacevic Vojtusek
- Department of Nephrology, Arterial Hypertension, Dialysis and Transplantation, University Hospital Center Zagreb, 10000 Zagreb, Croatia
| | - Lovorka Grgurevic
- Center for Translational and Clinical Research, Department of Proteomics, School of Medicine, University of Zagreb, 10000 Zagreb, Croatia
- Department of Anatomy, “Drago Perovic”, School of Medicine, University of Zagreb, 10000 Zagreb, Croatia
| |
Collapse
|
23
|
Abbasi Mesrabadi H, Faez K, Pirgazi J. Drug-target interaction prediction based on protein features, using wrapper feature selection. Sci Rep 2023; 13:3594. [PMID: 36869062 PMCID: PMC9984486 DOI: 10.1038/s41598-023-30026-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 02/14/2023] [Indexed: 03/05/2023] Open
Abstract
Drug-target interaction prediction is a vital stage in drug development, involving lots of methods. Experimental methods that identify these relationships on the basis of clinical remedies are time-taking, costly, laborious, and complex introducing a lot of challenges. One group of new methods is called computational methods. The development of new computational methods which are more accurate can be preferable to experimental methods, in terms of total cost and time. In this paper, a new computational model to predict drug-target interaction (DTI), consisting of three phases, including feature extraction, feature selection, and classification is proposed. In feature extraction phase, different features such as EAAC, PSSM and etc. would be extracted from sequence of proteins and fingerprint features from drugs. These extracted features would then be combined. In the next step, one of the wrapper feature selection methods named IWSSR, due to the large amount of extracted data, is applied. The selected features are then given to rotation forest classification, to have a more efficient prediction. Actually, the innovation of our work is that we extract different features; and then select features by the use of IWSSR. The accuracy of the rotation forest classifier based on tenfold on the golden standard datasets (enzyme, ion channels, G-protein-coupled receptors, nuclear receptors) is as follows: 98.12, 98.07, 96.82, and 95.64. The results of experiments indicate that the proposed model has an acceptable rate in DTI prediction and is compatible with the proposed methods in other papers.
Collapse
Affiliation(s)
- Hengame Abbasi Mesrabadi
- Faculty of Computer and Information Technology Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran
| | - Karim Faez
- Department of Electrical Engineering, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran.
| | - Jamshid Pirgazi
- Department of Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
| |
Collapse
|
24
|
Identification of Potential Key Genes and Prognostic Biomarkers of Lung Cancer Based on Bioinformatics. BIOMED RESEARCH INTERNATIONAL 2023; 2023:2152432. [PMID: 36714024 PMCID: PMC9876670 DOI: 10.1155/2023/2152432] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 10/31/2022] [Accepted: 11/17/2022] [Indexed: 01/19/2023]
Abstract
Objective To analyze and identify the core genes related to the expression and prognosis of lung cancer including lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) by bioinformatics technology, with the aim of providing a reference for clinical treatment. Methods Five sets of gene chips, GSE7670, GSE151102, GSE33532, GSE43458, and GSE19804, were obtained from the Gene Expression Omnibus (GEO) database. After using GEO2R to analyze the differentially expressed genes (DEGs) between lung cancer and normal tissues online, the common DEGs of the five sets of chips were obtained using a Venn online tool and imported into the Database for Annotation, Visualization, and Integrated Discovery (DAVID) database for Gene Ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses. The protein-protein interaction (PPI) network was constructed by STRING online software for further study, and the core genes were determined by Cytoscape software and KEGG pathway enrichment analysis. The clustering heat map was drawn by Excel software to verify its accuracy. In addition, we used the University of Alabama at Birmingham Cancer (UALCAN) website to analyze the expression of core genes in P53 mutation status, confirmed the expression of crucial core genes in lung cancer tissues with Gene Expression Profiling Interactive Analysis (GEPIA) and GEPIA2 online software, and evaluated their prognostic value in lung cancer patients with the Kaplan-Meier online plotter tool. Results CHEK1, CCNB1, CCNB2, and CDK1 were selected. The expression levels of these four genes in lung cancer tissues were significantly higher than those in normal tissues. Their increased expression was negatively correlated with lung cancer patients (including LUAD and LUSC) prognosis and survival rate. Conclusion CHEK1, CCNB1, CCNB2, and CDK1 are the critical core genes of lung cancer and are highly expressed in lung cancer. They are negatively correlated with the prognosis of lung cancer patients (including LUAD and LUSC) and closely related to the formation and prediction of lung cancer. They are valuable predictors and may be predictive biomarkers of lung cancer.
Collapse
|
25
|
Tian Z, Peng X, Fang H, Zhang W, Dai Q, Ye Y. MHADTI: predicting drug-target interactions via multiview heterogeneous information network embedding with hierarchical attention mechanisms. Brief Bioinform 2022; 23:6761042. [PMID: 36242566 DOI: 10.1093/bib/bbac434] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Revised: 08/19/2022] [Accepted: 09/08/2022] [Indexed: 12/14/2022] Open
Abstract
MOTIVATION Discovering the drug-target interactions (DTIs) is a crucial step in drug development such as the identification of drug side effects and drug repositioning. Since identifying DTIs by web-biological experiments is time-consuming and costly, many computational-based approaches have been proposed and have become an efficient manner to infer the potential interactions. Although extensive effort is invested to solve this task, the prediction accuracy still needs to be improved. More especially, heterogeneous network-based approaches do not fully consider the complex structure and rich semantic information in these heterogeneous networks. Therefore, it is still a challenge to predict DTIs efficiently. RESULTS In this study, we develop a novel method via Multiview heterogeneous information network embedding with Hierarchical Attention mechanisms to discover potential Drug-Target Interactions (MHADTI). Firstly, MHADTI constructs different similarity networks for drugs and targets by utilizing their multisource information. Combined with the known DTI network, three drug-target heterogeneous information networks (HINs) with different views are established. Secondly, MHADTI learns embeddings of drugs and targets from multiview HINs with hierarchical attention mechanisms, which include the node-level, semantic-level and graph-level attentions. Lastly, MHADTI employs the multilayer perceptron to predict DTIs with the learned deep feature representations. The hierarchical attention mechanisms could fully consider the importance of nodes, meta-paths and graphs in learning the feature representations of drugs and targets, which makes their embeddings more comprehensively. Extensive experimental results demonstrate that MHADTI performs better than other SOTA prediction models. Moreover, analysis of prediction results for some interested drugs and targets further indicates that MHADTI has advantages in discovering DTIs. AVAILABILITY AND IMPLEMENTATION https://github.com/pxystudy/MHADTI.
Collapse
Affiliation(s)
- Zhen Tian
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Xiangyu Peng
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Haichuan Fang
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Wenjie Zhang
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Qiguo Dai
- School of Computer Science and Engineering, Dalian Minzu University, Dalian,116600, China
| | - Yangdong Ye
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| |
Collapse
|
26
|
Kurata H, Tsukiyama S. ICAN: Interpretable cross-attention network for identifying drug and target protein interactions. PLoS One 2022; 17:e0276609. [PMID: 36279284 PMCID: PMC9591068 DOI: 10.1371/journal.pone.0276609] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 10/10/2022] [Indexed: 11/18/2022] Open
Abstract
Drug-target protein interaction (DTI) identification is fundamental for drug discovery and drug repositioning, because therapeutic drugs act on disease-causing proteins. However, the DTI identification process often requires expensive and time-consuming tasks, including biological experiments involving large numbers of candidate compounds. Thus, a variety of computation approaches have been developed. Of the many approaches available, chemo-genomics feature-based methods have attracted considerable attention. These methods compute the feature descriptors of drugs and proteins as the input data to train machine and deep learning models to enable accurate prediction of unknown DTIs. In addition, attention-based learning methods have been proposed to identify and interpret DTI mechanisms. However, improvements are needed for enhancing prediction performance and DTI mechanism elucidation. To address these problems, we developed an attention-based method designated the interpretable cross-attention network (ICAN), which predicts DTIs using the Simplified Molecular Input Line Entry System of drugs and amino acid sequences of target proteins. We optimized the attention mechanism architecture by exploring the cross-attention or self-attention, attention layer depth, and selection of the context matrixes from the attention mechanism. We found that a plain attention mechanism that decodes drug-related protein context features without any protein-related drug context features effectively achieved high performance. The ICAN outperformed state-of-the-art methods in several metrics on the DAVIS dataset and first revealed with statistical significance that some weighted sites in the cross-attention weight matrix represent experimental binding sites, thus demonstrating the high interpretability of the results. The program is freely available at https://github.com/kuratahiroyuki/ICAN.
Collapse
Affiliation(s)
- Hiroyuki Kurata
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka, Japan
- * E-mail:
| | - Sho Tsukiyama
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka, Japan
| |
Collapse
|
27
|
ACP-ADA: A Boosting Method with Data Augmentation for Improved Prediction of Anticancer Peptides. Int J Mol Sci 2022; 23:ijms232012194. [PMID: 36293050 PMCID: PMC9603247 DOI: 10.3390/ijms232012194] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 10/08/2022] [Accepted: 10/11/2022] [Indexed: 11/30/2022] Open
Abstract
Cancer is the second-leading cause of death worldwide, and therapeutic peptides that target and destroy cancer cells have received a great deal of interest in recent years. Traditional wet experiments are expensive and inefficient for identifying novel anticancer peptides; therefore, the development of an effective computational approach is essential to recognize ACP candidates before experimental methods are used. In this study, we proposed an Ada-boosting algorithm with the base learner random forest called ACP-ADA, which integrates binary profile feature, amino acid index, and amino acid composition with a 210-dimensional feature space vector to represent the peptides. Training samples in the feature space were augmented to increase the sample size and further improve the performance of the model in the case of insufficient samples. Furthermore, we used five-fold cross-validation to find model parameters, and the cross-validation results showed that ACP-ADA outperforms existing methods for this feature combination with data augmentation in terms of performance metrics. Specifically, ACP-ADA recorded an average accuracy of 86.4% and a Mathew’s correlation coefficient of 74.01% for dataset ACP740 and 90.83% and 81.65% for dataset ACP240; consequently, it can be a very useful tool in drug development and biomedical research.
Collapse
|
28
|
Han X, Wang F, Yang P, Di B, Xu X, Zhang C, Yao M, Sun Y, Lin Y. A Bioinformatic Approach Based on Systems Biology to Determine the Effects of SARS-CoV-2 Infection in Patients with Hypertrophic Cardiomyopathy. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:5337380. [PMID: 36203534 PMCID: PMC9532139 DOI: 10.1155/2022/5337380] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 08/26/2022] [Accepted: 09/01/2022] [Indexed: 11/18/2022]
Abstract
Recently, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative agent of coronavirus disease 2019 (COVID-19), has infected millions of individuals worldwide. While COVID-19 generally affects the lungs, it also damages other organs, including those of the cardiovascular system. Hypertrophic cardiomyopathy (HCM) is a common genetic cardiovascular disorder. Studies have shown that HCM patients with COVID-19 have a higher mortality rate; however, the reason for this phenomenon is not yet elucidated. Herein, we conducted transcriptomic analyses to identify shared biomarkers between HCM and COVID-19 to bridge this knowledge gap. Differentially expressed genes (DEGs) were obtained using the Gene Expression Omnibus ribonucleic acid (RNA) sequencing datasets, GSE147507 and GSE89714, to identify shared pathways and potential drug candidates. We discovered 30 DEGs that were common between these two datasets. Using a combination of statistical and biological tools, protein-protein interactions were constructed in response to these findings to support hub genes and modules. We discovered that HCM is linked to COVID-19 progression based on a functional analysis under ontology terms. Based on the DEGs identified from the datasets, a coregulatory network of transcription factors, genes, proteins, and microRNAs was also discovered. Lastly, our research suggests that the potential drugs we identified might be helpful for COVID-19 therapy.
Collapse
Affiliation(s)
- Xiao Han
- Department of Cardiology, Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Fei Wang
- Department of Emergency Medicine, Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Ping Yang
- Department of Pharmacy, Xinhua Hospital Affiliated to Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Bin Di
- Department of Cardiology, Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Xiangdong Xu
- Department of Cardiology, Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Chunya Zhang
- Department of Cardiology, Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Man Yao
- Department of Cardiology, Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Yaping Sun
- Department of Cardiology, Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Yangyi Lin
- Department of Pulmonary Vascular Disease, State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| |
Collapse
|
29
|
Yuan Y, Yan H, Cui Z, Liu Z, Su W, Zhang R. Quantum Chemical Calculations with Machine Learning for Multipolar Electrostatics Prediction in RNA: An Application to Pentose. J Chem Inf Model 2022; 62:4122-4133. [PMID: 36036609 DOI: 10.1021/acs.jcim.2c00747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
To develop a realistic electrostatic model that allows for the anisotropy of the atomic electron density, high-rank atomic multipole moments computed by quantum chemical calculations have been studied extensively. However, it is hard to process huge RNA systems only relying on quantum chemical calculations due to its highly computational cost. In this study, we employ five machine learning methods of Gaussian process regression with automatic relevance determination (ARDGPR), Kriging, radial basis function neural networks, Bagging, and generalized regression neural network to predict atomic multipole moments. Atom-atom electrostatic interaction energies are subsequently computed using the predicted atomic multipole moments in the pilot system pentose of RNA. Here, the performance of the five methods is compared in terms of both the multipole moment prediction errors and the electrostatic energy prediction errors. For the predicted high-rank multipole moments of the four elements (O, C, N, and H) in capped pentose, ARDGPR and Kriging consistently outperform the other three methods. Therefore, the multipole moments predicted by the two best methods of ARDGPR and Kriging are then used to predict electrostatic interaction energy of each pentose. Finally, the absolute average energy errors of ARDGPR and Kriging are 1.83 and 4.33 kJ mol-1, respectively. Compared to Kriging, the ARDGPR method achieves a 58% decrease in the absolute average energy error. These satisfactory results demonstrated that the ARDGPR method with the strong feature extraction ability can predict the electrostatic interaction energy of pentose in RNA correctly and reliably.
Collapse
Affiliation(s)
- Yongna Yuan
- School of Information Science & Engineering, Lanzhou University, Lanzhou, China, 730000
| | - Haoqiu Yan
- School of Information Science & Engineering, Lanzhou University, Lanzhou, China, 730000
| | - Zeyang Cui
- School of Information Science & Engineering, Lanzhou University, Lanzhou, China, 730000
| | - Zhenyu Liu
- School of Cyberspace Security, Gansu University of Political Science and Law, Lanzhou, China, 730070
| | - Wei Su
- School of Information Science & Engineering, Lanzhou University, Lanzhou, China, 730000
| | - Ruisheng Zhang
- School of Information Science & Engineering, Lanzhou University, Lanzhou, China, 730000
| |
Collapse
|
30
|
El-Behery H, Attia AF, El-Fishawy N, Torkey H. An ensemble-based drug-target interaction prediction approach using multiple feature information with data balancing. J Biol Eng 2022; 16:21. [PMID: 35941686 PMCID: PMC9361677 DOI: 10.1186/s13036-022-00296-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 06/02/2022] [Indexed: 11/16/2022] Open
Abstract
Background Recently, drug repositioning has received considerable attention for its advantage to pharmaceutical industries in drug development. Artificial intelligence techniques have greatly enhanced drug reproduction by discovering therapeutic drug profiles, side effects, and new target proteins. However, as the number of drugs increases, their targets and enormous interactions produce imbalanced data that might not be preferable as an input to a prediction model immediately. Methods This paper proposes a novel scheme for predicting drug–target interactions (DTIs) based on drug chemical structures and protein sequences. The drug Morgan fingerprint, drug constitutional descriptors, protein amino acid composition, and protein dipeptide composition were employed to extract the drugs and protein’s characteristics. Then, the proposed approach for extracting negative samples using a support vector machine one-class classifier was developed to tackle the imbalanced data problem feature sets from the drug–target dataset. Negative and positive samplings were constructed and fed into different prediction algorithms to identify DTIs. A 10-fold CV validation test procedure was applied to assess the predictability of the proposed method, in addition to the study of the effectiveness of the chemical and physical features in the evaluation and discovery of the drug–target interactions. Results Our experimental model outperformed existing techniques concerning the curve for receiver operating characteristic (AUC), accuracy, precision, recall F-score, mean square error, and MCC. The results obtained by the AdaBoost classifier enhanced prediction accuracy by 2.74%, precision by 1.98%, AUC by 1.14%, F-score by 3.53%, and MCC by 4.54% over existing methods.
Collapse
Affiliation(s)
- Heba El-Behery
- Department of Computer Science and Engineering, Faculty of Engineering, Kafrelsheikh University, Kafr_El_Sheikh, Egypt.
| | - Abdel-Fattah Attia
- Department of Computer Science and Engineering, Faculty of Engineering, Kafrelsheikh University, Kafr_El_Sheikh, Egypt
| | - Nawal El-Fishawy
- Computer Science & Engineering Department, Faculty of Electronic Engineering, Menoufia University, Menouf, Egypt
| | - Hanaa Torkey
- Computer Science & Engineering Department, Faculty of Electronic Engineering, Menoufia University, Menouf, Egypt
| |
Collapse
|
31
|
Chen X, Huang J, He B. AntiDMPpred: a web service for identifying anti-diabetic peptides. PeerJ 2022; 10:e13581. [PMID: 35722269 PMCID: PMC9205309 DOI: 10.7717/peerj.13581] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Accepted: 05/23/2022] [Indexed: 01/17/2023] Open
Abstract
Diabetes mellitus (DM) is a chronic metabolic disease that has been a major threat to human health globally, causing great economic and social adversities. The oral administration of anti-diabetic peptide drugs has become a novel route for diabetes therapy. Numerous bioactive peptides have demonstrated potential anti-diabetic properties and are promising as alternative treatment measures to prevent and manage diabetes. The computational prediction of anti-diabetic peptides can help promote peptide-based drug discovery in the process of searching newly effective therapeutic peptide agents for diabetes treatment. Here, we resorted to random forest to develop a computational model, named AntiDMPpred, for predicting anti-diabetic peptides. A benchmark dataset with 236 anti-diabetic and 236 non-anti-diabetic peptides was first constructed. Four types of sequence-derived descriptors were used to represent the peptide sequences. We then combined four machine learning methods and six feature scoring methods to select the non-redundant features, which were fed into diverse machine learning classifiers to train the models. Experimental results show that AntiDMPpred reached an accuracy of 77.12% and area under the receiver operating curve (AUCROC) of 0.8193 in the nested five-fold cross-validation, yielding a satisfactory performance and surpassing other classifiers implemented in the study. The web service is freely accessible at http://i.uestc.edu.cn/AntiDMPpred/cgi-bin/AntiDMPpred.pl. We hope AntiDMPpred could improve the discovery of anti-diabetic bioactive peptides.
Collapse
Affiliation(s)
- Xue Chen
- Medical College, Guizhou University, Guiyang, China
| | - Jian Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Bifang He
- Medical College, Guizhou University, Guiyang, China
| |
Collapse
|
32
|
Hosen MF, Mahmud SH, Ahmed K, Chen W, Moni MA, Deng HW, Shoombuatong W, Hasan MM. DeepDNAbP: A deep learning-based hybrid approach to improve the identification of deoxyribonucleic acid-binding proteins. Comput Biol Med 2022; 145:105433. [DOI: 10.1016/j.compbiomed.2022.105433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 03/11/2022] [Accepted: 03/20/2022] [Indexed: 11/03/2022]
|
33
|
DeepMHADTA: Prediction of Drug-Target Binding Affinity Using Multi-Head Self-Attention and Convolutional Neural Network. Curr Issues Mol Biol 2022; 44:2287-2299. [PMID: 35678684 PMCID: PMC9164023 DOI: 10.3390/cimb44050155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 05/08/2022] [Accepted: 05/14/2022] [Indexed: 11/17/2022] Open
Abstract
Drug-target interactions provide insight into the drug-side effects and drug repositioning. However, wet-lab biochemical experiments are time-consuming and labor-intensive, and are insufficient to meet the pressing demand for drug research and development. With the rapid advancement of deep learning, computational methods are increasingly applied to screen drug-target interactions. Many methods consider this problem as a binary classification task (binding or not), but ignore the quantitative binding affinity. In this paper, we propose a new end-to-end deep learning method called DeepMHADTA, which uses the multi-head self-attention mechanism in a deep residual network to predict drug-target binding affinity. On two benchmark datasets, our method outperformed several current state-of-the-art methods in terms of multiple performance measures, including mean square error (MSE), consistency index (CI), rm2, and PR curve area (AUPR). The results demonstrated that our method achieved better performance in predicting the drug–target binding affinity.
Collapse
|
34
|
Shao K, Zhang Y, Wen Y, Zhang Z, He S, Bo X. DTI-HETA: prediction of drug-target interactions based on GCN and GAT on heterogeneous graph. Brief Bioinform 2022; 23:6563180. [PMID: 35380622 DOI: 10.1093/bib/bbac109] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 02/14/2022] [Accepted: 03/03/2022] [Indexed: 12/19/2022] Open
Abstract
Drug-target interaction (DTI) prediction plays an important role in drug repositioning, drug discovery and drug design. However, due to the large size of the chemical and genomic spaces and the complex interactions between drugs and targets, experimental identification of DTIs is costly and time-consuming. In recent years, the emerging graph neural network (GNN) has been applied to DTI prediction because DTIs can be represented effectively using graphs. However, some of these methods are only based on homogeneous graphs, and some consist of two decoupled steps that cannot be trained jointly. To further explore GNN-based DTI prediction by integrating heterogeneous graph information, this study regards DTI prediction as a link prediction problem and proposes an end-to-end model based on HETerogeneous graph with Attention mechanism (DTI-HETA). In this model, a heterogeneous graph is first constructed based on the drug-drug and target-target similarity matrices and the DTI matrix. Then, the graph convolutional neural network is utilized to obtain the embedded representation of the drugs and targets. To highlight the contribution of different neighborhood nodes to the central node in aggregating the graph convolution information, a graph attention mechanism is introduced into the node embedding process. Afterward, an inner product decoder is applied to predict DTIs. To evaluate the performance of DTI-HETA, experiments are conducted on two datasets. The experimental results show that our model is superior to the state-of-the-art methods. Also, the identification of novel DTIs indicates that DTI-HETA can serve as a powerful tool for integrating heterogeneous graph information to predict DTIs.
Collapse
Affiliation(s)
| | | | - Yuqi Wen
- Beijing Institute of Radiation Medicine, Beijing, China
| | | | - Song He
- Beijing Institute of Radiation Medicine, Beijing, China
| | - Xiaochen Bo
- Beijing Institute of Radiation Medicine, Beijing, China
| |
Collapse
|
35
|
Bioinformatics and Network-based Approaches for Determining Pathways, Signature Molecules, and Drug Substances connected to Genetic Basis of Schizophrenia etiology. Brain Res 2022; 1785:147889. [PMID: 35339428 DOI: 10.1016/j.brainres.2022.147889] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 02/28/2022] [Accepted: 03/21/2022] [Indexed: 12/12/2022]
Abstract
Knowledge of heterogeneous etiology and pathophysiology of schizophrenia (SZP) is reasonably inadequate and non-deterministic due to its inherent complexity and underlying vast dynamics related to genetic mechanisms. The evolution of large-scale transcriptome-wide datasets and subsequent development of relevant, robust technologies for their analyses show promises toward elucidating the genetic basis of disease pathogenesis, its early risk prediction, and predicting drug molecule targets for therapeutic intervention. In this research, we have scrutinized the genetic basis of SZP through functional annotation and network-based system biology approaches. We have determined 96 overlapping differentially expressed genes (DEGs) from 2 microarray datasets and subsequently identified their interconnecting networks to reveal transcriptome signatures like hub proteins (FYN, RAD51, SOCS3, XIAP, AKAP13, PIK3C2A, CBX5, GATA3, EIF3K, and CDKN2B), transcription factors and miRNAs. In addition, we have employed gene set enrichment to highlight significant gene ontology (e.g., positive regulation of microglial cell activation) and relevant pathways (such as axon guidance and focal adhesion) interconnected to the genes associated with SZP. Finally, we have suggested candidate drug substances like Luteolin HL60 UP as a possible therapeutic target based on these key molecular signatures.
Collapse
|
36
|
Hasan MM, Khan Z, Chowdhury MS, Khan MA, Moni MA, Rahman MH. In silico molecular docking and ADME/T analysis of Quercetin compound with its evaluation of broad-spectrum therapeutic potential against particular diseases. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.100894] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
|
37
|
Hasan I, Hossain A, Bhuiyan P, Miah S, Rahman H. A system biology approach to determine therapeutic targets by identifying molecular mechanisms and key pathways for type 2 diabetes that are linked to the development of tuberculosis and rheumatoid arthritis. Life Sci 2022; 297:120483. [DOI: 10.1016/j.lfs.2022.120483] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 03/07/2022] [Accepted: 03/09/2022] [Indexed: 12/17/2022]
|
38
|
Mahbub NI, Hasan MI, Rahman MH, Naznin F, Islam MZ, Moni MA. Identifying molecular signatures and pathways shared between Alzheimer's and Huntington's disorders: A bioinformatics and systems biology approach. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.100888] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022] Open
|
39
|
Determination of molecular signatures and pathways common to brain tissues of autism spectrum disorder: Insights from comprehensive bioinformatics approach. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.100871] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
|
40
|
Systems Biology and Bioinformatics approach to Identify blood based signatures molecules and drug targets of patient with COVID-19. INFORMATICS IN MEDICINE UNLOCKED 2022; 28:100840. [PMID: 34981034 PMCID: PMC8716147 DOI: 10.1016/j.imu.2021.100840] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 12/27/2021] [Indexed: 01/08/2023] Open
Abstract
Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) infection results in the development of a highly contagious respiratory ailment known as new coronavirus disease (COVID-19). Despite the fact that the prevalence of COVID-19 continues to rise, it is still unclear how people become infected with SARS-CoV-2 and how patients with COVID-19 become so unwell. Detecting biomarkers for COVID-19 using peripheral blood mononuclear cells (PBMCs) may aid in drug development and treatment. This research aimed to find blood cell transcripts that represent levels of gene expression associated with COVID-19 progression. Through the development of a bioinformatics pipeline, two RNA-Seq transcriptomic datasets and one microarray dataset were studied and discovered 102 significant differentially expressed genes (DEGs) that were shared by three datasets derived from PBMCs. To identify the roles of these DEGs, we discovered disease-gene association networks and signaling pathways, as well as we performed gene ontology (GO) studies and identified hub protein. Identified significant gene ontology and molecular pathways improved our understanding of the pathophysiology of COVID-19, and our identified blood-based hub proteins TPX2, DLGAP5, NCAPG, CCNB1, KIF11, HJURP, AURKB, BUB1B, TTK, and TOP2A could be used for the development of therapeutic intervention. In COVID-19 subjects, we discovered effective putative connections between pathological processes in the transcripts blood cells, suggesting that blood cells could be used to diagnose and monitor the disease’s initiation and progression as well as developing drug therapeutics.
Collapse
|
41
|
Huang X, Zhang KJ, Jiang JJ, Jiang SY, Lin JB, Lou YJ. Identification of Crucial Genes and Key Functions in Type 2 Diabetic Hearts by Bioinformatic Analysis. Front Endocrinol (Lausanne) 2022; 13:801260. [PMID: 35242109 PMCID: PMC8885996 DOI: 10.3389/fendo.2022.801260] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 01/20/2022] [Indexed: 12/16/2022] Open
Abstract
Type 2 diabetes (T2D) patients with SARS-CoV-2 infection hospitalized develop an acute cardiovascular syndrome. It is urgent to elucidate underlying mechanisms associated with the acute cardiac injury in T2D hearts. We performed bioinformatic analysis on the expression profiles of public datasets to identify the pathogenic and prognostic genes in T2D hearts. Cardiac RNA-sequencing datasets from db/db or BKS mice (GSE161931) were updated to NCBI-Gene Expression Omnibus (NCBI-GEO), and used for the transcriptomics analyses with public datasets from NCBI-GEO of autopsy heart specimens with COVID-19 (5/6 with T2D, GSE150316), or dead healthy persons (GSE133054). Differentially expressed genes (DEGs) and overlapping homologous DEGs among the three datasets were identified using DESeq2. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes analyses were conducted for event enrichment through clusterProfile. The protein-protein interaction (PPI) network of DEGs was established and visualized by Cytoscape. The transcriptions and functions of crucial genes were further validated in db/db hearts. In total, 542 up-regulated and 485 down-regulated DEGs in mice, and 811 up-regulated and 1399 down-regulated DEGs in human were identified, respectively. There were 74 overlapping homologous DEGs among all datasets. Mitochondria inner membrane and serine-type endopeptidase activity were further identified as the top-10 GO events for overlapping DEGs. Cardiac CAPNS1 (calpain small subunit 1) was the unique crucial gene shared by both enriched events. Its transcriptional level significantly increased in T2D mice, but surprisingly decreased in T2D patients with SARS-CoV-2 infection. PPI network was constructed with 30 interactions in overlapping DEGs, including CAPNS1. The substrates Junctophilin2 (Jp2), Tnni3, and Mybpc3 in cardiac calpain/CAPNS1 pathway showed less transcriptional change, although Capns1 increased in transcription in db/db mice. Instead, cytoplasmic JP2 significantly reduced and its hydrolyzed product JP2NT exhibited nuclear translocation in myocardium. This study suggests CAPNS1 is a crucial gene in T2D hearts. Its transcriptional upregulation leads to calpain/CAPNS1-associated JP2 hydrolysis and JP2NT nuclear translocation. Therefore, attenuated cardiac CAPNS1 transcription in T2D patients with SARS-CoV-2 infection highlights a novel target in adverse prognostics and comprehensive therapy. CAPNS1 can also be explored for the molecular signaling involving the onset, progression and prognostic in T2D patients with SARS-CoV-2 infection.
Collapse
Affiliation(s)
- Xin Huang
- Cardiovascular Key Laboratory of Zhejiang Province, The 2nd Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
- Biotherapy Research Center, The 2nd Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
- *Correspondence: Xin Huang, ; Yi-jia Lou,
| | - Kai-jie Zhang
- Institute of Pharmacology and Toxicology, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
- Chu Kochen Honors College, Zhejiang University, Hangzhou, China
| | - Jun-jie Jiang
- Institute of Pharmacology and Toxicology, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
- Chu Kochen Honors College, Zhejiang University, Hangzhou, China
| | - Shou-yin Jiang
- Department of Emergency Medicine, The 2nd Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Jia-bin Lin
- Clinical Research Center, The 2nd Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Yi-jia Lou
- Institute of Pharmacology and Toxicology, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
- *Correspondence: Xin Huang, ; Yi-jia Lou,
| |
Collapse
|
42
|
Identification of molecular signatures and pathways common to blood cells and brain tissue based RNA-Seq datasets of bipolar disorder: Insights from comprehensive bioinformatics approach. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.100881] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
|
43
|
Gyebi GA, Ogunyemi OM, Ibrahim IM, Ogunro OB, Adegunloye AP, Afolabi SO. SARS-CoV-2 host cell entry: an in silico investigation of potential inhibitory roles of terpenoids. J Genet Eng Biotechnol 2021; 19:113. [PMID: 34351542 PMCID: PMC8339396 DOI: 10.1186/s43141-021-00209-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Accepted: 07/16/2021] [Indexed: 12/26/2022]
Abstract
BACKGROUND Targeting viral cell entry proteins is an emerging therapeutic strategy for inhibiting the first stage of SARS-CoV-2 infection. In this study, 106 bioactive terpenoids from African medicinal plants were screened through molecular docking analysis against human angiotensin-converting enzyme 2 (hACE2), human transmembrane protease serine 2 (TMPRSS2), and the spike (S) proteins of SARS-CoV-2, SARS-CoV, and MERS-CoV. In silico absorption-distribution-metabolism-excretion-toxicity (ADMET) and drug-likeness prediction, molecular dynamics (MD) simulation, binding free energy calculations, and clustering analysis of MD simulation trajectories were performed on the top docked terpenoids to respective protein targets. RESULTS The results revealed eight terpenoids with high binding tendencies to the catalytic residues of different targets. Two pentacyclic terpenoids (24-methylene cycloartenol and isoiguesteri) interacted with the hACE2 binding hotspots for the SARS-CoV-2 spike protein, while the abietane diterpenes were found accommodated within the S1-specificity pocket, interacting strongly with the active site residues TMPRSS2. 3-benzoylhosloppone and cucurbitacin interacted with the RBD and S2 subunit of SARS-CoV-2 spike protein respectively. These interactions were preserved in a simulated dynamic environment, thereby, demonstrating high structural stability. The MM-GBSA binding free energy calculations corroborated the docking interactions. The top docked terpenoids showed favorable drug-likeness and ADMET properties over a wide range of molecular descriptors. CONCLUSION The identified terpenoids from this study provides core structure that can be exploited for further lead optimization to design drugs against SARS-CoV-2 cell-mediated entry proteins. They are therefore recommended for further in vitro and in vivo studies towards developing entry inhibitors against the ongoing COVID-19 pandemic.
Collapse
Affiliation(s)
- Gideon A Gyebi
- Department of Biochemistry, Faculty of Sciences and Technology, Bingham University, P.M.B 005, Karu, Nasarawa State, Nigeria.
| | - Oludare M Ogunyemi
- Human Nutraceuticals and Bioinformatics Research Unit, Department of Biochemistry, Salem University, Lokoja, Nigeria
| | - Ibrahim M Ibrahim
- Faculty of Sciences, Department of Biophysics Cairo University, Giza, Egypt
| | - Olalekan B Ogunro
- Department of Biological Sciences, KolaDaisi University, Ibadan, Nigeria
| | - Adegbenro P Adegunloye
- Department of Biochemistry, Faculty of Life Sciences, University of Ilorin, Ilorin, Nigeria
| | - Saheed O Afolabi
- Department of Pharmacology and Therapeutics, Faculty of Basic Medical Sciences, University of Ilorin, Ilorin, Nigeria
| |
Collapse
|
44
|
Chen XG, Zhang W, Yang X, Li C, Chen H. ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation. Front Genet 2021; 12:698477. [PMID: 34276801 PMCID: PMC8279753 DOI: 10.3389/fgene.2021.698477] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 06/07/2021] [Indexed: 12/09/2022] Open
Abstract
Anticancer peptides (ACPs) have provided a promising perspective for cancer treatment, and the prediction of ACPs is very important for the discovery of new cancer treatment drugs. It is time consuming and expensive to use experimental methods to identify ACPs, so computational methods for ACP identification are urgently needed. There have been many effective computational methods, especially machine learning-based methods, proposed for such predictions. Most of the current machine learning methods try to find suitable features or design effective feature learning techniques to accurately represent ACPs. However, the performance of these methods can be further improved for cases with insufficient numbers of samples. In this article, we propose an ACP prediction model called ACP-DA (Data Augmentation), which uses data augmentation for insufficient samples to improve the prediction performance. In our method, to better exploit the information of peptide sequences, peptide sequences are represented by integrating binary profile features and AAindex features, and then the samples in the training set are augmented in the feature space. After data augmentation, the samples are used to train the machine learning model, which is used to predict ACPs. The performance of ACP-DA exceeds that of existing methods, and ACP-DA achieves better performance in the prediction of ACPs compared with a method without data augmentation. The proposed method is available at http://github.com/chenxgscuec/ACPDA.
Collapse
Affiliation(s)
- Xian-Gan Chen
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, China.,Hubei Engineering Technology Research Center of Agricultural Big Data, Wuhan, China
| | - Xiaofei Yang
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| | - Chenhong Li
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| | - Hengling Chen
- School of Biomedical Engineering, South-Central University for Nationalities, Wuhan, China.,Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central University for Nationalities, Wuhan, China.,Key Laboratory of Cognitive Science (South-Central University for Nationalities), State Ethnic Affairs Commission, Wuhan, China
| |
Collapse
|