1
|
Saha S, Chatterjee P, Basu S, Nasipuri M. EPI-SF: essential protein identification in protein interaction networks using sequence features. PeerJ 2024; 12:e17010. [PMID: 38495766 PMCID: PMC10944162 DOI: 10.7717/peerj.17010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Accepted: 02/05/2024] [Indexed: 03/19/2024] Open
Abstract
Proteins are considered indispensable for facilitating an organism's viability, reproductive capabilities, and other fundamental physiological functions. Conventional biological assays are characterized by prolonged duration, extensive labor requirements, and financial expenses in order to identify essential proteins. Therefore, it is widely accepted that employing computational methods is the most expeditious and effective approach to successfully discerning essential proteins. Despite being a popular choice in machine learning (ML) applications, the deep learning (DL) method is not suggested for this specific research work based on sequence features due to the restricted availability of high-quality training sets of positive and negative samples. However, some DL works on limited availability of data are also executed at recent times which will be our future scope of work. Conventional ML techniques are thus utilized in this work due to their superior performance compared to DL methodologies. In consideration of the aforementioned, a technique called EPI-SF is proposed here, which employs ML to identify essential proteins within the protein-protein interaction network (PPIN). The protein sequence is the primary determinant of protein structure and function. So, initially, relevant protein sequence features are extracted from the proteins within the PPIN. These features are subsequently utilized as input for various machine learning models, including XGB Boost Classifier, AdaBoost Classifier, logistic regression (LR), support vector classification (SVM), Decision Tree model (DT), Random Forest model (RF), and Naïve Bayes model (NB). The objective is to detect the essential proteins within the PPIN. The primary investigation conducted on yeast examined the performance of various ML models for yeast PPIN. Among these models, the RF model technique had the highest level of effectiveness, as indicated by its precision, recall, F1-score, and AUC values of 0.703, 0.720, 0.711, and 0.745, respectively. It is also found to be better in performance when compared to the other state-of-arts based on traditional centrality like betweenness centrality (BC), closeness centrality (CC), etc. and deep learning methods as well like DeepEP, as emphasized in the result section. As a result of its favorable performance, EPI-SF is later employed for the prediction of novel essential proteins inside the human PPIN. Due to the tendency of viruses to selectively target essential proteins involved in the transmission of diseases within human PPIN, investigations are conducted to assess the probable involvement of these proteins in COVID-19 and other related severe diseases.
Collapse
Affiliation(s)
- Sovan Saha
- Department of Computer Science & Engineering (Artificial Intelligence & Machine Learning), Techno Main Salt Lake, Kolkata, West Bengal, India
| | - Piyali Chatterjee
- Department of Computer Science & Engineering, Netaji Subhash Engineering College, Kolkata, West Bengal, India
| | - Subhadip Basu
- Department of Computer Science & Engineering, Jadavpur University, Kolkata, West Bengal, India
| | - Mita Nasipuri
- Department of Computer Science & Engineering, Jadavpur University, Kolkata, West Bengal, India
| |
Collapse
|
2
|
Mobley JA, Molyvdas A, Kojima K, Ahmad I, Jilling T, Li JL, Garantziotis S, Matalon S. The SARS-CoV-2 spike S1 protein induces global proteomic changes in ATII-like rat L2 cells that are attenuated by hyaluronan. Am J Physiol Lung Cell Mol Physiol 2023; 324:L413-L432. [PMID: 36719087 PMCID: PMC10042596 DOI: 10.1152/ajplung.00282.2022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 12/29/2022] [Accepted: 01/25/2023] [Indexed: 02/01/2023] Open
Abstract
The COVID-19 pandemic continues to impose a major impact on global health and economy since its identification in early 2020, causing significant morbidity and mortality worldwide. Caused by the SARS-CoV-2 virus, along with a growing number of variants, COVID-19 has led to 651,918,402 confirmed cases and 6,656,601 deaths worldwide (as of December 27, 2022; https://covid19.who.int/). Despite advances in our understanding of COVID-19 pathogenesis, the precise mechanism by which SARS-CoV2 causes epithelial injury is incompletely understood. In this current study, robust application of global-discovery proteomics identified highly significant induced changes by the Spike S1 protein of SARS-CoV-2 in the proteome of alveolar type II (ATII)-like rat L2 cells that lack ACE2 receptors. Systems biology analysis revealed that the S1-induced proteomics changes were associated with three significant network hubs: E2F1, CREB1/RelA, and ROCK2/RhoA. We also found that pretreatment of L2 cells with high molecular weight hyaluronan (HMW-HA) greatly attenuated the S1 effects on the proteome. Western blotting analysis and cell cycle measurements confirmed the S1 upregulation of E2F1 and ROCK2/RhoA in L2 cells and the protective effects of HMW-HA. Taken as a whole, our studies revealed profound and novel biological changes that contribute to our current understanding of both S1 and hyaluronan biology. These data show that the S1 protein may contribute to epithelial injury induced by SARS-CoV-2. In addition, our work supports the potential benefit of HMW-HA in ameliorating SARS CoV-2-induced cell injury.
Collapse
Affiliation(s)
- James A Mobley
- Division of Molecular and Translational Biomedicine, Department of Anesthesiology and Perioperative Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States
- O'Neal Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, Alabama, United States
| | - Adam Molyvdas
- Division of Molecular and Translational Biomedicine, Department of Anesthesiology and Perioperative Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States
| | - Kyoko Kojima
- O'Neal Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, Alabama, United States
| | - Israr Ahmad
- Division of Molecular and Translational Biomedicine, Department of Anesthesiology and Perioperative Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States
| | - Tamas Jilling
- Division of Neonatology, Department of Pediatrics, Heersink School of Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States
| | - Jian-Liang Li
- National Institute of Environmental Health Sciences, Durham, North Carolina, United States
| | - Stavros Garantziotis
- National Institute of Environmental Health Sciences, Durham, North Carolina, United States
| | - Sadis Matalon
- Division of Molecular and Translational Biomedicine, Department of Anesthesiology and Perioperative Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States
| |
Collapse
|
3
|
Saha S, Chatterjee P, Halder AK, Nasipuri M, Basu S, Plewczynski D. ML-DTD: Machine Learning-Based Drug Target Discovery for the Potential Treatment of COVID-19. Vaccines (Basel) 2022; 10:1643. [PMID: 36298508 PMCID: PMC9607653 DOI: 10.3390/vaccines10101643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/11/2022] [Accepted: 09/14/2022] [Indexed: 11/05/2022] Open
Abstract
Recent research has highlighted that a large section of druggable protein targets in the Human interactome remains unexplored for various diseases. It might lead to the drug repurposing study and help in the in-silico prediction of new drug-human protein target interactions. The same applies to the current pandemic of COVID-19 disease in global health issues. It is highly desirable to identify potential human drug targets for COVID-19 using a machine learning approach since it saves time and labor compared to traditional experimental methods. Structure-based drug discovery where druggability is determined by molecular docking is only appropriate for the protein whose three-dimensional structures are available. With machine learning algorithms, differentiating relevant features for predicting targets and non-targets can be used for the proteins whose 3-D structures are unavailable. In this research, a Machine Learning-based Drug Target Discovery (ML-DTD) approach is proposed where a machine learning model is initially built up and tested on the curated dataset consisting of COVID-19 human drug targets and non-targets formed by using the Therapeutic Target Database (TTD) and human interactome using several classifiers like XGBBoost Classifier, AdaBoost Classifier, Logistic Regression, Support Vector Classification, Decision Tree Classifier, Random Forest Classifier, Naive Bayes Classifier, and K-Nearest Neighbour Classifier (KNN). In this method, protein features include Gene Set Enrichment Analysis (GSEA) ranking, properties derived from the protein sequence, and encoded protein network centrality-based measures. Among all these, XGBBoost, KNN, and Random Forest models are satisfactory and consistent. This model is further used to predict novel COVID-19 human drug targets, which are further validated by target pathway analysis, the emergence of allied repurposed drugs, and their subsequent docking study.
Collapse
Affiliation(s)
- Sovan Saha
- Department of Computer Science & Engineering, Institute of Engineering & Management, Salt Lake Electronics Complex, Kolkata 700091, India
| | - Piyali Chatterjee
- Department of Computer Science & Engineering, Netaji Subhash Engineering College, Techno City, Panchpota, Garia, Kolkata 700152, India
| | - Anup Kumar Halder
- Faculty of Mathematics and Information Sciences, Warsaw University of Technology, Koszykowa 75, 00-662 Warsaw, Poland
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c Street, 02-097 Warsaw, Poland
| | - Mita Nasipuri
- Department of Computer Science & Engineering, Jadavpur University, 188, Raja S.C. Mallick Road, Kolkata 700032, India
| | - Subhadip Basu
- Department of Computer Science & Engineering, Jadavpur University, 188, Raja S.C. Mallick Road, Kolkata 700032, India
| | - Dariusz Plewczynski
- Faculty of Mathematics and Information Sciences, Warsaw University of Technology, Koszykowa 75, 00-662 Warsaw, Poland
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c Street, 02-097 Warsaw, Poland
| |
Collapse
|
4
|
Mobley JA, Molyvdas A, Kojima K, Jilling T, Li JL, Garantziotis S, Matalon S. The SARS-CoV-2 Spike S1 Protein Induces Global Proteomic Changes in ATII-Like Rat L2 Cells that are Attenuated by Hyaluronan. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2022:2022.08.31.506023. [PMID: 36093347 PMCID: PMC9460966 DOI: 10.1101/2022.08.31.506023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The COVID-19 pandemic continues to impose a major impact on global health and economy since its identification in early 2020, causing significant morbidity and mortality worldwide. Caused by the SARS-CoV-2 virus, along with a growing number of variants that have been characterized to date, COVID-19 has led to 571,198,904 confirmed cases, and 6,387,863 deaths worldwide (as of July 15 th , 2022). Despite tremendous advances in our understanding of COVID19 pathogenesis, the precise mechanism by which SARS-CoV2 causes epithelial injury is incompletely understood. In this current study, robust application of global-discovery proteomics applications combined with systems biology analysis identified highly significant induced changes by the Spike S1 protein of SARS-CoV-2 in an ATII-like Rat L2 cells that include three significant network hubs: E2F1, CREB1/ RelA, and ROCK2/ RhoA. Separately, we found that pre-treatment with High Molecular Weight Hyaluronan (HMW-HA), greatly attenuated the S1 effects. Immuno-targeted studies carried out on E2F1 and Rock2/ RhoA induction and kinase-mediated activation, in addition to cell cycle measurements, validated these observations. Taken as a whole, our discovery proteomics and systems analysis workflow, combined with standard immuno-targeted and cell cycle measurements revealed profound and novel biological changes that contribute to our current understanding of both Spike S1 and Hyaluronan biology. This data shows that the Spike S1 protein may contribute to epithelial injury induced by SARS-CoV-2. In addition, our work supports the potential benefit of HMW-HA in ameliorating SARS CoV2 induced cell injury.
Collapse
|
5
|
Basu S, Plewczynski D. Computational methods and strategies for combating COVID-19. Methods 2022; 206:99-100. [PMID: 36028161 PMCID: PMC9398558 DOI: 10.1016/j.ymeth.2022.08.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Affiliation(s)
- Subhadip Basu
- Computer Science & Engineering Department, Jadavpur University, Kolkata 700032, India
| | - Dariusz Plewczynski
- Centre of New Technologies, University of Warsaw, Warsaw, Poland; Faculty of Mathematics and Information Sciences, Warsaw University of Technology, Warsaw, Poland.
| |
Collapse
|