1
|
Wu Y, Li K, Li M, Pu X, Guo Y. Attention Mechanism-Based Graph Neural Network Model for Effective Activity Prediction of SARS-CoV-2 Main Protease Inhibitors: Application to Drug Repurposing as Potential COVID-19 Therapy. J Chem Inf Model 2023; 63:7011-7031. [PMID: 37960886 DOI: 10.1021/acs.jcim.3c01280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Compared to de novo drug discovery, drug repurposing provides a time-efficient way to treat coronavirus disease 19 (COVID-19) that is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). SARS-CoV-2 main protease (Mpro) has been proved to be an attractive drug target due to its pivotal involvement in viral replication and transcription. Here, we present a graph neural network-based deep-learning (DL) strategy to prioritize the existing drugs for their potential therapeutic effects against SARS-CoV-2 Mpro. Mpro inhibitors were represented as molecular graphs ready for graph attention network (GAT) and graph isomorphism network (GIN) modeling for predicting the inhibitory activities. The result shows that the GAT model outperforms the GIN and other competitive models and yields satisfactory predictions for unseen Mpro inhibitors, confirming its robustness and generalization. The attention mechanism of GAT enables to capture the dominant substructures and thus to realize the interpretability of the model. Finally, we applied the optimal GAT model in conjunction with molecular docking simulations to screen the Drug Repurposing Hub (DRH) database. As a result, 18 drug hits with best consensus prediction scores and binding affinity values were identified as the potential therapeutics against COVID-19. Both the extensive literature searching and evaluations on adsorption, distribution, metabolism, excretion, and toxicity (ADMET) illustrate the premium drug-likeness and pharmacokinetic properties of the drug candidates. Overall, our work not only provides an effective GAT-based DL prediction tool for inhibitory activity of SARS-CoV-2 Mpro inhibitors but also provides theoretical guidelines for drug discovery in the COVID-19 treatment.
Collapse
Affiliation(s)
- Yanling Wu
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Kun Li
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Menglong Li
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Xuemei Pu
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Yanzhi Guo
- College of Chemistry, Sichuan University, Chengdu 610064, China
| |
Collapse
|
2
|
Williams AH, Zhan CG. Staying Ahead of the Game: How SARS-CoV-2 has Accelerated the Application of Machine Learning in Pandemic Management. BioDrugs 2023; 37:649-674. [PMID: 37464099 DOI: 10.1007/s40259-023-00611-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/28/2023] [Indexed: 07/20/2023]
Abstract
In recent years, machine learning (ML) techniques have garnered considerable interest for their potential use in accelerating the rate of drug discovery. With the emergence of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic, the utilization of ML has become even more crucial in the search for effective antiviral medications. The pandemic has presented the scientific community with a unique challenge, and the rapid identification of potential treatments has become an urgent priority. Researchers have been able to accelerate the process of identifying drug candidates, repurposing existing drugs, and designing new compounds with desirable properties using machine learning in drug discovery. To train predictive models, ML techniques in drug discovery rely on the analysis of large datasets, including both experimental and clinical data. These models can be used to predict the biological activities, potential side effects, and interactions with specific target proteins of drug candidates. This strategy has proven to be an effective method for identifying potential coronavirus disease 2019 (COVID-19) and other disease treatments. This paper offers a thorough analysis of the various ML techniques implemented to combat COVID-19, including supervised and unsupervised learning, deep learning, and natural language processing. The paper discusses the impact of these techniques on pandemic drug development, including the identification of potential treatments, the understanding of the disease mechanism, and the creation of effective and safe therapeutics. The lessons learned can be applied to future outbreaks and drug discovery initiatives.
Collapse
Affiliation(s)
- Alexander H Williams
- Molecular Modeling and Biopharmaceutical Center, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA
- GSK Upper Providence, 1250 S. Collegeville Road, Collegeville, PA, 19426, USA
| | - Chang-Guo Zhan
- Molecular Modeling and Biopharmaceutical Center, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA.
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA.
| |
Collapse
|
3
|
Qu J, Song Z, Cheng X, Jiang Z, Zhou J. A new integrated framework for the identification of potential virus-drug associations. Front Microbiol 2023; 14:1179414. [PMID: 37675432 PMCID: PMC10478006 DOI: 10.3389/fmicb.2023.1179414] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Accepted: 07/31/2023] [Indexed: 09/08/2023] Open
Abstract
Introduction With the increasingly serious problem of antiviral drug resistance, drug repurposing offers a time-efficient and cost-effective way to find potential therapeutic agents for disease. Computational models have the ability to quickly predict potential reusable drug candidates to treat diseases. Methods In this study, two matrix decomposition-based methods, i.e., Matrix Decomposition with Heterogeneous Graph Inference (MDHGI) and Bounded Nuclear Norm Regularization (BNNR), were integrated to predict anti-viral drugs. Moreover, global leave-one-out cross-validation (LOOCV), local LOOCV, and 5-fold cross-validation were implemented to evaluate the performance of the proposed model based on datasets of DrugVirus that consist of 933 known associations between 175 drugs and 95 viruses. Results The results showed that the area under the receiver operating characteristics curve (AUC) of global LOOCV and local LOOCV are 0.9035 and 0.8786, respectively. The average AUC and the standard deviation of the 5-fold cross-validation for DrugVirus datasets are 0.8856 ± 0.0032. We further implemented cross-validation based on MDAD and aBiofilm, respectively, to evaluate the performance of the model. In particle, MDAD (aBiofilm) dataset contains 2,470 (2,884) known associations between 1,373 (1,470) drugs and 173 (140) microbes. In addition, two types of case studies were carried out further to verify the effectiveness of the model based on the DrugVirus and MDAD datasets. The results of the case studies supported the effectiveness of MHBVDA in identifying potential virus-drug associations as well as predicting potential drugs for new microbes.
Collapse
Affiliation(s)
- Jia Qu
- School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou, Jiangsu, China
| | - Zihao Song
- School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou, Jiangsu, China
| | - Xiaolong Cheng
- School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou, Jiangsu, China
| | - Zhibin Jiang
- School of Computer Science and Engineering, Shaoxing University, Shaoxing, Zhejiang, China
| | - Jie Zhou
- School of Computer Science and Engineering, Shaoxing University, Shaoxing, Zhejiang, China
| |
Collapse
|
4
|
Huang Z, Zhang P, Deng L. DeepCoVDR: deep transfer learning with graph transformer and cross-attention for predicting COVID-19 drug response. Bioinformatics 2023; 39:i475-i483. [PMID: 37387168 DOI: 10.1093/bioinformatics/btad244] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION The coronavirus disease 2019 (COVID-19) remains a global public health emergency. Although people, especially those with underlying health conditions, could benefit from several approved COVID-19 therapeutics, the development of effective antiviral COVID-19 drugs is still a very urgent problem. Accurate and robust drug response prediction to a new chemical compound is critical for discovering safe and effective COVID-19 therapeutics. RESULTS In this study, we propose DeepCoVDR, a novel COVID-19 drug response prediction method based on deep transfer learning with graph transformer and cross-attention. First, we adopt a graph transformer and feed-forward neural network to mine the drug and cell line information. Then, we use a cross-attention module that calculates the interaction between the drug and cell line. After that, DeepCoVDR combines drug and cell line representation and their interaction features to predict drug response. To solve the problem of SARS-CoV-2 data scarcity, we apply transfer learning and use the SARS-CoV-2 dataset to fine-tune the model pretrained on the cancer dataset. The experiments of regression and classification show that DeepCoVDR outperforms baseline methods. We also evaluate DeepCoVDR on the cancer dataset, and the results indicate that our approach has high performance compared with other state-of-the-art methods. Moreover, we use DeepCoVDR to predict COVID-19 drugs from FDA-approved drugs and demonstrate the effectiveness of DeepCoVDR in identifying novel COVID-19 drugs. AVAILABILITY AND IMPLEMENTATION https://github.com/Hhhzj-7/DeepCoVDR.
Collapse
Affiliation(s)
- Zhijian Huang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Pan Zhang
- Hunan Provincial Key Laboratory of Clinical Epidemiology, Xiangya School of Public Health, Central South University, Changsha 410083, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
5
|
Ghosh N, Saha I, Gambin A. Interactome-Based Machine Learning Predicts Potential Therapeutics for COVID-19. ACS OMEGA 2023; 8:13840-13854. [PMID: 37163139 PMCID: PMC10084923 DOI: 10.1021/acsomega.3c00030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 02/22/2023] [Indexed: 05/11/2023]
Abstract
COVID-19, the disease caused by SARS-CoV-2, has been disrupting our lives for more than two years now. SARS-CoV-2 interacts with human proteins to pave its way into the human body, thereby wreaking havoc. Moreover, the mutating variants of the virus that take place in the SARS-CoV-2 genome are also a cause of concern among the masses. Thus, it is very important to understand human-spike protein-protein interactions (PPIs) in order to predict new PPIs and consequently propose drugs for the human proteins in order to fight the virus and its different mutated variants, with the mutations occurring in the spike protein. This fact motivated us to develop a complete pipeline where PPIs and drug-protein interactions can be predicted for human-SARS-CoV-2 interactions. In this regard, initially interacting data sets are collected from the literature, and noninteracting data sets are subsequently created for human-SARS-CoV-2 by considering only spike glycoprotein. On the other hand, for drug-protein interactions both interacting and noninteracting data sets are considered from DrugBank and ChEMBL databases. Thereafter, a model based on a sequence-based feature is used to code the protein sequences of human and spike proteins using the well-known Moran autocorrelation technique, while the drugs are coded using another well-known technique, viz., PaDEL descriptors, to predict new human-spike PPIs and eventually new drug-protein interactions for the top 20 predicted human proteins interacting with the original spike protein and its different mutated variants like Alpha, Beta, Delta, Gamma, and Omicron. Such predictions are carried out by random forest as it is found to perform better than other predictors, providing an accuracy of 90.53% for human-spike PPI and 96.15% for drug-protein interactions. Finally, 40 unique drugs like eicosapentaenoic acid, doxercalciferol, ciclesonide, dexamethasone, methylprednisolone, etc. are identified that target 32 human proteins like ACACA, DST, DYNC1H1, etc.
Collapse
Affiliation(s)
- Nimisha Ghosh
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, 00-927 Warsaw, Poland
- Department of Computer Science and Information Technology, Institute of Technical Education and Research, Siksha 'O' Anusandhan, Bhubaneswar, 751030 Odisha, India
| | - Indrajit Saha
- Department of Computer Science and Engineering, National Institute of Technical Teachers' Training and Research, Kolkata, 700106 West Bengal, India
| | - Anna Gambin
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, 00-927 Warsaw, Poland
| |
Collapse
|
6
|
Chakraborty A, Mitra S, Bhattacharjee M, De D, Pal AJ. Determining human-coronavirus protein-protein interaction using machine intelligence. MEDICINE IN NOVEL TECHNOLOGY AND DEVICES 2023; 18:100228. [PMID: 37056696 PMCID: PMC10077817 DOI: 10.1016/j.medntd.2023.100228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2022] [Revised: 03/29/2023] [Accepted: 04/01/2023] [Indexed: 04/08/2023] Open
Abstract
The Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2) virus spread the novel CoronaVirus −19 (nCoV-19) pandemic, resulting in millions of fatalities globally. Recent research demonstrated that the Protein-Protein Interaction (PPI) between SARS-CoV-2 and human proteins is accountable for viral pathogenesis. However, many of these PPIs are poorly understood and unexplored, necessitating a more in-depth investigation to find latent yet critical interactions. This article elucidates the host-viral PPI through Machine Learning (ML) lenses and validates the biological significance of the same using web-based tools. ML classifiers are designed based on comprehensive datasets with five sequence-based features of human proteins, namely Amino Acid Composition, Pseudo Amino Acid Composition, Conjoint Triad, Dipeptide Composition, and Normalized Auto Correlation. A majority voting rule-based ensemble method composed of the Random Forest Model (RFM), AdaBoost, and Bagging technique is proposed that delivers encouraging statistical performance compared to other models employed in this work. The proposed ensemble model predicted a total of 111 possible SARS-CoV-2 human target proteins with a high likelihood factor ≥70%, validated by utilizing Gene Ontology (GO) and KEGG pathway enrichment analysis. Consequently, this research can aid in a deeper understanding of the molecular mechanisms underlying viral pathogenesis and provide clues for developing more efficient anti-COVID medications.
Collapse
Affiliation(s)
- Arijit Chakraborty
- Bachelor of Computer Application Department, The Heritage Academy, Kolkata, India
| | - Sajal Mitra
- Department of Computer Science and Engineering, Heritage Institute of Technology, Kolkata, India
| | | | - Debashis De
- Department of Computer Science and Engineering, Maulana Abul Kalam Azad University of Technology, Kolkata, India
| | | |
Collapse
|
7
|
S S, E R V, Krishnakumar U. Improving miRNA Disease Association Prediction Accuracy Using Integrated Similarity Information and Deep Autoencoders. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1125-1136. [PMID: 35914051 DOI: 10.1109/tcbb.2022.3195514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
MicroRNAs (miRNAs) are short endogenous non-encoding RNA molecules (22nt) that have a vital role in many biological and molecular processes inside the human body. Abnormal and dysregulated expressions of miRNAs are correlated with many complex disorders. Time-consuming wet-lab biological experiments are costly and labour-intensive. So, the situation demands feasible and efficient computational approaches for predicting promising miRNAs associated with diseases. Here a two-stage feature pruning approach based on miRNA feature similarity fusion that uses deep attention autoencoder and recursive feature elimination with cross-validation (RFECV) is proposed for predicting unknown miRNA-disease associations. In the first stage, an attention autoencoder captures highly influential features from the fused feature vector. For further pruning of features, RFECV is applied. The resultant features were given to a Random Forest classifier for association prediction. The Highest AUC of 94.41% is attained when all miRNA similarity measures are merged with disease similarities. Case studies were done on two diseases-lymphoma and leukaemia, to examine the reliability of the approach. Comparative analysis shows that the proposed approach outperforms recent methodologies for predicting miRNA-disease associations.
Collapse
|
8
|
Das B, Kutsal M, Das R. A geometric deep learning model for display and prediction of potential drug-virus interactions against SARS-CoV-2. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS : AN INTERNATIONAL JOURNAL SPONSORED BY THE CHEMOMETRICS SOCIETY 2022; 229:104640. [PMID: 36042844 PMCID: PMC9400382 DOI: 10.1016/j.chemolab.2022.104640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 08/17/2022] [Accepted: 08/19/2022] [Indexed: 05/04/2023]
Abstract
Although the coronavirus epidemic spread rapidly with the Omicron variant, it lost its lethality rate with the effect of vaccine and immunity. The hospitalization and intense demand decreased. However, there is no definite information about when this disease will end or how dangerous the different variants could be. In addition, it is not possible to end the risk of variants that will continue to circulate among animals in nature. After this stage, drug-virus interactions should be examined in order to be able to prepare against possible new types of viruses and variants and to rapidly-produce drugs or vaccines against possible viruses. Despite experimental methods that are expensive, laborious, and time-consuming, geometric deep learning(GDL) is an alternative method that can be used to make this process faster and cheaper. In this study, we propose a new model based on geometric deep learning for the prediction of drug-virus interaction against COVID-19. First, we use the antiviral drug data in the SMILES molecular structure representation to generate too many features and better describe the structure of chemical species. Then the data is converted into a molecular representation and then into a graphical structure that the GDL model can understand. The node feature vectors are transferred to a different space with the Message Passing Neural Network (MPNN) for the training process to take place. We develop a geometric neural network architecture where the graph embedding values are passed through the fully connected layer and the prediction is actualized. The results indicate that the proposed method outperforms existing methods with 97% accuracy in predicting drug-virus interactions.
Collapse
Affiliation(s)
- Bihter Das
- Department of Software Engineering, Technology Faculty, Firat University, 23119, Elazig, Turkey
| | - Mucahit Kutsal
- Department of Software Engineering, Technology Faculty, Firat University, 23119, Elazig, Turkey
| | - Resul Das
- Department of Software Engineering, Technology Faculty, Firat University, 23119, Elazig, Turkey
| |
Collapse
|
9
|
Heidari A, Jafari Navimipour N, Unal M, Toumaj S. Machine learning applications for COVID-19 outbreak management. Neural Comput Appl 2022; 34:15313-15348. [PMID: 35702664 PMCID: PMC9186489 DOI: 10.1007/s00521-022-07424-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 05/10/2022] [Indexed: 12/29/2022]
Abstract
Recently, the COVID-19 epidemic has resulted in millions of deaths and has impacted practically every area of human life. Several machine learning (ML) approaches are employed in the medical field in many applications, including detecting and monitoring patients, notably in COVID-19 management. Different medical imaging systems, such as computed tomography (CT) and X-ray, offer ML an excellent platform for combating the pandemic. Because of this need, a significant quantity of study has been carried out; thus, in this work, we employed a systematic literature review (SLR) to cover all aspects of outcomes from related papers. Imaging methods, survival analysis, forecasting, economic and geographical issues, monitoring methods, medication development, and hybrid apps are the seven key uses of applications employed in the COVID-19 pandemic. Conventional neural networks (CNNs), long short-term memory networks (LSTM), recurrent neural networks (RNNs), generative adversarial networks (GANs), autoencoders, random forest, and other ML techniques are frequently used in such scenarios. Next, cutting-edge applications related to ML techniques for pandemic medical issues are discussed. Various problems and challenges linked with ML applications for this pandemic were reviewed. It is expected that additional research will be conducted in the upcoming to limit the spread and catastrophe management. According to the data, most papers are evaluated mainly on characteristics such as flexibility and accuracy, while other factors such as safety are overlooked. Also, Keras was the most often used library in the research studied, accounting for 24.4 percent of the time. Furthermore, medical imaging systems are employed for diagnostic reasons in 20.4 percent of applications.
Collapse
Affiliation(s)
- Arash Heidari
- Department of Computer Engineering, Tabriz Branch, Islamic Azad University, Tabriz, Iran
- Department of Computer Engineering, Shabestar Branch, Islamic Azad University, Shabestar, Iran
| | | | - Mehmet Unal
- Department of Computer Engineering, Nisantasi University, Istanbul, Turkey
| | - Shiva Toumaj
- Urmia University of Medical Sciences, Urmia, Iran
| |
Collapse
|