1
|
Alme C, Pirim H, Akbulut Y. Machine learning approaches for predicting craniofacial anomalies with graph neural networks. Comput Biol Chem 2025; 115:108294. [PMID: 39642539 PMCID: PMC11850188 DOI: 10.1016/j.compbiolchem.2024.108294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2024] [Revised: 10/09/2024] [Accepted: 11/25/2024] [Indexed: 12/09/2024]
Abstract
This study explores the use of machine learning algorithms, including traditional approaches and graph neural networks (GNNs), to predict certain diseases by analyzing protein-protein interactions. Protein-protein interactions (PPIs) are complex, multifaceted, and sometimes ever-changing. Therefore, analyzing PPIs and making predictions based on them present significant challenges to traditional computational techniques. However, machine learning, particularly GNNs, with their powerful ability to identify complex patterns within large, convoluted datasets, emerge as compelling and revolutionary tools for unraveling these intricate biological networks. We apply machine learning, aided by SHAP explainability and GNNs, on three networks of distinct sizes, ranging from small to large. While the ML results highlight the higher importance of network features in prediction, GNNs exhibit superior accuracy.
Collapse
Affiliation(s)
- Colten Alme
- Mechanical Engineering, North Dakota State University, United States of America
| | - Harun Pirim
- Industrial and Manufacturing Engineering, North Dakota State University, United States of America.
| | - Yusuf Akbulut
- Industrial and Manufacturing Engineering, North Dakota State University, United States of America
| |
Collapse
|
2
|
Liu Y, Yuan H, Hu J, Xu X, Yin S, Hu Y, Liu F. A Complex Network of Obesity-Risk Genes Revealed by Systematic Bioinformatics and Single-Cell Transcriptomic Analyses. J Obes 2025; 2025:7821115. [PMID: 40201036 PMCID: PMC11976034 DOI: 10.1155/jobe/7821115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 08/05/2024] [Accepted: 11/23/2024] [Indexed: 04/10/2025] Open
Abstract
The development of obesity is closely linked to genetic factors. Despite the identification of numerous genes associated with an increased risk of obesity in humans, a comprehensive understanding of their biological roles has not been achieved. In our extensive bioinformatics study, we identified 802 core genes implicated in obesity. Our protein-protein interaction (PPI) network analysis revealed that these genes form a tightly connected functional network primarily involved in neurological and metabolic regulatory processes. Moreover, our in-depth analysis of single-cell transcriptomic datasets from the human hypothalamus, pancreatic islets, adipose tissue, and liver has shed light on the distinct expression profiles of these obesity-linked genes across various tissue and cell types. This analysis also highlighted the biological processes they influence and the upstream transcriptional regulatory networks involved. Our study not only uncovers the complicated regulatory role of genetic factors in the pathogenesis and progression of obesity but also establishes a close link between the expression patterns and functional roles of these obesity-associated genes. This study provides crucial insights for advancing our understanding of the genetic mechanisms underlying obesity.
Collapse
Affiliation(s)
- Yuenan Liu
- Department of Otolaryngology Head and Neck Surgery, Shanghai Key Laboratory of Sleep Disordered Breathing, Otolaryngological Institute of Shanghai Jiaotong University, Shanghai Jiao Tong University School of Medicine Affiliated Sixth People's Hospital, Shanghai 200233, China
| | - Haolin Yuan
- Department of Otolaryngology Head and Neck Surgery, Shanghai Key Laboratory of Sleep Disordered Breathing, Otolaryngological Institute of Shanghai Jiaotong University, Shanghai Jiao Tong University School of Medicine Affiliated Sixth People's Hospital, Shanghai 200233, China
| | - Junhui Hu
- Department of Otolaryngology Head and Neck Surgery, Shanghai Key Laboratory of Sleep Disordered Breathing, Otolaryngological Institute of Shanghai Jiaotong University, Shanghai Jiao Tong University School of Medicine Affiliated Sixth People's Hospital, Shanghai 200233, China
| | - Xu Xu
- Department of Otolaryngology Head and Neck Surgery, Shanghai Key Laboratory of Sleep Disordered Breathing, Otolaryngological Institute of Shanghai Jiaotong University, Shanghai Jiao Tong University School of Medicine Affiliated Sixth People's Hospital, Shanghai 200233, China
| | - Shankai Yin
- Department of Otolaryngology Head and Neck Surgery, Shanghai Key Laboratory of Sleep Disordered Breathing, Otolaryngological Institute of Shanghai Jiaotong University, Shanghai Jiao Tong University School of Medicine Affiliated Sixth People's Hospital, Shanghai 200233, China
| | - Yiming Hu
- Department of Otolaryngology Head and Neck Surgery, Shanghai Key Laboratory of Sleep Disordered Breathing, Otolaryngological Institute of Shanghai Jiaotong University, Shanghai Jiao Tong University School of Medicine Affiliated Sixth People's Hospital, Shanghai 200233, China
| | - Feng Liu
- Department of Otolaryngology Head and Neck Surgery, Shanghai Key Laboratory of Sleep Disordered Breathing, Otolaryngological Institute of Shanghai Jiaotong University, Shanghai Jiao Tong University School of Medicine Affiliated Sixth People's Hospital, Shanghai 200233, China
| |
Collapse
|
3
|
Ambreen S, Umar M, Noor A, Jain H, Ali R. Advanced AI and ML frameworks for transforming drug discovery and optimization: With innovative insights in polypharmacology, drug repurposing, combination therapy and nanomedicine. Eur J Med Chem 2025; 284:117164. [PMID: 39721292 DOI: 10.1016/j.ejmech.2024.117164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Revised: 11/24/2024] [Accepted: 11/27/2024] [Indexed: 12/28/2024]
Abstract
Artificial Intelligence (AI) and Machine Learning (ML) are transforming drug discovery by overcoming traditional challenges like high costs, time-consuming, and frequent failures. AI-driven approaches streamline key phases, including target identification, lead optimization, de novo drug design, and drug repurposing. Frameworks such as deep neural networks (DNNs), convolutional neural networks (CNNs), and deep reinforcement learning (DRL) models have shown promise in identifying drug targets, optimizing delivery systems, and accelerating drug repurposing. Generative adversarial networks (GANs) and variational autoencoders (VAEs) aid de novo drug design by creating novel drug-like compounds with desired properties. Case studies, such as DDR1 kinase inhibitors designed using generative models and CDK20 inhibitors developed via structure-based methods, highlight AI's ability to produce highly specific therapeutics. Models like SNF-CVAE and DeepDR further advance drug repurposing by uncovering new therapeutic applications for existing drugs. Advanced ML algorithms enhance precision in predicting drug efficacy, toxicity, and ADME-Tox properties, reducing development costs and improving drug-target interactions. AI also supports polypharmacology by optimizing multi-target drug interactions and enhances combination therapy through predictions of drug synergies and antagonisms. In nanomedicine, AI models like CURATE.AI and the Hartung algorithm optimize personalized treatments by predicting toxicological risks and real-time dosing adjustments with high accuracy. Despite its potential, challenges like data quality, model interpretability, and ethical concerns must be addressed. High-quality datasets, transparent models, and unbiased algorithms are essential for reliable AI applications. As AI continues to evolve, it is poised to revolutionize drug discovery and personalized medicine, advancing therapeutic development and patient care.
Collapse
Affiliation(s)
- Subiya Ambreen
- Department of Pharmaceutical Chemistry, Delhi Institute of Pharmaceutical Sciences and Research (DIPSAR), DPSRU, Pushp Vihar, New Delhi, 110017, India
| | - Mohammad Umar
- Department of Pharmaceutical Chemistry, Delhi Institute of Pharmaceutical Sciences and Research (DIPSAR), DPSRU, Pushp Vihar, New Delhi, 110017, India
| | - Aaisha Noor
- Department of Pharmaceutical Chemistry, Delhi Institute of Pharmaceutical Sciences and Research (DIPSAR), DPSRU, Pushp Vihar, New Delhi, 110017, India
| | - Himangini Jain
- Department of Pharmaceutical Chemistry, Delhi Institute of Pharmaceutical Sciences and Research (DIPSAR), DPSRU, Pushp Vihar, New Delhi, 110017, India
| | - Ruhi Ali
- Department of Pharmaceutical Chemistry, Delhi Institute of Pharmaceutical Sciences and Research (DIPSAR), DPSRU, Pushp Vihar, New Delhi, 110017, India.
| |
Collapse
|
4
|
Akid H, Chennen K, Frey G, Thompson J, Ben Ayed M, Lachiche N. Graph-based machine learning model for weight prediction in protein-protein networks. BMC Bioinformatics 2024; 25:349. [PMID: 39511478 PMCID: PMC11546293 DOI: 10.1186/s12859-024-05973-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Accepted: 10/31/2024] [Indexed: 11/15/2024] Open
Abstract
Proteins interact with each other in complex ways to perform significant biological functions. These interactions, known as protein-protein interactions (PPIs), can be depicted as a graph where proteins are nodes and their interactions are edges. The development of high-throughput experimental technologies allows for the generation of numerous data which permits increasing the sophistication of PPI models. However, despite significant progress, current PPI networks remain incomplete. Discovering missing interactions through experimental techniques can be costly, time-consuming, and challenging. Therefore, computational approaches have emerged as valuable tools for predicting missing interactions. In PPI networks, a graph is usually used to model the interactions between proteins. An edge between two proteins indicates a known interaction, while the absence of an edge means the interaction is not known or missed. However, this binary representation overlooks the reliability of known interactions when predicting new ones. To address this challenge, we propose a novel approach for link prediction in weighted protein-protein networks, where interaction weights denote confidence scores. By leveraging data from the yeast Saccharomyces cerevisiae obtained from the STRING database, we introduce a new model that combines similarity-based algorithms and aggregated confidence score weights for accurate link prediction purposes. Our model significantly improves prediction accuracy, surpassing traditional approaches in terms of Mean Absolute Error, Mean Relative Absolute Error, and Root Mean Square Error. Our proposed approach holds the potential for improved accuracy in predicting PPIs, which is crucial for better understanding the underlying biological processes.
Collapse
Affiliation(s)
- Hajer Akid
- ICube, University of Strasbourg, 67412, Illkirch Cedex, France.
| | - Kirsley Chennen
- ICube, University of Strasbourg, 67412, Illkirch Cedex, France
| | - Gabriel Frey
- ICube, University of Strasbourg, 67412, Illkirch Cedex, France
| | - Julie Thompson
- ICube, University of Strasbourg, 67412, Illkirch Cedex, France
| | | | | |
Collapse
|
5
|
Pancino N, Gallegati C, Romagnoli F, Bongini P, Bianchini M. Protein-Protein Interfaces: A Graph Neural Network Approach. Int J Mol Sci 2024; 25:5870. [PMID: 38892057 PMCID: PMC11173158 DOI: 10.3390/ijms25115870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2024] [Revised: 05/15/2024] [Accepted: 05/24/2024] [Indexed: 06/21/2024] Open
Abstract
Protein-protein interactions (PPIs) are fundamental processes governing cellular functions, crucial for understanding biological systems at the molecular level. Compared to experimental methods for PPI prediction and site identification, computational deep learning approaches represent an affordable and efficient solution to tackle these problems. Since protein structure can be summarized as a graph, graph neural networks (GNNs) represent the ideal deep learning architecture for the task. In this work, PPI prediction is modeled as a node-focused binary classification task using a GNN to determine whether a generic residue is part of the interface. Biological data were obtained from the Protein Data Bank in Europe (PDBe), leveraging the Protein Interfaces, Surfaces, and Assemblies (PISA) service. To gain a deeper understanding of how proteins interact, the data obtained from PISA were assembled into three datasets: Whole, Interface, and Chain, consisting of data on the whole protein, couples of interacting chains, and single chains, respectively. These three datasets correspond to three different nuances of the problem: identifying interfaces between protein complexes, between chains of the same protein, and interface regions in general. The results indicate that GNNs are capable of solving each of the three tasks with very good performance levels.
Collapse
Affiliation(s)
- Niccolò Pancino
- Department of Information Engineering and Mathematics, University of Siena, Via Roma, 56, 53100 Siena, Italy; (C.G.); (P.B.); (M.B.)
| | | | | | | | | |
Collapse
|
6
|
Kim DN, McNaughton AD, Kumar N. Leveraging Artificial Intelligence to Expedite Antibody Design and Enhance Antibody-Antigen Interactions. Bioengineering (Basel) 2024; 11:185. [PMID: 38391671 PMCID: PMC10886287 DOI: 10.3390/bioengineering11020185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 01/30/2024] [Accepted: 02/06/2024] [Indexed: 02/24/2024] Open
Abstract
This perspective sheds light on the transformative impact of recent computational advancements in the field of protein therapeutics, with a particular focus on the design and development of antibodies. Cutting-edge computational methods have revolutionized our understanding of protein-protein interactions (PPIs), enhancing the efficacy of protein therapeutics in preclinical and clinical settings. Central to these advancements is the application of machine learning and deep learning, which offers unprecedented insights into the intricate mechanisms of PPIs and facilitates precise control over protein functions. Despite these advancements, the complex structural nuances of antibodies pose ongoing challenges in their design and optimization. Our review provides a comprehensive exploration of the latest deep learning approaches, including language models and diffusion techniques, and their role in surmounting these challenges. We also present a critical analysis of these methods, offering insights to drive further progress in this rapidly evolving field. The paper includes practical recommendations for the application of these computational techniques, supplemented with independent benchmark studies. These studies focus on key performance metrics such as accuracy and the ease of program execution, providing a valuable resource for researchers engaged in antibody design and development. Through this detailed perspective, we aim to contribute to the advancement of antibody design, equipping researchers with the tools and knowledge to navigate the complexities of this field.
Collapse
Affiliation(s)
| | | | - Neeraj Kumar
- Pacific Northwest National Laboratory, 902 Battelle Blvd., Richland, WA 99352, USA; (D.N.K.); (A.D.M.)
| |
Collapse
|
7
|
Fu X, Yuan Y, Qiu H, Suo H, Song Y, Li A, Zhang Y, Xiao C, Li Y, Dou L, Zhang Z, Cui F. AGF-PPIS: A protein-protein interaction site predictor based on an attention mechanism and graph convolutional networks. Methods 2024; 222:142-151. [PMID: 38242383 DOI: 10.1016/j.ymeth.2024.01.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 01/04/2024] [Accepted: 01/13/2024] [Indexed: 01/21/2024] Open
Abstract
Protein-protein interactions play an important role in various biological processes. Interaction among proteins has a wide range of applications. Therefore, the correct identification of protein-protein interactions sites is crucial. In this paper, we propose a novel predictor for protein-protein interactions sites, AGF-PPIS, where we utilize a multi-head self-attention mechanism (introducing a graph structure), graph convolutional network, and feed-forward neural network. We use the Euclidean distance between each protein residue to generate the corresponding protein graph as the input of AGF-PPIS. On the independent test dataset Test_60, AGF-PPIS achieves superior performance over comparative methods in terms of seven different evaluation metrics (ACC, precision, recall, F1-score, MCC, AUROC, AUPRC), which fully demonstrates the validity and superiority of the proposed AGF-PPIS model. The source codes and the steps for usage of AGF-PPIS are available at https://github.com/fxh1001/AGF-PPIS.
Collapse
Affiliation(s)
- Xiuhao Fu
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Ye Yuan
- Beidahuang Industry Group General Hospital, Harbin 150001, China
| | - Haoye Qiu
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Haodong Suo
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Yingying Song
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Anqi Li
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Yupeng Zhang
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Cuilin Xiao
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Yazi Li
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Lijun Dou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland, OH 44106, USA
| | - Zilong Zhang
- School of Computer Science and Technology, Hainan University, Haikou 570228, China.
| | - Feifei Cui
- School of Computer Science and Technology, Hainan University, Haikou 570228, China.
| |
Collapse
|
8
|
Cong H, Liu H, Cao Y, Liang C, Chen Y. Protein-protein interaction site prediction by model ensembling with hybrid feature and self-attention. BMC Bioinformatics 2023; 24:456. [PMID: 38053020 DOI: 10.1186/s12859-023-05592-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Accepted: 11/30/2023] [Indexed: 12/07/2023] Open
Abstract
BACKGROUND Protein-protein interactions (PPIs) are crucial in various biological functions and cellular processes. Thus, many computational approaches have been proposed to predict PPI sites. Although significant progress has been made, these methods still have limitations in encoding the characteristics of each amino acid in sequences. Many feature extraction methods rely on the sliding window technique, which simply merges all the features of residues into a vector. The importance of some key residues may be weakened in the feature vector, leading to poor performance. RESULTS We propose a novel sequence-based method for PPI sites prediction. The new network model, PPINet, contains multiple feature processing paths. For a residue, the PPINet extracts the features of the targeted residue and its context separately. These two types of features are processed by two paths in the network and combined to form a protein representation, where the two types of features are of relatively equal importance. The model ensembling technique is applied to make use of more features. The base models are trained with different features and then ensembled via stacking. In addition, a data balancing strategy is presented, by which our model can get significant improvement on highly unbalanced data. CONCLUSION The proposed method is evaluated on a fused dataset constructed from Dset186, Dset_72, and PDBset_164, as well as the public Dset_448 dataset. Compared with current state-of-the-art methods, the performance of our method is better than the others. In the most important metrics, such as AUPRC and recall, it surpasses the second-best programmer on the latter dataset by 6.9% and 4.7%, respectively. We also demonstrated that the improvement is essentially due to using the ensemble model, especially, the hybrid feature. We share our code for reproducibility and future research at https://github.com/CandiceCong/StackingPPINet .
Collapse
Affiliation(s)
- Hanhan Cong
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
- Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan, China
| | - Hong Liu
- School of Information Science and Engineering, Shandong Normal University, Jinan, China.
- Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan, China.
| | - Yi Cao
- School of Information Science and Engineering, University of Jinan, Jinan, China
- Shandong Provincial Key Laboratory of Network Based Intelligent Computing, Jinan, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Yuehui Chen
- School of Information Science and Engineering, University of Jinan, Jinan, China
- Shandong Provincial Key Laboratory of Network Based Intelligent Computing, Jinan, China
| |
Collapse
|
9
|
Chen J, Gu Z, Lai L, Pei J. In silico protein function prediction: the rise of machine learning-based approaches. MEDICAL REVIEW (2021) 2023; 3:487-510. [PMID: 38282798 PMCID: PMC10808870 DOI: 10.1515/mr-2023-0038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/11/2023] [Indexed: 01/30/2024]
Abstract
Proteins function as integral actors in essential life processes, rendering the realm of protein research a fundamental domain that possesses the potential to propel advancements in pharmaceuticals and disease investigation. Within the context of protein research, an imperious demand arises to uncover protein functionalities and untangle intricate mechanistic underpinnings. Due to the exorbitant costs and limited throughput inherent in experimental investigations, computational models offer a promising alternative to accelerate protein function annotation. In recent years, protein pre-training models have exhibited noteworthy advancement across multiple prediction tasks. This advancement highlights a notable prospect for effectively tackling the intricate downstream task associated with protein function prediction. In this review, we elucidate the historical evolution and research paradigms of computational methods for predicting protein function. Subsequently, we summarize the progress in protein and molecule representation as well as feature extraction techniques. Furthermore, we assess the performance of machine learning-based algorithms across various objectives in protein function prediction, thereby offering a comprehensive perspective on the progress within this field.
Collapse
Affiliation(s)
- Jiaxiao Chen
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Zhonghui Gu
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Luhua Lai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
- Research Unit of Drug Design Method, Chinese Academy of Medical Sciences (2021RU014), Beijing, China
| | - Jianfeng Pei
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- Research Unit of Drug Design Method, Chinese Academy of Medical Sciences (2021RU014), Beijing, China
| |
Collapse
|
10
|
Binothman N, Aljadani M, Alghanem B, Refai MY, Rashid M, Al Tuwaijri A, Alsubhi NH, Alrefaei GI, Khan MY, Sonbul SN, Aljoud F, Alhayyani S, Abdulal RH, Ganash M, Hashem AM. Identification of novel interacts partners of ADAR1 enzyme mediating the oncogenic process in aggressive breast cancer. Sci Rep 2023; 13:8341. [PMID: 37221310 PMCID: PMC10206070 DOI: 10.1038/s41598-023-35517-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 05/19/2023] [Indexed: 05/25/2023] Open
Abstract
Triple-negative breast cancer (TNBC) subtype is characterized by aggressive clinical behavior and poor prognosis patient outcomes. Here, we show that ADAR1 is more abundantly expressed in infiltrating breast cancer (BC) tumors than in benign tumors. Further, ADAR1 protein expression is higher in aggressive BC cells (MDA-MB-231). Moreover, we identify a novel interacting partners proteins list with ADAR1 in MDA-MB-231, using immunoprecipitation assay and mass spectrometry. Using iLoop, a protein-protein interaction prediction server based on structural features, five proteins with high iloop scores were discovered: Histone H2A.V, Kynureninase (KYNU), 40S ribosomal protein SA, Complement C4-A, and Nebulin (ranged between 0.6 and 0.8). In silico analysis showed that invasive ductal carcinomas had the highest level of KYNU gene expression than the other classifications (p < 0.0001). Moreover, KYNU mRNA expression was shown to be considerably higher in TNBC patients (p < 0.0001) and associated with poor patient outcomes with a high-risk value. Importantly, we found an interaction between ADAR1 and KYNU in the more aggressive BC cells. Altogether, these results propose a new ADAR-KYNU interaction as potential therapeutic targeted therapy in aggressive BC.
Collapse
Affiliation(s)
- Najat Binothman
- Department of Chemistry, College of Sciences and Arts, King Abdulaziz University, Rabigh, Saudi Arabia.
- Vaccine and Immunotherapy Unit, King Fahad Medical Research Center, King Abdulaziz University Saudi Arabia, Jeddah, Saudi Arabia.
| | - Majidah Aljadani
- Department of Chemistry, College of Sciences and Arts, King Abdulaziz University, Rabigh, Saudi Arabia
| | - Bandar Alghanem
- Medical Research Core Facility and Platforms (MRCFP), King Abdullah International Medical Research Center/King Saud bin Abdulaziz University for Health Sciences (KSAU-HS), King Abdulaziz Medical City (KAMC), National Guard Health Affairs (NGHA), Riyadh, Saudi Arabia
| | - Mohammed Y Refai
- Department of Biochemistry, College of Science, University of Jeddah, Jeddah, Saudi Arabia
| | - Mamoon Rashid
- Department of AI and Bioinformatics, King Abdullah International Medical Research Center (KAIMRC), King Saud Bin Abdulaziz University for Health Sciences (KSAU-HS), King Abdulaziz Medical City, Ministry of National Guard Health Affairs, P.O. Box 22490, Riyadh, 11426, Saudi Arabia
| | - Abeer Al Tuwaijri
- Medical Genomics Research Department, King Abdullah International Medical Research Center (KAIMRC), Ministry of National Guard Health Affairs (MNGH), Riyadh, Saudi Arabia
- Clinical Laboratory Sciences Department, College of Applied Medical Sciences, King Saud Bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
| | - Nouf H Alsubhi
- Biological Sciences Department, College of Science & Arts, King Abdulaziz University, Rabigh, 21911, Saudi Arabia
| | - Ghadeer I Alrefaei
- Department of Biology, College of Science, University of Jeddah, Jeddah, Saudi Arabia
| | - Muhammad Yasir Khan
- Vaccine and Immunotherapy Unit, King Fahad Medical Research Center, King Abdulaziz University Saudi Arabia, Jeddah, Saudi Arabia
- Department of Biology, Faculty of Science, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Sultan N Sonbul
- Biochemistry Department, Faculty of Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
- Experimental Biochemistry Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Fadwa Aljoud
- Department of Biology, Faculty of Science, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
- Regenerative Medicine Unit, King Fahd Medical Research Centre, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Sultan Alhayyani
- Department of Chemistry, College of Sciences and Arts, King Abdulaziz University, Rabigh, Saudi Arabia
| | - Rwaa H Abdulal
- Vaccine and Immunotherapy Unit, King Fahad Medical Research Center, King Abdulaziz University Saudi Arabia, Jeddah, Saudi Arabia
- Department of Biology, Faculty of Science, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Magdah Ganash
- Department of Biology, Faculty of Science, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Anwar M Hashem
- Vaccine and Immunotherapy Unit, King Fahad Medical Research Center, King Abdulaziz University Saudi Arabia, Jeddah, Saudi Arabia
- Department of Medical Microbiology and Parasitology, Faculty of Medicine, King AbdulAziz University, Jeddah, Saudi Arabia
| |
Collapse
|
11
|
Li B, Altelaar M, van Breukelen B. Identification of Protein Complexes by Integrating Protein Abundance and Interaction Features Using a Deep Learning Strategy. Int J Mol Sci 2023; 24:ijms24097884. [PMID: 37175590 PMCID: PMC10178578 DOI: 10.3390/ijms24097884] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/23/2023] [Accepted: 04/24/2023] [Indexed: 05/15/2023] Open
Abstract
Many essential cellular functions are carried out by multi-protein complexes that can be characterized by their protein-protein interactions. The interactions between protein subunits are critically dependent on the strengths of their interactions and their cellular abundances, both of which span orders of magnitude. Despite many efforts devoted to the global discovery of protein complexes by integrating large-scale protein abundance and interaction features, there is still room for improvement. Here, we integrated >7000 quantitative proteomic samples with three published affinity purification/co-fractionation mass spectrometry datasets into a deep learning framework to predict protein-protein interactions (PPIs), followed by the identification of protein complexes using a two-stage clustering strategy. Our deep-learning-technique-based classifier significantly outperformed recently published machine learning prediction models and in the process captured 5010 complexes containing over 9000 unique proteins. The vast majority of proteins in our predicted complexes exhibited low or no tissue specificity, which is an indication that the observed complexes tend to be ubiquitously expressed throughout all cell types and tissues. Interestingly, our combined approach increased the model sensitivity for low abundant proteins, which amongst other things allowed us to detect the interaction of MCM10, which connects to the replicative helicase complex via the MCM6 protein. The integration of protein abundances and their interaction features using a deep learning approach provided a comprehensive map of protein-protein interactions and a unique perspective on possible novel protein complexes.
Collapse
Affiliation(s)
- Bohui Li
- Biomolecular Mass Spectrometry and Proteomics, Padualaan 8, 3584 CH Utrecht, The Netherlands
- Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Universiteitsweg 99, 3584 CG Utrecht, The Netherlands
| | - Maarten Altelaar
- Biomolecular Mass Spectrometry and Proteomics, Padualaan 8, 3584 CH Utrecht, The Netherlands
- Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Universiteitsweg 99, 3584 CG Utrecht, The Netherlands
- Mass Spectrometry and Proteomics Facility, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Bas van Breukelen
- Biomolecular Mass Spectrometry and Proteomics, Padualaan 8, 3584 CH Utrecht, The Netherlands
- Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Universiteitsweg 99, 3584 CG Utrecht, The Netherlands
| |
Collapse
|
12
|
Jha K, Karmakar S, Saha S. Graph-BERT and language model-based framework for protein-protein interaction identification. Sci Rep 2023; 13:5663. [PMID: 37024543 PMCID: PMC10079975 DOI: 10.1038/s41598-023-31612-w] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 03/14/2023] [Indexed: 04/08/2023] Open
Abstract
Identification of protein-protein interactions (PPI) is among the critical problems in the domain of bioinformatics. Previous studies have utilized different AI-based models for PPI classification with advances in artificial intelligence (AI) techniques. The input to these models is the features extracted from different sources of protein information, mainly sequence-derived features. In this work, we present an AI-based PPI identification model utilizing a PPI network and protein sequences. The PPI network is represented as a graph where each node is a protein pair, and an edge is defined between two nodes if there exists a common protein between these nodes. Each node in a graph has a feature vector. In this work, we have used the language model to extract feature vectors directly from protein sequences. The feature vectors for protein in pairs are concatenated and used as a node feature vector of a PPI network graph. Finally, we have used the Graph-BERT model to encode the PPI network graph with sequence-based features and learn the hidden representation of the feature vector for each node. The next step involves feeding the learned representations of nodes to the fully connected layer, the output of which is fed into the softmax layer to classify the protein interactions. To assess the efficacy of the proposed PPI model, we have performed experiments on several PPI datasets. The experimental results demonstrate that the proposed approach surpasses the existing PPI works and designed baselines in classifying PPI.
Collapse
Affiliation(s)
- Kanchan Jha
- Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, Bihar, 801103, India.
| | - Sourav Karmakar
- Department of Computer Science and Engineering, National Institute of Technology Durgapur, Durgapur, West Bengal, 713209, India
| | - Sriparna Saha
- Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, Bihar, 801103, India
| |
Collapse
|
13
|
DeepCF-PPI: improved prediction of protein-protein interactions by combining learned and handcrafted features based on attention mechanisms. APPL INTELL 2023. [DOI: 10.1007/s10489-022-04387-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
14
|
Ortiz-Vilchis P, De-la-Cruz-García JS, Ramirez-Arellano A. Identification of Relevant Protein Interactions with Partial Knowledge: A Complex Network and Deep Learning Approach. BIOLOGY 2023; 12:140. [PMID: 36671832 PMCID: PMC9856098 DOI: 10.3390/biology12010140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 01/11/2023] [Accepted: 01/12/2023] [Indexed: 01/18/2023]
Abstract
Protein-protein interactions (PPIs) are the basis for understanding most cellular events in biological systems. Several experimental methods, e.g., biochemical, molecular, and genetic methods, have been used to identify protein-protein associations. However, some of them, such as mass spectrometry, are time-consuming and expensive. Machine learning (ML) techniques have been widely used to characterize PPIs, increasing the number of proteins analyzed simultaneously and optimizing time and resources for identifying and predicting protein-protein functional linkages. Previous ML approaches have focused on well-known networks or specific targets but not on identifying relevant proteins with partial or null knowledge of the interaction networks. The proposed approach aims to generate a relevant protein sequence based on bidirectional Long-Short Term Memory (LSTM) with partial knowledge of interactions. The general framework comprises conducting a scale-free and fractal complex network analysis. The outcome of these analyses is then used to fine-tune the fractal method for the vital protein extraction of PPI networks. The results show that several PPI networks are self-similar or fractal, but that both features cannot coexist. The generated protein sequences (by the bidirectional LSTM) also contain an average of 39.5% of proteins in the original sequence. The average length of the generated sequences was 17% of the original one. Finally, 95% of the generated sequences were true.
Collapse
Affiliation(s)
- Pilar Ortiz-Vilchis
- Sección de Estudios de Posgrado e Investigación, Escuela Superior de Medicina, Instituto Politécnico Nacional, Mexico City 11340, Mexico
| | - Jazmin-Susana De-la-Cruz-García
- Sección de Estudios de Posgrado e Investigación, Unidad Profesional Interdisciplinaria de Ingeniería y Ciencias Sociales y Administrativas, Instituto Politécnico Nacional, Mexico City 08400, Mexico
| | - Aldo Ramirez-Arellano
- Sección de Estudios de Posgrado e Investigación, Unidad Profesional Interdisciplinaria de Ingeniería y Ciencias Sociales y Administrativas, Instituto Politécnico Nacional, Mexico City 08400, Mexico
| |
Collapse
|
15
|
Vora DS, Kalakoti Y, Sundar D. Computational Methods and Deep Learning for Elucidating Protein Interaction Networks. Methods Mol Biol 2023; 2553:285-323. [PMID: 36227550 DOI: 10.1007/978-1-0716-2617-7_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Protein interactions play a critical role in all biological processes, but experimental identification of protein interactions is a time- and resource-intensive process. The advances in next-generation sequencing and multi-omics technologies have greatly benefited large-scale predictions of protein interactions using machine learning methods. A wide range of tools have been developed to predict protein-protein, protein-nucleic acid, and protein-drug interactions. Here, we discuss the applications, methods, and challenges faced when employing the various prediction methods. We also briefly describe ways to overcome the challenges and prospective future developments in the field of protein interaction biology.
Collapse
Affiliation(s)
- Dhvani Sandip Vora
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India
| | - Yogesh Kalakoti
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India
| | - Durai Sundar
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India.
- School of Artificial Intelligence, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India.
| |
Collapse
|
16
|
Jha K, Saha S. Analyzing Effect of Multi-Modality in Predicting Protein-Protein Interactions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:162-173. [PMID: 35259112 DOI: 10.1109/tcbb.2022.3157531] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Nowadays, multiple sources of information about proteins are available such as protein sequences, 3D structures, Gene Ontology (GO), etc. Most of the works on protein-protein interaction (PPI) identification had utilized these information about proteins, mainly sequence-based, but individually. The new advances in deep learning techniques allow us to leverage multiple sources/modalities of proteins, which complement each other. Some recent works have shown that multi-modal PPI models perform better than uni-modal approaches. This paper aims to investigate whether the performance of multi-modal PPI models is always consistent or depends on other factors such as dataset distribution, algorithms used to learn features, etc. We have used three modalities for this study: Protein sequence, 3D structure, and GO. Various techniques, including deep learning algorithms, are employed to extract features from multiple sources of proteins. These feature vectors from different modalities are then integrated in several combinations (bi-modal and tri-modal) to predict PPI. To conduct this study, we have used Human and S. cerevisiae PPI datasets. The obtained results demonstrate the potentiality of a multi-modal approach and deep learning techniques in predicting protein interactions. However, the predictive capability of a model for PPI depends on feature extraction methods as well. Also, increasing the modality does not always ensure performance improvement. In this study, the PPI model integrating two modalities outperforms the designed uni-modal and tri-modal PPI models.
Collapse
|
17
|
Murakami Y, Mizuguchi K. Recent developments of sequence-based prediction of protein-protein interactions. Biophys Rev 2022; 14:1393-1411. [PMID: 36589735 PMCID: PMC9789376 DOI: 10.1007/s12551-022-01038-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/08/2022] [Indexed: 12/25/2022] Open
Abstract
The identification of protein-protein interactions (PPIs) can lead to a better understanding of cellular functions and biological processes of proteins and contribute to the design of drugs to target disease-causing PPIs. In addition, targeting host-pathogen PPIs is useful for elucidating infection mechanisms. Although several experimental methods have been used to identify PPIs, these methods can yet to draw complete PPI networks. Hence, computational techniques are increasingly required for the prediction of potential PPIs, which have never been seen experimentally. Recent high-performance sequence-based methods have contributed to the construction of PPI networks and the elucidation of pathogenetic mechanisms in specific diseases. However, the usefulness of these methods depends on the quality and quantity of training data of PPIs. In this brief review, we introduce currently available PPI databases and recent sequence-based methods for predicting PPIs. Also, we discuss key issues in this field and present future perspectives of the sequence-based PPI predictions.
Collapse
Affiliation(s)
- Yoichi Murakami
- grid.440890.10000 0004 0640 9413Tokyo University of Information Sciences, 4-1 Onaridai, Wakaba-Ku, Chiba, 265-8501 Japan
| | - Kenji Mizuguchi
- grid.136593.b0000 0004 0373 3971Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-Shi, Osaka, 565-0871 Japan ,grid.482562.fNational Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085 Japan
| |
Collapse
|
18
|
Zheng Q, Lin R, Chen Y, Lv Q, Zhang J, Zhai J, Xu W, Wang W. SARS-CoV-2 induces "cytokine storm" hyperinflammatory responses in RA patients through pyroptosis. Front Immunol 2022; 13:1058884. [PMID: 36532040 PMCID: PMC9751040 DOI: 10.3389/fimmu.2022.1058884] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 11/15/2022] [Indexed: 12/04/2022] Open
Abstract
Background The coronavirus disease (COVID-19) is a pandemic disease that threatens worldwide public health, and rheumatoid arthritis (RA) is the most common autoimmune disease. COVID-19 and RA are each strong risk factors for the other, but their molecular mechanisms are unclear. This study aims to investigate the biomarkers between COVID-19 and RA from the mechanism of pyroptosis and find effective disease-targeting drugs. Methods We obtained the common gene shared by COVID-19, RA (GSE55235), and pyroptosis using bioinformatics analysis and then did the principal component analysis(PCA). The Co-genes were evaluated by Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and ClueGO for functional enrichment, the protein-protein interaction (PPI) network was built by STRING, and the k-means machine learning algorithm was employed for cluster analysis. Modular analysis utilizing Cytoscape to identify hub genes, functional enrichment analysis with Metascape and GeneMANIA, and NetworkAnalyst for gene-drug prediction. Network pharmacology analysis was performed to identify target drug-related genes intersecting with COVID-19, RA, and pyroptosis to acquire Co-hub genes and construct transcription factor (TF)-hub genes and miRNA-hub genes networks by NetworkAnalyst. The Co-hub genes were validated using GSE55457 and GSE93272 to acquire the Key gene, and their efficacy was assessed using receiver operating curves (ROC); SPEED2 was then used to determine the upstream pathway. Immune cell infiltration was analyzed using CIBERSORT and validated by the HPA database. Molecular docking, molecular dynamics simulation, and molecular mechanics-generalized born surface area (MM-GBSA) were used to explore and validate drug-gene relationships through computer-aided drug design. Results COVID-19, RA, and pyroptosis-related genes were enriched in pyroptosis and pro-inflammatory pathways(the NOD-like receptor family pyrin domain containing 3 (NLRP3) inflammasome complex, death-inducing signaling complex, regulation of interleukin production), natural immune pathways (Network map of SARS-CoV-2 signaling pathway, activation of NLRP3 inflammasome by SARS-CoV-2) and COVID-19-and RA-related cytokine storm pathways (IL, nuclear factor-kappa B (NF-κB), TNF signaling pathway and regulation of cytokine-mediated signaling). Of these, CASP1 is the most involved pathway and is closely related to minocycline. YY1, hsa-mir-429, and hsa-mir-34a-5p play an important role in the expression of CASP1. Monocytes are high-caspase-1-expressing sentinel cells. Minocycline can generate a highly stable state for biochemical activity by docking closely with the active region of caspase-1. Conclusions Caspase-1 is a common biomarker for COVID-19, RA, and pyroptosis, and it may be an important mediator of the excessive inflammatory response induced by SARS-CoV-2 in RA patients through pyroptosis. Minocycline may counteract cytokine storm inflammation in patients with COVID-19 combined with RA by inhibiting caspase-1 expression.
Collapse
Affiliation(s)
- Qingcong Zheng
- Department of Orthopedics, 900th Hospital of Joint Logistics Support Force, Fuzhou, China
| | - Rongjie Lin
- Department of Orthopedics, 900th Hospital of Joint Logistics Support Force, Fuzhou, China
| | - Yuchao Chen
- Department of Paediatrics, Fujian Provincial Hospital South Branch, Fuzhou, China
| | - Qi Lv
- Department of Orthopedics, 900th Hospital of Joint Logistics Support Force, Fuzhou, China
| | - Jin Zhang
- Department of Pharmacology and Toxicology, University of Mississippi Medical Center, Jackson, MS, United States
| | - Jingbo Zhai
- Key Laboratory of Zoonose Prevention and Control at Universities of Inner Mongolia Autonomous Region, Medical College, Inner Mongolia Minzu University, Tongliao, China
| | - Weihong Xu
- Department of Orthopedics, First Affiliated Hospital of Fujian Medical University, Fuzhou, China,*Correspondence: Weihong Xu, ; Wanming Wang,
| | - Wanming Wang
- Department of Orthopedics, 900th Hospital of Joint Logistics Support Force, Fuzhou, China,*Correspondence: Weihong Xu, ; Wanming Wang,
| |
Collapse
|
19
|
Singh N, Villoutreix BO. A Hybrid Docking and Machine Learning Approach to Enhance the Performance of Virtual Screening Carried out on Protein-Protein Interfaces. Int J Mol Sci 2022; 23:ijms232214364. [PMID: 36430841 PMCID: PMC9694378 DOI: 10.3390/ijms232214364] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 11/11/2022] [Accepted: 11/16/2022] [Indexed: 11/22/2022] Open
Abstract
The modulation of protein-protein interactions (PPIs) by small chemical compounds is challenging. PPIs play a critical role in most cellular processes and are involved in numerous disease pathways. As such, novel strategies that assist the design of PPI inhibitors are of major importance. We previously reported that the knowledge-based DLIGAND2 scoring tool was the best-rescoring function for improving receptor-based virtual screening (VS) performed with the Surflex docking engine applied to several PPI targets with experimentally known active and inactive compounds. Here, we extend our investigation by assessing the vs. potential of other types of scoring functions with an emphasis on docking-pose derived solvent accessible surface area (SASA) descriptors, with or without the use of machine learning (ML) classifiers. First, we explored rescoring strategies of Surflex-generated docking poses with five GOLD scoring functions (GoldScore, ChemScore, ASP, ChemPLP, ChemScore with Receptor Depth Scaling) and with consensus scoring. The top-ranked poses were post-processed to derive a set of protein and ligand SASA descriptors in the bound and unbound states, which were combined to derive descriptors of the docked protein-ligand complexes. Further, eight ML models (tree, bagged forest, random forest, Bayesian, support vector machine, logistic regression, neural network, and neural network with bagging) were trained using the derivatized SASA descriptors and validated on test sets. The results show that many SASA descriptors are better than Surflex and GOLD scoring functions in terms of overall performance and early recovery success on the used dataset. The ML models were superior to all scoring functions and rescoring approaches for most targets yielding up to a seven-fold increase in enrichment factors at 1% of the screened collections. In particular, the neural networks and random forest-based ML emerged as the best techniques for this PPI dataset, making them robust and attractive vs. tools for hit-finding efforts. The presented results suggest that exploring further docking-pose derived SASA descriptors could be valuable for structure-based virtual screening projects, and in the present case, to assist the rational design of small-molecule PPI inhibitors.
Collapse
|
20
|
Alakus TB, Turkoglu I. Prediction of viral-host interactions of COVID-19 by computational methods. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS : AN INTERNATIONAL JOURNAL SPONSORED BY THE CHEMOMETRICS SOCIETY 2022; 228:104622. [PMID: 35879939 PMCID: PMC9301933 DOI: 10.1016/j.chemolab.2022.104622] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Revised: 06/20/2022] [Accepted: 07/10/2022] [Indexed: 06/15/2023]
Abstract
Experimental approaches are currently used to determine viral-host interactions, but these approaches are both time-consuming and costly. For these reasons, computational-based approaches are recommended. In this study, using computational-based approaches, viral-host interactions of SARS-CoV-2 virus and human proteins were predicted. The study consists of four different stages; in the first stage viral and host protein sequences were obtained. In the second stage, protein sequences were converted into numerical expressions by various protein mapping methods. These methods are entropy-based, AVL-tree, FIBHASH, binary encoding, CPNR, PAM250, BLOSUM62, Atchley factors, Meiler parameters, EIIP, AESNN1, Miyazawa energies, Micheletti potentials, Z-scale, and hydrophobicity. In the third stage, a deep learning model was designed and BiLSTM was used for this. In the last stage, the protein sequences were classified, and the viral-host interactions were predicted. The performances of protein mapping methods were determined by accuracy, F1-score, specificity, sensitivity, and AUC scores. According to the classification results, the best classification process was obtained by the entropy-based method. With this method, 94.74% accuracy, and 0.95 AUC score were calculated. Then, the most successful classification process was performed with the Z-scale and 91.23% accuracy, and 0.96 AUC score were obtained. Although other protein mapping methods are not as efficient as Z-scale and entropy-based methods, they have achieved successful classification. AVL-tree, FIBHASH, binary encoding, CPNR, PAM250, BLOSUM62, Atchley factors, Meiler parameters and AESNN1 methods showed over 80% accuracy, F1-score, and AUC score. Accuracy scores of EIIP, Miyazawa energies, Micheletti potentials and hydrophobicity methods remained below 80%. When the results were examined in general, it was observed that the computational approaches were successful in predicting viral-host interactions between SARS-CoV-2 virus and human proteins.
Collapse
Affiliation(s)
- Talha Burak Alakus
- Kirklareli University, Department of Software Engineering, Kirklareli, 39000, Turkey
| | - Ibrahim Turkoglu
- Firat University, Department of Software Engineering, Elazig, 23119, Turkey
| |
Collapse
|
21
|
Zheng Q, Wang D, Lin R, Lv Q, Wang W. IFI44 is an immune evasion biomarker for SARS-CoV-2 and Staphylococcus aureus infection in patients with RA. Front Immunol 2022; 13:1013322. [PMID: 36189314 PMCID: PMC9520788 DOI: 10.3389/fimmu.2022.1013322] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Accepted: 08/29/2022] [Indexed: 12/04/2022] Open
Abstract
Background Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) caused a global pandemic of severe coronavirus disease 2019 (COVID-19). Staphylococcus aureus is one of the most common pathogenic bacteria in humans, rheumatoid arthritis (RA) is among the most prevalent autoimmune conditions. RA is a significant risk factor for SARS-CoV-2 and S. aureus infections, although the mechanism of RA and SARS-CoV-2 infection in conjunction with S. aureus infection has not been elucidated. The purpose of this study is to investigate the biomarkers and disease targets between RA and SARS-CoV-2 and S. aureus infections using bioinformatics analysis, to search for the molecular mechanisms of SARS-CoV-2 and S. aureus immune escape and potential drug targets in the RA population, and to provide new directions for further analysis and targeted development of clinical treatments. Methods The RA dataset (GSE93272) and the S. aureus bacteremia (SAB) dataset (GSE33341) were used to obtain differentially expressed gene sets, respectively, and the common differentially expressed genes (DEGs) were determined through the intersection. Functional enrichment analysis utilizing GO, KEGG, and ClueGO methods. The PPI network was created utilizing the STRING database, and the top 10 hub genes were identified and further examined for functional enrichment using Metascape and GeneMANIA. The top 10 hub genes were intersected with the SARS-CoV-2 gene pool to identify five hub genes shared by RA, COVID-19, and SAB, and functional enrichment analysis was conducted using Metascape and GeneMANIA. Using the NetworkAnalyst platform, TF-hub gene and miRNA-hub gene networks were built for these five hub genes. The hub gene was verified utilizing GSE17755, GSE55235, and GSE13670, and its effectiveness was assessed utilizing ROC curves. CIBERSORT was applied to examine immune cell infiltration and the link between the hub gene and immune cells. Results A total of 199 DEGs were extracted from the GSE93272 and GSE33341 datasets. KEGG analysis of enrichment pathways were NLR signaling pathway, cell membrane DNA sensing pathway, oxidative phosphorylation, and viral infection. Positive/negative regulation of the immune system, regulation of the interferon-I (IFN-I; IFN-α/β) pathway, and associated pathways of the immunological response to viruses were enriched in GO and ClueGO analyses. PPI network and Cytoscape platform identified the top 10 hub genes: RSAD2, IFIT3, GBP1, RTP4, IFI44, OAS1, IFI44L, ISG15, HERC5, and IFIT5. The pathways are mainly enriched in response to viral and bacterial infection, IFN signaling, and 1,25-dihydroxy vitamin D3. IFI44, OAS1, IFI44L, ISG15, and HERC5 are the five hub genes shared by RA, COVID-19, and SAB. The pathways are primarily enriched for response to viral and bacterial infections. The TF-hub gene network and miRNA-hub gene network identified YY1 as a key TF and hsa-mir-1-3p and hsa-mir-146a-5p as two important miRNAs related to IFI44. IFI44 was identified as a hub gene by validating GSE17755, GSE55235, and GSE13670. Immune cell infiltration analysis showed a strong positive correlation between activated dendritic cells and IFI44 expression. Conclusions IFI144 was discovered as a shared biomarker and disease target for RA, COVID-19, and SAB by this study. IFI44 negatively regulates the IFN signaling pathway to promote viral replication and bacterial proliferation and is an important molecular target for SARS-CoV-2 and S. aureus immune escape in RA. Dendritic cells play an important role in this process. 1,25-Dihydroxy vitamin D3 may be an important therapeutic agent in treating RA with SARS-CoV-2 and S. aureus infections.
Collapse
Affiliation(s)
- Qingcong Zheng
- Department of Orthopedics, 900th Hospital of Joint Logistics Support Force, Fuzhou, China
| | - Du Wang
- Arthritis Clinical and Research Center, Peking University People’s Hospital, Beijing, China
| | - Rongjie Lin
- Department of Orthopedics, 900th Hospital of Joint Logistics Support Force, Fuzhou, China
| | - Qi Lv
- Department of Orthopedics, 900th Hospital of Joint Logistics Support Force, Fuzhou, China
| | - Wanming Wang
- Department of Orthopedics, 900th Hospital of Joint Logistics Support Force, Fuzhou, China
| |
Collapse
|
22
|
Robin V, Bodein A, Scott-Boyer MP, Leclercq M, Périn O, Droit A. Overview of methods for characterization and visualization of a protein-protein interaction network in a multi-omics integration context. Front Mol Biosci 2022; 9:962799. [PMID: 36158572 PMCID: PMC9494275 DOI: 10.3389/fmolb.2022.962799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 08/16/2022] [Indexed: 11/26/2022] Open
Abstract
At the heart of the cellular machinery through the regulation of cellular functions, protein-protein interactions (PPIs) have a significant role. PPIs can be analyzed with network approaches. Construction of a PPI network requires prediction of the interactions. All PPIs form a network. Different biases such as lack of data, recurrence of information, and false interactions make the network unstable. Integrated strategies allow solving these different challenges. These approaches have shown encouraging results for the understanding of molecular mechanisms, drug action mechanisms, and identification of target genes. In order to give more importance to an interaction, it is evaluated by different confidence scores. These scores allow the filtration of the network and thus facilitate the representation of the network, essential steps to the identification and understanding of molecular mechanisms. In this review, we will discuss the main computational methods for predicting PPI, including ones confirming an interaction as well as the integration of PPIs into a network, and we will discuss visualization of these complex data.
Collapse
Affiliation(s)
- Vivian Robin
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Antoine Bodein
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Marie-Pier Scott-Boyer
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Mickaël Leclercq
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Olivier Périn
- Digital Sciences Department, L'Oréal Advanced Research, Aulnay-sous-bois, France
| | - Arnaud Droit
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| |
Collapse
|
23
|
Orientation algorithm for PPI networks based on network propagation approach. J Biosci 2022. [DOI: 10.1007/s12038-022-00284-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
24
|
Jiang Y, Wang Y, Shen L, Adjeroh DA, Liu Z, Lin J. Identification of all-against-all protein-protein interactions based on deep hash learning. BMC Bioinformatics 2022; 23:266. [PMID: 35804303 PMCID: PMC9264577 DOI: 10.1186/s12859-022-04811-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 06/17/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Protein-protein interaction (PPI) is vital for life processes, disease treatment, and drug discovery. The computational prediction of PPI is relatively inexpensive and efficient when compared to traditional wet-lab experiments. Given a new protein, one may wish to find whether the protein has any PPI relationship with other existing proteins. Current computational PPI prediction methods usually compare the new protein to existing proteins one by one in a pairwise manner. This is time consuming. RESULTS In this work, we propose a more efficient model, called deep hash learning protein-and-protein interaction (DHL-PPI), to predict all-against-all PPI relationships in a database of proteins. First, DHL-PPI encodes a protein sequence into a binary hash code based on deep features extracted from the protein sequences using deep learning techniques. This encoding scheme enables us to turn the PPI discrimination problem into a much simpler searching problem. The binary hash code for a protein sequence can be regarded as a number. Thus, in the pre-screening stage of DHL-PPI, the string matching problem of comparing a protein sequence against a database with M proteins can be transformed into a much more simpler problem: to find a number inside a sorted array of length M. This pre-screening process narrows down the search to a much smaller set of candidate proteins for further confirmation. As a final step, DHL-PPI uses the Hamming distance to verify the final PPI relationship. CONCLUSIONS The experimental results confirmed that DHL-PPI is feasible and effective. Using a dataset with strictly negative PPI examples of four species, DHL-PPI is shown to be superior or competitive when compared to the other state-of-the-art methods in terms of precision, recall or F1 score. Furthermore, in the prediction stage, the proposed DHL-PPI reduced the time complexity from [Formula: see text] to [Formula: see text] for performing an all-against-all PPI prediction for a database with M proteins. With the proposed approach, a protein database can be preprocessed and stored for later search using the proposed encoding scheme. This can provide a more efficient way to cope with the rapidly increasing volume of protein datasets.
Collapse
Affiliation(s)
- Yue Jiang
- College of Computer and Cyber Security, Fujian Normal University, Fuzhou, 350108, People's Republic of China
| | - Yuxuan Wang
- No. 2 Thoracic Surgery Department Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Research Institute, Beijing, 101149, People's Republic of China
| | - Lin Shen
- College of Computer and Cyber Security, Fujian Normal University, Fuzhou, 350108, People's Republic of China
| | - Donald A Adjeroh
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, 26506, USA
| | - Zhidong Liu
- No. 2 Thoracic Surgery Department Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Research Institute, Beijing, 101149, People's Republic of China.
| | - Jie Lin
- College of Computer and Cyber Security, Fujian Normal University, Fuzhou, 350108, People's Republic of China.
| |
Collapse
|
25
|
A Survey on Deep Networks Approaches in Prediction of Sequence-Based Protein–Protein Interactions. SN COMPUTER SCIENCE 2022; 3:298. [PMID: 35611239 PMCID: PMC9119573 DOI: 10.1007/s42979-022-01197-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 05/06/2022] [Indexed: 12/03/2022]
Abstract
The prominence of protein–protein interactions (PPIs) in system biology with diverse biological procedures has become the topic to discuss because it acts as a fundamental part in predicting the protein function of the target protein and drug ability of molecules. Numerous researches have been published to predict PPIs computationally because they provide an alternative solution to laboratory trials and a cost-effective way of predicting the most likely set of interactions at the entire proteome scale. In recent computational methods, deep learning has become a buzzword with numerous scientific researches. This paper presents, for the first time, a comprehensive survey of sequence-based PPI prediction by three popular deep learning architectures i.e. deep neural networks, convolutional neural networks and recurrent neural networks and its variants. The thorough survey discussed herein carefully mined every possible information, can help the researchers to further explore the success in this area.
Collapse
|
26
|
Promising perspectives on novel protein food sources combining artificial intelligence and 3D food printing for food industry. Trends Food Sci Technol 2022. [DOI: 10.1016/j.tifs.2022.05.013] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
|
27
|
Cha M, Emre EST, Xiao X, Kim JY, Bogdan P, VanEpps JS, Violi A, Kotov NA. Unifying structural descriptors for biological and bioinspired nanoscale complexes. NATURE COMPUTATIONAL SCIENCE 2022; 2:243-252. [PMID: 38177552 DOI: 10.1038/s43588-022-00229-w] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 03/17/2022] [Indexed: 01/06/2024]
Abstract
Biomimetic nanoparticles are known to serve as nanoscale adjuvants, enzyme mimics and amyloid fibrillation inhibitors. Their further development requires better understanding of their interactions with proteins. The abundant knowledge about protein-protein interactions can serve as a guide for designing protein-nanoparticle assemblies, but the chemical and biological inputs used in computational packages for protein-protein interactions are not applicable to inorganic nanoparticles. Analysing chemical, geometrical and graph-theoretical descriptors for protein complexes, we found that geometrical and graph-theoretical descriptors are uniformly applicable to biological and inorganic nanostructures and can predict interaction sites in protein pairs with accuracy >80% and classification probability ~90%. We extended the machine-learning algorithms trained on protein-protein interactions to inorganic nanoparticles and found a nearly exact match between experimental and predicted interaction sites with proteins. These findings can be extended to other organic and inorganic nanoparticles to predict their assemblies with biomolecules and other chemical structures forming lock-and-key complexes.
Collapse
Affiliation(s)
- Minjeong Cha
- Department of Materials Science and Engineering, University of Michigan, Ann Arbor, MI, USA
- Biointerfaces Institute, University of Michigan, Ann Arbor, MI, USA
| | - Emine Sumeyra Turali Emre
- Biointerfaces Institute, University of Michigan, Ann Arbor, MI, USA
- Department of Chemical Engineering, University of Michigan, Ann Arbor, MI, USA
| | - Xiongye Xiao
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - Ji-Young Kim
- Biointerfaces Institute, University of Michigan, Ann Arbor, MI, USA
- Department of Chemical Engineering, University of Michigan, Ann Arbor, MI, USA
| | - Paul Bogdan
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - J Scott VanEpps
- Biointerfaces Institute, University of Michigan, Ann Arbor, MI, USA
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI, USA
- Program in Macromolecular Science and Engineering, University of Michigan, Ann Arbor, MI, USA
- Department of Emergency Medicine, University of Michigan, Ann Arbor, MI, USA
- Michigan Center for Integrative Research in Critical Care, University of Michigan, Ann Arbor, MI, USA
| | - Angela Violi
- Department of Chemical Engineering, University of Michigan, Ann Arbor, MI, USA
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI, USA
- Biophysics Program, University of Michigan, Ann Arbor, MI, USA
| | - Nicholas A Kotov
- Department of Materials Science and Engineering, University of Michigan, Ann Arbor, MI, USA.
- Biointerfaces Institute, University of Michigan, Ann Arbor, MI, USA.
- Department of Chemical Engineering, University of Michigan, Ann Arbor, MI, USA.
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI, USA.
- Program in Macromolecular Science and Engineering, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
28
|
Munjal NS, Sapra D, Parthasarathi KTS, Goyal A, Pandey A, Banerjee M, Sharma J. Deciphering the Interactions of SARS-CoV-2 Proteins with Human Ion Channels Using Machine-Learning-Based Methods. Pathogens 2022; 11:pathogens11020259. [PMID: 35215201 PMCID: PMC8874499 DOI: 10.3390/pathogens11020259] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 01/31/2022] [Accepted: 02/08/2022] [Indexed: 01/04/2023] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is accountable for the protracted COVID-19 pandemic. Its high transmission rate and pathogenicity led to health emergencies and economic crisis. Recent studies pertaining to the understanding of the molecular pathogenesis of SARS-CoV-2 infection exhibited the indispensable role of ion channels in viral infection inside the host. Moreover, machine learning (ML)-based algorithms are providing a higher accuracy for host-SARS-CoV-2 protein–protein interactions (PPIs). In this study, PPIs of SARS-CoV-2 proteins with human ion channels (HICs) were trained on the PPI-MetaGO algorithm. PPI networks (PPINs) and a signaling pathway map of HICs with SARS-CoV-2 proteins were generated. Additionally, various U.S. food and drug administration (FDA)-approved drugs interacting with the potential HICs were identified. The PPIs were predicted with 82.71% accuracy, 84.09% precision, 84.09% sensitivity, 0.89 AUC-ROC, 65.17% Matthews correlation coefficient score (MCC) and 84.09% F1 score. Several host pathways were found to be altered, including calcium signaling and taste transduction pathway. Potential HICs could serve as an initial set to the experimentalists for further validation. The study also reinforces the drug repurposing approach for the development of host directed antiviral drugs that may provide a better therapeutic management strategy for infection caused by SARS-CoV-2.
Collapse
Affiliation(s)
- Nupur S. Munjal
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - Dikscha Sapra
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - K. T. Shreya Parthasarathi
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - Abhishek Goyal
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - Akhilesh Pandey
- Center for Molecular Medicine, National Institute of Mental Health and Neurosciences (NIMHANS), Hosur Road, Bangalore 560029, India;
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA
- Center for Individualized Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Manidipa Banerjee
- Kusuma School of Biological Sciences, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India;
| | - Jyoti Sharma
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
- Manipal Academy of Higher Education (MAHE), Udupi 576104, India
- Correspondence:
| |
Collapse
|
29
|
Khetan R, Curtis R, Deane CM, Hadsund JT, Kar U, Krawczyk K, Kuroda D, Robinson SA, Sormanni P, Tsumoto K, Warwicker J, Martin ACR. Current advances in biopharmaceutical informatics: guidelines, impact and challenges in the computational developability assessment of antibody therapeutics. MAbs 2022; 14:2020082. [PMID: 35104168 PMCID: PMC8812776 DOI: 10.1080/19420862.2021.2020082] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Therapeutic monoclonal antibodies and their derivatives are key components of clinical pipelines in the global biopharmaceutical industry. The availability of large datasets of antibody sequences, structures, and biophysical properties is increasingly enabling the development of predictive models and computational tools for the "developability assessment" of antibody drug candidates. Here, we provide an overview of the antibody informatics tools applicable to the prediction of developability issues such as stability, aggregation, immunogenicity, and chemical degradation. We further evaluate the opportunities and challenges of using biopharmaceutical informatics for drug discovery and optimization. Finally, we discuss the potential of developability guidelines based on in silico metrics that can be used for the assessment of antibody stability and manufacturability.
Collapse
Affiliation(s)
- Rahul Khetan
- Manchester Institute of Biotechnology, University of Manchester, Manchester, UK
| | - Robin Curtis
- Manchester Institute of Biotechnology, University of Manchester, Manchester, UK
| | | | | | - Uddipan Kar
- Department of Biological Engineering, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | | | - Daisuke Kuroda
- Department of Bioengineering, School of Engineering, The University of Tokyo, Tokyo, Japan.,Medical Device Development and Regulation Research Center, School of Engineering, The University of Tokyo, Tokyo, Japan.,Department of Chemistry and Biotechnology, School of Engineering, The University of Tokyo, Tokyo, Japan
| | | | - Pietro Sormanni
- Chemistry of Health, Yusuf Hamied Department of Chemistry, University of Cambridge
| | - Kouhei Tsumoto
- Department of Bioengineering, School of Engineering, The University of Tokyo, Tokyo, Japan.,Medical Device Development and Regulation Research Center, School of Engineering, The University of Tokyo, Tokyo, Japan.,Department of Chemistry and Biotechnology, School of Engineering, The University of Tokyo, Tokyo, Japan.,The Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Jim Warwicker
- Manchester Institute of Biotechnology, University of Manchester, Manchester, UK
| | - Andrew C R Martin
- Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London, UK
| |
Collapse
|
30
|
Chatterjee A, Paul S, Bisht B, Bhattacharya S, Sivasubramaniam S, Paul MK. Advances in targeting the WNT/β-catenin signaling pathway in cancer. Drug Discov Today 2022; 27:82-101. [PMID: 34252612 DOI: 10.1016/j.drudis.2021.07.007] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 05/27/2021] [Accepted: 07/06/2021] [Indexed: 01/05/2023]
Abstract
WNT/β-catenin signaling orchestrates various physiological processes, including embryonic development, growth, tissue homeostasis, and regeneration. Abnormal WNT/β-catenin signaling is associated with various cancers and its inhibition has shown effective antitumor responses. In this review, we discuss the pathway, potential targets for the development of WNT/β-catenin inhibitors, available inhibitors, and their specific molecular interactions with the target proteins. We also discuss inhibitors that are in clinical trials and describe potential new avenues for therapeutically targeting the WNT/β-catenin pathway. Furthermore, we introduce emerging strategies, including artificial intelligence (AI)-assisted tools and technology-based actionable approaches, to translate WNT/β-catenin inhibitors to the clinic for cancer therapy.
Collapse
Affiliation(s)
- Avradip Chatterjee
- Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Sayan Paul
- Department of Biotechnology, Manonmaniam Sundaranar University, Tirunelveli, Tamil Nadu 627012, India; Centre for Cardiovascular Biology and Disease, Institute for Stem Cell Science and Regenerative Medicine (inStem), Bangalore 560065, India
| | - Bharti Bisht
- Department of Thoracic Surgery, David Geffen School of Medicine, UCLA, Los Angeles, CA 90095, USA
| | - Shelley Bhattacharya
- Environmental Toxicology Laboratory, Department of Zoology (Centre for Advanced Studies), Visva Bharati (A Central University), Santiniketan 731235, India
| | - Sudhakar Sivasubramaniam
- Department of Biotechnology, Manonmaniam Sundaranar University, Tirunelveli, Tamil Nadu 627012, India
| | - Manash K Paul
- Department of Pulmonary and Critical Care Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA 90095, USA.
| |
Collapse
|
31
|
Hu X, Feng C, Ling T, Chen M. Deep learning frameworks for protein–protein interaction prediction. Comput Struct Biotechnol J 2022; 20:3223-3233. [PMID: 35832624 PMCID: PMC9249595 DOI: 10.1016/j.csbj.2022.06.025] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 05/27/2022] [Accepted: 06/12/2022] [Indexed: 11/26/2022] Open
|
32
|
Dong TN, Brogden G, Gerold G, Khosla M. A multitask transfer learning framework for the prediction of virus-human protein-protein interactions. BMC Bioinformatics 2021; 22:572. [PMID: 34837942 PMCID: PMC8626732 DOI: 10.1186/s12859-021-04484-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 11/15/2021] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Viral infections are causing significant morbidity and mortality worldwide. Understanding the interaction patterns between a particular virus and human proteins plays a crucial role in unveiling the underlying mechanism of viral infection and pathogenesis. This could further help in prevention and treatment of virus-related diseases. However, the task of predicting protein-protein interactions between a new virus and human cells is extremely challenging due to scarce data on virus-human interactions and fast mutation rates of most viruses. RESULTS We developed a multitask transfer learning approach that exploits the information of around 24 million protein sequences and the interaction patterns from the human interactome to counter the problem of small training datasets. Instead of using hand-crafted protein features, we utilize statistically rich protein representations learned by a deep language modeling approach from a massive source of protein sequences. Additionally, we employ an additional objective which aims to maximize the probability of observing human protein-protein interactions. This additional task objective acts as a regularizer and also allows to incorporate domain knowledge to inform the virus-human protein-protein interaction prediction model. CONCLUSIONS Our approach achieved competitive results on 13 benchmark datasets and the case study for the SARS-COV-2 virus receptor. Experimental results show that our proposed model works effectively for both virus-human and bacteria-human protein-protein interaction prediction tasks. We share our code for reproducibility and future research at https://git.l3s.uni-hannover.de/dong/multitask-transfer .
Collapse
Affiliation(s)
- Thi Ngan Dong
- L3S Research Center, Leibniz University Hannover, Hannover, Germany.
| | - Graham Brogden
- Institute for Biochemistry, University of Veterinary Medicine, Hannover, Germany.,Institute of Experimental Virology, TWINCORE, Center for Experimental and Clinical Infection Research Hannover, Hannover, Germany
| | - Gisa Gerold
- Institute for Biochemistry, University of Veterinary Medicine, Hannover, Germany.,Institute of Experimental Virology, TWINCORE, Center for Experimental and Clinical Infection Research Hannover, Hannover, Germany.,Department of Clinical Microbiology, Umeå University, Umeå, Sweden.,Wallenberg Centre for Molecular Medicine (WCMM), Umeå University, Umeå, Sweden
| | - Megha Khosla
- L3S Research Center, Leibniz University Hannover, Hannover, Germany
| |
Collapse
|
33
|
Alakus TB, Turkoglu I. A Novel Protein Mapping Method for Predicting the Protein Interactions in COVID-19 Disease by Deep Learning. Interdiscip Sci 2021; 13:44-60. [PMID: 33433784 PMCID: PMC7801232 DOI: 10.1007/s12539-020-00405-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 11/23/2020] [Accepted: 11/28/2020] [Indexed: 12/11/2022]
Abstract
The new type of corona virus (SARS-COV-2) emerging in Wuhan, China has spread rapidly to the world and has become a pandemic. In addition to having a significant impact on daily life, it also shows its effect in different areas, including public health and economy. Currently, there is no vaccine or antiviral drug available to prevent the COVID-19 disease. Therefore, determination of protein interactions of new types of corona virus is vital in clinical studies, drug therapy, identification of preclinical compounds and protein functions. Protein–protein interactions are important to examine protein functions and pathways involved in various biological processes and to determine the cause and progression of diseases. Various high-throughput experimental methods have been used to identify protein–protein interactions in organisms, yet, there is still a huge gap in specifying all possible protein interactions in an organism. In addition, since the experimental methods used include cloning, labeling, affinity purification mass spectrometry, the processes take a long time. Determining these interactions with artificial intelligence-based methods rather than experimental approaches may help to identify protein functions faster. Thus, protein–protein interaction prediction using deep-learning algorithms has been employed in conjunction with experimental method to explore new protein interactions. However, to predict protein interactions with artificial intelligence techniques, protein sequences need to be mapped. There are various types and numbers of protein-mapping methods in the literature. In this study, we wanted to contribute to the literature by proposing a novel protein-mapping method based on the AVL tree. The proposed method was inspired by the fast search performance on the dictionary structure of AVL tree and was used to verify the protein interactions between SARS-COV-2 virus and human. First, protein sequences were mapped by both the proposed method and various protein-mapping methods. Then, the mapped protein sequences were normalized and classified by bidirectional recurrent neural networks. The performance of the proposed method was evaluated with accuracy, f1-score, precision, recall, and AUC scores. Our results indicated that our mapping method predicts the protein interactions between SARS-COV-2 virus proteins and human proteins at an accuracy of 97.76%, precision of 97.60%, recall of 98.33%, f1-score of 79.42%, and with AUC 89% in average.
Collapse
Affiliation(s)
- Talha Burak Alakus
- Faculty of Engineering, Department of Software Engineering, Kirklareli University, 39000, Kirklareli, Turkey.
| | - Ibrahim Turkoglu
- Faculty of Technology, Department of Software Engineering, Firat University, 23119, Elazig, Turkey
| |
Collapse
|
34
|
An Integrative Computational Approach for the Prediction of Human- Plasmodium Protein-Protein Interactions. BIOMED RESEARCH INTERNATIONAL 2021; 2020:2082540. [PMID: 33426052 PMCID: PMC7771252 DOI: 10.1155/2020/2082540] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 11/08/2020] [Accepted: 12/04/2020] [Indexed: 12/27/2022]
Abstract
Host-pathogen molecular cross-talks are critical in determining the pathophysiology of a specific infection. Most of these cross-talks are mediated via protein-protein interactions between the host and the pathogen (HP-PPI). Thus, it is essential to know how some pathogens interact with their hosts to understand the mechanism of infections. Malaria is a life-threatening disease caused by an obligate intracellular parasite belonging to the Plasmodium genus, of which P. falciparum is the most prevalent. Several previous studies predicted human-plasmodium protein-protein interactions using computational methods have demonstrated their utility, accuracy, and efficiency to identify the interacting partners and therefore complementing experimental efforts to characterize host-pathogen interaction networks. To predict potential putative HP-PPIs, we use an integrative computational approach based on the combination of multiple OMICS-based methods including human red blood cells (RBC) and Plasmodium falciparum 3D7 strain expressed proteins, domain-domain based PPI, similarity of gene ontology terms, structure similarity method homology identification, and machine learning prediction. Our results reported a set of 716 protein interactions involving 302 human proteins and 130 Plasmodium proteins. This work provides a list of potential human-Plasmodium interacting proteins. These findings will contribute to better understand the mechanisms underlying the molecular determinism of malaria disease and potentially to identify candidate pharmacological targets.
Collapse
|
35
|
Cady NC, Tokranova N, Minor A, Nikvand N, Strle K, Lee WT, Page W, Guignon E, Pilar A, Gibson GN. Multiplexed detection and quantification of human antibody response to COVID-19 infection using a plasmon enhanced biosensor platform. Biosens Bioelectron 2021; 171:112679. [PMID: 33069957 PMCID: PMC7545244 DOI: 10.1016/j.bios.2020.112679] [Citation(s) in RCA: 69] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Revised: 09/30/2020] [Accepted: 10/01/2020] [Indexed: 12/18/2022]
Abstract
The 2019 SARS CoV-2 (COVID-19) pandemic has illustrated the need for rapid and accurate diagnostic tests. In this work, a multiplexed grating-coupled fluorescent plasmonics (GC-FP) biosensor platform was used to rapidly and accurately measure antibodies against COVID-19 in human blood serum and dried blood spot samples. The GC-FP platform measures antibody-antigen binding interactions for multiple targets in a single sample, and has 100% selectivity and sensitivity (n = 23) when measuring serum IgG levels against three COVID-19 antigens (spike S1, spike S1S2, and the nucleocapsid protein). The GC-FP platform yielded a quantitative, linear response for serum samples diluted to as low as 1:1600 dilution. Test results were highly correlated with two commercial COVID-19 antibody tests, including an enzyme linked immunosorbent assay (ELISA) and a Luminex-based microsphere immunoassay. To demonstrate test efficacy with other sample matrices, dried blood spot samples (n = 63) were obtained and evaluated with GC-FP, yielding 100% selectivity and 86.7% sensitivity for diagnosing prior COVID-19 infection. The test was also evaluated for detection of multiple immunoglobulin isotypes, with successful detection of IgM, IgG and IgA antibody-antigen interactions. Last, a machine learning approach was developed to accurately score patient samples for prior COVID-19 infection, using antibody binding data for all three COVID-19 antigens used in the test.
Collapse
Affiliation(s)
- Nathaniel C Cady
- College of Nanoscale Science & Engineering, SUNY Polytechnic Institute, Albany, NY, USA.
| | - Natalya Tokranova
- College of Nanoscale Science & Engineering, SUNY Polytechnic Institute, Albany, NY, USA
| | - Armond Minor
- College of Nanoscale Science & Engineering, SUNY Polytechnic Institute, Albany, NY, USA
| | - Nima Nikvand
- College of Nanoscale Science & Engineering, SUNY Polytechnic Institute, Albany, NY, USA
| | - Klemen Strle
- Wadsworth Center, New York State Department of Health, Albany, NY, USA and School of Public Health, University at Albany, Albany, NY, USA
| | - William T Lee
- Wadsworth Center, New York State Department of Health, Albany, NY, USA and School of Public Health, University at Albany, Albany, NY, USA
| | | | | | | | - George N Gibson
- Ciencia, Inc., East Hartford, CT, USA; University of Connecticut, Storrs, CT, USA
| |
Collapse
|
36
|
Waiho K, Afiqah‐Aleng N, Iryani MTM, Fazhan H. Protein–protein interaction network: an emerging tool for understanding fish disease in aquaculture. REVIEWS IN AQUACULTURE 2021; 13:156-177. [DOI: 10.1111/raq.12468] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Accepted: 06/11/2020] [Indexed: 01/03/2025]
Abstract
AbstractProtein–protein interactions (PPIs) play integral roles in a wide range of biological processes that regulate the overall growth, development, physiology and disease in living organisms. With the advancement of high‐throughput sequencing technologies, increasing numbers of PPI networks are being predicted and annotated, and these contribute greatly towards the understanding of pathogenesis and the discovery of novel drug targets for the treatment of diseases. The use of this tool is gaining popularity in the identification, understanding and treatment of diseases in humans and plants. Due to the importance of aquaculture in tackling the global food crisis by producing cheap and high‐quality protein source, the maintenance of the overall health status of aquaculture species is essential. With the increasing omics data on aquaculture species, the PPI network is an emerging tool for fish health maintenance. In this review, we first introduce the concept of PPI network, how they are discovered and their general application. Then, the current status of aquaculture and disease in aquaculture are discussed. The different applications of PPI network in aquaculture fish disease management such as biomarker identification, mechanism prediction, understanding of host–pathogen interaction, understanding of pathogen co‐infection interaction, and potential development of vaccines and treatments are subsequently highlighted. It is hoped that this emerging tool – PPI network – would deepen our understanding of the pathogenesis of various diseases and hasten the prevention and treatment processes in aquaculture species.
Collapse
Affiliation(s)
- Khor Waiho
- Institute of Tropical Aquaculture and Fisheries Universiti Malaysia Terengganu Terengganu Malaysia
| | - Nor Afiqah‐Aleng
- Institute of Marine Biotechnology Universiti Malaysia Terengganu Terengganu Malaysia
| | - Mat Taib Mimi Iryani
- Institute of Marine Biotechnology Universiti Malaysia Terengganu Terengganu Malaysia
| | - Hanafiah Fazhan
- Institute of Tropical Aquaculture and Fisheries Universiti Malaysia Terengganu Terengganu Malaysia
- Guangdong Provincial Key Laboratory of Marine Biotechnology Shantou University Guangdong China
| |
Collapse
|
37
|
Randhawa V, Pathania S. Advancing from protein interactomes and gene co-expression networks towards multi-omics-based composite networks: approaches for predicting and extracting biological knowledge. Brief Funct Genomics 2020; 19:364-376. [PMID: 32678894 DOI: 10.1093/bfgp/elaa015] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 05/31/2020] [Accepted: 06/15/2020] [Indexed: 01/17/2023] Open
Abstract
Prediction of biological interaction networks from single-omics data has been extensively implemented to understand various aspects of biological systems. However, more recently, there is a growing interest in integrating multi-omics datasets for the prediction of interactomes that provide a global view of biological systems with higher descriptive capability, as compared to single omics. In this review, we have discussed various computational approaches implemented to infer and analyze two of the most important and well studied interactomes: protein-protein interaction networks and gene co-expression networks. We have explicitly focused on recent methods and pipelines implemented to infer and extract biologically important information from these interactomes, starting from utilizing single-omics data and then progressing towards multi-omics data. Accordingly, recent examples and case studies are also briefly discussed. Overall, this review will provide a proper understanding of the latest developments in protein and gene network modelling and will also help in extracting practical knowledge from them.
Collapse
Affiliation(s)
- Vinay Randhawa
- Department of Biochemistry, Panjab University, Chandigarh, 160014, India
| | - Shivalika Pathania
- Department of Biotechnology, Panjab University, Chandigarh, 160014, India
| |
Collapse
|
38
|
Poot Velez AH, Fontove F, Del Rio G. Protein-Protein Interactions Efficiently Modeled by Residue Cluster Classes. Int J Mol Sci 2020; 21:E4787. [PMID: 32640745 PMCID: PMC7370293 DOI: 10.3390/ijms21134787] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2020] [Revised: 06/20/2020] [Accepted: 06/28/2020] [Indexed: 01/22/2023] Open
Abstract
Predicting protein-protein interactions (PPI) represents an important challenge in structural bioinformatics. Current computational methods display different degrees of accuracy when predicting these interactions. Different factors were proposed to help improve these predictions, including choosing the proper descriptors of proteins to represent these interactions, among others. In the current work, we provide a representative protein structure that is amenable to PPI classification using machine learning approaches, referred to as residue cluster classes. Through sampling and optimization, we identified the best algorithm-parameter pair to classify PPI from more than 360 different training sets. We tested these classifiers against PPI datasets that were not included in the training set but shared sequence similarity with proteins in the training set to reproduce the situation of most proteins sharing sequence similarity with others. We identified a model with almost no PPI error (96-99% of correctly classified instances) and showed that residue cluster classes of protein pairs displayed a distinct pattern between positive and negative protein interactions. Our results indicated that residue cluster classes are structural features relevant to model PPI and provide a novel tool to mathematically model the protein structure/function relationship.
Collapse
Affiliation(s)
- Albros Hermes Poot Velez
- Department of biochemistry and structural biology, Instituto de fisiologia celular, UNAM Mexico City 04510, Mexico;
| | | | - Gabriel Del Rio
- Department of biochemistry and structural biology, Instituto de fisiologia celular, UNAM Mexico City 04510, Mexico;
| |
Collapse
|
39
|
Shringari SR, Giannakoulias S, Ferrie JJ, Petersson EJ. Rosetta custom score functions accurately predict ΔΔG of mutations at protein-protein interfaces using machine learning. Chem Commun (Camb) 2020; 56:6774-6777. [PMID: 32441721 DOI: 10.1039/d0cc01959c] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Protein-protein interfaces play essential roles in a variety of biological processes and many therapeutic molecules are targeted at these interfaces. However, accurate predictions of the effects of interfacial mutations to identify "hotspots" have remained elusive despite the myriad of modeling and machine learning methods tested. Here, for the first time, we demonstrate that nonlinear reweighting of energy terms from Rosetta, through the use of machine learning, exhibits improved predictability of ΔΔG values associated with interfacial mutations.
Collapse
Affiliation(s)
- Sumant R Shringari
- Department of Chemistry, University of Pennsylvania, 231 South 34th Street, Philadelphia, PA 19104, USA.
| | | | | | | |
Collapse
|