1
|
Shah M, Anum H, Sarfraz A, Aktaruzzaman M, Hasan AR, Khan MU, Fawy KF, Altwaim SA, Alasmari SMN, Ali A, Nishan U, Chen K. Bioinformatics-guided decoding of the Ancylostoma duodenale genome for the identification of potential vaccine targets. BMC Genomics 2025; 26:468. [PMID: 40355819 PMCID: PMC12067957 DOI: 10.1186/s12864-025-11652-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2025] [Accepted: 04/29/2025] [Indexed: 05/15/2025] Open
Abstract
Ancylostoma duodenale, a parasitic nematode worm, is found to be involved in various infections, including intestinal blood loss, protein malnutrition, and anemia. Antimicrobial resistance to the available therapeutics has prompted the search for new drug and vaccine targets against A. duodenale. Despite significant advances in vaccine development against A. duodenale, no commercial and FDA-approved vaccine exists to safeguard humans from infections caused by this pathogen. In this investigation, a stringent bioinformatics analysis identified 36 unique essential and host-interacting proteins. Based on their subcellular localization, 6 proteins located in the extracellular space and outer membrane were categorized as vaccine targets, while the remaining proteins were predicted to act as potential drug candidates. These vaccine candidates were further assessed for antigenicity, allergenicity, and physicochemical analysis to determine their suitability for the designing of a multi-epitope vaccine. Two candidate proteins were chosen as optimal targets in the development of vaccine design. The identified T- and B-cell epitopes from these proteins were then combined with appropriate linkers and adjuvants to design chimeric vaccine constructs aimed at inducing both cellular and humoral immune responses. Molecular docking, molecular dynamic simulations, PCA analysis, DCCM analysis, and binding free energy calculations proved stable interactions of the designed vaccine with human immune cell receptors. Within a bacterial cloning system, the vaccine constructs demonstrated the ability to be cloned and expressed. The immunological stimulation elicited significant immunological responses to the proposed vaccine. Our investigation identified new therapeutic targets and developed a peptide-based multi-epitope vaccine against A. duodenale infection. Additional experimental verification will open up new therapeutic alternatives for this emerging resistant pathogen.
Collapse
Affiliation(s)
- Mohibullah Shah
- Department of Biochemistry, Bahauddin Zakariya University, Multan, Punjab, 66000, Pakistan.
- Department of Animal Science, Federal University of Ceara, Fortaleza, Brazil.
| | - Hira Anum
- Department of Biochemistry, Bahauddin Zakariya University, Multan, Punjab, 66000, Pakistan
| | - Asifa Sarfraz
- Department of Biochemistry, Bahauddin Zakariya University, Multan, Punjab, 66000, Pakistan
| | - Md Aktaruzzaman
- Department of Pharmacy, Faculty of Biological Science and Technology, Jashore University of Science and Technology, Jashore, 7408, Bangladesh
| | - Al Riyad Hasan
- Department of Pharmacy, Faculty of Biological Science and Technology, Jashore University of Science and Technology, Jashore, 7408, Bangladesh
| | - Muhammad Umer Khan
- Institute of Molecular Biology and Biotechnology, The University of Lahore, Lahore, Pakistan
| | - Khaled Fahmi Fawy
- Chemistry Department, Faculty of Science, King Khalid University, P.O. Box 9004, Abha, 61413, Saudi Arabia
- Research Center for Advanced Materials Science (RCAMS), King Khalid University, P.O. Box 960, AlQura'a, Abha, Saudi Arabia
| | - Sarah A Altwaim
- Department of Clinical Microbiology and Immunology, Faculty of Medicine, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
- Special Infectious Agents Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Saeed M N Alasmari
- Department of Biology, Faculty of Science and Arts, Najran University, Najran, 1988, Saudi Arabia
| | - Abid Ali
- Department of Zoology, Abdul Wali Khan University, Mardan, Khyber Pakhtunkhwa, 23200, Pakistan
| | - Umar Nishan
- Department of Chemistry, Kohat University of Science & Technology, Kohat, Pakistan
| | - Ke Chen
- Department of Infectious Diseases, The Affiliated Hospital of Southwest Medical University, Luzhou, 646000, China.
| |
Collapse
|
2
|
Teimouri H, Medvedeva A, Kolomeisky AB. Unraveling the role of physicochemical differences in predicting protein-protein interactions. J Chem Phys 2024; 161:045102. [PMID: 39051836 DOI: 10.1063/5.0219501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 07/09/2024] [Indexed: 07/27/2024] Open
Abstract
The ability to accurately predict protein-protein interactions is critically important for understanding major cellular processes. However, current experimental and computational approaches for identifying them are technically very challenging and still have limited success. We propose a new computational method for predicting protein-protein interactions using only primary sequence information. It utilizes the concept of physicochemical similarity to determine which interactions will most likely occur. In our approach, the physicochemical features of proteins are extracted using bioinformatics tools for different organisms. Then they are utilized in a machine-learning method to identify successful protein-protein interactions via correlation analysis. It was found that the most important property that correlates most with the protein-protein interactions for all studied organisms is dipeptide amino acid composition (the frequency of specific amino acid pairs in a protein sequence). While current approaches often overlook the specificity of protein-protein interactions with different organisms, our method yields context-specific features that determine protein-protein interactions. The analysis is specifically applied to the bacterial two-component system that includes histidine kinase and transcriptional response regulators, as well as to the barnase-barstar complex, demonstrating the method's versatility across different biological systems. Our approach can be applied to predict protein-protein interactions in any biological system, providing an important tool for investigating complex biological processes' mechanisms.
Collapse
Affiliation(s)
- Hamid Teimouri
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, USA
| | - Angela Medvedeva
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, USA
| | - Anatoly B Kolomeisky
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, USA
| |
Collapse
|
3
|
Teimouri H, Medvedeva A, Kolomeisky AB. Physical-Chemical Features Selection Reveals That Differences in Dipeptide Compositions Correlate Most with Protein-Protein Interactions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.27.582345. [PMID: 38464064 PMCID: PMC10925282 DOI: 10.1101/2024.02.27.582345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
The ability to accurately predict protein-protein interactions is critically important for our understanding of major cellular processes. However, current experimental and computational approaches for identifying them are technically very challenging and still have limited success. We propose a new computational method for predicting protein-protein interactions using only primary sequence information. It utilizes a concept of physical-chemical similarity to determine which interactions will most probably occur. In our approach, the physical-chemical features of protein are extracted using bioinformatics tools for different organisms, and then they are utilized in a machine-learning method to identify successful protein-protein interactions via correlation analysis. It is found that the most important property that correlates most with the protein-protein interactions for all studied organisms is dipeptide amino acid compositions. The analysis is specifically applied to the bacterial two-component system that includes histidine kinase and transcriptional response regulators. Our theoretical approach provides a simple and robust method for quantifying the important details of complex mechanisms of biological processes.
Collapse
Affiliation(s)
- Hamid Teimouri
- Department of Chemistry, Rice University, Houston, Texas, United States
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States
| | - Angela Medvedeva
- Department of Chemistry, Rice University, Houston, Texas, United States
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States
| | - Anatoly B. Kolomeisky
- Department of Chemistry, Rice University, Houston, Texas, United States
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas, United States
- Department of Physics and Astronomy, Rice University, Houston, TX, United States
| |
Collapse
|
4
|
Idrees S, Paudel KR, Sadaf T, Hansbro PM. How different viruses perturb host cellular machinery via short linear motifs. EXCLI JOURNAL 2023; 22:1113-1128. [PMID: 38054205 PMCID: PMC10694346 DOI: 10.17179/excli2023-6328] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 10/18/2023] [Indexed: 12/07/2023]
Abstract
The virus interacts with its hosts by developing protein-protein interactions. Most viruses employ protein interactions to imitate the host protein: A viral protein with the same amino acid sequence or structure as the host protein attaches to the host protein's binding partner and interferes with the host protein's pathways. Being opportunistic, viruses have evolved to manipulate host cellular mechanisms by mimicking short linear motifs. In this review, we shed light on the current understanding of mimicry via short linear motifs and focus on viral mimicry by genetically different viral subtypes by providing recent examples of mimicry evidence and how high-throughput methods can be a reliable source to study SLiM-mediated viral mimicry.
Collapse
Affiliation(s)
- Sobia Idrees
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
- Centre for Inflammation, Centenary Institute and the University of Technology Sydney, School of Life Sciences, Faculty of Science, Sydney, New South Wales, Australia
| | - Keshav Raj Paudel
- Centre for Inflammation, Centenary Institute and the University of Technology Sydney, School of Life Sciences, Faculty of Science, Sydney, New South Wales, Australia
| | - Tayyaba Sadaf
- Centre for Inflammation, Centenary Institute and the University of Technology Sydney, School of Life Sciences, Faculty of Science, Sydney, New South Wales, Australia
| | - Philip M. Hansbro
- Centre for Inflammation, Centenary Institute and the University of Technology Sydney, School of Life Sciences, Faculty of Science, Sydney, New South Wales, Australia
| |
Collapse
|
5
|
Chakraborty A, Mitra S, Bhattacharjee M, De D, Pal AJ. Determining human-coronavirus protein-protein interaction using machine intelligence. MEDICINE IN NOVEL TECHNOLOGY AND DEVICES 2023; 18:100228. [PMID: 37056696 PMCID: PMC10077817 DOI: 10.1016/j.medntd.2023.100228] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2022] [Revised: 03/29/2023] [Accepted: 04/01/2023] [Indexed: 04/08/2023] Open
Abstract
The Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2) virus spread the novel CoronaVirus -19 (nCoV-19) pandemic, resulting in millions of fatalities globally. Recent research demonstrated that the Protein-Protein Interaction (PPI) between SARS-CoV-2 and human proteins is accountable for viral pathogenesis. However, many of these PPIs are poorly understood and unexplored, necessitating a more in-depth investigation to find latent yet critical interactions. This article elucidates the host-viral PPI through Machine Learning (ML) lenses and validates the biological significance of the same using web-based tools. ML classifiers are designed based on comprehensive datasets with five sequence-based features of human proteins, namely Amino Acid Composition, Pseudo Amino Acid Composition, Conjoint Triad, Dipeptide Composition, and Normalized Auto Correlation. A majority voting rule-based ensemble method composed of the Random Forest Model (RFM), AdaBoost, and Bagging technique is proposed that delivers encouraging statistical performance compared to other models employed in this work. The proposed ensemble model predicted a total of 111 possible SARS-CoV-2 human target proteins with a high likelihood factor ≥70%, validated by utilizing Gene Ontology (GO) and KEGG pathway enrichment analysis. Consequently, this research can aid in a deeper understanding of the molecular mechanisms underlying viral pathogenesis and provide clues for developing more efficient anti-COVID medications.
Collapse
Affiliation(s)
- Arijit Chakraborty
- Bachelor of Computer Application Department, The Heritage Academy, Kolkata, India
| | - Sajal Mitra
- Department of Computer Science and Engineering, Heritage Institute of Technology, Kolkata, India
| | | | - Debashis De
- Department of Computer Science and Engineering, Maulana Abul Kalam Azad University of Technology, Kolkata, India
| | | |
Collapse
|
6
|
Xie P, Zhuang J, Tian G, Yang J. Emvirus: An embedding-based neural framework for human-virus protein-protein interactions prediction. BIOSAFETY AND HEALTH 2023; 5:152-158. [PMID: 37362223 PMCID: PMC10166638 DOI: 10.1016/j.bsheal.2023.04.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/23/2023] [Accepted: 04/23/2023] [Indexed: 06/28/2023] Open
Abstract
Human-virus protein-protein interactions (PPIs) play critical roles in viral infection. For example, the spike protein of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) binds primarily to human angiotensin-converting enzyme 2 (ACE2) protein to infect human cells. Thus, identifying and blocking these PPIs contribute to controlling and preventing viruses. However, wet-lab experiment-based identification of human-virus PPIs is usually expensive, labor-intensive, and time-consuming, which presents the need for computational methods. Many machine-learning methods have been proposed recently and achieved good results in predicting human-virus PPIs. However, most methods are based on protein sequence features and apply manually extracted features, such as statistical characteristics, phylogenetic profiles, and physicochemical properties. In this work, we present an embedding-based neural framework with convolutional neural network (CNN) and bi-directional long short-term memory unit (Bi-LSTM) architecture, named Emvirus, to predict human-virus PPIs (including human-SARS-CoV-2 PPIs). In addition, we conduct cross-viral experiments to explore the generalization ability of Emvirus. Compared to other feature extraction methods, Emvirus achieves better prediction accuracy.
Collapse
Affiliation(s)
- Pengfei Xie
- College of Transportation Engineering, Dalian Maritime University, Dalian 116026, China
| | - Jujuan Zhuang
- School of Science, Dalian Maritime University, Dalian 116026, China
| | - Geng Tian
- Geneis Beijing Co., Ltd., Beijing 100102, China
- Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao 266000, China
| | - Jialiang Yang
- Geneis Beijing Co., Ltd., Beijing 100102, China
- Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao 266000, China
| |
Collapse
|
7
|
Avila-Bonilla R, Velazquez-Guzman J, Reyes-Zepeda E, Gutierrez-Avila J, Reyes-López C, Cisneros-Sarabia A, Saavedra E, Lopéz-Sandoval A, Ramírez-Moreno E, López-Camarillo C, Marchat L. Comparative genomics and interactomics of polyadenylation factors for the prediction of new parasite targets: Entamoeba histolytica as a working model. Biosci Rep 2023; 43:BSR20221911. [PMID: 36651565 PMCID: PMC9912109 DOI: 10.1042/bsr20221911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 01/05/2023] [Accepted: 01/13/2023] [Indexed: 01/19/2023] Open
Abstract
Protein-protein interactions (PPI) play a key role in predicting the function of a target protein and drug ability to affect an entire biological system. Prediction of PPI networks greatly contributes to determine a target protein and signal pathways related to its function. Polyadenylation of mRNA 3'-end is essential for gene expression regulation and several polyadenylation factors have been shown as valuable targets for controlling protozoan parasites that affect human health. Here, by using a computational strategy based on sequence-based prediction approaches, phylogenetic analyses, and computational prediction of PPI networks, we compared interactomes of polyadenylation factors in relevant protozoan parasites and the human host, to identify key proteins and define potential targets for pathogen control. Then, we used Entamoeba histolytica as a working model to validate our computational results. RT-qPCR assays confirmed the coordinated modulation of connected proteins in the PPI network and evidenced that silencing of the bottleneck protein EhCFIm25 affects the expression of interacting proteins. In addition, molecular dynamics simulations and docking approaches allowed to characterize the relationships between EhCFIm25 and Ehnopp34, two connected bottleneck proteins. Interestingly, the experimental identification of EhCFIm25 interactome confirmed the close relationships among proteins involved in gene expression regulation and evidenced new links with moonlight proteins in E. histolytica, suggesting a connection between RNA biology and metabolism as described in other organisms. Altogether, our results strengthened the relevance of comparative genomics and interactomics of polyadenylation factors for the prediction of new targets for the control of these human pathogens.
Collapse
Affiliation(s)
| | - Jorge Antonio Velazquez-Guzman
- Facultad de Ciencias, Universidad Autónoma del Estado de México. Carretera Toluca-Ixtlahuaca km 15.5 Cerrillo Piedras Blancas 50200 Toluca, Estado de México, Mexico
| | - Eimy Itzel Reyes-Zepeda
- Facultad de Ciencias, Universidad Autónoma del Estado de México. Carretera Toluca-Ixtlahuaca km 15.5 Cerrillo Piedras Blancas 50200 Toluca, Estado de México, Mexico
| | - Jorge Luis Gutierrez-Avila
- Posgrado en Ciencias Químico-Biológicas; Escuela Nacional de Ciencias Biológicas, Instituto Politécnico Nacional. Mexico City, Mexico
| | - César A Reyes-López
- Laboratorio de Bioquímica Estructural, Instituto Politécnico Nacional, Escuela Nacional de Medicina y Homeopatía, Mexico City 07320, Mexico
| | - Alondra Cisneros-Sarabia
- Laboratorio de Biomedicina Molecular II, ENMH, Instituto Politécnico Nacional, Mexico City, Mexico
| | - Emma Saavedra
- Departamento de Bioquímica, Instituto Nacional de Cardiología, Mexico City 14080, Mexico
| | - Angel Lopéz-Sandoval
- Laboratorio de Biomedicina Molecular II, ENMH, Instituto Politécnico Nacional, Mexico City, Mexico
| | - Esther Ramírez-Moreno
- Laboratorio de Biomedicina Molecular II, ENMH, Instituto Politécnico Nacional, Mexico City, Mexico
| | - César López-Camarillo
- Posgrado en Ciencias Genómicas, Universidad Autónoma de la Ciudad de México (UACM), Mexico City, Mexico
| | - Laurence A. Marchat
- Laboratorio de Biomedicina Molecular II, ENMH, Instituto Politécnico Nacional, Mexico City, Mexico
| |
Collapse
|
8
|
Duhan N, Kaundal R. HuCoPIA: An Atlas of Human vs. SARS-CoV-2 Interactome and the Comparative Analysis with Other Coronaviridae Family Viruses. Viruses 2023; 15:492. [PMID: 36851706 PMCID: PMC9962590 DOI: 10.3390/v15020492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 02/01/2023] [Accepted: 02/04/2023] [Indexed: 02/12/2023] Open
Abstract
SARS-CoV-2, a novel betacoronavirus strain, has caused a pandemic that has claimed the lives of nearly 6.7M people worldwide. Vaccines and medicines are being developed around the world to reduce the disease spread, fatality rates, and control the new variants. Understanding the protein-protein interaction mechanism of SARS-CoV-2 in humans, and their comparison with the previous SARS-CoV and MERS strains, is crucial for these efforts. These interactions might be used to assess vaccination effectiveness, diagnose exposure, and produce effective biotherapeutics. Here, we present the HuCoPIA database, which contains approximately 100,000 protein-protein interactions between humans and three strains (SARS-CoV-2, SARS-CoV, and MERS) of betacoronavirus. The interactions in the database are divided into common interactions between all three strains and those unique to each strain. It also contains relevant functional annotation information of human proteins. The HuCoPIA database contains SARS-CoV-2 (41,173), SARS-CoV (31,997), and MERS (26,862) interactions, with functional annotation of human proteins like subcellular localization, tissue-expression, KEGG pathways, and Gene ontology information. We believe HuCoPIA will serve as an invaluable resource to diverse experimental biologists, and will help to advance the research in better understanding the mechanism of betacoronaviruses.
Collapse
Affiliation(s)
- Naveen Duhan
- Department of Plants, Soils, and Climate/Center for Integrated BioSystems, College of Agriculture and Applied Sciences, Utah State University, Logan, UT 84322, USA
- Bioinformatics Facility, Center for Integrated BioSystems, College of Agriculture and Applied Sciences, Utah State University, Logan, UT 84322, USA
| | - Rakesh Kaundal
- Department of Plants, Soils, and Climate/Center for Integrated BioSystems, College of Agriculture and Applied Sciences, Utah State University, Logan, UT 84322, USA
- Bioinformatics Facility, Center for Integrated BioSystems, College of Agriculture and Applied Sciences, Utah State University, Logan, UT 84322, USA
- Department of Computer Science, College of Science, Utah State University, Logan, UT 84322, USA
| |
Collapse
|
9
|
Karan B, Mahapatra S, Sahu SS, Pandey DM, Chakravarty S. Computational models for prediction of protein-protein interaction in rice and Magnaporthe grisea. FRONTIERS IN PLANT SCIENCE 2023; 13:1046209. [PMID: 36816487 PMCID: PMC9929577 DOI: 10.3389/fpls.2022.1046209] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 12/28/2022] [Indexed: 06/18/2023]
Abstract
INTRODUCTION Plant-microbe interactions play a vital role in the development of strategies to manage pathogen-induced destructive diseases that cause enormous crop losses every year. Rice blast is one of the severe diseases to rice Oryza sativa (O. sativa) due to Magnaporthe grisea (M. grisea) fungus. Protein-protein interaction (PPI) between rice and fungus plays a key role in causing rice blast disease. METHODS In this paper, four genomic information-based models such as (i) the interolog, (ii) the domain, (iii) the gene ontology, and (iv) the phylogenetic-based model are developed for predicting the interaction between O. sativa and M. grisea in a whole-genome scale. RESULTS AND DISCUSSION A total of 59,430 interacting pairs between 1,801 rice proteins and 135 blast fungus proteins are obtained from the four models. Furthermore, a machine learning model is developed to assess the predicted interactions. Using composition-based amino acid composition (AAC) and conjoint triad (CT) features, an accuracy of 88% and 89% is achieved, respectively. When tested on the experimental dataset, the CT feature provides the highest accuracy of 95%. Furthermore, the specificity of the model is verified with other pathogen-host datasets where less accuracy is obtained, which confirmed that the model is specific to O. sativa and M. grisea. Understanding the molecular processes behind rice resistance to blast fungus begins with the identification of PPIs, and these predicted PPIs will be useful for drug design in the plant science community.
Collapse
Affiliation(s)
- Biswajit Karan
- Department of Electronics and Communication Engineering, Birla Institute of Technology, Ranchi, India
| | - Satyajit Mahapatra
- Department of Electronics and Communication Engineering, Birla Institute of Technology, Ranchi, India
| | - Sitanshu Sekhar Sahu
- Department of Electronics and Communication Engineering, Birla Institute of Technology, Ranchi, India
| | - Dev Mani Pandey
- Department of Bioengineering and Biotechnology, Birla Institute of Technology, Ranchi, India
| | - Sumit Chakravarty
- Department of Electrical and Computer Engineering, Kennesaw State University, Kennesaw, GA, United States
| |
Collapse
|
10
|
Vora DS, Kalakoti Y, Sundar D. Computational Methods and Deep Learning for Elucidating Protein Interaction Networks. Methods Mol Biol 2023; 2553:285-323. [PMID: 36227550 DOI: 10.1007/978-1-0716-2617-7_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Protein interactions play a critical role in all biological processes, but experimental identification of protein interactions is a time- and resource-intensive process. The advances in next-generation sequencing and multi-omics technologies have greatly benefited large-scale predictions of protein interactions using machine learning methods. A wide range of tools have been developed to predict protein-protein, protein-nucleic acid, and protein-drug interactions. Here, we discuss the applications, methods, and challenges faced when employing the various prediction methods. We also briefly describe ways to overcome the challenges and prospective future developments in the field of protein interaction biology.
Collapse
Affiliation(s)
- Dhvani Sandip Vora
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India
| | - Yogesh Kalakoti
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India
| | - Durai Sundar
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India.
- School of Artificial Intelligence, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India.
| |
Collapse
|
11
|
Ibrahim AH, Karabulut OC, Karpuzcu BA, Türk E, Süzek BE. A correlation coefficient-based feature selection approach for virus-host protein-protein interaction prediction. PLoS One 2023; 18:e0285168. [PMID: 37130110 PMCID: PMC10153705 DOI: 10.1371/journal.pone.0285168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 04/17/2023] [Indexed: 05/03/2023] Open
Abstract
Prediction of virus-host protein-protein interactions (PPI) is a broad research area where various machine-learning-based classifiers are developed. Transforming biological data into machine-usable features is a preliminary step in constructing these virus-host PPI prediction tools. In this study, we have adopted a virus-host PPI dataset and a reduced amino acids alphabet to create tripeptide features and introduced a correlation coefficient-based feature selection. We applied feature selection across several correlation coefficient metrics and statistically tested their relevance in a structural context. We compared the performance of feature-selection models against that of the baseline virus-host PPI prediction models created using different classification algorithms without the feature selection. We also tested the performance of these baseline models against the previously available tools to ensure their predictive power is acceptable. Here, the Pearson coefficient provides the best performance with respect to the baseline model as measured by AUPR; a drop of 0.003 in AUPR while achieving a 73.3% (from 686 to 183) reduction in the number of tripeptides features for random forest. The results suggest our correlation coefficient-based feature selection approach, while decreasing the computation time and space complexity, has a limited impact on the prediction performance of virus-host PPI prediction tools.
Collapse
Affiliation(s)
- Ahmed Hassan Ibrahim
- Bioinformatics Graduate Program, Graduate School of Natural and Applied Sciences, Muğla Sıtkı Koçman University, Muğla, Turkey
| | - Onur Can Karabulut
- Bioinformatics Graduate Program, Graduate School of Natural and Applied Sciences, Muğla Sıtkı Koçman University, Muğla, Turkey
| | - Betül Asiye Karpuzcu
- Bioinformatics Graduate Program, Graduate School of Natural and Applied Sciences, Muğla Sıtkı Koçman University, Muğla, Turkey
| | - Erdem Türk
- Bioinformatics Graduate Program, Graduate School of Natural and Applied Sciences, Muğla Sıtkı Koçman University, Muğla, Turkey
- Department of Computer Engineering, Faculty of Engineering, Muğla Sıtkı Koçman University, Muğla, Turkey
| | - Barış Ethem Süzek
- Bioinformatics Graduate Program, Graduate School of Natural and Applied Sciences, Muğla Sıtkı Koçman University, Muğla, Turkey
- Department of Computer Engineering, Faculty of Engineering, Muğla Sıtkı Koçman University, Muğla, Turkey
- Georgetown University Medical Center, Biochemistry and Molecular & Cellular Biology, Washington DC, United States of America
| |
Collapse
|
12
|
Karpuzcu BA, Türk E, Ibrahim AH, Karabulut OC, Süzek BE. Machine Learning Methods for Virus-Host Protein-Protein Interaction Prediction. Methods Mol Biol 2023; 2690:401-417. [PMID: 37450162 DOI: 10.1007/978-1-0716-3327-4_31] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2023]
Abstract
The attachment of a virion to a respective cellular receptor on the host organism occurring through the virus-host protein-protein interactions (PPIs) is a decisive step for viral pathogenicity and infectivity. Therefore, a vast number of wet-lab experimental techniques are used to study virus-host PPIs. Taking the great number and enormous variety of virus-host PPIs and the cost as well as labor of laboratory work, however, computational approaches toward analyzing the available interaction data and predicting previously unidentified interactions have been on the rise. Among them, machine-learning-based models are getting increasingly more attention with a great body of resources and tools proposed recently.In this chapter, we first provide the methodology with major steps toward the development of a virus-host PPI prediction tool. Next, we discuss the challenges involved and evaluate several existing machine-learning-based virus-host PPI prediction tools. Finally, we describe our experience with several ensemble techniques as utilized on available prediction results retrieved from individual PPI prediction tools. Overall, based on our experience, we recognize there is still room for the development of new individual and/or ensemble virus-host PPI prediction tools that leverage existing tools.
Collapse
Affiliation(s)
- Betül Asiye Karpuzcu
- Bioinformatics Graduate Program, Graduate School of Natural and Applied Sciences, Muğla Sıtkı Koçman University, Muğla, Turkey
| | - Erdem Türk
- Bioinformatics Graduate Program, Graduate School of Natural and Applied Sciences, Muğla Sıtkı Koçman University, Muğla, Turkey
- Department of Computer Engineering, Faculty of Engineering, Muğla Sıtkı Koçman University, Muğla, Turkey
| | - Ahmad Hassan Ibrahim
- Bioinformatics Graduate Program, Graduate School of Natural and Applied Sciences, Muğla Sıtkı Koçman University, Muğla, Turkey
| | - Onur Can Karabulut
- Bioinformatics Graduate Program, Graduate School of Natural and Applied Sciences, Muğla Sıtkı Koçman University, Muğla, Turkey
| | - Barış Ethem Süzek
- Bioinformatics Graduate Program, Graduate School of Natural and Applied Sciences, Muğla Sıtkı Koçman University, Muğla, Turkey.
- Department of Computer Engineering, Faculty of Engineering, Muğla Sıtkı Koçman University, Muğla, Turkey.
| |
Collapse
|
13
|
Fang Y, Yang Y, Liu C. New feature extraction from phylogenetic profiles improved the performance of pathogen-host interactions. Front Cell Infect Microbiol 2022; 12:931072. [PMID: 35982784 PMCID: PMC9378789 DOI: 10.3389/fcimb.2022.931072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 07/11/2022] [Indexed: 11/13/2022] Open
Abstract
MotivationThe understanding of pathogen-host interactions (PHIs) is essential and challenging research because this potentially provides the mechanism of molecular interactions between different organisms. The experimental exploration of PHI is time-consuming and labor-intensive, and computational approaches are playing a crucial role in discovering new unknown PHIs between different organisms. Although it has been proposed that most machine learning (ML)–based methods predict PHI, these methods are all based on the structure-based information extracted from the sequence for prediction. The selection of feature values is critical to improving the performance of predicting PHI using ML.ResultsThis work proposed a new method to extract features from phylogenetic profiles as evolutionary information for predicting PHI. The performance of our approach is better than that of structure-based and ML-based PHI prediction methods. The five different extract models proposed by our approach combined with structure-based information significantly improved the performance of PHI, suggesting that combining phylogenetic profile features and structure-based methods could be applied to the exploration of PHI and discover new unknown biological relativity.Availability and implementationThe KPP method is implemented in the Java language and is available at https://github.com/yangfangs/KPP.
Collapse
Affiliation(s)
- Yang Fang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
- Department of Laboratory Medicine, Third Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Yi Yang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
- *Correspondence: Chengcheng Liu, ; Yi Yang,
| | - Chengcheng Liu
- State Key Laboratory of Oral Diseases, Department of Periodontics, National Clinical Research Center for Oral Diseases, West China School & Hospital of Stomatology, Sichuan University, Chengdu, China
- *Correspondence: Chengcheng Liu, ; Yi Yang,
| |
Collapse
|
14
|
Asim MN, Ibrahim MA, Malik MI, Dengel A, Ahmed S. LGCA-VHPPI: A local-global residue context aware viral-host protein-protein interaction predictor. PLoS One 2022; 17:e0270275. [PMID: 35789333 PMCID: PMC9255777 DOI: 10.1371/journal.pone.0270275] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Accepted: 06/07/2022] [Indexed: 11/19/2022] Open
Abstract
Viral-host protein protein interaction (PPI) analysis is essential to decode the molecular mechanism of viral pathogen and host immunity processes which eventually help to control viral diseases and optimize therapeutics. The state-of-the-art viral-host PPI predictor leverages unsupervised embedding learning technique (doc2vec) to generate statistical representations of viral-host protein sequences and a Random Forest classifier for interaction prediction. However, doc2vec approach generates the statistical representations of viral-host protein sequences by merely modelling the local context of residues which only partially captures residue semantics. The paper in hand proposes a novel technique for generating better statistical representations of viral and host protein sequences based on the infusion of comprehensive local and global contextual information of the residues. While local residue context aware encoding captures semantic relatedness and short range dependencies of residues. Global residue context aware encoding captures comprehensive long-range residues dependencies, positional invariance of residues, and unique residue combination distribution important for interaction prediction. Using concatenated rich statistical representations of viral and host protein sequences, a robust machine learning framework "LGCA-VHPPI" is developed which makes use of a deep forest model to effectively model complex non-linearity of viral-host PPI sequences. An in-depth performance comparison of the proposed LGCA-VHPPI framework with existing diverse sequence encoding schemes based viral-host PPI predictors reveals that LGCA-VHPPI outperforms state-of-the-art predictor by 6%, 2%, and 2% in terms of matthews correlation coefficient over 3 different benchmark viral-host PPI prediction datasets.
Collapse
Affiliation(s)
- Muhammad Nabeel Asim
- Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, Germany
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, Germany
| | - Muhammad Ali Ibrahim
- Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, Germany
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, Germany
| | - Muhammad Imran Malik
- National Center of Artificial Intelligence, National University of Sciences and Technology, Islamabad, Pakistan
| | - Andreas Dengel
- Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, Germany
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, Germany
| | - Sheraz Ahmed
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, Germany
| |
Collapse
|
15
|
Saha S, Halder AK, Bandyopadhyay SS, Chatterjee P, Nasipuri M, Basu S. Computational modeling of human-nCoV protein-protein interaction network. Methods 2022; 203:488-497. [PMID: 34902553 PMCID: PMC8662836 DOI: 10.1016/j.ymeth.2021.12.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 11/30/2021] [Accepted: 12/06/2021] [Indexed: 01/25/2023] Open
Abstract
Novel coronavirus(SARS-CoV2) replicates the host cell's genome by interacting with the host proteins. Due to this fact, the identification of virus and host protein-protein interactions could be beneficial in understanding the disease transmission behavior of the virus as well as in potential COVID-19 drug identification. International Committee on Taxonomy of Viruses (ICTV) has declared that nCoV is highly genetically similar to the SARS-CoV epidemic in 2003 (∼89% similarity). With this hypothesis, the present work focuses on developing a computational model for the nCoV-Human protein interaction network, using the experimentally validated SARS-CoV-Human protein interactions. Initially, level-1 and level-2 human spreader proteins are identified in the SARS-CoV-Human interaction network, using Susceptible-Infected-Susceptible (SIS) model. These proteins are considered potential human targets for nCoV bait proteins. A gene-ontology-based fuzzy affinity function has been used to construct the nCoV-Human protein interaction network at a ∼99.98% specificity threshold. This also identifies 37 level-1 human spreaders for COVID-19 in the human protein-interaction network. 2474 level-2 human spreaders are subsequently identified using the SIS model. The derived host-pathogen interaction network is finally validated using six potential FDA-listed drugs for COVID-19 with significant overlap between the known drug target proteins and the identified spreader proteins.
Collapse
Affiliation(s)
- Sovan Saha
- Department of Computer Science & Engineering, Institute of Engineering & Management, Salt Lake Electronics Complex, Kolkata 700091, West Bengal, India
| | - Anup Kumar Halder
- Department of Computer Science & Engineering, University of Engineering & Management, Kolkata 700156, West Bengal, India
| | - Soumyendu Sekhar Bandyopadhyay
- Department of Computer Science & Engineering, School of Engineering and Technology, Adamas University, Kolkata 700126, West Bengal, India; Department of Computer Science & Engineering, Jadavpur University, Jadavpur, Kolkata, West Bengal 700032, India
| | - Piyali Chatterjee
- Department of Computer Science & Engineering, Netaji Subhash Engineering College, Garia, Kolkata, West Bengal 700152, India
| | - Mita Nasipuri
- Department of Computer Science & Engineering, Jadavpur University, Jadavpur, Kolkata, West Bengal 700032, India
| | - Subhadip Basu
- Department of Computer Science & Engineering, Jadavpur University, Jadavpur, Kolkata, West Bengal 700032, India.
| |
Collapse
|
16
|
Hu RS, Hesham AEL, Zou Q. Machine Learning and Its Applications for Protozoal Pathogens and Protozoal Infectious Diseases. Front Cell Infect Microbiol 2022; 12:882995. [PMID: 35573796 PMCID: PMC9097758 DOI: 10.3389/fcimb.2022.882995] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 03/28/2022] [Indexed: 12/24/2022] Open
Abstract
In recent years, massive attention has been attracted to the development and application of machine learning (ML) in the field of infectious diseases, not only serving as a catalyst for academic studies but also as a key means of detecting pathogenic microorganisms, implementing public health surveillance, exploring host-pathogen interactions, discovering drug and vaccine candidates, and so forth. These applications also include the management of infectious diseases caused by protozoal pathogens, such as Plasmodium, Trypanosoma, Toxoplasma, Cryptosporidium, and Giardia, a class of fatal or life-threatening causative agents capable of infecting humans and a wide range of animals. With the reduction of computational cost, availability of effective ML algorithms, popularization of ML tools, and accumulation of high-throughput data, it is possible to implement the integration of ML applications into increasing scientific research related to protozoal infection. Here, we will present a brief overview of important concepts in ML serving as background knowledge, with a focus on basic workflows, popular algorithms (e.g., support vector machine, random forest, and neural networks), feature extraction and selection, and model evaluation metrics. We will then review current ML applications and major advances concerning protozoal pathogens and protozoal infectious diseases through combination with correlative biology expertise and provide forward-looking insights for perspectives and opportunities in future advances in ML techniques in this field.
Collapse
Affiliation(s)
- Rui-Si Hu
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Abd El-Latif Hesham
- Genetics Department, Faculty of Agriculture, Beni-Suef University, Beni-Suef, Egypt
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
- *Correspondence: Quan Zou,
| |
Collapse
|
17
|
Yang X, Yang S, Ren P, Wuchty S, Zhang Z. Deep Learning-Powered Prediction of Human-Virus Protein-Protein Interactions. Front Microbiol 2022; 13:842976. [PMID: 35495666 PMCID: PMC9051481 DOI: 10.3389/fmicb.2022.842976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Accepted: 03/25/2022] [Indexed: 11/13/2022] Open
Abstract
Identifying human-virus protein-protein interactions (PPIs) is an essential step for understanding viral infection mechanisms and antiviral response of the human host. Recent advances in high-throughput experimental techniques enable the significant accumulation of human-virus PPI data, which have further fueled the development of machine learning-based human-virus PPI prediction methods. Emerging as a very promising method to predict human-virus PPIs, deep learning shows the powerful ability to integrate large-scale datasets, learn complex sequence-structure relationships of proteins and convert the learned patterns into final prediction models with high accuracy. Focusing on the recent progresses of deep learning-powered human-virus PPI predictions, we review technical details of these newly developed methods, including dataset preparation, deep learning architectures, feature engineering, and performance assessment. Moreover, we discuss the current challenges and potential solutions and provide future perspectives of human-virus PPI prediction in the coming post-AlphaFold2 era.
Collapse
Affiliation(s)
- Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing, China
| | - Panyu Ren
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, China
| | - Stefan Wuchty
- Department of Computer Science, University of Miami, Miami, FL, United States
- Department of Biology, University of Miami, Miami, FL, United States
- Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL, United States
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, China
- *Correspondence: Ziding Zhang,
| |
Collapse
|
18
|
OUP accepted manuscript. Brief Funct Genomics 2022; 21:243-269. [DOI: 10.1093/bfgp/elac007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 03/17/2022] [Accepted: 03/18/2022] [Indexed: 11/14/2022] Open
|
19
|
Gupta SK, Ponte-Sucre A, Bencurova E, Dandekar T. An Ebola, Neisseria and Trypanosoma human protein interaction census reveals a conserved human protein cluster targeted by various human pathogens. Comput Struct Biotechnol J 2021; 19:5292-5308. [PMID: 34745452 PMCID: PMC8531761 DOI: 10.1016/j.csbj.2021.09.017] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 09/14/2021] [Accepted: 09/15/2021] [Indexed: 12/28/2022] Open
Abstract
Filovirus ebolavirus (ZE; Zaire ebolavirus, Bundibugyo ebolavirus), Neisseria meningitidis (NM), and Trypanosoma brucei (Tb) are serious infectious pathogens, spanning viruses, bacteria and protists and all may target the blood and central nervous system during their life cycle. NM and Tb are extracellular pathogens while ZE is obligatory intracellular, targetting immune privileged sites. By using interactomics and comparative evolutionary analysis we studied whether conserved human proteins are targeted by these pathogens. We examined 2797 unique pathogen-targeted human proteins. The information derived from orthology searches of experimentally validated protein-protein interactions (PPIs) resulted both in unique and shared PPIs for each pathogen. Comparing and analyzing conserved and pathogen-specific infection pathways for NM, TB and ZE, we identified human proteins predicted to be targeted in at least two of the compared host-pathogen networks. However, four proteins were common to all three host-pathogen interactomes: the elongation factor 1-alpha 1 (EEF1A1), the SWI/SNF complex subunit SMARCC2 (matrix-associated actin-dependent regulator of chromatin subfamily C), the dolichyl-diphosphooligosaccharide--protein glycosyltransferase subunit 1 (RPN1), and the tubulin beta-5 chain (TUBB). These four human proteins all are also involved in cytoskeleton and its regulation and are often addressed by various human pathogens. Specifically, we found (i) 56 human pathogenic bacteria and viruses that target these four proteins, (ii) the well researched new pandemic pathogen SARS-CoV-2 targets two of these four human proteins and (iii) nine human pathogenic fungi (yet another evolutionary distant organism group) target three of the conserved proteins by 130 high confidence interactions.
Collapse
Affiliation(s)
- Shishir K Gupta
- Functional Genomics & Systems Biology Group, Department of Bioinformatics, Biocenter, Am Hubland, University of Würzburg, 97074 Würzburg, Germany
- Evolutionary Genomics Group, Center for Computational and Theoretical Biology, University of Würzburg, 97078 Würzburg, Germany
| | - Alicia Ponte-Sucre
- Laboratorio de Fisiología Molecular, Instituto de Medicina Experimental, Escuela Luis Razetti, Universidad Central de Venezuela, Caracas, Venezuela
- Medical Mission Institute, Hermann-Schell-Str. 7, 97074 Würzburg, Germany
| | - Elena Bencurova
- Functional Genomics & Systems Biology Group, Department of Bioinformatics, Biocenter, Am Hubland, University of Würzburg, 97074 Würzburg, Germany
| | - Thomas Dandekar
- Functional Genomics & Systems Biology Group, Department of Bioinformatics, Biocenter, Am Hubland, University of Würzburg, 97074 Würzburg, Germany
- EMBL Heidelberg, BioComputing Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany
| |
Collapse
|
20
|
Pitta JLDLP, Vasconcelos CRDS, Wallau GDL, Campos TDL, Rezende AM. In silico predictions of protein interactions between Zika virus and human host. PeerJ 2021; 9:e11770. [PMID: 34513323 PMCID: PMC8395582 DOI: 10.7717/peerj.11770] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Accepted: 06/23/2021] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND The ZIKA virus (ZIKV) belongs to the Flaviviridae family, was first isolated in the 1940s, and remained underreported until its global threat in 2016, where drastic consequences were reported as Guillan-Barre syndrome and microcephaly in newborns. Understanding molecular interactions of ZIKV proteins during the host infection is important to develop treatments and prophylactic measures; however, large-scale experimental approaches normally used to detect protein-protein interaction (PPI) are onerous and labor-intensive. On the other hand, computational methods may overcome these challenges and guide traditional approaches on one or few protein molecules. The prediction of PPIs can be used to study host-parasite interactions at the protein level and reveal key pathways that allow viral infection. RESULTS Applying Random Forest and Support Vector Machine (SVM) algorithms, we performed predictions of PPI between two ZIKV strains and human proteomes. The consensus number of predictions of both algorithms was 17,223 pairs of proteins. Functional enrichment analyses were executed with the predicted networks to access the biological meanings of the protein interactions. Some pathways related to viral infection and neurological development were found for both ZIKV strains in the enrichment analysis, but the JAK-STAT pathway was observed only for strain PE243 when compared with the FSS13025 strain. CONCLUSIONS The consensus network of PPI predictions made by Random Forest and SVM algorithms allowed an enrichment analysis that corroborates many aspects of ZIKV infection. The enrichment results are mainly related to viral infection, neuronal development, and immune response, and presented differences among the two compared ZIKV strains. Strain PE243 presented more predicted interactions between proteins from the JAK-STAT signaling pathway, which could lead to a more inflammatory immune response when compared with the FSS13025 strain. These results show that the methodology employed in this study can potentially reveal new interactions between the ZIKV and human cells.
Collapse
Affiliation(s)
| | | | | | - Túlio de Lima Campos
- Bioinformatics Platform, Aggeu Magalhães Institute-FIOCRUZ/PE, Recife, PE, Brasil
| | | |
Collapse
|
21
|
Sudhakar P, Machiels K, Verstockt B, Korcsmaros T, Vermeire S. Computational Biology and Machine Learning Approaches to Understand Mechanistic Microbiome-Host Interactions. Front Microbiol 2021; 12:618856. [PMID: 34046017 PMCID: PMC8148342 DOI: 10.3389/fmicb.2021.618856] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Accepted: 03/19/2021] [Indexed: 12/11/2022] Open
Abstract
The microbiome, by virtue of its interactions with the host, is implicated in various host functions including its influence on nutrition and homeostasis. Many chronic diseases such as diabetes, cancer, inflammatory bowel diseases are characterized by a disruption of microbial communities in at least one biological niche/organ system. Various molecular mechanisms between microbial and host components such as proteins, RNAs, metabolites have recently been identified, thus filling many gaps in our understanding of how the microbiome modulates host processes. Concurrently, high-throughput technologies have enabled the profiling of heterogeneous datasets capturing community level changes in the microbiome as well as the host responses. However, due to limitations in parallel sampling and analytical procedures, big gaps still exist in terms of how the microbiome mechanistically influences host functions at a system and community level. In the past decade, computational biology and machine learning methodologies have been developed with the aim of filling the existing gaps. Due to the agnostic nature of the tools, they have been applied in diverse disease contexts to analyze and infer the interactions between the microbiome and host molecular components. Some of these approaches allow the identification and analysis of affected downstream host processes. Most of the tools statistically or mechanistically integrate different types of -omic and meta -omic datasets followed by functional/biological interpretation. In this review, we provide an overview of the landscape of computational approaches for investigating mechanistic interactions between individual microbes/microbiome and the host and the opportunities for basic and clinical research. These could include but are not limited to the development of activity- and mechanism-based biomarkers, uncovering mechanisms for therapeutic interventions and generating integrated signatures to stratify patients.
Collapse
Affiliation(s)
- Padhmanand Sudhakar
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
| | - Kathleen Machiels
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
| | - Bram Verstockt
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
- Department of Gastroenterology and Hepatology, University Hospitals Leuven, KU Leuven, Leuven, Belgium
| | - Tamas Korcsmaros
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
| | - Séverine Vermeire
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
- Department of Gastroenterology and Hepatology, University Hospitals Leuven, KU Leuven, Leuven, Belgium
| |
Collapse
|
22
|
Lian X, Yang X, Yang S, Zhang Z. Current status and future perspectives of computational studies on human-virus protein-protein interactions. Brief Bioinform 2021; 22:6161422. [PMID: 33693490 DOI: 10.1093/bib/bbab029] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 01/14/2021] [Accepted: 01/20/2021] [Indexed: 12/19/2022] Open
Abstract
The protein-protein interactions (PPIs) between human and viruses mediate viral infection and host immunity processes. Therefore, the study of human-virus PPIs can help us understand the principles of human-virus relationships and can thus guide the development of highly effective drugs to break the transmission of viral infectious diseases. Recent years have witnessed the rapid accumulation of experimentally identified human-virus PPI data, which provides an unprecedented opportunity for bioinformatics studies revolving around human-virus PPIs. In this article, we provide a comprehensive overview of computational studies on human-virus PPIs, especially focusing on the method development for human-virus PPI predictions. We briefly introduce the experimental detection methods and existing database resources of human-virus PPIs, and then discuss the research progress in the development of computational prediction methods. In particular, we elaborate the machine learning-based prediction methods and highlight the need to embrace state-of-the-art deep-learning algorithms and new feature engineering techniques (e.g. the protein embedding technique derived from natural language processing). To further advance the understanding in this research topic, we also outline the practical applications of the human-virus interactome in fundamental biological discovery and new antiviral therapy development.
Collapse
Affiliation(s)
- Xianyi Lian
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
23
|
Yang X, Lian X, Fu C, Wuchty S, Yang S, Zhang Z. HVIDB: a comprehensive database for human-virus protein-protein interactions. Brief Bioinform 2021; 22:832-844. [PMID: 33515030 DOI: 10.1093/bib/bbaa425] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 11/12/2020] [Accepted: 12/19/2020] [Indexed: 12/22/2022] Open
Abstract
While leading to millions of people's deaths every year the treatment of viral infectious diseases remains a huge public health challenge.Therefore, an in-depth understanding of human-virus protein-protein interactions (PPIs) as the molecular interface between a virus and its host cell is of paramount importance to obtain new insights into the pathogenesis of viral infections and development of antiviral therapeutic treatments. However, current human-virus PPI database resources are incomplete, lack annotation and usually do not provide the opportunity to computationally predict human-virus PPIs. Here, we present the Human-Virus Interaction DataBase (HVIDB, http://zzdlab.com/hvidb/) that provides comprehensively annotated human-virus PPI data as well as seamlessly integrates online PPI prediction tools. Currently, HVIDB highlights 48 643 experimentally verified human-virus PPIs covering 35 virus families, 6633 virally targeted host complexes, 3572 host dependency/restriction factors as well as 911 experimentally verified/predicted 3D complex structures of human-virus PPIs. Furthermore, our database resource provides tissue-specific expression profiles of 6790 human genes that are targeted by viruses and 129 Gene Expression Omnibus series of differentially expressed genes post-viral infections. Based on these multifaceted and annotated data, our database allows the users to easily obtain reliable information about PPIs of various human viruses and conduct an in-depth analysis of their inherent biological significance. In particular, HVIDB also integrates well-performing machine learning models to predict interactions between the human host and viral proteins that are based on (i) sequence embedding techniques, (ii) interolog mapping and (iii) domain-domain interaction inference. We anticipate that HVIDB will serve as a one-stop knowledge base to further guide hypothesis-driven experimental efforts to investigate human-virus relationships.
Collapse
Affiliation(s)
- Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Xianyi Lian
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Chen Fu
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Stefan Wuchty
- Institute of Data Science and Sylvester Comprehensive Cancer Center at the University of Miami, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
24
|
An Integrative Computational Approach for the Prediction of Human- Plasmodium Protein-Protein Interactions. BIOMED RESEARCH INTERNATIONAL 2021; 2020:2082540. [PMID: 33426052 PMCID: PMC7771252 DOI: 10.1155/2020/2082540] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 11/08/2020] [Accepted: 12/04/2020] [Indexed: 12/27/2022]
Abstract
Host-pathogen molecular cross-talks are critical in determining the pathophysiology of a specific infection. Most of these cross-talks are mediated via protein-protein interactions between the host and the pathogen (HP-PPI). Thus, it is essential to know how some pathogens interact with their hosts to understand the mechanism of infections. Malaria is a life-threatening disease caused by an obligate intracellular parasite belonging to the Plasmodium genus, of which P. falciparum is the most prevalent. Several previous studies predicted human-plasmodium protein-protein interactions using computational methods have demonstrated their utility, accuracy, and efficiency to identify the interacting partners and therefore complementing experimental efforts to characterize host-pathogen interaction networks. To predict potential putative HP-PPIs, we use an integrative computational approach based on the combination of multiple OMICS-based methods including human red blood cells (RBC) and Plasmodium falciparum 3D7 strain expressed proteins, domain-domain based PPI, similarity of gene ontology terms, structure similarity method homology identification, and machine learning prediction. Our results reported a set of 716 protein interactions involving 302 human proteins and 130 Plasmodium proteins. This work provides a list of potential human-Plasmodium interacting proteins. These findings will contribute to better understand the mechanisms underlying the molecular determinism of malaria disease and potentially to identify candidate pharmacological targets.
Collapse
|
25
|
Kataria R, Duhan N, Kaundal R. Computational Systems Biology of Alfalfa - Bacterial Blight Host-Pathogen Interactions: Uncovering the Complex Molecular Networks for Developing Durable Disease Resistant Crop. FRONTIERS IN PLANT SCIENCE 2021; 12:807354. [PMID: 35251063 PMCID: PMC8891223 DOI: 10.3389/fpls.2021.807354] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Accepted: 12/29/2021] [Indexed: 05/04/2023]
Abstract
Medicago sativa (also known as alfalfa), a forage legume, is widely cultivated due to its high yield and high-value hay crop production. Infectious diseases are a major threat to the crops, owing to huge economic losses to the agriculture industry, worldwide. The protein-protein interactions (PPIs) between the pathogens and their hosts play a critical role in understanding the molecular basis of pathogenesis. Pseudomonas syringae pv. syringae ALF3 suppresses the plant's innate immune response by secreting type III effector proteins into the host cell, causing bacterial stem blight in alfalfa. The alfalfa-P. syringae system has little information available for PPIs. Thus, to understand the infection mechanism, we elucidated the genome-scale host-pathogen interactions (HPIs) between alfalfa and P. syringae using two computational approaches: interolog-based and domain-based method. A total of ∼14 M putative PPIs were predicted between 50,629 alfalfa proteins and 2,932 P. syringae proteins by combining these approaches. Additionally, ∼0.7 M consensus PPIs were also predicted. The functional analysis revealed that P. syringae proteins are highly involved in nucleotide binding activity (GO:0000166), intracellular organelle (GO:0043229), and translation (GO:0006412) while alfalfa proteins are involved in cellular response to chemical stimulus (GO:0070887), oxidoreductase activity (GO:0016614), and Golgi apparatus (GO:0005794). According to subcellular localization predictions, most of the pathogen proteins targeted host proteins within the cytoplasm and nucleus. In addition, we discovered a slew of new virulence effectors in the predicted HPIs. The current research describes an integrated approach for deciphering genome-scale host-pathogen PPIs between alfalfa and P. syringae, allowing the researchers to better understand the pathogen's infection mechanism and develop pathogen-resistant lines.
Collapse
Affiliation(s)
- Raghav Kataria
- Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences, Utah State University, Logan, UT, United States
| | - Naveen Duhan
- Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences, Utah State University, Logan, UT, United States
| | - Rakesh Kaundal
- Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences, Utah State University, Logan, UT, United States
- Bioinformatics Facility, Center for Integrated Biosystems, Utah State University, Logan, UT, United States
- Department of Computer Science, College of Science, Utah State University, Logan, UT, United States
- *Correspondence: Rakesh Kaundal, ;
| |
Collapse
|
26
|
Malik R, Fazal S, Kamal MA. Computational Analysis of Dynamical Fluctuations of Oncoprotein E7 (HPV 16) for the Hot Spot Residue Identification Using Elastic Network Model. LETT DRUG DES DISCOV 2020. [DOI: 10.2174/1570180817999200606225735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Aims:
To find out Potential Drug targets against HPV E7.
Background:
Oncoprotein E7 of Human Papilloma Virus (HPV-16), after invading human body
alter host protein-protein interaction networks caused by the fluctuations of amino acid residues present
in E7. E7 interacts with Rb protein of human host with variable residual fluctuations, leading
towards the progression of cervical cancer.
Objective:
Our study was focused our computational analysis of the binding and competing interactions
of the E7 protein of HPV with Rb protein.
Methods:
Our study is based on analysis of dynamic fluctuations of E7 in host cell and correlation
analysis of specific residue found in motif of LxCxE, that is the key region in stabilizing interaction
between E7 and Rb.
Results and Discussion:
Cysteine, Leucine and Glutamic acid have been identified as hot spot residues
of E7 which can provide platform for drug designing and understanding of pathogenesis of
cervical cancer, in future. Our study shows validation of the vitality of linear binding motifs LxCxE
of E7 of HPV in interacting with Rb as an important event in propagation of HPV in human cells
and transformation of infection into cervical cancer.
Conclusion:
Our study shows validation of the vitality of linear binding motifs LxCxE of E7 of
HPV in interacting with Rb as an important event in propagation of HPV in human cells and transformation
of infection into cervical cancer.
Other:
E7 interacts with Rb protein of human host with variable residual fluctuations, leading towards
the progression of cervical cancer.
Collapse
Affiliation(s)
- Rabbiah Malik
- Capital University of Science and Technology, Islamabad, Pakistan
| | - Sahar Fazal
- Capital University of Science and Technology, Islamabad, Pakistan
| | | |
Collapse
|
27
|
Khorsand B, Savadi A, Naghibzadeh M. Comprehensive host-pathogen protein-protein interaction network analysis. BMC Bioinformatics 2020; 21:400. [PMID: 32912135 PMCID: PMC7488060 DOI: 10.1186/s12859-020-03706-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2020] [Accepted: 07/31/2020] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Infectious diseases are a cruel assassin with millions of victims around the world each year. Understanding infectious mechanism of viruses is indispensable for their inhibition. One of the best ways of unveiling this mechanism is to investigate the host-pathogen protein-protein interaction network. In this paper we try to disclose many properties of this network. We focus on human as host and integrate experimentally 32,859 interaction between human proteins and virus proteins from several databases. We investigate different properties of human proteins targeted by virus proteins and find that most of them have a considerable high centrality scores in human intra protein-protein interaction network. Investigating human proteins network properties which are targeted by different virus proteins can help us to design multipurpose drugs. RESULTS As host-pathogen protein-protein interaction network is a bipartite network and centrality measures for this type of networks are scarce, we proposed seven new centrality measures for analyzing bipartite networks. Applying them to different virus strains reveals unrandomness of attack strategies of virus proteins which could help us in drug design hence elevating the quality of life. They could also be used in detecting host essential proteins. Essential proteins are those whose functions are critical for survival of its host. One of the proposed centralities named diversity of predators, outperforms the other existing centralities in terms of detecting essential proteins and could be used as an optimal essential proteins' marker. CONCLUSIONS Different centralities were applied to analyze human protein-protein interaction network and to detect characteristics of human proteins targeted by virus proteins. Moreover, seven new centralities were proposed to analyze host-pathogen protein-protein interaction network and to detect pathogens' favorite host protein victims. Comparing different centralities in detecting essential proteins reveals that diversity of predator (one of the proposed centralities) is the best essential protein marker.
Collapse
Affiliation(s)
- Babak Khorsand
- Computer Engineering Department, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Abdorreza Savadi
- Computer Engineering Department, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
- Ferdowsi University of Mashhad, Azadi Square, Mashhad, 9177948974 Iran
| | | |
Collapse
|
28
|
Khatun MS, Shoombuatong W, Hasan MM, Kurata H. Evolution of Sequence-based Bioinformatics Tools for Protein-protein Interaction Prediction. Curr Genomics 2020; 21:454-463. [PMID: 33093807 PMCID: PMC7536797 DOI: 10.2174/1389202921999200625103936] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 03/19/2020] [Accepted: 05/27/2020] [Indexed: 12/22/2022] Open
Abstract
Protein-protein interactions (PPIs) are the physical connections between two or more proteins via electrostatic forces or hydrophobic effects. Identification of the PPIs is pivotal, which contributes to many biological processes including protein function, disease incidence, and therapy design. The experimental identification of PPIs via high-throughput technology is time-consuming and expensive. Bioinformatics approaches are expected to solve such restrictions. In this review, our main goal is to provide an inclusive view of the existing sequence-based computational prediction of PPIs. Initially, we briefly introduce the currently available PPI databases and then review the state-of-the-art bioinformatics approaches, working principles, and their performances. Finally, we discuss the caveats and future perspective of the next generation algorithms for the prediction of PPIs.
Collapse
Affiliation(s)
| | | | - Md. Mehedi Hasan
- Address correspondence to these authors at the Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan; Tel: +81-948-297-828; E-mail: and Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Biomedical Informatics R&D Center, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828; E-mail:
| | - Hiroyuki Kurata
- Address correspondence to these authors at the Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan; Tel: +81-948-297-828; E-mail: and Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Biomedical Informatics R&D Center, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828; E-mail:
| |
Collapse
|
29
|
Chen C, Zhang Q, Yu B, Yu Z, Lawrence PJ, Ma Q, Zhang Y. Improving protein-protein interactions prediction accuracy using XGBoost feature selection and stacked ensemble classifier. Comput Biol Med 2020; 123:103899. [DOI: 10.1016/j.compbiomed.2020.103899] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Revised: 06/28/2020] [Accepted: 06/28/2020] [Indexed: 10/23/2022]
|
30
|
Chen H, Li F, Wang L, Jin Y, Chi CH, Kurgan L, Song J, Shen J. Systematic evaluation of machine learning methods for identifying human-pathogen protein-protein interactions. Brief Bioinform 2020; 22:5847611. [PMID: 32459334 DOI: 10.1093/bib/bbaa068] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 03/31/2020] [Accepted: 04/01/2020] [Indexed: 12/11/2022] Open
Abstract
In recent years, high-throughput experimental techniques have significantly enhanced the accuracy and coverage of protein-protein interaction identification, including human-pathogen protein-protein interactions (HP-PPIs). Despite this progress, experimental methods are, in general, expensive in terms of both time and labour costs, especially considering that there are enormous amounts of potential protein-interacting partners. Developing computational methods to predict interactions between human and bacteria pathogen has thus become critical and meaningful, in both facilitating the detection of interactions and mining incomplete interaction maps. In this paper, we present a systematic evaluation of machine learning-based computational methods for human-bacterium protein-protein interactions (HB-PPIs). We first reviewed a vast number of publicly available databases of HP-PPIs and then critically evaluate the availability of these databases. Benefitting from its well-structured nature, we subsequently preprocess the data and identified six bacterium pathogens that could be used to study bacterium subjects in which a human was the host. Additionally, we thoroughly reviewed the literature on 'host-pathogen interactions' whereby existing models were summarized that we used to jointly study the impact of different feature representation algorithms and evaluate the performance of existing machine learning computational models. Owing to the abundance of sequence information and the limited scale of other protein-related information, we adopted the primary protocol from the literature and dedicated our analysis to a comprehensive assessment of sequence information and machine learning models. A systematic evaluation of machine learning models and a wide range of feature representation algorithms based on sequence information are presented as a comparison survey towards the prediction performance evaluation of HB-PPIs.
Collapse
|
31
|
Loaiza CD, Duhan N, Lister M, Kaundal R. In silico prediction of host-pathogen protein interactions in melioidosis pathogen Burkholderia pseudomallei and human reveals novel virulence factors and their targets. Brief Bioinform 2020; 22:5842243. [PMID: 32444871 DOI: 10.1093/bib/bbz162] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Revised: 11/13/2019] [Accepted: 11/20/2019] [Indexed: 12/13/2022] Open
Abstract
The aerobic, Gram-negative motile bacillus, Burkholderia pseudomallei is a facultative intracellular bacterium causing melioidosis, a critical disease of public health importance, which is widely endemic in the tropics and subtropical regions of the world. Melioidosis is associated with high case fatality rates in animals and humans; even with treatment, its mortality is 20-50%. It also infects plants and is designated as a biothreat agent. B. pseudomallei is pathogenic due to its ability to invade, resist factors in serum and survive intracellularly. Despite its importance, to date only a few effector proteins have been functionally characterized, and there is not much information regarding the host-pathogen protein-protein interactions (PPI) of this system, which are important to studying infection mechanisms and thereby develop prevention measures. We explored two computational approaches, the homology-based interolog and the domain-based method, to predict genome-scale host-pathogen interactions (HPIs) between two different strains of B. pseudomallei (prototypical, and highly virulent) and human. In total, 76 335 common HPIs (between the two strains) were predicted involving 8264 human and 1753 B. pseudomallei proteins. Among the unique PPIs, 14 131 non-redundant HPIs were found to be unique between the prototypical strain and human, compared to 3043 non-redundant HPIs between the highly virulent strain and human. The protein hubs analysis showed that most B. pseudomallei proteins formed a hub with human dnaK complex proteins associated with tuberculosis, a disease similar in symptoms to melioidosis. In addition, drug-binding and carbohydrate-binding mechanisms were found overrepresented within the host-pathogen network, and metabolic pathways were frequently activated according to the pathway enrichment. Subcellular localization analysis showed that most of the pathogen proteins are targeting human proteins inside cytoplasm and nucleus. We also discovered the host targets of the drug-related pathogen proteins and proteins that form T3SS and T6SS in B. pseudomallei. Additionally, a comparison between the unique PPI patterns present in the prototypical and highly virulent strains was performed. The current study is the first report on developing a genome-scale host-pathogen protein interaction networks between the human and B. pseudomallei, a critical biothreat agent. We have identified novel virulence factors and their interacting partners in the human proteome. These PPIs can be further validated by high-throughput experiments and may give new insights on how B. pseudomallei interacts with its host, which will help medical researchers in developing better prevention measures.
Collapse
Affiliation(s)
- Cristian D Loaiza
- Center for Integrated BioSystems/Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences, Utah State University, USA
| | - Naveen Duhan
- Center for Integrated BioSystems/Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences, Utah State University, USA
| | - Matthew Lister
- Bioinformatics Facility, Center for Integrated BioSystems, Utah State University, USA
| | - Rakesh Kaundal
- Department of Plants, Soils, and Climate/Center for Integrated BioSystems, College of Agriculture and Applied Sciences, Utah State University, Logan, UT 84322 USA
| |
Collapse
|
32
|
Guven-Maiorov E, Hakouz A, Valjevac S, Keskin O, Tsai CJ, Gursoy A, Nussinov R. HMI-PRED: A Web Server for Structural Prediction of Host-Microbe Interactions Based on Interface Mimicry. J Mol Biol 2020; 432:3395-3403. [PMID: 32061934 PMCID: PMC7261632 DOI: 10.1016/j.jmb.2020.01.025] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 11/28/2019] [Accepted: 01/14/2020] [Indexed: 02/07/2023]
Abstract
Microbes, commensals, and pathogens, control the numerous functions in the host cells. They can alter host signaling and modulate immune surveillance by interacting with the host proteins. For shedding light on the contribution of microbes to health and disease, it is vital to discern how microbial proteins rewire host signaling and through which host proteins they do this. Host-Microbe Interaction PREDictor (HMI-PRED) is a user-friendly web server for structural prediction of protein-protein interactions (PPIs) between the host and a microbial species, including bacteria, viruses, fungi, and protozoa. HMI-PRED relies on "interface mimicry" through which the microbial proteins hijack host binding surfaces. Given the structure of a microbial protein of interest, HMI-PRED will return structural models of potential host-microbe interaction (HMI) complexes, the list of host endogenous and exogenous PPIs that can be disrupted, and tissue expression of the microbe-targeted host proteins. The server also allows users to upload homology models of microbial proteins. Broadly, it aims at large-scale, efficient identification of HMIs. The prediction results are stored in a repository for community access. HMI-PRED is free and available at https://interactome.ku.edu.tr/hmi.
Collapse
Affiliation(s)
- Emine Guven-Maiorov
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, 21702, USA.
| | - Asma Hakouz
- Department of Computer Engineering, Koc University, Istanbul, 34450, Turkey.
| | - Sukejna Valjevac
- Department of Computer Engineering, Koc University, Istanbul, 34450, Turkey.
| | - Ozlem Keskin
- Department of Chemical and Biological Engineering, Koc University, Istanbul, 34450, Turkey.
| | - Chung-Jung Tsai
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, 21702, USA.
| | - Attila Gursoy
- Department of Computer Engineering, Koc University, Istanbul, 34450, Turkey.
| | - Ruth Nussinov
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, 21702, USA; Sackler Inst. of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, 69978, Israel.
| |
Collapse
|
33
|
Khorsand B, Savadi A, Zahiri J, Naghibzadeh M. Alpha influenza virus infiltration prediction using virus-human protein-protein interaction network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2020; 17:3109-3129. [PMID: 32987519 DOI: 10.3934/mbe.2020176] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
More than ten million deaths make influenza virus one of the deadliest of history. About half a million sever illnesses are annually reported consequent of influenza. Influenza is a parasite which needs the host cellular machinery to replicate its genome. To reach the host, viral proteins need to interact with the host proteins. Therefore, identification of host-virus protein interaction network (HVIN) is one of the crucial steps in treating viral diseases. Being expensive, time-consuming and laborious of HVIN experimental identification, force the researches to use computational methods instead of experimental ones to obtain a better understanding of HVIN. In this study, several features are extracted from physicochemical properties of amino acids, combined with different centralities of human protein-protein interaction network (HPPIN) to predict protein-protein interactions between human proteins and Alphainfluenzavirus proteins (HI-PPIs). Ensemble learning methods were used to predict such PPIs. Our model reached 0.93 accuracy, 0.91 sensitivity and 0.95 specificity. Moreover, a database including 694522 new PPIs was constructed by prediction results of the model. Further analysis showed that HPPIN centralities, gene ontology semantic similarity and conjoint triad of virus proteins are the most important features to predict HI-PPIs.
Collapse
Affiliation(s)
- Babak Khorsand
- Computer Engineering Department, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Abdorreza Savadi
- Computer Engineering Department, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Javad Zahiri
- Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | - Mahmoud Naghibzadeh
- Computer Engineering Department, Ferdowsi University of Mashhad, Mashhad, Iran
| |
Collapse
|
34
|
Sen R, Tagore S, De RK. ASAPP: Architectural Similarity-Based Automated Pathway Prediction System and Its Application in Host-Pathogen Interactions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:506-515. [PMID: 30281472 DOI: 10.1109/tcbb.2018.2872527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The significance of metabolic pathway prediction is to envision the viable unknown transformations that can occur provided the appropriate enzymes are present. It can facilitate the prediction of the consequences of host-pathogen interactions. In this article, we have proposed a new algorithm Architectural Similarity-based Automated Pathway Prediction (ASAPP) to predict metabolic pathways based on the structural similarity among the metabolites. ASAPP takes two-dimensional structure and molecular weight of metabolites as input, and generates a list of probable transformations without the knowledge of any externally established reactions, with an accuracy of 85.09 percent. ASAPP has also been applied to predict the outcome of pathogen liberated toxins on the carbohydrate and lipid pathways of the hosts. We have analyzed the disruption of host pathways in the presence of toxins, and have found that some metabolites in Glycolysis and the TCA cycle have a high chance of being the breakpoints in the pathway. The tool is available at http://asapp.droppages.com/.
Collapse
|
35
|
Gupta SK, Srivastava M, Osmanoglu Ö, Dandekar T. Genome-wide inference of the Camponotus floridanus protein-protein interaction network using homologous mapping and interacting domain profile pairs. Sci Rep 2020; 10:2334. [PMID: 32047225 PMCID: PMC7012867 DOI: 10.1038/s41598-020-59344-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Accepted: 01/22/2020] [Indexed: 12/18/2022] Open
Abstract
Apart from some model organisms, the interactome of most organisms is largely unidentified. High-throughput experimental techniques to determine protein-protein interactions (PPIs) are resource intensive and highly susceptible to noise. Computational methods of PPI determination can accelerate biological discovery by identifying the most promising interacting pairs of proteins and by assessing the reliability of identified PPIs. Here we present a first in-depth study describing a global view of the ant Camponotus floridanus interactome. Although several ant genomes have been sequenced in the last eight years, studies exploring and investigating PPIs in ants are lacking. Our study attempts to fill this gap and the presented interactome will also serve as a template for determining PPIs in other ants in future. Our C. floridanus interactome covers 51,866 non-redundant PPIs among 6,274 proteins, including 20,544 interactions supported by domain-domain interactions (DDIs), 13,640 interactions supported by DDIs and subcellular localization, and 10,834 high confidence interactions mediated by 3,289 proteins. These interactions involve and cover 30.6% of the entire C. floridanus proteome.
Collapse
Affiliation(s)
- Shishir K Gupta
- Functional Genomics and Systems Biology Group, Department of Bioinformatics, Biocenter, Am Hubland, D-97074, Würzburg, Germany.,Department of Microbiology, Biocenter, Am Hubland, D-97074, Würzburg, Germany
| | - Mugdha Srivastava
- Functional Genomics and Systems Biology Group, Department of Bioinformatics, Biocenter, Am Hubland, D-97074, Würzburg, Germany
| | - Özge Osmanoglu
- Functional Genomics and Systems Biology Group, Department of Bioinformatics, Biocenter, Am Hubland, D-97074, Würzburg, Germany
| | - Thomas Dandekar
- Functional Genomics and Systems Biology Group, Department of Bioinformatics, Biocenter, Am Hubland, D-97074, Würzburg, Germany. .,EMBL Heidelberg, BioComputing Unit, Meyerhofstraße 1, 69117, Heidelberg, Germany.
| |
Collapse
|
36
|
Li J, Wang S, Chen Z, Wang Y. A Bipartite Network Module-Based Project to Predict Pathogen-Host Association. Front Genet 2020; 10:1357. [PMID: 32038713 PMCID: PMC6992693 DOI: 10.3389/fgene.2019.01357] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Accepted: 12/11/2019] [Indexed: 12/23/2022] Open
Abstract
Pathogen-host interactions play an important role in understanding the mechanism by which a pathogen can infect its host. Some approaches for predicting pathogen-host association have been developed, but prediction accuracy is still low. In this paper, we propose a bipartite network module-based approach to improve prediction accuracy. First, a bipartite network with pathogens and hosts is constructed. Next, pathogens and hosts are divided into different modules respectively. Then, modular information on the pathogens and hosts is added into a bipartite network projection model and the association scores between pathogens and hosts are calculated. Finally, leave-one-out cross-validation is used to estimate the performance of the proposed method. Experimental results show that the proposed method performs better in predicting pathogen-host association than other methods, and some potential pathogen-host associations with higher prediction scores are also confirmed by the results of biological experiments in the publically available literature.
Collapse
Affiliation(s)
- Jie Li
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | | | | | | |
Collapse
|
37
|
Bose T, Venkatesh KV, Mande SS. Investigating host-bacterial interactions among enteric pathogens. BMC Genomics 2019; 20:1022. [PMID: 31881845 PMCID: PMC6935094 DOI: 10.1186/s12864-019-6398-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Accepted: 12/15/2019] [Indexed: 01/07/2023] Open
Abstract
Background In 2017, World Health Organization (WHO) published a catalogue of 12 families of antibiotic-resistant “priority pathogens” that are posing the greatest threats to human health. Six of these dreaded pathogens are known to infect the human gastrointestinal system. In addition to causing gastrointestinal and systemic infections, these pathogens can also affect the composition of other microbes constituting the healthy gut microbiome. Such aberrations in gut microbiome can significantly affect human physiology and immunity. Identifying the virulence mechanisms of these enteric pathogens are likely to help in developing newer therapeutic strategies to counter them. Results Using our previously published in silico approach, we have evaluated (and compared) Host-Pathogen Protein-Protein Interaction (HPI) profiles of four groups of enteric pathogens, namely, different species of Escherichia, Shigella, Salmonella and Vibrio. Results indicate that in spite of genus/ species specific variations, most enteric pathogens possess a common repertoire of HPIs. This core set of HPIs are probably responsible for the survival of these pathogen in the harsh nutrient-limiting environment within the gut. Certain genus/ species specific HPIs were also observed. Conslusions The identified bacterial proteins involved in the core set of HPIs are expected to be helpful in understanding the pathogenesis of these dreaded gut pathogens in greater detail. Possible role of genus/ species specific variations in the HPI profiles in the virulence of these pathogens are also discussed. The obtained results are likely to provide an opportunity for development of novel therapeutic strategies against the most dreaded gut pathogens.
Collapse
Affiliation(s)
- Tungadri Bose
- Bio-Sciences R&D Division, TCS Innovation Labs, Tata Consultancy Services Limited, Pune, India.,Department of Chemical Engineering, Indian Institute of Technology Bombay, Mumbai, India
| | - K V Venkatesh
- Department of Chemical Engineering, Indian Institute of Technology Bombay, Mumbai, India
| | - Sharmila S Mande
- Bio-Sciences R&D Division, TCS Innovation Labs, Tata Consultancy Services Limited, Pune, India.
| |
Collapse
|
38
|
Yang X, Yang S, Li Q, Wuchty S, Zhang Z. Prediction of human-virus protein-protein interactions through a sequence embedding-based machine learning method. Comput Struct Biotechnol J 2019; 18:153-161. [PMID: 31969974 PMCID: PMC6961065 DOI: 10.1016/j.csbj.2019.12.005] [Citation(s) in RCA: 74] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Revised: 11/29/2019] [Accepted: 12/10/2019] [Indexed: 12/11/2022] Open
Abstract
The identification of human-virus protein-protein interactions (PPIs) is an essential and challenging research topic, potentially providing a mechanistic understanding of viral infection. Given that the experimental determination of human-virus PPIs is time-consuming and labor-intensive, computational methods are playing an important role in providing testable hypotheses, complementing the determination of large-scale interactome between species. In this work, we applied an unsupervised sequence embedding technique (doc2vec) to represent protein sequences as rich feature vectors of low dimensionality. Training a Random Forest (RF) classifier through a training dataset that covers known PPIs between human and all viruses, we obtained excellent predictive accuracy outperforming various combinations of machine learning algorithms and commonly-used sequence encoding schemes. Rigorous comparison with three existing human-virus PPI prediction methods, our proposed computational framework further provided very competitive and promising performance, suggesting that the doc2vec encoding scheme effectively captures context information of protein sequences, pertaining to corresponding protein-protein interactions. Our approach is freely accessible through our web server as part of our host-pathogen PPI prediction platform (http://zzdlab.com/InterSPPI/). Taken together, we hope the current work not only contributes a useful predictor to accelerate the exploration of human-virus PPIs, but also provides some meaningful insights into human-virus relationships.
Collapse
Key Words
- AC, Auto Covariance
- ACC, Accuracy
- AUC, area under the ROC curve
- AUPRC, area under the PR curve
- Adaboost, Adaptive Boosting
- CT, Conjoint Triad
- Doc2vec
- Embedding
- Human-virus interaction
- LD, Local Descriptor
- MCC, Matthews correlation coefficient
- ML, machine learning
- MLP, Multiple Layer Perceptron
- MS, mass spectroscopy
- Machine learning
- PPIs, protein-protein interactions
- PR, Precision-Recall
- Prediction
- Protein-protein interaction
- RBF, radial basis function
- RF, Random Forest
- ROC, Receiver Operating Characteristic
- SGD, stochastic gradient descent
- SVM, Support Vector Machine
- Y2H, yeast two-hybrid
Collapse
Affiliation(s)
- Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Qinmengge Li
- National Demonstration Center for Experimental Biological Sciences Education, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Stefan Wuchty
- Dept. of Computer Science, University of Miami, Miami, FL 33146, USA
- Dept. of Biology, University of Miami, Miami, FL 33146, USA
- Center of Computational Science, University of Miami, Miami, FL 33146, USA
- Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL 33136, USA
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
39
|
Zheng N, Wang K, Zhan W, Deng L. Targeting Virus-host Protein Interactions: Feature Extraction and Machine Learning Approaches. Curr Drug Metab 2019; 20:177-184. [PMID: 30156155 DOI: 10.2174/1389200219666180829121038] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Revised: 05/21/2018] [Accepted: 08/02/2018] [Indexed: 01/15/2023]
Abstract
BACKGROUND Targeting critical viral-host Protein-Protein Interactions (PPIs) has enormous application prospects for therapeutics. Using experimental methods to evaluate all possible virus-host PPIs is labor-intensive and time-consuming. Recent growth in computational identification of virus-host PPIs provides new opportunities for gaining biological insights, including applications in disease control. We provide an overview of recent computational approaches for studying virus-host PPI interactions. METHODS In this review, a variety of computational methods for virus-host PPIs prediction have been surveyed. These methods are categorized based on the features they utilize and different machine learning algorithms including classical and novel methods. RESULTS We describe the pivotal and representative features extracted from relevant sources of biological data, mainly include sequence signatures, known domain interactions, protein motifs and protein structure information. We focus on state-of-the-art machine learning algorithms that are used to build binary prediction models for the classification of virus-host protein pairs and discuss their abilities, weakness and future directions. CONCLUSION The findings of this review confirm the importance of computational methods for finding the potential protein-protein interactions between virus and host. Although there has been significant progress in the prediction of virus-host PPIs in recent years, there is a lot of room for improvement in virus-host PPI prediction.
Collapse
Affiliation(s)
- Nantao Zheng
- School of Software, Central South University, Changsha, 410075, China
| | - Kairou Wang
- School of Software, Central South University, Changsha, 410075, China
| | - Weihua Zhan
- School of Electronics and Computer Science, Zhejiang Wanli University, Ningbo 315100, China
| | - Lei Deng
- School of Software, Central South University, Changsha, 410075, China.,Shanghai Key Lab of Intelligent Information Processing, Shanghai 200433, China
| |
Collapse
|
40
|
Ahmed I, Witbooi P, Christoffels A. Prediction of human-Bacillus anthracis protein-protein interactions using multi-layer neural network. Bioinformatics 2019; 34:4159-4164. [PMID: 29945178 PMCID: PMC6289132 DOI: 10.1093/bioinformatics/bty504] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2018] [Accepted: 06/24/2018] [Indexed: 12/22/2022] Open
Abstract
Motivation Triplet amino acids have successfully been included in feature selection to predict human-HPV protein-protein interactions (PPI). The utility of supervised learning methods is curtailed due to experimental data not being available in sufficient quantities. Improvements in machine learning techniques and features selection will enhance the study of PPI between host and pathogen. Results We present a comparison of a neural network model versus SVM for prediction of host-pathogen PPI based on a combination of features including: amino acid quadruplets, pairwise sequence similarity, and human interactome properties. The neural network and SVM were implemented using Python Sklearn library. The neural network model using quadruplet features and other network features outperformance the SVM model. The models are tested against published predictors and then applied to the human-B.anthracis case. Gene ontology term enrichment analysis identifies immunology response and regulation as functions of interacting proteins. For prediction of Human-viral PPI, our model (neural network) is a significant improvement in overall performance compared to a predictor using the triplets feature and achieves a good accuracy in predicting human-B.anthracis PPI. Availability and implementation All code can be downloaded from ftp://ftp.sanbi.ac.za/machine_learning/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ibrahim Ahmed
- South African National Bioinformatics Institute, South African MRC Bioinformatics Unit
| | - Peter Witbooi
- Department of Mathematics and Applied Mathematics, University of the Western Cape, Bellville, South Africa
| | - Alan Christoffels
- South African National Bioinformatics Institute, South African MRC Bioinformatics Unit
| |
Collapse
|
41
|
Sudhakar P, Jacomin AC, Hautefort I, Samavedam S, Fatemian K, Ari E, Gul L, Demeter A, Jones E, Korcsmaros T, Nezis IP. Targeted interplay between bacterial pathogens and host autophagy. Autophagy 2019; 15:1620-1633. [PMID: 30909843 PMCID: PMC6693458 DOI: 10.1080/15548627.2019.1590519] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Revised: 02/21/2019] [Accepted: 03/01/2019] [Indexed: 12/12/2022] Open
Abstract
Due to the critical role played by autophagy in pathogen clearance, pathogens have developed diverse strategies to subvert it. Despite previous key findings of bacteria-autophagy interplay, asystems-level insight into selective targeting by the host and autophagy modulation by the pathogens is lacking. We predicted potential interactions between human autophagy proteins and effector proteins from 56 pathogenic bacterial species by identifying bacterial proteins predicted to have recognition motifs for selective autophagy receptors SQSTM1/p62, CALCOCO2/NDP52 and MAP1LC3/LC3. Using structure-based interaction prediction, we identified bacterial proteins capable to modify core autophagy components. Our analysis revealed that autophagy receptors in general potentially target mostly genus-specific proteins, and not those present in multiple genera. The complementarity between the predicted SQSTM1/p62 and CALCOCO2/NDP52 targets, which has been shown for Salmonella, Listeria and Shigella, could be observed across other pathogens. This complementarity potentially leaves the host more susceptible to chronic infections upon the mutation of autophagy receptors. Proteins derived from enterotoxigenic and non-toxigenic Bacillus outer membrane vesicles indicated that autophagy targets pathogenic proteins rather than non-pathogenic ones. We also observed apathogen-specific pattern as to which autophagy phase could be modulated by specific genera. We found intriguing examples of bacterial proteins that could modulate autophagy, and in turn being targeted by autophagy as ahost defense mechanism. We confirmed experimentally an interplay between a Salmonella protease, YhjJ and autophagy. Our comparative meta-analysis points out key commonalities and differences in how pathogens could affect autophagy and how autophagy potentially recognizes these pathogenic effectors. Abbreviations: ATG5: autophagy related 5; CALCOCO2/NDP52: calcium binding and coiled-coil domain 2; GST: glutathione S-transferase; LIR: MAP1LC3/LC3-interacting region; MAP1LC3/LC3: microtubule associated protein 1 light chain 3 alpha; OMV: outer membrane vesicles; SQSTM1/p62: sequestosome 1; SCV: Salmonella containing vesicle; TECPR1: tectonin beta-propeller repeat containing 1; YhjJ: hypothetical zinc-protease.
Collapse
Affiliation(s)
- Padhmanand Sudhakar
- Earlham Institute, Norwich Research Park, Norwich, UK
- Gut Health and Microbes Programme, Quadram Institute, Norwich Research Park, Norwich, UK
- Department of Chronic Diseases, Metabolism and Ageing, KU Leuven, Leuven, Belgium
| | | | | | - Siva Samavedam
- School of Life Sciences, University of Warwick, Coventry, UK
| | - Koorosh Fatemian
- School of Life Sciences, University of Warwick, Coventry, UK
- Current affiliation:Exaelements LTD, Coventry, UK
| | - Eszter Ari
- Department of Genetics, Eotvos Lorand University, Budapest, Hungary
- Synthetic and System Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Leila Gul
- Earlham Institute, Norwich Research Park, Norwich, UK
| | - Amanda Demeter
- Earlham Institute, Norwich Research Park, Norwich, UK
- Gut Health and Microbes Programme, Quadram Institute, Norwich Research Park, Norwich, UK
- Department of Genetics, Eotvos Lorand University, Budapest, Hungary
| | - Emily Jones
- Earlham Institute, Norwich Research Park, Norwich, UK
- Gut Health and Microbes Programme, Quadram Institute, Norwich Research Park, Norwich, UK
| | - Tamas Korcsmaros
- Earlham Institute, Norwich Research Park, Norwich, UK
- Gut Health and Microbes Programme, Quadram Institute, Norwich Research Park, Norwich, UK
| | | |
Collapse
|
42
|
Durães-Carvalho R, Ludwig-Begall LF, Salemi M, Lins RD, Marques ETA. Influence of directional positive Darwinian selection-driven evolution on arboviruses Dengue and Zika virulence and pathogenesis. Mol Phylogenet Evol 2019; 140:106607. [PMID: 31473337 DOI: 10.1016/j.ympev.2019.106607] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Revised: 08/14/2019] [Accepted: 08/29/2019] [Indexed: 12/25/2022]
Abstract
Dengue (DENV) and Zika (ZIKV) viruses are antigenically and evolutionarily related; immunological cross-reactions between them have been associated to both cross-protection and infection-enhanced mechanisms. Here, DENV-1-4 and ZIKV were investigated through Bayesian coalescent-based approaches and selection-driven Darwinian evolution methods using robust datasets. Our findings show that both DENV and ZIKV, driven essentially by directional positive selection, have undergone evolution and diversification and that their entire polyproteins are subject to an intense directional evolution. Interestingly, positively selected codons mapped here are directly associated to DENV-1-2 virulence as well as the ZIKV burgeoning 2015-16 outbreak in the Americas, therefore, having impact on the pathogenesis of these viruses. Biochemical prediction analysis focusing on markers involved in virulence and viral transmission dynamics identified alterations in N-Glycosylation-, Phosphorylation- and Palmitoylation-sites in ZIKV sampled from different countries, hosts and isolation sources. Taking into account both DENV-ZIKV co-circulation either into and/or out of flavivirus-endemic regions, as well as recombination and quasispecies scenarios, these results indicate the action of a selection-driven evolution affecting the biology, virulence and pathogenesis of these pathogens in a non-randomized environment.
Collapse
Affiliation(s)
- Ricardo Durães-Carvalho
- Department of Virology, Aggeu Magalhães Institute, Oswaldo Cruz Foundation (FIOCRUZ), Recife, PE 50740-465, Brazil.
| | - Louisa F Ludwig-Begall
- Veterinary Virology and Animal Viral Diseases, Department of Infectious and Parasitic Diseases, FARAH Research Centre, Faculty of Veterinary Medicine, University of Liège, Belgium
| | - Marco Salemi
- Emerging Pathogens Institute, University of Florida, Gainesville, FL 32608, United States
| | - Roberto D Lins
- Department of Virology, Aggeu Magalhães Institute, Oswaldo Cruz Foundation (FIOCRUZ), Recife, PE 50740-465, Brazil
| | - Ernesto T A Marques
- Department of Virology, Aggeu Magalhães Institute, Oswaldo Cruz Foundation (FIOCRUZ), Recife, PE 50740-465, Brazil; Center for Vaccine Research, University of Pittsburgh, Pittsburgh, PA 15261, United States
| |
Collapse
|
43
|
Saha S, Sengupta K, Chatterjee P, Basu S, Nasipuri M. Analysis of protein targets in pathogen-host interaction in infectious diseases: a case study on Plasmodium falciparum and Homo sapiens interaction network. Brief Funct Genomics 2019; 17:441-450. [PMID: 29028886 DOI: 10.1093/bfgp/elx024] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Infection and disease progression is the outcome of protein interactions between pathogen and host. Pathogen, the role player of Infection, is becoming a severe threat to life as because of its adaptability toward drugs and evolutionary dynamism in nature. Identifying protein targets by analyzing protein interactions between host and pathogen is the key point. Proteins with higher degree and possessing some topologically significant graph theoretical measures are found to be drug targets. On the other hand, exceptional nodes may be involved in infection mechanism because of some pathway process and biologically unknown factors. In this article, we attempt to investigate characteristics of host-pathogen protein interactions by presenting a comprehensive review of computational approaches applied on different infectious diseases. As an illustration, we have analyzed a case study on infectious disease malaria, with its causative agent Plasmodium falciparum acting as 'Bait' and host, Homo sapiens/human acting as 'Prey'. In this pathogen-host interaction network based on some interconnectivity and centrality properties, proteins are viewed as central, peripheral, hub and non-hub nodes and their significance on infection process. Besides, it is observed that because of sparseness of the pathogen and host interaction network, there may be some topologically unimportant but biologically significant proteins, which can also act as Bait/Prey. So, functional similarity or gene ontology mapping can help us in this case to identify these proteins.
Collapse
Affiliation(s)
- Sovan Saha
- Department of Computer Science and Engineering at Dr Sudhir Chandra Sur Degree Engineering College, India
| | - Kaustav Sengupta
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Piyali Chatterjee
- Department of Computer Science and Engineering, Netaji Subhash Engineering College, Garia, India
| | - Subhadip Basu
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Mita Nasipuri
- Department of Computer Science and Engineering, Jadavpur University, India
| |
Collapse
|
44
|
Lian X, Yang S, Li H, Fu C, Zhang Z. Machine-Learning-Based Predictor of Human–Bacteria Protein–Protein Interactions by Incorporating Comprehensive Host-Network Properties. J Proteome Res 2019; 18:2195-2205. [DOI: 10.1021/acs.jproteome.9b00074] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Affiliation(s)
- Xianyi Lian
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Hong Li
- Key Laboratory of Tropical Biological Resources of Ministry of Education, Hainan University, Haikou, 570228, China
| | - Chen Fu
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
45
|
Cuesta-Astroz Y, Santos A, Oliveira G, Jensen LJ. Analysis of Predicted Host-Parasite Interactomes Reveals Commonalities and Specificities Related to Parasitic Lifestyle and Tissues Tropism. Front Immunol 2019; 10:212. [PMID: 30815000 PMCID: PMC6381214 DOI: 10.3389/fimmu.2019.00212] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 01/24/2019] [Indexed: 01/03/2023] Open
Abstract
The study of molecular host–parasite interactions is essential to understand parasitic infection and adaptation within the host system. As well, prevention and treatment of infectious diseases require a clear understanding of the molecular crosstalk between parasites and their hosts. Yet, large-scale experimental identification of host–parasite molecular interactions remains challenging, and the use of computational predictions becomes then necessary. Here, we propose a computational integrative approach to predict host—parasite protein—protein interaction (PPI) networks resulting from the human infection by 15 different eukaryotic parasites. We used an orthology-based approach to transfer high-confidence intraspecies interactions obtained from the STRING database to the corresponding interspecies homolog protein pairs in the host–parasite system. Our approach uses either the parasites predicted secretome and membrane proteins, or only the secretome, depending on whether they are uni- or multi-cellular, respectively, to reduce the number of false predictions. Moreover, the host proteome is filtered for proteins expressed in selected cellular localizations and tissues supporting the parasite growth. We evaluated the inferred interactions by analyzing the enriched biological processes and pathways in the predicted networks and their association with known parasitic invasion and evasion mechanisms. The resulting PPI networks were compared across parasites to identify common mechanisms that may define a global pathogenic hallmark. We also provided a study case focusing on a closer examination of the human–S. mansoni predicted interactome, detecting central proteins that have relevant roles in the human–S. mansoni network, and identifying tissue-specific interactions with key roles in the life cycle of the parasite. The predicted PPI networks can be visualized and downloaded at http://orthohpi.jensenlab.org.
Collapse
Affiliation(s)
- Yesid Cuesta-Astroz
- Instituto René Rachou, Fundação Oswaldo Cruz - FIOCRUZ, Belo Horizonte, Brazil
| | - Alberto Santos
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | | | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
46
|
Ivan FX, Kwoh CK, Chow VT, Zheng J. Genome Analysis – Identification of Genes Involved in Host-Pathogen Protein-Protein Interaction Networks. ENCYCLOPEDIA OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY 2019:410-424. [DOI: 10.1016/b978-0-12-809633-8.20124-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
47
|
Guven-Maiorov E, Tsai CJ, Ma B, Nussinov R. Interface-Based Structural Prediction of Novel Host-Pathogen Interactions. Methods Mol Biol 2019; 1851:317-335. [PMID: 30298406 PMCID: PMC8192064 DOI: 10.1007/978-1-4939-8736-8_18] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
About 20% of the cancer incidences worldwide have been estimated to be associated with infections. However, the molecular mechanisms of exactly how they contribute to host tumorigenesis are still unknown. To evade host defense, pathogens hijack host proteins at different levels: sequence, structure, motif, and binding surface, i.e., interface. Interface similarity allows pathogen proteins to compete with host counterparts to bind to a target protein, rewire physiological signaling, and result in persistent infections, as well as cancer. Identification of host-pathogen interactions (HPIs)-along with their structural details at atomic resolution-may provide mechanistic insight into pathogen-driven cancers and innovate therapeutic intervention. HPI data including structural details is scarce and large-scale experimental detection is challenging. Therefore, there is an urgent and mounting need for efficient and robust computational approaches to predict HPIs and their complex (bound) structures. In this chapter, we review the first and currently only interface-based computational approach to identify novel HPIs. The concept of interface mimicry promises to identify more HPIs than complete sequence or structural similarity. We illustrate this concept with a case study on Kaposi's sarcoma herpesvirus (KSHV) to elucidate how it subverts host immunity and helps contribute to malignant transformation of the host cells.
Collapse
Affiliation(s)
- Emine Guven-Maiorov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA
| | - Chung-Jung Tsai
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA
| | - Buyong Ma
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA
| | - Ruth Nussinov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA.
- Department of Human Genetics and Molecular Medicine, Sackler Inst. of Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
48
|
A new sequence based encoding for prediction of host-pathogen protein interactions. Comput Biol Chem 2018; 78:170-177. [PMID: 30553999 DOI: 10.1016/j.compbiolchem.2018.12.001] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2017] [Revised: 08/23/2018] [Accepted: 12/01/2018] [Indexed: 12/22/2022]
Abstract
Pathogen-host interactions are very important to figure out the infection process at the molecular level, where pathogen proteins physically bind to human proteins to manipulate critical biological processes in the host cell. Data scarcity and data unavailability are two major problems for computational approaches in the prediction of pathogen-host interactions. Developing a computational method to predict pathogen-host interactions with high accuracy, based on protein sequences alone, is of great importance because it can eliminate these problems. In this study, we propose a novel and robust sequence based feature extraction method, named Location Based Encoding, to predict pathogen-host interactions with machine learning based algorithms. In this context, we use Bacillus Anthracis and Yersinia Pestis data sets as the pathogen organisms and human proteins as the host model to compare our method with sequence based protein encoding methods, which are widely used in the literature, namely amino acid composition, amino acid pair, and conjoint triad. We use these encoding methods with decision trees (Random Forest, j48), statistical (Bayesian Networks, Naive Bayes), and instance based (kNN) classifiers to predict pathogen-host interactions. We conduct different experiments to evaluate the effectiveness of our method. We obtain the best results among all the experiments with RF classifier in terms of F1, accuracy, MCC, and AUC.
Collapse
|
49
|
Halder AK, Dutta P, Kundu M, Basu S, Nasipuri M. Review of computational methods for virus-host protein interaction prediction: a case study on novel Ebola-human interactions. Brief Funct Genomics 2018; 17:381-391. [PMID: 29028879 PMCID: PMC7109800 DOI: 10.1093/bfgp/elx026] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Identification of potential virus-host interactions is useful and vital to control the highly infectious virus-caused diseases. This may contribute toward development of new drugs to treat the viral infections. Recently, database records of clinically and experimentally validated interactions between a small set of human proteins and Ebola virus (EBOV) have been published. Using the information of the known human interaction partners of EBOV, our main objective is to identify a set of proteins that may interact with EBOV proteins. Here, we first review the state-of-the-art, computational methods used for prediction of novel virus-host interactions for infectious diseases followed by a case study on EBOV-human interactions. The assessment result shows that the predicted human host proteins are highly similar with known human interaction partners of EBOV in the context of structure and semantics and are responsible for similar biochemical activities, pathways and host-pathogen relationships.
Collapse
Affiliation(s)
- Anup Kumar Halder
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Pritha Dutta
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Mahantapas Kundu
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Subhadip Basu
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Mita Nasipuri
- Department of Computer Science and Engineering, Jadavpur University, India
| |
Collapse
|
50
|
Chen J, Sun J, Liu X, Liu F, Liu R, Wang J. Structure-based prediction of West Nile virus-human protein-protein interactions. J Biomol Struct Dyn 2018; 37:2310-2321. [PMID: 30044201 DOI: 10.1080/07391102.2018.1479659] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
In recent years, West Nile virus (WNV) has posed a great threat to global human health due to its explosive spread. Studying the protein-protein interactions (PPIs) between WNV and human is beneficial for understanding the pathogenesis of WNV and the immune response mechanism of human against WNV infection at the molecular level. In this study, we identified the human target proteins which interact with WNV based on protein structure similarity, and then the interacting pairs were filtered by the subcellular co-localization information. As a result, a network of 3346 interactions was constructed, involving 6 WNV proteins and 1970 human target proteins. To our knowledge, this is the first predicted interactome for WNV-human. By analyzing the topological properties and evolutionary rates of the human target proteins, it was demonstrated that these proteins tend to be the hub and bottleneck proteins in the human PPI network and are more conserved than the non-target ones. Triplet analysis showed that the target proteins are adjacent to each other in the human PPI network, suggesting that these proteins may have similar biological functions. Further, the functional enrichment analysis indicated that the target proteins are mainly involved in virus process, transcription regulation, cell adhesion, and so on. In addition, the common and specific targets were identified and compared based on the networks between WNV-human and Dengue virus II (DENV2)-human. Finally, by combining topological features and existing drug target information, we identified 30 potential anti-WNV human targets, among which 11 ones were reported to be associated with WNV infection. Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Jing Chen
- a Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics , Huazhong Agricultural University , Wuhan , People's Republic of China
| | - Jun Sun
- a Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics , Huazhong Agricultural University , Wuhan , People's Republic of China
| | - Xiangming Liu
- b Gongqing Institute of Science and Technology , Gongqing , People's Republic of China
| | - Feng Liu
- a Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics , Huazhong Agricultural University , Wuhan , People's Republic of China
| | - Rong Liu
- a Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics , Huazhong Agricultural University , Wuhan , People's Republic of China
| | - Jia Wang
- a Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics , Huazhong Agricultural University , Wuhan , People's Republic of China
| |
Collapse
|