1
|
Idrees S, Paudel KR, Banik M, Suwal N, Thapa R, Bashyal S. Predicting Motif-Mediated Interactions Based on Viral Genomic Composition. Int J Mol Sci 2025; 26:3674. [PMID: 40332242 PMCID: PMC12028151 DOI: 10.3390/ijms26083674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2025] [Revised: 04/10/2025] [Accepted: 04/11/2025] [Indexed: 05/08/2025] Open
Abstract
Viruses manipulate host cellular machinery to propagate their life cycle, with one key strategy being the mimicry of short linear motifs (SLiMs) found in host proteins. While databases continue to expand with virus-host protein-protein interaction (vhPPI) data, accurately predicting viral mimicry remains challenging due to the inherent degeneracy of SLiMs. In this study, we investigate how viral genomic composition influences motif mimicry and the mechanisms through which viruses hijack host cellular functions. We assessed domain-motif interaction (DMI) enrichment differences, and also predicted new DMIs based on known viral motifs with varying stringency levels, using SLiMEnrich v.1.5.1. Our findings reveal that dsDNA viruses capture significantly more known DMIs compared to other viral groups, with dsRNA viruses also exhibiting higher DMI enrichment than ssRNA viruses. Additionally, we identified new vhPPIs mediated via SLiMs, particularly within different viral genomic contexts. Understanding these interactions is vital for elucidating viral strategies to hijack host functions, which could inform the development of targeted antiviral therapies.
Collapse
Affiliation(s)
- Sobia Idrees
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW 2033, Australia
- Centre for Inflammation, Centenary Institute and the University of Technology Sydney, School of Life Sciences, Faculty of Science, Sydney, NSW 2007, Australia
| | - Keshav Raj Paudel
- Centre for Inflammation, Centenary Institute and the University of Technology Sydney, School of Life Sciences, Faculty of Science, Sydney, NSW 2007, Australia
| | - Mithila Banik
- Department of Bioinformatics and Biotechnology, Asian University for Women, Chittagong 4000, Bangladesh
| | - Newton Suwal
- Department of Pharmacy, Manmohan Institute of Health Sciences, Tribhuvan University, Kathmandu 44600, Nepal
| | - Rajan Thapa
- Department of Pharmacy, Universal College of Medical Sciences, Tribhuvan University, Bhairahawa, Rupendehi 32900, Nepal
| | - Saroj Bashyal
- Department of Pharmacy, Manmohan Institute of Health Sciences, Tribhuvan University, Kathmandu 44600, Nepal
| |
Collapse
|
2
|
Idrees S, Paudel KR, Hansbro PM. Prediction of motif-mediated viral mimicry through the integration of host-pathogen interactions. Arch Microbiol 2024; 206:94. [PMID: 38334822 PMCID: PMC10858152 DOI: 10.1007/s00203-024-03832-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 01/01/2024] [Accepted: 01/02/2024] [Indexed: 02/10/2024]
Abstract
One of the mechanisms viruses use in hijacking host cellular machinery is mimicking Short Linear Motifs (SLiMs) in host proteins to maintain their life cycle inside host cells. In the face of the escalating volume of virus-host protein-protein interactions (vhPPIs) documented in databases; the accurate prediction of molecular mimicry remains a formidable challenge due to the inherent degeneracy of SLiMs. Consequently, there is a pressing need for computational methodologies to predict new instances of viral mimicry. Our present study introduces a DMI-de-novo pipeline, revealing that vhPPIs catalogued in the VirHostNet3.0 database effectively capture domain-motif interactions (DMIs). Notably, both affinity purification coupled mass spectrometry and yeast two-hybrid assays emerged as good approaches for delineating DMIs. Furthermore, we have identified new vhPPIs mediated by SLiMs across different viruses. Importantly, the de-novo prediction strategy facilitated the recognition of several potential mimicry candidates implicated in the subversion of host cellular proteins. The insights gleaned from this research not only enhance our comprehension of the mechanisms by which viruses co-opt host cellular machinery but also pave the way for the development of novel therapeutic interventions.
Collapse
Affiliation(s)
- Sobia Idrees
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia.
- Centre for Inflammation, School of Life Sciences, Faculty of Science, Centenary Institute and the University of Technology Sydney, Sydney, NSW, Australia.
| | - Keshav Raj Paudel
- Centre for Inflammation, School of Life Sciences, Faculty of Science, Centenary Institute and the University of Technology Sydney, Sydney, NSW, Australia
| | - Philip M Hansbro
- Centre for Inflammation, School of Life Sciences, Faculty of Science, Centenary Institute and the University of Technology Sydney, Sydney, NSW, Australia
| |
Collapse
|
3
|
Khan T, Raza S. Exploration of Computational Aids for Effective Drug Designing and Management of Viral Diseases: A Comprehensive Review. Curr Top Med Chem 2023; 23:1640-1663. [PMID: 36725827 DOI: 10.2174/1568026623666230201144522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 11/14/2022] [Accepted: 12/19/2022] [Indexed: 02/03/2023]
Abstract
BACKGROUND Microbial diseases, specifically originating from viruses are the major cause of human mortality all over the world. The current COVID-19 pandemic is a case in point, where the dynamics of the viral-human interactions are still not completely understood, making its treatment a case of trial and error. Scientists are struggling to devise a strategy to contain the pandemic for over a year and this brings to light the lack of understanding of how the virus grows and multiplies in the human body. METHODS This paper presents the perspective of the authors on the applicability of computational tools for deep learning and understanding of host-microbe interaction, disease progression and management, drug resistance and immune modulation through in silico methodologies which can aid in effective and selective drug development. The paper has summarized advances in the last five years. The studies published and indexed in leading databases have been included in the review. RESULTS Computational systems biology works on an interface of biology and mathematics and intends to unravel the complex mechanisms between the biological systems and the inter and intra species dynamics using computational tools, and high-throughput technologies developed on algorithms, networks and complex connections to simulate cellular biological processes. CONCLUSION Computational strategies and modelling integrate and prioritize microbial-host interactions and may predict the conditions in which the fine-tuning attenuates. These microbial-host interactions and working mechanisms are important from the aspect of effective drug designing and fine- tuning the therapeutic interventions.
Collapse
Affiliation(s)
- Tahmeena Khan
- Department of Chemistry, Integral University, Lucknow, 226026, U.P., India
| | - Saman Raza
- Department of Chemistry, Isabella Thoburn College, Lucknow, 226007, U.P., India
| |
Collapse
|
4
|
Cheng T, Chin PJ, Cha K, Petrick N, Mikailov M. Profiling the BLAST bioinformatics application for load balancing on high-performance computing clusters. BMC Bioinformatics 2022; 23:544. [PMID: 36526957 PMCID: PMC9758941 DOI: 10.1186/s12859-022-05029-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 10/31/2022] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND The Basic Local Alignment Search Tool (BLAST) is a suite of commonly used algorithms for identifying matches between biological sequences. The user supplies a database file and query file of sequences for BLAST to find identical sequences between the two. The typical millions of database and query sequences make BLAST computationally challenging but also well suited for parallelization on high-performance computing clusters. The efficacy of parallelization depends on the data partitioning, where the optimal data partitioning relies on an accurate performance model. In previous studies, a BLAST job was sped up by 27 times by partitioning the database and query among thousands of processor nodes. However, the optimality of the partitioning method was not studied. Unlike BLAST performance models proposed in the literature that usually have problem size and hardware configuration as the only variables, the execution time of a BLAST job is a function of database size, query size, and hardware capability. In this work, the nucleotide BLAST application BLASTN was profiled using three methods: shell-level profiling with the Unix "time" command, code-level profiling with the built-in "profiler" module, and system-level profiling with the Unix "gprof" program. The runtimes were measured for six node types, using six different database files and 15 query files, on a heterogeneous HPC cluster with 500+ nodes. The empirical measurement data were fitted with quadratic functions to develop performance models that were used to guide the data parallelization for BLASTN jobs. RESULTS Profiling results showed that BLASTN contains more than 34,500 different functions, but a single function, RunMTBySplitDB, takes 99.12% of the total runtime. Among its 53 child functions, five core functions were identified to make up 92.12% of the overall BLASTN runtime. Based on the performance models, static load balancing algorithms can be applied to the BLASTN input data to minimize the runtime of the longest job on an HPC cluster. Four test cases being run on homogeneous and heterogeneous clusters were tested. Experiment results showed that the runtime can be reduced by 81% on a homogeneous cluster and by 20% on a heterogeneous cluster by re-distributing the workload. DISCUSSION Optimal data partitioning can improve BLASTN's overall runtime 5.4-fold in comparison with dividing the database and query into the same number of fragments. The proposed methodology can be used in the other applications in the BLAST+ suite or any other application as long as source code is available.
Collapse
Affiliation(s)
- Trinity Cheng
- grid.417587.80000 0001 2243 3366Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, MD 20993 USA ,grid.21107.350000 0001 2171 9311Department of Biomedical Engineering, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD 21218 USA
| | - Pei-Ju Chin
- grid.290496.00000 0001 1945 2072Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, MD 20993 USA
| | - Kenny Cha
- grid.417587.80000 0001 2243 3366Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, MD 20993 USA
| | - Nicholas Petrick
- grid.417587.80000 0001 2243 3366Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, MD 20993 USA
| | - Mike Mikailov
- grid.417587.80000 0001 2243 3366Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, MD 20993 USA
| |
Collapse
|
5
|
Wang L, Li FL, Ma XY, Cang Y, Bai F. PPI-Miner: A Structure and Sequence Motif Co-Driven Protein-Protein Interaction Mining and Modeling Computational Method. J Chem Inf Model 2022; 62:6160-6171. [PMID: 36448715 DOI: 10.1021/acs.jcim.2c01033] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Protein-protein interactions (PPIs) play important roles in biological processes of life, and predicting PPIs becomes a critical scientific issue of concern. Most PPIs occur through small domains or motifs (fragments), which are challenging and laborious to map by standard biochemical approaches because they generally require the cloning of several truncation mutants. Here, we present a computational method, named as PPI-Miner, to fish potential protein interacting partners utilizing protein motifs as queries. In brief, this work first developed a motif-matching algorithm designed to identify the proteins that contain sequential or structural similar motifs with the given query motif. Being aligned to the query motif, the binding mode of the discovered motif and its receptor protein will be initially determined to be used to build PPI complexes accordingly. Eventually, a PPI complex structure could be built and optimized with a designed automatic protocol. Besides discovering PPIs, PPI-Miner can also be applied to other areas, i.e., the rational design of molecular glues and protein vaccines. In this work, PPI-Miner was employed to mine the potential cereblon (CRBN) substrates from human proteome. As a result, 1,739 candidates were predicted, and 16 of them have been experimentally validated in previous studies. The source code of PPI-Miner can be obtained from the GitHub repository (https://github.com/Wang-Lin-boop/PPI-Miner), the webserver is freely available for users (https://bailab.siais.shanghaitech.edu.cn/services/ppi-miner), and the database of predicted CRBN substrates is accessible at https://bailab.siais.shanghaitech.edu.cn/services/crbn-subslib.
Collapse
Affiliation(s)
| | | | | | | | - Fang Bai
- Shanghai Clinical Research and Trial Center, Shanghai201210, China
| |
Collapse
|
6
|
Jain A, Mittal S, Tripathi LP, Nussinov R, Ahmad S. Host-pathogen protein-nucleic acid interactions: A comprehensive review. Comput Struct Biotechnol J 2022; 20:4415-4436. [PMID: 36051878 PMCID: PMC9420432 DOI: 10.1016/j.csbj.2022.08.001] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 08/01/2022] [Accepted: 08/01/2022] [Indexed: 12/02/2022] Open
Abstract
Recognition of pathogen-derived nucleic acids by host cells is an effective host strategy to detect pathogenic invasion and trigger immune responses. In the context of pathogen-specific pharmacology, there is a growing interest in mapping the interactions between pathogen-derived nucleic acids and host proteins. Insight into the principles of the structural and immunological mechanisms underlying such interactions and their roles in host defense is necessary to guide therapeutic intervention. Here, we discuss the newest advances in studies of molecular interactions involving pathogen nucleic acids and host factors, including their drug design, molecular structure and specific patterns. We observed that two groups of nucleic acid recognizing molecules, Toll-like receptors (TLRs) and the cytoplasmic retinoic acid-inducible gene (RIG)-I-like receptors (RLRs) form the backbone of host responses to pathogen nucleic acids, with additional support provided by absent in melanoma 2 (AIM2) and DNA-dependent activator of Interferons (IFNs)-regulatory factors (DAI) like cytosolic activity. We review the structural, immunological, and other biological aspects of these representative groups of molecules, especially in terms of their target specificity and affinity and challenges in leveraging host-pathogen protein-nucleic acid interactions (HP-PNI) in drug discovery.
Collapse
Affiliation(s)
- Anuja Jain
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - Shikha Mittal
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Solan, Himachal Pradesh, 173234, India
| | - Lokesh P. Tripathi
- National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
- Riken Center for Integrative Medical Sciences, Tsurumi, Yokohama, Kanagawa, Japan
| | - Ruth Nussinov
- Computational Structural Biology Section, Basic Science Program, Frederick National, Laboratory for Cancer Research, Frederick, MD 21702, USA
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Israel
| | - Shandar Ahmad
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| |
Collapse
|
7
|
Wadie B, Kleshchevnikov V, Sandaltzopoulou E, Benz C, Petsalaki E. Use of viral motif mimicry improves the proteome-wide discovery of human linear motifs. Cell Rep 2022; 39:110764. [PMID: 35508127 DOI: 10.1016/j.celrep.2022.110764] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 02/09/2022] [Accepted: 04/08/2022] [Indexed: 12/16/2022] Open
Abstract
Linear motifs have an integral role in dynamic cell functions, including cell signaling. However, due to their small size, low complexity, and frequent mutations, identifying novel functional motifs poses a challenge. Viruses rely extensively on the molecular mimicry of cellular linear motifs. In this study, we apply systematic motif prediction combined with functional filters to identify human linear motifs convergently evolved also in viral proteins. We observe an increase in the sensitivity of motif prediction and improved enrichment in known instances. We identify >7,300 non-redundant motif instances at various confidence levels, 99 of which are supported by all functional and structural filters. Overall, we provide a pipeline to improve the identification of functional linear motifs from interactomics datasets and a comprehensive catalog of putative human motifs that can contribute to our understanding of the human domain-linear motif code and the associated mechanisms of viral interference.
Collapse
Affiliation(s)
- Bishoy Wadie
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Vitalii Kleshchevnikov
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Elissavet Sandaltzopoulou
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Caroline Benz
- Department of Chemistry - BMC, Uppsala University, Uppsala, Sweden
| | - Evangelia Petsalaki
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK.
| |
Collapse
|
8
|
Hu RS, Hesham AEL, Zou Q. Machine Learning and Its Applications for Protozoal Pathogens and Protozoal Infectious Diseases. Front Cell Infect Microbiol 2022; 12:882995. [PMID: 35573796 PMCID: PMC9097758 DOI: 10.3389/fcimb.2022.882995] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 03/28/2022] [Indexed: 12/24/2022] Open
Abstract
In recent years, massive attention has been attracted to the development and application of machine learning (ML) in the field of infectious diseases, not only serving as a catalyst for academic studies but also as a key means of detecting pathogenic microorganisms, implementing public health surveillance, exploring host-pathogen interactions, discovering drug and vaccine candidates, and so forth. These applications also include the management of infectious diseases caused by protozoal pathogens, such as Plasmodium, Trypanosoma, Toxoplasma, Cryptosporidium, and Giardia, a class of fatal or life-threatening causative agents capable of infecting humans and a wide range of animals. With the reduction of computational cost, availability of effective ML algorithms, popularization of ML tools, and accumulation of high-throughput data, it is possible to implement the integration of ML applications into increasing scientific research related to protozoal infection. Here, we will present a brief overview of important concepts in ML serving as background knowledge, with a focus on basic workflows, popular algorithms (e.g., support vector machine, random forest, and neural networks), feature extraction and selection, and model evaluation metrics. We will then review current ML applications and major advances concerning protozoal pathogens and protozoal infectious diseases through combination with correlative biology expertise and provide forward-looking insights for perspectives and opportunities in future advances in ML techniques in this field.
Collapse
Affiliation(s)
- Rui-Si Hu
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Abd El-Latif Hesham
- Genetics Department, Faculty of Agriculture, Beni-Suef University, Beni-Suef, Egypt
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
- *Correspondence: Quan Zou,
| |
Collapse
|
9
|
Abstract
Since the large-scale experimental characterization of protein–protein interactions (PPIs) is not possible for all species, several computational PPI prediction methods have been developed that harness existing data from other species. While PPI network prediction has been extensively used in eukaryotes, microbial network inference has lagged behind. However, bacterial interactomes can be built using the same principles and techniques; in fact, several methods are better suited to bacterial genomes. These predicted networks allow systems-level analyses in species that lack experimental interaction data. This review describes the current network inference and analysis techniques and summarizes the use of computationally-predicted microbial interactomes to date.
Collapse
|
10
|
Chai H, Gu Q, Hughes J, Robertson DL. In silico prediction of HIV-1-host molecular interactions and their directionality. PLoS Comput Biol 2022; 18:e1009720. [PMID: 35134057 PMCID: PMC8856524 DOI: 10.1371/journal.pcbi.1009720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 02/18/2022] [Accepted: 12/03/2021] [Indexed: 11/18/2022] Open
Abstract
Human immunodeficiency virus type 1 (HIV-1) continues to be a major cause of disease and premature death. As with all viruses, HIV-1 exploits a host cell to replicate. Improving our understanding of the molecular interactions between virus and human host proteins is crucial for a mechanistic understanding of virus biology, infection and host antiviral activities. This knowledge will potentially permit the identification of host molecules for targeting by drugs with antiviral properties. Here, we propose a data-driven approach for the analysis and prediction of the HIV-1 interacting proteins (VIPs) with a focus on the directionality of the interaction: host-dependency versus antiviral factors. Using support vector machine learning models and features encompassing genetic, proteomic and network properties, our results reveal some significant differences between the VIPs and non-HIV-1 interacting human proteins (non-VIPs). As assessed by comparison with the HIV-1 infection pathway data in the Reactome database (sensitivity > 90%, threshold = 0.5), we demonstrate these models have good generalization properties. We find that the ‘direction’ of the HIV-1-host molecular interactions is also predictable due to different characteristics of ‘forward’/pro-viral versus ‘backward’/pro-host proteins. Additionally, we infer the previously unknown direction of the interactions between HIV-1 and 1351 human host proteins. A web server for performing predictions is available at http://hivpre.cvr.gla.ac.uk/.
Collapse
Affiliation(s)
- Haiting Chai
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
| | - Quan Gu
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
| | - Joseph Hughes
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
| | - David L. Robertson
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
- * E-mail:
| |
Collapse
|
11
|
Dynamic, but Not Necessarily Disordered, Human-Virus Interactions Mediated through SLiMs in Viral Proteins. Viruses 2021; 13:v13122369. [PMID: 34960638 PMCID: PMC8703344 DOI: 10.3390/v13122369] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 11/15/2021] [Accepted: 11/16/2021] [Indexed: 12/13/2022] Open
Abstract
Most viruses have small genomes that encode proteins needed to perform essential enzymatic functions. Across virus families, primary enzyme functions are under functional constraint; however, secondary functions mediated by exposed protein surfaces that promote interactions with the host proteins may be less constrained. Viruses often form transient interactions with host proteins through conformationally flexible interfaces. Exposed flexible amino acid residues are known to evolve rapidly suggesting that secondary functions may generate diverse interaction potentials between viruses within the same viral family. One mechanism of interaction is viral mimicry through short linear motifs (SLiMs) that act as functional signatures in host proteins. Viral SLiMs display specific patterns of adjacent amino acids that resemble their host SLiMs and may occur by chance numerous times in viral proteins due to mutational and selective processes. Through mimicry of SLiMs in the host cell proteome, viruses can interfere with the protein interaction network of the host and utilize the host-cell machinery to their benefit. The overlap between rapidly evolving protein regions and the location of functionally critical SLiMs suggest that these motifs and their functional potential may be rapidly rewired causing variation in pathogenicity, infectivity, and virulence of related viruses. The following review provides an overview of known viral SLiMs with select examples of their role in the life cycle of a virus, and a discussion of the structural properties of experimentally validated SLiMs highlighting that a large portion of known viral SLiMs are devoid of predicted intrinsic disorder based on the viral SLiMs from the ELM database.
Collapse
|
12
|
Karabulut OC, Karpuzcu BA, Türk E, Ibrahim AH, Süzek BE. ML-AdVInfect: A Machine-Learning Based Adenoviral Infection Predictor. Front Mol Biosci 2021; 8:647424. [PMID: 34026828 PMCID: PMC8139618 DOI: 10.3389/fmolb.2021.647424] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Accepted: 04/22/2021] [Indexed: 01/08/2023] Open
Abstract
Adenoviruses (AdVs) constitute a diverse family with many pathogenic types that infect a broad range of hosts. Understanding the pathogenesis of adenoviral infections is not only clinically relevant but also important to elucidate the potential use of AdVs as vectors in therapeutic applications. For an adenoviral infection to occur, attachment of the viral ligand to a cellular receptor on the host organism is a prerequisite and, in this sense, it is a criterion to decide whether an adenoviral infection can potentially happen. The interaction between any virus and its corresponding host organism is a specific kind of protein-protein interaction (PPI) and several experimental techniques, including high-throughput methods are being used in exploring such interactions. As a result, there has been accumulating data on virus-host interactions including a significant portion reported at publicly available bioinformatics resources. There is not, however, a computational model to integrate and interpret the existing data to draw out concise decisions, such as whether an infection happens or not. In this study, accepting the cellular entry of AdV as a decisive parameter for infectivity, we have developed a machine learning, more precisely support vector machine (SVM), based methodology to predict whether adenoviral infection can take place in a given host. For this purpose, we used the sequence data of the known receptors of AdVs, we identified sets of adenoviral ligands and their respective host species, and eventually, we have constructed a comprehensive adenovirus–host interaction dataset. Then, we committed interaction predictions through publicly available virus-host PPI tools and constructed an AdV infection predictor model using SVM with RBF kernel, with the overall sensitivity, specificity, and AUC of 0.88 ± 0.011, 0.83 ± 0.064, and 0.86 ± 0.030, respectively. ML-AdVInfect is the first of its kind as an effective predictor to screen the infection capacity along with anticipating any cross-species shifts. We anticipate our approach led to ML-AdVInfect can be adapted in making predictions for other viral infections.
Collapse
Affiliation(s)
- Onur Can Karabulut
- Bioinformatics Graduate Program, Graduate School of Natural and Applied Sciences, Muğla Sıtkı Koçman University, Muğla, Turkey
| | - Betül Asiye Karpuzcu
- Bioinformatics Graduate Program, Graduate School of Natural and Applied Sciences, Muğla Sıtkı Koçman University, Muğla, Turkey
| | - Erdem Türk
- Department of Computer Engineering, Faculty of Engineering, Muğla Sıtkı Koçman University, Muğla, Turkey
| | - Ahmad Hassan Ibrahim
- Department of Computer Engineering, Faculty of Engineering, Muğla Sıtkı Koçman University, Muğla, Turkey
| | - Barış Ethem Süzek
- Department of Computer Engineering, Faculty of Engineering, Muğla Sıtkı Koçman University, Muğla, Turkey.,Georgetown University Medical Center, Biochemistry and Molecular and Cellular Biology, Washington, DC, United States
| |
Collapse
|
13
|
Lo Cascio E, Toto A, Babini G, De Maio F, Sanguinetti M, Mordente A, Della Longa S, Arcovito A. Structural determinants driving the binding process between PDZ domain of wild type human PALS1 protein and SLiM sequences of SARS-CoV E proteins. Comput Struct Biotechnol J 2021; 19:1838-1847. [PMID: 33758649 PMCID: PMC7970798 DOI: 10.1016/j.csbj.2021.03.014] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Revised: 03/13/2021] [Accepted: 03/13/2021] [Indexed: 12/21/2022] Open
Abstract
Short Linear Motifs (SLiMs) are functional protein microdomains that typically mediate interactions between a short linear region in one protein and a globular domain in another. Surface Plasmon Resonance assays have been performed to determine the binding affinity between PDZ domain of wild type human PALS1 protein and tetradecapeptides representing the SLiMs sequences of SARS-CoV-1 and SARS-CoV-2 E proteins (E-SLiMs). SARS-CoV-2 E-SLiM binds to the human target protein with a higher affinity compared to SARS-CoV-1, showing a difference significantly greater than previously reported using the F318W mutant of PALS1 protein and shorter target peptides. Moreover, molecular dynamics simulations have provided clear evidence of the structural determinants driving this binding process. Specifically, the Arginine 69 residue in the SARS-CoV-2 E-SLiM is the key residue able to both enhance the specific polar interaction with negatively charged pockets of the PALS1 PDZ domain and reduce significantly the mobility of the viral peptide. These experimental and computational data are reinforced by the comparison of the interaction between the PALS1 PDZ domain with the natural ligand CRB1, as well as the corresponding E-SLiMs of other coronavirus members such as MERS and OCF43. Our results provide a model at the molecular level of the strategies used to mimic the endogenous SLiM peptide in the binding of the tight junctions of the host cell, explaining one of the possible reasons of the severity of the infection and pulmonary inflammation by SARS-CoV-2.
Collapse
Affiliation(s)
- Ettore Lo Cascio
- Dipartimento di Scienze Biotecnologiche di Base, Cliniche Intensivologiche e Perioperatorie, Università Cattolica del Sacro Cuore, Largo Francesco Vito 1, 00168 Roma, Italy
| | - Angelo Toto
- Istituto Pasteur-Fondazione Cenci Bolognetti, Dipartimento di Scienze Biochimiche "A. Rossi Fanelli" and Istituto di Biologia e Patologia Molecolari del CNR, Sapienza Università di Roma, 00185 Rome, Italy
| | - Gabriele Babini
- Dipartimento di Scienze della Salute della Donna, del Bambino e di Sanità Pubblica, Fondazione Policlinico Universitario "A. Gemelli", IRCCS, Largo A. Gemelli 8, 00168 Roma, Italy
| | - Flavio De Maio
- Dipartimento di Scienze Biotecnologiche di Base, Cliniche Intensivologiche e Perioperatorie, Università Cattolica del Sacro Cuore, Largo Francesco Vito 1, 00168 Roma, Italy.,Dipartimento di Scienze di Laboratorio e Infettivologiche, Fondazione Policlinico Universitario "A. Gemelli", IRCCS, Largo A. Gemelli 8, 00168 Roma, Italy
| | - Maurizio Sanguinetti
- Dipartimento di Scienze Biotecnologiche di Base, Cliniche Intensivologiche e Perioperatorie, Università Cattolica del Sacro Cuore, Largo Francesco Vito 1, 00168 Roma, Italy.,Dipartimento di Scienze di Laboratorio e Infettivologiche, Fondazione Policlinico Universitario "A. Gemelli", IRCCS, Largo A. Gemelli 8, 00168 Roma, Italy
| | - Alvaro Mordente
- Dipartimento di Scienze Biotecnologiche di Base, Cliniche Intensivologiche e Perioperatorie, Università Cattolica del Sacro Cuore, Largo Francesco Vito 1, 00168 Roma, Italy.,Dipartimento di Scienze di Laboratorio e Infettivologiche, Fondazione Policlinico Universitario "A. Gemelli", IRCCS, Largo A. Gemelli 8, 00168 Roma, Italy
| | - Stefano Della Longa
- Department of Life, Health and Environmental Sciences, University of L'Aquila, 67100 L'Aquila, Italy
| | - Alessandro Arcovito
- Dipartimento di Scienze Biotecnologiche di Base, Cliniche Intensivologiche e Perioperatorie, Università Cattolica del Sacro Cuore, Largo Francesco Vito 1, 00168 Roma, Italy.,Fondazione Policlinico Universitario "A. Gemelli", IRCCS, Largo A. Gemelli 8, 00168 Roma, Italy
| |
Collapse
|
14
|
Lian X, Yang X, Yang S, Zhang Z. Current status and future perspectives of computational studies on human-virus protein-protein interactions. Brief Bioinform 2021; 22:6161422. [PMID: 33693490 DOI: 10.1093/bib/bbab029] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 01/14/2021] [Accepted: 01/20/2021] [Indexed: 12/19/2022] Open
Abstract
The protein-protein interactions (PPIs) between human and viruses mediate viral infection and host immunity processes. Therefore, the study of human-virus PPIs can help us understand the principles of human-virus relationships and can thus guide the development of highly effective drugs to break the transmission of viral infectious diseases. Recent years have witnessed the rapid accumulation of experimentally identified human-virus PPI data, which provides an unprecedented opportunity for bioinformatics studies revolving around human-virus PPIs. In this article, we provide a comprehensive overview of computational studies on human-virus PPIs, especially focusing on the method development for human-virus PPI predictions. We briefly introduce the experimental detection methods and existing database resources of human-virus PPIs, and then discuss the research progress in the development of computational prediction methods. In particular, we elaborate the machine learning-based prediction methods and highlight the need to embrace state-of-the-art deep-learning algorithms and new feature engineering techniques (e.g. the protein embedding technique derived from natural language processing). To further advance the understanding in this research topic, we also outline the practical applications of the human-virus interactome in fundamental biological discovery and new antiviral therapy development.
Collapse
Affiliation(s)
- Xianyi Lian
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
15
|
Martínez YA, Guo X, Portales-Pérez DP, Rivera G, Castañeda-Delgado JE, García-Pérez CA, Enciso-Moreno JA, Lara-Ramírez EE. The analysis on the human protein domain targets and host-like interacting motifs for the MERS-CoV and SARS-CoV/CoV-2 infers the molecular mimicry of coronavirus. PLoS One 2021; 16:e0246901. [PMID: 33596252 PMCID: PMC7888644 DOI: 10.1371/journal.pone.0246901] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 01/28/2021] [Indexed: 12/14/2022] Open
Abstract
The MERS-CoV, SARS-CoV, and SARS-CoV-2 are highly pathogenic viruses that can cause severe pneumonic diseases in humans. Unfortunately, there is a non-available effective treatment to combat these viruses. Domain-motif interactions (DMIs) are an essential means by which viruses mimic and hijack the biological processes of host cells. To disentangle how viruses achieve this process can help to develop new rational therapies. Data mining was performed to obtain DMIs stored as regular expressions (regexp) in 3DID and ELM databases. The mined regexp information was mapped on the coronaviruses' proteomes. Most motifs on viral protein that could interact with human proteins are shared across the coronavirus species, indicating that molecular mimicry is a common strategy for coronavirus infection. Enrichment ontology analysis for protein domains showed a shared biological process and molecular function terms related to carbon source utilization and potassium channel regulation. Some of the mapped motifs were nested on B, and T cell epitopes, suggesting that it could be as an alternative way for reverse vaccinology. The information obtained in this study could be used for further theoretic and experimental explorations on coronavirus infection mechanism and development of medicines for treatment.
Collapse
Affiliation(s)
- Yamelie A. Martínez
- Unidad de Investigación Biomédica de Zacatecas, Instituto Mexicano Del Seguro Social, Zacatecas, México
- Laboratorio de Inmunología y Biología Celular y Molecular, Facultad de Ciencias Químicas, Universidad Autónoma de San Luis Potosí, San Luis Potosí, México
| | - Xianwu Guo
- Laboratorio de Biotecnología Genómica, Centro de Biotecnología Genómica, Instituto Politécnico Nacional, Reynosa, México
| | - Diana P. Portales-Pérez
- Laboratorio de Inmunología y Biología Celular y Molecular, Facultad de Ciencias Químicas, Universidad Autónoma de San Luis Potosí, San Luis Potosí, México
| | - Gildardo Rivera
- Laboratorio de Biotecnología Farmacéutica, Centro de Biotecnología Genómica, Instituto Politécnico Nacional, Reynosa, México
| | - Julio E. Castañeda-Delgado
- Unidad de Investigación Biomédica de Zacatecas, Instituto Mexicano Del Seguro Social, Zacatecas, México
- Cátedras-CONACYT, Unidad de Investigación Biomédica de Zacatecas, Instituto Mexicano Del Seguro Social, Zacatecas, México
| | - Carlos A. García-Pérez
- Information and Communication Technology Department (ICT), Complex Systems, Helmholtz Zentrum München, Neuherberg, Germany
| | - José A. Enciso-Moreno
- Unidad de Investigación Biomédica de Zacatecas, Instituto Mexicano Del Seguro Social, Zacatecas, México
| | - Edgar E. Lara-Ramírez
- Unidad de Investigación Biomédica de Zacatecas, Instituto Mexicano Del Seguro Social, Zacatecas, México
| |
Collapse
|
16
|
Yang CW, Shi ZL. Uncovering potential host proteins and pathways that may interact with eukaryotic short linear motifs in viral proteins of MERS, SARS and SARS2 coronaviruses that infect humans. PLoS One 2021; 16:e0246150. [PMID: 33534852 PMCID: PMC7857568 DOI: 10.1371/journal.pone.0246150] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Accepted: 01/14/2021] [Indexed: 12/30/2022] Open
Abstract
A coronavirus pandemic caused by a novel coronavirus (SARS-CoV-2) has spread rapidly worldwide since December 2019. Improved understanding and new strategies to cope with novel coronaviruses are urgently needed. Viruses (especially RNA viruses) encode a limited number and size (length of polypeptide chain) of viral proteins and must interact with the host cell components to control (hijack) the host cell machinery. To achieve this goal, the extensive mimicry of SLiMs in host proteins provides an effective strategy. However, little is known regarding SLiMs in coronavirus proteins and their potential targets in host cells. The objective of this study is to uncover SLiMs in coronavirus proteins that are present within host cells. These SLiMs have a high possibility of interacting with host intracellular proteins and hijacking the host cell machinery for virus replication and dissemination. In total, 1,479 SLiM hits were identified in the 16 proteins of 590 coronaviruses infecting humans. Overall, 106 host proteins were identified that may interact with SLiMs in 16 coronavirus proteins. These SLiM-interacting proteins are composed of many intracellular key regulators, such as receptors, transcription factors and kinases, and may have important contributions to virus replication, immune evasion and viral pathogenesis. A total of 209 pathways containing proteins that may interact with SLiMs in coronavirus proteins were identified. This study uncovers potential mechanisms by which coronaviruses hijack the host cell machinery. These results provide potential therapeutic targets for viral infections.
Collapse
Affiliation(s)
- Chu-Wen Yang
- Department of Microbiology, Center for Applied Artificial Intelligence Research, Soochow University, Taipei, Taiwan
- * E-mail:
| | - Zhi-Ling Shi
- Ocean School of Fuzhou University, Fuzhou University, Fuzhou, China
| |
Collapse
|
17
|
Acharya D, Dutta TK. Elucidating the network features and evolutionary attributes of intra- and interspecific protein-protein interactions between human and pathogenic bacteria. Sci Rep 2021; 11:190. [PMID: 33420198 PMCID: PMC7794237 DOI: 10.1038/s41598-020-80549-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 12/09/2020] [Indexed: 01/08/2023] Open
Abstract
Host–pathogen interaction is one of the most powerful determinants involved in coevolutionary processes covering a broad range of biological phenomena at molecular, cellular, organismal and/or population level. The present study explored host–pathogen interaction from the perspective of human–bacteria protein–protein interaction based on large-scale interspecific and intraspecific interactome data for human and three pathogenic bacterial species, Bacillus anthracis, Francisella tularensis and Yersinia pestis. The network features revealed a preferential enrichment of intraspecific hubs and bottlenecks for both human and bacterial pathogens in the interspecific human–bacteria interaction. Analyses unveiled that these bacterial pathogens interact mostly with human party-hubs that may enable them to affect desired functional modules, leading to pathogenesis. Structural features of pathogen-interacting human proteins indicated an abundance of protein domains, providing opportunities for interspecific domain-domain interactions. Moreover, these interactions do not always occur with high-affinity, as we observed that bacteria-interacting human proteins are rich in protein-disorder content, which correlates positively with the number of interacting pathogen proteins, facilitating low-affinity interspecific interactions. Furthermore, functional analyses of pathogen-interacting human proteins revealed an enrichment in regulation of processes like metabolism, immune system, cellular localization and transport apart from divulging functional competence to bind enzyme/protein, nucleic acids and cell adhesion molecules, necessary for host-microbial cross-talk.
Collapse
Affiliation(s)
- Debarun Acharya
- Department of Microbiology, Bose Institute, P-1/12, CIT Scheme VII M, Kolkata, West Bengal, 700 054, India
| | - Tapan K Dutta
- Department of Microbiology, Bose Institute, P-1/12, CIT Scheme VII M, Kolkata, West Bengal, 700 054, India.
| |
Collapse
|
18
|
Kumar N, Mishra B, Mehmood A, Mohammad Athar, M Shahid Mukhtar. Integrative Network Biology Framework Elucidates Molecular Mechanisms of SARS-CoV-2 Pathogenesis. iScience 2020; 23:101526. [PMID: 32895641 PMCID: PMC7468341 DOI: 10.1016/j.isci.2020.101526] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 07/30/2020] [Accepted: 08/31/2020] [Indexed: 02/06/2023] Open
Abstract
COVID-19 (coronavirus disease 2019) is a respiratory illness caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Although the pathophysiology of this virus is complex and largely unknown, we employed a network-biology-fueled approach and integrated transcriptome data pertaining to lung epithelial cells with human interactome to generate Calu-3-specific human-SARS-CoV-2 interactome (CSI). Topological clustering and pathway enrichment analysis show that SARS-CoV-2 targets central nodes of the host-viral network, which participate in core functional pathways. Network centrality analyses discover 33 high-value SARS-CoV-2 targets, which are possibly involved in viral entry, proliferation, and survival to establish infection and facilitate disease progression. Our probabilistic modeling framework elucidates critical regulatory circuitry and molecular events pertinent to COVID-19, particularly the host-modifying responses and cytokine storm. Overall, our network-centric analyses reveal novel molecular components, uncover structural and functional modules, and provide molecular insights into the pathogenicity of SARS-CoV-2 that may help foster effective therapeutic design.
Collapse
Affiliation(s)
- Nilesh Kumar
- Department of Biology, University of Alabama at Birmingham, 464 Campbell Hall, 1300 University Boulevard, AL 35294, USA
| | - Bharat Mishra
- Department of Biology, University of Alabama at Birmingham, 464 Campbell Hall, 1300 University Boulevard, AL 35294, USA
| | - Adeel Mehmood
- Department of Biology, University of Alabama at Birmingham, 464 Campbell Hall, 1300 University Boulevard, AL 35294, USA.,Department of Computer Science, University of Alabama at Birmingham, 1402 10th Avenue S., Birmingham, AL 35294, USA
| | - Mohammad Athar
- Department of Dermatology, School of Medicine, University of Alabama at Birmingham, 1720 University Boulevard, AL 35294, USA
| | - M Shahid Mukhtar
- Department of Biology, University of Alabama at Birmingham, 464 Campbell Hall, 1300 University Boulevard, AL 35294, USA.,Nutrition Obesity Research Center, University of Alabama at Birmingham, 1675 University Boulevard, Birmingham, AL 35294, USA.,Department of Surgery, University of Alabama at Birmingham, 1808 7th Avenue S, Birmingham, AL 35294, USA
| |
Collapse
|
19
|
Young F, Rogers S, Robertson DL. Predicting host taxonomic information from viral genomes: A comparison of feature representations. PLoS Comput Biol 2020; 16:e1007894. [PMID: 32453718 PMCID: PMC7307784 DOI: 10.1371/journal.pcbi.1007894] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Revised: 06/22/2020] [Accepted: 04/21/2020] [Indexed: 12/13/2022] Open
Abstract
The rise in metagenomics has led to an exponential growth in virus discovery. However, the majority of these new virus sequences have no assigned host. Current machine learning approaches to predicting virus host interactions have a tendency to focus on nucleotide features, ignoring other representations of genomic information. Here we investigate the predictive potential of features generated from four different ‘levels’ of viral genome representation: nucleotide, amino acid, amino acid properties and protein domains. This more fully exploits the biological information present in the virus genomes. Over a hundred and eighty binary datasets for infecting versus non-infecting viruses at all taxonomic ranks of both eukaryote and prokaryote hosts were compiled. The viral genomes were converted into the four different levels of genome representation and twenty feature sets were generated by extracting k-mer compositions and predicted protein domains. We trained and tested Support Vector Machine, SVM, classifiers to compare the predictive capacity of each of these feature sets for each dataset. Our results show that all levels of genome representation are consistently predictive of host taxonomy and that prediction k-mer composition improves with increasing k-mer length for all k-mer based features. Using a phylogenetically aware holdout method, we demonstrate that the predictive feature sets contain signals reflecting both the evolutionary relationship between the viruses infecting related hosts, and host-mimicry. Our results demonstrate that incorporating a range of complementary features, generated purely from virus genome sequences, leads to improved accuracy for a range of virus host prediction tasks enabling computational assignment of host taxonomic information. Elucidating the host of a newly identified virus species is an important challenge, with applications from knowing the source species of a newly emerged pathogen to understanding the bacteriophage-host relationships within the microbiome of any of earth’s ecosystems. Current high throughput methods used to identify viruses within biological or environmental samples have resulted in an unprecedented increase in virus discovery. However, for the majority of these virus genomes the host species/taxonomic classification remains unknown. To address this gap in our knowledge there is a need for fast, accurate computational methods for the assignment of putative host taxonomic information. Machine learning is an ideal approach but to maximise predictive accuracy the viral genomes need to be represented in a format (sets of features) that makes the discriminative information available to the machine learning algorithm. Here, we compare different types of features derived from the same viral genomes for their ability to predict host information. Our results demonstrate that all these feature sets are predictive of host taxonomy and when combined have the potential to improve accuracy over the use of individual feature sets across many virus host prediction applications.
Collapse
Affiliation(s)
- Francesca Young
- MRC-University of Glasgow Centre For Virus Research, Glasgow, United Kingdom
| | - Simon Rogers
- School of Computing Science, University of Glasgow, Glasgow, United Kingdom
| | - David L. Robertson
- MRC-University of Glasgow Centre For Virus Research, Glasgow, United Kingdom
- * E-mail:
| |
Collapse
|
20
|
Guven-Maiorov E, Hakouz A, Valjevac S, Keskin O, Tsai CJ, Gursoy A, Nussinov R. HMI-PRED: A Web Server for Structural Prediction of Host-Microbe Interactions Based on Interface Mimicry. J Mol Biol 2020; 432:3395-3403. [PMID: 32061934 PMCID: PMC7261632 DOI: 10.1016/j.jmb.2020.01.025] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 11/28/2019] [Accepted: 01/14/2020] [Indexed: 02/07/2023]
Abstract
Microbes, commensals, and pathogens, control the numerous functions in the host cells. They can alter host signaling and modulate immune surveillance by interacting with the host proteins. For shedding light on the contribution of microbes to health and disease, it is vital to discern how microbial proteins rewire host signaling and through which host proteins they do this. Host-Microbe Interaction PREDictor (HMI-PRED) is a user-friendly web server for structural prediction of protein-protein interactions (PPIs) between the host and a microbial species, including bacteria, viruses, fungi, and protozoa. HMI-PRED relies on "interface mimicry" through which the microbial proteins hijack host binding surfaces. Given the structure of a microbial protein of interest, HMI-PRED will return structural models of potential host-microbe interaction (HMI) complexes, the list of host endogenous and exogenous PPIs that can be disrupted, and tissue expression of the microbe-targeted host proteins. The server also allows users to upload homology models of microbial proteins. Broadly, it aims at large-scale, efficient identification of HMIs. The prediction results are stored in a repository for community access. HMI-PRED is free and available at https://interactome.ku.edu.tr/hmi.
Collapse
Affiliation(s)
- Emine Guven-Maiorov
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, 21702, USA.
| | - Asma Hakouz
- Department of Computer Engineering, Koc University, Istanbul, 34450, Turkey.
| | - Sukejna Valjevac
- Department of Computer Engineering, Koc University, Istanbul, 34450, Turkey.
| | - Ozlem Keskin
- Department of Chemical and Biological Engineering, Koc University, Istanbul, 34450, Turkey.
| | - Chung-Jung Tsai
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, 21702, USA.
| | - Attila Gursoy
- Department of Computer Engineering, Koc University, Istanbul, 34450, Turkey.
| | - Ruth Nussinov
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, 21702, USA; Sackler Inst. of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, 69978, Israel.
| |
Collapse
|
21
|
Kumar N, Mishra B, Mehmood A, Athar M, Mukhtar MS. Integrative Network Biology Framework Elucidates Molecular Mechanisms of SARS-CoV-2 Pathogenesis. SSRN 2020:3581857. [PMID: 32714115 PMCID: PMC7366800 DOI: 10.2139/ssrn.3581857] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 05/07/2020] [Indexed: 01/02/2023]
Abstract
COVID-19 (Coronavirus disease 2019) is a respiratory illness caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). While the pathophysiology of this deadly virus is complex and largely unknown, we employ a network biology-fueled approach and integrate multiomics data pertaining to lung epithelial cells-specific co-expression network and human interactome to generate Calu-3-specific human-SARS-CoV-2 Interactome (CSI). Topological clustering and pathway enrichment analysis show that SARS-CoV-2 target central nodes of host-viral network that participate in core functional pathways. Network centrality analyses discover 28 high-value SARS-CoV-2 targets, which are possibly involved in viral entry, proliferation and survival to establish infection and facilitate disease progression. Our probabilistic modeling framework elucidates critical regulatory circuitry and molecular events pertinent to COVID-19, particularly the host modifying responses and cytokine storm. Overall, our network centric analyses reveal novel molecular components, uncover structural and functional modules, and provide molecular insights into SARS-CoV-2 pathogenicity that may foster effective therapeutic design. Funding: This work was supported by the National Science Foundation (IOS-1557796) to M.S.M., and U54 ES 030246 from NIH/NIEHS to M. A. Conflict of Interest: The authors declare no competing interests. The authors also declare no financial interests.
Collapse
Affiliation(s)
- Nilesh Kumar
- Department of Biology, 464 Campbell Hall, 1300 University Boulevard, University of Alabama at Birmingham, Alabama 35294, USA
| | - Bharat Mishra
- Department of Biology, 464 Campbell Hall, 1300 University Boulevard, University of Alabama at Birmingham, Alabama 35294, USA
| | - Adeel Mehmood
- Department of Biology, 464 Campbell Hall, 1300 University Boulevard, University of Alabama at Birmingham, Alabama 35294, USA
- Department of Computer Science, University of Alabama at Birmingham, 1402 10th Ave. S. , Birmingham, AL 35294, USA
| | - Mohammad Athar
- Department of Dermatology, School of Medicine, University of Alabama at Birmingham, Alabama 35294, USA
| | - M. Shahid Mukhtar
- Department of Biology, 464 Campbell Hall, 1300 University Boulevard, University of Alabama at Birmingham, Alabama 35294, USA
- Nutrition Obesity Research Center, 1675 University Blvd, University of Alabama at Birmingham, Birmingham, AL 35294, USA
- Department of Surgery, 1808 7th Ave S, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| |
Collapse
|
22
|
Dick K, Samanfar B, Barnes B, Cober ER, Mimee B, Tan LH, Molnar SJ, Biggar KK, Golshani A, Dehne F, Green JR. PIPE4: Fast PPI Predictor for Comprehensive Inter- and Cross-Species Interactomes. Sci Rep 2020; 10:1390. [PMID: 31996697 PMCID: PMC6989690 DOI: 10.1038/s41598-019-56895-w] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2019] [Accepted: 12/13/2019] [Indexed: 02/06/2023] Open
Abstract
The need for larger-scale and increasingly complex protein-protein interaction (PPI) prediction tasks demands that state-of-the-art predictors be highly efficient and adapted to inter- and cross-species predictions. Furthermore, the ability to generate comprehensive interactomes has enabled the appraisal of each PPI in the context of all predictions leading to further improvements in classification performance in the face of extreme class imbalance using the Reciprocal Perspective (RP) framework. We here describe the PIPE4 algorithm. Adaptation of the PIPE3/MP-PIPE sequence preprocessing step led to upwards of 50x speedup and the new Similarity Weighted Score appropriately normalizes for window frequency when applied to any inter- and cross-species prediction schemas. Comprehensive interactomes for three prediction schemas are generated: (1) cross-species predictions, where Arabidopsis thaliana is used as a proxy to predict the comprehensive Glycine max interactome, (2) inter-species predictions between Homo sapiens-HIV1, and (3) a combined schema involving both cross- and inter-species predictions, where both Arabidopsis thaliana and Caenorhabditis elegans are used as proxy species to predict the interactome between Glycine max (the soybean legume) and Heterodera glycines (the soybean cyst nematode). Comparing PIPE4 with the state-of-the-art resulted in improved performance, indicative that it should be the method of choice for complex PPI prediction schemas.
Collapse
Affiliation(s)
- Kevin Dick
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, K1S 5B6, Canada
| | - Bahram Samanfar
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, K1A 0C6, Canada
- Department of Biology, Carleton University, Ottawa, K1S 5B6, Ontario, Canada
| | - Bradley Barnes
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, K1S 5B6, Canada
| | - Elroy R Cober
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, K1A 0C6, Canada
| | - Benjamin Mimee
- Agriculture and Agri-Food Canada, Saint-Jean-sur-Richelieu Research and Development Centre, Saint-Jean-sur-Richelieu, J3B 3E6, Quebec, Canada
| | - Le Hoa Tan
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, K1A 0C6, Canada
| | - Stephen J Molnar
- Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, K1A 0C6, Canada
| | - Kyle K Biggar
- Department of Biology, Carleton University, Ottawa, K1S 5B6, Ontario, Canada
| | - Ashkan Golshani
- Department of Biology, Carleton University, Ottawa, K1S 5B6, Ontario, Canada
- Ottawa Institute of Systems Biology, Carleton University, 1125 Colonel By Drive, Ottawa, K1S 5B6, Canada
| | - Frank Dehne
- School of Computer Science, Carleton University, Ottawa, Ontario, K1S 5B6, Canada
| | - James R Green
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, K1S 5B6, Canada.
| |
Collapse
|
23
|
Hraber P, O'Maille PE, Silberfarb A, Davis-Anderson K, Generous N, McMahon BH, Fair JM. Resources to Discover and Use Short Linear Motifs in Viral Proteins. Trends Biotechnol 2020; 38:113-127. [PMID: 31427097 PMCID: PMC7114124 DOI: 10.1016/j.tibtech.2019.07.004] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 07/11/2019] [Accepted: 07/15/2019] [Indexed: 12/23/2022]
Abstract
Viral proteins evade host immune function by molecular mimicry, often achieved by short linear motifs (SLiMs) of three to ten consecutive amino acids (AAs). Motif mimicry tolerates mutations, evolves quickly to modify interactions with the host, and enables modular interactions with protein complexes. Host cells cannot easily coordinate changes to conserved motif recognition and binding interfaces under selective pressure to maintain critical signaling pathways. SLiMs offer potential for use in synthetic biology, such as better immunogens and therapies, but may also present biosecurity challenges. We survey viral uses of SLiMs to mimic host proteins, and information resources available for motif discovery. As the number of examples continues to grow, knowledge management tools are essential to help organize and compare new findings.
Collapse
Affiliation(s)
- Peter Hraber
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, NM 87545, USA.
| | - Paul E O'Maille
- Biosciences Division, SRI International, 333 Ravenswood Ave, Menlo Park, CA 94025, USA
| | - Andrew Silberfarb
- Artificial Intelligence Center, SRI International, 333 Ravenswood Ave, Menlo Park, CA 94025, USA
| | - Katie Davis-Anderson
- Biosecurity and Public Health, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Nicholas Generous
- Global Security Directorate, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Benjamin H McMahon
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Jeanne M Fair
- Biosecurity and Public Health, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| |
Collapse
|
24
|
Zheng N, Wang K, Zhan W, Deng L. Targeting Virus-host Protein Interactions: Feature Extraction and Machine Learning Approaches. Curr Drug Metab 2019; 20:177-184. [PMID: 30156155 DOI: 10.2174/1389200219666180829121038] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Revised: 05/21/2018] [Accepted: 08/02/2018] [Indexed: 01/15/2023]
Abstract
BACKGROUND Targeting critical viral-host Protein-Protein Interactions (PPIs) has enormous application prospects for therapeutics. Using experimental methods to evaluate all possible virus-host PPIs is labor-intensive and time-consuming. Recent growth in computational identification of virus-host PPIs provides new opportunities for gaining biological insights, including applications in disease control. We provide an overview of recent computational approaches for studying virus-host PPI interactions. METHODS In this review, a variety of computational methods for virus-host PPIs prediction have been surveyed. These methods are categorized based on the features they utilize and different machine learning algorithms including classical and novel methods. RESULTS We describe the pivotal and representative features extracted from relevant sources of biological data, mainly include sequence signatures, known domain interactions, protein motifs and protein structure information. We focus on state-of-the-art machine learning algorithms that are used to build binary prediction models for the classification of virus-host protein pairs and discuss their abilities, weakness and future directions. CONCLUSION The findings of this review confirm the importance of computational methods for finding the potential protein-protein interactions between virus and host. Although there has been significant progress in the prediction of virus-host PPIs in recent years, there is a lot of room for improvement in virus-host PPI prediction.
Collapse
Affiliation(s)
- Nantao Zheng
- School of Software, Central South University, Changsha, 410075, China
| | - Kairou Wang
- School of Software, Central South University, Changsha, 410075, China
| | - Weihua Zhan
- School of Electronics and Computer Science, Zhejiang Wanli University, Ningbo 315100, China
| | - Lei Deng
- School of Software, Central South University, Changsha, 410075, China.,Shanghai Key Lab of Intelligent Information Processing, Shanghai 200433, China
| |
Collapse
|
25
|
Unity and diversity among viral kinases. Gene 2019; 723:144134. [PMID: 31589960 DOI: 10.1016/j.gene.2019.144134] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Revised: 09/12/2019] [Accepted: 09/16/2019] [Indexed: 12/27/2022]
Abstract
Viral kinases are known to undergo autophosphorylation and also phosphorylate viral and host substrates. Viral kinases have been implicated in various diseases and are also known to acquire host kinases for mimicking cellular functions and exhibit virulence. Although substantial analyses have been reported in the literature on diversity of viral kinases, there is a gap in the understanding of sequence and structural similarity among kinases from different classes of viruses. In this study, we performed a comprehensive analysis of protein kinases encoded in viral genomes. Homology search methods have been used to identify kinases from 104,282 viral genomic datasets. Serine/threonine and tyrosine kinases are identified only in 390 viral genomes. Out of seven viral classes that are based on nature of genetic material, only viruses having double-stranded DNA and single-stranded RNA retroviruses are found to encode kinases. The 716 identified protein kinases are classified into 63 subfamilies based on their sequence similarity within each cluster, and sequence signatures have been identified for each subfamily. 11 clusters are well represented with at least 10 members in each of these clusters. Kinases from dsDNA viruses, Phycodnaviridae which infect green algae and Herpesvirales that infect vertebrates including human, form a major group. From our analysis, it has been observed that the protein kinases in viruses belonging to same taxonomic lineages form discrete clusters and the kinases encoded in alphaherpesvirus form host-specific clusters. A comprehensive sequence and structure-based analysis enabled us to identify the conserved residues or motifs in kinase catalytic domain regions across all viral kinases. Conserved sequence regions that are specific to a particular viral kinase cluster and the kinases that show close similarity to eukaryotic kinases were identified by using sequence and three-dimensional structural regions of eukaryotic kinases as reference. The regions specific to each viral kinase cluster can be used as signatures in the future in classifying uncharacterized viral kinases. We note that kinases from giant viruses Marseilleviridae have close similarity to viral oncogenes in the functional regions and in putative substrate binding regions indicating their possible role in cancer.
Collapse
|
26
|
Ray S, Alberuni S, Maulik U. Computational Prediction of HCV-Human Protein-Protein Interaction via Topological Analysis of HCV Infected PPI Modules. IEEE Trans Nanobioscience 2019; 17:55-61. [PMID: 29570075 DOI: 10.1109/tnb.2018.2797696] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In this paper, we have developed a framework for detection of protein-protein interactions (PPI) between Hepatitis-C virus (HCV) and human proteins based on PPI and gene ontology based information of the HCV infected proteins. First, a bipartite interaction network is formed between HCV proteins and human host proteins. Next, we have analyzed different topological properties of the interaction network and observed that degree of HCV-interacting proteins is significantly higher than non-interacting host proteins. We have also observed that the HCV interacted protein pairs are functionally similar with each other than the non-interacting pairs. Following the observations, we have applied an inference mechanism to predict novel interactions between HCV and human protein. The inference mechanism is based on partitioning the network formed by HCV interacted human proteins and their first neighbors in dense and functionally similar groups using a PPI network clustering algorithm. The groups are then analyzed to predict PPIs. The predicted interaction pairs are validated using literature search in PUBMED. Experimental evidence of over 50% of the predicted pairs are found in existing literatures by searching PUBMED. A Gene Ontology and pathway based analysis is also carried out to validate the identified modules biologically.
Collapse
|
27
|
Ivan FX, Kwoh CK, Chow VT, Zheng J. Genome Analysis – Identification of Genes Involved in Host-Pathogen Protein-Protein Interaction Networks. ENCYCLOPEDIA OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY 2019:410-424. [DOI: 10.1016/b978-0-12-809633-8.20124-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
28
|
Guven-Maiorov E, Tsai CJ, Ma B, Nussinov R. Interface-Based Structural Prediction of Novel Host-Pathogen Interactions. Methods Mol Biol 2019; 1851:317-335. [PMID: 30298406 PMCID: PMC8192064 DOI: 10.1007/978-1-4939-8736-8_18] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
About 20% of the cancer incidences worldwide have been estimated to be associated with infections. However, the molecular mechanisms of exactly how they contribute to host tumorigenesis are still unknown. To evade host defense, pathogens hijack host proteins at different levels: sequence, structure, motif, and binding surface, i.e., interface. Interface similarity allows pathogen proteins to compete with host counterparts to bind to a target protein, rewire physiological signaling, and result in persistent infections, as well as cancer. Identification of host-pathogen interactions (HPIs)-along with their structural details at atomic resolution-may provide mechanistic insight into pathogen-driven cancers and innovate therapeutic intervention. HPI data including structural details is scarce and large-scale experimental detection is challenging. Therefore, there is an urgent and mounting need for efficient and robust computational approaches to predict HPIs and their complex (bound) structures. In this chapter, we review the first and currently only interface-based computational approach to identify novel HPIs. The concept of interface mimicry promises to identify more HPIs than complete sequence or structural similarity. We illustrate this concept with a case study on Kaposi's sarcoma herpesvirus (KSHV) to elucidate how it subverts host immunity and helps contribute to malignant transformation of the host cells.
Collapse
Affiliation(s)
- Emine Guven-Maiorov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA
| | - Chung-Jung Tsai
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA
| | - Buyong Ma
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA
| | - Ruth Nussinov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA.
- Department of Human Genetics and Molecular Medicine, Sackler Inst. of Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
29
|
Halder AK, Dutta P, Kundu M, Basu S, Nasipuri M. Review of computational methods for virus-host protein interaction prediction: a case study on novel Ebola-human interactions. Brief Funct Genomics 2018; 17:381-391. [PMID: 29028879 PMCID: PMC7109800 DOI: 10.1093/bfgp/elx026] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Identification of potential virus-host interactions is useful and vital to control the highly infectious virus-caused diseases. This may contribute toward development of new drugs to treat the viral infections. Recently, database records of clinically and experimentally validated interactions between a small set of human proteins and Ebola virus (EBOV) have been published. Using the information of the known human interaction partners of EBOV, our main objective is to identify a set of proteins that may interact with EBOV proteins. Here, we first review the state-of-the-art, computational methods used for prediction of novel virus-host interactions for infectious diseases followed by a case study on EBOV-human interactions. The assessment result shows that the predicted human host proteins are highly similar with known human interaction partners of EBOV in the context of structure and semantics and are responsible for similar biochemical activities, pathways and host-pathogen relationships.
Collapse
Affiliation(s)
- Anup Kumar Halder
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Pritha Dutta
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Mahantapas Kundu
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Subhadip Basu
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Mita Nasipuri
- Department of Computer Science and Engineering, Jadavpur University, India
| |
Collapse
|
30
|
Li Y, Maleki M, Carruthers NJ, Stemmer PM, Ngom A, Rueda L. The predictive performance of short-linear motif features in the prediction of calmodulin-binding proteins. BMC Bioinformatics 2018; 19:410. [PMID: 30453876 PMCID: PMC6245490 DOI: 10.1186/s12859-018-2378-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/21/2023] Open
Abstract
Background The prediction of calmodulin-binding (CaM-binding) proteins plays a very important role in the fields of biology and biochemistry, because the calmodulin protein binds and regulates a multitude of protein targets affecting different cellular processes. Computational methods that can accurately identify CaM-binding proteins and CaM-binding domains would accelerate research in calcium signaling and calmodulin function. Short-linear motifs (SLiMs), on the other hand, have been effectively used as features for analyzing protein-protein interactions, though their properties have not been utilized in the prediction of CaM-binding proteins. Results We propose a new method for the prediction of CaM-binding proteins based on both the total and average scores of known and new SLiMs in protein sequences using a new scoring method called sliding window scoring (SWS) as features for the prediction module. A dataset of 194 manually curated human CaM-binding proteins and 193 mitochondrial proteins have been obtained and used for testing the proposed model. The motif generation tool, Multiple EM for Motif Elucidation (MEME), has been used to obtain new motifs from each of the positive and negative datasets individually (the SM approach) and from the combined negative and positive datasets (the CM approach). Moreover, the wrapper criterion with random forest for feature selection (FS) has been applied followed by classification using different algorithms such as k-nearest neighbors (k-NN), support vector machines (SVM), naive Bayes (NB) and random forest (RF). Conclusions Our proposed method shows very good prediction results and demonstrates how information contained in SLiMs is highly relevant in predicting CaM-binding proteins. Further, three new CaM-binding motifs have been computationally selected and biologically validated in this study, and which can be used for predicting CaM-binding proteins. Electronic supplementary material The online version of this article (10.1186/s12859-018-2378-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yixun Li
- School of Computer Science, University of Windsor, Windsor, Ontario, Canada
| | - Mina Maleki
- School of Computer Science, University of Windsor, Windsor, Ontario, Canada
| | | | - Paul M Stemmer
- Inst. of Env. Health Sci., Wayne State University, Detroit, MI, USA
| | - Alioune Ngom
- School of Computer Science, University of Windsor, Windsor, Ontario, Canada
| | - Luis Rueda
- School of Computer Science, University of Windsor, Windsor, Ontario, Canada.
| |
Collapse
|
31
|
Chen J, Sun J, Liu X, Liu F, Liu R, Wang J. Structure-based prediction of West Nile virus-human protein-protein interactions. J Biomol Struct Dyn 2018; 37:2310-2321. [PMID: 30044201 DOI: 10.1080/07391102.2018.1479659] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
In recent years, West Nile virus (WNV) has posed a great threat to global human health due to its explosive spread. Studying the protein-protein interactions (PPIs) between WNV and human is beneficial for understanding the pathogenesis of WNV and the immune response mechanism of human against WNV infection at the molecular level. In this study, we identified the human target proteins which interact with WNV based on protein structure similarity, and then the interacting pairs were filtered by the subcellular co-localization information. As a result, a network of 3346 interactions was constructed, involving 6 WNV proteins and 1970 human target proteins. To our knowledge, this is the first predicted interactome for WNV-human. By analyzing the topological properties and evolutionary rates of the human target proteins, it was demonstrated that these proteins tend to be the hub and bottleneck proteins in the human PPI network and are more conserved than the non-target ones. Triplet analysis showed that the target proteins are adjacent to each other in the human PPI network, suggesting that these proteins may have similar biological functions. Further, the functional enrichment analysis indicated that the target proteins are mainly involved in virus process, transcription regulation, cell adhesion, and so on. In addition, the common and specific targets were identified and compared based on the networks between WNV-human and Dengue virus II (DENV2)-human. Finally, by combining topological features and existing drug target information, we identified 30 potential anti-WNV human targets, among which 11 ones were reported to be associated with WNV infection. Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Jing Chen
- a Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics , Huazhong Agricultural University , Wuhan , People's Republic of China
| | - Jun Sun
- a Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics , Huazhong Agricultural University , Wuhan , People's Republic of China
| | - Xiangming Liu
- b Gongqing Institute of Science and Technology , Gongqing , People's Republic of China
| | - Feng Liu
- a Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics , Huazhong Agricultural University , Wuhan , People's Republic of China
| | - Rong Liu
- a Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics , Huazhong Agricultural University , Wuhan , People's Republic of China
| | - Jia Wang
- a Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics , Huazhong Agricultural University , Wuhan , People's Republic of China
| |
Collapse
|
32
|
Soyemi J, Isewon I, Oyelade J, Adebiyi E. Inter-Species/Host-Parasite Protein Interaction Predictions Reviewed. Curr Bioinform 2018; 13:396-406. [PMID: 31496926 PMCID: PMC6691774 DOI: 10.2174/1574893613666180108155851] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Revised: 12/31/2017] [Accepted: 01/02/2018] [Indexed: 01/01/2023]
Abstract
BACKGROUND Host-parasite protein interactions (HPPI) are those interactions occurring between a parasite and its host. Host-parasite protein interaction enhances the understanding of how parasite can infect its host. The interaction plays an important role in initiating infections, although it is not all host-parasite interactions that result in infection. Identifying the protein-protein interactions (PPIs) that allow a parasite to infect its host has a lot do in discovering possible drug targets. Such PPIs, when altered, would prevent the host from being infected by the parasite and in some cases, result in the parasite inability to complete specific stages of its life cycle and invariably lead to the death of such parasite. It therefore becomes important to understand the workings of host-parasite interactions which are the major causes of most infectious diseases. OBJECTIVE Many studies have been conducted in literature to predict HPPI, mostly using computational methods with few experimental methods. Computational method has proved to be faster and more efficient in manipulating and analyzing real life data. This study looks at various computational methods used in literature for host-parasite/inter-species protein-protein interaction predictions with the hope of getting a better insight into computational methods used and identify whether machine learning approaches have been extensively used for the same purpose. METHODS The various methods involved in host-parasite protein interactions were reviewed with their individual strengths. Tabulations of studies that carried out host-parasite/inter-species protein interaction predictions were performed, analyzing their predictive methods, filters used, potential protein-protein interactions discovered in those studies and various validation measurements used as the case may be. The commonly used measurement indexes for such studies were highlighted displaying the various formulas. Finally, future prospects of studies specific to human-plasmodium falciparum PPI predictions were proposed. RESULT We discovered that quite a few studies reviewed implemented machine learning approach for HPPI predictions when compared with methods such as sequence homology search and protein structure and domain-motif. The key challenge well noted in HPPI predictions is getting relevant information. CONCLUSION This review presents useful knowledge and future directions on the subject matter.
Collapse
Affiliation(s)
- Jumoke Soyemi
- Department of Computer Science, The Federal Polytechnic, Ilaro, Nigeria
- Covenant University Bioinformatics Research (CUBRe), Ota, Nigeria
| | - Itunnuoluwa Isewon
- Department of Computer & Information Sciences, Covenant University, Ota, Nigeria and
- Covenant University Bioinformatics Research (CUBRe), Ota, Nigeria
| | - Jelili Oyelade
- Department of Computer & Information Sciences, Covenant University, Ota, Nigeria and
- Covenant University Bioinformatics Research (CUBRe), Ota, Nigeria
| | - Ezekiel Adebiyi
- Department of Computer & Information Sciences, Covenant University, Ota, Nigeria and
- Covenant University Bioinformatics Research (CUBRe), Ota, Nigeria
| |
Collapse
|
33
|
Ding Z, Kihara D. Computational Methods for Predicting Protein-Protein Interactions Using Various Protein Features. CURRENT PROTOCOLS IN PROTEIN SCIENCE 2018; 93:e62. [PMID: 29927082 PMCID: PMC6097941 DOI: 10.1002/cpps.62] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Understanding protein-protein interactions (PPIs) in a cell is essential for learning protein functions, pathways, and mechanism of diseases. PPIs are also important targets for developing drugs. Experimental methods, both small-scale and large-scale, have identified PPIs in several model organisms. However, results cover only a part of PPIs of organisms; moreover, there are many organisms whose PPIs have not yet been investigated. To complement experimental methods, many computational methods have been developed that predict PPIs from various characteristics of proteins. Here we provide an overview of literature reports to classify computational PPI prediction methods that consider different features of proteins, including protein sequence, genomes, protein structure, function, PPI network topology, and those which integrate multiple methods. © 2018 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Ziyun Ding
- Department of Biological Science, Purdue University, West Lafayette, IN, 47907 USA
| | - Daisuke Kihara
- Department of Biological Science, Purdue University, West Lafayette, IN, 47907 USA
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907 USA
- Corresponding author: DK; , Phone: 1-765-496-2284 (DK)
| |
Collapse
|
34
|
García-Pérez CA, Guo X, Navarro JG, Aguilar DAG, Lara-Ramírez EE. Proteome-wide analysis of human motif-domain interactions mapped on influenza a virus. BMC Bioinformatics 2018; 19:238. [PMID: 29940841 PMCID: PMC6019528 DOI: 10.1186/s12859-018-2237-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2017] [Accepted: 06/07/2018] [Indexed: 01/27/2023] Open
Abstract
Background The influenza A virus (IAV) is a constant threat for humans worldwide. The understanding of motif-domain protein participation is essential to combat the pathogen. Results In this study, a data mining approach was employed to extract influenza-human Protein-Protein interactions (PPI) from VirusMentha,Virus MINT, IntAct, and Pfam databases, to mine motif-domain interactions (MDIs) stored as Regular Expressions (RegExp) in 3DID database. A total of 107 RegExp related to human MDIs were searched on 51,242 protein fragments from H1N1, H1N2, H2N2, H3N2 and H5N1 strains obtained from Virus Variation database. A total 46 MDIs were frequently mapped on the IAV proteins and shared between the different strains. IAV kept host-like MDIs that were associated with the virus survival, which could be related to essential biological process such as microtubule-based processes, regulation of cell cycle check point, regulation of replication and transcription of DNA, etc. in human cells. The amino acid motifs were searched for matches in the immune epitope database and it was found that some motifs are part of experimentally determined epitopes on IAV, implying that such interactions exist. Conclusion The directed data-mining method employed could be used to identify functional motifs in other viruses for envisioning new therapies. Electronic supplementary material The online version of this article (10.1186/s12859-018-2237-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Carlos A García-Pérez
- Centro de Biotecnología Genómica, Instituto Politécnico Nacional, Reynosa, Tamaulipas, Mexico
| | - Xianwu Guo
- Centro de Biotecnología Genómica, Instituto Politécnico Nacional, Reynosa, Tamaulipas, Mexico
| | | | | | - Edgar E Lara-Ramírez
- Unidad de Investigación Biomédica de Zacatecas, Instituto Mexicano del Seguro Social, Interior Alameda # 45, Colonia Centro, CP. 98000, Zacatecas, Zac, Mexico.
| |
Collapse
|
35
|
Predicting Interactions between Virus and Host Proteins Using Repeat Patterns and Composition of Amino Acids. JOURNAL OF HEALTHCARE ENGINEERING 2018; 2018:1391265. [PMID: 29854357 PMCID: PMC5966669 DOI: 10.1155/2018/1391265] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Revised: 03/27/2018] [Accepted: 04/17/2018] [Indexed: 11/29/2022]
Abstract
Previous methods for predicting protein-protein interactions (PPIs) were mainly focused on PPIs within a single species, but PPIs across different species have recently emerged as an important issue in some areas such as viral infection. The primary focus of this study is to predict PPIs between virus and its targeted host, which are involved in viral infection. We developed a general method that predicts interactions between virus and host proteins using the repeat patterns and composition of amino acids. In independent testing of the method with PPIs of new viruses and hosts, it showed a high performance comparable to the best performance of other methods for single virus-host PPIs. In comparison of our method with others using same datasets, our method outperformed the others. The repeat patterns and composition of amino acids are simple, yet powerful features for predicting virus-host PPIs. The method developed in this study will help in finding new virus-host PPIs for which little information is available.
Collapse
|
36
|
The present and the future of motif-mediated protein-protein interactions. Curr Opin Struct Biol 2018; 50:162-170. [PMID: 29730529 DOI: 10.1016/j.sbi.2018.04.005] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2017] [Revised: 02/07/2018] [Accepted: 04/11/2018] [Indexed: 01/14/2023]
Abstract
Protein-protein interactions (PPIs) are essential to governing virtually all cellular processes. Of particular importance are the versatile motif-mediated interactions (MMIs), which are thus far underrepresented in available interaction data. This is largely due to technical difficulties inherent in the properties of MMIs, but due to the increasing recognition of the vital roles of MMIs in biology, several systematic approaches have recently been developed to detect novel MMIs. Consequently, rapidly growing numbers of motifs are being identified and pursued further for therapeutic applications. In this review, we discuss the current understanding on the diverse functions and disease-relevance of MMIs, the key methodologies for detection of MMIs, and the potential of MMIs for drug development.
Collapse
|