1
|
Identification of the HNSC88 Molecular Signature for Predicting Subtypes of Head and Neck Cancer. Int J Mol Sci 2023; 24:13068. [PMID: 37685875 PMCID: PMC10487792 DOI: 10.3390/ijms241713068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 08/14/2023] [Accepted: 08/17/2023] [Indexed: 09/10/2023] Open
Abstract
Head and neck squamous cell carcinoma (HNSC) exhibits genetic heterogeneity in etiologies, tumor sites, and biological processes, which significantly impact therapeutic strategies and prognosis. While the influence of human papillomavirus on clinical outcomes is established, the molecular subtypes determining additional treatment options for HNSC remain unclear and inconsistent. This study aims to identify distinct HNSC molecular subtypes to enhance diagnosis and prognosis accuracy. In this study, we collected three HNSC microarrays (n = 306) from the Gene Expression Omnibus (GEO), and HNSC RNA-Seq data (n = 566) from The Cancer Genome Atlas (TCGA) to identify differentially expressed genes (DEGs) and validate our results. Two scoring methods, representative score (RS) and perturbative score (PS), were developed for DEGs to summarize their possible activation functions and influence in tumorigenesis. Based on the RS and PS scoring, we selected candidate genes to cluster TCGA samples for the identification of molecular subtypes in HNSC. We have identified 289 up-regulated DEGs and selected 88 genes (called HNSC88) using the RS and PS scoring methods. Based on HNSC88 and TCGA samples, we determined three HNSC subtypes, including one HPV-associated subtype, and two HPV-negative subtypes. One of the HPV-negative subtypes showed a relationship to smoking behavior, while the other exhibited high expression in tumor immune response. The Kaplan-Meier method was used to compare overall survival among the three subtypes. The HPV-associated subtype showed a better prognosis compared to the other two HPV-negative subtypes (log rank, p = 0.0092 and 0.0001; hazard ratio, 1.36 and 1.39). Additionally, within the HPV-negative group, the smoking-related subgroup exhibited worse prognosis compared to the subgroup with high expression in immune response (log rank, p = 0.039; hazard ratio, 1.53). The HNSC88 not only enables the identification of HPV-associated subtypes, but also proposes two potential HPV-negative subtypes with distinct prognoses and molecular signatures. This study provides valuable strategies for summarizing the roles and influences of genes in tumorigenesis for identifying molecular signatures and subtypes of HNSC.
Collapse
|
2
|
Membrane protein-regulated networks across human cancers. Nat Commun 2019; 10:3131. [PMID: 31311925 PMCID: PMC6635409 DOI: 10.1038/s41467-019-10920-8] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Accepted: 06/10/2019] [Indexed: 01/01/2023] Open
Abstract
Alterations in membrane proteins (MPs) and their regulated pathways have been established as cancer hallmarks and extensively targeted in clinical applications. However, the analysis of MP-interacting proteins and downstream pathways across human malignancies remains challenging. Here, we present a systematically integrated method to generate a resource of cancer membrane protein-regulated networks (CaMPNets), containing 63,746 high-confidence protein-protein interactions (PPIs) for 1962 MPs, using expression profiles from 5922 tumors with overall survival outcomes across 15 human cancers. Comprehensive analysis of CaMPNets links MP partner communities and regulated pathways to provide MP-based gene sets for identifying prognostic biomarkers and druggable targets. For example, we identify CHRNA9 with 12 PPIs (e.g., ERBB2) can be a therapeutic target and find its anti-metastasis agent, bupropion, for treatment in nicotine-induced breast cancer. This resource is a study to systematically integrate MP interactions, genomics, and clinical outcomes for helping illuminate cancer-wide atlas and prognostic landscapes in tumor homo/heterogeneity.
Collapse
|
3
|
Emerging roles of allosteric modulators in the regulation of protein-protein interactions (PPIs): A new paradigm for PPI drug discovery. Med Res Rev 2019; 39:2314-2342. [PMID: 30957264 DOI: 10.1002/med.21585] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2018] [Revised: 03/12/2019] [Accepted: 03/24/2019] [Indexed: 12/26/2022]
Abstract
Protein-protein interactions (PPIs) are closely implicated in various types of cellular activities and are thus pivotal to health and disease states. Given their fundamental roles in a wide range of biological processes, the modulation of PPIs has enormous potential in drug discovery. However, owing to the general properties of large, flat, and featureless interfaces of PPIs, previous attempts have demonstrated that the generation of therapeutic agents targeting PPI interfaces is challenging, rendering them almost "undruggable" for decades. To date, rapid progress in chemical and structural biology techniques has promoted the exploitation of allostery as a novel approach in drug discovery. By attaching to allosteric sites that are topologically and spatially distinct from PPI interfaces, allosteric modulators can achieve improved physiochemical properties. Thus, allosteric modulators may represent an alternative strategy to target intractable PPIs and have attracted intense pharmaceutical interest. In this review, we first briefly introduce the characteristics of PPIs and then present different approaches for investigating PPIs, as well as the latest methods for modulating PPIs. Importantly, we comprehensively review the recent progress in the development of allosteric modulators to inhibit or stabilize PPIs. Finally, we conclude with future perspectives on the discovery of allosteric PPI modulators, especially the application of computational methods to aid in allosteric PPI drug discovery.
Collapse
|
4
|
A homologous mapping method for three-dimensional reconstruction of protein networks reveals disease-associated mutations. BMC SYSTEMS BIOLOGY 2018; 12:13. [PMID: 29560828 PMCID: PMC5861491 DOI: 10.1186/s12918-018-0537-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND One of the crucial steps toward understanding the associations among molecular interactions, pathways, and diseases in a cell is to investigate detailed atomic protein-protein interactions (PPIs) in the structural interactome. Despite the availability of large-scale methods for analyzing PPI networks, these methods often focused on PPI networks using genome-scale data and/or known experimental PPIs. However, these methods are unable to provide structurally resolved interaction residues and their conservations in PPI networks. RESULTS Here, we reconstructed a human three-dimensional (3D) structural PPI network (hDiSNet) with the detailed atomic binding models and disease-associated mutations by enhancing our PPI families and 3D-domain interologs from 60,618 structural complexes and complete genome database with 6,352,363 protein sequences across 2274 species. hDiSNet is a scale-free network (γ = 2.05), which consists of 5177 proteins and 19,239 PPIs with 5843 mutations. These 19,239 structurally resolved PPIs not only expanded the number of PPIs compared to present structural PPI network, but also achieved higher agreement with gene ontology similarities and higher co-expression correlation than the ones of 181,868 experimental PPIs recorded in public databases. Among 5843 mutations, 1653 and 790 mutations involved in interacting domains and contacting residues, respectively, are highly related to diseases. Our hDiSNet can provide detailed atomic interactions of human disease and their associated proteins with mutations. Our results show that the disease-related mutations are often located at the contacting residues forming the hydrogen bonds or conserved in the PPI family. In addition, hDiSNet provides the insights of the FGFR (EGFR)-MAPK pathway for interpreting the mechanisms of breast cancer and ErbB signaling pathway in brain cancer. CONCLUSIONS Our results demonstrate that hDiSNet can explore structural-based interactions insights for understanding the mechanisms of disease-associated proteins and their mutations. We believe that our method is useful to reconstruct structurally resolved PPI networks for interpreting structural genomics and disease associations.
Collapse
|
5
|
Non-interacting proteins may resemble interacting proteins: prevalence and implications. Sci Rep 2017; 7:40419. [PMID: 28084410 PMCID: PMC5289270 DOI: 10.1038/srep40419] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Accepted: 12/07/2016] [Indexed: 12/13/2022] Open
Abstract
The vast majority of proteins do not form functional interactions in physiological conditions. We have considered several sets of protein pairs from S. cerevisiae with no functional interaction reported, denoted as non-interacting pairs, and compared their 3D structures to available experimental complexes. We identified some non-interacting pairs with significant structural similarity with experimental complexes, indicating that, even though they do not form functional interactions, they have compatible structures. We estimate that up to 8.7% of non-interacting protein pairs could have compatible structures. This number of interactions exceeds the number of functional interactions (around 0.2% of the total interactions) by a factor 40. Network analysis suggests that the interactions formed by non-interacting pairs with compatible structures could be particularly hazardous to the protein-protein interaction network. From a structural point of view, these interactions display no aberrant structural characteristics, and are even predicted as relatively stable and enriched in potential physical interactors, suggesting a major role of regulation to prevent them.
Collapse
|
6
|
A calmodulin like EF hand protein positively regulates oxalate decarboxylase expression by interacting with E-box elements of the promoter. Sci Rep 2015; 5:14578. [PMID: 26455820 PMCID: PMC4600981 DOI: 10.1038/srep14578] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2015] [Accepted: 09/03/2015] [Indexed: 12/02/2022] Open
Abstract
Oxalate decarboxylase (OXDC) enzyme has immense biotechnological applications due to its ability to decompose anti-nutrient oxalic acid. Flammulina velutipes, an edible wood rotting fungus responds to oxalic acid by induction of OXDC to maintain steady levels of pH and oxalate anions outside the fungal hyphae. Here, we report that upon oxalic acid induction, a calmodulin (CaM) like protein-FvCaMLP, interacts with the OXDC promoter to regulate its expression. Electrophoretic mobility shift assay showed that FvCamlp specifically binds to two non-canonical E-box elements (AACGTG) in the OXDC promoter. Moreover, substitutions of amino acids in the EF hand motifs resulted in loss of DNA binding ability of FvCamlp. F. velutipes mycelia treated with synthetic siRNAs designed against FvCaMLP showed significant reduction in FvCaMLP as well as OXDC transcript pointing towards positive nature of the regulation. FvCaMLP is different from other known EF hand proteins. It shows sequence similarity to both CaMs and myosin regulatory light chain (Cdc4), but has properties typical of a calmodulin, like binding of 45Ca2+, heat stability and Ca2+ dependent electrophoretic shift. Hence, FvCaMLP can be considered a new addition to the category of unconventional Ca2+ binding transcriptional regulators.
Collapse
|
7
|
Module organization and variance in protein-protein interaction networks. Sci Rep 2015; 5:9386. [PMID: 25797237 PMCID: PMC4369690 DOI: 10.1038/srep09386] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2014] [Accepted: 03/03/2015] [Indexed: 12/13/2022] Open
Abstract
A module is a group of closely related proteins that act in concert to perform specific biological functions through protein–protein interactions (PPIs) that occur in time and space. However, the underlying module organization and variance remain unclear. In this study, we collected module templates to infer respective module families, including 58,041 homologous modules in 1,678 species, and PPI families using searches of complete genomic database. We then derived PPI evolution scores and interface evolution scores to describe the module elements, including core and ring components. Functions of core components were highly correlated with those of essential genes. In comparison with ring components, core proteins/PPIs were conserved across multiple species. Subsequently, protein/module variance of PPI networks confirmed that core components form dynamic network hubs and play key roles in various biological functions. Based on the analyses of gene essentiality, module variance, and gene co-expression, we summarize the observations of module organization and variance as follows: 1) a module consists of core and ring components; 2) core components perform major biological functions and collaborate with ring components to execute certain functions in some cases; 3) core components are more conserved and essential during organizational changes in different biological states or conditions.
Collapse
|
8
|
Reconstructing genome-wide protein-protein interaction networks using multiple strategies with homologous mapping. PLoS One 2015; 10:e0116347. [PMID: 25602759 PMCID: PMC4300222 DOI: 10.1371/journal.pone.0116347] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2014] [Accepted: 12/08/2014] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND One of the crucial steps toward understanding the biological functions of a cellular system is to investigate protein-protein interaction (PPI) networks. As an increasing number of reliable PPIs become available, there is a growing need for discovering PPIs to reconstruct PPI networks of interesting organisms. Some interolog-based methods and homologous PPI families have been proposed for predicting PPIs from the known PPIs of source organisms. RESULTS Here, we propose a multiple-strategy scoring method to identify reliable PPIs for reconstructing the mouse PPI network from two well-known organisms: human and fly. We firstly identified the PPI candidates of target organisms based on homologous PPIs, sharing significant sequence similarities (joint E-value ≤ 1 × 10(-40)), from source organisms using generalized interolog mapping. These PPI candidates were evaluated by our multiple-strategy scoring method, combining sequence similarities, normalized ranks, and conservation scores across multiple organisms. According to 106,825 PPI candidates in yeast derived from human and fly, our scoring method can achieve high prediction accuracy and outperform generalized interolog mapping. Experiment results show that our multiple-strategy score can avoid the influence of the protein family size and length to significantly improve PPI prediction accuracy and reflect the biological functions. In addition, the top-ranked and conserved PPIs are often orthologous/essential interactions and share the functional similarity. Based on these reliable predicted PPIs, we reconstructed a comprehensive mouse PPI network, which is a scale-free network and can reflect the biological functions and high connectivity of 292 KEGG modules, including 216 pathways and 76 structural complexes. CONCLUSIONS Experimental results show that our scoring method can improve the predicting accuracy based on the normalized rank and evolutionary conservation from multiple organisms. Our predicted PPIs share similar biological processes and cellular components, and the reconstructed genome-wide PPI network can reflect network topology and modularity. We believe that our method is useful for inferring reliable PPIs and reconstructing a comprehensive PPI network of an interesting organism.
Collapse
|
9
|
Abstract
The past decade has seen a dramatic expansion in the number and range of techniques available to obtain genome-wide information and to analyze this information so as to infer both the functions of individual molecules and how they interact to modulate the behavior of biological systems. Here, we review these techniques, focusing on the construction of physical protein-protein interaction networks, and highlighting approaches that incorporate protein structure, which is becoming an increasingly important component of systems-level computational techniques. We also discuss how network analyses are being applied to enhance our basic understanding of biological systems and their disregulation, as well as how these networks are being used in drug development.
Collapse
|
10
|
Abstract
Background The adaptive immune response is antigen-specific and triggered by pathogen recognition through T cells. Although the interactions and mechanisms of TCR-peptide-MHC (TCR-pMHC) have been studied over three decades, the biological basis for these processes remains controversial. As an increasing number of high-throughput binding epitopes and available TCR-pMHC complex structures, a fast genome-wide structural modelling of TCR-pMHC interactions is an emergent task for understanding immune interactions and developing peptide vaccines. Results We first constructed the PPI matrices and iMatrix, using 621 non-redundant PPI interfaces and 398 non-redundant antigen-antibody interfaces, respectively, for modelling the MHC-peptide and TCR-peptide interfaces, respectively. The iMatrix consists of four knowledge-based scoring matrices to evaluate the hydrogen bonds and van der Waals forces between sidechains or backbones, respectively. The predicted energies of iMatrix are high correlated (Pearson's correlation coefficient is 0.6) to 70 experimental free energies on antigen-antibody interfaces. To further investigate iMatrix and PPI matrices, we inferred the 701,897 potential peptide antigens with significant statistic from 389 pathogen genomes and modelled the TCR-pMHC interactions using available TCR-pMHC complex structures. These identified peptide antigens keep hydrogen-bond energies and consensus interactions and our TCR-pMHC models can provide detailed interacting models and crucial binding regions. Conclusions Experimental results demonstrate that our method can achieve high precision for predicting binding affinity and potential peptide antigens. We believe that iMatrix and our template-based method can be useful for the binding mechanisms of TCR-pMHC complexes and peptide vaccine designs.
Collapse
|
11
|
Inferring homologous protein-protein interactions through pair position specific scoring matrix. BMC Bioinformatics 2013; 14 Suppl 2:S11. [PMID: 23367879 PMCID: PMC3549806 DOI: 10.1186/1471-2105-14-s2-s11] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Background The protein-protein interaction (PPI) is one of the most important features to understand biological processes. For a PPI, the physical domain-domain interaction (DDI) plays the key role for biology functions. In the post-genomic era, to rapidly identify homologous PPIs for analyzing the contact residue pairs of their interfaces within DDIs on a genomic scale is essential to determine PPI networks and the PPI interface evolution across multiple species. Results In this study, we proposed "pair Position Specific Scoring Matrix (pairPSSM)" to identify homologous PPIs. The pairPSSM can successfully distinguish the true protein complexes from unreasonable protein pairs with about 90% accuracy. For the test set including 1,122 representative heterodimers and 2,708,746 non-interacting protein pairs, the mean average precision and mean false positive rate of pairPSSM were 0.42 and 0.31, respectively. Moreover, we applied pairPSSM to identify ~450,000 homologous PPIs with their interacting domains and residues in seven common organisms (e.g. Homo sapiens, Mus musculus, Saccharomyces cerevisiae and Escherichia coli). Conclusions Our pairPSSM is able to provide statistical significance of residue pairs using evolutionary profiles and a scoring system for inferring homologous PPIs. According to our best knowledge, the pairPSSM is the first method for searching homologous PPIs across multiple species using pair position specific scoring matrix and a 3D dimer as the template to map interacting domain pairs of these PPIs. We believe that pairPSSM is able to provide valuable insights for the PPI evolution and networks across multiple species.
Collapse
|
12
|
Protein interactions in 3D: From interface evolution to drug discovery. J Struct Biol 2012; 179:347-58. [DOI: 10.1016/j.jsb.2012.04.009] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2011] [Revised: 03/27/2012] [Accepted: 04/18/2012] [Indexed: 11/25/2022]
|
13
|
Steric recognition of T-cell receptor contact residues is required to map mutant epitopes by immunoinformatical programmes. Immunology 2012; 136:139-52. [PMID: 22121944 DOI: 10.1111/j.1365-2567.2011.03542.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
MHC class I-restricted CD8 T-lymphocyte epitopes comprise anchor motifs, T-cell receptor (TCR) contact residues and the peptide backbone. Serial variant epitopes with substitution of amino acids at either anchor motifs or TCR contact residues have been synthesized for specific interferon-γ responses to clarify the TCR recognition mechanism as well as to assess the epitope prediction capacity of immunoinformatical programmes. CD8 T lymphocytes recognise the steric configuration of functional groups at the TCR contact side chain with a parallel observation that peptide backbones of various epitopes adapt to the conserved conformation upon binding to the same MHC class I molecule. Variant epitopes with amino acid substitutions at the TCR contact site are not recognised by specific CD8 T lymphocytes without compromising their binding capacity to MHC class I molecules, which demonstrates two discrete antigen presentation events for the binding of peptides to MHC class I molecules and for TCR recognition. The predicted outcome of immunoinformatical programmes is not consistent with the results of epitope identification by laboratory experiments in the absence of information on the interaction with TCR contact residues. Immunoinformatical programmes based on the binding affinity to MHC class I molecules are not sufficient for the accurate prediction of CD8 T-lymphocyte epitopes. The predictive capacity is further improved to distinguish mutant epitopes from the non-mutated epitopes if the peptide-TCR interface is integrated into the computing simulation programme.
Collapse
|
14
|
Abstract
Background Identifying the location of binding sites on proteins is of fundamental importance for a wide range of applications including molecular docking, de novo drug design, structure identification and comparison of functional sites. Structural genomic projects are beginning to produce protein structures with unknown functions. Therefore, efficient methods are required if all these structures are to be properly annotated. Lots of methods for finding binding sites involve 3D structure comparison. Here we design a method to find protein binding sites by direct comparison of protein 3D structures. Results We have developed an efficient heuristic approach for finding similar binding sites from the surface of given proteins. Our approach consists of three steps: local sequence alignment, protein surface detection, and 3D structures comparison. We implement the algorithm and produce a software package that works well in practice. When comparing a complete protein with all complete protein structures in the PDB database, experiments show that the average recall value of our approach is 82% and the average precision value of our approach is also significantly better than the existing approaches. Conclusions Our program has much higher recall values than those existing programs. Experiments show that all the existing approaches have recall values less than 50%. This implies that more than 50% of real binding sites cannot be reported by those existing approaches. The software package is available at http://sites.google.com/site/guofeics/bsfinder.
Collapse
|
15
|
Uncovering Arabidopsis membrane protein interactome enriched in transporters using mating-based split ubiquitin assays and classification models. FRONTIERS IN PLANT SCIENCE 2012; 3:124. [PMID: 22737156 PMCID: PMC3380418 DOI: 10.3389/fpls.2012.00124] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2012] [Accepted: 05/24/2012] [Indexed: 05/18/2023]
Abstract
High-throughput data are a double-edged sword; for the benefit of large amount of data, there is an associated cost of noise. To increase reliability and scalability of high-throughput protein interaction data generation, we tested the efficacy of classification to enrich potential protein-protein interactions. We applied this method to identify interactions among Arabidopsis membrane proteins enriched in transporters. We validated our method with multiple retests. Classification improved the quality of the ensuing interaction network and was effective in reducing the search space and increasing true positive rate. The final network of 541 interactions among 239 proteins (of which 179 are transporters) is the first protein interaction network enriched in membrane transporters reported for any organism. This network has similar topological attributes to other published protein interaction networks. It also extends and fills gaps in currently available biological networks in plants and allows building a number of hypotheses about processes and mechanisms involving signal-transduction and transport systems.
Collapse
|
16
|
MoNetFamily: a web server to infer homologous modules and module-module interaction networks in vertebrates. Nucleic Acids Res 2012; 40:W263-70. [PMID: 22689643 PMCID: PMC3394321 DOI: 10.1093/nar/gks541] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
A module is a fundamental unit forming with highly connected proteins and performs a certain kind of biological functions. Modules and module–module interaction (MMI) network are essential for understanding cellular processes and functions. The MoNetFamily web server can identify the modules, homologous modules (called module family) and MMI networks across multiple species for the query protein(s). This server first finds module candidates of the query by using BLASTP to search the module template database (1785 experimental and 1252 structural templates). MoNetFamily then infers the homologous modules of the selected module candidate using protein–protein interaction (PPI) families. According to homologous modules and PPIs, we statistically calculated MMIs and MMI networks across multiple species. For each module candidate, MoNetFamily identifies its neighboring modules and their MMIs in module networks of Homo sapiens, Mus musculus and Danio rerio. Finally, MoNetFamily shows the conserved proteins, PPI profiles and functional annotations of the module family. Our results indicate that the server can be useful for MMI network (e.g. 1818 modules and 9678 MMIs in H. sapiens) visualizations and query annotations using module families and neighboring modules. We believe that the server is able to provide valuable insights to determine homologous modules and MMI networks across multiple species for studying module evolution and cellular processes. The MoNetFamily sever is available at http://monetfamily.life.nctu.edu.tw.
Collapse
|
17
|
Abstract
Over the past two decades computational methods have eased up the financial and experimental burden of early drug discovery process. The in silico methods have provided support in terms of databases, data mining of large genomes, network analysis, systems biology on the bioinformatics front and structure-activity relationship, similarity analysis, docking, and pharmacophore methods for lead design and optimization. This review highlights some of the applications of bioinformatics and chemoinformatics methods that have enriched the field of drug discovery. In addition, the review also provided insights into the use of free energy perturbation methods for efficiently computing binding energy. These in silico methods are complementary and can be easily integrated into the traditional in vitro and in vivo methods to test pharmacological hypothesis.
Collapse
|
18
|
Abstract
Identifying the location of binding sites on proteins is of fundamental importance for a wide range of applications including molecular docking, de novo drug design, structure identification, and comparison of functional sites. In this paper, we develop an efficient approach for finding binding sites between proteins. Our approach consists of four steps: local sequence alignment, protein surface detection, 3D structure comparison, and candidate binding site selection. A comparison of our method with the LSA algorithm shows that the binding sites predicted by our method are somewhat closer to the actual binding sites in the protein-protein complexes. The software package is available at http://sites.google.com/site/guofeics/pro-bs for noncommercial use.
Collapse
|
19
|
IBIS (Inferred Biomolecular Interaction Server) reports, predicts and integrates multiple types of conserved interactions for proteins. Nucleic Acids Res 2011; 40:D834-40. [PMID: 22102591 PMCID: PMC3245142 DOI: 10.1093/nar/gkr997] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We have recently developed the Inferred Biomolecular Interaction Server (IBIS) and database, which reports, predicts and integrates different types of interaction partners and locations of binding sites in proteins based on the analysis of homologous structural complexes. Here, we highlight several new IBIS features and options. The server's webpage is now redesigned to allow users easier access to data for different interaction types. An entry page is added to give a quick summary of available results and to now accept protein sequence accessions. To elucidate the formation of protein complexes, not just binary interactions, IBIS currently presents an expandable interaction network. Previously, IBIS provided annotations for four different types of binding partners: proteins, small molecules, nucleic acids and peptides; in the current version a new protein-ion interaction type has been added. Several options provide easy downloads of IBIS data for all Protein Data Bank (PDB) protein chains and the results for each query. In this study, we show that about one-third of all RefSeq sequences can be annotated with IBIS interaction partners and binding sites. The IBIS server is available at http://www.ncbi.nlm.nih.gov/Structure/ibis/ibis.cgi and updated biweekly.
Collapse
|
20
|
Spatial clustering of protein binding sites for template based protein docking. Bioinformatics 2011; 27:2820-7. [DOI: 10.1093/bioinformatics/btr493] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
21
|
PAComplex: a web server to infer peptide antigen families and binding models from TCR-pMHC complexes. Nucleic Acids Res 2011; 39:W254-60. [PMID: 21666259 PMCID: PMC3125798 DOI: 10.1093/nar/gkr434] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2011] [Revised: 04/22/2011] [Accepted: 05/12/2011] [Indexed: 01/04/2023] Open
Abstract
One of the most adaptive immune responses is triggered by specific T-cell receptors (TCR) binding to peptide-major histocompatibility complexes (pMHC). Despite the availability of many prediction servers to identify peptides binding to MHC, these servers are often lacking in peptide-TCR interactions and detailed atomic interacting models. PAComplex is the first web server investigating both pMHC and peptide-TCR interfaces to infer peptide antigens and homologous peptide antigens of a query. This server first identifies significantly similar TCR-pMHC templates (joint Z-value ≥ 4.0) of the query by using antibody-antigen and protein-protein interacting scoring matrices for peptide-TCR and pMHC interfaces, respectively. PAComplex then identifies the homologous peptide antigens of these hit templates from complete pathogen genome databases (≥10(8) peptide candidates from 864,628 protein sequences of 389 pathogens) and experimental peptide databases (80,057 peptides in 2287 species). Finally, the server outputs peptide antigens and homologous peptide antigens of the query and displays detailed interacting models (e.g. hydrogen bonds and steric interactions in two interfaces) of hitTCR-pMHC templates. Experimental results demonstrate that the proposed server can achieve high prediction accuracy and offer potential peptide antigens across pathogens. We believe that the server is able to provide valuable insights for the peptide vaccine and MHC restriction. The PAComplex sever is available at http://PAcomplex.life.nctu.edu.tw.
Collapse
|
22
|
Abstract
Singapore grouper iridovirus (SGIV), a major pathogen of concern for grouper aquaculture, has a double-stranded DNA (dsDNA) genome with 162 predicted open reading frames, for which a total of 62 SGIV proteins have been identified. One of these, ORF158L, bears no sequence homology to any other known protein. Knockdown of orf158L using antisense morpholino oligonucleotides resulted in a significant decrease in virus yield in grouper embryonic cells. ORF158L was observed in nuclei and virus assembly centers of virus-infected cells. This observation led us to study the structure and function of ORF158L. The crystal structure determined at 2.2-Å resolution reveals that ORF158L partially exhibits a structural resemblance to the histone binding region of antisilencing factor 1 (Asf1), a histone H3/H4 chaperon, despite the fact that there is no significant sequence identity between the two proteins. Interactions of ORF158L with the histone H3/H4 complex and H3 were demonstrated by isothermal titration calorimetry (ITC) experiments. Subsequently, the results of ITC studies on structure-based mutants of ORF158L suggested Arg67 and Ala93 were key residues for histone H3 interactions. Moreover, a combination of approaches of ORF158L knockdown and isobaric tags/mass spectrometry for relative and absolute quantifications (iTRAQ) revealed that ORF158L may be involved in both the regulation and the expression of histone H3 and H3 methylation. Our present studies suggest that ORF158L may function as a histone H3 chaperon, enabling it to control host cellular gene expression and to facilitate viral replication.
Collapse
|
23
|
3D-interologs: an evolution database of physical protein- protein interactions across multiple genomes. BMC Genomics 2010; 11 Suppl 3:S7. [PMID: 21143789 PMCID: PMC2999352 DOI: 10.1186/1471-2164-11-s3-s7] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Background Comprehensive exploration of protein-protein interactions is a challenging route to understand biological processes. For efficiently enlarging protein interactions annotated with residue-based binding models, we proposed a new concept "3D-domain interolog mapping" with a scoring system to explore all possible protein pairs between the two homolog families, derived from a known 3D-structure dimmer (template), across multiple species. Each family consists of homologous proteins which have interacting domains of the template for studying domain interface evolution of two interacting homolog families. Results The 3D-interologs database records the evolution of protein-protein interactions database across multiple species. Based on "3D-domain interolog mapping" and a new scoring function, we infer 173,294 protein-protein interactions by using 1,895 three-dimensional (3D) structure heterodimers to search the UniProt database (4,826,134 protein sequences). The 3D- interologs database comprises 15,124 species and 283,980 protein-protein interactions, including 173,294 interactions (61%) and 110,686 interactions (39%) summarized from the IntAct database. For a protein-protein interaction, the 3D-interologs database shows functional annotations (e.g. Gene Ontology), interacting domains and binding models (e.g. hydrogen-bond interactions and conserved residues). Additionally, this database provides couple-conserved residues and the interacting evolution by exploring the interologs across multiple species. Experimental results reveal that the proposed scoring function obtains good agreement for the binding affinity of 275 mutated residues from the ASEdb. The precision and recall of our method are 0.52 and 0.34, respectively, by using 563 non-redundant heterodimers to search on the Integr8 database (549 complete genomes). Conclusions Experimental results demonstrate that the proposed method can infer reliable physical protein-protein interactions and be useful for studying the protein-protein interaction evolution across multiple species. In addition, the top-ranked strategy and template interface score are able to significantly improve the accuracies of identifying protein-protein interactions in a complete genome. The 3D-interologs database is available at http://3D- interologs.life.nctu.edu.tw.
Collapse
|
24
|
Predicting domain-domain interaction based on domain profiles with feature selection and support vector machines. BMC Bioinformatics 2010; 11:537. [PMID: 21034480 PMCID: PMC2989984 DOI: 10.1186/1471-2105-11-537] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2009] [Accepted: 10/29/2010] [Indexed: 11/23/2022] Open
Abstract
Background Protein-protein interaction (PPI) plays essential roles in cellular functions. The cost, time and other limitations associated with the current experimental methods have motivated the development of computational methods for predicting PPIs. As protein interactions generally occur via domains instead of the whole molecules, predicting domain-domain interaction (DDI) is an important step toward PPI prediction. Computational methods developed so far have utilized information from various sources at different levels, from primary sequences, to molecular structures, to evolutionary profiles. Results In this paper, we propose a computational method to predict DDI using support vector machines (SVMs), based on domains represented as interaction profile hidden Markov models (ipHMM) where interacting residues in domains are explicitly modeled according to the three dimensional structural information available at the Protein Data Bank (PDB). Features about the domains are extracted first as the Fisher scores derived from the ipHMM and then selected using singular value decomposition (SVD). Domain pairs are represented by concatenating their selected feature vectors, and classified by a support vector machine trained on these feature vectors. The method is tested by leave-one-out cross validation experiments with a set of interacting protein pairs adopted from the 3DID database. The prediction accuracy has shown significant improvement as compared to InterPreTS (Interaction Prediction through Tertiary Structure), an existing method for PPI prediction that also uses the sequences and complexes of known 3D structure. Conclusions We show that domain-domain interaction prediction can be significantly enhanced by exploiting information inherent in the domain profiles via feature selection based on Fisher scores, singular value decomposition and supervised learning based on support vector machines. Datasets and source code are freely available on the web at http://liao.cis.udel.edu/pub/svdsvm. Implemented in Matlab and supported on Linux and MS Windows.
Collapse
|
25
|
Structure and flexibility in cold-adapted iron superoxide dismutases: the case of the enzyme isolated from Pseudoalteromonas haloplanktis. J Struct Biol 2010; 172:343-52. [PMID: 20732427 DOI: 10.1016/j.jsb.2010.08.008] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2010] [Revised: 07/29/2010] [Accepted: 08/18/2010] [Indexed: 10/19/2022]
Abstract
Superoxide dismutases (SODs) are metalloenzymes catalysing the dismutation of superoxide anion radicals into molecular oxygen and hydrogen peroxide. Here, we present the crystal structure of a cold-adapted Fe-SOD from the Antarctic eubacterium Pseudoalteromonas haloplanktis (PhSOD), and that of its complex with sodium azide. The structures were compared with those of the corresponding homologues having a high sequence identity with PhSOD, such as the mesophilic SOD from Escherichia coli (EcSOD) or Pseudomonas ovalis, and the psychrophilic SOD from Aliivibrio salmonicida (AsSOD). These enzymes shared a large structural similarity, such as a conserved tertiary structure and arrangement of the two monomers, an almost identical total number of inter- and intramolecular hydrogen bonds and salt bridges. However, the two cold-adapted SODs showed an increased flexibility of the active site residues with respect to their mesophilic homologues. Structural information was combined with a characterisation of the chemical and thermal stability performed by CD and fluorescence measurements. Despite of its psychrophilic origin, the denaturation temperature of PhSOD was comparable with that of the mesophilic EcSOD, whereas AsSOD showed a lower denaturation temperature. On the contrary, the values of the denaturant concentration at the transition midpoint were in line with the psychrophilic/mesophilic origin of the proteins. These data provide additional support to the hypothesis that cold-adapted enzymes achieve efficient catalysis at low temperature, by increasing the flexibility of their active site; moreover, our results underline how fine structural modifications can alter enzyme flexibility and/or stability without compromising the overall structure of typical rigid enzymes, such as SODs.
Collapse
|
26
|
Abstract
The proteins in a cell often assemble into complexes to carry out their functions and play an essential role of biological processes. The PCFamily server identifies template-based homologous protein complexes [called protein complex family (PCF)] and infers functional modules of the query proteins. This server first finds homologous structure complexes of the query using BLASTP to search the structural template database (11 263 complexes). PCFamily then searches the homologous complexes of the templates (query) from a complete genomic database (Integr8 with 6 352 363 protein sequences in 2274 species). According to these homologous complexes across multiple species, this sever infers binding models (e.g. hydrogen-bonds and conserved amino acids in the interfaces), functional modules, and the conserved interacting domains and Gene Ontology annotations of the PCF. Experimental results demonstrate that the PCFamily server can be useful for binding model visualizations and annotating the query proteins. We believe that the server is able to provide valuable insights for determining functional modules of biological networks across multiple species. The PCFamily sever is available at http://pcfamily.life.nctu.edu.tw.
Collapse
|
27
|
Abstract
Although protein–peptide interactions are estimated to constitute up to 40% of all protein interactions, relatively little information is available for the structural details of these interactions. Peptide-mediated interactions are a prime target for drug design because they are predominantly present in signaling and regulatory networks. A reliable data set of nonredundant protein–peptide complexes is indispensable as a basis for modeling and design, but current data sets for protein–peptide interactions are often biased towards specific types of interactions or are limited to interactions with small ligands. In PepX (http://pepx.switchlab.org), we have designed an unbiased and exhaustive data set of all protein–peptide complexes available in the Protein Data Bank with peptide lengths up to 35 residues. In addition, these complexes have been clustered based on their binding interfaces rather than sequence homology, providing a set of structurally diverse protein–peptide interactions. The final data set contains 505 unique protein–peptide interface clusters from 1431 complexes. Thorough annotation of each complex with both biological and structural information facilitates searching for and browsing through individual complexes and clusters. Moreover, we provide an additional source of data for peptide design by annotating peptides with naturally occurring backbone variations using fragment clusters from the BriX database.
Collapse
|
28
|
Inferred Biomolecular Interaction Server--a web server to analyze and predict protein interacting partners and binding sites. Nucleic Acids Res 2009; 38:D518-24. [PMID: 19843613 PMCID: PMC2808861 DOI: 10.1093/nar/gkp842] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
IBIS is the NCBI Inferred Biomolecular Interaction Server. This server organizes, analyzes and predicts interaction partners and locations of binding sites in proteins. IBIS provides annotations for different types of binding partners (protein, chemical, nucleic acid and peptides), and facilitates the mapping of a comprehensive biomolecular interaction network for a given protein query. IBIS reports interactions observed in experimentally determined structural complexes of a given protein, and at the same time IBIS infers binding sites/interacting partners by inspecting protein complexes formed by homologous proteins. Similar binding sites are clustered together based on their sequence and structure conservation. To emphasize biologically relevant binding sites, several algorithms are used for verification in terms of evolutionary conservation, biological importance of binding partners, size and stability of interfaces, as well as evidence from the published literature. IBIS is updated regularly and is freely accessible via http://www.ncbi.nlm.nih.gov/Structure/ibis/ibis.html.
Collapse
|
29
|
CAPIH: a Web interface for comparative analyses and visualization of host-HIV protein-protein interactions. BMC Microbiol 2009; 9:164. [PMID: 19674441 PMCID: PMC2782265 DOI: 10.1186/1471-2180-9-164] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2009] [Accepted: 08/12/2009] [Indexed: 01/27/2023] Open
Abstract
Background The Human Immunodeficiency Virus type one (HIV-1) is the major causing pathogen of the Acquired Immune Deficiency Syndrome (AIDS). A large number of HIV-1-related studies are based on three non-human model animals: chimpanzee, rhesus macaque, and mouse. However, the differences in host-HIV-1 interactions between human and these model organisms have remained unexplored. Description Here we present CAPIH (Comparative Analysis of Protein Interactions for HIV-1), the first web-based interface to provide comparative information between human and the three model organisms in the context of host-HIV-1 protein interactions. CAPIH identifies genetic changes that occur in HIV-1-interacting host proteins. In a total of 1,370 orthologous protein sets, CAPIH identifies ~86,000 amino acid substitutions, ~21,000 insertions/deletions, and ~33,000 potential post-translational modifications that occur only in one of the four compared species. CAPIH also provides an interactive interface to display the host-HIV-1 protein interaction networks, the presence/absence of orthologous proteins in the model organisms in the networks, the genetic changes that occur in the protein nodes, and the functional domains and potential protein interaction hot sites that may be affected by the genetic changes. The CAPIH interface is freely accessible at http://bioinfo-dbb.nhri.org.tw/capih. Conclusion CAPIH exemplifies that large divergences exist in disease-associated proteins between human and the model animals. Since all of the newly developed medications must be tested in model animals before entering clinical trials, it is advisable that comparative analyses be performed to ensure proper translations of animal-based studies. In the case of AIDS, the host-HIV-1 protein interactions apparently have differed to a great extent among the compared species. An integrated protein network comparison among the four species will probably shed new lights on AIDS studies.
Collapse
|
30
|
Abstract
‘Protinfo PPC’ (Prediction of Protein Complex) is a web server that predicts atomic level structures of interacting proteins from their amino-acid sequences. It uses the interolog method to search for experimental protein complex structures that are homologous to the input sequences submitted by a user. These structures are then used as starting templates to generate protein complex models, which are returned to the user in Protein Data Bank format via email. The server supports modeling of both homo and hetero multimers and generally produces full atomic level models (including insertion/deletion regions) of protein complexes as long as at least one putative homologous template for the query sequences is found. The modeling pipeline behind Protinfo PPC has been rigorously benchmarked and proven to produce highly accurate protein complex models. The fully automated all atom comparative modeling service for protein complexes provided by Protinfo PPC server offers wide capabilities ranging from prediction of protein complex interactions to identification of possible interaction sites, which will be useful for researchers studying these topics. The Protinfo PPC web server is available at http://protinfo.compbio.washington.edu/ppc/
Collapse
|
31
|
PPISearch: a web server for searching homologous protein-protein interactions across multiple species. Nucleic Acids Res 2009; 37:W369-75. [PMID: 19417070 PMCID: PMC2703927 DOI: 10.1093/nar/gkp309] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
As an increasing number of reliable protein–protein interactions (PPIs) become available and high-throughput experimental methods provide systematic identification of PPIs, there is a growing need for fast and accurate methods for discovering homologous PPIs of a newly determined PPI. PPISearch is a web server that rapidly identifies homologous PPIs (called PPI family) and infers transferability of interacting domains and functions of a query protein pair. This server first identifies two homologous families of the query, respectively, by using BLASTP to scan an annotated PPIs database (290 137 PPIs in 576 species), which is a collection of five public databases. We determined homologous PPIs from protein pairs of homologous families when these protein pairs were in the annotated database and have significant joint sequence similarity (E ≤ 10−40) with the query. Using these homologous PPIs across multiple species, this sever infers the conserved domain–domain pairs (Pfam and InterPro domains) and function pairs (Gene Ontology annotations). Our results demonstrate that the transferability of conserved domain-domain pairs between homologous PPIs and query pairs is 88% using 103 762 PPI queries, and the transferability of conserved function pairs is 69% based on 106 997 PPI queries. The PPISearch server should be useful for searching homologous PPIs and PPI families across multiple species. The PPISearch server is available through the website at http://gemdock.life.nctu.edu.tw/ppisearch/.
Collapse
|
32
|
A survey of available tools and web servers for analysis of protein-protein interactions and interfaces. Brief Bioinform 2009; 10:217-32. [PMID: 19240123 DOI: 10.1093/bib/bbp001] [Citation(s) in RCA: 119] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
The unanimous agreement that cellular processes are (largely) governed by interactions between proteins has led to enormous community efforts culminating in overwhelming information relating to these proteins; to the regulation of their interactions, to the way in which they interact and to the function which is determined by these interactions. These data have been organized in databases and servers. However, to make these really useful, it is essential not only to be aware of these, but in particular to have a working knowledge of which tools to use for a given problem; what are the tool advantages and drawbacks; and no less important how to combine these for a particular goal since usually it is not one tool, but some combination of tool-modules that is needed. This is the goal of this review.
Collapse
|
33
|
HOMCOS: a server to predict interacting protein pairs and interacting sites by homology modeling of complex structures. Nucleic Acids Res 2008; 36:W185-9. [PMID: 18442990 PMCID: PMC2447736 DOI: 10.1093/nar/gkn218] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2008] [Revised: 04/04/2008] [Accepted: 04/09/2008] [Indexed: 11/18/2022] Open
Abstract
As protein-protein interactions are crucial in most biological processes, it is valuable to understand how and where protein pairs interact. We developed a web server HOMCOS (Homology Modeling of Complex Structure, http://biunit.naist.jp/homcos) to predict interacting protein pairs and interacting sites by homology modeling of complex structures. Our server is capable of three services. The first is modeling heterodimers from two query amino acid sequences posted by users. The server performs BLAST searches to identify homologous templates in the latest representative dataset of heterodimer structures generated from the PQS database. Structure validity is evaluated by the combination of sequence similarity and knowledge-based contact potential energy as previously described. The server generates a sequence-replaced model PDB file and a MODELLER script to build full atomic models of complex structures. The second service is modeling homodimers from one query sequence. The third service is identification of potentially interacting proteins for one query sequence. The server searches the dataset of heterodimer structures for a homologous template, outputs the candidate interacting sequences in the Uniprot database homologous for the interacting partner template proteins. These features are useful for wide range of researchers to predict putative interaction sites and interacting proteins.
Collapse
|