1
|
Phung DK, Pilotto S, Matelska D, Blombach F, Pinotsis N, Hovan L, Gervasio FL, Werner F. Archaeal NusA2 is the ancestor of ribosomal protein eS7 in eukaryotes. Structure 2025; 33:149-159.e6. [PMID: 39504966 DOI: 10.1016/j.str.2024.10.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 09/06/2024] [Accepted: 10/10/2024] [Indexed: 11/08/2024]
Abstract
N-utilization substance A (NusA) is a regulatory factor with pleiotropic functions in gene expression in bacteria. Archaea encode two conserved small proteins, NusA1 and NusA2, with domains orthologous to the two RNA binding K Homology (KH) domains of NusA. Here, we report the crystal structures of NusA2 from Sulfolobus acidocaldarius and Saccharolobus solfataricus obtained at 3.1 Å and 1.68 Å, respectively. NusA2 comprises an N-terminal zinc finger followed by two KH-like domains lacking the GXXG signature. Despite the loss of the GXXG motif, NusA2 binds single-stranded RNA. Mutations in the zinc finger domain compromise the structural integrity of NusA2 at high temperatures and molecular dynamics simulations indicate that zinc binding provides an energy barrier preventing the domain from reaching unfolded states. A structure-guided phylogenetic analysis of the KH-like domains supports the notion that the NusA2 clade is ancestral to the ribosomal protein eS7 in eukaryotes, implying a potential role of NusA2 in translation.
Collapse
Affiliation(s)
- Duy Khanh Phung
- RNAP Laboratory, Institute for Structural and Molecular Biology, University College London, London WC1E 6BT, UK
| | - Simona Pilotto
- RNAP Laboratory, Institute for Structural and Molecular Biology, University College London, London WC1E 6BT, UK
| | - Dorota Matelska
- RNAP Laboratory, Institute for Structural and Molecular Biology, University College London, London WC1E 6BT, UK
| | - Fabian Blombach
- RNAP Laboratory, Institute for Structural and Molecular Biology, University College London, London WC1E 6BT, UK
| | - Nikos Pinotsis
- Institute for Structural and Molecular Biology, Birkbeck College, London WC1E 7HX, UK
| | - Ladislav Hovan
- Pharmaceutical Sciences, University of Geneva, 1206 Genève, Switzerland
| | - Francesco Luigi Gervasio
- Pharmaceutical Sciences, University of Geneva, 1206 Genève, Switzerland; Institute of Pharmaceutical Sciences of Western Switzerland (ISPSO), University of Geneva, 1206 Genève, Switzerland; Department of Chemistry, University College London, London WC1E 6BT, UK
| | - Finn Werner
- RNAP Laboratory, Institute for Structural and Molecular Biology, University College London, London WC1E 6BT, UK.
| |
Collapse
|
2
|
Zhao B, Basu S, Kurgan L. DescribePROT Database of Residue-Level Protein Structure and Function Annotations. Methods Mol Biol 2025; 2867:169-184. [PMID: 39576581 DOI: 10.1007/978-1-0716-4196-5_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2024]
Abstract
DescribePROT is a freely available online database of structural and functional descriptors of proteins at the amino acid level. It provides access to 13 diverse descriptors that include sequence conservation, putative secondary structure, solvent accessibility, intrinsic disorder, and signal peptides, and putative annotations of residues that interact with proteins, peptides and nucleic acids. These data can be used to elucidate protein functions, to support efforts to develop therapeutics, and to develop and evaluate future predictors of protein structure and function. DescribePROT includes 7.8 billion predictions for 1.4 million proteins from 83 complete proteomes of popular model organisms. This information can be downloaded at multiple levels of scope (entire database, specific organisms, and individual proteins) and can be interacted with using a graphical interface that simultaneously displays data on multiple descriptors. We describe the contents of this resource, provide directions on how to use its interface, and offer instructions on how to obtain and interact with the underlying data. Moreover, we briefly discuss plans for a future expansion of this database. DescribePROT is available at http://biomine.cs.vcu.edu/servers/DESCRIBEPROT/ .
Collapse
Affiliation(s)
- Bi Zhao
- Genomics program, College of Public Health, University of South Florida, Tampa, FL, USA
| | - Sushmita Basu
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
3
|
Dai Y, Ma S, Guo Y, Zhang X, Liu D, Gao Y, Zhai C, Chen Q, Xiao S, Zhang Z, Yu L. Evolution and Expression of the Meprin and TRAF Homology Domain-Containing Gene Family in Solanaceae. Int J Mol Sci 2023; 24:ijms24108782. [PMID: 37240124 DOI: 10.3390/ijms24108782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 05/01/2023] [Accepted: 05/10/2023] [Indexed: 05/28/2023] Open
Abstract
Meprin and TRAF homology (MATH)-domain-containing proteins are pivotal in modulating plant development and environmental stress responses. To date, members of the MATH gene family have been identified only in a few plant species, including Arabidopsis thaliana, Brassica rapa, maize, and rice, and the functions of this gene family in other economically important crops, especially the Solanaceae family, remain unclear. The present study identified and analyzed 58 MATH genes from three Solanaceae species, including tomato (Solanum lycopersicum), potato (Solanum tuberosum), and pepper (Capsicum annuum). Phylogenetic analysis and domain organization classified these MATH genes into four groups, consistent with those based on motif organization and gene structure. Synteny analysis found that segmental and tandem duplication might have contributed to MATH gene expansion in the tomato and the potato, respectively. Collinearity analysis revealed high conservation among Solanaceae MATH genes. Further cis-regulatory element prediction and gene expression analysis showed that Solanaceae MATH genes play essential roles during development and stress response. These findings provide a theoretical basis for other functional studies on Solanaceae MATH genes.
Collapse
Affiliation(s)
- Yangshuo Dai
- State Key Laboratory of Biocontrol, Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China
- Guangdong Provincial Key Laboratory of High Technology for Plant Protection, Plant Protection Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China
| | - Sirui Ma
- State Key Laboratory of Biocontrol, Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China
| | - Yixian Guo
- State Key Laboratory of Biocontrol, Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China
| | - Xue Zhang
- State Key Laboratory of Biocontrol, Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China
| | - Di Liu
- State Key Laboratory of Biocontrol, Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China
| | - Yan Gao
- Guangdong Provincial Key Laboratory of High Technology for Plant Protection, Plant Protection Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China
| | - Chendong Zhai
- Guangdong Provincial Key Laboratory of High Technology for Plant Protection, Plant Protection Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China
| | - Qinfang Chen
- State Key Laboratory of Biocontrol, Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China
| | - Shi Xiao
- State Key Laboratory of Biocontrol, Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China
| | - Zhenfei Zhang
- Guangdong Provincial Key Laboratory of High Technology for Plant Protection, Plant Protection Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China
| | - Lujun Yu
- State Key Laboratory of Biocontrol, Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China
| |
Collapse
|
4
|
Shen Y, Parks JM, Smith JC. HLA Class I Supertype Classification Based on Structural Similarity. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2023; 210:103-114. [PMID: 36453976 DOI: 10.4049/jimmunol.2200685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 10/31/2022] [Indexed: 12/24/2022]
Abstract
HLA class I proteins, a critical component in adaptive immunity, bind and present intracellular Ags to CD8+ T cells. The extreme polymorphism of HLA genes and associated peptide binding specificities leads to challenges in various endeavors, including neoantigen vaccine development, disease association studies, and HLA typing. Supertype classification, defined by clustering functionally similar HLA alleles, has proven helpful in reducing the complexity of distinguishing alleles. However, determining supertypes via experiments is impractical, and current in silico classification methods exhibit limitations in stability and functional relevance. In this study, by incorporating three-dimensional structures we present a method for classifying HLA class I molecules with improved breadth, accuracy, stability, and flexibility. Critical for these advances is our finding that structural similarity highly correlates with peptide binding specificity. The new classification should be broadly useful in peptide-based vaccine development and HLA-disease association studies.
Collapse
Affiliation(s)
- Yue Shen
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN
| | - Jerry M Parks
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN; and
| | - Jeremy C Smith
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN.,Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN; and.,Department of Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, TN
| |
Collapse
|
5
|
Biró B, Zhao B, Kurgan L. Complementarity of the residue-level protein function and structure predictions in human proteins. Comput Struct Biotechnol J 2022; 20:2223-2234. [PMID: 35615015 PMCID: PMC9118482 DOI: 10.1016/j.csbj.2022.05.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 05/02/2022] [Accepted: 05/02/2022] [Indexed: 11/24/2022] Open
Abstract
Sequence-based predictors of the residue-level protein function and structure cover a broad spectrum of characteristics including intrinsic disorder, secondary structure, solvent accessibility and binding to nucleic acids. They were catalogued and evaluated in numerous surveys and assessments. However, methods focusing on a given characteristic are studied separately from predictors of other characteristics, while they are typically used on the same proteins. We fill this void by studying complementarity of a representative collection of methods that target different predictions using a large, taxonomically consistent, and low similarity dataset of human proteins. First, we bridge the gap between the communities that develop structure-trained vs. disorder-trained predictors of binding residues. Motivated by a recent study of the protein-binding residue predictions, we empirically find that combining the structure-trained and disorder-trained predictors of the DNA-binding and RNA-binding residues leads to substantial improvements in predictive quality. Second, we investigate whether diverse predictors generate results that accurately reproduce relations between secondary structure, solvent accessibility, interaction sites, and intrinsic disorder that are present in the experimental data. Our empirical analysis concludes that predictions accurately reflect all combinations of these relations. Altogether, this study provides unique insights that support combining results produced by diverse residue-level predictors of protein function and structure.
Collapse
Affiliation(s)
- Bálint Biró
- Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life Sciences, Gödöllő, Hungary
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| |
Collapse
|
6
|
Dynamic, but Not Necessarily Disordered, Human-Virus Interactions Mediated through SLiMs in Viral Proteins. Viruses 2021; 13:v13122369. [PMID: 34960638 PMCID: PMC8703344 DOI: 10.3390/v13122369] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 11/15/2021] [Accepted: 11/16/2021] [Indexed: 12/13/2022] Open
Abstract
Most viruses have small genomes that encode proteins needed to perform essential enzymatic functions. Across virus families, primary enzyme functions are under functional constraint; however, secondary functions mediated by exposed protein surfaces that promote interactions with the host proteins may be less constrained. Viruses often form transient interactions with host proteins through conformationally flexible interfaces. Exposed flexible amino acid residues are known to evolve rapidly suggesting that secondary functions may generate diverse interaction potentials between viruses within the same viral family. One mechanism of interaction is viral mimicry through short linear motifs (SLiMs) that act as functional signatures in host proteins. Viral SLiMs display specific patterns of adjacent amino acids that resemble their host SLiMs and may occur by chance numerous times in viral proteins due to mutational and selective processes. Through mimicry of SLiMs in the host cell proteome, viruses can interfere with the protein interaction network of the host and utilize the host-cell machinery to their benefit. The overlap between rapidly evolving protein regions and the location of functionally critical SLiMs suggest that these motifs and their functional potential may be rapidly rewired causing variation in pathogenicity, infectivity, and virulence of related viruses. The following review provides an overview of known viral SLiMs with select examples of their role in the life cycle of a virus, and a discussion of the structural properties of experimentally validated SLiMs highlighting that a large portion of known viral SLiMs are devoid of predicted intrinsic disorder based on the viral SLiMs from the ELM database.
Collapse
|
7
|
Zhang J, Ghadermarzi S, Katuwawala A, Kurgan L. DNAgenie: accurate prediction of DNA-type-specific binding residues in protein sequences. Brief Bioinform 2021; 22:6355416. [PMID: 34415020 DOI: 10.1093/bib/bbab336] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 07/02/2021] [Accepted: 07/28/2021] [Indexed: 01/02/2023] Open
Abstract
Efforts to elucidate protein-DNA interactions at the molecular level rely in part on accurate predictions of DNA-binding residues in protein sequences. While there are over a dozen computational predictors of the DNA-binding residues, they are DNA-type agnostic and significantly cross-predict residues that interact with other ligands as DNA binding. We leverage a custom-designed machine learning architecture to introduce DNAgenie, first-of-its-kind predictor of residues that interact with A-DNA, B-DNA and single-stranded DNA. DNAgenie uses a comprehensive physiochemical profile extracted from an input protein sequence and implements a two-step refinement process to provide accurate predictions and to minimize the cross-predictions. Comparative tests on an independent test dataset demonstrate that DNAgenie outperforms the current methods that we adapt to predict residue-level interactions with the three DNA types. Further analysis finds that the use of the second (refinement) step leads to a substantial reduction in the cross predictions. Empirical tests show that DNAgenie's outputs that are converted to coarse-grained protein-level predictions compare favorably against recent tools that predict which DNA-binding proteins interact with double-stranded versus single-stranded DNAs. Moreover, predictions from the sequences of the whole human proteome reveal that the results produced by DNAgenie substantially overlap with the known DNA-binding proteins while also including promising leads for several hundred previously unknown putative DNA binders. These results suggest that DNAgenie is a valuable tool for the sequence-based characterization of protein functions. The DNAgenie's webserver is available at http://biomine.cs.vcu.edu/servers/DNAgenie/.
Collapse
Affiliation(s)
- Jian Zhang
- School of Computer and Information Technology at the Xinyang Normal University, No.237, Nanhu Road, Xinyang 464000, Henan Province, P.R. China
| | - Sina Ghadermarzi
- Department of Computer Science at the Virginia Commonwealth University, 401 West Main Street, Room E4225, Richmond, Virginia 23284, USA
| | - Akila Katuwawala
- Department of Computer Science from the Virginia Commonwealth University, 401 West Main Street, Room E4225, Richmond, Virginia 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science at the Virginia Commonwealth University, 401 West Main Street, Room E4225, Richmond, Virginia 23284, USA
| |
Collapse
|
8
|
Zhang J, Ghadermarzi S, Kurgan L. Prediction of protein-binding residues: dichotomy of sequence-based methods developed using structured complexes versus disordered proteins. Bioinformatics 2021; 36:4729-4738. [PMID: 32860044 DOI: 10.1093/bioinformatics/btaa573] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Revised: 05/22/2020] [Accepted: 06/10/2020] [Indexed: 01/08/2023] Open
Abstract
MOTIVATION There are over 30 sequence-based predictors of the protein-binding residues (PBRs). They use either structure-annotated or disorder-annotated training datasets, potentially creating a dichotomy where the structure-/disorder-specific models may not be able to cross-over to accurately predict the other type. Moreover, the structure-trained predictors were shown to substantially cross-predict PBRs among residues that interact with non-protein partners (nucleic acids and small ligands). We address these issues by performing first-of-its-kind comparative study of a representative collection of disorder- and structure-trained predictors using a comprehensive benchmark set with the structure- and disorder-derived annotations of PBRs (to analyze the cross-over) and the protein-, nucleic acid- and small ligand-binding proteins (to study the cross-predictions). RESULTS Three predictors provide accurate results: SCRIBER, ANCHOR and disoRDPbind. Some of the structure-trained methods make accurate predictions on the structure-annotated proteins. Similarly, the disorder-trained predictors predict well on the disorder-annotated proteins. However, the considered predictors generally fail to cross-over, with the exception of SCRIBER. Our study also reveals that virtually all methods substantially cross-predict PBRs, except for SCRIBER for the structure-annotated proteins and disoRDPbind for the disorder-annotated proteins. We formulate a novel hybrid predictor, hybridPBRpred, that combines results produced by disoRDPbind and SCRIBER to accurately predict disorder- and structure-annotated PBRs. HybridPBRpred generates accurate results that cross-over structure- and disorder-annotated proteins and produces relatively low amount of cross-predictions, offering an accurate alternative to predict PBRs. AVAILABILITY AND IMPLEMENTATION HybridPBRpred webserver, benchmark dataset and supplementary information are available at http://biomine.cs.vcu.edu/servers/hybridPBRpred/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University, Xinyang 464000, China
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
9
|
Zhao B, Katuwawala A, Oldfield CJ, Dunker AK, Faraggi E, Gsponer J, Kloczkowski A, Malhis N, Mirdita M, Obradovic Z, Söding J, Steinegger M, Zhou Y, Kurgan L. DescribePROT: database of amino acid-level protein structure and function predictions. Nucleic Acids Res 2021; 49:D298-D308. [PMID: 33119734 PMCID: PMC7778963 DOI: 10.1093/nar/gkaa931] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 09/11/2020] [Accepted: 10/05/2020] [Indexed: 12/30/2022] Open
Abstract
We present DescribePROT, the database of predicted amino acid-level descriptors of structure and function of proteins. DescribePROT delivers a comprehensive collection of 13 complementary descriptors predicted using 10 popular and accurate algorithms for 83 complete proteomes that cover key model organisms. The current version includes 7.8 billion predictions for close to 600 million amino acids in 1.4 million proteins. The descriptors encompass sequence conservation, position specific scoring matrix, secondary structure, solvent accessibility, intrinsic disorder, disordered linkers, signal peptides, MoRFs and interactions with proteins, DNA and RNAs. Users can search DescribePROT by the amino acid sequence and the UniProt accession number and entry name. The pre-computed results are made available instantaneously. The predictions can be accesses via an interactive graphical interface that allows simultaneous analysis of multiple descriptors and can be also downloaded in structured formats at the protein, proteome and whole database scale. The putative annotations included by DescriPROT are useful for a broad range of studies, including: investigations of protein function, applied projects focusing on therapeutics and diseases, and in the development of predictors for other protein sequence descriptors. Future releases will expand the coverage of DescribePROT. DescribePROT can be accessed at http://biomine.cs.vcu.edu/servers/DESCRIBEPROT/.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | | | - A Keith Dunker
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Eshel Faraggi
- Battelle Center for Mathematical Medicine at the Nationwide Children's Hospital, and Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Jörg Gsponer
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Andrzej Kloczkowski
- Battelle Center for Mathematical Medicine at the Nationwide Children's Hospital, and Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Nawar Malhis
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Milot Mirdita
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Zoran Obradovic
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
| | - Johannes Söding
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Martin Steinegger
- School of Biological Sciences and Institute of Molecular Biology & Genetics, Seoul National University, Seoul, Republic of Korea
| | - Yaoqi Zhou
- Institute for Glycomics, Griffith University, Gold Coast, Queensland, Australia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
10
|
Zhang F, Shi W, Zhang J, Zeng M, Li M, Kurgan L. PROBselect: accurate prediction of protein-binding residues from proteins sequences via dynamic predictor selection. Bioinformatics 2020; 36:i735-i744. [DOI: 10.1093/bioinformatics/btaa806] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/07/2020] [Indexed: 12/13/2022] Open
Abstract
Abstract
Motivation
Knowledge of protein-binding residues (PBRs) improves our understanding of protein−protein interactions, contributes to the prediction of protein functions and facilitates protein−protein docking calculations. While many sequence-based predictors of PBRs were published, they offer modest levels of predictive performance and most of them cross-predict residues that interact with other partners. One unexplored option to improve the predictive quality is to design consensus predictors that combine results produced by multiple methods.
Results
We empirically investigate predictive performance of a representative set of nine predictors of PBRs. We report substantial differences in predictive quality when these methods are used to predict individual proteins, which contrast with the dataset-level benchmarks that are currently used to assess and compare these methods. Our analysis provides new insights for the cross-prediction concern, dissects complementarity between predictors and demonstrates that predictive performance of the top methods depends on unique characteristics of the input protein sequence. Using these insights, we developed PROBselect, first-of-its-kind consensus predictor of PBRs. Our design is based on the dynamic predictor selection at the protein level, where the selection relies on regression-based models that accurately estimate predictive performance of selected predictors directly from the sequence. Empirical assessment using a low-similarity test dataset shows that PROBselect provides significantly improved predictive quality when compared with the current predictors and conventional consensuses that combine residue-level predictions. Moreover, PROBselect informs the users about the expected predictive quality for the prediction generated from a given input protein.
Availability and implementation
PROBselect is available at http://bioinformatics.csu.edu.cn/PROBselect/home/index.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Fuhao Zhang
- Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Wenbo Shi
- Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University, Xinyang 464000, China
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Min Zeng
- Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Min Li
- Hunan Provincial Key Laboratory on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
11
|
Salmanian S, Pezeshk H, Sadeghi M. Inter-protein residue covariation information unravels physically interacting protein dimers. BMC Bioinformatics 2020; 21:584. [PMID: 33334319 PMCID: PMC7745481 DOI: 10.1186/s12859-020-03930-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 12/09/2020] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Predicting physical interaction between proteins is one of the greatest challenges in computational biology. There are considerable various protein interactions and a huge number of protein sequences and synthetic peptides with unknown interacting counterparts. Most of co-evolutionary methods discover a combination of physical interplays and functional associations. However, there are only a handful of approaches which specifically infer physical interactions. Hybrid co-evolutionary methods exploit inter-protein residue coevolution to unravel specific physical interacting proteins. In this study, we introduce a hybrid co-evolutionary-based approach to predict physical interplays between pairs of protein families, starting from protein sequences only. RESULTS In the present analysis, pairs of multiple sequence alignments are constructed for each dimer and the covariation between residues in those pairs are calculated by CCMpred (Contacts from Correlated Mutations predicted) and three mutual information based approaches for ten accessible surface area threshold groups. Then, whole residue couplings between proteins of each dimer are unified into a single Frobenius norm value. Norms of residue contact matrices of all dimers in different accessible surface area thresholds are fed into support vector machine as single or multiple feature models. The results of training the classifiers by single features show no apparent different accuracies in distinct methods for different accessible surface area thresholds. Nevertheless, mutual information product and context likelihood of relatedness procedures may roughly have an overall higher and lower performances than other two methods for different accessible surface area cut-offs, respectively. The results also demonstrate that training support vector machine with multiple norm features for several accessible surface area thresholds leads to a considerable improvement of prediction performance. In this context, CCMpred roughly achieves an overall better performance than mutual information based approaches. The best accuracy, sensitivity, specificity, precision and negative predictive value for that method are 0.98, 1, 0.962, 0.96, and 0.962, respectively. CONCLUSIONS In this paper, by feeding norm values of protein dimers into support vector machines in different accessible surface area thresholds, we demonstrate that even small number of proteins in pairs of multiple alignments could allow one to accurately discriminate between positive and negative dimers.
Collapse
Affiliation(s)
- Sara Salmanian
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Hamid Pezeshk
- School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran
- Present Address: Department of Mathematics and Statistics, Concordia University, Montreal, Canada
- School of Biological Sciences, Institute for Research in Fundamental Sciences, Tehran, Iran
| | - Mehdi Sadeghi
- National Institute of Genetic Engineering and Biotechnology, Tehran, Iran
| |
Collapse
|
12
|
Zhang J, Kurgan L. SCRIBER: accurate and partner type-specific prediction of protein-binding residues from proteins sequences. Bioinformatics 2020; 35:i343-i353. [PMID: 31510679 PMCID: PMC6612887 DOI: 10.1093/bioinformatics/btz324] [Citation(s) in RCA: 76] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Motivation Accurate predictions of protein-binding residues (PBRs) enhances understanding of molecular-level rules governing protein–protein interactions, helps protein–protein docking and facilitates annotation of protein functions. Recent studies show that current sequence-based predictors of PBRs severely cross-predict residues that interact with other types of protein partners (e.g. RNA and DNA) as PBRs. Moreover, these methods are relatively slow, prohibiting genome-scale use. Results We propose a novel, accurate and fast sequence-based predictor of PBRs that minimizes the cross-predictions. Our SCRIBER (SeleCtive pRoteIn-Binding rEsidue pRedictor) method takes advantage of three innovations: comprehensive dataset that covers multiple types of binding residues, novel types of inputs that are relevant to the prediction of PBRs, and an architecture that is tailored to reduce the cross-predictions. The dataset includes complete protein chains and offers improved coverage of binding annotations that are transferred from multiple protein–protein complexes. We utilize innovative two-layer architecture where the first layer generates a prediction of protein-binding, RNA-binding, DNA-binding and small ligand-binding residues. The second layer re-predicts PBRs by reducing overlap between PBRs and the other types of binding residues produced in the first layer. Empirical tests on an independent test dataset reveal that SCRIBER significantly outperforms current predictors and that all three innovations contribute to its high predictive performance. SCRIBER reduces cross-predictions by between 41% and 69% and our conservative estimates show that it is at least 3 times faster. We provide putative PBRs produced by SCRIBER for the entire human proteome and use these results to hypothesize that about 14% of currently known human protein domains bind proteins. Availability and implementation SCRIBER webserver is available at http://biomine.cs.vcu.edu/servers/SCRIBER/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University, Xinyang, China.,Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
13
|
Andreani J, Quignot C, Guerois R. Structural prediction of protein interactions and docking using conservation and coevolution. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1470] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Jessica Andreani
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| | - Chloé Quignot
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| | - Raphael Guerois
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| |
Collapse
|
14
|
Reille S, Garnier M, Robert X, Gouet P, Martin J, Launay G. Identification and visualization of protein binding regions with the ArDock server. Nucleic Acids Res 2019; 46:W417-W422. [PMID: 29905873 PMCID: PMC6031020 DOI: 10.1093/nar/gky472] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Accepted: 05/28/2018] [Indexed: 12/21/2022] Open
Abstract
ArDock (ardock.ibcp.fr) is a structural bioinformatics web server for the prediction and the visualization of potential interaction regions at protein surfaces. ArDock ranks the surface residues of a protein according to their tendency to form interfaces in a set of predefined docking experiments between the query protein and a set of arbitrary protein probes. The ArDock methodology is derived from large scale cross-docking studies where it was observed that randomly chosen proteins tend to dock in a non-random way at protein surfaces. The method predicts interaction site of the protein, or alternate interfaces in the case of proteins with multiple interaction modes. The server takes a protein structure as input and computes a score for each surface residue. Its output focuses on the interactive visualization of results and on interoperability with other services.
Collapse
Affiliation(s)
- Sébastien Reille
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
| | - Mélanie Garnier
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
| | - Xavier Robert
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
| | - Patrice Gouet
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
| | - Juliette Martin
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
| | - Guillaume Launay
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
- To whom correspondence should be addressed. Tel: +33 437 652 936; Fax: +33 472 722 601;
| |
Collapse
|
15
|
e Silva KSF, Lima RM, Baeza LC, Lima PDS, Cordeiro TDM, Charneau S, da Silva RA, Soares CMDA, Pereira M. Interactome of Glyceraldehyde-3-Phosphate Dehydrogenase Points to the Existence of Metabolons in Paracoccidioides lutzii. Front Microbiol 2019; 10:1537. [PMID: 31338083 PMCID: PMC6629890 DOI: 10.3389/fmicb.2019.01537] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Accepted: 06/20/2019] [Indexed: 11/13/2022] Open
Abstract
Paracoccidioides is a dimorphic fungus, the causative agent of paracoccidioidomycosis. The disease is endemic within Latin America and prevalent in Brazil. The treatment is based on azoles, sulfonamides and amphotericin B. The seeking for new treatment approaches is a real necessity for neglected infections. Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) is an essential glycolytic enzyme, well known for its multitude of functions within cells, therefore categorized as a moonlight protein. To our knowledge, this is the first approach performed on the Paracoccidioides genus regarding the description of PPIs having GAPDH as a target. Here, we show an overview of experimental GAPDH interactome in different phases of Paracoccidioides lutzii and an in silico analysis of 18 proteins partners. GAPDH interacted with 207 proteins in P. lutzii. Several proteins bound to GAPDH in mycelium, transition and yeast phases are common to important pathways such as glycolysis and TCA. We performed a co-immunoprecipitation assay to validate the complex formed by GAPDH with triose phosphate isomerase, enolase, isocitrate lyase and 2-methylcitrate synthase. We found GAPDH participating in complexes with proteins of specific pathways, indicating the existence of a glycolytic and a TCA metabolon in P. lutzii. GAPDH interacted with several proteins that undergoes regulation by nitrosylation. In addition, we modeled the GAPDH 3-D structure, performed molecular dynamics and molecular docking in order to identify the interacting interface between GAPDH and the interacting proteins. Despite the large number of interacting proteins, GAPDH has only four main regions of contact with interacting proteins, reflecting its ancestrality and conservation over evolution.
Collapse
Affiliation(s)
| | - Raisa Melo Lima
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Brazil
| | - Lilian Cristiane Baeza
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Brazil
| | - Patrícia de Sousa Lima
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Brazil
| | - Thuany de Moura Cordeiro
- Laboratório de Bioquímica e Química de Proteínas, Departamento de Biologia Celular, Universidade de Brasília, Brasília, Brazil
| | - Sébastien Charneau
- Laboratório de Bioquímica e Química de Proteínas, Departamento de Biologia Celular, Universidade de Brasília, Brasília, Brazil
| | - Roosevelt Alves da Silva
- Núcleo Colaborativo de Biossistemas, Instituto de Ciências Exatas, Universidade Federal de Jataí, Goiás, Brazil
| | | | - Maristela Pereira
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Brazil
| |
Collapse
|
16
|
Wong ETC, Gsponer J. Predicting Protein-Protein Interfaces that Bind Intrinsically Disordered Protein Regions. J Mol Biol 2019; 431:3157-3178. [PMID: 31207240 DOI: 10.1016/j.jmb.2019.06.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Revised: 06/01/2019] [Accepted: 06/04/2019] [Indexed: 12/18/2022]
Abstract
A long-standing goal in biology is the complete annotation of function and structure on all protein-protein interactions, a large fraction of which is mediated by intrinsically disordered protein regions (IDRs). However, knowledge derived from experimental structures of such protein complexes is disproportionately small due, in part, to challenges in studying interactions of IDRs. Here, we introduce IDRBind, a computational method that by combining gradient boosted trees and conditional random field models predicts binding sites of IDRs with performance approaching state-of-the-art globular interface predictions, making it suitable for proteome-wide applications. Although designed and trained with a focus on molecular recognition features, which are long interaction-mediating-elements in IDRs, IDRBind also predicts the binding sites of short peptides more accurately than existing specialized predictors. Consistent with IDRBind's specificity, a comparison of protein interface categories uncovered uniform trends in multiple physicochemical properties, positioning molecular recognition feature interfaces between peptide and globular interfaces.
Collapse
Affiliation(s)
- Eric T C Wong
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada; Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Jörg Gsponer
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada; Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
17
|
Zeng B, Hönigschmid P, Frishman D. Residue co-evolution helps predict interaction sites in α-helical membrane proteins. J Struct Biol 2019; 206:156-169. [DOI: 10.1016/j.jsb.2019.02.009] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Revised: 01/30/2019] [Accepted: 02/13/2019] [Indexed: 11/29/2022]
|
18
|
Sharmeen N, Sulea T, Whiteway M, Wu C. The adaptor protein Ste50 directly modulates yeast MAPK signaling specificity through differential connections of its RA domain. Mol Biol Cell 2019; 30:794-807. [PMID: 30650049 PMCID: PMC6589780 DOI: 10.1091/mbc.e18-11-0708] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Discriminating among diverse environmental stimuli is critical for organisms to ensure their proper development, homeostasis, and survival. Saccharomyces cerevisiae regulates mating, osmoregulation, and filamentous growth using three different MAPK signaling pathways that share common components and therefore must ensure specificity. The adaptor protein Ste50 activates Ste11p, the MAP3K of all three modules. Its Ras association (RA) domain acts in both hyperosmolar and filamentous growth pathways, but its connection to the mating pathway is unknown. Genetically probing the domain, we found mutants that specifically disrupted mating or HOG-signaling pathways or both. Structurally these residues clustered on the RA domain, forming distinct surfaces with a propensity for protein–protein interactions. GFP fusions of wild-type (WT) and mutant Ste50p show that WT is localized to the shmoo structure and accumulates at the growing shmoo tip. The specifically pheromone response–defective mutants are severely impaired in shmoo formation and fail to localize ste50p, suggesting a failure of association and function of Ste50 mutants in the pheromone-signaling complex. Our results suggest that yeast cells can use differential protein interactions with the Ste50p RA domain to provide specificity of signaling during MAPK pathway activation.
Collapse
Affiliation(s)
- Nusrat Sharmeen
- Division of Experimental Medicine, Department of Medicine, McGill University, Montreal, QC H4A 3J1, Canada
| | - Traian Sulea
- Human Health Therapeutics Research Centre, National Research Council Canada, Montreal, QC H4P 2R2, Canada.,Institute of Parasitology, McGill University, Sainte-Anne-de-Bellevue, H9X 3V9 QC, Canada
| | - Malcolm Whiteway
- Department of Biology, Concordia University, Montreal, QC H4B 1R6, Canada
| | - Cunle Wu
- Division of Experimental Medicine, Department of Medicine, McGill University, Montreal, QC H4A 3J1, Canada.,Human Health Therapeutics Research Centre, National Research Council Canada, Montreal, QC H4P 2R2, Canada
| |
Collapse
|
19
|
Nadalin F, Carbone A. Protein-protein interaction specificity is captured by contact preferences and interface composition. Bioinformatics 2018; 34:459-468. [PMID: 29028884 PMCID: PMC5860360 DOI: 10.1093/bioinformatics/btx584] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2016] [Accepted: 09/18/2017] [Indexed: 12/24/2022] Open
Abstract
Motivation Large-scale computational docking will be increasingly used in future years to discriminate protein–protein interactions at the residue resolution. Complete cross-docking experiments make in silico reconstruction of protein–protein interaction networks a feasible goal. They ask for efficient and accurate screening of the millions structural conformations issued by the calculations. Results We propose CIPS (Combined Interface Propensity for decoy Scoring), a new pair potential combining interface composition with residue–residue contact preference. CIPS outperforms several other methods on screening docking solutions obtained either with all-atom or with coarse-grain rigid docking. Further testing on 28 CAPRI targets corroborates CIPS predictive power over existing methods. By combining CIPS with atomic potentials, discrimination of correct conformations in all-atom structures reaches optimal accuracy. The drastic reduction of candidate solutions produced by thousands of proteins docked against each other makes large-scale docking accessible to analysis. Availability and implementation CIPS source code is freely available at http://www.lcqb.upmc.fr/CIPS. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Francesca Nadalin
- Sorbonne Universités, UPMC-Univ P6, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative-UMR 7238, 75005 Paris, France
| | - Alessandra Carbone
- Sorbonne Universités, UPMC-Univ P6, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative-UMR 7238, 75005 Paris, France.,Institut Universitaire de France, 75005 Paris, France
| |
Collapse
|
20
|
Wong AKC, Sze-To HY, Johanning GL. Pattern to Knowledge: Deep Knowledge-Directed Machine Learning for Residue-Residue Interaction Prediction. Sci Rep 2018; 8:14841. [PMID: 30287904 PMCID: PMC6172270 DOI: 10.1038/s41598-018-32834-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Accepted: 09/17/2018] [Indexed: 11/21/2022] Open
Abstract
Residue-residue close contact (R2R-C) data procured from three-dimensional protein-protein interaction (PPI) experiments is currently used for predicting residue-residue interaction (R2R-I) in PPI. However, due to complex physiochemical environments, R2R-I incidences, facilitated by multiple factors, are usually entangled in the source environment and masked in the acquired data. Here we present a novel method, P2K (Pattern to Knowledge), to disentangle R2R-I patterns and render much succinct discriminative information expressed in different specific R2R-I statistical/functional spaces. Since such knowledge is not visible in the data acquired, we refer to it as deep knowledge. Leveraging the deep knowledge discovered to construct machine learning models for sequence-based R2R-I prediction, without trial-and-error combination of the features over external knowledge of sequences, our R2R-I predictor was validated for its effectiveness under stringent leave-one-complex-out-alone cross-validation in a benchmark dataset, and was surprisingly demonstrated to perform better than an existing sequence-based R2R-I predictor by 28% (p: 1.9E-08). P2K is accessible via our web server on https://p2k.uwaterloo.ca .
Collapse
Affiliation(s)
- Andrew K C Wong
- Department of Systems Design Engineering, University of Waterloo, 200 University Avenue West, Waterloo, N2L 3G1, Ontario, Canada.
| | - Ho Yin Sze-To
- Department of Systems Design Engineering, University of Waterloo, 200 University Avenue West, Waterloo, N2L 3G1, Ontario, Canada
| | - Gary L Johanning
- Biosciences Division, SRI International, 333 Ravenswood Ave, Menlo Park, CA, USA
| |
Collapse
|
21
|
Geometric and amino acid type determinants for protein-protein interaction interfaces. QUANTITATIVE BIOLOGY 2018. [DOI: 10.1007/s40484-018-0138-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
22
|
Daberdaku S, Ferrari C. Exploring the potential of 3D Zernike descriptors and SVM for protein-protein interface prediction. BMC Bioinformatics 2018; 19:35. [PMID: 29409446 PMCID: PMC5802066 DOI: 10.1186/s12859-018-2043-3] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Accepted: 01/24/2018] [Indexed: 12/22/2022] Open
Abstract
Background The correct determination of protein–protein interaction interfaces is important for understanding disease mechanisms and for rational drug design. To date, several computational methods for the prediction of protein interfaces have been developed, but the interface prediction problem is still not fully understood. Experimental evidence suggests that the location of binding sites is imprinted in the protein structure, but there are major differences among the interfaces of the various protein types: the characterising properties can vary a lot depending on the interaction type and function. The selection of an optimal set of features characterising the protein interface and the development of an effective method to represent and capture the complex protein recognition patterns are of paramount importance for this task. Results In this work we investigate the potential of a novel local surface descriptor based on 3D Zernike moments for the interface prediction task. Descriptors invariant to roto-translations are extracted from circular patches of the protein surface enriched with physico-chemical properties from the HQI8 amino acid index set, and are used as samples for a binary classification problem. Support Vector Machines are used as a classifier to distinguish interface local surface patches from non-interface ones. The proposed method was validated on 16 classes of proteins extracted from the Protein–Protein Docking Benchmark 5.0 and compared to other state-of-the-art protein interface predictors (SPPIDER, PrISE and NPS-HomPPI). Conclusions The 3D Zernike descriptors are able to capture the similarity among patterns of physico-chemical and biochemical properties mapped on the protein surface arising from the various spatial arrangements of the underlying residues, and their usage can be easily extended to other sets of amino acid properties. The results suggest that the choice of a proper set of features characterising the protein interface is crucial for the interface prediction task, and that optimality strongly depends on the class of proteins whose interface we want to characterise. We postulate that different protein classes should be treated separately and that it is necessary to identify an optimal set of features for each protein class. Electronic supplementary material The online version of this article (10.1186/s12859-018-2043-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sebastian Daberdaku
- Department of Information Engineering, University of Padova, via Gradenigo 6/A, Padova, 35131, Italy.
| | - Carlo Ferrari
- Department of Information Engineering, University of Padova, via Gradenigo 6/A, Padova, 35131, Italy
| |
Collapse
|
23
|
Meyer MJ, Beltrán JF, Liang S, Fragoza R, Rumack A, Liang J, Wei X, Yu H. Interactome INSIDER: a structural interactome browser for genomic studies. Nat Methods 2018; 15:107-114. [PMID: 29355848 PMCID: PMC6026581 DOI: 10.1038/nmeth.4540] [Citation(s) in RCA: 112] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2017] [Accepted: 10/22/2017] [Indexed: 02/07/2023]
Abstract
We present Interactome INSIDER, a tool to link genomic variant information with structural protein-protein interactomes. Underlying this tool is the application of machine learning to predict protein interaction interfaces for 185,957 protein interactions with previously unresolved interfaces in human and seven model organisms, including the entire experimentally determined human binary interactome. Predicted interfaces exhibit functional properties similar to those of known interfaces, including enrichment for disease mutations and recurrent cancer mutations. Through 2,164 de novo mutagenesis experiments, we show that mutations of predicted and known interface residues disrupt interactions at a similar rate and much more frequently than mutations outside of predicted interfaces. To spur functional genomic studies, Interactome INSIDER (http://interactomeinsider.yulab.org) enables users to identify whether variants or disease mutations are enriched in known and predicted interaction interfaces at various resolutions. Users may explore known population variants, disease mutations, and somatic cancer mutations, or they may upload their own set of mutations for this purpose.
Collapse
Affiliation(s)
- Michael J. Meyer
- Department of Biological Statistics and Computational Biology, Cornell
University, Ithaca, New York, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca,
New York, 14853, USA
- Tri-Institutional Training Program in Computational Biology and Medicine,
New York, New York, 10065, USA
| | - Juan Felipe Beltrán
- Department of Biological Statistics and Computational Biology, Cornell
University, Ithaca, New York, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca,
New York, 14853, USA
| | - Siqi Liang
- Department of Biological Statistics and Computational Biology, Cornell
University, Ithaca, New York, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca,
New York, 14853, USA
| | - Robert Fragoza
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca,
New York, 14853, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY
14853, USA
| | - Aaron Rumack
- Department of Biological Statistics and Computational Biology, Cornell
University, Ithaca, New York, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca,
New York, 14853, USA
| | - Jin Liang
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca,
New York, 14853, USA
| | - Xiaomu Wei
- Department of Biological Statistics and Computational Biology, Cornell
University, Ithaca, New York, 14853, USA
- Department of Medicine, Weill Cornell College of Medicine, New York, New
York, 10065, USA
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell
University, Ithaca, New York, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca,
New York, 14853, USA
| |
Collapse
|
24
|
Structure-based prediction of ligand-protein interactions on a genome-wide scale. Proc Natl Acad Sci U S A 2017; 114:13685-13690. [PMID: 29229851 DOI: 10.1073/pnas.1705381114] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We report a template-based method, LT-scanner, which scans the human proteome using protein structural alignment to identify proteins that are likely to bind ligands that are present in experimentally determined complexes. A scoring function that rapidly accounts for binding site similarities between the template and the proteins being scanned is a crucial feature of the method. The overall approach is first tested based on its ability to predict the residues on the surface of a protein that are likely to bind small-molecule ligands. The algorithm that we present, LBias, is shown to compare very favorably to existing algorithms for binding site residue prediction. LT-scanner's performance is evaluated based on its ability to identify known targets of Food and Drug Administration (FDA)-approved drugs and it too proves to be highly effective. The specificity of the scoring function that we use is demonstrated by the ability of LT-scanner to identify the known targets of FDA-approved kinase inhibitors based on templates involving other kinases. Combining sequence with structural information further improves LT-scanner performance. The approach we describe is extendable to the more general problem of identifying binding partners of known ligands even if they do not appear in a structurally determined complex, although this will require the integration of methods that combine protein structure and chemical compound databases.
Collapse
|
25
|
Different protein-protein interface patterns predicted by different machine learning methods. Sci Rep 2017; 7:16023. [PMID: 29167570 PMCID: PMC5700192 DOI: 10.1038/s41598-017-16397-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 11/13/2017] [Indexed: 12/02/2022] Open
Abstract
Different types of protein-protein interactions make different protein-protein interface patterns. Different machine learning methods are suitable to deal with different types of data. Then, is it the same situation that different interface patterns are preferred for prediction by different machine learning methods? Here, four different machine learning methods were employed to predict protein-protein interface residue pairs on different interface patterns. The performances of the methods for different types of proteins are different, which suggest that different machine learning methods tend to predict different protein-protein interface patterns. We made use of ANOVA and variable selection to prove our result. Our proposed methods taking advantages of different single methods also got a good prediction result compared to single methods. In addition to the prediction of protein-protein interactions, this idea can be extended to other research areas such as protein structure prediction and design.
Collapse
|
26
|
Yan J, Kurgan L. DRNApred, fast sequence-based method that accurately predicts and discriminates DNA- and RNA-binding residues. Nucleic Acids Res 2017; 45:e84. [PMID: 28132027 PMCID: PMC5449545 DOI: 10.1093/nar/gkx059] [Citation(s) in RCA: 86] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2016] [Accepted: 01/24/2017] [Indexed: 01/18/2023] Open
Abstract
Protein-DNA and protein-RNA interactions are part of many diverse and essential cellular functions and yet most of them remain to be discovered and characterized. Recent research shows that sequence-based predictors of DNA-binding residues accurately find these residues but also cross-predict many RNA-binding residues as DNA-binding, and vice versa. Most of these methods are also relatively slow, prohibiting applications on the whole-genome scale. We describe a novel sequence-based method, DRNApred, which accurately and in high-throughput predicts and discriminates between DNA- and RNA-binding residues. DRNApred was designed using a new dataset with both DNA- and RNA-binding proteins, regression that penalizes cross-predictions, and a novel two-layered architecture. DRNApred outperforms state-of-the-art predictors of DNA- or RNA-binding residues on a benchmark test dataset by substantially reducing the cross predictions and predicting arguably higher quality false positives that are located nearby the native binding residues. Moreover, it also more accurately predicts the DNA- and RNA-binding proteins. Application on the human proteome confirms that DRNApred reduces the cross predictions among the native nucleic acid binders. Also, novel putative DNA/RNA-binding proteins that it predicts share similar subcellular locations and residue charge profiles with the known native binding proteins. Webserver of DRNApred is freely available at http://biomine.cs.vcu.edu/servers/DRNApred/.
Collapse
Affiliation(s)
- Jing Yan
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton T6G 2V4, Canada
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, 23284, USA
| |
Collapse
|
27
|
Murakami Y, Tripathi LP, Prathipati P, Mizuguchi K. Network analysis and in silico prediction of protein-protein interactions with applications in drug discovery. Curr Opin Struct Biol 2017; 44:134-142. [PMID: 28364585 DOI: 10.1016/j.sbi.2017.02.005] [Citation(s) in RCA: 71] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2016] [Revised: 02/05/2017] [Accepted: 02/23/2017] [Indexed: 11/29/2022]
Abstract
Protein-protein interactions (PPIs) are vital to maintaining cellular homeostasis. Several PPI dysregulations have been implicated in the etiology of various diseases and hence PPIs have emerged as promising targets for drug discovery. Surface residues and hotspot residues at the interface of PPIs form the core regions, which play a key role in modulating cellular processes such as signal transduction and are used as starting points for drug design. In this review, we briefly discuss how PPI networks (PPINs) inferred from experimentally characterized PPI data have been utilized for knowledge discovery and how in silico approaches to PPI characterization can contribute to PPIN-based biological research. Next, we describe the principles of in silico PPI prediction and survey the existing PPI and PPI site prediction servers that are useful for drug discovery. Finally, we discuss the potential of in silico PPI prediction in drug discovery.
Collapse
Affiliation(s)
- Yoichi Murakami
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan.
| | - Lokesh P Tripathi
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan.
| | - Philip Prathipati
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan
| | - Kenji Mizuguchi
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan.
| |
Collapse
|
28
|
Zhang J, Kurgan L. Review and comparative assessment of sequence-based predictors of protein-binding residues. Brief Bioinform 2017; 19:821-837. [DOI: 10.1093/bib/bbx022] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2016] [Indexed: 12/31/2022] Open
Affiliation(s)
- Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
29
|
Integrating computational methods and experimental data for understanding the recognition mechanism and binding affinity of protein-protein complexes. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2017; 128:33-38. [PMID: 28069340 DOI: 10.1016/j.pbiomolbio.2017.01.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Revised: 01/04/2017] [Accepted: 01/05/2017] [Indexed: 01/09/2023]
Abstract
Protein-protein interactions perform several functions inside the cell. Understanding the recognition mechanism and binding affinity of protein-protein complexes is a challenging problem in experimental and computational biology. In this review, we focus on two aspects (i) understanding the recognition mechanism and (ii) predicting the binding affinity. The first part deals with computational techniques for identifying the binding site residues and the contribution of important interactions for understanding the recognition mechanism of protein-protein complexes in comparison with experimental observations. The second part is devoted to the methods developed for discriminating high and low affinity complexes, and predicting the binding affinity of protein-protein complexes using three-dimensional structural information and just from the amino acid sequence. The overall view enhances our understanding of the integration of experimental data and computational methods, recognition mechanism of protein-protein complexes and the binding affinity.
Collapse
|
30
|
Computational Approaches for Predicting Binding Partners, Interface Residues, and Binding Affinity of Protein-Protein Complexes. Methods Mol Biol 2017; 1484:237-253. [PMID: 27787830 DOI: 10.1007/978-1-4939-6406-2_16] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Studying protein-protein interactions leads to a better understanding of the underlying principles of several biological pathways. Cost and labor-intensive experimental techniques suggest the need for computational methods to complement them. Several such state-of-the-art methods have been reported for analyzing diverse aspects such as predicting binding partners, interface residues, and binding affinity for protein-protein complexes with reliable performance. However, there are specific drawbacks for different methods that indicate the need for their improvement. This review highlights various available computational algorithms for analyzing diverse aspects of protein-protein interactions and endorses the necessity for developing new robust methods for gaining deep insights about protein-protein interactions.
Collapse
|
31
|
Bai F, Morcos F, Cheng RR, Jiang H, Onuchic JN. Elucidating the druggable interface of protein-protein interactions using fragment docking and coevolutionary analysis. Proc Natl Acad Sci U S A 2016; 113:E8051-E8058. [PMID: 27911825 PMCID: PMC5167203 DOI: 10.1073/pnas.1615932113] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Protein-protein interactions play a central role in cellular function. Improving the understanding of complex formation has many practical applications, including the rational design of new therapeutic agents and the mechanisms governing signal transduction networks. The generally large, flat, and relatively featureless binding sites of protein complexes pose many challenges for drug design. Fragment docking and direct coupling analysis are used in an integrated computational method to estimate druggable protein-protein interfaces. (i) This method explores the binding of fragment-sized molecular probes on the protein surface using a molecular docking-based screen. (ii) The energetically favorable binding sites of the probes, called hot spots, are spatially clustered to map out candidate binding sites on the protein surface. (iii) A coevolution-based interface interaction score is used to discriminate between different candidate binding sites, yielding potential interfacial targets for therapeutic drug design. This approach is validated for important, well-studied disease-related proteins with known pharmaceutical targets, and also identifies targets that have yet to be studied. Moreover, therapeutic agents are proposed by chemically connecting the fragments that are strongly bound to the hot spots.
Collapse
Affiliation(s)
- Fang Bai
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005
| | - Faruck Morcos
- Department of Biological Sciences, University of Texas at Dallas, Dallas, TX 75080
- Department of Bioengineering, University of Texas at Dallas, Dallas, TX 75080
- Center for Systems Biology, University of Texas at Dallas, Dallas, TX 75080
| | - Ryan R Cheng
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005
| | - Hualiang Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China;
| | - José N Onuchic
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005;
- Department of Physics and Astronomy, Rice University, Houston, TX 77005
- Department of Chemistry, Rice University, Houston, TX 77005
- Department of Biosciences, Rice University, Houston, TX 77005
| |
Collapse
|
32
|
Tonddast-Navaei S, Skolnick J. Are protein-protein interfaces special regions on a protein's surface? J Chem Phys 2016; 143:243149. [PMID: 26723634 DOI: 10.1063/1.4937428] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Protein-protein interactions (PPIs) are involved in many cellular processes. Experimentally obtained protein quaternary structures provide the location of protein-protein interfaces, the surface region of a given protein that interacts with another. These regions are termed half-interfaces (HIs). Canonical HIs cover roughly one third of a protein's surface and were found to have more hydrophobic residues than the non-interface surface region. In addition, the classical view of protein HIs was that there are a few (if not one) HIs per protein that are structurally and chemically unique. However, on average, a given protein interacts with at least a dozen others. This raises the question of whether they use the same or other HIs. By copying HIs from monomers with the same folds in solved quaternary structures, we introduce the concept of geometric HIs (HIs whose geometry has a significant match to other known interfaces) and show that on average they cover three quarters of a protein's surface. We then demonstrate that in some cases, these geometric HI could result in real physical interactions (which may or may not be biologically relevant). The composition of the new HIs is on average more charged compared to most known ones, suggesting that the current protein interface database is biased towards more hydrophobic, possibly more obligate, complexes. Finally, our results provide evidence for interface fuzziness and PPI promiscuity. Thus, the classical view of unique, well defined HIs needs to be revisited as HIs are another example of coarse-graining that is used by nature.
Collapse
Affiliation(s)
- Sam Tonddast-Navaei
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street N.W., Atlanta, Georgia 30318, USA
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street N.W., Atlanta, Georgia 30318, USA
| |
Collapse
|
33
|
Maheshwari S, Brylinski M. Template-based identification of protein–protein interfaces using eFindSitePPI. Methods 2016; 93:64-71. [DOI: 10.1016/j.ymeth.2015.07.017] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 07/12/2015] [Accepted: 07/29/2015] [Indexed: 11/26/2022] Open
|
34
|
Local Geometry and Evolutionary Conservation of Protein Surfaces Reveal the Multiple Recognition Patches in Protein-Protein Interactions. PLoS Comput Biol 2015; 11:e1004580. [PMID: 26690684 PMCID: PMC4686965 DOI: 10.1371/journal.pcbi.1004580] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2015] [Accepted: 10/04/2015] [Indexed: 11/19/2022] Open
Abstract
Protein-protein interactions (PPIs) are essential to all biological processes and they represent increasingly important therapeutic targets. Here, we present a new method for accurately predicting protein-protein interfaces, understanding their properties, origins and binding to multiple partners. Contrary to machine learning approaches, our method combines in a rational and very straightforward way three sequence- and structure-based descriptors of protein residues: evolutionary conservation, physico-chemical properties and local geometry. The implemented strategy yields very precise predictions for a wide range of protein-protein interfaces and discriminates them from small-molecule binding sites. Beyond its predictive power, the approach permits to dissect interaction surfaces and unravel their complexity. We show how the analysis of the predicted patches can foster new strategies for PPIs modulation and interaction surface redesign. The approach is implemented in JET2, an automated tool based on the Joint Evolutionary Trees (JET) method for sequence-based protein interface prediction. JET2 is freely available at www.lcqb.upmc.fr/JET2.
Collapse
|
35
|
Maheshwari S, Brylinski M. Predicted binding site information improves model ranking in protein docking using experimental and computer-generated target structures. BMC STRUCTURAL BIOLOGY 2015; 15:23. [PMID: 26597230 PMCID: PMC4657198 DOI: 10.1186/s12900-015-0050-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2015] [Accepted: 10/30/2015] [Indexed: 01/10/2023]
Abstract
Background Protein-protein interactions (PPIs) mediate the vast majority of biological processes, therefore, significant efforts have been directed to investigate PPIs to fully comprehend cellular functions. Predicting complex structures is critical to reveal molecular mechanisms by which proteins operate. Despite recent advances in the development of new methods to model macromolecular assemblies, most current methodologies are designed to work with experimentally determined protein structures. However, because only computer-generated models are available for a large number of proteins in a given genome, computational tools should tolerate structural inaccuracies in order to perform the genome-wide modeling of PPIs. Results To address this problem, we developed eRankPPI, an algorithm for the identification of near-native conformations generated by protein docking using experimental structures as well as protein models. The scoring function implemented in eRankPPI employs multiple features including interface probability estimates calculated by eFindSitePPI and a novel contact-based symmetry score. In comparative benchmarks using representative datasets of homo- and hetero-complexes, we show that eRankPPI consistently outperforms state-of-the-art algorithms improving the success rate by ~10 %. Conclusions eRankPPI was designed to bridge the gap between the volume of sequence data, the evidence of binary interactions, and the atomic details of pharmacologically relevant protein complexes. Tolerating structure imperfections in computer-generated models opens up a possibility to conduct the exhaustive structure-based reconstruction of PPI networks across proteomes. The methods and datasets used in this study are available at www.brylinski.org/erankppi.
Collapse
Affiliation(s)
- Surabhi Maheshwari
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA.
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA. .,Center for Computation & Technology, Louisiana State University, Baton Rouge, LA, 70803, USA.
| |
Collapse
|
36
|
Hwang H, Petrey D, Honig B. A hybrid method for protein-protein interface prediction. Protein Sci 2015; 25:159-65. [PMID: 26178156 DOI: 10.1002/pro.2744] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2015] [Revised: 07/02/2015] [Accepted: 07/06/2015] [Indexed: 12/31/2022]
Abstract
The growing structural coverage of proteomes is making structural comparison a powerful tool for function annotation. Such template-based approaches are based on the observation that structural similarity is often sufficient to infer similar function. However, it seems clear that, in addition to structural similarity, the specific characteristics of a given protein should also be taken into account in predicting function. Here we describe PredUs 2.0, a method to predict regions on a protein surface likely to bind other proteins, that is, interfacial residues. PredUs 2.0 is based on the PredUs method that is entirely template-based and uses known binding sites in structurally similar proteins to predict interfacial residues. PredUs 2.0 uses a Bayesian approach to combine the template-based scoring of PredUs with a score that reflects the propensities of individual amino acids to be in interfaces. PredUs 2.0 includes a novel protein size dependent metric to determine the number of residues that should be reported as interfacial. PredUs 2.0 significantly outperforms PredUs as well as other published interface prediction methods.
Collapse
Affiliation(s)
- Howook Hwang
- Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, Howard Hughes Medical Institute, Columbia University, 1130 St. Nicholas Ave., Room 815, New York, NY, 10032
| | - Donald Petrey
- Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, Howard Hughes Medical Institute, Columbia University, 1130 St. Nicholas Ave., Room 815, New York, NY, 10032
| | - Barry Honig
- Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, Howard Hughes Medical Institute, Columbia University, 1130 St. Nicholas Ave., Room 815, New York, NY, 10032
| |
Collapse
|
37
|
Cooper DE, Young PA, Klett EL, Coleman RA. Physiological Consequences of Compartmentalized Acyl-CoA Metabolism. J Biol Chem 2015; 290:20023-31. [PMID: 26124277 DOI: 10.1074/jbc.r115.663260] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Meeting the complex physiological demands of mammalian life requires strict control of the metabolism of long-chain fatty acyl-CoAs because of the multiplicity of their cellular functions. Acyl-CoAs are substrates for energy production; stored within lipid droplets as triacylglycerol, cholesterol esters, and retinol esters; esterified to form membrane phospholipids; or used to activate transcriptional and signaling pathways. Indirect evidence suggests that acyl-CoAs do not wander freely within cells, but instead, are channeled into specific pathways. In this review, we will discuss the evidence for acyl-CoA compartmentalization, highlight the key modes of acyl-CoA regulation, and diagram potential mechanisms for controlling acyl-CoA partitioning.
Collapse
Affiliation(s)
| | | | - Eric L Klett
- From the Departments of Nutrition and Medicine, University of North Carolina, Chapel Hill, North Carolina 27599
| | | |
Collapse
|