301
|
Landau KS, Na I, Schenck RO, Uversky VN. Unfoldomics of prostate cancer: on the abundance and roles of intrinsically disordered proteins in prostate cancer. Asian J Androl 2017; 18:662-72. [PMID: 27453073 PMCID: PMC5000786 DOI: 10.4103/1008-682x.184999] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Prostatic diseases such as prostate cancer and benign prostatic hyperplasia are highly prevalent among men. The number of studies focused on the abundance and roles of intrinsically disordered proteins in prostate cancer is rather limited. The goal of this study is to analyze the prevalence and degree of disorder in proteins that were previously associated with the prostate cancer pathogenesis and to compare these proteins to the entire human proteome. The analysis of these datasets provides means for drawing conclusions on the roles of disordered proteins in this common male disease. We also hope that the results of our analysis can potentially lead to future experimental studies of these proteins to find novel pathways associated with this disease.
Collapse
Affiliation(s)
- Kevin S Landau
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| | - Insung Na
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| | - Ryan O Schenck
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA; USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, Florida 33612, USA; Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russia,
| |
Collapse
|
302
|
Intrinsically Disordered Regions in Serum Albumin: What Are They For? Cell Biochem Biophys 2017; 76:39-57. [PMID: 28281231 DOI: 10.1007/s12013-017-0785-6] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2016] [Accepted: 02/13/2017] [Indexed: 12/16/2022]
Abstract
Serum albumin is a major plasma protein in mammalian blood. The importance of this protein lies in its roles in both bioregulation and transport phenomena. Serum albumin binds various metal ions and participates in the transport and storage of fatty acids, bilirubin, steroids amino acids, and many other ligands, usually with regions of hydrophobic surface. Although the primary role of serum albumin is to transport various ligand, its versatile binding capacities and high concentration mean that it can assume a number of additional functions. The major goal of this article is to show how intrinsic disorder is encoded in the amino acid sequence of serum albumin, and how intrinsic disorder is related to functions of this important serum protein.
Collapse
|
303
|
Hanson J, Yang Y, Paliwal K, Zhou Y. Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks. Bioinformatics 2017; 33:685-692. [PMID: 28011771 DOI: 10.1093/bioinformatics/btw678] [Citation(s) in RCA: 109] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Accepted: 10/26/2016] [Indexed: 11/12/2022] Open
Abstract
Motivation Capturing long-range interactions between structural but not sequence neighbors of proteins is a long-standing challenging problem in bioinformatics. Recently, long short-term memory (LSTM) networks have significantly improved the accuracy of speech and image classification problems by remembering useful past information in long sequential events. Here, we have implemented deep bidirectional LSTM recurrent neural networks in the problem of protein intrinsic disorder prediction. Results The new method, named SPOT-Disorder, has steadily improved over a similar method using a traditional, window-based neural network (SPINE-D) in all datasets tested without separate training on short and long disordered regions. Independent tests on four other datasets including the datasets from critical assessment of structure prediction (CASP) techniques and >10 000 annotated proteins from MobiDB, confirmed SPOT-Disorder as one of the best methods in disorder prediction. Moreover, initial studies indicate that the method is more accurate in predicting functional sites in disordered regions. These results highlight the usefulness combining LSTM with deep bidirectional recurrent neural networks in capturing non-local, long-range interactions for bioinformatics applications. Availability and Implementation SPOT-disorder is available as a web server and as a standalone program at: http://sparks-lab.org/server/SPOT-disorder/index.php . Contact j.hanson@griffith.edu.au or yuedong.yang@griffith.edu.au or yaoqi.zhou@griffith.edu.au. Supplementary information Supplementary data is available at Bioinformatics online.
Collapse
Affiliation(s)
- Jack Hanson
- Signal Processing Laboratory, Griffith University, Brisbane 4122, Australia
| | - Yuedong Yang
- Institute for Glycomics, Griffith University, Gold Coast 4215, Australia
| | - Kuldip Paliwal
- Signal Processing Laboratory, Griffith University, Brisbane 4122, Australia
| | - Yaoqi Zhou
- Institute for Glycomics, Griffith University, Gold Coast 4215, Australia
| |
Collapse
|
304
|
Wu W, Wang Z, Cong P, Li T. Accurate prediction of protein relative solvent accessibility using a balanced model. BioData Min 2017; 10:1. [PMID: 28127402 PMCID: PMC5259893 DOI: 10.1186/s13040-016-0121-5] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Accepted: 12/27/2016] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND Protein relative solvent accessibility provides insight into understanding protein structure and function. Prediction of protein relative solvent accessibility is often the first stage of predicting other protein properties. Recent predictors of relative solvent accessibility discriminate against exposed regions as compared with buried regions, resulting in higher prediction accuracy associated with buried regions relative to exposed regions. METHODS Here, we propose a more accurate and balanced predictor of protein relative solvent accessibility. First, we collected known proteins in three subsets according to sequence length and constructed a balanced dataset after reducing redundancy within each subset. Next, we measured the performance associated with different variables and variable combinations to determine the best variable combination. Finally, a predictor called BMRSA was constructed for modelling and prediction, which used the balanced set as the training set, the position- specific scoring matrix, predicted secondary structure, buried-exposed profile, and length of a query sequence as variables, and the conditional random field as the machine-learning method. RESULTS BMRSA performance on test sets confirmed that our approach improved prediction accuracy relative to state-of-the-art approaches and was balanced in its comparison of buried and exposed regions. Our method is valuable when higher levels of accuracy in predicting exposed-residue states are required. The BMRSA is available at: http://cheminfo.tongji.edu.cn:8080/BMRSA/.
Collapse
Affiliation(s)
- Wei Wu
- Department of Chemistry, Tongji University, Shanghai, China
| | - Zhiheng Wang
- Department of Chemistry, Tongji University, Shanghai, China
| | - Peisheng Cong
- Department of Chemistry, Tongji University, Shanghai, China
| | - Tonghua Li
- Department of Chemistry, Tongji University, Shanghai, China
| |
Collapse
|
305
|
Pancsa R, Raimondi D, Cilia E, Vranken WF. Early Folding Events, Local Interactions, and Conservation of Protein Backbone Rigidity. Biophys J 2017; 110:572-583. [PMID: 26840723 DOI: 10.1016/j.bpj.2015.12.028] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Revised: 12/21/2015] [Accepted: 12/29/2015] [Indexed: 01/20/2023] Open
Abstract
Protein folding is in its early stages largely determined by the protein sequence and complex local interactions between amino acids, resulting in lower energy conformations that provide the context for further folding into the native state. We compiled a comprehensive data set of early folding residues based on pulsed labeling hydrogen deuterium exchange experiments. These early folding residues have corresponding higher backbone rigidity as predicted by DynaMine from sequence, an effect also present when accounting for the secondary structures in the folded protein. We then show that the amino acids involved in early folding events are not more conserved than others, but rather, early folding fragments and the secondary structure elements they are part of show a clear trend toward conserving a rigid backbone. We therefore propose that backbone rigidity is a fundamental physical feature conserved by proteins that can provide important insights into their folding mechanisms and stability.
Collapse
Affiliation(s)
- Rita Pancsa
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - Daniele Raimondi
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - Elisa Cilia
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - Wim F Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium.
| |
Collapse
|
306
|
Wang Y, Guo Y, Pu X, Li M. A sequence-based computational method for prediction of MoRFs. RSC Adv 2017. [DOI: 10.1039/c6ra27161h] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Molecular recognition features (MoRFs) are relatively short segments (10–70 residues) within intrinsically disordered regions (IDRs) that can undergo disorder-to-order transitions during binding to partner proteins.
Collapse
Affiliation(s)
- Yu Wang
- College of Chemistry
- Sichuan University
- Chengdu
- People's Republic of China
| | - Yanzhi Guo
- College of Chemistry
- Sichuan University
- Chengdu
- People's Republic of China
| | - Xuemei Pu
- College of Chemistry
- Sichuan University
- Chengdu
- People's Republic of China
| | - Menglong Li
- College of Chemistry
- Sichuan University
- Chengdu
- People's Republic of China
| |
Collapse
|
307
|
Wu Z, Hu G, Wang K, Kurgan L. Exploratory Analysis of Quality Assessment of Putative Intrinsic Disorder in Proteins. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING 2017. [DOI: 10.1007/978-3-319-59063-9_65] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
308
|
Abstract
Currently available computational tools, which are many, provide a researcher with the multitude of options for prediction of intrinsic disorder in a protein of interest and for finding at least some of its disorder-based functions. This chapter provides a highly subjective guideline on how not to be lost in the "dark forest" of available tools for the analysis of intrinsic disorder. By no means it gives a unique pathway through this forest, but simply presents some of the tools the author uses in his everyday research.
Collapse
Affiliation(s)
- Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, 33612, USA.
- Institute for Biological Instrumentation, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russian Federation.
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russian Federation.
| |
Collapse
|
309
|
Data on evolution of intrinsically disordered regions of the human kinome and contribution of FAK1 IDRs to cytoskeletal remodeling. Data Brief 2016; 10:315-324. [PMID: 28004021 PMCID: PMC5157709 DOI: 10.1016/j.dib.2016.11.099] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Revised: 11/21/2016] [Accepted: 11/30/2016] [Indexed: 11/23/2022] Open
Abstract
We present data on the evolution of intrinsically disordered regions (IDRs) taking into account the entire human protein kinome. The evolutionary data of the IDRs with respect to the kinase domains (KDs) and kinases as a whole protein (WP) are reported. Further, we have reported its post translational modifications of FAK1 IDRs and their contribution to the cytoskeletal remodeling. We also report the data to build a protein-protein interaction (PPI) network of primary and secondary FAK1-interacting hybrid proteins. Detailed analysis of the data and its effect on FAK1-related functions have been described in “Structural pliability adjacent to the kinase domain highlights contribution of FAK1 IDRs to cytoskeletal remodeling” (Kathiriya et. al., 2016) [1].
Collapse
|
310
|
Fitzsimmons R, Amin N, Uversky VN. Understanding the roles of intrinsic disorder in subunits of hemoglobin and the disease process of sickle cell anemia. INTRINSICALLY DISORDERED PROTEINS 2016; 4:e1248273. [PMID: 28232898 DOI: 10.1080/21690707.2016.1248273] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 10/10/2016] [Indexed: 12/14/2022]
Abstract
One of the common genetic disorders is sickle cell anemia, in which 2 recessive alleles must meet to allow for destruction and alteration in the morphology of red blood cells. This usually leads to loss of proper binding of oxygen to hemoglobin and curved, sickle-shaped erythrocytes. The mutation causing this disease occurs in the 6th codon of the HBB gene encoding the hemoglobin subunit β (β-globin), a protein, serving as an integral part of the adult hemoglobin A (HbA), which is a heterotetramer of 2 α chains and 2 β chains that is responsible for binding to the oxygen in the blood. This mutation changes a charged glutamic acid to a hydrophobic valine residue and disrupts the tertiary structure and stability of the hemoglobin molecule. Since in the field of protein intrinsic disorder, charged and polar residues are typically considered as disorder promoting, in opposite to the order-promoting non-polar hydrophobic residues, in this study we attempted to answer a question if intrinsic disorder might have a role in the pathogenesis of sickle cell anemia. To this end, several disorder predictors were utilized to evaluate the presence of intrinsically disordered regions in all subunits of human hemoglobin: α, β, δ, ε, ζ, γ1, and γ2. Then, structural analysis was completed by using the SWISS-MODEL Repository to visualize the outputs of the disorder predictors. Finally, Uniprot STRING and D2P2 were used to determine biochemical interactome and protein partners for each hemoglobin subunit along with analyzing their posttranslational modifications. All these properties were used to determine any differences between the 6 different types of subunits of hemoglobin and to correlate the mutation leading to sickle cell anemia with intrinsic disorder propensity.
Collapse
Affiliation(s)
- Reis Fitzsimmons
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida , Tampa, FL, USA
| | - Narmin Amin
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida , Tampa, FL, USA
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA; USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA; Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russia
| |
Collapse
|
311
|
Lieutaud P, Ferron F, Uversky AV, Kurgan L, Uversky VN, Longhi S. How disordered is my protein and what is its disorder for? A guide through the "dark side" of the protein universe. INTRINSICALLY DISORDERED PROTEINS 2016; 4:e1259708. [PMID: 28232901 DOI: 10.1080/21690707.2016.1259708] [Citation(s) in RCA: 80] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Revised: 11/03/2016] [Accepted: 11/04/2016] [Indexed: 12/18/2022]
Abstract
In the last 2 decades it has become increasingly evident that a large number of proteins are either fully or partially disordered. Intrinsically disordered proteins lack a stable 3D structure, are ubiquitous and fulfill essential biological functions. Their conformational heterogeneity is encoded in their amino acid sequences, thereby allowing intrinsically disordered proteins or regions to be recognized based on properties of these sequences. The identification of disordered regions facilitates the functional annotation of proteins and is instrumental for delineating boundaries of protein domains amenable to structural determination with X-ray crystallization. This article discusses a comprehensive selection of databases and methods currently employed to disseminate experimental and putative annotations of disorder, predict disorder and identify regions involved in induced folding. It also provides a set of detailed instructions that should be followed to perform computational analysis of disorder.
Collapse
Affiliation(s)
- Philippe Lieutaud
- Aix-Marseille Université, AFMB UMR, Marseille, France; CNRS, AFMB UMR, Marseille, France
| | - François Ferron
- Aix-Marseille Université, AFMB UMR, Marseille, France; CNRS, AFMB UMR, Marseille, France
| | - Alexey V Uversky
- Center for Data Analytics and Biomedical Informatics, Department of Computer and Information Sciences, College of Science and Technology, Temple University , Philadelphia, PA, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University , Richmond, VA, USA
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA; Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russia
| | - Sonia Longhi
- Aix-Marseille Université, AFMB UMR, Marseille, France; CNRS, AFMB UMR, Marseille, France
| |
Collapse
|
312
|
Peng Z, Uversky VN, Kurgan L. Genes encoding intrinsic disorder in Eukaryota have high GC content. INTRINSICALLY DISORDERED PROTEINS 2016; 4:e1262225. [PMID: 28232902 DOI: 10.1080/21690707.2016.1262225] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 09/20/2016] [Revised: 11/03/2016] [Accepted: 11/15/2016] [Indexed: 10/20/2022]
Abstract
We analyze a correlation between the GC content in genes of 12 eukaryotic species and the level of intrinsic disorder in their corresponding proteins. Comprehensive computational analysis has revealed that the disordered regions in eukaryotes are encoded by the GC-enriched gene regions and that this enrichment is correlated with the amount of disorder and is present across proteins and species characterized by varying amounts of disorder. The GC enrichment is a result of higher rate of amino acid coded by GC-rich codons in the disordered regions. Individual amino acids have the same GC-content profile between different species. Eukaryotic proteins with the disordered regions encoded by the GC-enriched gene segments carry out important biological functions including interactions with RNAs, DNAs, nucleotides, binding of calcium and metal ions, are involved in transcription, transport, cell division and certain signaling pathways, and are localized primarily in nucleus, cytosol and cytoplasm. We also investigate a possible relationship between GC content, intrinsic disorder and protein evolution. Analysis of a devised "age" of amino acids, their disorder-promoting capacity and the GC-enrichment of their codons suggests that the early amino acids are mostly disorder-promoting and their codons are GC-rich while most of late amino acids are mostly order-promoting.
Collapse
Affiliation(s)
- Zhenling Peng
- Center for Applied Mathematics, Tianjin University , Tianjin, China
| | - Vladimir N Uversky
- Department of Molecular Medicine and Byrd Alzheimer Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA; Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University , Richmond, VA, USA
| |
Collapse
|
313
|
Alowolodu O, Johnson G, Alashwal L, Addou I, Zhdanova IV, Uversky VN. Intrinsic disorder in spondins and some of their interacting partners. INTRINSICALLY DISORDERED PROTEINS 2016; 4:e1255295. [PMID: 28232900 DOI: 10.1080/21690707.2016.1255295] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Revised: 10/22/2016] [Accepted: 10/27/2016] [Indexed: 12/28/2022]
Abstract
Spondins, which are proteins that inhibit and promote adherence of embryonic cells so as to aid axonal growth are part of the thrombospondin-1 family. Spondins function in several important biological processes, such as apoptosis, angiogenesis, etc. Spondins constitute a thrombospondin subfamily that includes F-spondin, a protein that interacts with Aβ precursor protein and inhibits its proteolytic processing; R-spondin, a 4-membered group of proteins that regulates Wnt pathway and have other functions, such as regulation of kidney proliferation, induction of epithelial proliferation, the tumor suppressant action; M-spondin that mediates mechanical linkage between the muscles and apodemes; and the SCO-spondin, a protein important for neuronal development. In this study, we investigated intrinsic disorder status of human spondins and their interacting partners, such as members of the LRP family, LGR family, Frizzled family, and several other binding partners in order to establish the existence and importance of disordered regions in spondins and their interacting partners by conducting a detailed analysis of their sequences, finding disordered regions, and establishing a correlation between their structure and biological functions.
Collapse
Affiliation(s)
- Oluwole Alowolodu
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida , Tampa, FL, USA
| | - Gbemisola Johnson
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida , Tampa, FL, USA
| | - Lamis Alashwal
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida , Tampa, FL, USA
| | - Iqbal Addou
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida , Tampa, FL, USA
| | - Irina V Zhdanova
- Department of Anatomy & Neurobiology, Boston University School of Medicine , Boston, MA, USA
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA; USF Health Byrd Alzheimer Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA; Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russia
| |
Collapse
|
314
|
Srivastava A, Mazzocco G, Kel A, Wyrwicz LS, Plewczynski D. Detecting reliable non interacting proteins (NIPs) significantly enhancing the computational prediction of protein-protein interactions using machine learning methods. MOLECULAR BIOSYSTEMS 2016; 12:778-85. [PMID: 26738778 DOI: 10.1039/c5mb00672d] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Protein-protein interactions (PPIs) play a vital role in most biological processes. Hence their comprehension can promote a better understanding of the mechanisms underlying living systems. However, besides the cost and the time limitation involved in the detection of experimentally validated PPIs, the noise in the data is still an important issue to overcome. In the last decade several in silico PPI prediction methods using both structural and genomic information were developed for this purpose. Here we introduce a unique validation approach aimed to collect reliable non interacting proteins (NIPs). Thereafter the most relevant protein/protein-pair related features were selected. Finally, the prepared dataset was used for PPI classification, leveraging the prediction capabilities of well-established machine learning methods. Our best classification procedure displayed specificity and sensitivity values of 96.33% and 98.02%, respectively, surpassing the prediction capabilities of other methods, including those trained on gold standard datasets. We showed that the PPI/NIP predictive performances can be considerably improved by focusing on data preparation.
Collapse
Affiliation(s)
- A Srivastava
- Maria Sklodowska-Curie Memorial Cancer Center and Institute of Oncology, Warsaw, Poland
| | - G Mazzocco
- Centre of New Technologies, University of Warsaw, Banacha 2c Str., 02-097 Warsaw, Poland. and Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland
| | - A Kel
- GeneXplain GmbH, Am Exer 10b, D-38302, Wolfenbüttel, Germany
| | - L S Wyrwicz
- Maria Sklodowska-Curie Memorial Cancer Center and Institute of Oncology, Warsaw, Poland
| | - D Plewczynski
- Centre of New Technologies, University of Warsaw, Banacha 2c Str., 02-097 Warsaw, Poland.
| |
Collapse
|
315
|
Zambelli B, Uversky VN, Ciurli S. Nickel impact on human health: An intrinsic disorder perspective. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2016; 1864:1714-1731. [DOI: 10.1016/j.bbapap.2016.09.008] [Citation(s) in RCA: 82] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2016] [Revised: 08/31/2016] [Accepted: 09/14/2016] [Indexed: 01/26/2023]
|
316
|
Finn RD, Attwood TK, Babbitt PC, Bateman A, Bork P, Bridge AJ, Chang HY, Dosztányi Z, El-Gebali S, Fraser M, Gough J, Haft D, Holliday GL, Huang H, Huang X, Letunic I, Lopez R, Lu S, Marchler-Bauer A, Mi H, Mistry J, Natale DA, Necci M, Nuka G, Orengo CA, Park Y, Pesseat S, Piovesan D, Potter SC, Rawlings ND, Redaschi N, Richardson L, Rivoire C, Sangrador-Vegas A, Sigrist C, Sillitoe I, Smithers B, Squizzato S, Sutton G, Thanki N, Thomas PD, Tosatto SCE, Wu CH, Xenarios I, Yeh LS, Young SY, Mitchell AL. InterPro in 2017-beyond protein family and domain annotations. Nucleic Acids Res 2016; 45:D190-D199. [PMID: 27899635 PMCID: PMC5210578 DOI: 10.1093/nar/gkw1107] [Citation(s) in RCA: 1073] [Impact Index Per Article: 119.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2016] [Accepted: 10/27/2016] [Indexed: 02/07/2023] Open
Abstract
InterPro (http://www.ebi.ac.uk/interpro/) is a freely available database used to classify protein sequences into families and to predict the presence of important domains and sites. InterProScan is the underlying software that allows both protein and nucleic acid sequences to be searched against InterPro's predictive models, which are provided by its member databases. Here, we report recent developments with InterPro and its associated software, including the addition of two new databases (SFLD and CDD), and the functionality to include residue-level annotation and prediction of intrinsic disorder. These developments enrich the annotations provided by InterPro, increase the overall number of residues annotated and allow more specific functional inferences.
Collapse
Affiliation(s)
- Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Patricia C Babbitt
- Department of Bioengineering & Therapeutic Sciences, University of California, San Francisco, CA 94143, USA
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Peer Bork
- European Molecular Biology Laboratory, Biocomputing, Meyerhofstasse 1, 69117 Heidelberg, Germany
| | - Alan J Bridge
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, Switzerland
| | - Hsin-Yu Chang
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Zsuzsanna Dosztányi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Pázmány Péter sétány 1/c, Budapest, Hungary
| | - Sara El-Gebali
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthew Fraser
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Julian Gough
- Computer Science department, University of Bristol, Woodland Road, Bristol BS8 1UB, UK
| | - David Haft
- Bioinformatics Department, J. Craig Venter Institute, 9714 Medical Center Drive, Rockville, MD 20850, USA
| | - Gemma L Holliday
- Department of Bioengineering & Therapeutic Sciences, University of California, San Francisco, CA 94143, USA
| | - Hongzhan Huang
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
| | - Xiaosong Huang
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | - Ivica Letunic
- Biobyte Solutions GmbH, Bothestr. 142, 69126 Heidelberg, Germany
| | - Rodrigo Lopez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Shennan Lu
- National Center for Biotechnology Information, National Library of Medicine, NIH Bldg, 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Aron Marchler-Bauer
- National Center for Biotechnology Information, National Library of Medicine, NIH Bldg, 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Huaiyu Mi
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | - Jaina Mistry
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Darren A Natale
- Georgetown University Medical Center, 3300 Whitehaven St, NW, Washington, DC 20007, USA
| | - Marco Necci
- Department of Biomedical Sciences and CRIBI Biotech Center, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy
| | - Gift Nuka
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Christine A Orengo
- Structural and Molecular Biology, University College London, Darwin Building, London WC1E 6BT, UK
| | - Youngmi Park
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sebastien Pesseat
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Damiano Piovesan
- Department of Biomedical Sciences and CRIBI Biotech Center, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy
| | - Simon C Potter
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Neil D Rawlings
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nicole Redaschi
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, Switzerland
| | - Lorna Richardson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Catherine Rivoire
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, Switzerland
| | - Amaia Sangrador-Vegas
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Christian Sigrist
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, Switzerland
| | - Ian Sillitoe
- Structural and Molecular Biology, University College London, Darwin Building, London WC1E 6BT, UK
| | - Ben Smithers
- Computer Science department, University of Bristol, Woodland Road, Bristol BS8 1UB, UK
| | - Silvano Squizzato
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Granger Sutton
- Bioinformatics Department, J. Craig Venter Institute, 9714 Medical Center Drive, Rockville, MD 20850, USA
| | - Narmada Thanki
- National Center for Biotechnology Information, National Library of Medicine, NIH Bldg, 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Paul D Thomas
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | - Silvio C E Tosatto
- Department of Biomedical Sciences and CRIBI Biotech Center, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy.,CNR Institute of Neuroscience, via U. Bassi 58/b, 35131 Padua, Italy
| | - Cathy H Wu
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA
| | - Ioannis Xenarios
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, Switzerland
| | - Lai-Su Yeh
- Georgetown University Medical Center, 3300 Whitehaven St, NW, Washington, DC 20007, USA
| | - Siew-Yit Young
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alex L Mitchell
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
317
|
p53 Proteoforms and Intrinsic Disorder: An Illustration of the Protein Structure-Function Continuum Concept. Int J Mol Sci 2016; 17:ijms17111874. [PMID: 27834926 PMCID: PMC5133874 DOI: 10.3390/ijms17111874] [Citation(s) in RCA: 140] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Revised: 10/27/2016] [Accepted: 11/03/2016] [Indexed: 01/10/2023] Open
Abstract
Although it is one of the most studied proteins, p53 continues to be an enigma. This protein has numerous biological functions, possesses intrinsically disordered regions crucial for its functionality, can form both homo-tetramers and isoform-based hetero-tetramers, and is able to interact with many binding partners. It contains numerous posttranslational modifications, has several isoforms generated by alternative splicing, alternative promoter usage or alternative initiation of translation, and is commonly mutated in different cancers. Therefore, p53 serves as an important illustration of the protein structure–function continuum concept, where the generation of multiple proteoforms by various mechanisms defines the ability of this protein to have a multitude of structurally and functionally different states. Considering p53 in the light of a proteoform-based structure–function continuum represents a non-canonical and conceptually new contemplation of structure, regulation, and functionality of this important protein.
Collapse
|
318
|
Necci M, Piovesan D, Tosatto SCE. Large-scale analysis of intrinsic disorder flavors and associated functions in the protein sequence universe. Protein Sci 2016; 25:2164-2174. [PMID: 27636733 DOI: 10.1002/pro.3041] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Revised: 09/12/2016] [Accepted: 09/12/2016] [Indexed: 12/22/2022]
Abstract
Intrinsic disorder (ID) in proteins has been extensively described for the last decade; a large-scale classification of ID in proteins is mostly missing. Here, we provide an extensive analysis of ID in the protein universe on the UniProt database derived from sequence-based predictions in MobiDB. Almost half the sequences contain an ID region of at least five residues. About 9% of proteins have a long ID region of over 20 residues which are more abundant in Eukaryotic organisms and most frequently cover less than 20% of the sequence. A small subset of about 67,000 (out of over 80 million) proteins is fully disordered and mostly found in Viruses. Most proteins have only one ID, with short ID evenly distributed along the sequence and long ID overrepresented in the center. The charged residue composition of Das and Pappu was used to classify ID proteins by structural propensities and corresponding functional enrichment. Swollen Coils seem to be used mainly as structural components and in biosynthesis in both Prokaryotes and Eukaryotes. In Bacteria, they are confined in the nucleoid and in Viruses provide DNA binding function. Coils & Hairpins seem to be specialized in ribosome binding and methylation activities. Globules & Tadpoles bind antigens in Eukaryotes but are involved in killing other organisms and cytolysis in Bacteria. The Undefined class is used by Bacteria to bind toxic substances and mediate transport and movement between and within organisms in Viruses. Fully disordered proteins behave similarly, but are enriched for glycine residues and extracellular structures.
Collapse
Affiliation(s)
- Marco Necci
- Department of Biomedical Sciences and CRIBI Biotech Center, University of Padua, Padua, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences and CRIBI Biotech Center, University of Padua, Padua, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences and CRIBI Biotech Center, University of Padua, Padua, Italy.,CNR Institute of Neuroscience, Padua, Italy
| |
Collapse
|
319
|
Pancsa R, Tompa P. Essential functions linked with structural disorder in organisms of minimal genome. Biol Direct 2016; 11:45. [PMID: 27608806 PMCID: PMC5016991 DOI: 10.1186/s13062-016-0149-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Accepted: 09/03/2016] [Indexed: 12/13/2022] Open
Abstract
Abstract Intrinsically disordered regions (IDRs) of proteins fulfill important regulatory roles in most organisms. However, the proteins of certain endosymbiont and intracellular pathogenic bacteria with extremely reduced genomes contain disproportionately small amounts of IDRs, consisting almost entirely of folded domains. As their genomes co-evolving with their hosts have been reduced in unrelated lineages, the proteomes of these bacteria represent independently evolved minimal protein sets. We systematically analyzed structural disorder in a representative set of such minimal organisms to see which types of functionally relevant longer IDRs are invariably retained in them. We found that a few characteristic functions are consistently linked with conformational disorder: ribosomal proteins, key components of the protein production machinery, a central coordinator of DNA metabolism and certain housekeeping chaperones seem to strictly rely on structural disorder even in genome-reduced organisms. We propose that these functions correspond to the most essential and probably also the most ancient ones fulfilled by structural disorder in cellular organisms. Reviewers This article was reviewed by Michael Gromiha, Zoltan Gaspari and Sandor Pongor. Electronic supplementary material The online version of this article (doi:10.1186/s13062-016-0149-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Rita Pancsa
- Structural Biology Research Center (SBRC), Flanders Institute for Biotechnology (VIB), Vrije Universiteit Brussel (VUB), 1050 Pleinlaan 2, Brussels, Belgium
| | - Peter Tompa
- Structural Biology Research Center (SBRC), Flanders Institute for Biotechnology (VIB), Vrije Universiteit Brussel (VUB), 1050 Pleinlaan 2, Brussels, Belgium. .,Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, 1117 Budapest, Magyar Tudósok körútja 2., Budapest, Hungary.
| |
Collapse
|
320
|
Iqbal S, Hoque MT. Estimation of Position Specific Energy as a Feature of Protein Residues from Sequence Alone for Structural Classification. PLoS One 2016; 11:e0161452. [PMID: 27588752 PMCID: PMC5010294 DOI: 10.1371/journal.pone.0161452] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Accepted: 08/06/2016] [Indexed: 11/20/2022] Open
Abstract
A set of features computed from the primary amino acid sequence of proteins, is crucial in the process of inducing a machine learning model that is capable of accurately predicting three-dimensional protein structures. Solutions for existing protein structure prediction problems are in need of features that can capture the complexity of molecular level interactions. With a view to this, we propose a novel approach to estimate position specific estimated energy (PSEE) of a residue using contact energy and predicted relative solvent accessibility (RSA). Furthermore, we demonstrate PSEE can be reasonably estimated based on sequence information alone. PSEE is useful in identifying the structured as well as unstructured or, intrinsically disordered region of a protein by computing favorable and unfavorable energy respectively, characterized by appropriate threshold. The most intriguing finding, verified empirically, is the indication that the PSEE feature can effectively classify disorder versus ordered residues and can segregate different secondary structure type residues by computing the constituent energies. PSEE values for each amino acid strongly correlate with the hydrophobicity value of the corresponding amino acid. Further, PSEE can be used to detect the existence of critical binding regions that essentially undergo disorder-to-order transitions to perform crucial biological functions. Towards an application of disorder prediction using the PSEE feature, we have rigorously tested and found that a support vector machine model informed by a set of features including PSEE consistently outperforms a model with an identical set of features with PSEE removed. In addition, the new disorder predictor, DisPredict2, shows competitive performance in predicting protein disorder when compared with six existing disordered protein predictors.
Collapse
Affiliation(s)
- Sumaiya Iqbal
- Department of Computer Science, University of New Orleans, New Orleans, LA, United States of America
| | - Md Tamjidul Hoque
- Department of Computer Science, University of New Orleans, New Orleans, LA, United States of America
| |
Collapse
|
321
|
DeForte S, Uversky VN. Order, Disorder, and Everything in Between. Molecules 2016; 21:molecules21081090. [PMID: 27548131 PMCID: PMC6274243 DOI: 10.3390/molecules21081090] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2016] [Revised: 08/10/2016] [Accepted: 08/11/2016] [Indexed: 02/04/2023] Open
Abstract
In addition to the “traditional” proteins characterized by the unique crystal-like structures needed for unique functions, it is increasingly recognized that many proteins or protein regions (collectively known as intrinsically disordered proteins (IDPs) and intrinsically disordered protein regions (IDPRs)), being biologically active, do not have a specific 3D-structure in their unbound states under physiological conditions. There are also subtler categories of disorder, such as conditional (or dormant) disorder and partial disorder. Both the ability of a protein/region to fold into a well-ordered functional unit or to stay intrinsically disordered but functional are encoded in the amino acid sequence. Structurally, IDPs/IDPRs are characterized by high spatiotemporal heterogeneity and exist as dynamic structural ensembles. It is important to remember, however, that although structure and disorder are often treated as binary states, they actually sit on a structural continuum.
Collapse
Affiliation(s)
- Shelly DeForte
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
- USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia.
| |
Collapse
|
322
|
Thieulin-Pardo G, Schramm A, Lignon S, Lebrun R, Kojadinovic M, Gontero B. The intriguing CP12-like tail of adenylate kinase 3 fromChlamydomonas reinhardtii. FEBS J 2016; 283:3389-407. [DOI: 10.1111/febs.13814] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Revised: 06/14/2016] [Accepted: 07/13/2016] [Indexed: 01/09/2023]
Affiliation(s)
| | - Antoine Schramm
- Aix Marseille Univ; CNRS; BIP, UMR 7281, IMM; Marseille Cedex 20 France
| | - Sabrina Lignon
- Plate-forme Protéomique; Marseille Protéomique (MaP); Institut de Microbiologie de la Méditerranée; CNRS, FR 3479 Marseille Cedex 20 France
| | - Régine Lebrun
- Plate-forme Protéomique; Marseille Protéomique (MaP); Institut de Microbiologie de la Méditerranée; CNRS, FR 3479 Marseille Cedex 20 France
| | - Mila Kojadinovic
- Aix Marseille Univ; CNRS; BIP, UMR 7281, IMM; Marseille Cedex 20 France
| | - Brigitte Gontero
- Aix Marseille Univ; CNRS; BIP, UMR 7281, IMM; Marseille Cedex 20 France
| |
Collapse
|
323
|
Upadhyay AK, Sowdhamini R. Genome-Wide Prediction and Analysis of 3D-Domain Swapped Proteins in the Human Genome from Sequence Information. PLoS One 2016; 11:e0159627. [PMID: 27467780 PMCID: PMC4965083 DOI: 10.1371/journal.pone.0159627] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Accepted: 07/06/2016] [Indexed: 11/19/2022] Open
Abstract
3D-domain swapping is one of the mechanisms of protein oligomerization and the proteins exhibiting this phenomenon have many biological functions. These proteins, which undergo domain swapping, have acquired much attention owing to their involvement in human diseases, such as conformational diseases, amyloidosis, serpinopathies, proteionopathies etc. Early realisation of proteins in the whole human genome that retain tendency to domain swap will enable many aspects of disease control management. Predictive models were developed by using machine learning approaches with an average accuracy of 78% (85.6% of sensitivity, 87.5% of specificity and an MCC value of 0.72) to predict putative domain swapping in protein sequences. These models were applied to many complete genomes with special emphasis on the human genome. Nearly 44% of the protein sequences in the human genome were predicted positive for domain swapping. Enrichment analysis was performed on the positively predicted sequences from human genome for their domain distribution, disease association and functional importance based on Gene Ontology (GO). Enrichment analysis was also performed to infer a better understanding of the functional importance of these sequences. Finally, we developed hinge region prediction, in the given putative domain swapped sequence, by using important physicochemical properties of amino acids.
Collapse
Affiliation(s)
- Atul Kumar Upadhyay
- National Centre for Biological Sciences (TIFR), GKVK Campus, Bellary Road, Bangalore 560 065, India
| | - Ramanathan Sowdhamini
- National Centre for Biological Sciences (TIFR), GKVK Campus, Bellary Road, Bangalore 560 065, India
| |
Collapse
|
324
|
The Widespread Prevalence and Functional Significance of Silk-Like Structural Proteins in Metazoan Biological Materials. PLoS One 2016; 11:e0159128. [PMID: 27415783 PMCID: PMC4944945 DOI: 10.1371/journal.pone.0159128] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2016] [Accepted: 06/28/2016] [Indexed: 01/05/2023] Open
Abstract
In nature, numerous mechanisms have evolved by which organisms fabricate biological structures with an impressive array of physical characteristics. Some examples of metazoan biological materials include the highly elastic byssal threads by which bivalves attach themselves to rocks, biomineralized structures that form the skeletons of various animals, and spider silks that are renowned for their exceptional strength and elasticity. The remarkable properties of silks, which are perhaps the best studied biological materials, are the result of the highly repetitive, modular, and biased amino acid composition of the proteins that compose them. Interestingly, similar levels of modularity/repetitiveness and similar bias in amino acid compositions have been reported in proteins that are components of structural materials in other organisms, however the exact nature and extent of this similarity, and its functional and evolutionary relevance, is unknown. Here, we investigate this similarity and use sequence features common to silks and other known structural proteins to develop a bioinformatics-based method to identify similar proteins from large-scale transcriptome and whole-genome datasets. We show that a large number of proteins identified using this method have roles in biological material formation throughout the animal kingdom. Despite the similarity in sequence characteristics, most of the silk-like structural proteins (SLSPs) identified in this study appear to have evolved independently and are restricted to a particular animal lineage. Although the exact function of many of these SLSPs is unknown, the apparent independent evolution of proteins with similar sequence characteristics in divergent lineages suggests that these features are important for the assembly of biological materials. The identification of these characteristics enable the generation of testable hypotheses regarding the mechanisms by which these proteins assemble and direct the construction of biological materials with diverse morphologies. The SilkSlider predictor software developed here is available at https://github.com/wwood/SilkSlider.
Collapse
|
325
|
Gu Y, Li DW, Brüschweiler R. Decoding the Mobility and Time Scales of Protein Loops. J Chem Theory Comput 2016; 11:1308-14. [PMID: 26579776 DOI: 10.1021/ct501085y] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
The flexible nature of protein loops and the time scales of their dynamics are critical for many biologically important events at the molecular level, such as protein interaction and recognition processes. In order to obtain a predictive understanding of the dynamic properties of loops, 500 ns molecular dynamics (MD) computer simulations of 38 different proteins were performed and validated using NMR chemical shifts. A total of 169 loops were analyzed and classified into three types, namely fast loops with correlation times <10 ns, slow loops with correlation times between 10 and 500 ns, and loops that are static over the course of the whole trajectory. Chemical and biophysical loop descriptors, such as amino-acid sequence, average 3D structure, charge distribution, hydrophobicity, and local contacts were used to develop and parametrize the ToeLoop algorithm for the prediction of the flexibility and motional time scale of every protein loop, which is also implemented as a public Web server (http://spin.ccic.ohio-state.edu/index.php/loop). The results demonstrate that loop dynamics with their time scales can be predicted rapidly with reasonable accuracy, which will allow the screening of average protein structures to help better understand the various roles loops can play in the context of protein-protein interactions and binding.
Collapse
Affiliation(s)
- Yina Gu
- Department of Chemistry and Biochemistry and ‡Campus Chemical Instrument Center, The Ohio State University , Columbus, Ohio 43210, United States
| | - Da-Wei Li
- Department of Chemistry and Biochemistry and ‡Campus Chemical Instrument Center, The Ohio State University , Columbus, Ohio 43210, United States
| | - Rafael Brüschweiler
- Department of Chemistry and Biochemistry and ‡Campus Chemical Instrument Center, The Ohio State University , Columbus, Ohio 43210, United States
| |
Collapse
|
326
|
Pastor A, Singh AK, Shukla PK, Equbal MJ, Malik ST, Singh TP, Chaudhuri TK. Role of N-terminal region of Escherichia coli maltodextrin glucosidase in folding and function of the protein. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2016; 1864:1138-1151. [PMID: 27317979 DOI: 10.1016/j.bbapap.2016.06.008] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2016] [Revised: 06/10/2016] [Accepted: 06/14/2016] [Indexed: 01/06/2023]
Abstract
Maltodextrin glucosidase (MalZ) hydrolyses short malto-oligosaccharides from the reducing end releasing glucose and maltose in Escherichia coli. MalZ is a highly aggregation prone protein and molecular chaperonins GroEL and GroES assist in the folding of this protein to a substantial level. The N-terminal region of this enzyme appears to be a unique domain as seen in sequence comparison studies with other amylases as well as through homology modelling. The sequence and homology model analysis show a probability of disorder in the N-Terminal region of MalZ. The crystal structure of this enzyme has been reported in the present communication. Based on the crystallographic structure, it has been interpreted that the N-terminal region of the enzyme (Met1-Phe131) might be unstructured or flexible. To understand the role of the N-terminal region of MalZ in its enzymatic activity, and overall stability, a truncated version (Ala111-His616) of MalZ was created. The truncated version failed to fold into an active enzyme both in E. coli cytosol and in vitro even with the assistance of chaperonins GroEL and GroES. Furthermore, the refolding effort of N-truncated MalZ in the presence of isolated N-terminal domain didn't succeed. Our studies suggest that while the structural rigidity or orientation of the N-terminal region of the MalZ protein may not be essential for its stability and function, but the said domain is likely to play an important role in the formation of the native structure of the protein when present as an integral part of the protein.
Collapse
Affiliation(s)
- Ashutosh Pastor
- Kusuma School of Biological Sciences, Indian Institute of Technology Delhi, New Delhi 110016, India
| | - Amit K Singh
- Kusuma School of Biological Sciences, Indian Institute of Technology Delhi, New Delhi 110016, India
| | - Prakash K Shukla
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi 110029, India
| | - Md Javed Equbal
- Kusuma School of Biological Sciences, Indian Institute of Technology Delhi, New Delhi 110016, India
| | - Shikha T Malik
- Kusuma School of Biological Sciences, Indian Institute of Technology Delhi, New Delhi 110016, India
| | - Tej P Singh
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi 110029, India
| | - Tapan K Chaudhuri
- Kusuma School of Biological Sciences, Indian Institute of Technology Delhi, New Delhi 110016, India.
| |
Collapse
|
327
|
Sarkar D, Patra P, Ghosh A, Saha S. Computational Framework for Prediction of Peptide Sequences That May Mediate Multiple Protein Interactions in Cancer-Associated Hub Proteins. PLoS One 2016; 11:e0155911. [PMID: 27218803 PMCID: PMC4878775 DOI: 10.1371/journal.pone.0155911] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2015] [Accepted: 05/08/2016] [Indexed: 01/26/2023] Open
Abstract
A considerable proportion of protein-protein interactions (PPIs) in the cell are estimated to be mediated by very short peptide segments that approximately conform to specific sequence patterns known as linear motifs (LMs), often present in the disordered regions in the eukaryotic proteins. These peptides have been found to interact with low affinity and are able bind to multiple interactors, thus playing an important role in the PPI networks involving date hubs. In this work, PPI data and de novo motif identification based method (MEME) were used to identify such peptides in three cancer-associated hub proteins—MYC, APC and MDM2. The peptides corresponding to the significant LMs identified for each hub protein were aligned, the overlapping regions across these peptides being termed as overlapping linear peptides (OLPs). These OLPs were thus predicted to be responsible for multiple PPIs of the corresponding hub proteins and a scoring system was developed to rank them. We predicted six OLPs in MYC and five OLPs in MDM2 that scored higher than OLP predictions from randomly generated protein sets. Two OLP sequences from the C-terminal of MYC were predicted to bind with FBXW7, component of an E3 ubiquitin-protein ligase complex involved in proteasomal degradation of MYC. Similarly, we identified peptides in the C-terminal of MDM2 interacting with FKBP3, which has a specific role in auto-ubiquitinylation of MDM2. The peptide sequences predicted in MYC and MDM2 look promising for designing orthosteric inhibitors against possible disease-associated PPIs. Since these OLPs can interact with other proteins as well, these inhibitors should be specific to the targeted interactor to prevent undesired side-effects. This computational framework has been designed to predict and rank the peptide regions that may mediate multiple PPIs and can be applied to other disease-associated date hub proteins for prediction of novel therapeutic targets of small molecule PPI modulators.
Collapse
Affiliation(s)
| | - Piya Patra
- Maulana Abdul Kalam Azad University of Technology, Kolkata, India
| | - Abhirupa Ghosh
- Maulana Abdul Kalam Azad University of Technology, Kolkata, India
| | - Sudipto Saha
- Bioinformatics Centre, Bose Institute, Kolkata, India
- * E-mail: ;
| |
Collapse
|
328
|
Malhis N, Jacobson M, Gsponer J. MoRFchibi SYSTEM: software tools for the identification of MoRFs in protein sequences. Nucleic Acids Res 2016; 44:W488-93. [PMID: 27174932 PMCID: PMC4987941 DOI: 10.1093/nar/gkw409] [Citation(s) in RCA: 117] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Accepted: 05/03/2016] [Indexed: 11/13/2022] Open
Abstract
Molecular recognition features, MoRFs, are short segments within longer disordered protein regions that bind to globular protein domains in a process known as disorder-to-order transition. MoRFs have been found to play a significant role in signaling and regulatory processes in cells. High-confidence computational identification of MoRFs remains an important challenge. In this work, we introduce MoRFchibi SYSTEM that contains three MoRF predictors: MoRFCHiBi, a basic predictor best suited as a component in other applications, MoRFCHiBi_ Light, ideal for high-throughput predictions and MoRFCHiBi_ Web, slower than the other two but best for high accuracy predictions. Results show that MoRFchibi SYSTEM provides more than double the precision of other predictors. MoRFchibi SYSTEM is available in three different forms: as HTML web server, RESTful web server and downloadable software at: http://www.chibi.ubc.ca/faculty/joerg-gsponer/gsponer-lab/software/morf_chibi/.
Collapse
Affiliation(s)
- Nawar Malhis
- Michael Smith Laboratories-Centre for High-Throughput Biology, The University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Matthew Jacobson
- Michael Smith Laboratories-Centre for High-Throughput Biology, The University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Jörg Gsponer
- Michael Smith Laboratories-Centre for High-Throughput Biology, The University of British Columbia, Vancouver, BC V6T 1Z4, Canada Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| |
Collapse
|
329
|
Al-Jiffri OH, Al-Sharif FM, Al-Jiffri EH, Uversky VN. Intrinsic disorder in biomarkers of insulin resistance, hypoadiponectinemia, and endothelial dysfunction among the type 2 diabetic patients. INTRINSICALLY DISORDERED PROTEINS 2016; 4:e1171278. [PMID: 28232897 DOI: 10.1080/21690707.2016.1171278] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Revised: 03/15/2016] [Accepted: 03/17/2016] [Indexed: 02/06/2023]
Abstract
Type 2 diabetes mellitus (T2DM) is a chronic and progressive disease that is strongly associated with various complications including cardiovascular diseases and related mortality. The present study aimed to analyze the abundance and functionality of intrinsically disordered regions in several biomarkers of insulin resistance, adiponectin, and endothelial dysfunction found in the T2DM patients. In fact, in comparison to controls, obese T2DM patients are known to have significantly higher levels of inter-cellular adhesion molecule (iCAM-1), vascular cell adhesion molecule (vCAM-1), and E-selectin, whereas their adiponectin levels are relatively low. Bioinformatics analysis revealed that these selected biomarkers (iCAM-1, vCAM-1, E-selectin, and adiponectin) are characterized by the noticeable levels of intrinsic disorder propensity and high binding promiscuity, which are important features expected for proteins serving as biomarkers. Within the limit of studied groups, there is an association between insulin resistance and both hypoadiponectinemia and endothelial dysfunction.
Collapse
Affiliation(s)
- Osama H Al-Jiffri
- Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences, King Abdulaziz University , Jeddah, Saudi Arabia
| | - Fadwa M Al-Sharif
- Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences, King Abdulaziz University , Jeddah, Saudi Arabia
| | - Essam H Al-Jiffri
- Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences, King Abdulaziz University , Jeddah, Saudi Arabia
| | - Vladimir N Uversky
- Faculty of Science, Department of Biological Science, King Abdulaziz University, Jeddah, Saudi Arabia; Department of Molecular Medicine and USF Health Byrd Alzheimer Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA; Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russia
| |
Collapse
|
330
|
Banerjee S, Chakraborty S, De RK. Deciphering the cause of evolutionary variance within intrinsically disordered regions in human proteins. J Biomol Struct Dyn 2016; 35:233-249. [PMID: 26790343 DOI: 10.1080/07391102.2016.1143877] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Why the intrinsically disordered regions evolve within human proteome has became an interesting question for a decade. Till date, it remains an unsolved yet an intriguing issue to investigate why some of the disordered regions evolve rapidly while the rest are highly conserved across mammalian species. Identifying the key biological factors, responsible for the variation in the conservation rate of different disordered regions within the human proteome, may revisit the above issue. We emphasized that among the other biological features (multifunctionality, gene essentiality, protein connectivity, number of unique domains, gene expression level and expression breadth) considered in our study, the number of unique protein domains acts as a strong determinant that negatively influences the conservation of disordered regions. In this context, we justified that proteins having a fewer types of domains preferably need to conserve their disordered regions to enhance their structural flexibility which in turn will facilitate their molecular interactions. In contrast, the selection pressure acting on the stretches of disordered regions is not so strong in the case of multi-domains proteins. Therefore, we reasoned that the presence of conserved disordered stretches may compensate the functions of multiple domains within a single domain protein. Interestingly, we noticed that the influence of the unique domain number and expression level acts differently on the evolution of disordered regions from that of well-structured ones.
Collapse
Affiliation(s)
- Sanghita Banerjee
- a Machine Intelligence Unit , Indian Statistical Institute , 203 Barrackpore Trunk Road, Kolkata 700108 , India
| | | | - Rajat K De
- a Machine Intelligence Unit , Indian Statistical Institute , 203 Barrackpore Trunk Road, Kolkata 700108 , India
| |
Collapse
|
331
|
Stojanovski BM, Breydo L, Uversky VN, Ferreira GC. Murine erythroid 5-aminolevulinate synthase: Truncation of a disordered N-terminal extension is not detrimental for catalysis. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2016; 1864:441-52. [DOI: 10.1016/j.bbapap.2016.02.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2015] [Revised: 01/19/2016] [Accepted: 02/03/2016] [Indexed: 11/16/2022]
|
332
|
Wang C, Uversky VN, Kurgan L. Disordered nucleiome: Abundance of intrinsic disorder in the DNA- and RNA-binding proteins in 1121 species from Eukaryota, Bacteria and Archaea. Proteomics 2016; 16:1486-98. [DOI: 10.1002/pmic.201500177] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Revised: 02/26/2016] [Accepted: 03/29/2016] [Indexed: 12/12/2022]
Affiliation(s)
- Chen Wang
- Department of Computer Science; Virginia Commonwealth University; Richmond VA USA
- Department of Electrical and Computer Engineering; University of Alberta; Edmonton Canada
| | - Vladimir N. Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute; Morsani College of Medicine; University of South Florida; Tampa FL USA
- Institute for Biological Instrumentation; Russian Academy of Sciences; Pushchino Moscow Region Russian Federation
- Department of Biology; Faculty of Science; King Abdulaziz University; Jeddah Kingdom of Saudi Arabia
| | - Lukasz Kurgan
- Department of Computer Science; Virginia Commonwealth University; Richmond VA USA
- Department of Electrical and Computer Engineering; University of Alberta; Edmonton Canada
| |
Collapse
|
333
|
Yacoub HA, Al-Maghrabi OA, Ahmed ES, Uversky VN. Abundance and functional roles of intrinsic disorder in the antimicrobial peptides of the NK-lysin family. J Biomol Struct Dyn 2016; 35:836-856. [DOI: 10.1080/07391102.2016.1164077] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Haitham A. Yacoub
- Faculty of Science, Department of Biological Sciences, University of Jeddah, Jeddah, Saudi Arabia
- Department of Cell Biology, Genetic Engineering and Biotechnology Division, National Research Centre, P.O. Box 12622, Gizza, Egypt
| | - Omar A. Al-Maghrabi
- Faculty of Science, Department of Biological Sciences, University of Jeddah, Jeddah, Saudi Arabia
| | - Ekram S. Ahmed
- Department of Cell Biology, Genetic Engineering and Biotechnology Division, National Research Centre, P.O. Box 12622, Gizza, Egypt
| | - Vladimir N. Uversky
- Faculty of Sciences, Department of Biological Sciences, King Abdulaziz University, P.O. Box 80203, Jeddah, Saudi Arabia
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russia
- Department of Molecular Medicine and USF Health Byrd Alzheimer’s Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
| |
Collapse
|
334
|
Espinoza-Fonseca LM, Kelekar A. High-resolution structural characterization of Noxa, an intrinsically disordered protein, by microsecond molecular dynamics simulations. MOLECULAR BIOSYSTEMS 2016; 11:1850-6. [PMID: 25855872 DOI: 10.1039/c5mb00170f] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
High-resolution characterization of the structure and dynamics of intrinsically disordered proteins (IDPs) remains a challenging task. Consequently, a detailed understanding of the structural and functional features of IDPs remains limited, as very few full-length disordered proteins have been structurally characterized. We have performed microsecond-long molecular dynamics (MD) simulations of Noxa, the smallest member of the large Bcl-2 family of apoptosis regulating proteins, to characterize in atomic-level detail the structural features of a disordered protein. A 2.5 μs MD simulation starting from an unfolded state of the protein revealed the formation of a central antiparallel β-sheet structure flanked by two disordered segments at the N- and C-terminal ends. This topology is in reasonable agreement with protein disorder predictions and available experimental data. We show that this fold plays an essential role in the intracellular function and regulation of Noxa. We demonstrate that unbiased MD simulations in combination with a modern force field reveal structural and functional features of disordered proteins at atomic-level resolution.
Collapse
Affiliation(s)
- L Michel Espinoza-Fonseca
- Department of Biochemistry, Molecular Biology and Biophysics University of Minnesota, Minneapolis, MN 55455, USA.
| | | |
Collapse
|
335
|
He CL, Bian YY, Xue Y, Liu ZX, Zhou KQ, Yao CF, Lin Y, Zou HF, Luo FX, Qu YY, Zhao JY, Ye ML, Zhao SM, Xu W. Pyruvate Kinase M2 Activates mTORC1 by Phosphorylating AKT1S1. Sci Rep 2016; 6:21524. [PMID: 26876154 PMCID: PMC4753445 DOI: 10.1038/srep21524] [Citation(s) in RCA: 99] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Accepted: 01/26/2016] [Indexed: 02/05/2023] Open
Abstract
In cancer cells, the mammalian target of rapamycin complex 1 (mTORC1) that requires hormonal and nutrient signals for its activation, is constitutively activated. We found that overexpression of pyruvate kinase M2 (PKM2) activates mTORC1 signaling through phosphorylating mTORC1 inhibitor AKT1 substrate 1 (AKT1S1). An unbiased quantitative phosphoproteomic survey identified 974 PKM2 substrates, including serine202 and serine203 (S202/203) of AKT1S1, in the proteome of renal cell carcinoma (RCC). Phosphorylation of S202/203 of AKT1S1 by PKM2 released AKT1S1 from raptor and facilitated its binding to 14-3-3, resulted in hormonal- and nutrient-signals independent activation of mTORC1 signaling and led accelerated oncogenic growth and autophagy inhibition in cancer cells. Decreasing S202/203 phosphorylation by TEPP-46 treatment reversed these effects. In RCCs and breast cancers, PKM2 overexpression was correlated with elevated S202/203 phosphorylation, activated mTORC1 and inhibited autophagy. Our results provided the first phosphorylome of PKM2 and revealed a constitutive mTORC1 activating mechanism in cancer cells.
Collapse
Affiliation(s)
- Chang-Liang He
- State Key Lab of Genetic Engineering, Obstetrics & Gynecology Hospital of Fudan University and School of Life Sciences, Shanghai 200090, P.R. China
- Institutes of Biomedical Sciences and Collaborative Innovation Center for Genetics and Development Biology, Fudan University, Shanghai 200032, P.R. China
- Collaborative Innovation Center for Biotherapy, West China Hospital, Sichuan University, Chengdu, 610041, P.R. China
| | - Yang-Yang Bian
- Chinese Academy of Sciences, Dalian Institute Chemical Physics, National Chromatography R&A Center, Key Lab Separation Science Analytic Chemistry, Dalian 116023, P.R. China
| | - Yu Xue
- Department of Medical Engineering, College of Life Sciences and Technology, Huazhong University of Science and Technology, Wuhan 430074, P.R. China
| | - Ze-Xian Liu
- Department of Medical Engineering, College of Life Sciences and Technology, Huazhong University of Science and Technology, Wuhan 430074, P.R. China
| | - Kai-Qiang Zhou
- State Key Lab of Genetic Engineering, Obstetrics & Gynecology Hospital of Fudan University and School of Life Sciences, Shanghai 200090, P.R. China
- Institutes of Biomedical Sciences and Collaborative Innovation Center for Genetics and Development Biology, Fudan University, Shanghai 200032, P.R. China
| | - Cui-Fang Yao
- State Key Lab of Genetic Engineering, Obstetrics & Gynecology Hospital of Fudan University and School of Life Sciences, Shanghai 200090, P.R. China
- Institutes of Biomedical Sciences and Collaborative Innovation Center for Genetics and Development Biology, Fudan University, Shanghai 200032, P.R. China
| | - Yan Lin
- State Key Lab of Genetic Engineering, Obstetrics & Gynecology Hospital of Fudan University and School of Life Sciences, Shanghai 200090, P.R. China
- Institutes of Biomedical Sciences and Collaborative Innovation Center for Genetics and Development Biology, Fudan University, Shanghai 200032, P.R. China
| | - Han-Fa Zou
- Chinese Academy of Sciences, Dalian Institute Chemical Physics, National Chromatography R&A Center, Key Lab Separation Science Analytic Chemistry, Dalian 116023, P.R. China
| | - Fang-Xiu Luo
- Department of Pathology, Affiliated Ruijin Hospital of Shanghai Jiaotong University, Shanghai, 201821 P.R. China
| | - Yuan-Yuan Qu
- Department of Urology, Fudan University Shanghai Cancer Center, Shanghai 200032, P.R. China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, P.R. China
| | - Jian-Yuan Zhao
- State Key Lab of Genetic Engineering, Obstetrics & Gynecology Hospital of Fudan University and School of Life Sciences, Shanghai 200090, P.R. China
- Collaborative Innovation Center for Biotherapy, West China Hospital, Sichuan University, Chengdu, 610041, P.R. China
| | - Ming-Liang Ye
- Chinese Academy of Sciences, Dalian Institute Chemical Physics, National Chromatography R&A Center, Key Lab Separation Science Analytic Chemistry, Dalian 116023, P.R. China
| | - Shi-Min Zhao
- State Key Lab of Genetic Engineering, Obstetrics & Gynecology Hospital of Fudan University and School of Life Sciences, Shanghai 200090, P.R. China
- Institutes of Biomedical Sciences and Collaborative Innovation Center for Genetics and Development Biology, Fudan University, Shanghai 200032, P.R. China
- Collaborative Innovation Center for Biotherapy, West China Hospital, Sichuan University, Chengdu, 610041, P.R. China
| | - Wei Xu
- State Key Lab of Genetic Engineering, Obstetrics & Gynecology Hospital of Fudan University and School of Life Sciences, Shanghai 200090, P.R. China
- Institutes of Biomedical Sciences and Collaborative Innovation Center for Genetics and Development Biology, Fudan University, Shanghai 200032, P.R. China
- Collaborative Innovation Center for Biotherapy, West China Hospital, Sichuan University, Chengdu, 610041, P.R. China
| |
Collapse
|
336
|
Banerjee S, De RK. Structural disorder: a tool for housekeeping proteins performing tissue-specific interactions. J Biomol Struct Dyn 2016; 34:1930-45. [DOI: 10.1080/07391102.2015.1095115] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Sanghita Banerjee
- Machine Intelligence Unit, Indian Statistical Institute, 203 Barrackpore Trunk Road, Kolkata 700108, India
| | - Rajat K. De
- Machine Intelligence Unit, Indian Statistical Institute, 203 Barrackpore Trunk Road, Kolkata 700108, India
| |
Collapse
|
337
|
DeForte S, Uversky VN. Resolving the ambiguity: Making sense of intrinsic disorder when PDB structures disagree. Protein Sci 2016; 25:676-88. [PMID: 26683124 DOI: 10.1002/pro.2864] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Revised: 12/14/2015] [Accepted: 12/15/2015] [Indexed: 12/25/2022]
Abstract
Missing regions in X-ray crystal structures in the Protein Data Bank (PDB) have played a foundational role in the study of intrinsically disordered protein regions (IDPRs), especially in the development of in silico predictors of intrinsic disorder. However, a missing region is only a weak indication of intrinsic disorder, and this uncertainty is compounded by the presence of ambiguous regions, where more than one structure of the same protein sequence "disagrees" in terms of the presence or absence of missing residues. The question is this: are these ambiguous regions intrinsically disordered, or are they the result of static disorder that arises from experimental conditions, ensembles of structures, or domain wobbling? A novel way of looking at ambiguous regions in terms of the pattern between multiple PDB structures has been demonstrated. It was found that the propensity for intrinsic disorder increases as the level of ambiguity decreases. However, it is also shown that ambiguity is more likely to occur as the protein region is placed within different environmental conditions, and even the most ambiguous regions as a set display compositional bias that suggests flexibility. The results suggested that ambiguity is a natural result for many IDPRs crystallized under different conditions and that static disorder and wobbling domains are relatively rare. Instead, it is more likely that ambiguity arises because many of these regions were conditionally or partially disordered.
Collapse
Affiliation(s)
- Shelly DeForte
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida, 33612
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida, 33612.,USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, Florida, 33612.,Department of Biological Science, Faculty of Science, King Abdulaziz University, PO Box 80203, Jeddah, Jeddah 21589, Saudi Arabia.,Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, Moscow Region, 142290, Russian Federation.,Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russian Federation
| |
Collapse
|
338
|
Abstract
In the last two decades, it has become increasingly evident that a large number of proteins are either fully or partially disordered. Intrinsically disordered proteins are ubiquitous proteins that fulfill essential biological functions while lacking a stable 3D structure. Their conformational heterogeneity is encoded at the amino acid sequence level, thereby allowing intrinsically disordered proteins or regions to be recognized based on their sequence properties. The identification of disordered regions facilitates the functional annotation of proteins and is instrumental for delineating boundaries of protein domains amenable to crystallization. This chapter focuses on the methods currently employed for predicting disorder and identifying regions involved in induced folding.
Collapse
Affiliation(s)
- Philippe Lieutaud
- AFMB UMR 7257, Aix-Marseille Université, 163, avenue de Luminy, Case 932, 13288, Marseille Cedex 09, France
- AFMB UMR 7257, CNRS, 163, avenue de Luminy, Case 932, 13288, Marseille Cedex 09, France
| | - François Ferron
- AFMB UMR 7257, Aix-Marseille Université, 163, avenue de Luminy, Case 932, 13288, Marseille Cedex 09, France
- AFMB UMR 7257, CNRS, 163, avenue de Luminy, Case 932, 13288, Marseille Cedex 09, France
| | - Sonia Longhi
- AFMB UMR 7257, Aix-Marseille Université, 163, avenue de Luminy, Case 932, 13288, Marseille Cedex 09, France.
- AFMB UMR 7257, CNRS, 163, avenue de Luminy, Case 932, 13288, Marseille Cedex 09, France.
| |
Collapse
|
339
|
Yan J, Dunker AK, Uversky VN, Kurgan L. Molecular recognition features (MoRFs) in three domains of life. MOLECULAR BIOSYSTEMS 2016; 12:697-710. [DOI: 10.1039/c5mb00640f] [Citation(s) in RCA: 103] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
MoRFs are widespread intrinsically disordered protein-binding regions that have similar abundance and amino acid composition across the three domains of life.
Collapse
Affiliation(s)
- Jing Yan
- Department of Electrical and Computer Engineering
- University of Alberta
- Edmonton
- Canada
| | - A. Keith Dunker
- Center for Computational Biology and Bioinformatics
- Indiana University School of Medicine
- Indianapolis
- USA
- Indiana University School of Informatics
| | - Vladimir N. Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute
- Morsani College of Medicine
- University of South Florida
- Tampa
- USA
| | - Lukasz Kurgan
- Department of Electrical and Computer Engineering
- University of Alberta
- Edmonton
- Canada
- Department of Computer Science
| |
Collapse
|
340
|
Meng F, Na I, Kurgan L, Uversky VN. Compartmentalization and Functionality of Nuclear Disorder: Intrinsic Disorder and Protein-Protein Interactions in Intra-Nuclear Compartments. Int J Mol Sci 2015; 17:ijms17010024. [PMID: 26712748 PMCID: PMC4730271 DOI: 10.3390/ijms17010024] [Citation(s) in RCA: 92] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2015] [Revised: 11/23/2015] [Accepted: 12/18/2015] [Indexed: 01/12/2023] Open
Abstract
The cell nucleus contains a number of membrane-less organelles or intra-nuclear compartments. These compartments are dynamic structures representing liquid-droplet phases which are only slightly denser than the bulk intra-nuclear fluid. They possess different functions, have diverse morphologies, and are typically composed of RNA (or, in some cases, DNA) and proteins. We analyzed 3005 mouse proteins localized in specific intra-nuclear organelles, such as nucleolus, chromatin, Cajal bodies, nuclear speckles, promyelocytic leukemia (PML) nuclear bodies, nuclear lamina, nuclear pores, and perinuclear compartment and compared them with ~29,863 non-nuclear proteins from mouse proteome. Our analysis revealed that intrinsic disorder is enriched in the majority of intra-nuclear compartments, except for the nuclear pore and lamina. These compartments are depleted in proteins that lack disordered domains and enriched in proteins that have multiple disordered domains. Moonlighting proteins found in multiple intra-nuclear compartments are more likely to have multiple disordered domains. Protein-protein interaction networks in the intra-nuclear compartments are denser and include more hubs compared to the non-nuclear proteins. Hubs in the intra-nuclear compartments (except for the nuclear pore) are enriched in disorder compared with non-nuclear hubs and non-nuclear proteins. Therefore, our work provides support to the idea of the functional importance of intrinsic disorder in the cell nucleus and shows that many proteins associated with sub-nuclear organelles in nuclei of mouse cells are enriched in disorder. This high level of disorder in the mouse nuclear proteins defines their ability to serve as very promiscuous binders, possessing both large quantities of potential disorder-based interaction sites and the ability of a single such site to be involved in a large number of interactions.
Collapse
Affiliation(s)
- Fanchi Meng
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 2V4, Canada.
| | - Insung Na
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
| | - Lukasz Kurgan
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 2V4, Canada.
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23219, USA.
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
- University of South Florida Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
- Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, Moscow Region 142292, Russian.
- Biology Department, Faculty of Science, King Abdulaziz University, P.O. Box 80203, Jeddah 21589, Saudi Arabia.
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, Saint Petersburg 194064, Russian.
| |
Collapse
|
341
|
El-Baky NA, Uversky VN, Redwan EM. Human consensus interferons: Bridging the natural and artificial cytokines with intrinsic disorder. Cytokine Growth Factor Rev 2015; 26:637-45. [DOI: 10.1016/j.cytogfr.2015.07.012] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2015] [Revised: 07/01/2015] [Accepted: 07/02/2015] [Indexed: 12/13/2022]
|
342
|
Peyro M, Soheilypour M, Lee BL, Mofrad MRK. Evolutionarily Conserved Sequence Features Regulate the Formation of the FG Network at the Center of the Nuclear Pore Complex. Sci Rep 2015; 5:15795. [PMID: 26541386 PMCID: PMC4635341 DOI: 10.1038/srep15795] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Accepted: 09/29/2015] [Indexed: 12/29/2022] Open
Abstract
The nuclear pore complex (NPC) is the portal for bidirectional transportation of cargos between the nucleus and the cytoplasm. While most of the structural elements of the NPC, i.e. nucleoporins (Nups), are well characterized, the exact transport mechanism is still under much debate. Many of the functional Nups are rich in phenylalanine-glycine (FG) repeats and are believed to play the key role in nucleocytoplasmic transport. We present a bioinformatics study conducted on more than a thousand FG Nups across 252 species. Our results reveal the regulatory role of polar residues and specific sequences of charged residues, named 'like charge regions' (LCRs), in the formation of the FG network at the center of the NPC. Positively charged LCRs prepare the environment for negatively charged cargo complexes and regulate the size of the FG network. The low number density of charged residues in these regions prevents FG domains from forming a relaxed coil structure. Our results highlight the significant role of polar interactions in FG network formation at the center of the NPC and demonstrate that the specific localization of LCRs, FG motifs, charged, and polar residues regulate the formation of the FG network at the center of the NPC.
Collapse
Affiliation(s)
- M Peyro
- Molecular Cell Biomechanics Laboratory, Departments of Bioengineering and Mechanical Engineering, University of California, Berkeley, CA 94720
| | - M Soheilypour
- Molecular Cell Biomechanics Laboratory, Departments of Bioengineering and Mechanical Engineering, University of California, Berkeley, CA 94720
| | - B L Lee
- Molecular Cell Biomechanics Laboratory, Departments of Bioengineering and Mechanical Engineering, University of California, Berkeley, CA 94720
| | - M R K Mofrad
- Molecular Cell Biomechanics Laboratory, Departments of Bioengineering and Mechanical Engineering, University of California, Berkeley, CA 94720
| |
Collapse
|
343
|
DisPredict: A Predictor of Disordered Protein Using Optimized RBF Kernel. PLoS One 2015; 10:e0141551. [PMID: 26517719 PMCID: PMC4627842 DOI: 10.1371/journal.pone.0141551] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2015] [Accepted: 10/09/2015] [Indexed: 12/02/2022] Open
Abstract
Intrinsically disordered proteins or, regions perform important biological functions through their dynamic conformations during binding. Thus accurate identification of these disordered regions have significant implications in proper annotation of function, induced fold prediction and drug design to combat critical diseases. We introduce DisPredict, a disorder predictor that employs a single support vector machine with RBF kernel and novel features for reliable characterization of protein structure. DisPredict yields effective performance. In addition to 10-fold cross validation, training and testing of DisPredict was conducted with independent test datasets. The results were consistent with both the training and test error minimal. The use of multiple data sources, makes the predictor generic. The datasets used in developing the model include disordered regions of various length which are categorized as short and long having different compositions, different types of disorder, ranging from fully to partially disordered regions as well as completely ordered regions. Through comparison with other state of the art approaches and case studies, DisPredict is found to be a useful tool with competitive performance. DisPredict is available at https://github.com/tamjidul/DisPredict_v1.0.
Collapse
|
344
|
Malhis N, Wong ETC, Nassar R, Gsponer J. Computational Identification of MoRFs in Protein Sequences Using Hierarchical Application of Bayes Rule. PLoS One 2015; 10:e0141603. [PMID: 26517836 PMCID: PMC4627796 DOI: 10.1371/journal.pone.0141603] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2015] [Accepted: 10/09/2015] [Indexed: 01/24/2023] Open
Abstract
Motivation Intrinsically disordered regions of proteins play an essential role in the regulation of various biological processes. Key to their regulatory function is often the binding to globular protein domains via sequence elements known as molecular recognition features (MoRFs). Development of computational tools for the identification of candidate MoRF locations in amino acid sequences is an important task and an area of growing interest. Given the relative sparseness of MoRFs in protein sequences, the accuracy of the available MoRF predictors is often inadequate for practical usage, which leaves a significant need and room for improvement. In this work, we introduce MoRFCHiBi_Web, which predicts MoRF locations in protein sequences with higher accuracy compared to current MoRF predictors. Methods Three distinct and largely independent property scores are computed with component predictors and then combined to generate the final MoRF propensity scores. The first score reflects the likelihood of sequence windows to harbour MoRFs and is based on amino acid composition and sequence similarity information. It is generated by MoRFCHiBi using small windows of up to 40 residues in size. The second score identifies long stretches of protein disorder and is generated by ESpritz with the DisProt option. Lastly, the third score reflects residue conservation and is assembled from PSSM files generated by PSI-BLAST. These propensity scores are processed and then hierarchically combined using Bayes rule to generate the final MoRFCHiBi_Web predictions. Results MoRFCHiBi_Web was tested on three datasets. Results show that MoRFCHiBi_Web outperforms previously developed predictors by generating less than half the false positive rate for the same true positive rate at practical threshold values. This level of accuracy paired with its relatively high processing speed makes MoRFCHiBi_Web a practical tool for MoRF prediction. Availability http://morf.chibi.ubc.ca:8080/morf/.
Collapse
Affiliation(s)
- Nawar Malhis
- Centre for High-Throughput Biology, University of British Columbia, Vancouver, BC, Canada
- * E-mail: (NM); (JG)
| | - Eric T. C. Wong
- Centre for High-Throughput Biology, University of British Columbia, Vancouver, BC, Canada
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Roy Nassar
- Centre for High-Throughput Biology, University of British Columbia, Vancouver, BC, Canada
| | - Jörg Gsponer
- Centre for High-Throughput Biology, University of British Columbia, Vancouver, BC, Canada
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada
- * E-mail: (NM); (JG)
| |
Collapse
|
345
|
Li J, Feng Y, Wang X, Li J, Liu W, Rong L, Bao J. An Overview of Predictors for Intrinsically Disordered Proteins over 2010-2014. Int J Mol Sci 2015; 16:23446-62. [PMID: 26426014 PMCID: PMC4632708 DOI: 10.3390/ijms161023446] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2015] [Revised: 08/25/2015] [Accepted: 08/31/2015] [Indexed: 02/05/2023] Open
Abstract
The sequence-structure-function paradigm of proteins has been changed by the occurrence of intrinsically disordered proteins (IDPs). Benefiting from the structural disorder, IDPs are of particular importance in biological processes like regulation and signaling. IDPs are associated with human diseases, including cancer, cardiovascular disease, neurodegenerative diseases, amyloidoses, and several other maladies. IDPs attract a high level of interest and a substantial effort has been made to develop experimental and computational methods. So far, more than 70 prediction tools have been developed since 1997, within which 17 predictors were created in the last five years. Here, we presented an overview of IDPs predictors developed during 2010-2014. We analyzed the algorithms used for IDPs prediction by these tools and we also discussed the basic concept of various prediction methods for IDPs. The comparison of prediction performance among these tools is discussed as well.
Collapse
Affiliation(s)
- Jianzong Li
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
| | - Yu Feng
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
| | - Xiaoyun Wang
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
| | - Jing Li
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
- State Key Laboratory of Biotherapy/Collaborative Innovation Center for Biotherapy, West China Hospital, Sichuan University, Chengdu 610041, China.
| | - Wen Liu
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
| | - Li Rong
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
| | - Jinku Bao
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
- State Key Laboratory of Biotherapy/Collaborative Innovation Center for Biotherapy, West China Hospital, Sichuan University, Chengdu 610041, China.
- State Key Laboratory of Oral Diseases, West China College of Stomatology, Sichuan University, Chengdu 610041, China.
| |
Collapse
|
346
|
Pellegrini M. Tandem Repeats in Proteins: Prediction Algorithms and Biological Role. Front Bioeng Biotechnol 2015; 3:143. [PMID: 26442257 PMCID: PMC4585158 DOI: 10.3389/fbioe.2015.00143] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Accepted: 09/07/2015] [Indexed: 12/30/2022] Open
Abstract
Tandem repetitions in protein sequence and structure is a fascinating subject of research which has been a focus of study since the late 1990s. In this survey, we give an overview on the multi-faceted aspects of research on protein tandem repeats (PTR for short), including prediction algorithms, databases, early classification efforts, mechanisms of PTR formation and evolution, and synthetic PTR design. We also touch on the rather open issue of the relationship between PTR and flexibility (or disorder) in proteins. Detection of PTR either from protein sequence or structure data is challenging due to inherent high (biological) signal-to-noise ratio that is a key feature of this problem. As early in silico analytic tools have been key enablers for starting this field of study, we expect that current and future algorithmic and statistical breakthroughs will have a high impact on the investigations of the biological role of PTR.
Collapse
Affiliation(s)
- Marco Pellegrini
- Laboratory for Integrative Systems Medicine (LISM), Istituto di Informatica e Telematica, and Istituto di Fisiologia Clinica, Consiglio Nazionale delle Ricerche , Pisa , Italy
| |
Collapse
|
347
|
Permyakov SE, Permyakov EA, Uversky VN. Intrinsically disordered caldesmon binds calmodulin via the "buttons on a string" mechanism. PeerJ 2015; 3:e1265. [PMID: 26417545 PMCID: PMC4582948 DOI: 10.7717/peerj.1265] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2015] [Accepted: 09/03/2015] [Indexed: 01/27/2023] Open
Abstract
We show here that chicken gizzard caldesmon (CaD) and its C-terminal domain (residues 636–771, CaD136) are intrinsically disordered proteins. The computational and experimental analyses of the wild type CaD136 and series of its single tryptophan mutants (W674A, W707A, and W737A) and a double tryptophan mutant (W674A/W707A) suggested that although the interaction of CaD136 with calmodulin (CaM) can be driven by the non-specific electrostatic attraction between these oppositely charged molecules, the specificity of CaD136-CaM binding is likely to be determined by the specific packing of important CaD136 tryptophan residues at the CaD136-CaM interface. It is suggested that this interaction can be described as the “buttons on a charged string” model, where the electrostatic attraction between the intrinsically disordered CaD136 and the CaM is solidified in a “snapping buttons” manner by specific packing of the CaD136 “pliable buttons” (which are the short segments of fluctuating local structure condensed around the tryptophan residues) at the CaD136-CaM interface. Our data also show that all three “buttons” are important for binding, since mutation of any of the tryptophans affects CaD136-CaM binding and since CaD136 remains CaM-buttoned even when two of the three tryptophans are mutated to alanines.
Collapse
Affiliation(s)
- Sergei E Permyakov
- Protein Research Group, Institute for Biological Instrumentation, Russian Academy of Sciences , Pushchino, Moscow Region , Russia
| | - Eugene A Permyakov
- Protein Research Group, Institute for Biological Instrumentation, Russian Academy of Sciences , Pushchino, Moscow Region , Russia
| | - Vladimir N Uversky
- Protein Research Group, Institute for Biological Instrumentation, Russian Academy of Sciences , Pushchino, Moscow Region , Russia ; Department of Molecular Medicine, University of South Florida , Tampa, FL , USA
| |
Collapse
|
348
|
Baraldi E, Coller E, Zoli L, Cestaro A, Tosatto SCE, Zambelli B. Unfoldome variation upon plant-pathogen interactions: strawberry infection by Colletotrichum acutatum. PLANT MOLECULAR BIOLOGY 2015; 89:49-65. [PMID: 26245354 DOI: 10.1007/s11103-015-0353-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2015] [Accepted: 07/26/2015] [Indexed: 06/04/2023]
Abstract
Intrinsically disordered proteins (IDPs) are proteins that lack secondary and/or tertiary structure under physiological conditions. These proteins are very abundant in eukaryotic proteomes and play crucial roles in all molecular mechanisms underlying the response to environmental challenges. In plants, different IDPs involved in stress response have been identified and characterized. Nevertheless, a comprehensive evaluation of protein disorder in plant proteomes under abiotic or biotic stresses is not available so far. In the present work the transcriptome dataset of strawberry (Fragaria X ananassa) fruits interacting with the fungal pathogen Colletotrichum acutatum was actualized onto the woodland strawberry (Fragaria vesca) genome. The obtained cDNA sequences were translated into protein sequences, which were subsequently subjected to disorder analysis. The results, providing the first estimation of disorder abundance associated to plant infection, showed that the proteome activated in the strawberry red fruit during the active fungal propagation is remarkably depleted in disorder. On the other hand, in the resistant white fruit, no significant disorder reduction is observed in the proteins expressed in response to fungal infection. Four representative proteins, FvSMP, FvPRKRIP, FvPCD-4 and FvFAM32A-like, predicted as mainly disordered and never experimentally characterized before, were isolated, and the absence of structure was validated at the secondary and tertiary level using circular dichroism and differential scanning fluorimetry. Their quaternary structure was also established using light scattering. The results are discussed considering the role of protein disorder in plant defense.
Collapse
Affiliation(s)
- Elena Baraldi
- Department of Agricultural Sciences, University of Bologna, Bologna, Italy
| | - Emanuela Coller
- Research and Innovation Centre, Foundation Edmund Mach (FEM), San Michele all' Adige, Trento, Italy
- Department of Biomedical Sciences, University of Padova, Padua, Italy
| | - Lisa Zoli
- Department of Agricultural Sciences, University of Bologna, Bologna, Italy
| | - Alessandro Cestaro
- Research and Innovation Centre, Foundation Edmund Mach (FEM), San Michele all' Adige, Trento, Italy
| | | | - Barbara Zambelli
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.
| |
Collapse
|
349
|
Cong Q, Borek D, Otwinowski Z, Grishin NV. Skipper genome sheds light on unique phenotypic traits and phylogeny. BMC Genomics 2015; 16:639. [PMID: 26311350 PMCID: PMC4551732 DOI: 10.1186/s12864-015-1846-0] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2015] [Accepted: 08/14/2015] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Butterflies and moths are emerging as model organisms in genetics and evolutionary studies. The family Hesperiidae (skippers) was traditionally viewed as a sister to other butterflies based on its moth-like morphology and darting flight habits with fast wing beats. However, DNA studies suggest that the family Papilionidae (swallowtails) may be the sister to other butterflies including skippers. The moth-like features and the controversial position of skippers in Lepidoptera phylogeny make them valuable targets for comparative genomics. RESULTS We obtained the 310 Mb draft genome of the Clouded Skipper (Lerema accius) from a wild-caught specimen using a cost-effective strategy that overcomes the high (1.6 %) heterozygosity problem. Comparative analysis of Lerema accius and the highly heterozygous genome of Papilio glaucus revealed differences in patterns of SNP distribution, but similarities in functions of genes that are enriched in non-synonymous SNPs. Comparison of Lepidoptera genomes revealed possible molecular bases for unique traits of skippers: a duplication of electron transport chain components could result in efficient energy supply for their rapid flight; a diversified family of predicted cellulases might allow them to feed on cellulose-enriched grasses; an expansion of pheromone-binding proteins and enzymes for pheromone synthesis implies a more efficient mate-recognition system, which compensates for the lack of clear visual cues due to the similarities in wing colors and patterns of many species of skippers. Phylogenetic analysis of several Lepidoptera genomes suggested that the position of Hesperiidae remains uncertain as the tree topology varied depending on the evolutionary model. CONCLUSION Completion of the first genome from the family Hesperiidae allowed comparative analyses with other Lepidoptera that revealed potential genetic bases for the unique phenotypic traits of skippers. This work lays the foundation for future experimental studies of skippers and provides a rich dataset for comparative genomics and phylogenetic studies of Lepidoptera.
Collapse
Affiliation(s)
- Qian Cong
- Department of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX, 75390-8816, USA.
| | - Dominika Borek
- Department of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX, 75390-8816, USA.
| | - Zbyszek Otwinowski
- Department of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX, 75390-8816, USA.
| | - Nick V Grishin
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX, 75390-9050, USA. .,Department of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX, 75390-8816, USA.
| |
Collapse
|
350
|
Volpato V, Alshomrani B, Pollastri G. Accurate Ab Initio and Template-Based Prediction of Short Intrinsically-Disordered Regions by Bidirectional Recurrent Neural Networks Trained on Large-Scale Datasets. Int J Mol Sci 2015; 16:19868-85. [PMID: 26307973 PMCID: PMC4581330 DOI: 10.3390/ijms160819868] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Revised: 07/28/2015] [Accepted: 07/29/2015] [Indexed: 12/02/2022] Open
Abstract
Intrinsically-disordered regions lack a well-defined 3D structure, but play key roles in determining the function of many proteins. Although predictors of disorder have been shown to achieve relatively high rates of correct classification of these segments, improvements over the the years have been slow, and accurate methods are needed that are capable of accommodating the ever-increasing amount of structurally-determined protein sequences to try to boost predictive performances. In this paper, we propose a predictor for short disordered regions based on bidirectional recurrent neural networks and tested by rigorous five-fold cross-validation on a large, non-redundant dataset collected from MobiDB, a new comprehensive source of protein disorder annotations. The system exploits sequence and structural information in the forms of frequency profiles, predicted secondary structure and solvent accessibility and direct disorder annotations from homologous protein structures (templates) deposited in the Protein Data Bank. The contributions of sequence, structure and homology information result in large improvements in predictive accuracy. Additionally, the large scale of the training set leads to low false positive rates, making our systems a robust and efficient way to address high-throughput disorder prediction.
Collapse
Affiliation(s)
- Viola Volpato
- School of Computer Science, University College Dublin, Belfield, Dublin 4, Ireland.
- Adaptive and Complex Systems Laboratory, University College Dublin, Belfield, Dublin 4, Ireland.
| | - Badr Alshomrani
- School of Computer Science, University College Dublin, Belfield, Dublin 4, Ireland.
- Adaptive and Complex Systems Laboratory, University College Dublin, Belfield, Dublin 4, Ireland.
| | - Gianluca Pollastri
- School of Computer Science, University College Dublin, Belfield, Dublin 4, Ireland.
- Adaptive and Complex Systems Laboratory, University College Dublin, Belfield, Dublin 4, Ireland.
| |
Collapse
|