51
|
Zhernov I, Diez S, Braun M, Lansky Z. Intrinsically Disordered Domain of Kinesin-3 Kif14 Enables Unique Functional Diversity. Curr Biol 2020; 30:3342-3351.e5. [PMID: 32649913 DOI: 10.1016/j.cub.2020.06.039] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Revised: 05/06/2020] [Accepted: 06/11/2020] [Indexed: 12/12/2022]
Abstract
In addition to their force-generating motor domains, kinesin motor proteins feature various accessory domains enabling them to fulfill a variety of functions in the cell. Human kinesin-3, Kif14, localizes to the midbody of the mitotic spindle and is involved in the progression of cytokinesis. The specific motor properties enabling Kif14's cellular functions, however, remain unknown. Here, we show in vitro that the intrinsically disordered N-terminal domain of Kif14 enables unique functional diversity of the kinesin. Using single molecule TIRF microscopy, we found that Kif14 exists either as a diffusible monomer or as processive dimer and that the disordered domain (1) enables diffusibility of the monomeric Kif14, (2) renders the dimeric Kif14 super-processive and enables the kinesin to pass through highly crowded areas, (3) enables robust, autonomous Kif14 tracking of growing microtubule tips, independent of microtubule end-binding (EB) proteins, and (4) is sufficient to enable crosslinking of parallel microtubules and necessary to enable Kif14-driven sliding of antiparallel ones. We explain these features of Kif14 by the observed diffusible interaction of the disordered domain with the microtubule lattice and the observed increased affinity of the disordered domain for GTP-bound tubulin. We suggest that the disordered domain tethers the motor domain to the microtubule providing a diffusible foothold and a regulatory hub, tuning the kinesin's interaction with microtubules. Our findings thus exemplify pliable protein tethering as a fundamental mechanism of molecular motor regulation.
Collapse
Affiliation(s)
- Ilia Zhernov
- Institute of Biotechnology of the Czech Academy of Sciences, BIOCEV, Prumyslova 595, 252 50 Vestec, Prague West, Czech Republic; Faculty of Mathematics and Physics, Charles University, Ke Karlovu 3, 121 16 Prague, Czech Republic
| | - Stefan Diez
- B CUBE - Center for Molecular Bioengineering, TU Dresden, Tatzberg 41, 01307 Dresden, Germany; Cluster of Excellence Physics of Life, TU Dresden, Tatzberg 47/49, 01307 Dresden, Germany; Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstr. 108, Dresden 01307, Germany
| | - Marcus Braun
- Institute of Biotechnology of the Czech Academy of Sciences, BIOCEV, Prumyslova 595, 252 50 Vestec, Prague West, Czech Republic.
| | - Zdenek Lansky
- Institute of Biotechnology of the Czech Academy of Sciences, BIOCEV, Prumyslova 595, 252 50 Vestec, Prague West, Czech Republic.
| |
Collapse
|
52
|
Guo Y, Qiu W, Roche TE, Hackert ML. Crystal structure of the catalytic subunit of bovine pyruvate dehydrogenase phosphatase. ACTA CRYSTALLOGRAPHICA SECTION F-STRUCTURAL BIOLOGY COMMUNICATIONS 2020; 76:292-301. [PMID: 32627744 DOI: 10.1107/s2053230x20007943] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Accepted: 06/11/2020] [Indexed: 11/11/2022]
Abstract
Mammalian pyruvate dehydrogenase (PDH) activity is tightly regulated by phosphorylation and dephosphorylation, which is catalyzed by PDH kinase isomers and PDH phosphatase isomers, respectively. PDH phosphatase isomer 1 (PDP1) is a heterodimer consisting of a catalytic subunit (PDP1c) and a regulatory subunit (PDP1r). Here, the crystal structure of bovine PDP1c determined at 2.1 Å resolution is reported. The crystals belonged to space group P3221, with unit-cell parameters a = b = 75.3, c = 173.2 Å. The structure was solved by molecular-replacement methods and refined to a final R factor of 21.9% (Rfree = 24.7%). The final model consists of 402 of a possible 467 amino-acid residues of the PDP1c monomer, two Mn2+ ions in the active site, an additional Mn2+ ion coordinated by His410 and His414, two MnSO4 ion pairs at special positions near the crystallographic twofold symmetry axis and 226 water molecules. Several new features of the PDP1c structure are revealed. The requirements are described and plausible bases are deduced for the interaction of PDP1c with PDP1r and other components of the pyruvate dehydrogenase complex.
Collapse
Affiliation(s)
- Youzhong Guo
- Department of Medicinal Chemistry, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Weihua Qiu
- Department of Medicinal Chemistry, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Thomas E Roche
- Department of Biochemistry and Molecular Biophysics, Kansas State University, Manhattan, KS 66506, USA
| | - Marvin L Hackert
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
53
|
Paci V, Krasteva I, Orsini M, Di Febo T, Luciani M, Perletta F, Di Pasquale A, Mattioli M, Tittarelli M. Proteomic analysis of Brucella melitensis and Brucella ovis for identification of virulence factor using bioinformatics approachs. Mol Cell Probes 2020; 53:101581. [PMID: 32428653 DOI: 10.1016/j.mcp.2020.101581] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Revised: 04/10/2020] [Accepted: 04/18/2020] [Indexed: 11/15/2022]
Abstract
The genus Brucella includes several genetically monomorphic species but with different phenotypic and virulence characteristics. In this study, proteins of two Brucella species, B. melitensis type strain 16 M and B. ovis REO198 were compared by proteomics approach, in order to explain the phenotypic and pathophysiological differences among Brucella species and correlate them with virulence factors. Protein extracts from the two Brucella species were separated by SDS-PAGE and 5 areas, which resulted qualitatively and quantitatively different, were analyzed by nLC-MS/MS. A total of 880 proteins (274 proteins of B. melitensis and 606 proteins of B. ovis) were identified; their functional and structural features were analyzed by bioinformatics tools. Four unique peptides belonging to 3 proteins for B. ovis and 10 peptides derived from 7 proteins for B. melitensis were chosen for the high amount of predicted B-cell epitopes exposed to the solvent. Among these proteins, outer-membrane immunogenic protein (N8LTS7) and 25 kDa outer-membrane immunogenic protein (Q45321), respectively of B. ovis and B. melitensis, could be interesting candidates for improving diagnostics tests and vaccines. Moreover, 8 and 13 outer and periplasmic non homologue proteins of B. ovis and B. melitensis were identified to screen the phenotypic differences between the two Brucella strains. These proteins will be used to unravel pathogenesis and ameliorate current diagnostic assays.
Collapse
Affiliation(s)
- Valentina Paci
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise (IZSAM), Teramo, Italy; University of Teramo, Faculty of Bioscience and Agro-Food and Environmental Technology, Teramo, Italy
| | - Ivanka Krasteva
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise (IZSAM), Teramo, Italy.
| | - Massimiliano Orsini
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise (IZSAM), Teramo, Italy
| | - Tiziana Di Febo
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise (IZSAM), Teramo, Italy
| | - Mirella Luciani
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise (IZSAM), Teramo, Italy
| | - Fabrizia Perletta
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise (IZSAM), Teramo, Italy
| | - Adriano Di Pasquale
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise (IZSAM), Teramo, Italy
| | - Mauro Mattioli
- University of Teramo, Faculty of Bioscience and Agro-Food and Environmental Technology, Teramo, Italy
| | - Manuela Tittarelli
- Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise (IZSAM), Teramo, Italy
| |
Collapse
|
54
|
The First 3D Model of the Full-Length KIT Cytoplasmic Domain Reveals a New Look for an Old Receptor. Sci Rep 2020; 10:5401. [PMID: 32214210 PMCID: PMC7096506 DOI: 10.1038/s41598-020-62460-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Accepted: 03/02/2020] [Indexed: 11/18/2022] Open
Abstract
Receptor tyrosine kinases (RTKs) are key regulators of normal cellular processes and have a critical role in the development and progression of many diseases. RTK ligand-induced stimulation leads to activation of the cytoplasmic kinase domain that controls the intracellular signalling. Although the kinase domain of RTKs has been extensively studied using X-ray analysis, the kinase insert domain (KID) and the C-terminal are partially or fully missing in all reported structures. We communicate the first structural model of the full-length RTK KIT cytoplasmic domain, a crucial target for cancer therapy. This model was achieved by integration of ab initio KID and C-terminal probe models into an X-ray structure, and by their further exploration through molecular dynamics (MD) simulation. An extended (2-µs) MD simulation of the proper model provided insight into the structure and conformational dynamics of the full-length cytoplasmic domain of KIT, which can be exploited in the description of the KIT transduction processes.
Collapse
|
55
|
Hanson J, Paliwal KK, Litfin T, Zhou Y. SPOT-Disorder2: Improved Protein Intrinsic Disorder Prediction by Ensembled Deep Learning. GENOMICS PROTEOMICS & BIOINFORMATICS 2020; 17:645-656. [PMID: 32173600 PMCID: PMC7212484 DOI: 10.1016/j.gpb.2019.01.004] [Citation(s) in RCA: 101] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Revised: 01/18/2019] [Accepted: 02/15/2019] [Indexed: 01/13/2023]
Abstract
Intrinsically disordered or unstructured proteins (or regions in proteins) have been found to be important in a wide range of biological functions and implicated in many diseases. Due to the high cost and low efficiency of experimental determination of intrinsic disorder and the exponential increase of unannotated protein sequences, developing complementary computational prediction methods has been an active area of research for several decades. Here, we employed an ensemble of deep Squeeze-and-Excitation residual inception and long short-term memory (LSTM) networks for predicting protein intrinsic disorder with input from evolutionary information and predicted one-dimensional structural properties. The method, called SPOT-Disorder2, offers substantial and consistent improvement not only over our previous technique based on LSTM networks alone, but also over other state-of-the-art techniques in three independent tests with different ratios of disordered to ordered amino acid residues, and for sequences with either rich or limited evolutionary information. More importantly, semi-disordered regions predicted in SPOT-Disorder2 are more accurate in identifying molecular recognition features (MoRFs) than methods directly designed for MoRFs prediction. SPOT-Disorder2 is available as a web server and as a standalone program at https://sparks-lab.org/server/spot-disorder2/.
Collapse
Affiliation(s)
- Jack Hanson
- Signal Processing Laboratory, Griffith University, Brisbane 4111, Australia
| | - Kuldip K Paliwal
- Signal Processing Laboratory, Griffith University, Brisbane 4111, Australia
| | - Thomas Litfin
- School of Information and Communication Technology, Griffith University, Gold Coast 4222, Australia
| | - Yaoqi Zhou
- School of Information and Communication Technology, Griffith University, Gold Coast 4222, Australia; Institute for Glycomics, Griffith University, Gold Coast 4222, Australia.
| |
Collapse
|
56
|
Oltrogge LM, Chaijarasphong T, Chen AW, Bolin ER, Marqusee S, Savage DF. Multivalent interactions between CsoS2 and Rubisco mediate α-carboxysome formation. Nat Struct Mol Biol 2020; 27:281-287. [PMID: 32123388 PMCID: PMC7337323 DOI: 10.1038/s41594-020-0387-7] [Citation(s) in RCA: 98] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Accepted: 01/24/2020] [Indexed: 11/23/2022]
Abstract
Carboxysomes are bacterial microcompartments that function as the centerpiece of the bacterial CO2-concentrating mechanism by facilitating high CO2 concentrations near the carboxylase Rubisco. The carboxysome self-assembles from thousands of individual proteins into icosahedral-like particles with a dense enzyme cargo encapsulated within a proteinaceous shell. In the case of the α-carboxysome, there is little molecular insight into protein-protein interactions that drive the assembly process. Here, studies on the α-carboxysome from Halothiobacillus neapolitanus demonstrate that Rubisco interacts with the N-terminus of CsoS2, a multivalent, intrinsically disordered protein. X-ray structural analysis of the CsoS2 interaction motif bound to Rubisco reveals a series of conserved electrostatic interactions that are only made with properly assembled hexadecameric Rubisco. Although biophysical measurements indicate this single interaction is weak, its implicit multivalency induces high-affinity binding through avidity. Taken together, our results indicate CsoS2 acts as an interaction hub to condense Rubisco and enable efficient α-carboxysome formation.
Collapse
Affiliation(s)
- Luke M Oltrogge
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA
| | - Thawatchai Chaijarasphong
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA.,Department of Biotechnology, Faculty of Science, Mahidol University, Bangkok, Thailand
| | - Allen W Chen
- Department of Chemistry, University of California Berkeley, Berkeley, CA, USA
| | - Eric R Bolin
- Biophysics Graduate Program, University of California Berkeley, Berkeley, CA, USA.,California Institute for Quantitative Biosciences, University of California Berkeley, Berkeley, CA, USA
| | - Susan Marqusee
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA.,Department of Chemistry, University of California Berkeley, Berkeley, CA, USA.,California Institute for Quantitative Biosciences, University of California Berkeley, Berkeley, CA, USA
| | - David F Savage
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA.
| |
Collapse
|
57
|
Changes in hydrophobicity mainly promotes the aggregation tendency of ALS associated SOD1 mutants. Int J Biol Macromol 2020; 145:904-913. [PMID: 31669277 DOI: 10.1016/j.ijbiomac.2019.09.181] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2019] [Revised: 09/24/2019] [Accepted: 09/26/2019] [Indexed: 12/19/2022]
Abstract
Protein misfolding and aggregation due to mutations, are associated with fatal neurodegenerative disorders. The mutations in Cu/Zn superoxide dismutase (SOD1) causing its misfolding and aggregation are found linked to the motor neuron disorder, amyotrophic lateral sclerosis. Since the mutations are scattered throughout SOD1 structure, determining the exact molecular mechanism underlying the ALS pathology remains unresolved. In this study, we have investigated the major molecular factors that mainly contribute to SOD1 destabilization, intrinsic disorder, and misfolding using sequence and structural information. We have analysed 153 ALS causing SOD1 point mutants for aggregation tendency using four different aggregation prediction tools, viz., Aggrescan3D (A3D), CamSol, GAP and Zyggregator. Our results suggest that 74-79 mutants are susceptible to aggregation, due to distorted native interactions originated at the mutation site. Majority of the aggregation prone mutants are located in the buried regions of SOD1 molecule. Further, the mutations at the hydrophobic amino acids primarily promote the aggregation tendency of SOD1 protein through different destabilizing mechanisms including changes in hydrophobic free energy, loss of electrostatic interactions in the protein's surface and loss of hydrogen bonds that bridges the protein core and surface.
Collapse
|
58
|
Oldfield CJ, Fan X, Wang C, Dunker AK, Kurgan L. Computational Prediction of Intrinsic Disorder in Protein Sequences with the disCoP Meta-predictor. Methods Mol Biol 2020; 2141:21-35. [PMID: 32696351 DOI: 10.1007/978-1-0716-0524-0_2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Intrinsically disordered proteins are either entirely disordered or contain disordered regions in their native state. These proteins and regions function without the prerequisite of a stable structure and were found to be abundant across all kingdoms of life. Experimental annotation of disorder lags behind the rapidly growing number of sequenced proteins, motivating the development of computational methods that predict disorder in protein sequences. DisCoP is a user-friendly webserver that provides accurate sequence-based prediction of protein disorder. It relies on meta-architecture in which the outputs generated by multiple disorder predictors are combined together to improve predictive performance. The architecture of disCoP is presented, and its accuracy relative to several other disorder predictors is briefly discussed. We describe usage of the web interface and explain how to access and read results generated by this computational tool. We also provide an example of prediction results and interpretation. The disCoP's webserver is publicly available at http://biomine.cs.vcu.edu/servers/disCoP/ .
Collapse
Affiliation(s)
| | - Xiao Fan
- Department of Pediatrics, Columbia University, New York, NY, USA
| | - Chen Wang
- Department of Medicine, Columbia University, New York, NY, USA
| | - A Keith Dunker
- Department of Biochemistry and Molecular Biology, Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
59
|
Abstract
Intrinsically disordered regions (IDRs) are estimated to be highly abundant in nature. While only several thousand proteins are annotated with experimentally derived IDRs, computational methods can be used to predict IDRs for the millions of currently uncharacterized protein chains. Several dozen disorder predictors were developed over the last few decades. While some of these methods provide accurate predictions, unavoidably they also make some mistakes. Consequently, one of the challenges facing users of these methods is how to decide which predictions can be trusted and which are likely incorrect. This practical problem can be solved using quality assessment (QA) scores that predict correctness of the underlying (disorder) predictions at a residue level. We motivate and describe a first-of-its-kind toolbox of QA methods, QUARTER (QUality Assessment for pRotein inTrinsic disordEr pRedictions), which provides the scores for a diverse set of ten disorder predictors. QUARTER is available to the end users as a free and convenient webserver at http://biomine.cs.vcu.edu/servers/QUARTER/ . We briefly describe the predictive architecture of QUARTER and provide detailed instructions on how to use the webserver. We also explain how to interpret results produced by QUARTER with the help of a case study.
Collapse
|
60
|
Katuwawala A, Oldfield CJ, Kurgan L. DISOselect: Disorder predictor selection at the protein level. Protein Sci 2020; 29:184-200. [PMID: 31642118 PMCID: PMC6933862 DOI: 10.1002/pro.3756] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 10/16/2019] [Accepted: 10/17/2019] [Indexed: 12/27/2022]
Abstract
The intense interest in the intrinsically disordered proteins in the life science community, together with the remarkable advancements in predictive technologies, have given rise to the development of a large number of computational predictors of intrinsic disorder from protein sequence. While the growing number of predictors is a positive trend, we have observed a considerable difference in predictive quality among predictors for individual proteins. Furthermore, variable predictor performance is often inconsistent between predictors for different proteins, and the predictor that shows the best predictive performance depends on the unique properties of each protein sequence. We propose a computational approach, DISOselect, to estimate the predictive performance of 12 selected predictors for individual proteins based on their unique sequence-derived properties. This estimation informs the users about the expected predictive quality for a selected disorder predictor and can be used to recommend methods that are likely to provide the best quality predictions. Our solution does not depend on the results of any disorder predictor; the estimations are made based solely on the protein sequence. Our solution significantly improves predictive performance, as judged with a test set of 1,000 proteins, when compared to other alternatives. We have empirically shown that by using the recommended methods the overall predictive performance for a given set of proteins can be improved by a statistically significant margin. DISOselect is freely available for non-commercial users through the webserver at http://biomine.cs.vcu.edu/servers/DISOselect/.
Collapse
Affiliation(s)
- Akila Katuwawala
- Department of Computer ScienceVirginia Commonwealth UniversityRichmondVirginia
| | | | - Lukasz Kurgan
- Department of Computer ScienceVirginia Commonwealth UniversityRichmondVirginia
| |
Collapse
|
61
|
Katuwawala A, Oldfield CJ, Kurgan L. Accuracy of protein-level disorder predictions. Brief Bioinform 2019; 21:1509-1522. [DOI: 10.1093/bib/bbz100] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 06/22/2019] [Accepted: 07/15/2019] [Indexed: 01/15/2023] Open
Abstract
Abstract
Experimental annotations of intrinsic disorder are available for 0.1% of 147 000 000 of currently sequenced proteins. Over 60 sequence-based disorder predictors were developed to help bridge this gap. Current benchmarks of these methods assess predictive performance on datasets of proteins; however, predictions are often interpreted for individual proteins. We demonstrate that the protein-level predictive performance varies substantially from the dataset-level benchmarks. Thus, we perform first-of-its-kind protein-level assessment for 13 popular disorder predictors using 6200 disorder-annotated proteins. We show that the protein-level distributions are substantially skewed toward high predictive quality while having long tails of poor predictions. Consequently, between 57% and 75% proteins secure higher predictive performance than the currently used dataset-level assessment suggests, but as many as 30% of proteins that are located in the long tails suffer low predictive performance. These proteins typically have relatively high amounts of disorder, in contrast to the mostly structured proteins that are predicted accurately by all 13 methods. Interestingly, each predictor provides the most accurate results for some number of proteins, while the best-performing at the dataset-level method is in fact the best for only about 30% of proteins. Moreover, the majority of proteins are predicted more accurately than the dataset-level performance of the most accurate tool by at least four disorder predictors. While these results suggests that disorder predictors outperform their current benchmark performance for the majority of proteins and that they complement each other, novel tools that accurately identify the hard-to-predict proteins and that make accurate predictions for these proteins are needed.
Collapse
Affiliation(s)
- Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, USA
- Department of Computer Science, Virginia Commonwealth University, USA
| | - Christopher J Oldfield
- Department of Computer Science, Virginia Commonwealth University, USA
- Department of Computer Science, Virginia Commonwealth University, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, USA
- Department of Computer Science, Virginia Commonwealth University, USA
| |
Collapse
|
62
|
Cannon JF. Novel phosphorylation-dependent regulation in an unstructured protein. Proteins 2019; 88:366-384. [PMID: 31512287 DOI: 10.1002/prot.25812] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 07/15/2019] [Accepted: 09/04/2019] [Indexed: 12/15/2022]
Abstract
This work explores how phosphorylation of an unstructured protein region in inhibitor-2 (I2) regulates protein phosphatase-1 (PP1) enzyme activity using molecular dynamics (MD). Free I2 is largely unstructured; however, when bound to PP1, three segments adopt a stable structure. In particular, an I2 helix (i-helix) blocks the PP1 active site and inhibits phosphatase activity. I2 phosphorylation in the PP1-I2 complex activates phosphatase activity without I2 dissociation. The I2 Thr74 regulatory phosphorylation site is in an unstructured domain in PP1-I2. PP1-I2 MD demonstrated that I2 phosphorylation promotes early steps of PP1-I2 activation in explicit solvent models. Moreover, phosphorylation-dependent activation occurred in PP1-I2 complexes derived from I2 orthologs with diverse sequences from human, yeast, worm, and protozoa. This system allowed exploration of features of the 73-residue unstructured human I2 domain critical for phosphorylation-dependent activation. These studies revealed that components of I2 unstructured domain are strategically positioned for phosphorylation responsiveness including a transient α-helix. There was no evidence that electrostatic interactions of I2 phosphothreonine74 influenced PP1-I2 activation. Instead, phosphorylation altered the conformation of residues around Thr74. Phosphorylation uncurled the distance between I2 residues Glu71 to Tyr76 to promote PP1-I2 activation, whereas reduced distances reduced activation. This I2 residue Glu71 to Tyr76 distance distribution, independently from Thr74 phosphorylation, controls I2 i-helix displacement from the PP1 active site leading to PP1-I2 activation.
Collapse
Affiliation(s)
- John F Cannon
- Department of Molecular Microbiology and Immunology, University of Missouri, Columbia, Missouri
| |
Collapse
|
63
|
Liu Y, Wang X, Liu B. A comprehensive review and comparison of existing computational methods for intrinsically disordered protein and region prediction. Brief Bioinform 2019; 20:330-346. [PMID: 30657889 DOI: 10.1093/bib/bbx126] [Citation(s) in RCA: 95] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Indexed: 01/06/2023] Open
Abstract
Intrinsically disordered proteins and regions are widely distributed in proteins, which are associated with many biological processes and diseases. Accurate prediction of intrinsically disordered proteins and regions is critical for both basic research (such as protein structure and function prediction) and practical applications (such as drug development). During the past decades, many computational approaches have been proposed, which have greatly facilitated the development of this important field. Therefore, a comprehensive and updated review is highly required. In this regard, we give a review on the computational methods for intrinsically disordered protein and region prediction, especially focusing on the recent development in this field. These computational approaches are divided into four categories based on their methodologies, including physicochemical-based method, machine-learning-based method, template-based method and meta method. Furthermore, their advantages and disadvantages are also discussed. The performance of 40 state-of-the-art predictors is directly compared on the target proteins in the task of disordered region prediction in the 10th Critical Assessment of protein Structure Prediction. A more comprehensive performance comparison of 45 different predictors is conducted based on seven widely used benchmark data sets. Finally, some open problems and perspectives are discussed.
Collapse
Affiliation(s)
- Yumeng Liu
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, China
| | - Xiaolong Wang
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, China
| | - Bin Liu
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, China
| |
Collapse
|
64
|
Coskuner O, Uversky VN. Intrinsically disordered proteins in various hypotheses on the pathogenesis of Alzheimer's and Parkinson's diseases. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2019; 166:145-223. [PMID: 31521231 DOI: 10.1016/bs.pmbts.2019.05.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Amyloid-β (Aβ) and α-synuclein (αS) are two intrinsically disordered proteins (IDPs) at the centers of the pathogenesis of Alzheimer's and Parkinson's diseases, respectively. Different hypotheses have been proposed for explanation of the molecular mechanisms of the pathogenesis of these two diseases, with these two IDPs being involved in many of these hypotheses. Currently, we do not know, which of these hypothesis is more accurate. Experiments face challenges due to the rapid conformational changes, fast aggregation processes, solvent and paramagnetic effects in studying these two IDPs in detail. Furthermore, pathological modifications impact their structures and energetics. Theoretical studies using computational chemistry and computational biology have been utilized to understand the structures and energetics of Aβ and αS. In this chapter, we introduce Aβ and αS in light of various hypotheses, and discuss different experimental and theoretical techniques that are used to study these two proteins along with their weaknesses and strengths. We suggest that a promising solution for studying Aβ and αS at the center of varying hypotheses could be provided by developing new techniques that link quantum mechanics, statistical mechanics, thermodynamics, bioinformatics to machine learning. Such new developments could also lead to development in experimental techniques.
Collapse
Affiliation(s)
- Orkid Coskuner
- Turkish-German University, Molecular Biotechnology, Istanbul, Turkey.
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, United States; Laboratory of New Methods in Biology, Institute for Biological Instrumentation, Russian Academy of Sciences, Moscow, Russia.
| |
Collapse
|
65
|
Katuwawala A, Peng Z, Yang J, Kurgan L. Computational Prediction of MoRFs, Short Disorder-to-order Transitioning Protein Binding Regions. Comput Struct Biotechnol J 2019; 17:454-462. [PMID: 31007871 PMCID: PMC6453775 DOI: 10.1016/j.csbj.2019.03.013] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2019] [Revised: 03/22/2019] [Accepted: 03/23/2019] [Indexed: 12/28/2022] Open
Abstract
Molecular recognition features (MoRFs) are short protein-binding regions that undergo disorder-to-order transitions (induced folding) upon binding protein partners. These regions are abundant in nature and can be predicted from protein sequences based on their distinctive sequence signatures. This first-of-its-kind survey covers 14 MoRF predictors and six related methods for the prediction of short protein-binding linear motifs, disordered protein-binding regions and semi-disordered regions. We show that the development of MoRF predictors has accelerated in the recent years. These predictors depend on machine learning-derived models that were generated using training datasets where MoRFs are annotated using putative disorder. Our analysis reveals that they generate accurate predictions. We identified eight methods that offer area under the ROC curve (AUC) ≥ 0.7 on experimentally-validated test datasets. We show that modern MoRF predictors accurately find experimentally annotated MoRFs even though they were trained using the putative disorder annotations. They are relatively highly-cited, particularly the methods available as webservers that on average secure three times more citations than methods without this option. MoRF predictions contribute to the experimental discovery of protein-protein interactions, annotation of protein functions and computational analysis of a variety of proteomes, protein families, and pathways. We outline future development and application directions for these tools, stressing the importance to develop novel tools that would target interactions of disordered regions with other types of partners.
Collapse
Affiliation(s)
- Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, USA
| | - Zhenling Peng
- Center for Applied Mathematics, Tianjin University, Tianjin, China
| | - Jianyi Yang
- School of Mathematical Sciences, Nankai University, Tianjin, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, USA
| |
Collapse
|
66
|
Nielsen JT, Mulder FAA. Quality and bias of protein disorder predictors. Sci Rep 2019; 9:5137. [PMID: 30914747 PMCID: PMC6435736 DOI: 10.1038/s41598-019-41644-w] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Accepted: 03/13/2019] [Indexed: 02/03/2023] Open
Abstract
Disorder in proteins is vital for biological function, yet it is challenging to characterize. Therefore, methods for predicting protein disorder from sequence are fundamental. Currently, predictors are trained and evaluated using data from X-ray structures or from various biochemical or spectroscopic data. However, the prediction accuracy of disordered predictors is not calibrated, nor is it established whether predictors are intrinsically biased towards one of the extremes of the order-disorder axis. We therefore generated and validated a comprehensive experimental benchmarking set of site-specific and continuous disorder, using deposited NMR chemical shift data. This novel experimental data collection is fully appropriate and represents the full spectrum of disorder. We subsequently analyzed the performance of 26 widely-used disorder prediction methods and found that these vary noticeably. At the same time, a distinct bias for over-predicting order was identified for some algorithms. Our analysis has important implications for the validity and the interpretation of protein disorder, as utilized, for example, in assessing the content of disorder in proteomes.
Collapse
Affiliation(s)
- Jakob T Nielsen
- Interdisciplinary Nanoscience Center (iNANO), Aarhus University, Gustav Wieds Vej 14, 8000, Aarhus C, Denmark.
- Department of Chemistry, Aarhus University, Langelandsgade 140, 8000, Aarhus C, Denmark.
| | - Frans A A Mulder
- Interdisciplinary Nanoscience Center (iNANO), Aarhus University, Gustav Wieds Vej 14, 8000, Aarhus C, Denmark.
- Department of Chemistry, Aarhus University, Langelandsgade 140, 8000, Aarhus C, Denmark.
| |
Collapse
|
67
|
Shimomura T, Nishijima K, Kikuchi T. A new technique for predicting intrinsically disordered regions based on average distance map constructed with inter-residue average distance statistics. BMC STRUCTURAL BIOLOGY 2019; 19:3. [PMID: 30727987 PMCID: PMC6366092 DOI: 10.1186/s12900-019-0101-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2018] [Accepted: 01/23/2019] [Indexed: 01/03/2023]
Abstract
Background It had long been thought that a protein exhibits its specific function through its own specific 3D-structure under physiological conditions. However, subsequent research has shown that there are many proteins without specific 3D-structures under physiological conditions, so-called intrinsically disordered proteins (IDPs). This study presents a new technique for predicting intrinsically disordered regions in a protein, based on our average distance map (ADM) technique. The ADM technique was developed to predict compact regions or structural domains in a protein. In a protein containing partially disordered regions, a domain region is likely to be ordered, thus it is unlikely that a disordered region would be part of any domain. Therefore, the ADM technique is expected to also predict a disordered region between domains. Results The results of our new technique are comparable to the top three performing techniques in the community-wide CASP10 experiment. We further discuss the case of p53, a tumor-suppressor protein, which is the most significant protein among cell cycle regulatory proteins. This protein exhibits a disordered character as a monomer but an ordered character when two p53s form a dimer. Conclusion Our technique can predict the location of an intrinsically disordered region in a protein with an accuracy comparable to the best techniques proposed so far. Furthermore, it can also predict a core region of IDPs forming definite 3D structures through interactions, such as dimerization. The technique in our study may also serve as a means of predicting a disordered region which would become an ordered structure when binding to another protein. Electronic supplementary material The online version of this article (10.1186/s12900-019-0101-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Takumi Shimomura
- Department of Bioinformatics, College of Life Sciences, Ritsumeikan University, 1-1-1 Nojihigashi, Kusatsu, Shiga, 525-8577, Japan
| | - Kohki Nishijima
- Department of Bioinformatics, College of Life Sciences, Ritsumeikan University, 1-1-1 Nojihigashi, Kusatsu, Shiga, 525-8577, Japan
| | - Takeshi Kikuchi
- Department of Bioinformatics, College of Life Sciences, Ritsumeikan University, 1-1-1 Nojihigashi, Kusatsu, Shiga, 525-8577, Japan.
| |
Collapse
|
68
|
Oldfield CJ, Chen K, Kurgan L. Computational Prediction of Secondary and Supersecondary Structures from Protein Sequences. Methods Mol Biol 2019; 1958:73-100. [PMID: 30945214 DOI: 10.1007/978-1-4939-9161-7_4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Many new methods for the sequence-based prediction of the secondary and supersecondary structures have been developed over the last several years. These and older sequence-based predictors are widely applied for the characterization and prediction of protein structure and function. These efforts have produced countless accurate predictors, many of which rely on state-of-the-art machine learning models and evolutionary information generated from multiple sequence alignments. We describe and motivate both types of predictions. We introduce concepts related to the annotation and computational prediction of the three-state and eight-state secondary structure as well as several types of supersecondary structures, such as β hairpins, coiled coils, and α-turn-α motifs. We review 34 predictors focusing on recent tools and provide detailed information for a selected set of 14 secondary structure and 3 supersecondary structure predictors. We conclude with several practical notes for the end users of these predictive methods.
Collapse
Affiliation(s)
- Christopher J Oldfield
- Department of Computer Science, College of Engineering, Virginia Commonwealth University, Richmond, VA, USA
| | - Ke Chen
- School of Computer Science and Software Engineering, Tianjin Polytechnic University, Tianjin, People's Republic of China
| | - Lukasz Kurgan
- Department of Computer Science, College of Engineering, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
69
|
Oldfield CJ, Uversky VN, Dunker AK, Kurgan L. Introduction to intrinsically disordered proteins and regions. Proteins 2019. [DOI: 10.1016/b978-0-12-816348-1.00001-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
70
|
Zhao B, Xue B. Decision-Tree Based Meta-Strategy Improved Accuracy of Disorder Prediction and Identified Novel Disordered Residues Inside Binding Motifs. Int J Mol Sci 2018; 19:E3052. [PMID: 30301243 PMCID: PMC6213717 DOI: 10.3390/ijms19103052] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Revised: 09/24/2018] [Accepted: 10/04/2018] [Indexed: 02/06/2023] Open
Abstract
Using computational techniques to identify intrinsically disordered residues is practical and effective in biological studies. Therefore, designing novel high-accuracy strategies is always preferable when existing strategies have a lot of room for improvement. Among many possibilities, a meta-strategy that integrates the results of multiple individual predictors has been broadly used to improve the overall performance of predictors. Nonetheless, a simple and direct integration of individual predictors may not effectively improve the performance. In this project, dual-threshold two-step significance voting and neural networks were used to integrate the predictive results of four individual predictors, including: DisEMBL, IUPred, VSL2, and ESpritz. The new meta-strategy has improved the prediction performance of intrinsically disordered residues significantly, compared to all four individual predictors and another four recently-designed predictors. The improvement was validated using five-fold cross-validation and in independent test datasets.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Cell Biology, Microbiology and Molecular Biology, School of Natural Sciences and Mathematics, College of Arts and Sciences, University of South Florida, Tampa, FL 33620, USA.
| | - Bin Xue
- Department of Cell Biology, Microbiology and Molecular Biology, School of Natural Sciences and Mathematics, College of Arts and Sciences, University of South Florida, Tampa, FL 33620, USA.
| |
Collapse
|
71
|
Mann K, Cerveau N, Gummich M, Fritz M, Mann M, Jackson DJ. In-depth proteomic analyses of Haliotis laevigata (greenlip abalone) nacre and prismatic organic shell matrix. Proteome Sci 2018; 16:11. [PMID: 29983641 PMCID: PMC6003135 DOI: 10.1186/s12953-018-0139-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Accepted: 05/25/2018] [Indexed: 01/12/2023] Open
Abstract
Background The shells of various Haliotis species have served as models of invertebrate biomineralization and physical shell properties for more than 20 years. A focus of this research has been the nacreous inner layer of the shell with its conspicuous arrangement of aragonite platelets, resembling in cross-section a brick-and-mortar wall. In comparison, the outer, less stable, calcitic prismatic layer has received much less attention. One of the first molluscan shell proteins to be characterized at the molecular level was Lustrin A, a component of the nacreous organic matrix of Haliotis rufescens. This was soon followed by the C-type lectin perlucin and the growth factor-binding perlustrin, both isolated from H. laevigata nacre, and the crystal growth-modulating AP7 and AP24, isolated from H. rufescens nacre. Mass spectrometry-based proteomics was subsequently applied to to Haliotis biomineralization research with the analysis of the H. asinina shell matrix and yielded 14 different shell-associated proteins. That study was the most comprehensive for a Haliotis species to date. Methods The shell proteomes of nacre and prismatic layer of the marine gastropod Haliotis laevigata were analyzed combining mass spectrometry-based proteomics and next generation sequencing. Results We identified 297 proteins from the nacreous shell layer and 350 proteins from the prismatic shell layer from the green lip abalone H. laevigata. Considering the overlap between the two sets we identified a total of 448 proteins. Fifty-one nacre proteins and 43 prismatic layer proteins were defined as major proteins based on their abundance at more than 0.2% of the total. The remaining proteins occurred at low abundance and may not play any significant role in shell fabrication. The overlap of major proteins between the two shell layers was 17, amounting to a total of 77 major proteins. Conclusions The H. laevigata shell proteome shares moderate sequence similarity at the protein level with other gastropod, bivalve and more distantly related invertebrate biomineralising proteomes. Features conserved in H. laevigata and other molluscan shell proteomes include short repetitive sequences of low complexity predicted to lack intrinsic three-dimensional structure, and domains such as tyrosinase, chitin-binding, and carbonic anhydrase. This catalogue of H. laevigata shell proteins represents the most comprehensive for a haliotid and should support future efforts to elucidate the molecular mechanisms of shell assembly. Electronic supplementary material The online version of this article (10.1186/s12953-018-0139-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Karlheinz Mann
- 1Abteilung Proteomics und Signaltransduktion, Max-Planck-Institut für Biochemie, Am Klopferspitz 18, D-82152 Martinsried, Germany
| | - Nicolas Cerveau
- 2Department of Geobiology, Georg-August University of Göttingen, Goldschmidstr. 3, 37077 Göttingen, Germany
| | - Meike Gummich
- 3Universität Bremen, Institut für Biophysik, Otto Hahn Allee NW1, D-28334 Bremen, Germany
| | - Monika Fritz
- 3Universität Bremen, Institut für Biophysik, Otto Hahn Allee NW1, D-28334 Bremen, Germany
| | - Matthias Mann
- 1Abteilung Proteomics und Signaltransduktion, Max-Planck-Institut für Biochemie, Am Klopferspitz 18, D-82152 Martinsried, Germany
| | - Daniel J Jackson
- 2Department of Geobiology, Georg-August University of Göttingen, Goldschmidstr. 3, 37077 Göttingen, Germany
| |
Collapse
|
72
|
Dosztányi Z. Prediction of protein disorder based on IUPred. Protein Sci 2017; 27:331-340. [PMID: 29076577 DOI: 10.1002/pro.3334] [Citation(s) in RCA: 119] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 10/25/2017] [Accepted: 10/25/2017] [Indexed: 12/19/2022]
Abstract
Many proteins contain intrinsically disordered regions (IDRs), functional polypeptide segments that in isolation adopt a highly flexible conformational ensemble instead of a single, well-defined structure. Disorder prediction methods, which can discriminate ordered and disordered regions from the amino acid sequence, have contributed significantly to our current understanding of the distinct properties of intrinsically disordered proteins by enabling the characterization of individual examples as well as large-scale analyses of these protein regions. One popular method, IUPred provides a robust prediction of protein disorder based on an energy estimation approach that captures the fundamental difference between the biophysical properties of ordered and disordered regions. This paper reviews the energy estimation method underlying IUPred and the basic properties of the web server. Through an example, it also illustrates how the prediction output can be interpreted in a more complex case by taking into account the heterogeneous nature of IDRs. Various applications that benefited from IUPred to provide improved disorder predictions, complementing domain annotations and aiding the identification of functional short linear motifs are also described here. IUPred is freely available for noncommercial users through the web server (http://iupred.enzim.hu and http://iupred.elte.hu) . The program can also be downloaded and installed locally for large-scale analyses.
Collapse
Affiliation(s)
- Zsuzsanna Dosztányi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, H-1117, Hungary
| |
Collapse
|
73
|
Necci M, Piovesan D, Dosztányi Z, Tompa P, Tosatto SCE. A comprehensive assessment of long intrinsic protein disorder from the DisProt database. Bioinformatics 2017; 34:445-452. [DOI: 10.1093/bioinformatics/btx590] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2017] [Accepted: 09/15/2017] [Indexed: 12/30/2022] Open
Affiliation(s)
- Marco Necci
- Department of Biomedical Sciences, University of Padua, Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, Padova, Italy
| | - Zsuzsanna Dosztányi
- Agricoltural Sciences, University of Udine, Udine, Italy
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, Hungary
| | - Peter Tompa
- Fondazione Edmund Mach, S. Michele all'Adige, Italy
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), and Center for Structural Biology (CSB), Flanders Institute for Biotechnology (VIB), Brussels, Belgium
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padua, Padova, Italy
- CNR Institute of Neuroscience, Padova, Italy
| |
Collapse
|
74
|
Meng F, Uversky VN, Kurgan L. Comprehensive review of methods for prediction of intrinsic disorder and its molecular functions. Cell Mol Life Sci 2017; 74:3069-3090. [PMID: 28589442 PMCID: PMC11107660 DOI: 10.1007/s00018-017-2555-4] [Citation(s) in RCA: 130] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 06/01/2017] [Indexed: 12/19/2022]
Abstract
Computational prediction of intrinsic disorder in protein sequences dates back to late 1970 and has flourished in the last two decades. We provide a brief historical overview, and we review over 30 recent predictors of disorder. We are the first to also cover predictors of molecular functions of disorder, including 13 methods that focus on disordered linkers and disordered protein-protein, protein-RNA, and protein-DNA binding regions. We overview their predictive models, usability, and predictive performance. We highlight newest methods and predictors that offer strong predictive performance measured based on recent comparative assessments. We conclude that the modern predictors are relatively accurate, enjoy widespread use, and many of them are fast. Their predictions are conveniently accessible to the end users, via web servers and databases that store pre-computed predictions for millions of proteins. However, research into methods that predict many not yet addressed functions of intrinsic disorder remains an outstanding challenge.
Collapse
Affiliation(s)
- Fanchi Meng
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada
| | - Vladimir N Uversky
- Department of Molecular Medicine, USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
- Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, Moscow Region, Russian Federation
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, USA.
| |
Collapse
|
75
|
Meng F, Uversky V, Kurgan L. Computational Prediction of Intrinsic Disorder in Proteins. ACTA ACUST UNITED AC 2017; 88:2.16.1-2.16.14. [DOI: 10.1002/cpps.28] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Affiliation(s)
- Fanchi Meng
- Department of Electrical and Computer Engineering, University of Alberta Edmonton Canada
| | - Vladimir Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida Tampa FL USA
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences St. Petersburg Russia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University Richmond USA
| |
Collapse
|
76
|
Hanson J, Yang Y, Paliwal K, Zhou Y. Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks. Bioinformatics 2017; 33:685-692. [PMID: 28011771 DOI: 10.1093/bioinformatics/btw678] [Citation(s) in RCA: 109] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Accepted: 10/26/2016] [Indexed: 11/12/2022] Open
Abstract
Motivation Capturing long-range interactions between structural but not sequence neighbors of proteins is a long-standing challenging problem in bioinformatics. Recently, long short-term memory (LSTM) networks have significantly improved the accuracy of speech and image classification problems by remembering useful past information in long sequential events. Here, we have implemented deep bidirectional LSTM recurrent neural networks in the problem of protein intrinsic disorder prediction. Results The new method, named SPOT-Disorder, has steadily improved over a similar method using a traditional, window-based neural network (SPINE-D) in all datasets tested without separate training on short and long disordered regions. Independent tests on four other datasets including the datasets from critical assessment of structure prediction (CASP) techniques and >10 000 annotated proteins from MobiDB, confirmed SPOT-Disorder as one of the best methods in disorder prediction. Moreover, initial studies indicate that the method is more accurate in predicting functional sites in disordered regions. These results highlight the usefulness combining LSTM with deep bidirectional recurrent neural networks in capturing non-local, long-range interactions for bioinformatics applications. Availability and Implementation SPOT-disorder is available as a web server and as a standalone program at: http://sparks-lab.org/server/SPOT-disorder/index.php . Contact j.hanson@griffith.edu.au or yuedong.yang@griffith.edu.au or yaoqi.zhou@griffith.edu.au. Supplementary information Supplementary data is available at Bioinformatics online.
Collapse
Affiliation(s)
- Jack Hanson
- Signal Processing Laboratory, Griffith University, Brisbane 4122, Australia
| | - Yuedong Yang
- Institute for Glycomics, Griffith University, Gold Coast 4215, Australia
| | - Kuldip Paliwal
- Signal Processing Laboratory, Griffith University, Brisbane 4122, Australia
| | - Yaoqi Zhou
- Institute for Glycomics, Griffith University, Gold Coast 4215, Australia
| |
Collapse
|
77
|
Newman CE, Toxopeus J, Udaka H, Ahn S, Martynowicz DM, Graether SP, Sinclair BJ, Percival-Smith A. CRISPR-induced null alleles show that Frost protects Drosophila melanogaster reproduction after cold exposure. J Exp Biol 2017; 220:3344-3354. [DOI: 10.1242/jeb.160176] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2017] [Accepted: 07/09/2017] [Indexed: 12/20/2022]
Abstract
The ability to survive and reproduce after cold exposure is important in all kingdoms of life. However, even in a sophisticated genetic model system like Drosophila melanogaster, few genes have been identified as functioning in cold tolerance. The accumulation of the Frost (Fst) gene transcript increases after cold exposure, making it a good candidate for a gene that has a role in cold tolerance. However, despite extensive RNAi knockdown analysis, no role in cold tolerance has been assigned to Fst. CRISPR is an effective technique for completely knocking down genes, and less likely to produce off-target effects than GAL4-UAS RNAi systems. We have used CRISPR-mediated homologous recombination to generate Fst null alleles, and these Fst alleles uncovered a requirement for FST protein in maintaining female fecundity following cold exposure. However, FST does not have a direct role in survival following cold exposure. FST mRNA accumulates in the Malpighian tubules, and the FST protein is a highly disordered protein with a putative signal peptide for export from the cell. Future work is needed to determine whether FST is exported from the Malpighian tubules and directly interacts with female reproductive tissues post-cold exposure, or if it is required for other repair/recovery functions that indirectly alter energy allocation to reproduction.
Collapse
Affiliation(s)
- Claire E. Newman
- Department of Biology, University of Western Ontario, London, ON, Canada
| | - Jantina Toxopeus
- Department of Biology, University of Western Ontario, London, ON, Canada
| | - Hiroko Udaka
- Department of Biology, University of Western Ontario, London, ON, Canada
- Present Address: Department of Zoology, Kyoto University, Kyoto, Japan
| | - Soohyun Ahn
- Department of Biology, University of Western Ontario, London, ON, Canada
- Present Address: Melbourne Dental School, University of Melbourne, Melbourne, VIC, Australia
| | - David M. Martynowicz
- Department of Molecular and Cellular Biology, University of Guelph, Guelph, ON, Canada
| | - Steffen P. Graether
- Department of Molecular and Cellular Biology, University of Guelph, Guelph, ON, Canada
| | - Brent J. Sinclair
- Department of Biology, University of Western Ontario, London, ON, Canada
| | | |
Collapse
|
78
|
Iqbal S, Hoque MT. Estimation of Position Specific Energy as a Feature of Protein Residues from Sequence Alone for Structural Classification. PLoS One 2016; 11:e0161452. [PMID: 27588752 PMCID: PMC5010294 DOI: 10.1371/journal.pone.0161452] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Accepted: 08/06/2016] [Indexed: 11/20/2022] Open
Abstract
A set of features computed from the primary amino acid sequence of proteins, is crucial in the process of inducing a machine learning model that is capable of accurately predicting three-dimensional protein structures. Solutions for existing protein structure prediction problems are in need of features that can capture the complexity of molecular level interactions. With a view to this, we propose a novel approach to estimate position specific estimated energy (PSEE) of a residue using contact energy and predicted relative solvent accessibility (RSA). Furthermore, we demonstrate PSEE can be reasonably estimated based on sequence information alone. PSEE is useful in identifying the structured as well as unstructured or, intrinsically disordered region of a protein by computing favorable and unfavorable energy respectively, characterized by appropriate threshold. The most intriguing finding, verified empirically, is the indication that the PSEE feature can effectively classify disorder versus ordered residues and can segregate different secondary structure type residues by computing the constituent energies. PSEE values for each amino acid strongly correlate with the hydrophobicity value of the corresponding amino acid. Further, PSEE can be used to detect the existence of critical binding regions that essentially undergo disorder-to-order transitions to perform crucial biological functions. Towards an application of disorder prediction using the PSEE feature, we have rigorously tested and found that a support vector machine model informed by a set of features including PSEE consistently outperforms a model with an identical set of features with PSEE removed. In addition, the new disorder predictor, DisPredict2, shows competitive performance in predicting protein disorder when compared with six existing disordered protein predictors.
Collapse
Affiliation(s)
- Sumaiya Iqbal
- Department of Computer Science, University of New Orleans, New Orleans, LA, United States of America
| | - Md Tamjidul Hoque
- Department of Computer Science, University of New Orleans, New Orleans, LA, United States of America
| |
Collapse
|
79
|
DisPredict: A Predictor of Disordered Protein Using Optimized RBF Kernel. PLoS One 2015; 10:e0141551. [PMID: 26517719 PMCID: PMC4627842 DOI: 10.1371/journal.pone.0141551] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2015] [Accepted: 10/09/2015] [Indexed: 12/02/2022] Open
Abstract
Intrinsically disordered proteins or, regions perform important biological functions through their dynamic conformations during binding. Thus accurate identification of these disordered regions have significant implications in proper annotation of function, induced fold prediction and drug design to combat critical diseases. We introduce DisPredict, a disorder predictor that employs a single support vector machine with RBF kernel and novel features for reliable characterization of protein structure. DisPredict yields effective performance. In addition to 10-fold cross validation, training and testing of DisPredict was conducted with independent test datasets. The results were consistent with both the training and test error minimal. The use of multiple data sources, makes the predictor generic. The datasets used in developing the model include disordered regions of various length which are categorized as short and long having different compositions, different types of disorder, ranging from fully to partially disordered regions as well as completely ordered regions. Through comparison with other state of the art approaches and case studies, DisPredict is found to be a useful tool with competitive performance. DisPredict is available at https://github.com/tamjidul/DisPredict_v1.0.
Collapse
|
80
|
Li J, Feng Y, Wang X, Li J, Liu W, Rong L, Bao J. An Overview of Predictors for Intrinsically Disordered Proteins over 2010-2014. Int J Mol Sci 2015; 16:23446-62. [PMID: 26426014 PMCID: PMC4632708 DOI: 10.3390/ijms161023446] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2015] [Revised: 08/25/2015] [Accepted: 08/31/2015] [Indexed: 02/05/2023] Open
Abstract
The sequence-structure-function paradigm of proteins has been changed by the occurrence of intrinsically disordered proteins (IDPs). Benefiting from the structural disorder, IDPs are of particular importance in biological processes like regulation and signaling. IDPs are associated with human diseases, including cancer, cardiovascular disease, neurodegenerative diseases, amyloidoses, and several other maladies. IDPs attract a high level of interest and a substantial effort has been made to develop experimental and computational methods. So far, more than 70 prediction tools have been developed since 1997, within which 17 predictors were created in the last five years. Here, we presented an overview of IDPs predictors developed during 2010-2014. We analyzed the algorithms used for IDPs prediction by these tools and we also discussed the basic concept of various prediction methods for IDPs. The comparison of prediction performance among these tools is discussed as well.
Collapse
Affiliation(s)
- Jianzong Li
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
| | - Yu Feng
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
| | - Xiaoyun Wang
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
| | - Jing Li
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
- State Key Laboratory of Biotherapy/Collaborative Innovation Center for Biotherapy, West China Hospital, Sichuan University, Chengdu 610041, China.
| | - Wen Liu
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
| | - Li Rong
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
| | - Jinku Bao
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
- State Key Laboratory of Biotherapy/Collaborative Innovation Center for Biotherapy, West China Hospital, Sichuan University, Chengdu 610041, China.
- State Key Laboratory of Oral Diseases, West China College of Stomatology, Sichuan University, Chengdu 610041, China.
| |
Collapse
|
81
|
Singh GP. Association between intrinsic disorder and serine/threonine phosphorylation in Mycobacterium tuberculosis. PeerJ 2015; 3:e724. [PMID: 25648268 PMCID: PMC4304846 DOI: 10.7717/peerj.724] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2014] [Accepted: 12/21/2014] [Indexed: 01/28/2023] Open
Abstract
Serine/threonine phosphorylation is an important mechanism that is involved in the regulation of protein function. In eukaryotes, phosphorylation occurs predominantly in intrinsically disordered regions of proteins. Though serine/threonine phosphorylation and protein disorder are much less prevalent in prokaryotes, some bacteria have high levels of serine/threonine phosphorylation and disorder, including the medically important M. tuberculosis. Here I show that serine/threonine phosphorylation sites in M. tuberculosis are highly enriched in intrinsically disordered regions, indicating similarity in the substrate recognition mechanisms of eukaryotic and M. tuberculosis kinases. Serine/threonine phosphorylation has been linked to the pathogenicity and survival of M. tuberculosis. Thus, a better understanding of how its kinases recognize their substrates could have important implications in understanding and controlling the biology of this deadly pathogen. These results also indicate that the association between serine/threonine phosphorylation and disorder is not a feature restricted to eukaryotes.
Collapse
Affiliation(s)
- Gajinder Pal Singh
- School of Biotechnology, KIIT University , Patia, Bhubaneswar, Odisha , India
| |
Collapse
|
82
|
Kershner AM, Shin H, Hansen TJ, Kimble J. Discovery of two GLP-1/Notch target genes that account for the role of GLP-1/Notch signaling in stem cell maintenance. Proc Natl Acad Sci U S A 2014; 111:3739-44. [PMID: 24567412 PMCID: PMC3956202 DOI: 10.1073/pnas.1401861111] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A stem cell's immediate microenvironment creates an essential "niche" to maintain stem cell self-renewal. Many niches and their intercellular signaling pathways are known, but for the most part, the key downstream targets of niche signaling remain elusive. Here, we report the discovery of two GLP-1/Notch target genes, lst-1 (lateral signaling target) and sygl-1 (synthetic Glp), that function redundantly to maintain germ-line stem cells (GSCs) in the nematode Caenorhabditis elegans. Whereas lst-1 and sygl-1 single mutants appear normal, lst-1 sygl-1 double mutants are phenotypically indistinguishable from glp-1/Notch mutants. Multiple lines of evidence demonstrate that GLP-1/Notch signaling activates lst-1 and sygl-1 expression in GSCs within the niche. Therefore, these two genes fully account for the role of GLP-1/Notch signaling in GSC maintenance. Importantly, lst-1 and sygl-1 are not required for GLP-1/Notch signaling per se. We conclude that lst-1 and sygl-1 forge a critical link between Notch signaling and GSC maintenance.
Collapse
Affiliation(s)
| | - Heaji Shin
- Department of Biochemistry, University of Wisconsin, Madison, WI 53706
| | - Tyler J. Hansen
- Department of Biochemistry, University of Wisconsin, Madison, WI 53706
| | - Judith Kimble
- Howard Hughes Medical Institute and
- Department of Biochemistry, University of Wisconsin, Madison, WI 53706
| |
Collapse
|
83
|
Mizianty MJ, Uversky V, Kurgan L. Prediction of intrinsic disorder in proteins using MFDp2. Methods Mol Biol 2014; 1137:147-62. [PMID: 24573480 DOI: 10.1007/978-1-4939-0366-5_11] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Intrinsically disordered proteins (IDPs) are either entirely disordered or contain disordered regions in their native state. IDPs were found to be abundant across all kingdoms of life, particularly in eukaryotes, and are implicated in numerous cellular processes. Experimental annotation of disorder lags behind the rapidly growing sizes of the protein databases and thus computational methods are used to close this gap and to investigate the disorder. MFDp2 is a novel webserver for accurate sequence-based prediction of protein disorder which also outputs well-described sequence-derived information that allows profiling the predicted disorder. We conveniently visualize sequence conservation, predicted secondary structure, relative solvent accessibility, and alignments to chains with annotated disorder. The webserver allows predictions for multiple proteins at the same time, includes help pages and tutorial, and the results can be downloaded as text-based (parsable) file. MFDp2 is freely available at http://biomine.ece.ualberta.ca/MFDp2/.
Collapse
Affiliation(s)
- Marcin J Mizianty
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada
| | | | | |
Collapse
|