1
|
Datta RR, Akdogan D, Tezcan EB, Onal P. Versatile roles of disordered transcription factor effector domains in transcriptional regulation. FEBS J 2025. [PMID: 39888268 DOI: 10.1111/febs.17424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2024] [Revised: 11/25/2024] [Accepted: 01/21/2025] [Indexed: 02/01/2025]
Abstract
Transcription, a crucial step in the regulation of gene expression, is tightly controlled and involves several essential processes, such as chromatin organization, recognition of the specific genomic sequences, DNA binding, and ultimately recruiting the transcriptional machinery to facilitate transcript synthesis. At the center of this regulation are transcription factors (TFs), which comprise at least one DNA-binding domain (DBD) and an effector domain (ED). Although the structure and function of DBDs have been well studied, our knowledge of the structure and function of effector domains is limited. EDs are of particular importance in generating distinct transcriptional responses between protein members of the same TF family that have similar DBDs and specificities. The study of transcriptional activity conferred by effector domains has traditionally been conducted through examining protein-protein interactions. However, recent research has uncovered alternative mechanisms by which EDs regulate gene expression, such as the formation of condensates that increase the local concentration of transcription factors, cofactors, and coregulated genes, as well as DNA binding. Here, we provide a comprehensive overview of the known roles of transcription factor EDs, with a specific focus on disordered regions. Additionally, we emphasize the significance of intrinsically disordered regions (IDRs) during transcriptional regulation. We examine the mechanisms underlying the establishment and maintenance of transcriptional specificity through the structural properties of predominantly disordered EDs. We then provide a comprehensive overview of the current understanding of these domains, including their physical and chemical characteristics, as well as their functional roles.
Collapse
Affiliation(s)
| | - Dilan Akdogan
- Molecular Biology and Genetics Department, Ihsan Dogramaci Bilkent University, Ankara, Turkey
| | - Elif B Tezcan
- Molecular Biology and Genetics Department, Ihsan Dogramaci Bilkent University, Ankara, Turkey
| | - Pinar Onal
- Molecular Biology and Genetics Department, Ihsan Dogramaci Bilkent University, Ankara, Turkey
| |
Collapse
|
2
|
Erdős G, Dosztányi Z. Deep learning for intrinsically disordered proteins: From improved predictions to deciphering conformational ensembles. Curr Opin Struct Biol 2024; 89:102950. [PMID: 39522439 DOI: 10.1016/j.sbi.2024.102950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Revised: 09/19/2024] [Accepted: 10/16/2024] [Indexed: 11/16/2024]
Abstract
Intrinsically disordered proteins (IDPs) lack a stable three-dimensional structure under physiological conditions, challenging traditional structure-based prediction methods. This review explores how modern deep learning approaches, which have revolutionized structure prediction for globular proteins, have impacted protein disorder predictions. We highlight the role of community-driven efforts in curating data and assessing state-of-the-art, which have been crucial in advancing the field. We also review state-of-the-art methods utilizing deep learning techniques, highlighting innovative approaches. We also address advancements in characterizing protein conformational ensembles directly from sequence data using novel machine learning methods.
Collapse
Affiliation(s)
- Gábor Erdős
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary.
| |
Collapse
|
3
|
Dziadek ŁJ, Sieradzan AK, Czaplewski C, Zalewski M, Banaś F, Toczek M, Nisterenko W, Grudinin S, Liwo A, Giełdoń A. Assessment of Four Theoretical Approaches to Predict Protein Flexibility in the Crystal Phase and Solution. J Chem Theory Comput 2024; 20:7667-7681. [PMID: 39171852 PMCID: PMC11391579 DOI: 10.1021/acs.jctc.4c00754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
In this paper, we evaluated the ability of four coarse-grained methods to predict protein flexible regions with potential biological importance, UNRES-flex, UNRES-DSSP-flex (based on the united residue model of polypeptide chains without and with secondary structure restraints, respectively), CABS-flex (based on the C-α, C-β, and side chain model), and nonlinear rigid block normal mode analysis (NOLB) with a set of 100 protein structures determined by NMR spectroscopy or X-ray crystallography, with all secondary structure types. End regions with high fluctuations were excluded from analysis. The Pearson and Spearman correlation coefficients were used to quantify the conformity between the calculated and experimental fluctuation profiles, the latter determined from NMR ensembles and X-ray B-factors, respectively. For X-ray structures (corresponding to proteins in a crowded environment), NOLB resulted in the best agreement between the predicted and experimental fluctuation profiles, while for NMR structures (corresponding to proteins in solution), the ranking of performance is CABS-flex > UNRES-DSSP-flex > UNRES-flex > NOLB; however, CABS-flex sometimes exaggerated the extent of small fluctuations, as opposed to UNRES-DSSP-flex.
Collapse
Affiliation(s)
- Ł J Dziadek
- Faculty of Chemistry, University of Gdansk, ul. Wita-Stwosza 63, 80-308 Gdańsk, Poland
| | - A K Sieradzan
- Faculty of Chemistry, University of Gdansk, ul. Wita-Stwosza 63, 80-308 Gdańsk, Poland
| | - C Czaplewski
- Faculty of Chemistry, University of Gdansk, ul. Wita-Stwosza 63, 80-308 Gdańsk, Poland
- School of Computational Sciences, Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu, Seoul 02455, Republic of Korea
| | - M Zalewski
- Faculty of Chemistry, University of Gdansk, ul. Wita-Stwosza 63, 80-308 Gdańsk, Poland
| | - F Banaś
- Faculty of Chemistry, University of Gdansk, ul. Wita-Stwosza 63, 80-308 Gdańsk, Poland
| | - M Toczek
- Faculty of Chemistry, University of Gdansk, ul. Wita-Stwosza 63, 80-308 Gdańsk, Poland
| | - W Nisterenko
- Faculty of Chemistry, University of Gdansk, ul. Wita-Stwosza 63, 80-308 Gdańsk, Poland
| | - S Grudinin
- LJK, University Grenoble Alpes, CNRS, Grenoble INP, F-38000 Grenoble, France
| | - A Liwo
- Faculty of Chemistry, University of Gdansk, ul. Wita-Stwosza 63, 80-308 Gdańsk, Poland
| | - A Giełdoń
- Faculty of Chemistry, University of Gdansk, ul. Wita-Stwosza 63, 80-308 Gdańsk, Poland
| |
Collapse
|
4
|
Kurgan L, Hu G, Wang K, Ghadermarzi S, Zhao B, Malhis N, Erdős G, Gsponer J, Uversky VN, Dosztányi Z. Tutorial: a guide for the selection of fast and accurate computational tools for the prediction of intrinsic disorder in proteins. Nat Protoc 2023; 18:3157-3172. [PMID: 37740110 DOI: 10.1038/s41596-023-00876-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 06/21/2023] [Indexed: 09/24/2023]
Abstract
Intrinsic disorder is instrumental for a wide range of protein functions, and its analysis, using computational predictions from primary structures, complements secondary and tertiary structure-based approaches. In this Tutorial, we provide an overview and comparison of 23 publicly available computational tools with complementary parameters useful for intrinsic disorder prediction, partly relying on results from the Critical Assessment of protein Intrinsic Disorder prediction experiment. We consider factors such as accuracy, runtime, availability and the need for functional insights. The selected tools are available as web servers and downloadable programs, offer state-of-the-art predictions and can be used in a high-throughput manner. We provide examples and instructions for the selected tools to illustrate practical aspects related to the submission, collection and interpretation of predictions, as well as the timing and their limitations. We highlight two predictors for intrinsically disordered proteins, flDPnn as accurate and fast and IUPred as very fast and moderately accurate, while suggesting ANCHOR2 and MoRFchibi as two of the best-performing predictors for intrinsically disordered region binding. We link these tools to additional resources, including databases of predictions and web servers that integrate multiple predictive methods. Altogether, this Tutorial provides a hands-on guide to comparatively evaluating multiple predictors, submitting and collecting their own predictions, and reading and interpreting results. It is suitable for experimentalists and computational biologists interested in accurately and conveniently identifying intrinsic disorder, facilitating the functional characterization of the rapidly growing collections of protein sequences.
Collapse
Affiliation(s)
- Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| | - Gang Hu
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Kui Wang
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Nawar Malhis
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Gábor Erdős
- MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, Hungary
| | - Jörg Gsponer
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada.
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA.
- Byrd Alzheimer's Center and Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA.
| | - Zsuzsanna Dosztányi
- MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, Hungary.
| |
Collapse
|
5
|
Computational prediction of disordered binding regions. Comput Struct Biotechnol J 2023; 21:1487-1497. [PMID: 36851914 PMCID: PMC9957716 DOI: 10.1016/j.csbj.2023.02.018] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 02/08/2023] [Accepted: 02/08/2023] [Indexed: 02/12/2023] Open
Abstract
One of the key features of intrinsically disordered regions (IDRs) is their ability to interact with a broad range of partner molecules. Multiple types of interacting IDRs were identified including molecular recognition fragments (MoRFs), short linear sequence motifs (SLiMs), and protein-, nucleic acids- and lipid-binding regions. Prediction of binding IDRs in protein sequences is gaining momentum in recent years. We survey 38 predictors of binding IDRs that target interactions with a diverse set of partners, such as peptides, proteins, RNA, DNA and lipids. We offer a historical perspective and highlight key events that fueled efforts to develop these methods. These tools rely on a diverse range of predictive architectures that include scoring functions, regular expressions, traditional and deep machine learning and meta-models. Recent efforts focus on the development of deep neural network-based architectures and extending coverage to RNA, DNA and lipid-binding IDRs. We analyze availability of these methods and show that providing implementations and webservers results in much higher rates of citations/use. We also make several recommendations to take advantage of modern deep network architectures, develop tools that bundle predictions of multiple and different types of binding IDRs, and work on algorithms that model structures of the resulting complexes.
Collapse
|
6
|
Han B, Ren C, Wang W, Li J, Gong X. Computational Prediction of Protein Intrinsically Disordered Region Related Interactions and Functions. Genes (Basel) 2023; 14:432. [PMID: 36833360 PMCID: PMC9956190 DOI: 10.3390/genes14020432] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 02/02/2023] [Accepted: 02/05/2023] [Indexed: 02/11/2023] Open
Abstract
Intrinsically Disordered Proteins (IDPs) and Regions (IDRs) exist widely. Although without well-defined structures, they participate in many important biological processes. In addition, they are also widely related to human diseases and have become potential targets in drug discovery. However, there is a big gap between the experimental annotations related to IDPs/IDRs and their actual number. In recent decades, the computational methods related to IDPs/IDRs have been developed vigorously, including predicting IDPs/IDRs, the binding modes of IDPs/IDRs, the binding sites of IDPs/IDRs, and the molecular functions of IDPs/IDRs according to different tasks. In view of the correlation between these predictors, we have reviewed these prediction methods uniformly for the first time, summarized their computational methods and predictive performance, and discussed some problems and perspectives.
Collapse
Affiliation(s)
- Bingqing Han
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Chongjiao Ren
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Wenda Wang
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Jiashan Li
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Xinqi Gong
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
- Beijing Academy of Intelligence, Beijing 100083, China
| |
Collapse
|
7
|
Chen L, Zhao B, Palomo A, Sun Y, Cheng Z, Zhang M, Xia Y. Micron-scale biogeography reveals conservative intra anammox bacteria spatial co-associations. WATER RESEARCH 2022; 220:118640. [PMID: 35661503 DOI: 10.1016/j.watres.2022.118640] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 05/17/2022] [Accepted: 05/18/2022] [Indexed: 06/15/2023]
Abstract
Micron-scale resolution can help to reliably identify true taxon-taxon interactions in complex microbial communities. Despite widespread recognition of the critical role of metabolic interactions in anaerobic ammonium oxidation (anammox) system performance, no studies have examined microbial interactions at the micron-scale in anammox consortia. To fill this gap, we extensively sampled (totally 242 samples) the consortia of a lab-scale anammox reactor at different length scales, including bulk-scale (∼cm), macro-scale (300-500 µm) and micron-scale (70-100 µm). We firstly observed evident micron-scale heterogeneity in anammox consortia, with the relative abundance of anammox bacteria fluctuated greatly across individual clusters (2.0%-79.3%), indicating that the biotic interactions play a significant role in the assembly of anammox communities under well-controlled and well-mixed condition. Importantly, by mapping the spatial associations in anammox consortia at micron-scale, we demonstrated that the conserved co-associations for anammox bacteria were restricted to three different Brocadia species over time, and their co-associations with heterotrophs were random, implying that there was no statistically significant symbiotic interaction between anammox bacteria and other heterotrophic populations. Further metagenomic binning revealed that the quorum sensing with secondary messenger c-di-GMP potentially holding on the conservative metabolic cooperation among Brocadia species. These results shed new light on the social behavior of the anammox community. Overall, delineating of biological structures at micron-scale opens a new way of monitoring the microbial spatial structure and interactions, paving the way for improved community engineering of biotreatment systems.
Collapse
Affiliation(s)
- Liming Chen
- School of Environmental Science and Engineering, College of Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| | - Bixi Zhao
- School of Environmental Science and Engineering, College of Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| | - Alejandro Palomo
- School of Environmental Science and Engineering, College of Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| | - Yuhong Sun
- School of Environmental Science and Engineering, College of Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| | - Zhanwen Cheng
- School of Environmental Science and Engineering, College of Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| | - Miao Zhang
- School of Environmental Science and Engineering, College of Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| | - Yu Xia
- School of Environmental Science and Engineering, College of Engineering, Southern University of Science and Technology, Shenzhen 518055, China; State Environmental Protection Key Laboratory of Integrated Surface Water-Groundwater Pollution Control, School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China; Guangdong Provincial Key Laboratory of Soil and Groundwater Pollution Control, School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China.
| |
Collapse
|
8
|
Compositional Bias of Intrinsically Disordered Proteins and Regions and Their Predictions. Biomolecules 2022; 12:biom12070888. [PMID: 35883444 PMCID: PMC9313023 DOI: 10.3390/biom12070888] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 06/10/2022] [Accepted: 06/10/2022] [Indexed: 11/17/2022] Open
Abstract
Intrinsically disordered regions (IDRs) carry out many cellular functions and vary in length and placement in protein sequences. This diversity leads to variations in the underlying compositional biases, which were demonstrated for the short vs. long IDRs. We analyze compositional biases across four classes of disorder: fully disordered proteins; short IDRs; long IDRs; and binding IDRs. We identify three distinct biases: for the fully disordered proteins, the short IDRs and the long and binding IDRs combined. We also investigate compositional bias for putative disorder produced by leading disorder predictors and find that it is similar to the bias of the native disorder. Interestingly, the accuracy of disorder predictions across different methods is correlated with the correctness of the compositional bias of their predictions highlighting the importance of the compositional bias. The predictive quality is relatively low for the disorder classes with compositional bias that is the most different from the “generic” disorder bias, while being much higher for the classes with the most similar bias. We discover that different predictors perform best across different classes of disorder. This suggests that no single predictor is universally best and motivates the development of new architectures that combine models that target specific disorder classes.
Collapse
|
9
|
Avramov M, Schád É, Révész Á, Turiák L, Uzelac I, Tantos Á, Drahos L, Popović ŽD. Identification of Intrinsically Disordered Proteins and Regions in a Non-Model Insect Species Ostrinia nubilalis (Hbn.). Biomolecules 2022; 12:biom12040592. [PMID: 35454181 PMCID: PMC9029825 DOI: 10.3390/biom12040592] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 04/06/2022] [Accepted: 04/11/2022] [Indexed: 12/29/2022] Open
Abstract
Research in previous decades has shown that intrinsically disordered proteins (IDPs) and regions in proteins (IDRs) are as ubiquitous as highly ordered proteins. Despite this, research on IDPs and IDRs still has many gaps left to fill. Here, we present an approach that combines wet lab methods with bioinformatics tools to identify and analyze intrinsically disordered proteins in a non-model insect species that is cold-hardy. Due to their known resilience to the effects of extreme temperatures, these proteins likely play important roles in this insect's adaptive mechanisms to sub-zero temperatures. The approach involves IDP enrichment by sample heating and double-digestion of proteins, followed by peptide and protein identification. Next, proteins are bioinformatically analyzed for disorder content, presence of long disordered regions, amino acid composition, and processes they are involved in. Finally, IDP detection is validated with an in-house 2D PAGE. In total, 608 unique proteins were identified, with 39 being mostly disordered, 100 partially disordered, 95 nearly ordered, and 374 ordered. One-third contain at least one long disordered segment. Functional information was available for only 90 proteins with intrinsic disorders out of 312 characterized proteins. Around half of the 90 proteins are cytoskeletal elements or involved in translational processes.
Collapse
Affiliation(s)
- Miloš Avramov
- Department of Biology and Ecology, Faculty of Sciences, University of Novi Sad, 21000 Novi Sad, Serbia; (M.A.); (I.U.)
| | - Éva Schád
- Institute of Enzymology, Research Centre for Natural Sciences, 1117 Budapest, Hungary; (É.S.); (Á.T.)
| | - Ágnes Révész
- Institute of Organic Chemistry, Research Centre for Natural Sciences, 1117 Budapest, Hungary; (Á.R.); (L.T.); (L.D.)
| | - Lilla Turiák
- Institute of Organic Chemistry, Research Centre for Natural Sciences, 1117 Budapest, Hungary; (Á.R.); (L.T.); (L.D.)
| | - Iva Uzelac
- Department of Biology and Ecology, Faculty of Sciences, University of Novi Sad, 21000 Novi Sad, Serbia; (M.A.); (I.U.)
| | - Ágnes Tantos
- Institute of Enzymology, Research Centre for Natural Sciences, 1117 Budapest, Hungary; (É.S.); (Á.T.)
| | - László Drahos
- Institute of Organic Chemistry, Research Centre for Natural Sciences, 1117 Budapest, Hungary; (Á.R.); (L.T.); (L.D.)
| | - Željko D. Popović
- Department of Biology and Ecology, Faculty of Sciences, University of Novi Sad, 21000 Novi Sad, Serbia; (M.A.); (I.U.)
- Correspondence:
| |
Collapse
|
10
|
Zhao B, Kurgan L. Deep learning in prediction of intrinsic disorder in proteins. Comput Struct Biotechnol J 2022; 20:1286-1294. [PMID: 35356546 PMCID: PMC8927795 DOI: 10.1016/j.csbj.2022.03.003] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 03/04/2022] [Accepted: 03/04/2022] [Indexed: 12/12/2022] Open
Abstract
Intrinsic disorder prediction is an active area that has developed over 100 predictors. We identify and investigate a recent trend towards the development of deep neural network (DNN)-based methods. The first DNN-based method was released in 2013 and since 2019 deep learners account for majority of the new disorder predictors. We find that the 13 currently available DNN-based predictors are diverse in their topologies, sizes of their networks and the inputs that they utilize. We empirically show that the deep learners are statistically more accurate than other types of disorder predictors using the blind test dataset from the recent community assessment of intrinsic disorder predictions (CAID). We also identify several well-rounded DNN-based predictors that are accurate, fast and/or conveniently available. The popularity, favorable predictive performance and architectural flexibility suggest that deep networks are likely to fuel the development of future disordered predictors. Novel hybrid designs of deep networks could be used to adequately accommodate for diversity of types and flavors of intrinsic disorder. We also discuss scarcity of the DNN-based methods for the prediction of disordered binding regions and the need to develop more accurate methods for this prediction.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
11
|
Abstract
INTRODUCTION Intrinsic disorder prediction field develops, assesses, and deploys computational predictors of disorder in protein sequences and constructs and disseminates databases of these predictions. Over 40 years of research resulted in the release of numerous resources. AREAS COVERED We identify and briefly summarize the most comprehensive to date collection of over 100 disorder predictors. We focus on their predictive models, availability and predictive performance. We categorize and study them from a historical point of view to highlight informative trends. EXPERT OPINION We find a consistent trend of improvements in predictive quality as newer and more advanced predictors are developed. The original focus on machine learning methods has shifted to meta-predictors in early 2010s, followed by a recent transition to deep learning. The use of deep learners will continue in foreseeable future given recent and convincing success of these methods. Moreover, a broad range of resources that facilitate convenient collection of accurate disorder predictions is available to users. They include web servers and standalone programs for disorder prediction, servers that combine prediction of disorder and disorder functions, and large databases of pre-computed predictions. We also point to the need to address the shortage of accurate methods that predict disordered binding regions.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA
| |
Collapse
|
12
|
Nayak C, Singh SK. In silico identification of natural product inhibitors against Octamer-binding transcription factor 4 (Oct4) to impede the mechanism of glioma stem cells. PLoS One 2021; 16:e0255803. [PMID: 34613998 PMCID: PMC8494328 DOI: 10.1371/journal.pone.0255803] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Accepted: 07/23/2021] [Indexed: 02/07/2023] Open
Abstract
Octamer-binding transcription factor 4 (Oct4) is a core regulator in the retention of stemness, invasive, and self-renewal properties in glioma initiating cells (GSCs) and its overexpression inhibits the differentiation of glioma cells promoting tumor cell proliferation. The Pit-Oct-Unc (POU) domain comprising POU-specific domain (POUS) and POU-type homeodomain (POUHD) subdomains is the most critical part of the Oct4 for the generation of induced pluripotent stem cells from somatic cells that lead to tumor initiation, invasion, posttreatment relapse, and therapeutic resistance. Therefore, the present investigation hunts for natural product inhibitors (NPIs) against the POUHD domain of Oct4 by employing receptor-based virtual screening (RBVS) followed by binding free energy calculation and molecular dynamics simulation (MDS). RBVS provided 13 compounds with acceptable ranges of pharmacokinetic properties and good docking scores having key interactions with the POUHD domain. More Specifically, conformational and interaction stability analysis of 13 compounds through MDS unveiled two compounds ZINC02145000 and ZINC32124203 which stabilized the backbone of protein even in the presence of linker and POUS domain. Additionally, ZINC02145000 and ZINC32124203 exhibited stable and strong interactions with key residues W277, R242, and R234 of the POUHD domain even in dynamic conditions. Interestingly, ZINC02145000 and ZINC32124203 established communication not only with the POUHD domain but also with the POUS domain indicating their incredible potency toward thwarting the function of Oct4. ZINC02145000 and ZINC32124203 also reduced the flexibility and escalated the correlations between the amino acid residues of Oct4 evidenced by PCA and DCCM analysis. Finally, our examination proposed two NPIs that can impede the Oct4 function and may help to improve overall survival, diminish tumor relapse, and achieve a cure not only in deadly disease GBM but also in other cancers with minimal side effects.
Collapse
Affiliation(s)
- Chirasmita Nayak
- Computer-Aided Drug Design and Molecular Modeling Lab, Department of Bioinformatics, Alagappa University, Karaikudi Tamil Nadu, India
| | - Sanjeev Kumar Singh
- Computer-Aided Drug Design and Molecular Modeling Lab, Department of Bioinformatics, Alagappa University, Karaikudi Tamil Nadu, India
| |
Collapse
|
13
|
Katuwawala A, Ghadermarzi S, Hu G, Wu Z, Kurgan L. QUARTERplus: Accurate disorder predictions integrated with interpretable residue-level quality assessment scores. Comput Struct Biotechnol J 2021; 19:2597-2606. [PMID: 34025946 PMCID: PMC8122155 DOI: 10.1016/j.csbj.2021.04.066] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 04/24/2021] [Accepted: 04/24/2021] [Indexed: 12/13/2022] Open
Abstract
A recent advance in the disorder prediction field is the development of the quality assessment (QA) scores. QA scores complement the propensities produced by the disorder predictors by identifying regions where these predictions are more likely to be correct. We develop, empirically test and release a new QA tool, QUARTERplus, that addresses several key drawbacks of the current QA method, QUARTER. QUARTERplus is the first solution that utilizes QA scores and the associated input disorder predictions to produce very accurate disorder predictions with the help of a modern deep learning meta-model. The deep neural network utilizes the QA scores to identify and fix the regions where the original/input disorder predictions are poor. More importantly, the accurate QUATERplus's predictions are accompanied by easy to interpret residue-level QA scores that reliably quantify their residue-level predictive quality. We provide these interpretable QA scores for QUARTERplus and 10 other popular disorder predictors. Empirical tests on a large and independent (low similarity) test dataset show that QUARTERplus predictions secure AUC = 0.93 and are statistically more accurate than the results of twelve state-of-the-art disorder predictors. We also demonstrate that the new QA scores produced by QUARTERplus are highly correlated with the actual predictive quality and that they can be effectively used to identify regions of correct disorder predictions. This feature empowers the users to easily identify which parts of the predictions generated by the modern disorder predictors are more trustworthy. QUARTERplus is available as a convenient webserver at http://biomine.cs.vcu.edu/servers/QUARTERplus/.
Collapse
Affiliation(s)
- Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Gang Hu
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin 300071, China
| | - Zhonghua Wu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
14
|
Lyngdoh DL, Nag N, Uversky VN, Tripathi T. Prevalence and functionality of intrinsic disorder in human FG-nucleoporins. Int J Biol Macromol 2021; 175:156-170. [PMID: 33548309 DOI: 10.1016/j.ijbiomac.2021.01.218] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 01/19/2021] [Accepted: 01/31/2021] [Indexed: 11/27/2022]
Abstract
The nuclear-cytoplasmic transport of biomolecules is assisted by the nuclear pores composed of evolutionarily conserved proteins termed nucleoporins (Nups). The central Nups, characterized by multiple FG-repeats, are highly dynamic and contain a high level of intrinsically disordered regions (IDPRs). FG-Nups bind several protein partners and play critical roles in molecular interactions and the regulation of cellular functions through their IDPRs. In the present study, we performed a multiparametric bioinformatics analysis to characterize the prevalence and functionality of IDPRs in human FG-Nups. These analyses revealed that the sequence of all FG-Nups contained >50% IDPRs (except Nup54 and Nup358). Nup98, Nup153, and POM121 were extremely disordered with ~80% IDPRs. The functional disorder-based binding regions in the FG-Nups were identified. The phase separation behavior of FG-Nups indicated that all FG-Nups have the potential to undergo liquid-to-liquid phase separation that could stabilize their liquid state. The inherent structural flexibility in FG-Nups is mechanistically and functionally advantageous. Since certain FG-Nups interact with disease-relevant protein aggregates, their complexes can be exploited for drug design. Furthermore, consideration of the FG-Nups from the intrinsic disorder perspective provides critical information that can guide future experimental studies to uncover novel pathways associated with diseases linked with protein misfolding and aggregation.
Collapse
Affiliation(s)
- Denzelle Lee Lyngdoh
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong 793022, India
| | - Niharika Nag
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong 793022, India
| | - Vladimir N Uversky
- Department of Molecular Medicine and Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33620, United States
| | - Timir Tripathi
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong 793022, India.
| |
Collapse
|
15
|
Pancsa R, Vranken W, Mészáros B. Computational resources for identifying and describing proteins driving liquid-liquid phase separation. Brief Bioinform 2021; 22:6124912. [PMID: 33517364 PMCID: PMC8425267 DOI: 10.1093/bib/bbaa408] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 11/23/2020] [Accepted: 12/12/2020] [Indexed: 01/06/2023] Open
Abstract
One of the most intriguing fields emerging in current molecular biology is the study of membraneless organelles formed via liquid–liquid phase separation (LLPS). These organelles perform crucial functions in cell regulation and signalling, and recent years have also brought about the understanding of the molecular mechanism of their formation. The LLPS field is continuously developing and optimizing dedicated in vitro and in vivo methods to identify and characterize these non-stoichiometric molecular condensates and the proteins able to drive or contribute to LLPS. Building on these observations, several computational tools and resources have emerged in parallel to serve as platforms for the collection, annotation and prediction of membraneless organelle-linked proteins. In this survey, we showcase recent advancements in LLPS bioinformatics, focusing on (i) available databases and ontologies that are necessary to describe the studied phenomena and the experimental results in an unambiguous way and (ii) prediction methods to assess the potential LLPS involvement of proteins. Through hands-on application of these resources on example proteins and representative datasets, we give a practical guide to show how they can be used in conjunction to provide in silico information on LLPS.
Collapse
Affiliation(s)
- Rita Pancsa
- Enzymology Institute of the Research Centre for Natural Sciences, Budapest, Hungary
| | - Wim Vranken
- Computer Science, chemistry and biomedical sciences at the Vrije Universiteit Brussel
| | - Bálint Mészáros
- Structural and Computational Biology Unit at the European Molecular Biology Laboratory, Heidelberg 69117, Germany
| |
Collapse
|
16
|
Orosz F. On the TPPP-like proteins of flagellated fungi. Fungal Biol 2020; 125:357-367. [PMID: 33910677 DOI: 10.1016/j.funbio.2020.12.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Revised: 12/02/2020] [Accepted: 12/06/2020] [Indexed: 12/12/2022]
Abstract
TPPP-like proteins, exhibiting microtubule stabilizing function, constitute a eukaryotic superfamily, characterized by the presence of the p25alpha domain. TPPPs in the strict sense are present in animals except Trichoplax adhaerens, which instead contains apicortin where a part of the p25alpha domain is combined with a DCX domain. Apicortin is absent in other animals and occurs mostly in the protozoan phylum, Apicomplexa. A strong correlation between the occurrence of p25alpha domain and that of the eukaryotic cilium/flagellum was suggested. Species of the deeper branching clades of Fungi possess flagellum but others lost it thus investigation of fungal genomes can help testing of this suggestion. Indeed, these proteins are present in early branching Fungi. Both TPPP and apicortin are present in Rozellomycota (Cryptomycota) and Chytridiomycota, TPPP in Blastocladiomycota, apicortin in Neocallimastigomycota, Monoblepharomycota and the non-flagellated Mucoromycota. Beside the "normal" TPPP occurring in animals, a special, fungal-type TPPP is also present in Fungi, in which a part of the p25alpha domain is duplicated. Dikarya, the most developed subkingdom of Fungi, lacks both flagellum and TPPPs. Thus it is strengthened that each ciliated/flagellated organism contains p25alpha domain-containing proteins while there are very few non-flagellated ones where p25alpha domain can be found.
Collapse
Affiliation(s)
- Ferenc Orosz
- Institute of Enzymology, Research Centre for Natural Sciences, Magyar Tudósok Körútja 2, 1117, Budapest, Hungary.
| |
Collapse
|
17
|
Katuwawala A, Kurgan L. Comparative Assessment of Intrinsic Disorder Predictions with a Focus on Protein and Nucleic Acid-Binding Proteins. Biomolecules 2020; 10:E1636. [PMID: 33291838 PMCID: PMC7762010 DOI: 10.3390/biom10121636] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2020] [Revised: 11/26/2020] [Accepted: 12/03/2020] [Indexed: 01/18/2023] Open
Abstract
With over 60 disorder predictors, users need help navigating the predictor selection task. We review 28 surveys of disorder predictors, showing that only 11 include assessment of predictive performance. We identify and address a few drawbacks of these past surveys. To this end, we release a novel benchmark dataset with reduced similarity to the training sets of the considered predictors. We use this dataset to perform a first-of-its-kind comparative analysis that targets two large functional families of disordered proteins that interact with proteins and with nucleic acids. We show that limiting sequence similarity between the benchmark and the training datasets has a substantial impact on predictive performance. We also demonstrate that predictive quality is sensitive to the use of the well-annotated order and inclusion of the fully structured proteins in the benchmark datasets, both of which should be considered in future assessments. We identify three predictors that provide favorable results using the new benchmark set. While we find that VSL2B offers the most accurate and robust results overall, ESpritz-DisProt and SPOT-Disorder perform particularly well for disordered proteins. Moreover, we find that predictions for the disordered protein-binding proteins suffer low predictive quality compared to generic disordered proteins and the disordered nucleic acids-binding proteins. This can be explained by the high disorder content of the disordered protein-binding proteins, which makes it difficult for the current methods to accurately identify ordered regions in these proteins. This finding motivates the development of a new generation of methods that would target these difficult-to-predict disordered proteins. We also discuss resources that support users in collecting and identifying high-quality disorder predictions.
Collapse
Affiliation(s)
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA;
| |
Collapse
|
18
|
Folding and structural polymorphism of p53 C-terminal domain: One peptide with many conformations. Arch Biochem Biophys 2020; 684:108342. [DOI: 10.1016/j.abb.2020.108342] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 02/20/2020] [Accepted: 03/11/2020] [Indexed: 11/19/2022]
|
19
|
Oldfield CJ, Fan X, Wang C, Dunker AK, Kurgan L. Computational Prediction of Intrinsic Disorder in Protein Sequences with the disCoP Meta-predictor. Methods Mol Biol 2020; 2141:21-35. [PMID: 32696351 DOI: 10.1007/978-1-0716-0524-0_2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Intrinsically disordered proteins are either entirely disordered or contain disordered regions in their native state. These proteins and regions function without the prerequisite of a stable structure and were found to be abundant across all kingdoms of life. Experimental annotation of disorder lags behind the rapidly growing number of sequenced proteins, motivating the development of computational methods that predict disorder in protein sequences. DisCoP is a user-friendly webserver that provides accurate sequence-based prediction of protein disorder. It relies on meta-architecture in which the outputs generated by multiple disorder predictors are combined together to improve predictive performance. The architecture of disCoP is presented, and its accuracy relative to several other disorder predictors is briefly discussed. We describe usage of the web interface and explain how to access and read results generated by this computational tool. We also provide an example of prediction results and interpretation. The disCoP's webserver is publicly available at http://biomine.cs.vcu.edu/servers/disCoP/ .
Collapse
Affiliation(s)
| | - Xiao Fan
- Department of Pediatrics, Columbia University, New York, NY, USA
| | - Chen Wang
- Department of Medicine, Columbia University, New York, NY, USA
| | - A Keith Dunker
- Department of Biochemistry and Molecular Biology, Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
20
|
Katuwawala A, Oldfield CJ, Kurgan L. DISOselect: Disorder predictor selection at the protein level. Protein Sci 2020; 29:184-200. [PMID: 31642118 PMCID: PMC6933862 DOI: 10.1002/pro.3756] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 10/16/2019] [Accepted: 10/17/2019] [Indexed: 12/27/2022]
Abstract
The intense interest in the intrinsically disordered proteins in the life science community, together with the remarkable advancements in predictive technologies, have given rise to the development of a large number of computational predictors of intrinsic disorder from protein sequence. While the growing number of predictors is a positive trend, we have observed a considerable difference in predictive quality among predictors for individual proteins. Furthermore, variable predictor performance is often inconsistent between predictors for different proteins, and the predictor that shows the best predictive performance depends on the unique properties of each protein sequence. We propose a computational approach, DISOselect, to estimate the predictive performance of 12 selected predictors for individual proteins based on their unique sequence-derived properties. This estimation informs the users about the expected predictive quality for a selected disorder predictor and can be used to recommend methods that are likely to provide the best quality predictions. Our solution does not depend on the results of any disorder predictor; the estimations are made based solely on the protein sequence. Our solution significantly improves predictive performance, as judged with a test set of 1,000 proteins, when compared to other alternatives. We have empirically shown that by using the recommended methods the overall predictive performance for a given set of proteins can be improved by a statistically significant margin. DISOselect is freely available for non-commercial users through the webserver at http://biomine.cs.vcu.edu/servers/DISOselect/.
Collapse
Affiliation(s)
- Akila Katuwawala
- Department of Computer ScienceVirginia Commonwealth UniversityRichmondVirginia
| | | | - Lukasz Kurgan
- Department of Computer ScienceVirginia Commonwealth UniversityRichmondVirginia
| |
Collapse
|
21
|
Barik A, Katuwawala A, Hanson J, Paliwal K, Zhou Y, Kurgan L. DEPICTER: Intrinsic Disorder and Disorder Function Prediction Server. J Mol Biol 2019; 432:3379-3387. [PMID: 31870849 DOI: 10.1016/j.jmb.2019.12.030] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 12/07/2019] [Accepted: 12/15/2019] [Indexed: 01/06/2023]
Abstract
Computational predictions of the intrinsic disorder and its functions are instrumental to facilitate annotation for the millions of unannotated proteins. However, access to these predictors is fragmented and requires substantial effort to find them and to collect and combine their results. The DEPICTER (DisorderEd PredictIon CenTER) server provides first-of-its-kind centralized access to 10 popular disorder and disorder function predictions that cover protein and nucleic acids binding, linkers, and moonlighting regions. It automates the prediction process, runs user-selected methods on the server side, visualizes the results, and outputs all predictions in a consistent and easy-to-parse format. DEPICTER also includes two accurate consensus predictors of disorder and disordered protein binding. Empirical tests on an independent (low similarity) benchmark dataset reveal that the computational tools included in DEPICTER generate accurate predictions that are significantly better than the results secured using sequence alignment. The DEPICTER server is freely available at http://biomine.cs.vcu.edu/servers/DEPICTER/.
Collapse
Affiliation(s)
- Amita Barik
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA; Department of Biotechnology, National Institute of Technology, Durgapur, India
| | - Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA
| | - Jack Hanson
- Signal Processing Laboratory, Griffith University, Brisbane, QLD, 4122, Australia
| | - Kuldip Paliwal
- Signal Processing Laboratory, Griffith University, Brisbane, QLD, 4122, Australia
| | - Yaoqi Zhou
- School of Information and Communication Technology, Griffith University, Gold Coast, QLD, 4222, Australia; Institute for Glycomics, Griffith University, Gold Coast, QLD, 4222, Australia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA.
| |
Collapse
|
22
|
Sequential, Structural and Functional Properties of Protein Complexes Are Defined by How Folding and Binding Intertwine. J Mol Biol 2019; 431:4408-4428. [DOI: 10.1016/j.jmb.2019.07.034] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 07/10/2019] [Accepted: 07/29/2019] [Indexed: 12/15/2022]
|
23
|
Katuwawala A, Oldfield CJ, Kurgan L. Accuracy of protein-level disorder predictions. Brief Bioinform 2019; 21:1509-1522. [DOI: 10.1093/bib/bbz100] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 06/22/2019] [Accepted: 07/15/2019] [Indexed: 01/15/2023] Open
Abstract
Abstract
Experimental annotations of intrinsic disorder are available for 0.1% of 147 000 000 of currently sequenced proteins. Over 60 sequence-based disorder predictors were developed to help bridge this gap. Current benchmarks of these methods assess predictive performance on datasets of proteins; however, predictions are often interpreted for individual proteins. We demonstrate that the protein-level predictive performance varies substantially from the dataset-level benchmarks. Thus, we perform first-of-its-kind protein-level assessment for 13 popular disorder predictors using 6200 disorder-annotated proteins. We show that the protein-level distributions are substantially skewed toward high predictive quality while having long tails of poor predictions. Consequently, between 57% and 75% proteins secure higher predictive performance than the currently used dataset-level assessment suggests, but as many as 30% of proteins that are located in the long tails suffer low predictive performance. These proteins typically have relatively high amounts of disorder, in contrast to the mostly structured proteins that are predicted accurately by all 13 methods. Interestingly, each predictor provides the most accurate results for some number of proteins, while the best-performing at the dataset-level method is in fact the best for only about 30% of proteins. Moreover, the majority of proteins are predicted more accurately than the dataset-level performance of the most accurate tool by at least four disorder predictors. While these results suggests that disorder predictors outperform their current benchmark performance for the majority of proteins and that they complement each other, novel tools that accurately identify the hard-to-predict proteins and that make accurate predictions for these proteins are needed.
Collapse
Affiliation(s)
- Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, USA
- Department of Computer Science, Virginia Commonwealth University, USA
| | - Christopher J Oldfield
- Department of Computer Science, Virginia Commonwealth University, USA
- Department of Computer Science, Virginia Commonwealth University, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, USA
- Department of Computer Science, Virginia Commonwealth University, USA
| |
Collapse
|
24
|
Kuriata A, Gierut AM, Oleniecki T, Ciemny MP, Kolinski A, Kurcinski M, Kmiecik S. CABS-flex 2.0: a web server for fast simulations of flexibility of protein structures. Nucleic Acids Res 2019; 46:W338-W343. [PMID: 29762700 PMCID: PMC6031000 DOI: 10.1093/nar/gky356] [Citation(s) in RCA: 228] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Accepted: 04/27/2018] [Indexed: 11/13/2022] Open
Abstract
Classical simulations of protein flexibility remain computationally expensive, especially for large proteins. A few years ago, we developed a fast method for predicting protein structure fluctuations that uses a single protein model as the input. The method has been made available as the CABS-flex web server and applied in numerous studies of protein structure-function relationships. Here, we present a major update of the CABS-flex web server to version 2.0. The new features include: extension of the method to significantly larger and multimeric proteins, customizable distance restraints and simulation parameters, contact maps and a new, enhanced web server interface. CABS-flex 2.0 is freely available at http://biocomp.chem.uw.edu.pl/CABSflex2.
Collapse
Affiliation(s)
- Aleksander Kuriata
- University of Warsaw, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Aleksandra Maria Gierut
- University of Warsaw, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland.,Faculty of Physics, Astronomy and Applied Computer Science, Jagiellonian University, Krakow, Poland
| | - Tymoteusz Oleniecki
- University of Warsaw, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland.,College of Inter-Faculty Individual Studies in Mathematics and Natural Sciences, University of Warsaw, Warsaw, Poland
| | - Maciej Pawel Ciemny
- University of Warsaw, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland.,Faculty of Physics, University of Warsaw, Warsaw, Poland
| | - Andrzej Kolinski
- University of Warsaw, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Mateusz Kurcinski
- University of Warsaw, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Sebastian Kmiecik
- University of Warsaw, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| |
Collapse
|
25
|
Mészáros B, Erdos G, Dosztányi Z. IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding. Nucleic Acids Res 2019; 46:W329-W337. [PMID: 29860432 PMCID: PMC6030935 DOI: 10.1093/nar/gky384] [Citation(s) in RCA: 965] [Impact Index Per Article: 160.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Accepted: 05/11/2018] [Indexed: 01/31/2023] Open
Abstract
The structural states of proteins include ordered globular domains as well as intrinsically disordered protein regions that exist as highly flexible conformational ensembles in isolation. Various computational tools have been developed to discriminate ordered and disordered segments based on the amino acid sequence. However, properties of IDRs can also depend on various conditions, including binding to globular protein partners or environmental factors, such as redox potential. These cases provide further challenges for the computational characterization of disordered segments. In this work we present IUPred2A, a combined web interface that allows to generate energy estimation based predictions for ordered and disordered residues by IUPred2 and for disordered binding regions by ANCHOR2. The updated web server retains the robustness of the original programs but offers several new features. While only minor bug fixes are implemented for IUPred, the next version of ANCHOR is significantly improved through a new architecture and parameters optimized on novel datasets. In addition, redox-sensitive regions can also be highlighted through a novel experimental feature. The web server offers graphical and text outputs, a RESTful interface, access to software download and extensive help, and can be accessed at a new location: http://iupred2a.elte.hu.
Collapse
Affiliation(s)
- Bálint Mészáros
- MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest H-1117, Hungary
| | - Gábor Erdos
- MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest H-1117, Hungary
| | - Zsuzsanna Dosztányi
- MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest H-1117, Hungary
| |
Collapse
|
26
|
Liu Y, Wang X, Liu B. A comprehensive review and comparison of existing computational methods for intrinsically disordered protein and region prediction. Brief Bioinform 2019; 20:330-346. [PMID: 30657889 DOI: 10.1093/bib/bbx126] [Citation(s) in RCA: 95] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Indexed: 01/06/2023] Open
Abstract
Intrinsically disordered proteins and regions are widely distributed in proteins, which are associated with many biological processes and diseases. Accurate prediction of intrinsically disordered proteins and regions is critical for both basic research (such as protein structure and function prediction) and practical applications (such as drug development). During the past decades, many computational approaches have been proposed, which have greatly facilitated the development of this important field. Therefore, a comprehensive and updated review is highly required. In this regard, we give a review on the computational methods for intrinsically disordered protein and region prediction, especially focusing on the recent development in this field. These computational approaches are divided into four categories based on their methodologies, including physicochemical-based method, machine-learning-based method, template-based method and meta method. Furthermore, their advantages and disadvantages are also discussed. The performance of 40 state-of-the-art predictors is directly compared on the target proteins in the task of disordered region prediction in the 10th Critical Assessment of protein Structure Prediction. A more comprehensive performance comparison of 45 different predictors is conducted based on seven widely used benchmark data sets. Finally, some open problems and perspectives are discussed.
Collapse
Affiliation(s)
- Yumeng Liu
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, China
| | - Xiaolong Wang
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, China
| | - Bin Liu
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, China
| |
Collapse
|
27
|
Lyngdoh D, Shukla H, Sonkar A, Anupam R, Tripathi T. Portrait of the Intrinsically Disordered Side of the HTLV-1 Proteome. ACS OMEGA 2019; 4:10003-10018. [PMID: 31460093 PMCID: PMC6648719 DOI: 10.1021/acsomega.9b01017] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Accepted: 05/28/2019] [Indexed: 05/07/2023]
Abstract
Intrinsically disordered proteins (IDPs) lack an ordered 3D structure. These proteins contain one or more intrinsically disordered protein regions (IDPRs). IDPRs interact promiscuously with other proteins, which leads to their structural transition from a disordered to an ordered state. Such interaction-prone regions of IDPs are known as molecular recognition features. Recent studies suggest that IDPs provide structural plasticity and functional diversity to viral proteins that are involved in rapid replication and immune evasion within the host cells. In the present study, we evaluated the prevalence of IDPs and IDPRs in human T lymphotropic virus type 1 (HTLV-1) proteome. We also investigated the presence of MoRF regions in the structural and nonstructural proteins of HTLV-1. We found abundant IDPRs in HTLV-1 bZIP factor, p30, Rex, and structural nucleocapsid p15 proteins, which are involved in diverse functions such as virus proliferation, mRNA export, and genomic RNA binding. Our study analyzed the HTLV-1 proteome with the perspective of intrinsic disorder identification. We propose that the intrinsic disorder analysis of HTLV-1 proteins may form the basis for the development of protein disorder-based drugs.
Collapse
Affiliation(s)
- Denzelle
L. Lyngdoh
- Molecular
and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong 793022, India
| | - Harish Shukla
- Molecular
and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong 793022, India
| | - Amit Sonkar
- Molecular
and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong 793022, India
| | - Rajaneesh Anupam
- Department
of Biotechnology, Dr. Harisingh Gour Central
University, Sagar 470003, India
| | - Timir Tripathi
- Molecular
and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong 793022, India
- E-mail: , . Phone: +91-364-2722141. Fax: +91-364-2550108
| |
Collapse
|
28
|
Namdev P, Lyngdoh DL, Dar HY, Chaurasiya SK, Srivastava R, Tripathi T, Anupam R. Intrinsically Disordered Human T Lymphotropic Virus Type 1 p30 Protein: Experimental and Computational Evidence. AIDS Res Hum Retroviruses 2019; 35:477-487. [PMID: 30618266 DOI: 10.1089/aid.2018.0196] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Human T lymphotropic virus type 1 (HTLV-1) causes adult T cell leukemia and lymphoma and other neuroinflammatory diseases. The pX region of HTLV-1 genome encodes an accessory protein p30 that is required for viral persistence and spread in the host. p30 regulates viral gene expression at the transcription level by competing with Tax for p300 binding and at posttranscriptional level by nuclear retention of tax/rex messenger RNA (mRNA). In addition, p30 modulates the host cellular environment by binding to various host proteins such as ATM, REGγ, and PRMT5. However, the low expression levels of p30 has been a major hurdle in studying its structure-function relationship in the context of HTLV-1 pathobiology, which is most likely due to its intrinsically disordered nature. To investigate the unstable nature of p30, flow cytometric analysis of p30-GFP fusion protein expressed in Escherichia coli was conducted and bioinformatics analysis of p30 was performed. The bacterial cells were green fluorescent protein (GFP) positive, indicating that p30-GFP was in the soluble fraction. Induction, particularly at higher temperature, reduced the expression of p30-GFP. Moreover, p30-GFP was detected exclusively in insoluble fraction upon cell lysis, suggesting its unstable and disordered nature. The bioinformatics analysis of p30 protein sequence and amino acid content revealed that p30 has highly disordered regions from amino acids 75-155 and 197-241. Furthermore, p30 has regions for macromolecular interactions that could stabilize it and these regions coincide with the unstable regions. Collectively, the study indicates that HTLV-1 p30 is an intrinsically disordered protein.
Collapse
Affiliation(s)
- Priyanka Namdev
- Department of Biotechnology, Dr. Harisingh Gour University, Sagar, India
| | - Denzelle Lee Lyngdoh
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North Eastern Hill University, Shillong, India
| | - Hamid Y. Dar
- Department of Zoology, Dr. Harisingh Gour University, Sagar, India
| | - Shivendra K. Chaurasiya
- Host-Pathogen Interaction and Signal Transduction Laboratory, Department of Microbiology, Dr. Harisingh Gour University, Sagar, India
| | | | - Timir Tripathi
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North Eastern Hill University, Shillong, India
| | - Rajaneesh Anupam
- Department of Biotechnology, Dr. Harisingh Gour University, Sagar, India
| |
Collapse
|
29
|
Falahati H, Haji-Akbari A. Thermodynamically driven assemblies and liquid-liquid phase separations in biology. SOFT MATTER 2019; 15:1135-1154. [PMID: 30672955 DOI: 10.1039/c8sm02285b] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The sustenance of life depends on the high degree of organization that prevails through different levels of living organisms, from subcellular structures such as biomolecular complexes and organelles to tissues and organs. The physical origin of such organization is not fully understood, and even though it is clear that cells and organisms cannot maintain their integrity without consuming energy, there is growing evidence that individual assembly processes can be thermodynamically driven and occur spontaneously due to changes in thermodynamic variables such as intermolecular interactions and concentration. Understanding the phase separation in vivo requires a multidisciplinary approach, integrating the theory and physics of phase separation with experimental and computational techniques. This paper aims at providing a brief overview of the physics of phase separation and its biological implications, with a particular focus on the assembly of membraneless organelles. We discuss the underlying physical principles of phase separation from its thermodynamics to its kinetics. We also overview the wide range of methods utilized for experimental verification and characterization of phase separation of membraneless organelles, as well as the utility of molecular simulations rooted in thermodynamics and statistical physics in understanding the governing principles of thermodynamically driven biological self-assembly processes.
Collapse
Affiliation(s)
- Hanieh Falahati
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA.
| | | |
Collapse
|
30
|
Misprediction of Structural Disorder in Halophiles. Molecules 2019; 24:molecules24030479. [PMID: 30699990 PMCID: PMC6384707 DOI: 10.3390/molecules24030479] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Revised: 01/25/2019] [Accepted: 01/26/2019] [Indexed: 12/01/2022] Open
Abstract
Whereas the concept of intrinsic disorder derives from biophysical observations of the lack of structure of proteins or protein regions under native conditions, many of our respective concepts rest on proteome-scale bioinformatics predictions. It is established that most predictors work reliably on proteins commonly encountered, but it is often neglected that we know very little about their performance on proteins of microorganisms that thrive in environments of extreme temperature, pH, or salt concentration, which may cause adaptive sequence composition bias. To address this issue, we predicted structural disorder for the complete proteomes of different extremophile groups by popular prediction methods and compared them to those of the reference mesophilic group. While significant deviations from mesophiles could be explained by a lack or gain of disordered regions in hyperthermophiles and radiotolerants, respectively, we found systematic overprediction in the case of halophiles. Additionally, examples were collected from the Protein Data Bank (PDB) to demonstrate misprediction and to help understand the underlying biophysical principles, i.e., halophilic proteins maintain a highly acidic and hydrophilic surface to avoid aggregation in high salt conditions. Although sparseness of data on disordered proteins from extremophiles precludes the development of dedicated general predictors, we do formulate recommendations for how to address their disorder with current bioinformatics tools.
Collapse
|
31
|
Fichó E, Reményi I, Simon I, Mészáros B. MFIB: a repository of protein complexes with mutual folding induced by binding. Bioinformatics 2018; 33:3682-3684. [PMID: 29036655 PMCID: PMC5870711 DOI: 10.1093/bioinformatics/btx486] [Citation(s) in RCA: 57] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2017] [Accepted: 08/02/2017] [Indexed: 12/02/2022] Open
Abstract
Motivation It is commonplace that intrinsically disordered proteins (IDPs) are involved in crucial interactions in the living cell. However, the study of protein complexes formed exclusively by IDPs is hindered by the lack of data and such analyses remain sporadic. Systematic studies benefited other types of protein–protein interactions paving a way from basic science to therapeutics; yet these efforts require reliable datasets that are currently lacking for synergistically folding complexes of IDPs. Results Here we present the Mutual Folding Induced by Binding (MFIB) database, the first systematic collection of complexes formed exclusively by IDPs. MFIB contains an order of magnitude more data than any dataset used in corresponding studies and offers a wide coverage of known IDP complexes in terms of flexibility, oligomeric composition and protein function from all domains of life. The included complexes are grouped using a hierarchical classification and are complemented with structural and functional annotations. MFIB is backed by a firm development team and infrastructure, and together with possible future community collaboration it will provide the cornerstone for structural and functional studies of IDP complexes. Availability and implementation MFIB is freely accessible at http://mfib.enzim.ttk.mta.hu/. The MFIB application is hosted by Apache web server and was implemented in PHP. To enrich querying features and to enhance backend performance a MySQL database was also created. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Erzsébet Fichó
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - István Reményi
- Institute of Enzymology, RCNS, Hungarian Academy of Sciences, 'Momentum' Membrane Protein Bioinformatics Research Group, Budapest H-1117, Hungary
| | - István Simon
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Bálint Mészáros
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| |
Collapse
|
32
|
Sonkar A, Lyngdoh DL, Shukla R, Shukla H, Tripathi T, Ahmed S. Point mutation A394E in the central intrinsic disordered region of Rna14 leads to chromosomal instability in fission yeast. Int J Biol Macromol 2018; 119:785-791. [PMID: 30076928 DOI: 10.1016/j.ijbiomac.2018.07.193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 07/31/2018] [Accepted: 07/31/2018] [Indexed: 12/01/2022]
Abstract
Accurate chromosomal segregation is crucial for the maintenance of genomic integrity. Rna14 is a major component of the yeast pre-mRNA 3'-end processing factor, the cleavage factor IA complex, and is involved in cleavage and polyadenylation of mRNA in the nucleus. Rna14 is also essential for the maintenance of genomic integrity in fission yeast Schizosaccharomyces pombe. In the present study, we report that a non-homologous mutation, A394E that is present in the central intrinsic disordered region of Rna14 leads to chromosomal instability in fission yeast. This mutation was shown to disrupt chromosome segregation and 3'-end maturation, and also affects the pre-mRNA splicing in vivo at non-permissive temperatures. We observed that a significant part of Rna14 is intrinsically disordered, that includes the N- and C-terminal of Rna14, as well as the central region containing the HAT repeats and the mutation within amino acid residues 372-435. These regions are crucial for the function of Rna14 as they are involved in the interaction of Rna14 with other proteins.
Collapse
Affiliation(s)
- Amit Sonkar
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong 793022, India
| | - Denzelle Lee Lyngdoh
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong 793022, India
| | - Rohit Shukla
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong 793022, India
| | - Harish Shukla
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong 793022, India
| | - Timir Tripathi
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong 793022, India.
| | - Shakil Ahmed
- Molecular and Structural Biology Division, CSIR-Central Drug Research Institute, Sector 10, Jankipuram Extension, Lucknow 226031, India.
| |
Collapse
|
33
|
How Do We Study the Dynamic Structure of Unstructured Proteins: A Case Study on Nopp140 as an Example of a Large, Intrinsically Disordered Protein. Int J Mol Sci 2018; 19:ijms19020381. [PMID: 29382046 PMCID: PMC5855603 DOI: 10.3390/ijms19020381] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2018] [Revised: 01/22/2018] [Accepted: 01/23/2018] [Indexed: 02/04/2023] Open
Abstract
Intrinsically disordered proteins (IDPs) represent approximately 30% of the human genome and play key roles in cell proliferation and cellular signaling by modulating the function of target proteins via protein-protein interactions. In addition, IDPs are involved in various human disorders, such as cancer, neurodegenerative diseases, and amyloidosis. To understand the underlying molecular mechanism of IDPs, it is important to study their structural features during their interactions with target proteins. However, conventional biochemical and biophysical methods for analyzing proteins, such as X-ray crystallography, have difficulty in characterizing the features of IDPs because they lack an ordered three-dimensional structure. Here, we present biochemical and biophysical studies on nucleolar phosphoprotein 140 (Nopp140), which mostly consists of disordered regions, during its interaction with casein kinase 2 (CK2), which plays a central role in cell growth. Surface plasmon resonance and electron paramagnetic resonance studies were performed to characterize the interaction between Nopp140 and CK2. A single-molecule fluorescence resonance energy transfer study revealed conformational change in Nopp140 during its interaction with CK2. These studies on Nopp140 can provide a good model system for understanding the molecular function of IDPs.
Collapse
|
34
|
Uversky VN. The roles of intrinsic disorder-based liquid-liquid phase transitions in the "Dr. Jekyll-Mr. Hyde" behavior of proteins involved in amyotrophic lateral sclerosis and frontotemporal lobar degeneration. Autophagy 2017; 13:2115-2162. [PMID: 28980860 DOI: 10.1080/15548627.2017.1384889] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Pathological developments leading to amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration (FTLD) are associated with misbehavior of several key proteins, such as SOD1 (superoxide dismutase 1), TARDBP/TDP-43, FUS, C9orf72, and dipeptide repeat proteins generated as a result of the translation of the intronic hexanucleotide expansions in the C9orf72 gene, PFN1 (profilin 1), GLE1 (GLE1, RNA export mediator), PURA (purine rich element binding protein A), FLCN (folliculin), RBM45 (RNA binding motif protein 45), SS18L1/CREST, HNRNPA1 (heterogeneous nuclear ribonucleoprotein A1), HNRNPA2B1 (heterogeneous nuclear ribonucleoprotein A2/B1), ATXN2 (ataxin 2), MAPT (microtubule associated protein tau), and TIA1 (TIA1 cytotoxic granule associated RNA binding protein). Although these proteins are structurally and functionally different and have rather different pathological functions, they all possess some levels of intrinsic disorder and are either directly engaged in or are at least related to the physiological liquid-liquid phase transitions (LLPTs) leading to the formation of various proteinaceous membrane-less organelles (PMLOs), both normal and pathological. This review describes the normal and pathological functions of these ALS- and FTLD-related proteins, describes their major structural properties, glances at their intrinsic disorder status, and analyzes the involvement of these proteins in the formation of normal and pathological PMLOs, with the ultimate goal of better understanding the roles of LLPTs and intrinsic disorder in the "Dr. Jekyll-Mr. Hyde" behavior of those proteins.
Collapse
Affiliation(s)
- Vladimir N Uversky
- a Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute , Morsani College of Medicine , University of South Florida , Tampa , FL , USA.,b Institute for Biological Instrumentation of the Russian Academy of Sciences , Pushchino, Moscow region , Russia
| |
Collapse
|
35
|
Dosztányi Z. Prediction of protein disorder based on IUPred. Protein Sci 2017; 27:331-340. [PMID: 29076577 DOI: 10.1002/pro.3334] [Citation(s) in RCA: 119] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 10/25/2017] [Accepted: 10/25/2017] [Indexed: 12/19/2022]
Abstract
Many proteins contain intrinsically disordered regions (IDRs), functional polypeptide segments that in isolation adopt a highly flexible conformational ensemble instead of a single, well-defined structure. Disorder prediction methods, which can discriminate ordered and disordered regions from the amino acid sequence, have contributed significantly to our current understanding of the distinct properties of intrinsically disordered proteins by enabling the characterization of individual examples as well as large-scale analyses of these protein regions. One popular method, IUPred provides a robust prediction of protein disorder based on an energy estimation approach that captures the fundamental difference between the biophysical properties of ordered and disordered regions. This paper reviews the energy estimation method underlying IUPred and the basic properties of the web server. Through an example, it also illustrates how the prediction output can be interpreted in a more complex case by taking into account the heterogeneous nature of IDRs. Various applications that benefited from IUPred to provide improved disorder predictions, complementing domain annotations and aiding the identification of functional short linear motifs are also described here. IUPred is freely available for noncommercial users through the web server (http://iupred.enzim.hu and http://iupred.elte.hu) . The program can also be downloaded and installed locally for large-scale analyses.
Collapse
Affiliation(s)
- Zsuzsanna Dosztányi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, H-1117, Hungary
| |
Collapse
|
36
|
Meng F, Uversky VN, Kurgan L. Comprehensive review of methods for prediction of intrinsic disorder and its molecular functions. Cell Mol Life Sci 2017; 74:3069-3090. [PMID: 28589442 PMCID: PMC11107660 DOI: 10.1007/s00018-017-2555-4] [Citation(s) in RCA: 130] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 06/01/2017] [Indexed: 12/19/2022]
Abstract
Computational prediction of intrinsic disorder in protein sequences dates back to late 1970 and has flourished in the last two decades. We provide a brief historical overview, and we review over 30 recent predictors of disorder. We are the first to also cover predictors of molecular functions of disorder, including 13 methods that focus on disordered linkers and disordered protein-protein, protein-RNA, and protein-DNA binding regions. We overview their predictive models, usability, and predictive performance. We highlight newest methods and predictors that offer strong predictive performance measured based on recent comparative assessments. We conclude that the modern predictors are relatively accurate, enjoy widespread use, and many of them are fast. Their predictions are conveniently accessible to the end users, via web servers and databases that store pre-computed predictions for millions of proteins. However, research into methods that predict many not yet addressed functions of intrinsic disorder remains an outstanding challenge.
Collapse
Affiliation(s)
- Fanchi Meng
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada
| | - Vladimir N Uversky
- Department of Molecular Medicine, USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
- Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, Moscow Region, Russian Federation
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, USA.
| |
Collapse
|
37
|
Gao M, Yang F, Zhang L, Su Z, Huang Y. Exploring the sequence-structure-function relationship for the intrinsically disordered βγ-crystallin Hahellin. J Biomol Struct Dyn 2017; 36:1171-1181. [PMID: 28393629 DOI: 10.1080/07391102.2017.1316519] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
βγ-Crystallins are a superfamily of proteins containing crystallin-type Greek key motifs. Some βγ-crystallin domains have been shown to bind Ca2+. Hahellin is a newly identified intrinsically disordered βγ-crystallin domain from Hahella chejuensis. It folds into a typical βγ-crystallin structure upon Ca2+ binding and acts as a Ca2+-regulated conformational switch. Besides Hahellin, another two putative βγ-crystallins from Caulobacter crescentus and Yersinia pestis are shown to be partially disordered in their apo-form and undergo large conformational changes upon Ca2+ binding, although whether they acquire a βγ-crystallin fold is not known. The extent of conformational disorder/order of a protein is determined by its amino acid sequence. To date how this sequence-structure relationship is reflected in the βγ-crystallin superfamily has not been investigated. In this work, we comparatively studied the sequence and structure of Hahellin with those of Protein S, an ordered βγ-crystallin, via various computational biophysical techniques. We found that several factors, including presence of a C-terminal disorder prone region, high content of energetic frustrations, and low contact density, may promote the formation of the disordered state of apo-Hahellin. We also analyzed the disorder propensities for other putative disordered βγ-crystallin domains. This study provides new clues for further understanding the sequence-structure-function relationship of βγ-crystallins.
Collapse
Affiliation(s)
- Meng Gao
- a Department of Biological Engineering and Institute of Biomedical and Pharmaceutical Sciences , Hubei University of Technology , Wuhan , Hubei 430068 , China
| | - Fei Yang
- a Department of Biological Engineering and Institute of Biomedical and Pharmaceutical Sciences , Hubei University of Technology , Wuhan , Hubei 430068 , China
| | - Lei Zhang
- a Department of Biological Engineering and Institute of Biomedical and Pharmaceutical Sciences , Hubei University of Technology , Wuhan , Hubei 430068 , China
| | - Zhengding Su
- a Department of Biological Engineering and Institute of Biomedical and Pharmaceutical Sciences , Hubei University of Technology , Wuhan , Hubei 430068 , China
| | - Yongqi Huang
- a Department of Biological Engineering and Institute of Biomedical and Pharmaceutical Sciences , Hubei University of Technology , Wuhan , Hubei 430068 , China
| |
Collapse
|
38
|
Becerra A, Bucheli VA, Moreno PA. Prediction of virus-host protein-protein interactions mediated by short linear motifs. BMC Bioinformatics 2017; 18:163. [PMID: 28279163 PMCID: PMC5345135 DOI: 10.1186/s12859-017-1570-7] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2016] [Accepted: 02/24/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Short linear motifs in host organisms proteins can be mimicked by viruses to create protein-protein interactions that disable or control metabolic pathways. Given that viral linear motif instances of host motif regular expressions can be found by chance, it is necessary to develop filtering methods of functional linear motifs. We conduct a systematic comparison of linear motifs filtering methods to develop a computational approach for predicting motif-mediated protein-protein interactions between human and the human immunodeficiency virus 1 (HIV-1). RESULTS We implemented three filtering methods to obtain linear motif sets: 1) conserved in viral proteins (C), 2) located in disordered regions (D) and 3) rare or scarce in a set of randomized viral sequences (R). The sets C,D,R are united and intersected. The resulting sets are compared by the number of protein-protein interactions correctly inferred with them - with experimental validation. The comparison is done with HIV-1 sequences and interactions from the National Institute of Allergy and Infectious Diseases (NIAID). The number of correctly inferred interactions allows to rank the interactions by the sets used to deduce them: D∪R and C. The ordering of the sets is descending on the probability of capturing functional interactions. With respect to HIV-1, the sets C∪R, D∪R, C∪D∪R infer all known interactions between HIV1 and human proteins mediated by linear motifs. We found that the majority of conserved linear motifs in the virus are located in disordered regions. CONCLUSION We have developed a method for predicting protein-protein interactions mediated by linear motifs between HIV-1 and human proteins. The method only use protein sequences as inputs. We can extend the software developed to any other eukaryotic virus and host in order to find and rank candidate interactions. In future works we will use it to explore possible viral attack mechanisms based on linear motif mimicry.
Collapse
Affiliation(s)
- Andrés Becerra
- Escuela de ingeniería de sistemas y computación, Universidad del Valle, Calle 13 # 100-00, A. A. 25360, Cali, Colombia
| | - Victor A Bucheli
- Escuela de ingeniería de sistemas y computación, Universidad del Valle, Calle 13 # 100-00, A. A. 25360, Cali, Colombia
| | - Pedro A Moreno
- Escuela de ingeniería de sistemas y computación, Universidad del Valle, Calle 13 # 100-00, A. A. 25360, Cali, Colombia.
| |
Collapse
|
39
|
Abstract
Intrinsically disordered proteins and regions (IDPs and IDRs) are involved in a wide range of cellular functions and they often facilitate interactions with RNAs, DNAs, and proteins. Although many computational methods can predict IDPs and IDRs in protein sequences, only a few methods predict their functions and these functions primarily concern protein binding. We describe how to use the first computational method DisoRDPbind for high-throughput prediction of multiple functions of disordered regions. Our method predicts the RNA-, DNA-, and protein-binding residues located in IDRs in the input protein sequences. DisoRDPbind provides accurate predictions and is sufficiently fast to make predictions for full genomes. Our method is implemented as a user-friendly webserver that is freely available at http://biomine.ece.ualberta.ca/DisoRDPbind/ . We overview our predictor, discuss how to run the webserver, and show how to interpret the corresponding results. We also demonstrate the utility of our method based on two case studies, human BRCA1 protein that binds various proteins and DNA, and yeast 60S ribosomal protein L4 that interacts with proteins and RNA.
Collapse
|
40
|
Vojisavljevic V, Pirogova E. Prediction of intrinsically disordered regions in proteins using signal processing methods: application to heat-shock proteins. Med Biol Eng Comput 2016; 54:1831-1844. [PMID: 27037818 DOI: 10.1007/s11517-016-1477-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2015] [Accepted: 02/25/2016] [Indexed: 10/22/2022]
Abstract
Heat-shock protein (HSP)-based immunotherapy is believed to be a promising area of development for cancer treatment as such therapy is characterized by a unique approach to every tumour. It was shown that by inhibition of HSPs it is possible to induce apoptotic cell death in cancer cells. Interestingly, there are a great number of disordered regions in proteins associated with cancer, cardiovascular and neurodegenerative diseases, signalling, and diabetes. HSPs and some specific enzymes were shown to have these disordered regions in their primary structures. The experimental studies of HSPs confirmed that their intrinsically disordered (ID) regions are of functional importance. These ID regions play crucial roles in regulating the specificity of interactions between dimer complexes and their interacting partners. Because HSPs are overexpressed in cancer, predicting the locations of ID regions and binding sites in these proteins will be important for developing novel cancer therapeutics. In our previous studies, signal processing methods have been successfully used for protein structure-function analysis (i.e. for determining functionally important amino acids and the locations of protein active sites). In this paper, we present and discuss a novel approach for predicting the locations of ID regions in the selected cancer-related HSPs.
Collapse
Affiliation(s)
- Vuk Vojisavljevic
- Biomedical Engineering, School of Engineering, RMIT University, Melbourne, VIC, 3001, Australia
| | - Elena Pirogova
- Biomedical Engineering, School of Engineering, RMIT University, Melbourne, VIC, 3001, Australia.
| |
Collapse
|
41
|
Pearson H, Daouda T, Granados DP, Durette C, Bonneil E, Courcelles M, Rodenbrock A, Laverdure JP, Côté C, Mader S, Lemieux S, Thibault P, Perreault C. MHC class I-associated peptides derive from selective regions of the human genome. J Clin Invest 2016; 126:4690-4701. [PMID: 27841757 DOI: 10.1172/jci88590] [Citation(s) in RCA: 140] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Accepted: 09/30/2016] [Indexed: 12/24/2022] Open
Abstract
MHC class I-associated peptides (MAPs) define the immune self for CD8+ T lymphocytes and are key targets of cancer immunosurveillance. Here, the goals of our work were to determine whether the entire set of protein-coding genes could generate MAPs and whether specific features influence the ability of discrete genes to generate MAPs. Using proteogenomics, we have identified 25,270 MAPs isolated from the B lymphocytes of 18 individuals who collectively expressed 27 high-frequency HLA-A,B allotypes. The entire MAP repertoire presented by these 27 allotypes covered only 10% of the exomic sequences expressed in B lymphocytes. Indeed, 41% of expressed protein-coding genes generated no MAPs, while 59% of genes generated up to 64 MAPs, often derived from adjacent regions and presented by different allotypes. We next identified several features of transcripts and proteins associated with efficient MAP production. From these data, we built a logistic regression model that predicts with good accuracy whether a gene generates MAPs. Our results show preferential selection of MAPs from a limited repertoire of proteins with distinctive features. The notion that the MHC class I immunopeptidome presents only a small fraction of the protein-coding genome for monitoring by the immune system has profound implications in autoimmunity and cancer immunology.
Collapse
|
42
|
Abstract
Repeats are ubiquitous elements of proteins and they play important roles for cellular function and during evolution. Repeats are, however, also notoriously difficult to capture computationally and large scale studies so far had difficulties in linking genetic causes, structural properties and evolutionary trajectories of protein repeats. Here we apply recently developed methods for repeat detection and analysis to a large dataset comprising over hundred metazoan genomes. We find that repeats in larger protein families experience generally very few insertions or deletions (indels) of repeat units but there is also a significant fraction of noteworthy volatile outliers with very high indel rates. Analysis of structural data indicates that repeats with an open structure and independently folding units are more volatile and more likely to be intrinsically disordered. Such disordered repeats are also significantly enriched in sites with a high functional potential such as linear motifs. Furthermore, the most volatile repeats have a high sequence similarity between their units. Since many volatile repeats also show signs of recombination, we conclude they are often shaped by concerted evolution. Intriguingly, many of these conserved yet volatile repeats are involved in host-pathogen interactions where they might foster fast but subtle adaptation in biological arms races. KEY WORDS: protein evolution, domain rearrangements, protein repeats, concerted evolution.
Collapse
Affiliation(s)
- Andreas Schüler
- Institute for Evolution and Biodiversity, Westfalian Wilhelms University, Huefferstrasse 1, Muenster, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, Westfalian Wilhelms University, Huefferstrasse 1, Muenster, Germany
| |
Collapse
|
43
|
Mannige RV, Kundu J, Whitelam S. The Ramachandran Number: An Order Parameter for Protein Geometry. PLoS One 2016; 11:e0160023. [PMID: 27490241 PMCID: PMC4973960 DOI: 10.1371/journal.pone.0160023] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2016] [Accepted: 07/12/2016] [Indexed: 11/18/2022] Open
Abstract
Three-dimensional protein structures usually contain regions of local order, called secondary structure, such as α-helices and β-sheets. Secondary structure is characterized by the local rotational state of the protein backbone, quantified by two dihedral angles called ϕ and ψ. Particular types of secondary structure can generally be described by a single (diffuse) location on a two-dimensional plot drawn in the space of the angles ϕ and ψ, called a Ramachandran plot. By contrast, a recently-discovered nanomaterial made from peptoids, structural isomers of peptides, displays a secondary-structure motif corresponding to two regions on the Ramachandran plot [Mannige et al., Nature 526, 415 (2015)]. In order to describe such ‘higher-order’ secondary structure in a compact way we introduce here a means of describing regions on the Ramachandran plot in terms of a single Ramachandran number, R, which is a structurally meaningful combination of ϕ and ψ. We show that the potential applications of R are numerous: it can be used to describe the geometric content of protein structures, and can be used to draw diagrams that reveal, at a glance, the frequency of occurrence of regular secondary structures and disordered regions in large protein datasets. We propose that R might be used as an order parameter for protein geometry for a wide range of applications.
Collapse
Affiliation(s)
- Ranjan V. Mannige
- Molecular Foundry, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, United States of America
- * E-mail: (RVM); (SW)
| | - Joyjit Kundu
- Molecular Foundry, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, United States of America
| | - Stephen Whitelam
- Molecular Foundry, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, United States of America
- * E-mail: (RVM); (SW)
| |
Collapse
|
44
|
Nielsen JT, Mulder FAA. There is Diversity in Disorder-"In all Chaos there is a Cosmos, in all Disorder a Secret Order". Front Mol Biosci 2016; 3:4. [PMID: 26904549 PMCID: PMC4749933 DOI: 10.3389/fmolb.2016.00004] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2015] [Accepted: 01/25/2016] [Indexed: 11/13/2022] Open
Abstract
The protein universe consists of a continuum of structures ranging from full order to complete disorder. As the structured part of the proteome has been intensively studied, stably folded proteins are increasingly well documented and understood. However, proteins that are fully, or in large part, disordered are much less well characterized. Here we collected NMR chemical shifts in a small database for 117 protein sequences that are known to contain disorder. We demonstrate that NMR chemical shift data can be brought to bear as an exquisite judge of protein disorder at the residue level, and help in validation. With the help of secondary chemical shift analysis we demonstrate that the proteins in the database span the full spectrum of disorder, but still, largely segregate into two classes; disordered with small segments of order scattered along the sequence, and structured with small segments of disorder inserted between the different structured regions. A detailed analysis reveals that the distribution of order/disorder along the sequence shows a complex and asymmetric distribution, that is highly protein-dependent. Access to ratified training data further suggests an avenue to improving prediction of disorder from sequence.
Collapse
Affiliation(s)
- Jakob T Nielsen
- Department of Chemistry and Interdisciplinary Nanoscience Center, University of Aarhus Aarhus, Denmark
| | - Frans A A Mulder
- Department of Chemistry and Interdisciplinary Nanoscience Center, University of Aarhus Aarhus, Denmark
| |
Collapse
|
45
|
Jandrlić DR, Lazić GM, Mitić NS, Pavlović MD. Software tools for simultaneous data visualization and T cell epitopes and disorder prediction in proteins. J Biomed Inform 2016; 60:120-31. [PMID: 26851400 DOI: 10.1016/j.jbi.2016.01.016] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2015] [Revised: 01/15/2016] [Accepted: 01/28/2016] [Indexed: 11/16/2022]
Abstract
We have developed EpDis and MassPred, extendable open source software tools that support bioinformatic research and enable parallel use of different methods for the prediction of T cell epitopes, disorder and disordered binding regions and hydropathy calculation. These tools offer a semi-automated installation of chosen sets of external predictors and an interface allowing for easy application of the prediction methods, which can be applied either to individual proteins or to datasets of a large number of proteins. In addition to access to prediction methods, the tools also provide visualization of the obtained results, calculation of consensus from results of different methods, as well as import of experimental data and their comparison with results obtained with different predictors. The tools also offer a graphical user interface and the possibility to store data and the results obtained using all of the integrated methods in the relational database or flat file for further analysis. The MassPred part enables a massive parallel application of all integrated predictors to the set of proteins. Both tools can be downloaded from http://bioinfo.matf.bg.ac.rs/home/downloads.wafl?cat=Software. Appendix A includes the technical description of the created tools and a list of supported predictors.
Collapse
Affiliation(s)
- Davorka R Jandrlić
- University of Belgrade, Faculty of Mechanical Engineering, Kraljice Marije 16, Belgrade, Serbia.
| | - Goran M Lazić
- University of Belgrade, Faculty of Mathematics, P.O.B. 550, Studentski trg 16/IV, Belgrade, Serbia.
| | - Nenad S Mitić
- University of Belgrade, Faculty of Mathematics, P.O.B. 550, Studentski trg 16/IV, Belgrade, Serbia.
| | - Mirjana D Pavlović
- University of Belgrade, Institute of General and Physical Chemistry, Studentski trg 12/V, Belgrade, Serbia.
| |
Collapse
|
46
|
Peng Z, Kurgan L. High-throughput prediction of RNA, DNA and protein binding regions mediated by intrinsic disorder. Nucleic Acids Res 2015; 43:e121. [PMID: 26109352 PMCID: PMC4605291 DOI: 10.1093/nar/gkv585] [Citation(s) in RCA: 117] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2015] [Revised: 04/24/2015] [Accepted: 05/24/2015] [Indexed: 01/05/2023] Open
Abstract
Intrinsically disordered proteins and regions (IDPs and IDRs) lack stable 3D structure under physiological conditions in-vitro, are common in eukaryotes, and facilitate interactions with RNA, DNA and proteins. Current methods for prediction of IDPs and IDRs do not provide insights into their functions, except for a handful of methods that address predictions of protein-binding regions. We report first-of-its-kind computational method DisoRDPbind for high-throughput prediction of RNA, DNA and protein binding residues located in IDRs from protein sequences. DisoRDPbind is implemented using a runtime-efficient multi-layered design that utilizes information extracted from physiochemical properties of amino acids, sequence complexity, putative secondary structure and disorder and sequence alignment. Empirical tests demonstrate that it provides accurate predictions that are competitive with other predictors of disorder-mediated protein binding regions and complementary to the methods that predict RNA- and DNA-binding residues annotated based on crystal structures. Application in Homo sapiens, Mus musculus, Caenorhabditis elegans and Drosophila melanogaster proteomes reveals that RNA- and DNA-binding proteins predicted by DisoRDPbind complement and overlap with the corresponding known binding proteins collected from several sources. Also, the number of the putative protein-binding regions predicted with DisoRDPbind correlates with the promiscuity of proteins in the corresponding protein-protein interaction networks. Webserver: http://biomine.ece.ualberta.ca/DisoRDPbind/.
Collapse
Affiliation(s)
- Zhenling Peng
- Center for Applied Mathematics, Tianjin University, Tianjin, 300072, P.R. China Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, T6G 2V4, Canada
| | - Lukasz Kurgan
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, T6G 2V4, Canada
| |
Collapse
|
47
|
Li J, Feng Y, Wang X, Li J, Liu W, Rong L, Bao J. An Overview of Predictors for Intrinsically Disordered Proteins over 2010-2014. Int J Mol Sci 2015; 16:23446-62. [PMID: 26426014 PMCID: PMC4632708 DOI: 10.3390/ijms161023446] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2015] [Revised: 08/25/2015] [Accepted: 08/31/2015] [Indexed: 02/05/2023] Open
Abstract
The sequence-structure-function paradigm of proteins has been changed by the occurrence of intrinsically disordered proteins (IDPs). Benefiting from the structural disorder, IDPs are of particular importance in biological processes like regulation and signaling. IDPs are associated with human diseases, including cancer, cardiovascular disease, neurodegenerative diseases, amyloidoses, and several other maladies. IDPs attract a high level of interest and a substantial effort has been made to develop experimental and computational methods. So far, more than 70 prediction tools have been developed since 1997, within which 17 predictors were created in the last five years. Here, we presented an overview of IDPs predictors developed during 2010-2014. We analyzed the algorithms used for IDPs prediction by these tools and we also discussed the basic concept of various prediction methods for IDPs. The comparison of prediction performance among these tools is discussed as well.
Collapse
Affiliation(s)
- Jianzong Li
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
| | - Yu Feng
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
| | - Xiaoyun Wang
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
| | - Jing Li
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
- State Key Laboratory of Biotherapy/Collaborative Innovation Center for Biotherapy, West China Hospital, Sichuan University, Chengdu 610041, China.
| | - Wen Liu
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
| | - Li Rong
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
| | - Jinku Bao
- College of Life Sciences & Key Laboratory of Ministry of Education for Bio-Resources and Bio-Environment, Sichuan University, Chengdu 610064, China.
- State Key Laboratory of Biotherapy/Collaborative Innovation Center for Biotherapy, West China Hospital, Sichuan University, Chengdu 610041, China.
- State Key Laboratory of Oral Diseases, West China College of Stomatology, Sichuan University, Chengdu 610041, China.
| |
Collapse
|
48
|
Volpato V, Alshomrani B, Pollastri G. Accurate Ab Initio and Template-Based Prediction of Short Intrinsically-Disordered Regions by Bidirectional Recurrent Neural Networks Trained on Large-Scale Datasets. Int J Mol Sci 2015; 16:19868-85. [PMID: 26307973 PMCID: PMC4581330 DOI: 10.3390/ijms160819868] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Revised: 07/28/2015] [Accepted: 07/29/2015] [Indexed: 12/02/2022] Open
Abstract
Intrinsically-disordered regions lack a well-defined 3D structure, but play key roles in determining the function of many proteins. Although predictors of disorder have been shown to achieve relatively high rates of correct classification of these segments, improvements over the the years have been slow, and accurate methods are needed that are capable of accommodating the ever-increasing amount of structurally-determined protein sequences to try to boost predictive performances. In this paper, we propose a predictor for short disordered regions based on bidirectional recurrent neural networks and tested by rigorous five-fold cross-validation on a large, non-redundant dataset collected from MobiDB, a new comprehensive source of protein disorder annotations. The system exploits sequence and structural information in the forms of frequency profiles, predicted secondary structure and solvent accessibility and direct disorder annotations from homologous protein structures (templates) deposited in the Protein Data Bank. The contributions of sequence, structure and homology information result in large improvements in predictive accuracy. Additionally, the large scale of the training set leads to low false positive rates, making our systems a robust and efficient way to address high-throughput disorder prediction.
Collapse
Affiliation(s)
- Viola Volpato
- School of Computer Science, University College Dublin, Belfield, Dublin 4, Ireland.
- Adaptive and Complex Systems Laboratory, University College Dublin, Belfield, Dublin 4, Ireland.
| | - Badr Alshomrani
- School of Computer Science, University College Dublin, Belfield, Dublin 4, Ireland.
- Adaptive and Complex Systems Laboratory, University College Dublin, Belfield, Dublin 4, Ireland.
| | - Gianluca Pollastri
- School of Computer Science, University College Dublin, Belfield, Dublin 4, Ireland.
- Adaptive and Complex Systems Laboratory, University College Dublin, Belfield, Dublin 4, Ireland.
| |
Collapse
|
49
|
Tusnády GE, Dobson L, Tompa P. Disordered regions in transmembrane proteins. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2015; 1848:2839-48. [PMID: 26275590 DOI: 10.1016/j.bbamem.2015.08.002] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2015] [Revised: 07/28/2015] [Accepted: 08/09/2015] [Indexed: 11/18/2022]
Abstract
The functions of transmembrane proteins in living cells are widespread; they range from various transport processes to energy production, from cell-cell adhesion to communication. Structurally, they are highly ordered in their membrane-spanning regions, but may contain disordered regions in the cytosolic and extra-cytosolic parts. In this study, we have investigated the disordered regions in transmembrane proteins by a stringent definition of disordered residues on the currently available largest experimental dataset, and show a significant correlation between the spatial distributions of positively charged residues and disordered regions. This finding suggests a new role of disordered regions in transmembrane proteins by providing structural flexibility for stabilizing interactions with negatively charged head groups of the lipid molecules. We also find a preference of structural disorder in the terminal--as opposed to loop--regions in transmembrane proteins, and survey the respective functions involved in recruiting other proteins or mediating allosteric signaling effects. Finally, we critically compare disorder prediction methods on our transmembrane protein set. While there are no major differences between these methods using the usual statistics, such as per residue accuracies, Matthew's correlation coefficients, etc.; substantial differences can be found regarding the spatial distribution of the predicted disordered regions. We conclude that a predictor optimized for transmembrane proteins would be of high value to the field of structural disorder.
Collapse
Affiliation(s)
- Gábor E Tusnády
- Institute of Enzymology, RCNS, HAS, Magyar Tudósok körútja 2, 1117 Budapest, Hungary.
| | - László Dobson
- Institute of Enzymology, RCNS, HAS, Magyar Tudósok körútja 2, 1117 Budapest, Hungary
| | - Peter Tompa
- Institute of Enzymology, RCNS, HAS, Magyar Tudósok körútja 2, 1117 Budapest, Hungary; VIB Structural Biology Research Center, VUB, Building E, Pleinlaan 2, 1050 Brussels, Belgium
| |
Collapse
|
50
|
Brain expressed and X-linked (Bex) proteins are intrinsically disordered proteins (IDPs) and form new signaling hubs. PLoS One 2015; 10:e0117206. [PMID: 25612294 PMCID: PMC4303428 DOI: 10.1371/journal.pone.0117206] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2014] [Accepted: 12/20/2014] [Indexed: 11/19/2022] Open
Abstract
Intrinsically disordered proteins (IDPs) are abundant in complex organisms. Due to their promiscuous nature and their ability to adopt several conformations IDPs constitute important points of network regulation. The family of Brain Expressed and X-linked (Bex) proteins consists of 5 members in humans (Bex1-5). Recent reports have implicated Bex proteins in transcriptional regulation and signaling pathways involved in neurodegeneration, cancer, cell cycle and tumor growth. However, structural and biophysical data for this protein family is almost non-existent. We used bioinformatics analyses to show that Bex proteins contain long regions of intrinsic disorder which are conserved across all members. Moreover, we confirmed the intrinsic disorder by circular dichroism spectroscopy of Bex1 after expression and purification in E. coli. These observations strongly suggest that Bex proteins constitute a new group of IDPs. Based on these findings, together with the demonstrated promiscuity of Bex proteins and their involvement in different signaling pathways, we propose that Bex family members play important roles in the formation of protein network hubs.
Collapse
|