151
|
Quiroga IY, Ahn JH, Wang GG, Phanstiel D. Oncogenic fusion proteins and their role in three-dimensional chromatin structure, phase separation, and cancer. Curr Opin Genet Dev 2022; 74:101901. [PMID: 35427897 PMCID: PMC9156545 DOI: 10.1016/j.gde.2022.101901] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Revised: 02/17/2022] [Accepted: 03/05/2022] [Indexed: 11/27/2022]
Abstract
Three-dimensional (3D) chromatin structure plays a critical role in development, gene regulation, and cellular identity. Alterations to this structure can have profound effects on cellular phenotypes and have been associated with a variety of diseases including multiple types of cancer. One of several forces that help shape 3D chromatin structure is liquid-liquid phase separation, a form of self-association between biomolecules that can sequester regions of chromatin into subnuclear droplets or even membraneless organelles like nucleoli. This review focuses on a class of oncogenic fusion proteins that appear to exert their oncogenic function via phase-separation-driven alterations to 3D chromatin structure. Here, we review what is known about the mechanisms by which these oncogenic fusion proteins phase separate in the nucleus and their role in shaping the 3D chromatin structure. We discuss the potential for this phenomenon to be a more widespread mechanism of oncogenesis.
Collapse
Affiliation(s)
- Ivana Y Quiroga
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jeong Hyun Ahn
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, NC, USA; Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, NC, USA
| | - Gang Greg Wang
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, NC, USA; Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, NC, USA.
| | - Douglas Phanstiel
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA; Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, NC, USA; Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, NC, USA.
| |
Collapse
|
152
|
Ibuprofen Favors Binding of Amyloid-β Peptide to Its Depot, Serum Albumin. Int J Mol Sci 2022; 23:ijms23116168. [PMID: 35682848 PMCID: PMC9181795 DOI: 10.3390/ijms23116168] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 05/24/2022] [Accepted: 05/28/2022] [Indexed: 12/15/2022] Open
Abstract
The deposition of amyloid-β peptide (Aβ) in the brain is a critical event in the progression of Alzheimer’s disease (AD). This Aβ deposition could be prevented by directed enhancement of Aβ binding to its natural depot, human serum albumin (HSA). Previously, we revealed that specific endogenous ligands of HSA improve its affinity to monomeric Aβ. We show here that an exogenous HSA ligand, ibuprofen (IBU), exerts the analogous effect. Plasmon resonance spectroscopy data evidence that a therapeutic IBU level increases HSA affinity to monomeric Aβ40/Aβ42 by a factor of 3–5. Using thioflavin T fluorescence assay and transmission electron microcopy, we show that IBU favors the suppression of Aβ40 fibrillation by HSA. Molecular docking data indicate partial overlap between the IBU/Aβ40-binding sites of HSA. The revealed enhancement of the HSA–Aβ interaction by IBU and the strengthened inhibition of Aβ fibrillation by HSA in the presence of IBU could contribute to the neuroprotective effects of the latter, previously observed in mouse and human studies of AD.
Collapse
|
153
|
Rezaei S, Pereira F, Uversky VN, Sefidbakht Y. Molecular dynamics and intrinsic disorder analysis of the SARS-CoV-2 Nsp1 structural changes caused by substitution and deletion mutations. MOLECULAR SIMULATION 2022. [DOI: 10.1080/08927022.2022.2075546] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Shokouh Rezaei
- Protein Research Center, Shahid Beheshti University, G.C., Tehran, Iran
| | - Filipe Pereira
- Centre for Functional Ecology, Department of Life Sciences, University of Coimbra, Coimbra, Portugal
- IDENTIFICA genetic testing, Maia, Portugal
| | - Vladimir N. Uversky
- Department of Molecular Medicine, University of South Florida, Tampa, FL, USA
| | - Yahya Sefidbakht
- Protein Research Center, Shahid Beheshti University, G.C., Tehran, Iran
| |
Collapse
|
154
|
Biró B, Zhao B, Kurgan L. Complementarity of the residue-level protein function and structure predictions in human proteins. Comput Struct Biotechnol J 2022; 20:2223-2234. [PMID: 35615015 PMCID: PMC9118482 DOI: 10.1016/j.csbj.2022.05.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 05/02/2022] [Accepted: 05/02/2022] [Indexed: 11/24/2022] Open
Abstract
Sequence-based predictors of the residue-level protein function and structure cover a broad spectrum of characteristics including intrinsic disorder, secondary structure, solvent accessibility and binding to nucleic acids. They were catalogued and evaluated in numerous surveys and assessments. However, methods focusing on a given characteristic are studied separately from predictors of other characteristics, while they are typically used on the same proteins. We fill this void by studying complementarity of a representative collection of methods that target different predictions using a large, taxonomically consistent, and low similarity dataset of human proteins. First, we bridge the gap between the communities that develop structure-trained vs. disorder-trained predictors of binding residues. Motivated by a recent study of the protein-binding residue predictions, we empirically find that combining the structure-trained and disorder-trained predictors of the DNA-binding and RNA-binding residues leads to substantial improvements in predictive quality. Second, we investigate whether diverse predictors generate results that accurately reproduce relations between secondary structure, solvent accessibility, interaction sites, and intrinsic disorder that are present in the experimental data. Our empirical analysis concludes that predictions accurately reflect all combinations of these relations. Altogether, this study provides unique insights that support combining results produced by diverse residue-level predictors of protein function and structure.
Collapse
Affiliation(s)
- Bálint Biró
- Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life Sciences, Gödöllő, Hungary
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| |
Collapse
|
155
|
Micsonai A, Moussong É, Murvai N, Tantos Á, Tőke O, Réfrégiers M, Wien F, Kardos J. Disordered-Ordered Protein Binary Classification by Circular Dichroism Spectroscopy. Front Mol Biosci 2022; 9:863141. [PMID: 35591946 PMCID: PMC9110821 DOI: 10.3389/fmolb.2022.863141] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Accepted: 03/24/2022] [Indexed: 12/31/2022] Open
Abstract
Intrinsically disordered proteins lack a stable tertiary structure and form dynamic conformational ensembles due to their characteristic physicochemical properties and amino acid composition. They are abundant in nature and responsible for a large variety of cellular functions. While numerous bioinformatics tools have been developed for in silico disorder prediction in the last decades, there is a need for experimental methods to verify the disordered state. CD spectroscopy is widely used for protein secondary structure analysis. It is usable in a wide concentration range under various buffer conditions. Even without providing high-resolution information, it is especially useful when NMR, X-ray, or other techniques are problematic or one simply needs a fast technique to verify the structure of proteins. Here, we propose an automatized binary disorder-order classification method by analyzing far-UV CD spectroscopy data. The method needs CD data at only three wavelength points, making high-throughput data collection possible. The mathematical analysis applies the k-nearest neighbor algorithm with cosine distance function, which is independent of the spectral amplitude and thus free of concentration determination errors. Moreover, the method can be used even for strong absorbing samples, such as the case of crowded environmental conditions, if the spectrum can be recorded down to the wavelength of 212 nm. We believe the classification method will be useful in identifying disorder and will also facilitate the growth of experimental data in IDP databases. The method is implemented on a webserver and freely available for academic users.
Collapse
Affiliation(s)
- András Micsonai
- ELTE NAP Neuroimmunology Research Group, Department of Biochemistry, Institute of Biology, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Éva Moussong
- ELTE NAP Neuroimmunology Research Group, Department of Biochemistry, Institute of Biology, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Nikoletta Murvai
- Department of Biochemistry, Institute of Biology, ELTE Eötvös Loránd University, Budapest, Hungary
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Ágnes Tantos
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Orsolya Tőke
- Laboratory for NMR Spectroscopy, Research Centre for Natural Sciences, Budapest, Hungary
| | - Matthieu Réfrégiers
- Synchrotron SOLEIL, Gif-sur-Yvette, France
- Centre de Biophysique Moléculaire, CNRS UPR4301, Orléans, France
| | - Frank Wien
- Synchrotron SOLEIL, Gif-sur-Yvette, France
| | - József Kardos
- ELTE NAP Neuroimmunology Research Group, Department of Biochemistry, Institute of Biology, ELTE Eötvös Loránd University, Budapest, Hungary
| |
Collapse
|
156
|
What Is Parvalbumin for? Biomolecules 2022; 12:biom12050656. [PMID: 35625584 PMCID: PMC9138604 DOI: 10.3390/biom12050656] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Revised: 04/25/2022] [Accepted: 04/28/2022] [Indexed: 12/28/2022] Open
Abstract
Parvalbumin (PA) is a small, acidic, mostly cytosolic Ca2+-binding protein of the EF-hand superfamily. Structural and physical properties of PA are well studied but recently two highly conserved structural motifs consisting of three amino acids each (clusters I and II), which contribute to the hydrophobic core of the EF-hand domains, have been revealed. Despite several decades of studies, physiological functions of PA are still poorly known. Since no target proteins have been revealed for PA so far, it is believed that PA acts as a slow calcium buffer. Numerous experiments on various muscle systems have shown that PA accelerates the relaxation of fast skeletal muscles. It has been found that oxidation of PA by reactive oxygen species (ROS) is conformation-dependent and one more physiological function of PA in fast muscles could be a protection of these cells from ROS. PA is thought to regulate calcium-dependent metabolic and electric processes within the population of gamma-aminobutyric acid (GABA) neurons. Genetic elimination of PA results in changes in GABAergic synaptic transmission. Mammalian oncomodulin (OM), the β isoform of PA, is expressed mostly in cochlear outer hair cells and in vestibular hair cells. OM knockout mice lose their hearing after 3–4 months. It was suggested that, in sensory cells, OM maintains auditory function, most likely affecting outer hair cells’ motility mechanisms.
Collapse
|
157
|
Wilson CJ, Choy WY, Karttunen M. AlphaFold2: A Role for Disordered Protein/Region Prediction? Int J Mol Sci 2022; 23:4591. [PMID: 35562983 PMCID: PMC9104326 DOI: 10.3390/ijms23094591] [Citation(s) in RCA: 72] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Revised: 04/18/2022] [Accepted: 04/19/2022] [Indexed: 01/27/2023] Open
Abstract
The development of AlphaFold2 marked a paradigm-shift in the structural biology community. Herein, we assess the ability of AlphaFold2 to predict disordered regions against traditional sequence-based disorder predictors. We find that AlphaFold2 performs well at discriminating disordered regions, but also note that the disorder predictor one constructs from an AlphaFold2 structure determines accuracy. In particular, a naïve, but non-trivial assumption that residues assigned to helices, strands, and H-bond stabilized turns are likely ordered and all other residues are disordered results in a dramatic overestimation in disorder; conversely, the predicted local distance difference test (pLDDT) provides an excellent measure of residue-wise disorder. Furthermore, by employing molecular dynamics (MD) simulations, we note an interesting relationship between the pLDDT and secondary structure, that may explain our observations and suggests a broader application of the pLDDT for characterizing the local dynamics of intrinsically disordered proteins and regions (IDPs/IDRs).
Collapse
Affiliation(s)
- Carter J. Wilson
- Department of Mathematics, The University of Western Ontario, 1151 Richmond Street, London, ON N6A 5B7, Canada;
- Centre for Advanced Materials and Biomaterials Research, The University of Western Ontario, 1151 Richmond Street, London, ON N6A 5B7, Canada
| | - Wing-Yiu Choy
- Department of Biochemistry, The University of Western Ontario, 1151 Richmond Street, London, ON N6A 5C1, Canada
| | - Mikko Karttunen
- Centre for Advanced Materials and Biomaterials Research, The University of Western Ontario, 1151 Richmond Street, London, ON N6A 5B7, Canada
- Department of Physics and Astronomy, The University of Western Ontario, 1151 Richmond Street, London, ON N6A 5B7, Canada
- Department of Chemistry, The University of Western Ontario, 1151 Richmond Street, London, ON N6A 3K7, Canada
| |
Collapse
|
158
|
Avramov M, Schád É, Révész Á, Turiák L, Uzelac I, Tantos Á, Drahos L, Popović ŽD. Identification of Intrinsically Disordered Proteins and Regions in a Non-Model Insect Species Ostrinia nubilalis (Hbn.). Biomolecules 2022; 12:biom12040592. [PMID: 35454181 PMCID: PMC9029825 DOI: 10.3390/biom12040592] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 04/06/2022] [Accepted: 04/11/2022] [Indexed: 12/29/2022] Open
Abstract
Research in previous decades has shown that intrinsically disordered proteins (IDPs) and regions in proteins (IDRs) are as ubiquitous as highly ordered proteins. Despite this, research on IDPs and IDRs still has many gaps left to fill. Here, we present an approach that combines wet lab methods with bioinformatics tools to identify and analyze intrinsically disordered proteins in a non-model insect species that is cold-hardy. Due to their known resilience to the effects of extreme temperatures, these proteins likely play important roles in this insect's adaptive mechanisms to sub-zero temperatures. The approach involves IDP enrichment by sample heating and double-digestion of proteins, followed by peptide and protein identification. Next, proteins are bioinformatically analyzed for disorder content, presence of long disordered regions, amino acid composition, and processes they are involved in. Finally, IDP detection is validated with an in-house 2D PAGE. In total, 608 unique proteins were identified, with 39 being mostly disordered, 100 partially disordered, 95 nearly ordered, and 374 ordered. One-third contain at least one long disordered segment. Functional information was available for only 90 proteins with intrinsic disorders out of 312 characterized proteins. Around half of the 90 proteins are cytoskeletal elements or involved in translational processes.
Collapse
Affiliation(s)
- Miloš Avramov
- Department of Biology and Ecology, Faculty of Sciences, University of Novi Sad, 21000 Novi Sad, Serbia; (M.A.); (I.U.)
| | - Éva Schád
- Institute of Enzymology, Research Centre for Natural Sciences, 1117 Budapest, Hungary; (É.S.); (Á.T.)
| | - Ágnes Révész
- Institute of Organic Chemistry, Research Centre for Natural Sciences, 1117 Budapest, Hungary; (Á.R.); (L.T.); (L.D.)
| | - Lilla Turiák
- Institute of Organic Chemistry, Research Centre for Natural Sciences, 1117 Budapest, Hungary; (Á.R.); (L.T.); (L.D.)
| | - Iva Uzelac
- Department of Biology and Ecology, Faculty of Sciences, University of Novi Sad, 21000 Novi Sad, Serbia; (M.A.); (I.U.)
| | - Ágnes Tantos
- Institute of Enzymology, Research Centre for Natural Sciences, 1117 Budapest, Hungary; (É.S.); (Á.T.)
| | - László Drahos
- Institute of Organic Chemistry, Research Centre for Natural Sciences, 1117 Budapest, Hungary; (Á.R.); (L.T.); (L.D.)
| | - Željko D. Popović
- Department of Biology and Ecology, Faculty of Sciences, University of Novi Sad, 21000 Novi Sad, Serbia; (M.A.); (I.U.)
- Correspondence:
| |
Collapse
|
159
|
Aderinwale T, Bharadwaj V, Christoffer C, Terashi G, Zhang Z, Jahandideh R, Kagaya Y, Kihara D. Real-time structure search and structure classification for AlphaFold protein models. Commun Biol 2022; 5:316. [PMID: 35383281 PMCID: PMC8983703 DOI: 10.1038/s42003-022-03261-8] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 03/11/2022] [Indexed: 11/17/2022] Open
Abstract
Last year saw a breakthrough in protein structure prediction, where the AlphaFold2 method showed a substantial improvement in the modeling accuracy. Following the software release of AlphaFold2, predicted structures by AlphaFold2 for proteins in 21 species were made publicly available via the AlphaFold Database. Here, to facilitate structural analysis and application of AlphaFold2 models, we provide the infrastructure, 3D-AF-Surfer, which allows real-time structure-based search for the AlphaFold2 models. In 3D-AF-Surfer, structures are represented with 3D Zernike descriptors (3DZD), which is a rotationally invariant, mathematical representation of 3D shapes. We developed a neural network that takes 3DZDs of proteins as input and retrieves proteins of the same fold more accurately than direct comparison of 3DZDs. Using 3D-AF-Surfer, we report structure classifications of AlphaFold2 models and discuss the correlation between confidence levels of AlphaFold2 models and intrinsic disordered regions.
Collapse
Affiliation(s)
- Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Vijay Bharadwaj
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | | | - Yuki Kagaya
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| |
Collapse
|
160
|
Chakrabarti P, Chakravarty D. Intrinsically disordered proteins/regions and insight into their biomolecular interactions. Biophys Chem 2022; 283:106769. [DOI: 10.1016/j.bpc.2022.106769] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 01/26/2022] [Accepted: 01/26/2022] [Indexed: 12/20/2022]
|
161
|
Orlando G, Raimondi D, Codice F, Tabaro F, Vranken W. Prediction of disordered regions in proteins with recurrent Neural Networks and protein dynamics. J Mol Biol 2022; 434:167579. [DOI: 10.1016/j.jmb.2022.167579] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 03/21/2022] [Accepted: 03/31/2022] [Indexed: 10/18/2022]
|
162
|
Hassan SS, Kodakandla V, Redwan EM, Lundstrom K, Pal Choudhury P, Abd El-Aziz TM, Takayama K, Kandimalla R, Lal A, Serrano-Aroca Á, Azad GK, Aljabali AA, Palù G, Chauhan G, Adadi P, Tambuwala M, Brufsky AM, Baetas-da-Cruz W, Barh D, Azevedo V, Bazan NG, Andrade BS, Santana Silva RJ, Uversky VN. An issue of concern: unique truncated ORF8 protein variants of SARS-CoV-2. PeerJ 2022; 10:e13136. [PMID: 35341060 PMCID: PMC8944340 DOI: 10.7717/peerj.13136] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Accepted: 02/27/2022] [Indexed: 01/12/2023] Open
Abstract
Open reading frame 8 (ORF8) shows one of the highest levels of variability among accessory proteins in Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), the causative agent of Coronavirus Disease 2019 (COVID-19). It was previously reported that the ORF8 protein inhibits the presentation of viral antigens by the major histocompatibility complex class I (MHC-I), which interacts with host factors involved in pulmonary inflammation. The ORF8 protein assists SARS-CoV-2 in evading immunity and plays a role in SARS-CoV-2 replication. Among many contributing mutations, Q27STOP, a mutation in the ORF8 protein, defines the B.1.1.7 lineage of SARS-CoV-2, engendering the second wave of COVID-19. In the present study, 47 unique truncated ORF8 proteins (T-ORF8) with the Q27STOP mutations were identified among 49,055 available B.1.1.7 SARS-CoV-2 sequences. The results show that only one of the 47 T-ORF8 variants spread to over 57 geo-locations in North America, and other continents, which include Africa, Asia, Europe and South America. Based on various quantitative features, such as amino acid homology, polar/non-polar sequence homology, Shannon entropy conservation, and other physicochemical properties of all specific 47 T-ORF8 protein variants, nine possible T-ORF8 unique variants were defined. The question as to whether T-ORF8 variants function similarly to the wild type ORF8 is yet to be investigated. A positive response to the question could exacerbate future COVID-19 waves, necessitating severe containment measures.
Collapse
Affiliation(s)
- Sk. Sarif Hassan
- Department of Mathematics, Pingla Thana Mahavidyalaya, Maligram, India
| | - Vaishnavi Kodakandla
- Department of Life sciences, Sophia College For Women, University of Mumbai, Mumbai, India
| | - Elrashdy M. Redwan
- Faculty of Science, Department of Biological Science, King Abdulaziz University, Jeddah, Saudi Arabia
| | | | | | - Tarek Mohamed Abd El-Aziz
- Department of Cellular and Integrative Physiology, University of Texas Health Science Center at San Antonio, San Antonio, TX, United States
| | - Kazuo Takayama
- Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto, Japan
| | - Ramesh Kandimalla
- Applied Biology, CSIR-Indian Institute of Chemical Technology, Hyderabad, India
| | - Amos Lal
- Division of Pulmonary and Critical Care Medicine, Mayo Clinic Rochester, Rochester, NY, United States
| | - Ángel Serrano-Aroca
- Biomaterials and Bioengineering Lab, Centro de Investigacion Traslacional San Alberto Magno, Universidad Catolica de Valencia San Vicente Martir, Valencia, Spain
| | | | - Alaa A.A. Aljabali
- Department of Pharmaceutics and Pharmaceutical, Yarmouk University, Irbid, Jordan
| | - Giorgio Palù
- Department of Molecular Medicine, University of Padova, Padova, Italy
| | - Gaurav Chauhan
- School of Engineering and Sciences, Tecnologico de Monterrey, Monterrey, Mexico
| | - Parise Adadi
- Department of Food Science, University of Otago, University of Otago, Dunedin, New Zealand
| | - Murtaza Tambuwala
- School of Pharmacy and Pharmaceutical Science, Ulster University, Coleraine, UK
| | - Adam M. Brufsky
- Department of Medicine, Division of Hematology/Oncology, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
| | - Wagner Baetas-da-Cruz
- Translational Laboratory in Molecular Physiology, Centre for Experimental Surgery, College of Medicine, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Debmalya Barh
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and 46 Applied Biotechnology (IIOAB), Nonakuri, India
| | - Vasco Azevedo
- Departamento de Genetica, Ecologia e Evolucao, Instituto de Ciencias Biologicas, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Nikolas G. Bazan
- Neuroscience Center of Excellence, School of Medicine, LSU Health New Orleans, New Orleans, LA, United States
| | - Bruno Silva Andrade
- Laboratório de Bioinformática e Química Computacional, Departamento de Ciências Biológicas, Universidade Estadual do Sudoeste da Bahia, Jequié, Brazil
| | - Raner José Santana Silva
- Departamento de Ciencias Biologicas (DCB), Programa de Pos-Graduacao em Genetica e Biologia Molecular (PPGGBM), Universidade Estadual de Santa Cruz (UESC), Ilheus, Brazil
| | - Vladimir N. Uversky
- Department of Molecular Medicine, University of South Florida, Tampa, FL, United States
| |
Collapse
|
163
|
Pei H, Guo W, Peng Y, Xiong H, Chen Y. Targeting key proteins involved in transcriptional regulation for cancer therapy: Current strategies and future prospective. Med Res Rev 2022; 42:1607-1660. [PMID: 35312190 DOI: 10.1002/med.21886] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 02/10/2022] [Accepted: 02/22/2022] [Indexed: 12/14/2022]
Abstract
The key proteins involved in transcriptional regulation play convergent roles in cellular homeostasis, and their dysfunction mediates aberrant gene expressions that underline the hallmarks of tumorigenesis. As tumor progression is dependent on such abnormal regulation of transcription, it is important to discover novel chemical entities as antitumor drugs that target key tumor-associated proteins involved in transcriptional regulation. Despite most key proteins (especially transcription factors) involved in transcriptional regulation are historically recognized as undruggable targets, multiple targeting approaches at diverse levels of transcriptional regulation, such as epigenetic intervention, inhibition of DNA-binding of transcriptional factors, and inhibition of the protein-protein interactions (PPIs), have been established in preclinically or clinically studies. In addition, several new approaches have recently been described, such as targeting proteasomal degradation and eliciting synthetic lethality. This review will emphasize on accentuating these developing therapeutic approaches and provide a thorough conspectus of the drug development to target key proteins involved in transcriptional regulation and their impact on future oncotherapy.
Collapse
Affiliation(s)
- Haixiang Pei
- Institute for Advanced Study, Shenzhen University and Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University Health Science Center, Shenzhen, China.,Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China
| | - Weikai Guo
- Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China.,Joint National Laboratory for Antibody Drug Engineering, School of Basic Medical Science, Henan University, Kaifeng, China
| | - Yangrui Peng
- Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China
| | - Hai Xiong
- Institute for Advanced Study, Shenzhen University and Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University Health Science Center, Shenzhen, China
| | - Yihua Chen
- Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China
| |
Collapse
|
164
|
Poboinev VV, Khrustalev VV, Khrustaleva TA, Kasko TE, Popkov VD. The PentUnFOLD algorithm as a tool to distinguish the dark and the light sides of the structural instability of proteins. Amino Acids 2022; 54:1155-1171. [PMID: 35294674 PMCID: PMC8924573 DOI: 10.1007/s00726-022-03153-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 02/14/2022] [Indexed: 12/12/2022]
Abstract
Intrinsically disordered proteins are frequently involved in important regulatory processes in the cell thanks to their ability to bind several different targets performing sometimes even opposite functions. The PentUnFOLD algorithm is a physicochemical method that is based on new propensity scales for disordered, nonstable and stable elements of secondary structure and on the counting of stabilizing and destabilizing intraprotein contacts. Unlike other methods, it works with a PDB file, and it can determine not only those fragments of alpha helices, beta strands, and random coils that can turn into disordered state (the “dark” side of the disorder), but also nonstable regions of alpha helices and beta strands which are able to turn into random coils (the “light” side), and vice versa (H ↔ C, E ↔ C). The scales have been obtained from structural data on disordered regions from the middle parts of amino acid sequences only, and not on their expectedly disordered N- and C-termini. Among other tendencies we have found that regions of both alpha helices and beta strands that can turn into the disordered state are relatively enriched in residues of Ala, Met, Asp, and Lys, while regions of both alpha helices and beta strands that can turn into random coil are relatively enriched in hydrophilic residues, and Cys, Pro, and Gly. Moreover, PentUnFOLD has the option to determine the effect of secondary structure transitions on the stability of a given region of a protein. The PentUnFOLD algorithm is freely available at http://3.17.12.213/pent-un-fold and http://chemres.bsmu.by/PentUnFOLD.htm.
Collapse
Affiliation(s)
| | | | - Tatyana Aleksandrovna Khrustaleva
- Biochemical Group of the Multidisciplinary Diagnostic Laboratory, Institute of Physiology of the National Academy of Sciences of Belarus, Minsk, Belarus
| | - Tihon Evgenyevich Kasko
- Department of General Chemistry, Belarusian State Medical University, Dzerzinskogo 83, Minsk, Belarus
| | - Vadim Dmitrievich Popkov
- Department of General Chemistry, Belarusian State Medical University, Dzerzinskogo 83, Minsk, Belarus
| |
Collapse
|
165
|
Tenchov R, Zhou QA. Intrinsically Disordered Proteins: Perspective on COVID-19 Infection and Drug Discovery. ACS Infect Dis 2022; 8:422-432. [PMID: 35196007 PMCID: PMC8887652 DOI: 10.1021/acsinfecdis.2c00031] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Indexed: 12/23/2022]
Abstract
Since the beginning of the COVID-19 pandemic caused by SARS-CoV-2, millions of patients have been diagnosed and many of them have died from the disease worldwide. The identification of novel therapeutic targets are of utmost significance for prevention and treatment of COVID-19. SARS-CoV-2 is a single-stranded RNA virus with a 30 kb genome packaged into a membrane-enveloped virion, transcribing several tens of proteins. The belief that the amino acid sequence of proteins determines their 3D structure which, in turn, determines their function has been a central principle of molecular biology for a long time. Recently, it has been increasingly realized, however, that there is a large group of proteins that lack a fixed or ordered 3D structure, yet they exhibit important biological activities─so-called intrinsically disordered proteins and protein regions (IDPs/IDRs). Disordered regions in viral proteins are generally associated with viral infectivity and pathogenicity because they endow the viral proteins the ability to easily and promiscuously bind to host proteins; therefore, the proteome of SARS-CoV-2 has been thoroughly examined for intrinsic disorder. It has been recognized that, in fact, the SARS-CoV-2 proteome exhibits significant levels of structural order, with only the nucleocapsid (N) structural protein and two of the nonstructural proteins being highly disordered. The spike (S) protein of SARS-CoV-2 exhibits significant levels of structural order, yet its predicted percentage of intrinsic disorder is still higher than that of the spike protein of SARS-CoV. Noteworthy, however, even though IDPs/IDRs are not common in the SARS-CoV-2 proteome, the existing ones play major roles in the functioning and virulence of the virus and are thus promising drug targets for rational antiviral drug design. Presented here is a COVID-19 perspective on the intrinsically disordered proteins, summarizing recent results on the SARS-CoV-2 proteome disorder features, their physiological and pathological relevance, and their prominence as prospective drug target sites.
Collapse
Affiliation(s)
- Rumiana Tenchov
- CAS, a division of the American Chemical Society,
Columbus, Ohio 43210, United States
| | | |
Collapse
|
166
|
Ahmed SS, Rifat ZT, Lohia R, Campbell AJ, Dunker AK, Rahman MS, Iqbal S. Characterization of intrinsically disordered regions in proteins informed by human genetic diversity. PLoS Comput Biol 2022; 18:e1009911. [PMID: 35275927 PMCID: PMC8942211 DOI: 10.1371/journal.pcbi.1009911] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Revised: 03/23/2022] [Accepted: 02/10/2022] [Indexed: 01/21/2023] Open
Abstract
All proteomes contain both proteins and polypeptide segments that don’t form a defined three-dimensional structure yet are biologically active—called intrinsically disordered proteins and regions (IDPs and IDRs). Most of these IDPs/IDRs lack useful functional annotation limiting our understanding of their importance for organism fitness. Here we characterized IDRs using protein sequence annotations of functional sites and regions available in the UniProt knowledgebase (“UniProt features”: active site, ligand-binding pocket, regions mediating protein-protein interactions, etc.). By measuring the statistical enrichment of twenty-five UniProt features in 981 IDRs of 561 human proteins, we identified eight features that are commonly located in IDRs. We then collected the genetic variant data from the general population and patient-based databases and evaluated the prevalence of population and pathogenic variations in IDPs/IDRs. We observed that some IDRs tolerate 2 to 12-times more single amino acid-substituting missense mutations than synonymous changes in the general population. However, we also found that 37% of all germline pathogenic mutations are located in disordered regions of 96 proteins. Based on the observed-to-expected frequency of mutations, we categorized 34 IDRs in 20 proteins (DDX3X, KIT, RB1, etc.) as intolerant to mutation. Finally, using statistical analysis and a machine learning approach, we demonstrate that mutation-intolerant IDRs carry a distinct signature of functional features. Our study presents a novel approach to assign functional importance to IDRs by leveraging the wealth of available genetic data, which will aid in a deeper understating of the role of IDRs in biological processes and disease mechanisms.
Collapse
Affiliation(s)
- Shehab S. Ahmed
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, West Palashi, Dhaka-1205, Bangladesh
| | - Zaara T. Rifat
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, West Palashi, Dhaka-1205, Bangladesh
| | - Ruchi Lohia
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Arthur J. Campbell
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - A. Keith Dunker
- Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, Indiana, United States of America
| | - M. Sohel Rahman
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, West Palashi, Dhaka-1205, Bangladesh
- * E-mail: (MSR); (SI)
| | - Sumaiya Iqbal
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- * E-mail: (MSR); (SI)
| |
Collapse
|
167
|
Zhao B, Kurgan L. Deep learning in prediction of intrinsic disorder in proteins. Comput Struct Biotechnol J 2022; 20:1286-1294. [PMID: 35356546 PMCID: PMC8927795 DOI: 10.1016/j.csbj.2022.03.003] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 03/04/2022] [Accepted: 03/04/2022] [Indexed: 12/12/2022] Open
Abstract
Intrinsic disorder prediction is an active area that has developed over 100 predictors. We identify and investigate a recent trend towards the development of deep neural network (DNN)-based methods. The first DNN-based method was released in 2013 and since 2019 deep learners account for majority of the new disorder predictors. We find that the 13 currently available DNN-based predictors are diverse in their topologies, sizes of their networks and the inputs that they utilize. We empirically show that the deep learners are statistically more accurate than other types of disorder predictors using the blind test dataset from the recent community assessment of intrinsic disorder predictions (CAID). We also identify several well-rounded DNN-based predictors that are accurate, fast and/or conveniently available. The popularity, favorable predictive performance and architectural flexibility suggest that deep networks are likely to fuel the development of future disordered predictors. Novel hybrid designs of deep networks could be used to adequately accommodate for diversity of types and flavors of intrinsic disorder. We also discuss scarcity of the DNN-based methods for the prediction of disordered binding regions and the need to develop more accurate methods for this prediction.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
168
|
Kurgan L. Resources for computational prediction of intrinsic disorder in proteins. Methods 2022; 204:132-141. [DOI: 10.1016/j.ymeth.2022.03.018] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 03/25/2022] [Accepted: 03/29/2022] [Indexed: 12/26/2022] Open
|
169
|
Djulbegovic M, Uversky VN. The aqueous humor proteome is intrinsically disordered. Biochem Biophys Rep 2022; 29:101202. [PMID: 35128080 PMCID: PMC8808082 DOI: 10.1016/j.bbrep.2022.101202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 01/03/2022] [Accepted: 01/04/2022] [Indexed: 11/14/2022] Open
Abstract
Our study demonstrated that intrinsic disorder is abundant in the aqueous humor. The 749 aqueous proteins analyzed were enriched with disorder-promoting residues. 208 aqueous humor proteins were predicted to be highly intrinsically disordered. Misregulation of IDPs may promote pathology in the aqueous humor. IDPs in aqueous humor may serve as future targets for novel therapeutics.
Collapse
|
170
|
Piovesan D, Monzon AM, Quaglia F, Tosatto SCE. Databases for intrinsically disordered proteins. Acta Crystallogr D Struct Biol 2022; 78:144-151. [PMID: 35102880 PMCID: PMC8805306 DOI: 10.1107/s2059798321012109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Accepted: 11/12/2021] [Indexed: 11/28/2022] Open
Abstract
Intrinsically disordered regions (IDRs) lacking a fixed three-dimensional protein structure are widespread and play a central role in cell regulation. Only a small fraction of IDRs have been functionally characterized, with heterogeneous experimental evidence that is largely buried in the literature. Predictions of IDRs are still difficult to estimate and are poorly characterized. Here, an overview of the publicly available knowledge about IDRs is reported, including manually curated resources, deposition databases and prediction repositories. The types, scopes and availability of the various resources are analyzed, and their complementarity and overlap are highlighted. The volume of information included and the relevance to the field of structural biology are compared.
Collapse
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | - Federica Quaglia
- Department of Biomedical Sciences, University of Padova, Padova, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR–IBIOM), Bari, Italy
| | | |
Collapse
|
171
|
Erythropoietin Interacts with Specific S100 Proteins. Biomolecules 2022; 12:biom12010120. [PMID: 35053268 PMCID: PMC8773746 DOI: 10.3390/biom12010120] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 01/10/2022] [Accepted: 01/10/2022] [Indexed: 02/06/2023] Open
Abstract
Erythropoietin (EPO) is a clinically significant four-helical cytokine, exhibiting erythropoietic, cytoprotective, immunomodulatory, and cancer-promoting activities. Despite vast knowledge on its signaling pathways and physiological effects, extracellular factors regulating EPO activity remain underexplored. Here we show by surface plasmon resonance spectroscopy, that among eighteen members of Ca2+-binding proteins of the S100 protein family studied, only S100A2, S100A6 and S100P proteins specifically recognize EPO with equilibrium dissociation constants ranging from 81 nM to 0.5 µM. The interactions occur exclusively under calcium excess. Bioinformatics analysis showed that the EPO-S100 interactions could be relevant to progression of neoplastic diseases, including cancer, and other diseases. The detailed knowledge of distinct physiological effects of the EPO-S100 interactions could favor development of more efficient clinical implications of EPO. Summing up our data with previous findings, we conclude that S100 proteins are potentially able to directly affect functional activities of specific members of all families of four-helical cytokines, and cytokines of other structural superfamilies.
Collapse
|
172
|
Quaglia F, Mészáros B, Salladini E, Hatos A, Pancsa R, Chemes LB, Pajkos M, Lazar T, Peña-Díaz S, Santos J, Ács V, Farahi N, Fichó E, Aspromonte M, Bassot C, Chasapi A, Davey N, Davidović R, Dobson L, Elofsson A, Erdős G, Gaudet P, Giglio M, Glavina J, Iserte J, Iglesias V, Kálmán Z, Lambrughi M, Leonardi E, Longhi S, Macedo-Ribeiro S, Maiani E, Marchetti J, Marino-Buslje C, Mészáros A, Monzon A, Minervini G, Nadendla S, Nilsson JF, Novotný M, Ouzounis C, Palopoli N, Papaleo E, Pereira P, Pozzati G, Promponas V, Pujols J, Rocha AS, Salas M, Sawicki LR, Schad E, Shenoy A, Szaniszló T, Tsirigos K, Veljkovic N, Parisi G, Ventura S, Dosztányi Z, Tompa P, Tosatto SCE, Piovesan D. DisProt in 2022: improved quality and accessibility of protein intrinsic disorder annotation. Nucleic Acids Res 2022; 50:D480-D487. [PMID: 34850135 PMCID: PMC8728214 DOI: 10.1093/nar/gkab1082] [Citation(s) in RCA: 107] [Impact Index Per Article: 35.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/15/2021] [Accepted: 10/20/2021] [Indexed: 02/03/2023] Open
Abstract
The Database of Intrinsically Disordered Proteins (DisProt, URL: https://disprot.org) is the major repository of manually curated annotations of intrinsically disordered proteins and regions from the literature. We report here recent updates of DisProt version 9, including a restyled web interface, refactored Intrinsically Disordered Proteins Ontology (IDPO), improvements in the curation process and significant content growth of around 30%. Higher quality and consistency of annotations is provided by a newly implemented reviewing process and training of curators. The increased curation capacity is fostered by the integration of DisProt with APICURON, a dedicated resource for the proper attribution and recognition of biocuration efforts. Better interoperability is provided through the adoption of the Minimum Information About Disorder (MIADE) standard, an active collaboration with the Gene Ontology (GO) and Evidence and Conclusion Ontology (ECO) consortia and the support of the ELIXIR infrastructure.
Collapse
Affiliation(s)
- Federica Quaglia
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), Bari, Italy
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Bálint Mészáros
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
| | - Edoardo Salladini
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - András Hatos
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Rita Pancsa
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
| | - Lucía B Chemes
- Instituto de Investigaciones Biotecnológicas (IIBiO-CONICET), Universidad Nacional de San Martín, Av. 25 de Mayo y Francia, CP1650 Buenos Aires, Argentina
| | - Mátyás Pajkos
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Tamas Lazar
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnology, Brussels, Belgium
- Structural Biology Brussels (SBB), Bioengineering Sciences Department, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Samuel Peña-Díaz
- Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Barcelona, Spain
- Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Jaime Santos
- Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Barcelona, Spain
- Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Veronika Ács
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
| | - Nazanin Farahi
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnology, Brussels, Belgium
- Structural Biology Brussels (SBB), Bioengineering Sciences Department, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Erzsébet Fichó
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
- Cytocast Kft., Vecsés, Hungary
| | - Maria Cristina Aspromonte
- Department of Woman and Child Health, University of Padova, Padova, Italy
- Pediatric Research Institute, Città della Speranza, Padova, Italy
| | - Claudio Bassot
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 171 21 Solna, Sweden
| | - Anastasia Chasapi
- Biological Computation & Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thermi, Thessalonica 57001, Greece
| | - Norman E Davey
- Institute of Cancer Research, Chester Beatty Laboratories, 237 Fulham Rd, Chelsea, London, UK
| | - Radoslav Davidović
- Laboratory for Bioinformatics and Computational Chemistry, Vinča Institute of Nuclear Sciences, National Institute of the Republic of Serbia, University of Belgrade, 11000Belgrade, Serbia
| | - Laszlo Dobson
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
| | - Arne Elofsson
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 171 21 Solna, Sweden
| | - Gábor Erdős
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Pascale Gaudet
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Michelle Giglio
- Institute for Genome Sciences, University of Maryland School of Medicine 670 W. Baltimore St., Baltimore, MD 21201, USA
| | - Juliana Glavina
- Instituto de Investigaciones Biotecnológicas (IIBiO-CONICET), Universidad Nacional de San Martín, Av. 25 de Mayo y Francia, CP1650 Buenos Aires, Argentina
| | - Javier Iserte
- Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, C1405BWE, Argentina
| | - Valentín Iglesias
- Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Barcelona, Spain
- Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Zsófia Kálmán
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Práter u. 50/A, 1083 Budapest, Hungary
| | - Matteo Lambrughi
- Cancer Structural Biology, Danish Cancer Society Research Center, Strandboulevarden 49, 2100 Copenhagen, Denmark
| | - Emanuela Leonardi
- Department of Woman and Child Health, University of Padova, Padova, Italy
- Pediatric Research Institute, Città della Speranza, Padova, Italy
| | - Sonia Longhi
- Lab. Architecture et Fonction des Macromolécules Biologiques (AFMB), UMR 7257, Aix Marseille University and Centre National de la Recherche Scientifique (CNRS), 163 Avenue de Luminy, Case 932, 13288, Marseille, France
| | - Sandra Macedo-Ribeiro
- Instituto de Biologia Molecular e Celular (IBMC), Universidade do Porto, 4200-135 Porto, Portugal
- Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, 4200-135 Porto, Portugal
| | - Emiliano Maiani
- Cancer Structural Biology, Danish Cancer Society Research Center, Strandboulevarden 49, 2100 Copenhagen, Denmark
| | - Julia Marchetti
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | | | - Attila Mészáros
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnology, Brussels, Belgium
- Structural Biology Brussels (SBB), Bioengineering Sciences Department, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | | | | | - Suvarna Nadendla
- Institute for Genome Sciences, University of Maryland School of Medicine 670 W. Baltimore St., Baltimore, MD 21201, USA
| | - Juliet F Nilsson
- Lab. Architecture et Fonction des Macromolécules Biologiques (AFMB), UMR 7257, Aix Marseille University and Centre National de la Recherche Scientifique (CNRS), 163 Avenue de Luminy, Case 932, 13288, Marseille, France
| | - Marian Novotný
- Dep. of Cell Biology, Faculty of Science, Vinicna 7, 128 43, Prague, Czech Republic
| | - Christos A Ouzounis
- Biological Computation & Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thermi, Thessalonica 57001, Greece
- Biological Computation & Computational Biology Group, Artificial Intelligence & Information Analysis Lab, Department of Computer Science, Aristotle University of Thessalonica, Thessalonica 54124, Greece
| | - Nicolás Palopoli
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Elena Papaleo
- Cancer Structural Biology, Danish Cancer Society Research Center, Strandboulevarden 49, 2100 Copenhagen, Denmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and Technology, Technical University of Denmark, Lyngby, Denmark
| | - Pedro José Barbosa Pereira
- Instituto de Biologia Molecular e Celular (IBMC), Universidade do Porto, 4200-135 Porto, Portugal
- Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, 4200-135 Porto, Portugal
| | - Gabriele Pozzati
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 171 21 Solna, Sweden
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Jordi Pujols
- Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Barcelona, Spain
- Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | | | - Martin Salas
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Luciana Rodriguez Sawicki
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Eva Schad
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
| | - Aditi Shenoy
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 171 21 Solna, Sweden
| | - Tamás Szaniszló
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Konstantinos D Tsirigos
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Nevena Veljkovic
- Laboratory for Bioinformatics and Computational Chemistry, Vinča Institute of Nuclear Sciences, National Institute of the Republic of Serbia, University of Belgrade, 11000Belgrade, Serbia
| | - Gustavo Parisi
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Salvador Ventura
- Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Barcelona, Spain
- Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
- ICREA, Barcelona, Spain
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Peter Tompa
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnology, Brussels, Belgium
- Structural Biology Brussels (SBB), Bioengineering Sciences Department, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | | | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| |
Collapse
|
173
|
Hassan SS, Kodakandla V, Redwan EM, Lundstrom K, Pal Choudhury P, Abd El-Aziz TM, Takayama K, Kandimalla R, Lal A, Serrano-Aroca Á, Azad GK, Aljabali AAA, Palù G, Chauhan G, Adadi P, Tambuwala M, Brufsky AM, Baetas-da-Cruz W, Barh D, Azevedo V, Bazan NG, Andrade BS, Santana Silva RJ, Uversky VN. An issue of concern: unique truncated ORF8 protein variants of SARS-CoV-2. PeerJ 2022. [PMID: 35341060 DOI: 10.1101/2021.05.25.445557] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2023] Open
Abstract
Open reading frame 8 (ORF8) shows one of the highest levels of variability among accessory proteins in Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), the causative agent of Coronavirus Disease 2019 (COVID-19). It was previously reported that the ORF8 protein inhibits the presentation of viral antigens by the major histocompatibility complex class I (MHC-I), which interacts with host factors involved in pulmonary inflammation. The ORF8 protein assists SARS-CoV-2 in evading immunity and plays a role in SARS-CoV-2 replication. Among many contributing mutations, Q27STOP, a mutation in the ORF8 protein, defines the B.1.1.7 lineage of SARS-CoV-2, engendering the second wave of COVID-19. In the present study, 47 unique truncated ORF8 proteins (T-ORF8) with the Q27STOP mutations were identified among 49,055 available B.1.1.7 SARS-CoV-2 sequences. The results show that only one of the 47 T-ORF8 variants spread to over 57 geo-locations in North America, and other continents, which include Africa, Asia, Europe and South America. Based on various quantitative features, such as amino acid homology, polar/non-polar sequence homology, Shannon entropy conservation, and other physicochemical properties of all specific 47 T-ORF8 protein variants, nine possible T-ORF8 unique variants were defined. The question as to whether T-ORF8 variants function similarly to the wild type ORF8 is yet to be investigated. A positive response to the question could exacerbate future COVID-19 waves, necessitating severe containment measures.
Collapse
Affiliation(s)
- Sk Sarif Hassan
- Department of Mathematics, Pingla Thana Mahavidyalaya, Maligram, India
| | - Vaishnavi Kodakandla
- Department of Life sciences, Sophia College For Women, University of Mumbai, Mumbai, India
| | - Elrashdy M Redwan
- Faculty of Science, Department of Biological Science, King Abdulaziz University, Jeddah, Saudi Arabia
| | | | | | - Tarek Mohamed Abd El-Aziz
- Department of Cellular and Integrative Physiology, University of Texas Health Science Center at San Antonio, San Antonio, TX, United States
| | - Kazuo Takayama
- Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto, Japan
| | - Ramesh Kandimalla
- Applied Biology, CSIR-Indian Institute of Chemical Technology, Hyderabad, India
| | - Amos Lal
- Division of Pulmonary and Critical Care Medicine, Mayo Clinic Rochester, Rochester, NY, United States
| | - Ángel Serrano-Aroca
- Biomaterials and Bioengineering Lab, Centro de Investigacion Traslacional San Alberto Magno, Universidad Catolica de Valencia San Vicente Martir, Valencia, Spain
| | | | - Alaa A A Aljabali
- Department of Pharmaceutics and Pharmaceutical, Yarmouk University, Irbid, Jordan
| | - Giorgio Palù
- Department of Molecular Medicine, University of Padova, Padova, Italy
| | - Gaurav Chauhan
- School of Engineering and Sciences, Tecnologico de Monterrey, Monterrey, Mexico
| | - Parise Adadi
- Department of Food Science, University of Otago, University of Otago, Dunedin, New Zealand
| | - Murtaza Tambuwala
- School of Pharmacy and Pharmaceutical Science, Ulster University, Coleraine, UK
| | - Adam M Brufsky
- Department of Medicine, Division of Hematology/Oncology, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
| | - Wagner Baetas-da-Cruz
- Translational Laboratory in Molecular Physiology, Centre for Experimental Surgery, College of Medicine, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Debmalya Barh
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and 46 Applied Biotechnology (IIOAB), Nonakuri, India
| | - Vasco Azevedo
- Departamento de Genetica, Ecologia e Evolucao, Instituto de Ciencias Biologicas, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Nikolas G Bazan
- Neuroscience Center of Excellence, School of Medicine, LSU Health New Orleans, New Orleans, LA, United States
| | - Bruno Silva Andrade
- Laboratório de Bioinformática e Química Computacional, Departamento de Ciências Biológicas, Universidade Estadual do Sudoeste da Bahia, Jequié, Brazil
| | - Raner José Santana Silva
- Departamento de Ciencias Biologicas (DCB), Programa de Pos-Graduacao em Genetica e Biologia Molecular (PPGGBM), Universidade Estadual de Santa Cruz (UESC), Ilheus, Brazil
| | - Vladimir N Uversky
- Department of Molecular Medicine, University of South Florida, Tampa, FL, United States
| |
Collapse
|
174
|
Robinson SL. Artificial intelligence for microbial biotechnology: beyond the hype. Microb Biotechnol 2022; 15:65-69. [PMID: 34606686 PMCID: PMC8719820 DOI: 10.1111/1751-7915.13943] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Accepted: 09/25/2021] [Indexed: 11/30/2022] Open
Abstract
It has been a landmark year for artificial intelligence (AI) and biotechnology. Perhaps the most noteworthy of these advances was Google DeepMind's AlphaFold2 algorithm which smashed records in protein structure prediction (Jumper et al., 2021, Nature, 596, 583) complemented by progress made by other research groups around the globe (Baek et al., 2021, Science, 373, 871; Zheng et al., 2021, Proteins). For the first time in history, AI achieved protein structure models rivalling the accuracy of experimentally determined structures. The power of accurate protein structure prediction at our fingertips has countless implications for drug discovery, de novo protein design and fundamental research in chemical biology. While acknowledging the significance of these breakthroughs, this perspective aims to cut through the hype and examine some key limitations using AlphaFold2 as a lens to consider the broader implications of AI for microbial biotechnology for the next 15 years and beyond.
Collapse
Affiliation(s)
- Serina L. Robinson
- Department of Environmental MicrobiologyEawag ‐ Swiss Federal Institute for Aquatic Science and TechnologyDübendorfSwitzerland
| |
Collapse
|
175
|
Tamburrini KC, Pesce G, Nilsson J, Gondelaud F, Kajava AV, Berrin JG, Longhi S. Predicting Protein Conformational Disorder and Disordered Binding Sites. Methods Mol Biol 2022; 2449:95-147. [PMID: 35507260 DOI: 10.1007/978-1-0716-2095-3_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In the last two decades it has become increasingly evident that a large number of proteins adopt either a fully or a partially disordered conformation. Intrinsically disordered proteins are ubiquitous proteins that fulfill essential biological functions while lacking a stable 3D structure. Their conformational heterogeneity is encoded by the amino acid sequence, thereby allowing intrinsically disordered proteins or regions to be recognized based on their sequence properties. The identification of disordered regions facilitates the functional annotation of proteins and is instrumental for delineating boundaries of protein domains amenable to crystallization. This chapter focuses on the methods currently employed for predicting protein disorder and identifying intrinsically disordered binding sites.
Collapse
Affiliation(s)
- Ketty C Tamburrini
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
- INRAE, Aix Marseille Univ, Biodiversité et Biotechnologie Fongiques (BBF), UMR 1163, Marseille, France
| | - Giulia Pesce
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
| | - Juliet Nilsson
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
| | - Frank Gondelaud
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier, UMR 5237, CNRS, Université Montpellier, Montpellier, France
| | - Jean-Guy Berrin
- INRAE, Aix Marseille Univ, Biodiversité et Biotechnologie Fongiques (BBF), UMR 1163, Marseille, France
| | - Sonia Longhi
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France.
| |
Collapse
|
176
|
Iglesias V, Pintado-Grima C, Santos J, Fornt M, Ventura S. Prediction of the Effect of pH on the Aggregation and Conditional Folding of Intrinsically Disordered Proteins with SolupHred and DispHred. Methods Mol Biol 2022; 2449:197-211. [PMID: 35507264 DOI: 10.1007/978-1-0716-2095-3_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Proteins microenvironments modulate their structures. Binding partners, organic molecules, or dissolved ions can alter the protein's compaction, inducing aggregation or order-disorder conformational transitions. Surprisingly, bioinformatic platforms often disregard the protein context in their modeling. In a recent work, we proposed that modeling how pH affects protein net charge and hydrophobicity might allow us to forecast pH-dependent aggregation and conditional disorder in intrinsically disordered proteins (IDPs). As these approaches showed remarkable success in recapitulating the available bibliographical data, we made these prediction methods available for the scientific community as two user-friendly web servers. SolupHred is the first dedicated software to predict pH-dependent aggregation, and DispHred is the first pH-dependent predictor of protein disorder. Here we dissect the features of these two software applications to train and assist scientists in studying pH-dependent conformational changes in IDPs.
Collapse
Affiliation(s)
- Valentín Iglesias
- Institut de Biotecnologia i de Biomedicina and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Carlos Pintado-Grima
- Institut de Biotecnologia i de Biomedicina and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Jaime Santos
- Institut de Biotecnologia i de Biomedicina and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Marc Fornt
- Institut de Biotecnologia i de Biomedicina and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Salvador Ventura
- Institut de Biotecnologia i de Biomedicina and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Bellaterra, Spain.
- Institut de Biotecnologia i de Biomedicina, Campus Universitari de Bellaterra, Cerdanyola, Barcelona, Spain.
| |
Collapse
|
177
|
Lewkowicz E, Gursky O. Dynamic protein structures in normal function and pathologic misfolding in systemic amyloidosis. Biophys Chem 2022; 280:106699. [PMID: 34773861 PMCID: PMC9416430 DOI: 10.1016/j.bpc.2021.106699] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 10/08/2021] [Accepted: 10/08/2021] [Indexed: 02/08/2023]
Abstract
Dynamic and disordered regions in native proteins are often critical for their function, particularly in ligand binding and signaling. In certain proteins, however, such regions can contribute to misfolding and pathologic deposition as amyloid fibrils in vivo. For example, dynamic and disordered regions can promote amyloid formation by destabilizing the native structure, by directly triggering the aggregation, by promoting protein condensation, or by acting as sites of early proteolytic cleavage that favor a release of aggregation-prone fragments or facilitate fibril maturation. At the same time, enhanced dynamics in the native protein state accelerates proteolytic degradation that counteracts amyloid accumulation in vivo. Therefore, the functional need for dynamic protein regions must be balanced against their inherently labile nature. How exactly this balance is achieved and how is it shifted upon amyloidogenic mutations or post-translational modifications? To illustrate possible scenarios, here we review the beneficial and pathologic roles of dynamic and disordered regions in the native states of three families of human plasma proteins that form amyloid precursors in systemic amyloidoses: immunoglobulin light chain, apolipoproteins, and serum amyloid A. Analysis of structure, stability and local dynamics of these diverse proteins and their amyloidogenic variants exemplifies how disordered/dynamic regions can provide a functional advantage as well as an Achilles heel in pathologic amyloid formation.
Collapse
|
178
|
Porta-Pardo E, Ruiz-Serra V, Valentini S, Valencia A. The structural coverage of the human proteome before and after AlphaFold. PLoS Comput Biol 2022; 18:e1009818. [PMID: 35073311 PMCID: PMC8812986 DOI: 10.1371/journal.pcbi.1009818] [Citation(s) in RCA: 72] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 02/03/2022] [Accepted: 01/07/2022] [Indexed: 12/12/2022] Open
Abstract
The protein structure field is experiencing a revolution. From the increased throughput of techniques to determine experimental structures, to developments such as cryo-EM that allow us to find the structures of large protein complexes or, more recently, the development of artificial intelligence tools, such as AlphaFold, that can predict with high accuracy the folding of proteins for which the availability of homology templates is limited. Here we quantify the effect of the recently released AlphaFold database of protein structural models in our knowledge on human proteins. Our results indicate that our current baseline for structural coverage of 48%, considering experimentally-derived or template-based homology models, elevates up to 76% when including AlphaFold predictions. At the same time the fraction of dark proteome is reduced from 26% to just 10% when AlphaFold models are considered. Furthermore, although the coverage of disease-associated genes and mutations was near complete before AlphaFold release (69% of Clinvar pathogenic mutations and 88% of oncogenic mutations), AlphaFold models still provide an additional coverage of 3% to 13% of these critically important sets of biomedical genes and mutations. Finally, we show how the contribution of AlphaFold models to the structural coverage of non-human organisms, including important pathogenic bacteria, is significantly larger than that of the human proteome. Overall, our results show that the sequence-structure gap of human proteins has almost disappeared, an outstanding success of direct consequences for the knowledge on the human genome and the derived medical applications. Protein structures are key to understand many biological phenomena at the molecular scale: from the effects of genetic variation to how different proteins interact with each other to create molecular pathways that, together, have a biological function. Obtaining experimental structures, however, is extremely consuming in terms of both, time and resources. For this and other reasons, scientists have long worked to develop computational approaches that predict the structure of a protein using only its sequence as input. Recently, a group of scientists at Deepmind have developed AlphaFold2, a computational tool that is extremely accurate at this task. Moreover, they have used this tool to predict the structures of all human proteins. In this manuscript we provide an overview of the structural coverage of the human proteome before AlphaFold models were released and how much we have gained thanks to these models. We also show how the gain affects our understanding of human pathogenic variants, both germline and somatic. Finally, we provide evidence suggesting that the gain in non-human organisms is larger than for the human proteome, particularly in the case of bacteria.
Collapse
Affiliation(s)
- Eduard Porta-Pardo
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Josep Carreras Leukaemia Research Institute (IJC), Badalona, Spain
- * E-mail: (EP-P); (AV)
| | - Victoria Ruiz-Serra
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Josep Carreras Leukaemia Research Institute (IJC), Badalona, Spain
| | - Samuel Valentini
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Trento, Italy
| | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Institució Catalana de Recerca Avançada (ICREA), Barcelona, Spain
- * E-mail: (EP-P); (AV)
| |
Collapse
|
179
|
Abstract
INTRODUCTION Intrinsic disorder prediction field develops, assesses, and deploys computational predictors of disorder in protein sequences and constructs and disseminates databases of these predictions. Over 40 years of research resulted in the release of numerous resources. AREAS COVERED We identify and briefly summarize the most comprehensive to date collection of over 100 disorder predictors. We focus on their predictive models, availability and predictive performance. We categorize and study them from a historical point of view to highlight informative trends. EXPERT OPINION We find a consistent trend of improvements in predictive quality as newer and more advanced predictors are developed. The original focus on machine learning methods has shifted to meta-predictors in early 2010s, followed by a recent transition to deep learning. The use of deep learners will continue in foreseeable future given recent and convincing success of these methods. Moreover, a broad range of resources that facilitate convenient collection of accurate disorder predictions is available to users. They include web servers and standalone programs for disorder prediction, servers that combine prediction of disorder and disorder functions, and large databases of pre-computed predictions. We also point to the need to address the shortage of accurate methods that predict disordered binding regions.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA
| |
Collapse
|
180
|
Katuwawala A, Zhao B, Kurgan L. DisoLipPred: accurate prediction of disordered lipid-binding residues in protein sequences with deep recurrent networks and transfer learning. Bioinformatics 2021; 38:115-124. [PMID: 34487138 DOI: 10.1093/bioinformatics/btab640] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 08/05/2021] [Accepted: 09/02/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Intrinsically disordered protein regions interact with proteins, nucleic acids and lipids. Regions that bind lipids are implicated in a wide spectrum of cellular functions and several human diseases. Motivated by the growing amount of experimental data for these interactions and lack of tools that can predict them from the protein sequence, we develop DisoLipPred, the first predictor of the disordered lipid-binding residues (DLBRs). RESULTS DisoLipPred relies on a deep bidirectional recurrent network that implements three innovative features: transfer learning, bypass module that sidesteps predictions for putative structured residues, and expanded inputs that cover physiochemical properties associated with the protein-lipid interactions. Ablation analysis shows that these features drive predictive quality of DisoLipPred. Tests on an independent test dataset and the yeast proteome reveal that DisoLipPred generates accurate results and that none of the related existing tools can be used to indirectly identify DLBR. We also show that DisoLipPred's predictions complement the results generated by predictors of the transmembrane regions. Altogether, we conclude that DisoLipPred provides high-quality predictions of DLBRs that complement the currently available methods. AVAILABILITY AND IMPLEMENTATION DisoLipPred's webserver is available at http://biomine.cs.vcu.edu/servers/DisoLipPred/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
181
|
Zhang F, Zhao B, Shi W, Li M, Kurgan L. DeepDISOBind: accurate prediction of RNA-, DNA- and protein-binding intrinsically disordered residues with deep multi-task learning. Brief Bioinform 2021; 23:6461158. [PMID: 34905768 DOI: 10.1093/bib/bbab521] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 10/30/2021] [Accepted: 11/14/2021] [Indexed: 12/14/2022] Open
Abstract
Proteins with intrinsically disordered regions (IDRs) are common among eukaryotes. Many IDRs interact with nucleic acids and proteins. Annotation of these interactions is supported by computational predictors, but to date, only one tool that predicts interactions with nucleic acids was released, and recent assessments demonstrate that current predictors offer modest levels of accuracy. We have developed DeepDISOBind, an innovative deep multi-task architecture that accurately predicts deoxyribonucleic acid (DNA)-, ribonucleic acid (RNA)- and protein-binding IDRs from protein sequences. DeepDISOBind relies on an information-rich sequence profile that is processed by an innovative multi-task deep neural network, where subsequent layers are gradually specialized to predict interactions with specific partner types. The common input layer links to a layer that differentiates protein- and nucleic acid-binding, which further links to layers that discriminate between DNA and RNA interactions. Empirical tests show that this multi-task design provides statistically significant gains in predictive quality across the three partner types when compared to a single-task design and a representative selection of the existing methods that cover both disorder- and structure-trained tools. Analysis of the predictions on the human proteome reveals that DeepDISOBind predictions can be encoded into protein-level propensities that accurately predict DNA- and RNA-binding proteins and protein hubs. DeepDISOBind is available at https://www.csuligroup.com/DeepDISOBind/.
Collapse
Affiliation(s)
- Fuhao Zhang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA
| | - Wenbo Shi
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Min Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA
| |
Collapse
|
182
|
Peng J, Svetec N, Zhao L. Intermolecular interactions drive protein adaptive and co-adaptive evolution at both species and population levels. Mol Biol Evol 2021; 39:6456312. [PMID: 34878126 PMCID: PMC8789070 DOI: 10.1093/molbev/msab350] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Proteins are the building blocks for almost all the functions in cells. Understanding the molecular evolution of proteins and the forces that shape protein evolution is essential in understanding the basis of function and evolution. Previous studies have shown that adaptation frequently occurs at the protein surface, such as in genes involved in host–pathogen interactions. However, it remains unclear whether adaptive sites are distributed randomly or at regions associated with particular structural or functional characteristics across the genome, since many proteins lack structural or functional annotations. Here, we seek to tackle this question by combining large-scale bioinformatic prediction, structural analysis, phylogenetic inference, and population genomic analysis of Drosophila protein-coding genes. We found that protein sequence adaptation is more relevant to function-related rather than structure-related properties. Interestingly, intermolecular interactions contribute significantly to protein adaptation. We further showed that intermolecular interactions, such as physical interactions, may play a role in the coadaptation of fast-adaptive proteins. We found that strongly differentiated amino acids across geographic regions in protein-coding genes are mostly adaptive, which may contribute to the long-term adaptive evolution. This strongly indicates that a number of adaptive sites tend to be repeatedly mutated and selected throughout evolution in the past, present, and maybe future. Our results highlight the important roles of intermolecular interactions and coadaptation in the adaptive evolution of proteins both at the species and population levels.
Collapse
Affiliation(s)
- Junhui Peng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Nicolas Svetec
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| |
Collapse
|
183
|
Hassan SS, Lundstrom K, Barh D, Silva RJS, Andrade BS, Azevedo V, Choudhury PP, Palu G, Uhal BD, Kandimalla R, Seyran M, Lal A, Sherchan SP, Azad GK, Aljabali AAA, Brufsky AM, Serrano-Aroca Á, Adadi P, Abd El-Aziz TM, Redwan EM, Takayama K, Rezaei N, Tambuwala M, Uversky VN. Implications derived from S-protein variants of SARS-CoV-2 from six continents. Int J Biol Macromol 2021; 191:934-955. [PMID: 34571123 PMCID: PMC8462006 DOI: 10.1016/j.ijbiomac.2021.09.080] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 09/13/2021] [Accepted: 09/13/2021] [Indexed: 01/19/2023]
Abstract
The spike (S) protein is a critical determinant of the infectivity and antigenicity of SARS-CoV-2. Several mutations in the S protein of SARS-CoV-2 have already been detected, and their effect in immune system evasion and enhanced transmission as a cause of increased morbidity and mortality are being investigated. From pathogenic and epidemiological perspectives, S proteins are of prime interest to researchers. This study focused on the unique variants of S proteins from six continents: Asia, Africa, Europe, Oceania, South America, and North America. In comparison to the other five continents, Africa had the highest percentage of unique S proteins (29.1%). The phylogenetic relationship implies that unique S proteins from North America are significantly different from those of the other five continents. They are most likely to spread to the other geographic locations through international travel or naturally by emerging mutations. It is suggested that restriction of international travel should be considered, and massive vaccination as an utmost measure to combat the spread of the COVID-19 pandemic. It is also further suggested that the efficacy of existing vaccines and future vaccine development must be reviewed with careful scrutiny, and if needed, further re-engineered based on requirements dictated by new emerging S protein variants.
Collapse
Affiliation(s)
- Sk Sarif Hassan
- Department of Mathematics, Pingla Thana Mahavidyalaya, Maligram, Paschim Medinipur 721140, West Bengal, India.
| | | | - Debmalya Barh
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, WB, India; Department of Genetics, Ecology and Evolution, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte 31270-901, Brazil.
| | - Raner Jośe Santana Silva
- Department of Biological Sciences (DCB), Graduate Program in Genetics and Molecular Biology (PPGGBM), State University of Santa Cruz (UESC), Rodovia Ilheus-Itabuna, km 16, 45662-900 Ilheus, BA, Brazil
| | - Bruno Silva Andrade
- Laboratory of Bioinformatics and Computational Chemistry, Department of Biological Sciences, State University of Southwest Bahia (UESB), Jequié 45206-190, Brazil.
| | - Vasco Azevedo
- Laborat'orio de Geńetica Celular e Molecular, Departamento de Genetica, Ecologia e Evolucao, Instituto de Ciˆencias Biol'ogicas, Universidade Federal de Minas Gerais, Belo Horizonte CEP 31270-901, Brazil.
| | - Pabitra Pal Choudhury
- Applied Statistics Unit, Indian Statistical Institute, 203 B T Road, Kolkata 700108, India
| | - Giorgio Palu
- Department of Molecular Medicine, University of Padova, Via Gabelli 63, 35121 Padova, Italy.
| | - Bruce D Uhal
- Department of Physiology, Michigan State University, East Lansing, MI 48824, USA
| | - Ramesh Kandimalla
- Applied Biology, CSIR-Indian Institute of Chemical Technology, Uppal Road, Tarnaka, Hyderabad 500007, India; Department of Biochemistry, Kakatiya Medical College, Warangal, Telangana, India
| | - Murat Seyran
- Doctoral Studies in Natural and Technical Sciences (SPL 44), University of Vienna, W¨ahringer Straße, A-1090 Vienna, Austria
| | - Amos Lal
- Division of Pulmonary and Critical Care Medicine, Mayo Clinic, Rochester, MN, USA
| | - Samendra P Sherchan
- Department of Environmental Health Sciences, Tulane University, New Orleans, LA 70112, USA.
| | | | - Alaa A A Aljabali
- Department of Pharmaceutics and Pharmaceutical Technology, Yarmouk University, Faculty of Pharmacy, Irbid 566, Jordan.
| | - Adam M Brufsky
- University of Pittsburgh School of Medicine, Department of Medicine, Division of Hematology/Oncology, UPMC Hillman Cancer Center, Pittsburgh, PA, USA.
| | - Ángel Serrano-Aroca
- Biomaterials and Bioengineering Lab, Centro de Investigaci'on Traslacional San Alberto Magno, Universidad Cat́olica de Valencia San Vicente Ḿartir, c/Guillem de Castro, 94, 46001 Valencia, Spain.
| | - Parise Adadi
- Department of Food Science, University of Otago, Dunedin 9054, New Zealand
| | - Tarek Mohamed Abd El-Aziz
- Zoology Department, Faculty of Science, Minia University, El-Minia 61519, Egypt; Department of Cellular and Integrative Physiology, University of Texas Health Science Center at San Antonio, San Antonio, TX 78229-3900, USA.
| | - Elrashdy M Redwan
- Faculty of Science, Department of Biological Science, King Abdulazizi University, Jeddah 21589, Saudi Arabia; Therapeutic and Protective Proteins Laboratory, Protein Research Department, Genetic Engineering and Biotechnology Research Institute, City for Scientific Research and Technology Applications, New Borg El-Arab, Alexandria 21934, Egypt.
| | - Kazuo Takayama
- Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto 606-8507, Japan.
| | - Nima Rezaei
- Research Center for Immunodeficiencies, Pediatrics Center of Excellence, Children's Medical Center, Tehran University of Medical Sciences, Tehran, Iran; Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Stockholm, Sweden.
| | - Murtaza Tambuwala
- School of Pharmacy and Pharmaceutical Science, Ulster University, Coleraine BT52 1SA, Northern Ireland, UK.
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA; Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy pereulok, 9, Dolgoprudny, 141700, Russia.
| |
Collapse
|
184
|
Harrison PM. fLPS 2.0: rapid annotation of compositionally-biased regions in biological sequences. PeerJ 2021; 9:e12363. [PMID: 34760378 PMCID: PMC8557692 DOI: 10.7717/peerj.12363] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 09/30/2021] [Indexed: 12/12/2022] Open
Abstract
Compositionally-biased (CB) regions in biological sequences are enriched for a subset of sequence residue types. These can be shorter regions with a concentrated bias (i.e., those termed ‘low-complexity’), or longer regions that have a compositional skew. These regions comprise a prominent class of the uncharacterized ‘dark matter’ of the protein universe. Here, I report the latest version of the fLPS package for the annotation of CB regions, which includes added consideration of DNA sequences, to label the eight possible biased regions of DNA. In this version, the user is now able to restrict analysis to a specified subset of residue types, and also to filter for previously annotated domains to enable detection of discontinuous CB regions. A ‘thorough’ option has been added which enables the labelling of subtler biases, typically made from a skew for several residue types. In the output, protein CB regions are now labelled with bias classes reflecting the physico-chemical character of the biasing residues. The fLPS 2.0 package is available from: https://github.com/pmharrison/flps2 or in a Supplemental File of this paper.
Collapse
Affiliation(s)
- Paul M Harrison
- Department of Biology, McGill University, Montreal, QC, Canada
| |
Collapse
|
185
|
Dobson L, Tusnády GE. MemDis: Predicting Disordered Regions in Transmembrane Proteins. Int J Mol Sci 2021; 22:12270. [PMID: 34830151 PMCID: PMC8623522 DOI: 10.3390/ijms222212270] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 11/02/2021] [Accepted: 11/09/2021] [Indexed: 11/16/2022] Open
Abstract
Transmembrane proteins (TMPs) play important roles in cells, ranging from transport processes and cell adhesion to communication. Many of these functions are mediated by intrinsically disordered regions (IDRs), flexible protein segments without a well-defined structure. Although a variety of prediction methods are available for predicting IDRs, their accuracy is very limited on TMPs due to their special physico-chemical properties. We prepared a dataset containing membrane proteins exclusively, using X-ray crystallography data. MemDis is a novel prediction method, utilizing convolutional neural network and long short-term memory networks for predicting disordered regions in TMPs. In addition to attributes commonly used in IDR predictors, we defined several TMP specific features to enhance the accuracy of our method further. MemDis achieved the highest prediction accuracy on TMP-specific dataset among other popular IDR prediction methods.
Collapse
Affiliation(s)
| | - Gábor E. Tusnády
- Institute of Enzymology, Research Centre for Natural Sciences, Magyar Tudósok Körútja 2, 1117 Budapest, Hungary;
| |
Collapse
|
186
|
Tamburrini KC, Terrapon N, Lombard V, Bissaro B, Longhi S, Berrin JG. Bioinformatic Analysis of Lytic Polysaccharide Monooxygenases Reveals the Pan-Families Occurrence of Intrinsically Disordered C-Terminal Extensions. Biomolecules 2021; 11:1632. [PMID: 34827630 PMCID: PMC8615602 DOI: 10.3390/biom11111632] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 10/26/2021] [Accepted: 10/30/2021] [Indexed: 01/17/2023] Open
Abstract
Lytic polysaccharide monooxygenases (LPMOs) are monocopper enzymes secreted by many organisms and viruses. LPMOs catalyze the oxidative cleavage of different types of polysaccharides and are today divided into eight families (AA9-11, AA13-17) within the Auxiliary Activity enzyme class of the CAZy database. LPMOs minimal architecture encompasses a catalytic domain, to which can be appended a carbohydrate-binding module. Intriguingly, we observed that some LPMO sequences also display a C-terminal extension of varying length not associated with any known function or fold. Here, we analyzed 27,060 sequences from different LPMO families and show that 60% have a C-terminal extension predicted to be intrinsically disordered. Our analysis shows that these disordered C-terminal regions (dCTRs) are widespread in all LPMO families (except AA13) and differ in terms of sequence length and amino-acid composition. Noteworthily, these dCTRs have so far only been observed in LPMOs. LPMO-dCTRs share a common polyampholytic nature and an enrichment in serine and threonine residues, suggesting that they undergo post-translational modifications. Interestingly, dCTRs from AA11 and AA15 are enriched in redox-sensitive, conditionally disordered regions. The widespread occurrence of dCTRs in LPMOs from evolutionarily very divergent organisms, hints at a possible functional role and opens new prospects in the field of LPMOs.
Collapse
Affiliation(s)
- Ketty C. Tamburrini
- Architecture et Fonction des Macromolécules Biologiques (AFMB), Centre National de la Recherche Scientifique (CNRS), Aix-Marseille Université (AMU), UMR 7257, 13288 Marseille, France; (K.C.T.); (N.T.); (V.L.)
- Biodiversité et Biotechnologie Fongiques (BBF), French National Institute for Agriculture, Food, and Environment (INRAE), Aix-Marseille Université (AMU), UMR 1163, 13288 Marseille, France;
| | - Nicolas Terrapon
- Architecture et Fonction des Macromolécules Biologiques (AFMB), Centre National de la Recherche Scientifique (CNRS), Aix-Marseille Université (AMU), UMR 7257, 13288 Marseille, France; (K.C.T.); (N.T.); (V.L.)
- Architecture et Fonction des Macromolécules Biologiques (AFMB), French National Institute for Agriculture, Food, and Environment (INRAE), USC 1408, 13288 Marseille, France
| | - Vincent Lombard
- Architecture et Fonction des Macromolécules Biologiques (AFMB), Centre National de la Recherche Scientifique (CNRS), Aix-Marseille Université (AMU), UMR 7257, 13288 Marseille, France; (K.C.T.); (N.T.); (V.L.)
- Architecture et Fonction des Macromolécules Biologiques (AFMB), French National Institute for Agriculture, Food, and Environment (INRAE), USC 1408, 13288 Marseille, France
| | - Bastien Bissaro
- Biodiversité et Biotechnologie Fongiques (BBF), French National Institute for Agriculture, Food, and Environment (INRAE), Aix-Marseille Université (AMU), UMR 1163, 13288 Marseille, France;
| | - Sonia Longhi
- Architecture et Fonction des Macromolécules Biologiques (AFMB), Centre National de la Recherche Scientifique (CNRS), Aix-Marseille Université (AMU), UMR 7257, 13288 Marseille, France; (K.C.T.); (N.T.); (V.L.)
| | - Jean-Guy Berrin
- Biodiversité et Biotechnologie Fongiques (BBF), French National Institute for Agriculture, Food, and Environment (INRAE), Aix-Marseille Université (AMU), UMR 1163, 13288 Marseille, France;
| |
Collapse
|
187
|
Pintado-Grima C, Iglesias V, Santos J, Uversky VN, Ventura S. DispHScan: A Multi-Sequence Web Tool for Predicting Protein Disorder as a Function of pH. Biomolecules 2021; 11:1596. [PMID: 34827596 PMCID: PMC8616002 DOI: 10.3390/biom11111596] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 10/22/2021] [Accepted: 10/26/2021] [Indexed: 11/16/2022] Open
Abstract
Proteins are exposed to fluctuating environmental conditions in their cellular context and during their biotechnological production. Disordered regions are susceptible to these fluctuations and may experience solvent-dependent conformational switches that affect their local dynamism and activity. In a recent study, we modeled the influence of pH in the conformational state of IDPs by exploiting a charge-hydrophobicity diagram that considered the effect of solution pH on both variables. However, it was not possible to predict context-dependent transitions for multiple sequences, precluding proteome-wide analysis or the screening of collections of mutants. In this article, we present DispHScan, the first computational tool dedicated to predicting pH-induced disorder-order transitions in large protein datasets. The DispHScan web server allows the users to run pH-dependent disorder predictions of multiple sequences and identify context-dependent conformational transitions. It might provide new insights on the role of pH-modulated conditional disorder in the physiology and pathology of different organisms. The DispHScan web server is freely available for academic users, it is platform-independent and does not require previous registration.
Collapse
Affiliation(s)
- Carlos Pintado-Grima
- Institut de Biotecnologia i Biomedicina, Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Bellaterra, 08193 Barcelona, Spain; (C.P.-G.); (V.I.); (J.S.)
| | - Valentín Iglesias
- Institut de Biotecnologia i Biomedicina, Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Bellaterra, 08193 Barcelona, Spain; (C.P.-G.); (V.I.); (J.S.)
| | - Jaime Santos
- Institut de Biotecnologia i Biomedicina, Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Bellaterra, 08193 Barcelona, Spain; (C.P.-G.); (V.I.); (J.S.)
| | - Vladimir N. Uversky
- Department of Molecular Medicine, USF Health Byrd Alzheimer’s Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA;
| | - Salvador Ventura
- Institut de Biotecnologia i Biomedicina, Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Bellaterra, 08193 Barcelona, Spain; (C.P.-G.); (V.I.); (J.S.)
| |
Collapse
|
188
|
Emenecker RJ, Griffith D, Holehouse AS. Metapredict: a fast, accurate, and easy-to-use predictor of consensus disorder and structure. Biophys J 2021; 120:4312-4319. [PMID: 34480923 PMCID: PMC8553642 DOI: 10.1016/j.bpj.2021.08.039] [Citation(s) in RCA: 128] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 08/08/2021] [Accepted: 08/30/2021] [Indexed: 01/02/2023] Open
Abstract
Intrinsically disordered proteins and protein regions make up a substantial fraction of many proteomes in which they play a wide variety of essential roles. A critical first step in understanding the role of disordered protein regions in biological function is to identify those disordered regions correctly. Computational methods for disorder prediction have emerged as a core set of tools to guide experiments, interpret results, and develop hypotheses. Given the multiple different predictors available, consensus scores have emerged as a popular approach to mitigate biases or limitations of any single method. Consensus scores integrate the outcome of multiple independent disorder predictors and provide a per-residue value that reflects the number of tools that predict a residue to be disordered. Although consensus scores help mitigate the inherent problems of using any single disorder predictor, they are computationally expensive to generate. They also necessitate the installation of multiple different software tools, which can be prohibitively difficult. To address this challenge, we developed a deep-learning-based predictor of consensus disorder scores. Our predictor, metapredict, utilizes a bidirectional recurrent neural network trained on the consensus disorder scores from 12 proteomes. By benchmarking metapredict using two orthogonal approaches, we found that metapredict is among the most accurate disorder predictors currently available. Metapredict is also remarkably fast, enabling proteome-scale disorder prediction in minutes. Importantly, metapredict is a fully open source and is distributed as a Python package, a collection of command-line tools, and a web server, maximizing the potential practical utility of the predictor. We believe metapredict offers a convenient, accessible, accurate, and high-performance predictor for single-proteins and proteomes alike.
Collapse
Affiliation(s)
- Ryan J Emenecker
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri; Center for Science and Engineering Living Systems (CSELS), St. Louis, Missouri; Center for Engineering Mechanobiology, Washington University, St. Louis, Missouri
| | - Daniel Griffith
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri; Center for Science and Engineering Living Systems (CSELS), St. Louis, Missouri
| | - Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri; Center for Science and Engineering Living Systems (CSELS), St. Louis, Missouri.
| |
Collapse
|
189
|
Emenecker RJ, Griffith D, Holehouse AS. Metapredict: a fast, accurate, and easy-to-use predictor of consensus disorder and structure. Biophys J 2021; 120:4312-4319. [PMID: 34480923 DOI: 10.1101/2021.05.30.446349] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 08/08/2021] [Accepted: 08/30/2021] [Indexed: 05/28/2023] Open
Abstract
Intrinsically disordered proteins and protein regions make up a substantial fraction of many proteomes in which they play a wide variety of essential roles. A critical first step in understanding the role of disordered protein regions in biological function is to identify those disordered regions correctly. Computational methods for disorder prediction have emerged as a core set of tools to guide experiments, interpret results, and develop hypotheses. Given the multiple different predictors available, consensus scores have emerged as a popular approach to mitigate biases or limitations of any single method. Consensus scores integrate the outcome of multiple independent disorder predictors and provide a per-residue value that reflects the number of tools that predict a residue to be disordered. Although consensus scores help mitigate the inherent problems of using any single disorder predictor, they are computationally expensive to generate. They also necessitate the installation of multiple different software tools, which can be prohibitively difficult. To address this challenge, we developed a deep-learning-based predictor of consensus disorder scores. Our predictor, metapredict, utilizes a bidirectional recurrent neural network trained on the consensus disorder scores from 12 proteomes. By benchmarking metapredict using two orthogonal approaches, we found that metapredict is among the most accurate disorder predictors currently available. Metapredict is also remarkably fast, enabling proteome-scale disorder prediction in minutes. Importantly, metapredict is a fully open source and is distributed as a Python package, a collection of command-line tools, and a web server, maximizing the potential practical utility of the predictor. We believe metapredict offers a convenient, accessible, accurate, and high-performance predictor for single-proteins and proteomes alike.
Collapse
Affiliation(s)
- Ryan J Emenecker
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri; Center for Science and Engineering Living Systems (CSELS), St. Louis, Missouri; Center for Engineering Mechanobiology, Washington University, St. Louis, Missouri
| | - Daniel Griffith
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri; Center for Science and Engineering Living Systems (CSELS), St. Louis, Missouri
| | - Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri; Center for Science and Engineering Living Systems (CSELS), St. Louis, Missouri.
| |
Collapse
|
190
|
Pajkos M, Dosztányi Z. Functions of intrinsically disordered proteins through evolutionary lenses. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2021; 183:45-74. [PMID: 34656334 DOI: 10.1016/bs.pmbts.2021.06.017] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Protein sequences are the result of an evolutionary process that involves the balancing act of experimenting with novel mutations and selecting out those that have an undesirable functional outcome. In the case of globular proteins, the function relies on a well-defined conformation, therefore, there is a strong evolutionary pressure to preserve the structure. However, different evolutionary rules might apply for the group of intrinsically disordered regions and proteins (IDR/IDPs) that exist as an ensemble of fluctuating conformations. The function of IDRs can directly originate from their disordered state or arise through different types of molecular recognition processes. There is an amazing variety of ways IDRs can carry out their functions, and this is also reflected in their evolutionary properties. In this chapter we give an overview of the different types of evolutionary behavior of disordered proteins and associated functions in normal and disease settings.
Collapse
Affiliation(s)
- Mátyás Pajkos
- Department of Biochemistry, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, ELTE Eötvös Loránd University, Budapest, Hungary.
| |
Collapse
|
191
|
Zhao B, Katuwawala A, Oldfield CJ, Hu G, Wu Z, Uversky VN, Kurgan L. Intrinsic Disorder in Human RNA-Binding Proteins. J Mol Biol 2021; 433:167229. [PMID: 34487791 DOI: 10.1016/j.jmb.2021.167229] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 08/30/2021] [Accepted: 08/31/2021] [Indexed: 12/24/2022]
Abstract
Although RNA-binding proteins (RBPs) are known to be enriched in intrinsic disorder, no previous analysis focused on RBPs interacting with specific RNA types. We fill this gap with a comprehensive analysis of the putative disorder in RBPs binding to six common RNA types: messenger RNA (mRNA), transfer RNA (tRNA), small nuclear RNA (snRNA), non-coding RNA (ncRNA), ribosomal RNA (rRNA), and internal ribosome RNA (irRNA). We also analyze the amount of putative intrinsic disorder in the RNA-binding domains (RBDs) and non-RNA-binding-domain regions (non-RBD regions). Consistent with previous studies, we show that in comparison with human proteome, RBPs are significantly enriched in disorder. However, closer examination finds significant enrichment in predicted disorder for the mRNA-, rRNA- and snRNA-binding proteins, while the proteins that interact with ncRNA and irRNA are not enriched in disorder, and the tRNA-binding proteins are significantly depleted in disorder. We show a consistent pattern of significant disorder enrichment in the non-RBD regions coupled with low levels of disorder in RBDs, which suggests that disorder is relatively rarely utilized in the RNA-binding regions. Our analysis of the non-RBD regions suggests that disorder harbors posttranslational modification sites and is involved in the putative interactions with DNA. Importantly, we utilize experimental data from DisProt and independent data from Pfam to validate the above observations that rely on the disorder predictions. This study provides new insights into the distribution of disorder across proteins that bind different RNA types and the functional role of disorder in the regions where it is enriched.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Christopher J Oldfield
- Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Gang Hu
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin 300071, China
| | - Zhonghua Wu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA.
| |
Collapse
|
192
|
Morris OM, Torpey JH, Isaacson RL. Intrinsically disordered proteins: modes of binding with emphasis on disordered domains. Open Biol 2021; 11:210222. [PMID: 34610267 PMCID: PMC8492171 DOI: 10.1098/rsob.210222] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Our notions of protein function have long been determined by the protein structure-function paradigm. However, the idea that protein function is dictated by a prerequisite complementarity of shapes at the binding interface is becoming increasingly challenged. Interactions involving intrinsically disordered proteins (IDPs) have indicated a significant degree of disorder present in the bound state, ranging from static disorder to complete disorder, termed 'random fuzziness'. This review assesses the anatomy of an IDP and relates how its intrinsic properties permit promiscuity and allow for the various modes of interaction. Furthermore, a mechanistic overview of the types of disordered domains is detailed, while also relating to a recent example and the kinetic and thermodynamic principles governing its formation.
Collapse
Affiliation(s)
- Owen Michael Morris
- Department of Chemistry, Faculty of Natural, Mathematical and Engineering Sciences, King's College London, Britannia House, 7 Trinity Street, London SE1 1DB, UK
| | - James Hilary Torpey
- Department of Chemistry, Faculty of Natural, Mathematical and Engineering Sciences, King's College London, Britannia House, 7 Trinity Street, London SE1 1DB, UK
| | - Rivka Leah Isaacson
- Department of Chemistry, Faculty of Natural, Mathematical and Engineering Sciences, King's College London, Britannia House, 7 Trinity Street, London SE1 1DB, UK
| |
Collapse
|
193
|
Ruff KM, Pappu RV. AlphaFold and Implications for Intrinsically Disordered Proteins. J Mol Biol 2021; 433:167208. [PMID: 34418423 DOI: 10.1016/j.jmb.2021.167208] [Citation(s) in RCA: 313] [Impact Index Per Article: 78.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 08/11/2021] [Accepted: 08/12/2021] [Indexed: 10/20/2022]
Abstract
Accurate predictions of the three-dimensional structures of proteins from their amino acid sequences have come of age. AlphaFold, a deep learning-based approach to protein structure prediction, shows remarkable success in independent assessments of prediction accuracy. A significant epoch in structural bioinformatics was the structural annotation of over 98% of protein sequences in the human proteome. Interestingly, many predictions feature regions of very low confidence, and these regions largely overlap with intrinsically disordered regions (IDRs). That over 30% of regions within the proteome are disordered is congruent with estimates that have been made over the past two decades, as intense efforts have been undertaken to generalize the structure-function paradigm to include the importance of conformational heterogeneity and dynamics. With structural annotations from AlphaFold in hand, there is the temptation to draw inferences regarding the "structures" of IDRs and their interactomes. Here, we offer a cautionary note regarding the misinterpretations that might ensue and highlight efforts that provide concrete understanding of sequence-ensemble-function relationships of IDRs. This perspective is intended to emphasize the importance of IDRs in sequence-function relationships (SERs) and to highlight how one might go about extracting quantitative SERs to make sense of how IDRs function.
Collapse
Affiliation(s)
- Kiersten M Ruff
- Department of Biomedical Engineering and Center for Science & Engineering of Living Systems (CSELS), Washington University in St. Louis, Campus Box 1097, St. Louis, MO 63130, USA
| | - Rohit V Pappu
- Department of Biomedical Engineering and Center for Science & Engineering of Living Systems (CSELS), Washington University in St. Louis, Campus Box 1097, St. Louis, MO 63130, USA.
| |
Collapse
|
194
|
Bondos SE, Dunker AK, Uversky VN. On the roles of intrinsically disordered proteins and regions in cell communication and signaling. Cell Commun Signal 2021; 19:88. [PMID: 34461937 PMCID: PMC8404256 DOI: 10.1186/s12964-021-00774-3] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
For proteins, the sequence → structure → function paradigm applies primarily to enzymes, transmembrane proteins, and signaling domains. This paradigm is not universal, but rather, in addition to structured proteins, intrinsically disordered proteins and regions (IDPs and IDRs) also carry out crucial biological functions. For these proteins, the sequence → IDP/IDR ensemble → function paradigm applies primarily to signaling and regulatory proteins and regions. Often, in order to carry out function, IDPs or IDRs cooperatively interact, either intra- or inter-molecularly, with structured proteins or other IDPs or intermolecularly with nucleic acids. In this IDP/IDR thematic collection published in Cell Communication and Signaling, thirteen articles are presented that describe IDP/IDR signaling molecules from a variety of organisms from humans to fruit flies and tardigrades ("water bears") and that describe how these proteins and regions contribute to the function and regulation of cell signaling. Collectively, these papers exhibit the diverse roles of disorder in responding to a wide range of signals as to orchestrate an array of organismal processes. They also show that disorder contributes to signaling in a broad spectrum of species, ranging from micro-organisms to plants and animals.
Collapse
Affiliation(s)
- Sarah E Bondos
- Department of Molecular and Cellular Medicine, Texas A&M Health Science Center, College Station, TX, 77843, USA.
| | - A Keith Dunker
- Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, IN, 46202, USA.
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA.
- Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino, Russia.
| |
Collapse
|
195
|
Hołubowicz R, Ożyhar A, Dobryszycki P. Natural Mutations Affect Structure and Function of gC1q Domain of Otolin-1. Int J Mol Sci 2021; 22:ijms22169085. [PMID: 34445792 PMCID: PMC8396674 DOI: 10.3390/ijms22169085] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 08/13/2021] [Accepted: 08/19/2021] [Indexed: 12/29/2022] Open
Abstract
Otolin-1 is a scaffold protein of otoliths and otoconia, calcium carbonate biominerals from the inner ear. It contains a gC1q domain responsible for trimerization and binding of Ca2+. Knowledge of a structure-function relationship of gC1q domain of otolin-1 is crucial for understanding the biology of balance sensing. Here, we show how natural variants alter the structure of gC1q otolin-1 and how Ca2+ are able to revert some effects of the mutations. We discovered that natural substitutions: R339S, R342W and R402P negatively affect the stability of apo-gC1q otolin-1, and that Q426R has a stabilizing effect. In the presence of Ca2+, R342W and Q426R were stabilized at higher Ca2+ concentrations than the wild-type form, and R402P was completely insensitive to Ca2+. The mutations affected the self-association of gC1q otolin-1 by inducing detrimental aggregation (R342W) or disabling the trimerization (R402P) of the protein. Our results indicate that the natural variants of gC1q otolin-1 may have a potential to cause pathological changes in otoconia and otoconial membrane, which could affect sensing of balance and increase the probability of occurrence of benign paroxysmal positional vertigo (BPPV).
Collapse
Affiliation(s)
- Rafał Hołubowicz
- Correspondence: (R.H.); (P.D.); Tel.: +48-71-320-63-34 (R.H.); +48-71-320-63-32 (P.D.)
| | | | - Piotr Dobryszycki
- Correspondence: (R.H.); (P.D.); Tel.: +48-71-320-63-34 (R.H.); +48-71-320-63-32 (P.D.)
| |
Collapse
|
196
|
Tunyasuvunakool K, Adler J, Wu Z, Green T, Zielinski M, Žídek A, Bridgland A, Cowie A, Meyer C, Laydon A, Velankar S, Kleywegt GJ, Bateman A, Evans R, Pritzel A, Figurnov M, Ronneberger O, Bates R, Kohl SAA, Potapenko A, Ballard AJ, Romera-Paredes B, Nikolov S, Jain R, Clancy E, Reiman D, Petersen S, Senior AW, Kavukcuoglu K, Birney E, Kohli P, Jumper J, Hassabis D. Highly accurate protein structure prediction for the human proteome. Nature 2021; 596:590-596. [PMID: 34293799 PMCID: PMC8387240 DOI: 10.1038/s41586-021-03828-1] [Citation(s) in RCA: 1756] [Impact Index Per Article: 439.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 07/16/2021] [Indexed: 02/07/2023]
Abstract
Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experimentally determined structure1. Here we markedly expand the structural coverage of the proteome by applying the state-of-the-art machine learning method, AlphaFold2, at a scale that covers almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, of which a subset (36% of all residues) have very high confidence. We introduce several metrics developed by building on the AlphaFold model and use them to interpret the dataset, identifying strong multi-domain predictions as well as regions that are likely to be disordered. Finally, we provide some case studies to illustrate how high-quality predictions could be used to generate biological hypotheses. We are making our predictions freely available to the community and anticipate that routine large-scale and high-accuracy structure prediction will become an important tool that will allow new questions to be addressed from a structural perspective.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Sameer Velankar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Gerard J Kleywegt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | | | | | | |
Collapse
|
197
|
Hu G, Katuwawala A, Wang K, Wu Z, Ghadermarzi S, Gao J, Kurgan L. flDPnn: Accurate intrinsic disorder prediction with putative propensities of disorder functions. Nat Commun 2021; 12:4438. [PMID: 34290238 PMCID: PMC8295265 DOI: 10.1038/s41467-021-24773-7] [Citation(s) in RCA: 182] [Impact Index Per Article: 45.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 07/06/2021] [Indexed: 01/05/2023] Open
Abstract
Identification of intrinsic disorder in proteins relies in large part on computational predictors, which demands that their accuracy should be high. Since intrinsic disorder carries out a broad range of cellular functions, it is desirable to couple the disorder and disorder function predictions. We report a computational tool, flDPnn, that provides accurate, fast and comprehensive disorder and disorder function predictions from protein sequences. The recent Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment and results on other test datasets demonstrate that flDPnn offers accurate predictions of disorder, fully disordered proteins and four common disorder functions. These predictions are substantially better than the results of the existing disorder predictors and methods that predict functions of disorder. Ablation tests reveal that the high predictive performance stems from innovative ways used in flDPnn to derive sequence profiles and encode inputs. flDPnn's webserver is available at http://biomine.cs.vcu.edu/servers/flDPnn/.
Collapse
Affiliation(s)
- Gang Hu
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Kui Wang
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, China
| | - Zhonghua Wu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, China
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Jianzhao Gao
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
198
|
Kursula P. Small-angle X-ray scattering for the proteomics community: current overview and future potential. Expert Rev Proteomics 2021; 18:415-422. [PMID: 34210208 DOI: 10.1080/14789450.2021.1951242] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Introduction: Proteins are biological nanoparticles. For structural proteomics and hybrid structural biology, complementary methods are required that allow both high throughput and accurate automated data analysis. Small-angle X-ray scattering (SAXS) is a method for observing the size and shape of particles, such as proteins and complexes, in solution. SAXS data can be used to model both the structure, oligomeric state, conformational changes, and flexibility of biomolecular samples.Areas covered: The key principles of SAXS, its sample requirements, and its current and future applications for structural proteomics are briefly reviewed. Recent technical developments in SAXS experiments are discussed, and future potential of the method in structural proteomics is evaluated.Expert opinion: SAXS is a method suitable for several aspects of integrative structural proteomics, with current technical developments allowing for higher throughput and time-resolved studies, as well as the analysis of complex samples, such as membrane proteins. Increasing automation and streamlined data analysis are expected to equip SAXS for structure-based screening workflows. Originally, structural genomics had a heavy focus on folded, crystallizable proteins and complexes - SAXS is a method allowing an expansion of this focus to flexible and disordered systems.
Collapse
Affiliation(s)
- Petri Kursula
- Department of Biomedicine, University of Bergen, Bergen, Norway.,Biocenter Oulu & Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu, Finland
| |
Collapse
|
199
|
Affiliation(s)
- Benjamin Lang
- Department of Structural Biology and the Center for Data Driven Discovery, St. Jude Children's Research Hospital, Memphis, TN, USA.
| | - M Madan Babu
- Department of Structural Biology and the Center for Data Driven Discovery, St. Jude Children's Research Hospital, Memphis, TN, USA.
| |
Collapse
|
200
|
Erdős G, Pajkos M, Dosztányi Z. IUPred3: prediction of protein disorder enhanced with unambiguous experimental annotation and visualization of evolutionary conservation. Nucleic Acids Res 2021; 49:W297-W303. [PMID: 34048569 PMCID: PMC8262696 DOI: 10.1093/nar/gkab408] [Citation(s) in RCA: 316] [Impact Index Per Article: 79.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/21/2021] [Accepted: 05/14/2021] [Indexed: 12/22/2022] Open
Abstract
Intrinsically disordered proteins and protein regions (IDPs/IDRs) exist without a single well-defined conformation. They carry out important biological functions with multifaceted roles which is also reflected in their evolutionary behavior. Computational methods play important roles in the characterization of IDRs. One of the commonly used disorder prediction methods is IUPred, which relies on an energy estimation approach. The IUPred web server takes an amino acid sequence or a Uniprot ID/accession as an input and predicts the tendency for each amino acid to be in a disordered region with an option to also predict context-dependent disordered regions. In this new iteration of IUPred, we added multiple novel features to enhance the prediction capabilities of the server. First, learning from the latest evaluation of disorder prediction methods we introduced multiple new smoothing functions to the prediction that decreases noise and increases the performance of the predictions. We constructed a dataset consisting of experimentally verified ordered/disordered regions with unambiguous annotations which were added to the prediction. We also introduced a novel tool that enables the exploration of the evolutionary conservation of protein disorder coupled to sequence conservation in model organisms. The web server is freely available to users and accessible at https://iupred3.elte.hu.
Collapse
Affiliation(s)
- Gábor Erdős
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Mátyás Pajkos
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| |
Collapse
|