1
|
Basu S, Zhao B, Biró B, Faraggi E, Gsponer J, Hu G, Kloczkowski A, Malhis N, Mirdita M, Söding J, Steinegger M, Wang D, Wang K, Xu D, Zhang J, Kurgan L. DescribePROT in 2023: more, higher-quality and experimental annotations and improved data download options. Nucleic Acids Res 2024; 52:D426-D433. [PMID: 37933852 PMCID: PMC10767971 DOI: 10.1093/nar/gkad985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/12/2023] [Accepted: 10/16/2023] [Indexed: 11/08/2023] Open
Abstract
The DescribePROT database of amino acid-level descriptors of protein structures and functions was substantially expanded since its release in 2020. This expansion includes substantial increase in the size, scope, and quality of the underlying data, the addition of experimental structural information, the inclusion of new data download options, and an upgraded graphical interface. DescribePROT currently covers 19 structural and functional descriptors for proteins in 273 reference proteomes generated by 11 accurate and complementary predictive tools. Users can search our resource in multiple ways, interact with the data using the graphical interface, and download data at various scales including individual proteins, entire proteomes, and whole database. The annotations in DescribePROT are useful for a broad spectrum of studies that include investigations of protein structure and function, development and validation of predictive tools, and to support efforts in understanding molecular underpinnings of diseases and development of therapeutics. DescribePROT can be freely accessed at http://biomine.cs.vcu.edu/servers/DESCRIBEPROT/.
Collapse
Affiliation(s)
- Sushmita Basu
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Bi Zhao
- Genomics Program, College of Public Health, University of South Florida, Tampa, FL, USA
| | - Bálint Biró
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
- Department of Animal Biotechnology, Hungarian University of Agriculture and Life Sciences, Gödöllő, Hungary
| | - Eshel Faraggi
- Physics Department, Indiana University, Indianapolis, IN, USA
| | - Jörg Gsponer
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Gang Hu
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, P.R. China
| | - Andrzej Kloczkowski
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, USA
| | - Nawar Malhis
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Milot Mirdita
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
| | - Johannes Söding
- Quantitative and Computational Biology, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
- Institute of Molecular Biology & Genetics, Seoul National University, Seoul, Republic of Korea
- Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
| | - Duolin Wang
- Department of Electrical Engineer and Computer Science, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, USA
| | - Kui Wang
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, P.R. China
| | - Dong Xu
- Department of Electrical Engineer and Computer Science, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, USA
| | - Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University, Xinyang, P.R. China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
2
|
Chakraborty A, Hussain A, Sabnam N. Uncovering the structural stability of Magnaporthe oryzae effectors: a secretome-wide in silico analysis. J Biomol Struct Dyn 2023:1-22. [PMID: 38109060 DOI: 10.1080/07391102.2023.2292795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Accepted: 11/23/2023] [Indexed: 12/19/2023]
Abstract
Rice blast, caused by the ascomycete fungus Magnaporthe oryzae, is a deadly disease and a major threat to global food security. The pathogen secretes small proteinaceous effectors, virulence factors, inside the host to manipulate and perturb the host immune system, allowing the pathogen to colonize and establish a successful infection. While the molecular functions of several effectors are characterized, very little is known about the structural stability of these effectors. We analyzed a total of 554 small secretory proteins (SSPs) from the M. oryzae secretome to decipher key features of intrinsic disorder (ID) and the structural dynamics of the selected putative effectors through thorough and systematic in silico studies. Our results suggest that out of the total SSPs, 66% were predicted as effector proteins, released either into the apoplast or cytoplasm of the host cell. Of these, 68% were found to be intrinsically disordered effector proteins (IDEPs). Among the six distinct classes of disordered effectors, we observed peculiar relationships between the localization of several effectors in the apoplast or cytoplasm and the degree of disorder. We determined the degree of structural disorder and its impact on protein foldability across all the putative small secretory effector proteins from the blast pathogen, further validated by molecular dynamics simulation studies. This study provides definite clues toward unraveling the mystery behind the importance of structural distortions in effectors and their impact on plant-pathogen interactions. The study of these dynamical segments may help identify new effectors as well.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
| | - Afzal Hussain
- Department of Bioinformatics, Maulana Azad National Institute of Technology, Bhopal, India
| | - Nazmiara Sabnam
- Department of Life Sciences, Presidency University, Kolkata, India
| |
Collapse
|
3
|
Longfield SF, Mollazade M, Wallis TP, Gormal RS, Joensuu M, Wark JR, van Waardenberg AJ, Small C, Graham ME, Meunier FA, Martínez-Mármol R. Tau forms synaptic nano-biomolecular condensates controlling the dynamic clustering of recycling synaptic vesicles. Nat Commun 2023; 14:7277. [PMID: 37949856 PMCID: PMC10638352 DOI: 10.1038/s41467-023-43130-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Accepted: 11/01/2023] [Indexed: 11/12/2023] Open
Abstract
Neuronal communication relies on the release of neurotransmitters from various populations of synaptic vesicles. Despite displaying vastly different release probabilities and mobilities, the reserve and recycling pool of vesicles co-exist within a single cluster suggesting that small synaptic biomolecular condensates could regulate their nanoscale distribution. Here, we performed a large-scale activity-dependent phosphoproteome analysis of hippocampal neurons in vitro and identified Tau as a highly phosphorylated and disordered candidate protein. Single-molecule super-resolution microscopy revealed that Tau undergoes liquid-liquid phase separation to generate presynaptic nanoclusters whose density and number are regulated by activity. This activity-dependent diffusion process allows Tau to translocate into the presynapse where it forms biomolecular condensates, to selectively control the mobility of recycling vesicles. Tau, therefore, forms presynaptic nano-biomolecular condensates that regulate the nanoscale organization of synaptic vesicles in an activity-dependent manner.
Collapse
Affiliation(s)
- Shanley F Longfield
- Clem Jones Centre for Ageing Dementia Research (CJCADR), Queensland Brain Institute (QBI), The University of Queensland; St Lucia Campus, Brisbane, QLD, 4072, Australia
| | - Mahdie Mollazade
- Clem Jones Centre for Ageing Dementia Research (CJCADR), Queensland Brain Institute (QBI), The University of Queensland; St Lucia Campus, Brisbane, QLD, 4072, Australia
| | - Tristan P Wallis
- Clem Jones Centre for Ageing Dementia Research (CJCADR), Queensland Brain Institute (QBI), The University of Queensland; St Lucia Campus, Brisbane, QLD, 4072, Australia
| | - Rachel S Gormal
- Clem Jones Centre for Ageing Dementia Research (CJCADR), Queensland Brain Institute (QBI), The University of Queensland; St Lucia Campus, Brisbane, QLD, 4072, Australia
| | - Merja Joensuu
- Clem Jones Centre for Ageing Dementia Research (CJCADR), Queensland Brain Institute (QBI), The University of Queensland; St Lucia Campus, Brisbane, QLD, 4072, Australia
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland; St Lucia Campus, Brisbane, QLD, 4072, Australia
| | - Jesse R Wark
- Synapse Proteomics, Children's Medical Research Institute (CMRI), The University of Sydney, 214 Hawkesbury Road, Westmead, NSW, 2145, Australia
| | | | - Christopher Small
- Clem Jones Centre for Ageing Dementia Research (CJCADR), Queensland Brain Institute (QBI), The University of Queensland; St Lucia Campus, Brisbane, QLD, 4072, Australia
| | - Mark E Graham
- Synapse Proteomics, Children's Medical Research Institute (CMRI), The University of Sydney, 214 Hawkesbury Road, Westmead, NSW, 2145, Australia
| | - Frédéric A Meunier
- Clem Jones Centre for Ageing Dementia Research (CJCADR), Queensland Brain Institute (QBI), The University of Queensland; St Lucia Campus, Brisbane, QLD, 4072, Australia.
- School of Biomedical Science, The University of Queensland; St Lucia Campus, Brisbane, QLD, 4072, Australia.
| | - Ramón Martínez-Mármol
- Clem Jones Centre for Ageing Dementia Research (CJCADR), Queensland Brain Institute (QBI), The University of Queensland; St Lucia Campus, Brisbane, QLD, 4072, Australia.
| |
Collapse
|
4
|
Kurgan L, Hu G, Wang K, Ghadermarzi S, Zhao B, Malhis N, Erdős G, Gsponer J, Uversky VN, Dosztányi Z. Tutorial: a guide for the selection of fast and accurate computational tools for the prediction of intrinsic disorder in proteins. Nat Protoc 2023; 18:3157-3172. [PMID: 37740110 DOI: 10.1038/s41596-023-00876-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 06/21/2023] [Indexed: 09/24/2023]
Abstract
Intrinsic disorder is instrumental for a wide range of protein functions, and its analysis, using computational predictions from primary structures, complements secondary and tertiary structure-based approaches. In this Tutorial, we provide an overview and comparison of 23 publicly available computational tools with complementary parameters useful for intrinsic disorder prediction, partly relying on results from the Critical Assessment of protein Intrinsic Disorder prediction experiment. We consider factors such as accuracy, runtime, availability and the need for functional insights. The selected tools are available as web servers and downloadable programs, offer state-of-the-art predictions and can be used in a high-throughput manner. We provide examples and instructions for the selected tools to illustrate practical aspects related to the submission, collection and interpretation of predictions, as well as the timing and their limitations. We highlight two predictors for intrinsically disordered proteins, flDPnn as accurate and fast and IUPred as very fast and moderately accurate, while suggesting ANCHOR2 and MoRFchibi as two of the best-performing predictors for intrinsically disordered region binding. We link these tools to additional resources, including databases of predictions and web servers that integrate multiple predictive methods. Altogether, this Tutorial provides a hands-on guide to comparatively evaluating multiple predictors, submitting and collecting their own predictions, and reading and interpreting results. It is suitable for experimentalists and computational biologists interested in accurately and conveniently identifying intrinsic disorder, facilitating the functional characterization of the rapidly growing collections of protein sequences.
Collapse
Affiliation(s)
- Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| | - Gang Hu
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Kui Wang
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Nawar Malhis
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Gábor Erdős
- MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, Hungary
| | - Jörg Gsponer
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada.
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA.
- Byrd Alzheimer's Center and Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA.
| | - Zsuzsanna Dosztányi
- MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, Hungary.
| |
Collapse
|
5
|
Pang Y, Liu B. IDP-LM: Prediction of protein intrinsic disorder and disorder functions based on language models. PLoS Comput Biol 2023; 19:e1011657. [PMID: 37992088 DOI: 10.1371/journal.pcbi.1011657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 12/06/2023] [Accepted: 11/03/2023] [Indexed: 11/24/2023] Open
Abstract
Intrinsically disordered proteins (IDPs) and regions (IDRs) are a class of functionally important proteins and regions that lack stable three-dimensional structures under the native physiologic conditions. They participate in critical biological processes and thus are associated with the pathogenesis of many severe human diseases. Identifying the IDPs/IDRs and their functions will be helpful for a comprehensive understanding of protein structures and functions, and inform studies of rational drug design. Over the past decades, the exponential growth in the number of proteins with sequence information has deepened the gap between uncharacterized and annotated disordered sequences. Protein language models have recently demonstrated their powerful abilities to capture complex structural and functional information from the enormous quantity of unlabelled protein sequences, providing opportunities to apply protein language models to uncover the intrinsic disorders and their biological properties from the amino acid sequences. In this study, we proposed a computational predictor called IDP-LM for predicting intrinsic disorder and disorder functions by leveraging the pre-trained protein language models. IDP-LM takes the embeddings extracted from three pre-trained protein language models as the exclusive inputs, including ProtBERT, ProtT5 and a disorder specific language model (IDP-BERT). The ablation analysis shown that the IDP-BERT provided fine-grained feature representations of disorder, and the combination of three language models is the key to the performance improvement of IDP-LM. The evaluation results on independent test datasets demonstrated that the IDP-LM provided high-quality prediction results for intrinsic disorder and four common disordered functions.
Collapse
Affiliation(s)
- Yihe Pang
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
6
|
Velasco-Carneros L, Cuéllar J, Dublang L, Santiago C, Maréchal JD, Martín-Benito J, Maestro M, Fernández-Higuero JÁ, Orozco N, Moro F, Valpuesta JM, Muga A. The self-association equilibrium of DNAJA2 regulates its interaction with unfolded substrate proteins and with Hsc70. Nat Commun 2023; 14:5436. [PMID: 37670029 PMCID: PMC10480186 DOI: 10.1038/s41467-023-41150-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2022] [Accepted: 08/24/2023] [Indexed: 09/07/2023] Open
Abstract
J-domain proteins tune the specificity of Hsp70s, engaging them in precise functions. Despite their essential role, the structure and function of many J-domain proteins remain largely unknown. We explore human DNAJA2, finding that it reversibly forms highly-ordered, tubular structures that can be dissociated by Hsc70, the constitutively expressed Hsp70 isoform. Cryoelectron microscopy and mutational studies reveal that different domains are involved in self-association. Oligomer dissociation into dimers potentiates its interaction with unfolded client proteins. The J-domains are accessible to Hsc70 within the tubular structure. They allow binding of closely spaced Hsc70 molecules that could be transferred to the unfolded substrate for its cooperative remodelling, explaining the efficient recovery of DNAJA2-bound clients. The disordered C-terminal domain, comprising the last 52 residues, regulates its holding activity and productive interaction with Hsc70. These in vitro findings suggest that the association equilibrium of DNAJA2 could regulate its interaction with client proteins and Hsc70.
Collapse
Affiliation(s)
- Lorea Velasco-Carneros
- Biofisika Institute (CSIC, UPV/EHU), University of the Basque Country, 48940, Leioa, Spain
- Department of Biochemistry and Molecular Biology, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), 48940, Leioa, Spain
| | - Jorge Cuéllar
- Department of Macromolecular Structure, National Centre for Biotechnology (CNB-CSIC), 28049, Madrid, Spain
| | - Leire Dublang
- Biofisika Institute (CSIC, UPV/EHU), University of the Basque Country, 48940, Leioa, Spain
- Department of Biochemistry and Molecular Biology, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), 48940, Leioa, Spain
| | - César Santiago
- Department of Macromolecular Structure, National Centre for Biotechnology (CNB-CSIC), 28049, Madrid, Spain
| | - Jean-Didier Maréchal
- Insilichem, Departament de Química, Universitat Autònoma de Barcelona, (UAB), 08193, Bellaterra (Barcelona), Spain
| | - Jaime Martín-Benito
- Department of Macromolecular Structure, National Centre for Biotechnology (CNB-CSIC), 28049, Madrid, Spain
| | - Moisés Maestro
- Department of Macromolecular Structure, National Centre for Biotechnology (CNB-CSIC), 28049, Madrid, Spain
| | - José Ángel Fernández-Higuero
- Biofisika Institute (CSIC, UPV/EHU), University of the Basque Country, 48940, Leioa, Spain
- Department of Biochemistry and Molecular Biology, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), 48940, Leioa, Spain
| | - Natalia Orozco
- Biofisika Institute (CSIC, UPV/EHU), University of the Basque Country, 48940, Leioa, Spain
| | - Fernando Moro
- Biofisika Institute (CSIC, UPV/EHU), University of the Basque Country, 48940, Leioa, Spain
- Department of Biochemistry and Molecular Biology, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), 48940, Leioa, Spain
| | - José María Valpuesta
- Department of Macromolecular Structure, National Centre for Biotechnology (CNB-CSIC), 28049, Madrid, Spain.
| | - Arturo Muga
- Biofisika Institute (CSIC, UPV/EHU), University of the Basque Country, 48940, Leioa, Spain.
- Department of Biochemistry and Molecular Biology, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), 48940, Leioa, Spain.
| |
Collapse
|
7
|
Arrías PN, Monzon AM, Clementel D, Mozaffari S, Piovesan D, Kajava AV, Tosatto SCE. The repetitive structure of DNA clamps: An overlooked protein tandem repeat. J Struct Biol 2023; 215:108001. [PMID: 37467824 DOI: 10.1016/j.jsb.2023.108001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 07/12/2023] [Accepted: 07/16/2023] [Indexed: 07/21/2023]
Abstract
Structured tandem repeats proteins (STRPs) are a specific kind of tandem repeat proteins characterized by a modular and repetitive three-dimensional structure arrangement. The majority of STRPs adopt solenoid structures, but with the increasing availability of experimental structures and high-quality predicted structural models, more STRP folds can be characterized. Here, we describe "Box repeats", an overlooked STRP fold present in the DNA sliding clamp processivity factors, which has eluded classification although structural data has been available since the late 1990s. Each Box repeat is a β⍺βββ module of about 60 residues, which forms a class V "beads-on-a-string" type STRP. The number of repeats present in processivity factors is organism dependent. Monomers of PCNA proteins in both Archaea and Eukarya have 4 repeats, while the monomers of bacterial beta-sliding clamps have 6 repeats. This new repeat fold has been added to the RepeatsDB database, which now provides structural annotation for 66 Box repeat proteins belonging to different organisms, including viruses.
Collapse
Affiliation(s)
- Paula Nazarena Arrías
- Department of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Alexander Miguel Monzon
- Department of Information Engineering, University of Padova, via Giovanni Gradenigo 6/B, 35131 Padova, Italy
| | - Damiano Clementel
- Department of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Soroush Mozaffari
- Department of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier (CRBM), UMR 5237 CNRS, Université Montpellier, 1919 Route de Mende, Cedex 5, 34293 Montpellier, France
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy.
| |
Collapse
|
8
|
Ullah MI, Rehman Z, Dad R, Alsrhani A, Shakil M, Ghanem HB, Alameen AAM, Elsadek MF, Eltayeb LB, Ullah S, Atif M. Identification and Functional Characterization of Mutation in FYCO1 in Families with Congenital Cataract. Life (Basel) 2023; 13:1788. [PMID: 37629644 PMCID: PMC10456301 DOI: 10.3390/life13081788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 08/10/2023] [Accepted: 08/18/2023] [Indexed: 08/27/2023] Open
Abstract
Congenital cataract (CC) causes a third of the cases of treatable childhood blindness worldwide. CC is a disorder of the crystalline lens which is established as clinically divergent and has complex heterogeneity. This study aimed to determine the genetic basis of CC. Whole blood was obtained from four consanguineous families with CC. Genomic DNA was extracted from the blood, and the combination of targeted and Sanger sequencing was used to identify the causative gene. The mutations detected were analyzed in silico for structural and protein-protein interactions to predict their impact on protein activities. The sequencing found a known FYCO1 mutation (c.2206C>T; p.Gln736Term) in autosomal recessive mode in families with CC. Co-segregation analysis showed affected individuals as homozygous and carriers as heterozygous for the mutation and the unaffected as wild-type. Bioinformatics tools uncovered the loss of the Znf domain and structural compactness of the mutant protein. In conclusion, a previously reported nonsense mutation was identified in four consanguineous families with CC. Structural analysis predicted the protein as disordered and coordinated with other structural proteins. The autophagy process was found to be significant for the development of the lens and maintenance of its transparency. The identification of these markers expands the scientific knowledge of CC; the future goal should be to understand the mechanism of disease severity. Ascertaining the genetic etiology of CC in a family member facilitates establishing a molecular diagnosis, unlocks the prospect of prenatal diagnosis in pregnancies, and guides the successive generations by genetic counseling.
Collapse
Affiliation(s)
- Muhammad Ikram Ullah
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Jouf University, Sakaka 72388, Saudi Arabia; (A.A.); (H.B.G.); (A.A.M.A.); (M.A.)
| | - Zaira Rehman
- Department of Pathology, Indus Hospital & Health Network, Karachi 75190, Pakistan;
| | - Rubina Dad
- Structure Biology Research Centre, Human Technopole, 20157 Milan, Italy
| | - Abdullah Alsrhani
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Jouf University, Sakaka 72388, Saudi Arabia; (A.A.); (H.B.G.); (A.A.M.A.); (M.A.)
| | - Muhammad Shakil
- Department of Biochemistry, King Edward Medical University, Lahore 54600, Pakistan;
- Department of Biochemistry, University of Health Sciences, Lahore 54600, Pakistan
| | - Heba Bassiony Ghanem
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Jouf University, Sakaka 72388, Saudi Arabia; (A.A.); (H.B.G.); (A.A.M.A.); (M.A.)
| | - Ayman Ali Mohammed Alameen
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Jouf University, Sakaka 72388, Saudi Arabia; (A.A.); (H.B.G.); (A.A.M.A.); (M.A.)
| | - Mohamed Farouk Elsadek
- Department of Community Health Sciences, College of Applied Medical Sciences, King Saud University, Riyadh 11433, Saudi Arabia;
| | - Lienda Bashier Eltayeb
- Department of Medical Laboratory Sciences, College of Applied Medical Sciences, Prince Sattam Bin Abdul-Aziz University, Al-Kharj, Riyadh 11942, Saudi Arabia;
| | - Sajjad Ullah
- University Institute of Medical Laboratory Technology, Faculty of Allied Health Sciences, The University of Lahore, Lahore 54600, Pakistan;
| | - Muhammad Atif
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Jouf University, Sakaka 72388, Saudi Arabia; (A.A.); (H.B.G.); (A.A.M.A.); (M.A.)
| |
Collapse
|
9
|
Won SY, Soundararajan P, Irulappan V, Kim JS. In-silico, evolutionary, and functional analysis of CHUP1 and its related proteins in Bienertia sinuspersici-a comparative study across C 3, C 4, CAM, and SCC 4 model plants. PeerJ 2023; 11:e15696. [PMID: 37456874 PMCID: PMC10348308 DOI: 10.7717/peerj.15696] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 06/14/2023] [Indexed: 07/18/2023] Open
Abstract
Single-cell C4 (SCC4) plants with bienertioid anatomy carry out photosynthesis in a single cell. Chloroplast movement is the underlying phenomenon, where chloroplast unusual positioning 1 (CHUP1) plays a key role. This study aimed to characterize CHUP1 and CHUP1-like proteins in an SCC4 photosynthetic plant, Bienertia sinuspersici. Also, a comparative analysis of SCC4 CHUP1 was made with C3, C4, and CAM model plants including an extant basal angiosperm, Amborella. The CHUP1 gene exists as a single copy from the basal angiosperms to SCC4 plants. Our analysis identified that Chenopodium quinoa, a recently duplicated allotetraploid, has two copies of CHUP1. In addition, the numbers of CHUP1-like and its associated proteins such as CHUP1-like_a, CHUP1-like_b, HPR, TPR, and ABP varied between the species. Hidden Markov Model analysis showed that the gene size of CHUP1-like_a and CHUP1-like_b of SCC4 species, Bienertia, and Suaeda were enlarged than other plants. Also, we identified that CHUP1-like_a and CHUP1-like_b are absent in Arabidopsis and Amborella, respectively. Motif analysis identified several conserved and variable motifs based on the orders (monocot and dicot) as well as photosynthetic pathways. For instance, CAM plants such as pineapple and cactus shared certain motifs of CHUP1-like_a irrespective of their distant phylogenetic relationship. The free ratio model showed that CHUP1 maintained purifying selection, whereas CHUP1-like_a and CHUP1-like_b have adaptive functions between SCC4 plants and quinoa. Similarly, rice and maize branches displayed functional diversification on CHUP1-like_b. Relative gene expression data showed that during the subcellular compartmentalization process of Bienertia, CHUP1 and actin-binding proteins (ABP) genes showed a similar pattern of expression. Altogether, the results of this study provide insight into the evolutionary and functional details of CHUP1 and its associated proteins in the development of the SCC4 system in comparison with other C3, C4, and CAM model plants.
Collapse
Affiliation(s)
- So Youn Won
- Department of Agricultural Biotechnology, National Institute of Agricultural Sciences, Rural Development Administration, Jeonju-si, Jeollabuk-do, South Korea
| | - Prabhakaran Soundararajan
- Department of Agricultural Biotechnology, National Institute of Agricultural Sciences, Rural Development Administration, Jeonju-si, Jeollabuk-do, South Korea
| | - Vadivelmurugan Irulappan
- Department of Agricultural Biotechnology, National Institute of Agricultural Sciences, Rural Development Administration, Jeonju-si, Jeollabuk-do, South Korea
| | - Jung Sun Kim
- Department of Agricultural Biotechnology, National Institute of Agricultural Sciences, Rural Development Administration, Jeonju-si, Jeollabuk-do, South Korea
| |
Collapse
|
10
|
Computational prediction of disordered binding regions. Comput Struct Biotechnol J 2023; 21:1487-1497. [PMID: 36851914 PMCID: PMC9957716 DOI: 10.1016/j.csbj.2023.02.018] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 02/08/2023] [Accepted: 02/08/2023] [Indexed: 02/12/2023] Open
Abstract
One of the key features of intrinsically disordered regions (IDRs) is their ability to interact with a broad range of partner molecules. Multiple types of interacting IDRs were identified including molecular recognition fragments (MoRFs), short linear sequence motifs (SLiMs), and protein-, nucleic acids- and lipid-binding regions. Prediction of binding IDRs in protein sequences is gaining momentum in recent years. We survey 38 predictors of binding IDRs that target interactions with a diverse set of partners, such as peptides, proteins, RNA, DNA and lipids. We offer a historical perspective and highlight key events that fueled efforts to develop these methods. These tools rely on a diverse range of predictive architectures that include scoring functions, regular expressions, traditional and deep machine learning and meta-models. Recent efforts focus on the development of deep neural network-based architectures and extending coverage to RNA, DNA and lipid-binding IDRs. We analyze availability of these methods and show that providing implementations and webservers results in much higher rates of citations/use. We also make several recommendations to take advantage of modern deep network architectures, develop tools that bundle predictions of multiple and different types of binding IDRs, and work on algorithms that model structures of the resulting complexes.
Collapse
|
11
|
Peng Z, Li Z, Meng Q, Zhao B, Kurgan L. CLIP: accurate prediction of disordered linear interacting peptides from protein sequences using co-evolutionary information. Brief Bioinform 2023; 24:6858950. [PMID: 36458437 DOI: 10.1093/bib/bbac502] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 09/30/2022] [Accepted: 10/24/2022] [Indexed: 12/04/2022] Open
Abstract
One of key features of intrinsically disordered regions (IDRs) is facilitation of protein-protein and protein-nucleic acids interactions. These disordered binding regions include molecular recognition features (MoRFs), short linear motifs (SLiMs) and longer binding domains. Vast majority of current predictors of disordered binding regions target MoRFs, with a handful of methods that predict SLiMs and disordered protein-binding domains. A new and broader class of disordered binding regions, linear interacting peptides (LIPs), was introduced recently and applied in the MobiDB resource. LIPs are segments in protein sequences that undergo disorder-to-order transition upon binding to a protein or a nucleic acid, and they cover MoRFs, SLiMs and disordered protein-binding domains. Although current predictors of MoRFs and disordered protein-binding regions could be used to identify some LIPs, there are no dedicated sequence-based predictors of LIPs. To this end, we introduce CLIP, a new predictor of LIPs that utilizes robust logistic regression model to combine three complementary types of inputs: co-evolutionary information derived from multiple sequence alignments, physicochemical profiles and disorder predictions. Ablation analysis suggests that the co-evolutionary information is particularly useful for this prediction and that combining the three inputs provides substantial improvements when compared to using these inputs individually. Comparative empirical assessments using low-similarity test datasets reveal that CLIP secures area under receiver operating characteristic curve (AUC) of 0.8 and substantially improves over the results produced by the closest current tools that predict MoRFs and disordered protein-binding regions. The webserver of CLIP is freely available at http://biomine.cs.vcu.edu/servers/CLIP/ and the standalone code can be downloaded from http://yanglab.qd.sdu.edu.cn/download/CLIP/.
Collapse
Affiliation(s)
- Zhenling Peng
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China.,Frontier Science Center for Nonlinear Expectations, Ministry of Education, Qingdao, 266237, China
| | - Zixia Li
- Center for Applied Mathematics, Tianjin University, Tianjin, 300072, China
| | - Qiaozhen Meng
- College of Intelligence and Computing, Tianjin University, Tianjin, 300072, China
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
12
|
Yu K, Liu Z, Cheng H, Li S, Zhang Q, Liu J, Ju HQ, Zuo Z, Zhao Q, Kang S, Liu ZX. dSCOPE: a software to detect sequences critical for liquid-liquid phase separation. Brief Bioinform 2023; 24:6927233. [PMID: 36528388 DOI: 10.1093/bib/bbac550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 10/26/2022] [Accepted: 11/12/2022] [Indexed: 12/23/2022] Open
Abstract
Membrane-based cells are the fundamental structural and functional units of organisms, while evidences demonstrate that liquid-liquid phase separation (LLPS) is associated with the formation of membraneless organelles, such as P-bodies, nucleoli and stress granules. Many studies have been undertaken to explore the functions of protein phase separation (PS), but these studies lacked an effective tool to identify the sequence segments that critical for LLPS. In this study, we presented a novel software called dSCOPE (http://dscope.omicsbio.info) to predict the PS-driving regions. To develop the predictor, we curated experimentally identified sequence segments that can drive LLPS from published literature. Then sliding sequence window based physiological, biochemical, structural and coding features were integrated by random forest algorithm to perform prediction. Through rigorous evaluation, dSCOPE was demonstrated to achieve satisfactory performance. Furthermore, large-scale analysis of human proteome based on dSCOPE showed that the predicted PS-driving regions enriched various protein post-translational modifications and cancer mutations, and the proteins which contain predicted PS-driving regions enriched critical cellular signaling pathways. Taken together, dSCOPE precisely predicted the protein sequence segments critical for LLPS, with various helpful information visualized in the webserver to facilitate LLPS-related research.
Collapse
Affiliation(s)
- Kai Yu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Zekun Liu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Haoyang Cheng
- Department of Computer Science, The University of Hong Kong, Hong Kong 999077, China
| | - Shihua Li
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Qingfeng Zhang
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Jia Liu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Huai-Qiang Ju
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Zhixiang Zuo
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Qi Zhao
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Shiyang Kang
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Ze-Xian Liu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| |
Collapse
|
13
|
PANAGOPOULOS IOANNIS, ANDERSEN KRISTIN, GORUNOVA LUDMILA, HOGNESTAD HANNEREGINE, PEDERSEN THOMASDAHL, LOBMAIER INGVILD, MICCI FRANCESCA, HEIM SVERRE. Chromosome Translocation t(10;19)(q26;q13) in a CIC-sarcoma. In Vivo 2023; 37:57-69. [PMID: 36593014 PMCID: PMC9843759 DOI: 10.21873/invivo.13054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 11/05/2022] [Accepted: 11/10/2022] [Indexed: 01/04/2023]
Abstract
BACKGROUND/AIM CIC-sarcomas are characterized by rearrangements of the capicua transcriptional repressor (CIC) gene on chromosome subband 19q13.2, generating chimeras in which CIC is the 5'-end partner. Most reported CIC-sarcomas have been detected using PCR amplifications together with Sanger sequencing, high throughput sequencing, and fluorescence in situ hybridization (FISH). Only a few CIC-rearranged tumors have been characterized cytogenetically. Here, we describe the cytogenetic and molecular genetic features of a CIC-sarcoma carrying a t(10;19)(q26;q13), a chromosomal rearrangement not previously detected in such neoplasms. MATERIALS AND METHODS A round cell sarcoma removed from the right thigh of a 57-year-old man was investigated by G-banding cytogenetics, FISH, PCR and Sanger sequencing. RESULTS The tumor cells had three cytogenetically related clones with the translocations t(9;18)(q22;q21) and t(10;19)(q26;q13) common to all of them. FISH with a BAC probe containing the CIC gene hybridized to the normal chromosome 19, to der(10)t(10;19), and to der(19)t(10;19). PCR using tumor cDNA as template together with Sanger sequencing detected two CIC::DUX4 fusion transcripts which both had a stop TAG codon immediately after the fusion point. Both transcripts are predicted to encode truncated CIC polypeptides lacking the carboxy terminal part of the native protein. This missing part is crucial for CIC's DNA binding capacity and interaction with other proteins. CONCLUSION In addition to demonstrating that CIC rearrangement in sarcomas can occur via the microscopically visible translocation t(10;19)(q26;q13), the findings in the present case provide evidence that the missing part in CIC-truncated proteins has important functions whose loss may be important in tumorigenesis.
Collapse
Affiliation(s)
- IOANNIS PANAGOPOULOS
- Section for Cancer Cytogenetics, Institute for Cancer Genetics and Informatics, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway
| | - KRISTIN ANDERSEN
- Section for Cancer Cytogenetics, Institute for Cancer Genetics and Informatics, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway
| | - LUDMILA GORUNOVA
- Section for Cancer Cytogenetics, Institute for Cancer Genetics and Informatics, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway
| | | | | | | | - FRANCESCA MICCI
- Section for Cancer Cytogenetics, Institute for Cancer Genetics and Informatics, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway
| | - SVERRE HEIM
- Section for Cancer Cytogenetics, Institute for Cancer Genetics and Informatics, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway
| |
Collapse
|
14
|
Scietti L, Forneris F. Modeling of Protein Complexes. Methods Mol Biol 2023; 2627:349-371. [PMID: 36959458 DOI: 10.1007/978-1-0716-2974-1_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Abstract
The recent advances in structural biology, combined with continuously increasing computational capabilities and development of advanced softwares, have drastically simplified the workflow for protein homology modeling. Modeling of individual proteins is nowadays quick and straightforward for a large variety of protein targets, thanks to guided pipelines relying on advanced computational tools and user-friendly interfaces, which have extended and promoted the use of modeling also to scientists not focusing on molecular structures of proteins. Nevertheless, construction of models of multi-protein complexes remains quite challenging for the non-experts, often due to the usage of specific procedures depending on the system under investigation and the need for experimental validation approaches to strengthen the generated output.In this chapter, we provide a brief overview of the approaches enabling generation of multi-protein complex models starting from homology models of individual protein components. Using real-life examples, we include two examples to guide the reader in the generation of homomeric and heteromeric protein models.
Collapse
Affiliation(s)
- Luigi Scietti
- Department of Biology and Biotechnology, The Armenise-Harvard Laboratory of Structural Biology, University of Pavia, Pavia, Italy.
| | - Federico Forneris
- Department of Biology and Biotechnology, The Armenise-Harvard Laboratory of Structural Biology, University of Pavia, Pavia, Italy.
| |
Collapse
|
15
|
Sun C, Feng Y, Fan G. IDPsBind: a repository of binding sites for intrinsically disordered proteins complexes with known 3D structures. BMC Mol Cell Biol 2022; 23:33. [PMID: 35883018 PMCID: PMC9327236 DOI: 10.1186/s12860-022-00434-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 07/14/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
Intrinsically disordered proteins (IDPs) lack a stable three-dimensional structure under physiological conditions but play crucial roles in many biological processes. Intrinsically disordered proteins perform various biological functions by interacting with other ligands.
Results
Here, we present a database, IDPsBind, which displays interacting sites between IDPs and interacting ligands by using the distance threshold method in known 3D structure IDPs complexes from the PDB database. IDPsBind contains 9626 IDPs complexes and 880 intrinsically disordered proteins verified by experiments. The current release of the IDPsBind database is defined as version 1.0. IDPsBind is freely accessible at http://www.s-bioinformatics.cn/idpsbind/home/.
Conclusions
IDPsBind provides more comprehensive interaction sites for IDPs complexes of known 3D structures. It can not only help the subsequent studies of the interaction mechanism of intrinsically disordered proteins but also provides a suitable background for developing the algorithms for predicting the interaction sites of intrinsically disordered proteins.
Collapse
|
16
|
Piovesan D, Del Conte A, Clementel D, Monzon A, Bevilacqua M, Aspromonte M, Iserte J, Orti FE, Marino-Buslje C, Tosatto SE. MobiDB: 10 years of intrinsically disordered proteins. Nucleic Acids Res 2022; 51:D438-D444. [PMID: 36416266 PMCID: PMC9825420 DOI: 10.1093/nar/gkac1065] [Citation(s) in RCA: 39] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 10/11/2022] [Accepted: 10/25/2022] [Indexed: 11/24/2022] Open
Abstract
The MobiDB database (URL: https://mobidb.org/) is a knowledge base of intrinsically disordered proteins. MobiDB aggregates disorder annotations derived from the literature and from experimental evidence along with predictions for all known protein sequences. MobiDB generates new knowledge and captures the functional significance of disordered regions by processing and combining complementary sources of information. Since its first release 10 years ago, the MobiDB database has evolved in order to improve the quality and coverage of protein disorder annotations and its accessibility. MobiDB has now reached its maturity in terms of data standardization and visualization. Here, we present a new release which focuses on the optimization of user experience and database content. The major advances compared to the previous version are the integration of AlphaFoldDB predictions and the re-implementation of the homology transfer pipeline, which expands manually curated annotations by two orders of magnitude. Finally, the entry page has been restyled in order to provide an overview of the available annotations along with two separate views that highlight structural disorder evidence and functions associated with different binding modes.
Collapse
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Alessio Del Conte
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Damiano Clementel
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | | | | | - Javier A Iserte
- Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, Argentina
| | - Fernando E Orti
- Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, Argentina
| | | | | |
Collapse
|
17
|
Mitrea DM, Mittasch M, Gomes BF, Klein IA, Murcko MA. Modulating biomolecular condensates: a novel approach to drug discovery. Nat Rev Drug Discov 2022; 21:841-862. [PMID: 35974095 PMCID: PMC9380678 DOI: 10.1038/s41573-022-00505-4] [Citation(s) in RCA: 65] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/08/2022] [Indexed: 12/12/2022]
Abstract
In the past decade, membraneless assemblies known as biomolecular condensates have been reported to play key roles in many cellular functions by compartmentalizing specific proteins and nucleic acids in subcellular environments with distinct properties. Furthermore, growing evidence supports the view that biomolecular condensates often form by phase separation, in which a single-phase system demixes into a two-phase system consisting of a condensed phase and a dilute phase of particular biomolecules. Emerging understanding of condensate function in normal and aberrant cellular states, and of the mechanisms of condensate formation, is providing new insights into human disease and revealing novel therapeutic opportunities. In this Perspective, we propose that such insights could enable a previously unexplored drug discovery approach based on identifying condensate-modifying therapeutics (c-mods), and we discuss the strategies, techniques and challenges involved.
Collapse
|
18
|
Fang M, He Y, Du Z, Uversky VN. DeepCLD: An Efficient Sequence-Based Predictor of Intrinsically Disordered Proteins. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3154-3159. [PMID: 34727037 DOI: 10.1109/tcbb.2021.3124273] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Intrinsic disorder is common in proteins, plays important roles in protein functionality, and is commonly associated with various human diseases. To have an accurate tool for the annotation of intrinsic disorder in proteins, this paper proposes a novel algorithm, DeepCLD, for sequence-based prediction of intrinsically disordered proteins. This algorithm uses amino acid position specific scoring matrix (PSSM) to capture the intrinsic variability characteristic of sequence patterns, ResNet to preserve feature space structure, and bidirectional CudnnLSTM as recurrent layer to further improve the efficiency. Futhermore, DeepCLD also utilized the attention mechanism to solve the problem of gradient disappearing in deep network. Comparative analyses show that DeepCLD has faster training speed and higher prediction accuracy than comparable methods.
Collapse
|
19
|
Piersimoni L, Abd El Malek M, Bhatia T, Bender J, Brankatschk C, Calvo Sánchez J, Dayhoff GW, Di Ianni A, Figueroa Parra JO, Garcia-Martinez D, Hesselbarth J, Köppen J, Lauth LM, Lippik L, Machner L, Sachan S, Schmidt L, Selle R, Skalidis I, Sorokin O, Ubbiali D, Voigt B, Wedler A, Wei AAJ, Zorn P, Dunker AK, Köhn M, Sinz A, Uversky VN. Lighting up Nobel Prize-winning studies with protein intrinsic disorder. Cell Mol Life Sci 2022; 79:449. [PMID: 35882686 PMCID: PMC11072364 DOI: 10.1007/s00018-022-04468-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 06/18/2022] [Accepted: 07/04/2022] [Indexed: 11/03/2022]
Abstract
Intrinsically disordered proteins and regions (IDPs and IDRs) and their importance in biology are becoming increasingly recognized in biology, biochemistry, molecular biology and chemistry textbooks, as well as in current protein science and structural biology curricula. We argue that the sequence → dynamic conformational ensemble → function principle is of equal importance as the classical sequence → structure → function paradigm. To highlight this point, we describe the IDPs and/or IDRs behind the discoveries associated with 17 Nobel Prizes, 11 in Physiology or Medicine and 6 in Chemistry. The Nobel Laureates themselves did not always mention that the proteins underlying the phenomena investigated in their award-winning studies are in fact IDPs or contain IDRs. In several cases, IDP- or IDR-based molecular functions have been elucidated, while in other instances, it is recognized that the respective protein(s) contain IDRs, but the specific IDR-based molecular functions have yet to be determined. To highlight the importance of IDPs and IDRs as general principle in biology, we present here illustrative examples of IDPs/IDRs in Nobel Prize-winning mechanisms and processes.
Collapse
Affiliation(s)
- Lolita Piersimoni
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Marina Abd El Malek
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Twinkle Bhatia
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Julian Bender
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Christin Brankatschk
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Jaime Calvo Sánchez
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Guy W Dayhoff
- Department of Chemistry, College of Art and Sciences, University of South Florida, Tampa, FL, 33620, USA
| | - Alessio Di Ianni
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | | | - Dailen Garcia-Martinez
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Julia Hesselbarth
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Janett Köppen
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Luca M Lauth
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Laurin Lippik
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Lisa Machner
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Shubhra Sachan
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Lisa Schmidt
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Robin Selle
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Ioannis Skalidis
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Oleksandr Sorokin
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Daniele Ubbiali
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Bruno Voigt
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Alice Wedler
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Alan An Jung Wei
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Peter Zorn
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Alan Keith Dunker
- Department of Biochemistry and Molecular Biology, Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Marcel Köhn
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany.
| | - Andrea Sinz
- Research Training Group RTG2467, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany.
| | - Vladimir N Uversky
- Department of Molecular Medicine, USF Health Byrd Alzheimer's Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, 33612, USA.
| |
Collapse
|
20
|
Vicioso-Mantis M, Aguirre S, Martínez-Balbás MA. JmjC Family of Histone Demethylases Form Nuclear Condensates. Int J Mol Sci 2022; 23:ijms23147664. [PMID: 35887017 PMCID: PMC9319511 DOI: 10.3390/ijms23147664] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 07/08/2022] [Accepted: 07/08/2022] [Indexed: 12/16/2022] Open
Abstract
The Jumonji-C (JmjC) family of lysine demethylases (KDMs) (JMJC-KDMs) plays an essential role in controlling gene expression and chromatin structure. In most cases, their function has been attributed to the demethylase activity. However, accumulating evidence demonstrates that these proteins play roles distinct from histone demethylation. This raises the possibility that they might share domains that contribute to their functional outcome. Here, we show that the JMJC-KDMs contain low-complexity domains and intrinsically disordered regions (IDR), which in some cases reached 70% of the protein. Our data revealed that plant homeodomain finger protein (PHF2), KDM2A, and KDM4B cluster by phase separation. Moreover, our molecular analysis implies that PHF2 IDR contributes to transcription regulation. These data suggest that clustering via phase separation is a common feature that JMJC-KDMs utilize to facilitate their functional responses. Our study uncovers a novel potential function for the JMJC-KDM family that sheds light on the mechanisms to achieve the competent concentration of molecules in time and space within the cell nucleus.
Collapse
|
21
|
Quaglia F, Hatos A, Salladini E, Piovesan D, Tosatto SCE. Exploring Manually Curated Annotations of Intrinsically Disordered Proteins with DisProt. Curr Protoc 2022; 2:e484. [PMID: 35789137 DOI: 10.1002/cpz1.484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
DisProt is the major repository of manually curated data for intrinsically disordered proteins collected from the literature. Although lacking a stable three-dimensional structure under physiological conditions, intrinsically disordered proteins carry out a plethora of biological functions, some of them directly arising from their flexible nature. A growing number of scientific studies have been published during the last few decades to shed light on their unstructured state, their binding modes, and their functions. DisProt makes use of a team of expert biocurators to provide up-to-date annotations of intrinsically disordered proteins from the literature, making them available to the scientific community. Here we present a comprehensive description on how to use DisProt in different contexts and provide a detailed explanation of how to explore and interpret manually curated annotations of intrinsically disordered proteins. We describe how to search DisProt annotations, both using the web interface and the API for programmatic access. Finally, we explain how to visualize and interpret a DisProt entry, the SARS-CoV-2 Nucleoprotein, characterized by the presence of unstructured N-terminal and C-terminal regions and a flexible linker. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Performing a search in DisProt Support Protocol 1: Downloading options Support Protocol 2: Programmatic access with DisProt REST API Basic Protocol 2: Exploring the DisProt Ontology page Basic Protocol 3: Visualizing and interpreting DisProt entries-the SARS-CoV-2 Nucleoprotein use case.
Collapse
Affiliation(s)
- Federica Quaglia
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), Bari, Italy
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - András Hatos
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Edoardo Salladini
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | |
Collapse
|
22
|
Getting Closer to Decrypting the Phase Transitions of Bacterial Biomolecules. Biomolecules 2022; 12:biom12070907. [PMID: 35883463 PMCID: PMC9312465 DOI: 10.3390/biom12070907] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 06/24/2022] [Accepted: 06/26/2022] [Indexed: 12/31/2022] Open
Abstract
Liquid–liquid phase separation (LLPS) of biomolecules has emerged as a new paradigm in cell biology, and the process is one proposed mechanism for the formation of membraneless organelles (MLOs). Bacterial cells have only recently drawn strong interest in terms of studies on both liquid-to-liquid and liquid-to-solid phase transitions. It seems that these processes drive the formation of prokaryotic cellular condensates that resemble eukaryotic MLOs. In this review, we present an overview of the key microbial biomolecules that undergo LLPS, as well as the formation and organization of biomacromolecular condensates within the intracellular space. We also discuss the current challenges in investigating bacterial biomacromolecular condensates. Additionally, we highlight a summary of recent knowledge about the participation of bacterial biomolecules in a phase transition and provide some new in silico analyses that can be helpful for further investigations.
Collapse
|
23
|
Vicioso-Mantis M, Fueyo R, Navarro C, Cruz-Molina S, van Ijcken WFJ, Rebollo E, Rada-Iglesias Á, Martínez-Balbás MA. JMJD3 intrinsically disordered region links the 3D-genome structure to TGFβ-dependent transcription activation. Nat Commun 2022; 13:3263. [PMID: 35672304 PMCID: PMC9174158 DOI: 10.1038/s41467-022-30614-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 05/05/2022] [Indexed: 12/13/2022] Open
Abstract
Enhancers are key regulatory elements that govern gene expression programs in response to developmental signals. However, how multiple enhancers arrange in the 3D-space to control the activation of a specific promoter remains unclear. To address this question, we exploited our previously characterized TGFβ-response model, the neural stem cells, focusing on a ~374 kb locus where enhancers abound. Our 4C-seq experiments reveal that the TGFβ pathway drives the assembly of an enhancer-cluster and precise gene activation. We discover that the TGFβ pathway coactivator JMJD3 is essential to maintain these structures. Using live-cell imaging techniques, we demonstrate that an intrinsically disordered region contained in JMJD3 is involved in the formation of phase-separated biomolecular condensates, which are found in the enhancer-cluster. Overall, in this work we uncover novel functions for the coactivator JMJD3, and we shed light on the relationships between the 3D-conformation of the chromatin and the TGFβ-driven response during mammalian neurogenesis. Here the authors demonstrate that TGFβ drives multi-enhancer contacts and ultimately gene activation during neuronal commitment, and that this requires the intrinsically disordered region (IDR) of the histone demethylase JMJD3 likely through its role in promoting phase-separated biomolecular condensates.
Collapse
|
24
|
Predicting protein intrinsically disordered regions by applying natural language processing practices. Soft comput 2022. [DOI: 10.1007/s00500-022-07085-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
25
|
French-Pacheco L, Rosas-Bringas O, Segovia L, Covarrubias AA. Intrinsically disordered signaling proteins: Essential hub players in the control of stress responses in Saccharomyces cerevisiae. PLoS One 2022; 17:e0265422. [PMID: 35290420 PMCID: PMC8923507 DOI: 10.1371/journal.pone.0265422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 03/01/2022] [Indexed: 11/24/2022] Open
Abstract
Cells have developed diverse mechanisms to monitor changes in their surroundings. This allows them to establish effective responses to cope with adverse environments. Some of these mechanisms have been well characterized in the budding yeast Saccharomyces cerevisiae, an excellent experimental model to explore and elucidate some of the strategies selected in eukaryotic organisms to adjust their growth and development in stressful conditions. The relevance of structural disorder in proteins and the impact on their functions has been uncovered for proteins participating in different processes. This is the case of some transcription factors (TFs) and other signaling hub proteins, where intrinsically disordered regions (IDRs) play a critical role in their function. In this work, we present a comprehensive bioinformatic analysis to evaluate the significance of structural disorder in those TFs (170) recognized in S. cerevisiae. Our findings show that 85.2% of these TFs contain at least one IDR, whereas ~30% exhibit a higher disorder level and thus were considered as intrinsically disordered proteins (IDPs). We also found that TFs contain a higher number of IDRs compared to the rest of the yeast proteins, and that intrinsically disordered TFs (IDTFs) have a higher number of protein-protein interactions than those with low structural disorder. The analysis of different stress response pathways showed a high content of structural disorder not only in TFs but also in other signaling proteins. The propensity of yeast proteome to undergo a liquid-liquid phase separation (LLPS) was also analyzed, showing that a significant proportion of IDTFs may undergo this phenomenon. Our analysis is a starting point for future research on the importance of structural disorder in yeast stress responses.
Collapse
Affiliation(s)
- Leidys French-Pacheco
- Departamento de Biología Molecular de Plantas, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
| | - Omar Rosas-Bringas
- Departamento de Biología Molecular de Plantas, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
| | - Lorenzo Segovia
- Departamento de Ingeniería Celular y Biocatálisis, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
| | - Alejandra A. Covarrubias
- Departamento de Biología Molecular de Plantas, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
- * E-mail:
| |
Collapse
|
26
|
Ahmed SS, Rifat ZT, Lohia R, Campbell AJ, Dunker AK, Rahman MS, Iqbal S. Characterization of intrinsically disordered regions in proteins informed by human genetic diversity. PLoS Comput Biol 2022; 18:e1009911. [PMID: 35275927 PMCID: PMC8942211 DOI: 10.1371/journal.pcbi.1009911] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Revised: 03/23/2022] [Accepted: 02/10/2022] [Indexed: 01/21/2023] Open
Abstract
All proteomes contain both proteins and polypeptide segments that don’t form a defined three-dimensional structure yet are biologically active—called intrinsically disordered proteins and regions (IDPs and IDRs). Most of these IDPs/IDRs lack useful functional annotation limiting our understanding of their importance for organism fitness. Here we characterized IDRs using protein sequence annotations of functional sites and regions available in the UniProt knowledgebase (“UniProt features”: active site, ligand-binding pocket, regions mediating protein-protein interactions, etc.). By measuring the statistical enrichment of twenty-five UniProt features in 981 IDRs of 561 human proteins, we identified eight features that are commonly located in IDRs. We then collected the genetic variant data from the general population and patient-based databases and evaluated the prevalence of population and pathogenic variations in IDPs/IDRs. We observed that some IDRs tolerate 2 to 12-times more single amino acid-substituting missense mutations than synonymous changes in the general population. However, we also found that 37% of all germline pathogenic mutations are located in disordered regions of 96 proteins. Based on the observed-to-expected frequency of mutations, we categorized 34 IDRs in 20 proteins (DDX3X, KIT, RB1, etc.) as intolerant to mutation. Finally, using statistical analysis and a machine learning approach, we demonstrate that mutation-intolerant IDRs carry a distinct signature of functional features. Our study presents a novel approach to assign functional importance to IDRs by leveraging the wealth of available genetic data, which will aid in a deeper understating of the role of IDRs in biological processes and disease mechanisms.
Collapse
Affiliation(s)
- Shehab S. Ahmed
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, West Palashi, Dhaka-1205, Bangladesh
| | - Zaara T. Rifat
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, West Palashi, Dhaka-1205, Bangladesh
| | - Ruchi Lohia
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Arthur J. Campbell
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - A. Keith Dunker
- Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, Indiana, United States of America
| | - M. Sohel Rahman
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, West Palashi, Dhaka-1205, Bangladesh
- * E-mail: (MSR); (SI)
| | - Sumaiya Iqbal
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- * E-mail: (MSR); (SI)
| |
Collapse
|
27
|
Kurgan L. Resources for computational prediction of intrinsic disorder in proteins. Methods 2022; 204:132-141. [DOI: 10.1016/j.ymeth.2022.03.018] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 03/25/2022] [Accepted: 03/29/2022] [Indexed: 12/26/2022] Open
|
28
|
D3PM: a comprehensive database for protein motions ranging from residue to domain. BMC Bioinformatics 2022; 23:70. [PMID: 35164668 PMCID: PMC8845362 DOI: 10.1186/s12859-022-04595-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 02/01/2022] [Indexed: 11/24/2022] Open
Abstract
Background Knowledge of protein motions is significant to understand its functions. While currently available databases for protein motions are mostly focused on overall domain motions, little attention is paid on local residue motions. Albeit with relatively small scale, the local residue motions, especially those residues in binding pockets, may play crucial roles in protein functioning and ligands binding. Results A comprehensive protein motion database, namely D3PM, was constructed in this study to facilitate the analysis of protein motions. The protein motions in the D3PM range from overall structural changes of macromolecule to local flip motions of binding pocket residues. Currently, the D3PM has collected 7679 proteins with overall motions and 3513 proteins with pocket residue motions. The motion patterns are classified into 4 types of overall structural changes and 5 types of pocket residue motions. Impressively, we found that less than 15% of protein pairs have obvious overall conformational adaptations induced by ligand binding, while more than 50% of protein pairs have significant structural changes in ligand binding sites, indicating that ligand-induced conformational changes are drastic and mainly confined around ligand binding sites. Based on the residue preference in binding pocket, we classified amino acids into “pocketphilic” and “pocketphobic” residues, which should be helpful for pocket prediction and drug design. Conclusion D3PM is a comprehensive database about protein motions ranging from residue to domain, which should be useful for exploring diverse protein motions and for understanding protein function and drug design. The D3PM is available on www.d3pharma.com/D3PM/index.php. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04595-0.
Collapse
|
29
|
Medvedev KE, Pei J, Grishin NV. DisEnrich: database of enriched regions in human dark proteome. Bioinformatics 2022; 38:1870-1876. [PMID: 35094056 PMCID: PMC8963327 DOI: 10.1093/bioinformatics/btac051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 12/22/2021] [Accepted: 01/25/2022] [Indexed: 02/01/2023] Open
Abstract
MOTIVATION Intrinsically disordered proteins (IDPs) are involved in numerous processes crucial for living organisms. Bias in amino acid composition of these proteins determines their unique biophysical and functional features. Distinct intrinsically disordered regions (IDRs) with compositional bias play different important roles in various biological processes. IDRs enriched in particular amino acids in human proteome have not been described consistently. RESULTS We developed DisEnrich-the database of human proteome IDRs that are significantly enriched in particular amino acids. Each human protein is described using Gene Ontology (GO) function terms, disorder prediction for the full-length sequence using three methods, enriched IDR composition and ranks of human proteins with similar enriched IDRs. Distribution analysis of enriched IDRs among broad functional categories revealed significant overrepresentation of R- and Y-enriched IDRs in metabolic and enzymatic activities and F-enriched IDRs in transport. About 75% of functional categories contain IDPs with IDRs significantly enriched in hydrophobic residues that are important for protein-protein interactions. AVAILABILITY AND IMPLEMENTATION The database is available at http://prodata.swmed.edu/DisEnrichDB/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | - Jimin Pei
- McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Nick V Grishin
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA,Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA,Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| |
Collapse
|
30
|
Abstract
INTRODUCTION Intrinsic disorder prediction field develops, assesses, and deploys computational predictors of disorder in protein sequences and constructs and disseminates databases of these predictions. Over 40 years of research resulted in the release of numerous resources. AREAS COVERED We identify and briefly summarize the most comprehensive to date collection of over 100 disorder predictors. We focus on their predictive models, availability and predictive performance. We categorize and study them from a historical point of view to highlight informative trends. EXPERT OPINION We find a consistent trend of improvements in predictive quality as newer and more advanced predictors are developed. The original focus on machine learning methods has shifted to meta-predictors in early 2010s, followed by a recent transition to deep learning. The use of deep learners will continue in foreseeable future given recent and convincing success of these methods. Moreover, a broad range of resources that facilitate convenient collection of accurate disorder predictions is available to users. They include web servers and standalone programs for disorder prediction, servers that combine prediction of disorder and disorder functions, and large databases of pre-computed predictions. We also point to the need to address the shortage of accurate methods that predict disordered binding regions.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA
| |
Collapse
|
31
|
Han S, Lee H, Lee AJ, Kim SK, Jung I, Koh GY, Kim TK, Lee D. CHD4 Conceals Aberrant CTCF-Binding Sites at TAD Interiors by Regulating Chromatin Accessibility in Mouse Embryonic Stem Cells. Mol Cells 2021; 44:805-829. [PMID: 34764232 PMCID: PMC8627837 DOI: 10.14348/molcells.2021.0224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 09/06/2021] [Indexed: 11/27/2022] Open
Abstract
CCCTC-binding factor (CTCF) critically contributes to 3D chromatin organization by determining topologically associated domain (TAD) borders. Although CTCF primarily binds at TAD borders, there also exist putative CTCF-binding sites within TADs, which are spread throughout the genome by retrotransposition. However, the detailed mechanism responsible for masking the putative CTCF-binding sites remains largely elusive. Here, we show that the ATP-dependent chromatin remodeler, chromodomain helicase DNA-binding 4 (CHD4), regulates chromatin accessibility to conceal aberrant CTCF-binding sites embedded in H3K9me3-enriched heterochromatic B2 short interspersed nuclear elements (SINEs) in mouse embryonic stem cells (mESCs). Upon CHD4 depletion, these aberrant CTCF-binding sites become accessible and aberrant CTCF recruitment occurs within TADs, resulting in disorganization of local TADs. RNA-binding intrinsically disordered domains (IDRs) of CHD4 are required to prevent this aberrant CTCF binding, and CHD4 is critical for the repression of B2 SINE transcripts. These results collectively reveal that a CHD4-mediated mechanism ensures appropriate CTCF binding and associated TAD organization in mESCs.
Collapse
Affiliation(s)
- Sungwook Han
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea
| | - Hosuk Lee
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea
- Center for Vascular Research, Institute for Basic Sciences, Daejeon 34141, Korea
| | - Andrew J. Lee
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea
| | - Seung-Kyoon Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 37673, Korea
| | - Inkyung Jung
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea
| | - Gou Young Koh
- Center for Vascular Research, Institute for Basic Sciences, Daejeon 34141, Korea
| | - Tae-Kyung Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 37673, Korea
| | - Daeyoup Lee
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea
| |
Collapse
|
32
|
Sequence, structural and functional conservation among the human and fission yeast ELL and EAF transcription elongation factors. Mol Biol Rep 2021; 49:1303-1320. [PMID: 34807377 DOI: 10.1007/s11033-021-06958-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 11/11/2021] [Indexed: 10/19/2022]
Abstract
BACKGROUND Transcription elongation is a dynamic and tightly regulated step of gene expression in eukaryotic cells. Eleven nineteen Lysine rich Leukemia (ELL) and ELL Associated Factors (EAF) family of conserved proteins are required for efficient RNA polymerase II-mediated transcription elongation. Orthologs of these proteins have been identified in different organisms, including fission yeast and humans. METHODS AND RESULTS In the present study, we have examined the sequence, structural and functional conservation between the fission yeast and human ELL and EAF orthologs. Our computational analysis revealed that these proteins share some sequence characteristics, and were predominantly disordered in both organisms. Our functional complementation assays revealed that both human ELL and EAF proteins could complement the lack of ell1+ or eaf1+ in Schizosaccharomyces pombe respectively. Furthermore, our domain mapping experiments demonstrated that both the amino and carboxyl terminal domains of human EAF proteins could functionally complement the S. pombe eaf1 deletion phenotypes. However, only the carboxyl-terminus domain of human ELL was able to partially rescue the phenotypes associated with lack of ell1+ in S. pombe. CONCLUSIONS Collectively, our work adds ELL-EAF to the increasing list of human-yeast complementation gene pairs, wherein the simpler fission yeast can be used to further enhance our understanding of the role of these proteins in transcription elongation and human disease.
Collapse
|
33
|
Dobson L, Tusnády GE. MemDis: Predicting Disordered Regions in Transmembrane Proteins. Int J Mol Sci 2021; 22:12270. [PMID: 34830151 PMCID: PMC8623522 DOI: 10.3390/ijms222212270] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 11/02/2021] [Accepted: 11/09/2021] [Indexed: 11/16/2022] Open
Abstract
Transmembrane proteins (TMPs) play important roles in cells, ranging from transport processes and cell adhesion to communication. Many of these functions are mediated by intrinsically disordered regions (IDRs), flexible protein segments without a well-defined structure. Although a variety of prediction methods are available for predicting IDRs, their accuracy is very limited on TMPs due to their special physico-chemical properties. We prepared a dataset containing membrane proteins exclusively, using X-ray crystallography data. MemDis is a novel prediction method, utilizing convolutional neural network and long short-term memory networks for predicting disordered regions in TMPs. In addition to attributes commonly used in IDR predictors, we defined several TMP specific features to enhance the accuracy of our method further. MemDis achieved the highest prediction accuracy on TMP-specific dataset among other popular IDR prediction methods.
Collapse
Affiliation(s)
| | - Gábor E. Tusnády
- Institute of Enzymology, Research Centre for Natural Sciences, Magyar Tudósok Körútja 2, 1117 Budapest, Hungary;
| |
Collapse
|
34
|
Unipolar Peptidoglycan Synthesis in the Rhizobiales Requires an Essential Class A Penicillin-Binding Protein. mBio 2021; 12:e0234621. [PMID: 34544272 PMCID: PMC8546619 DOI: 10.1128/mbio.02346-21] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Members of the Rhizobiales are polarly growing bacteria that lack homologs of the canonical Rod complex. To investigate the mechanisms underlying polar cell wall synthesis, we systematically probed the function of cell wall synthesis enzymes in the plant pathogen Agrobacterium tumefaciens. The development of fluorescent d-amino acid dipeptide (FDAAD) probes, which are incorporated into peptidoglycan by penicillin-binding proteins in A. tumefaciens, enabled us to monitor changes in growth patterns in the mutants. Use of these fluorescent cell wall probes and peptidoglycan compositional analysis demonstrate that a single class A penicillin-binding protein is essential for polar peptidoglycan synthesis. Furthermore, we find evidence of an additional mode of cell wall synthesis that requires ld-transpeptidase activity. Genetic analysis and cell wall targeting antibiotics reveal that the mechanism of unipolar growth is conserved in Sinorhizobium and Brucella. This work provides insights into unipolar peptidoglycan biosynthesis employed by the Rhizobiales during cell elongation.
Collapse
|
35
|
Emenecker RJ, Griffith D, Holehouse AS. Metapredict: a fast, accurate, and easy-to-use predictor of consensus disorder and structure. Biophys J 2021; 120:4312-4319. [PMID: 34480923 PMCID: PMC8553642 DOI: 10.1016/j.bpj.2021.08.039] [Citation(s) in RCA: 67] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 08/08/2021] [Accepted: 08/30/2021] [Indexed: 01/02/2023] Open
Abstract
Intrinsically disordered proteins and protein regions make up a substantial fraction of many proteomes in which they play a wide variety of essential roles. A critical first step in understanding the role of disordered protein regions in biological function is to identify those disordered regions correctly. Computational methods for disorder prediction have emerged as a core set of tools to guide experiments, interpret results, and develop hypotheses. Given the multiple different predictors available, consensus scores have emerged as a popular approach to mitigate biases or limitations of any single method. Consensus scores integrate the outcome of multiple independent disorder predictors and provide a per-residue value that reflects the number of tools that predict a residue to be disordered. Although consensus scores help mitigate the inherent problems of using any single disorder predictor, they are computationally expensive to generate. They also necessitate the installation of multiple different software tools, which can be prohibitively difficult. To address this challenge, we developed a deep-learning-based predictor of consensus disorder scores. Our predictor, metapredict, utilizes a bidirectional recurrent neural network trained on the consensus disorder scores from 12 proteomes. By benchmarking metapredict using two orthogonal approaches, we found that metapredict is among the most accurate disorder predictors currently available. Metapredict is also remarkably fast, enabling proteome-scale disorder prediction in minutes. Importantly, metapredict is a fully open source and is distributed as a Python package, a collection of command-line tools, and a web server, maximizing the potential practical utility of the predictor. We believe metapredict offers a convenient, accessible, accurate, and high-performance predictor for single-proteins and proteomes alike.
Collapse
Affiliation(s)
- Ryan J Emenecker
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri; Center for Science and Engineering Living Systems (CSELS), St. Louis, Missouri; Center for Engineering Mechanobiology, Washington University, St. Louis, Missouri
| | - Daniel Griffith
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri; Center for Science and Engineering Living Systems (CSELS), St. Louis, Missouri
| | - Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri; Center for Science and Engineering Living Systems (CSELS), St. Louis, Missouri.
| |
Collapse
|
36
|
Emenecker RJ, Griffith D, Holehouse AS. Metapredict: a fast, accurate, and easy-to-use predictor of consensus disorder and structure. Biophys J 2021; 120:4312-4319. [PMID: 34480923 DOI: 10.1101/2021.05.30.446349] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 08/08/2021] [Accepted: 08/30/2021] [Indexed: 05/28/2023] Open
Abstract
Intrinsically disordered proteins and protein regions make up a substantial fraction of many proteomes in which they play a wide variety of essential roles. A critical first step in understanding the role of disordered protein regions in biological function is to identify those disordered regions correctly. Computational methods for disorder prediction have emerged as a core set of tools to guide experiments, interpret results, and develop hypotheses. Given the multiple different predictors available, consensus scores have emerged as a popular approach to mitigate biases or limitations of any single method. Consensus scores integrate the outcome of multiple independent disorder predictors and provide a per-residue value that reflects the number of tools that predict a residue to be disordered. Although consensus scores help mitigate the inherent problems of using any single disorder predictor, they are computationally expensive to generate. They also necessitate the installation of multiple different software tools, which can be prohibitively difficult. To address this challenge, we developed a deep-learning-based predictor of consensus disorder scores. Our predictor, metapredict, utilizes a bidirectional recurrent neural network trained on the consensus disorder scores from 12 proteomes. By benchmarking metapredict using two orthogonal approaches, we found that metapredict is among the most accurate disorder predictors currently available. Metapredict is also remarkably fast, enabling proteome-scale disorder prediction in minutes. Importantly, metapredict is a fully open source and is distributed as a Python package, a collection of command-line tools, and a web server, maximizing the potential practical utility of the predictor. We believe metapredict offers a convenient, accessible, accurate, and high-performance predictor for single-proteins and proteomes alike.
Collapse
Affiliation(s)
- Ryan J Emenecker
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri; Center for Science and Engineering Living Systems (CSELS), St. Louis, Missouri; Center for Engineering Mechanobiology, Washington University, St. Louis, Missouri
| | - Daniel Griffith
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri; Center for Science and Engineering Living Systems (CSELS), St. Louis, Missouri
| | - Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri; Center for Science and Engineering Living Systems (CSELS), St. Louis, Missouri.
| |
Collapse
|
37
|
Zhao B, Katuwawala A, Oldfield CJ, Hu G, Wu Z, Uversky VN, Kurgan L. Intrinsic Disorder in Human RNA-Binding Proteins. J Mol Biol 2021; 433:167229. [PMID: 34487791 DOI: 10.1016/j.jmb.2021.167229] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 08/30/2021] [Accepted: 08/31/2021] [Indexed: 12/24/2022]
Abstract
Although RNA-binding proteins (RBPs) are known to be enriched in intrinsic disorder, no previous analysis focused on RBPs interacting with specific RNA types. We fill this gap with a comprehensive analysis of the putative disorder in RBPs binding to six common RNA types: messenger RNA (mRNA), transfer RNA (tRNA), small nuclear RNA (snRNA), non-coding RNA (ncRNA), ribosomal RNA (rRNA), and internal ribosome RNA (irRNA). We also analyze the amount of putative intrinsic disorder in the RNA-binding domains (RBDs) and non-RNA-binding-domain regions (non-RBD regions). Consistent with previous studies, we show that in comparison with human proteome, RBPs are significantly enriched in disorder. However, closer examination finds significant enrichment in predicted disorder for the mRNA-, rRNA- and snRNA-binding proteins, while the proteins that interact with ncRNA and irRNA are not enriched in disorder, and the tRNA-binding proteins are significantly depleted in disorder. We show a consistent pattern of significant disorder enrichment in the non-RBD regions coupled with low levels of disorder in RBDs, which suggests that disorder is relatively rarely utilized in the RNA-binding regions. Our analysis of the non-RBD regions suggests that disorder harbors posttranslational modification sites and is involved in the putative interactions with DNA. Importantly, we utilize experimental data from DisProt and independent data from Pfam to validate the above observations that rely on the disorder predictions. This study provides new insights into the distribution of disorder across proteins that bind different RNA types and the functional role of disorder in the regions where it is enriched.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Christopher J Oldfield
- Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Gang Hu
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin 300071, China
| | - Zhonghua Wu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA.
| |
Collapse
|
38
|
Maron MI, Lehman SM, Gayatri S, DeAngelo JD, Hegde S, Lorton BM, Sun Y, Bai DL, Sidoli S, Gupta V, Marunde MR, Bone JR, Sun ZW, Bedford MT, Shabanowitz J, Chen H, Hunt DF, Shechter D. Independent transcriptomic and proteomic regulation by type I and II protein arginine methyltransferases. iScience 2021; 24:102971. [PMID: 34505004 PMCID: PMC8417332 DOI: 10.1016/j.isci.2021.102971] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 06/21/2021] [Accepted: 08/09/2021] [Indexed: 12/22/2022] Open
Abstract
Protein arginine methyltransferases (PRMTs) catalyze the post-translational monomethylation (Rme1), asymmetric (Rme2a), or symmetric (Rme2s) dimethylation of arginine. To determine the cellular consequences of type I (Rme2a) and II (Rme2s) PRMTs, we developed and integrated multiple approaches. First, we determined total cellular dimethylarginine levels, revealing that Rme2s was ∼3% of total Rme2 and that this percentage was dependent upon cell type and PRMT inhibition status. Second, we quantitatively characterized in vitro substrates of the major enzymes and expanded upon PRMT substrate recognition motifs. We also compiled our data with publicly available methylarginine-modified residues into a comprehensive database. Third, we inhibited type I and II PRMTs and performed proteomic and transcriptomic analyses to reveal their phenotypic consequences. These experiments revealed both overlapping and independent PRMT substrates and cellular functions. Overall, this study expands upon PRMT substrate diversity, the arginine methylome, and the complex interplay of type I and II PRMTs.
Collapse
Affiliation(s)
- Maxim I. Maron
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Stephanie M. Lehman
- Department of Chemistry, University of Virginia, Charlottesville, VA 22904, USA
| | - Sitaram Gayatri
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Smithville, TX, USA
- Center for Cancer Epigenetics, The University of Texas MD Anderson Cancer Center, Smithville, TX, USA
- Graduate Program in Genetics and Epigenetics, The University of Texas MD Anderson UT Health Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| | - Joseph D. DeAngelo
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Subray Hegde
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Benjamin M. Lorton
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Yan Sun
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Dina L. Bai
- Department of Chemistry, University of Virginia, Charlottesville, VA 22904, USA
| | - Simone Sidoli
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Varun Gupta
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | | | - James R. Bone
- EpiCypher, Inc., Research Triangle Park, NC 27709, USA
| | - Zu-Wen Sun
- EpiCypher, Inc., Research Triangle Park, NC 27709, USA
| | - Mark T. Bedford
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Smithville, TX, USA
- Center for Cancer Epigenetics, The University of Texas MD Anderson Cancer Center, Smithville, TX, USA
- Graduate Program in Genetics and Epigenetics, The University of Texas MD Anderson UT Health Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| | - Jeffrey Shabanowitz
- Department of Chemistry, University of Virginia, Charlottesville, VA 22904, USA
| | - Hongshan Chen
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Donald F. Hunt
- Departments of Chemistry and Pathology, University of Virginia, Charlottesville, VA 22904, USA
| | - David Shechter
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| |
Collapse
|
39
|
Fei X, Li Q, Olsen JE, Jiao X. Duo: A Signature Based Method to Batch-Analyze Functional Similarities of Proteins. Front Microbiol 2021; 12:698322. [PMID: 34475860 PMCID: PMC8406696 DOI: 10.3389/fmicb.2021.698322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 07/22/2021] [Indexed: 11/16/2022] Open
Abstract
With the rapid advancement of sequencing technology, handling of large sequencing data to analyze for protein coding capacity and functionality of predicted proteins has become an urgent demand. There is a lack of simple and effective tools to functionally annotate large number of unknown proteins in a personalized and customized workflow. To address this, we developed Duo, which batch-analyze functional similarities of predicted proteins. Duo can screen query proteins with specific characteristics based on highly flexible and customizable reference inputs from the user. In the current study, Duo was applied to screen for virulence associated proteins in the genome-sequence of Salmonella Typhimurium. Based on the analysis, recommendation for choice of Seed_database in order to get a reasonable number of predicted proteins for further analysis, and recommendation for preparing a Reference_proteins set for Duo was given. Delta-bitscore analysis was shown to be useful tool to focus the follow-up on predicted proteins. A successful screen for virulence proteins in the bacterial genome-sequence was further performed in a selection of 32 pathogenic bacteria, documenting the ability of Duo to work on a broad collection of bacteria. We anticipate that Duo will be a useful auxiliary tool for personalized and customized protein function research in the future.
Collapse
Affiliation(s)
- Xiao Fei
- Key Laboratory of Prevention and Control of Biological Hazard Factors (Animal Origin) for Agri-food Safety and Quality, Ministry of Agriculture of China, Yangzhou University, Yangzhou, China.,Jiangsu Key Lab of Zoonosis/Jiangsu Co-Innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University, Yangzhou, China.,Joint International Research Laboratory of Agriculture and Agri-Product Safety, Yangzhou University, Yangzhou, China.,Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Qiuchun Li
- Key Laboratory of Prevention and Control of Biological Hazard Factors (Animal Origin) for Agri-food Safety and Quality, Ministry of Agriculture of China, Yangzhou University, Yangzhou, China.,Jiangsu Key Lab of Zoonosis/Jiangsu Co-Innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University, Yangzhou, China.,Joint International Research Laboratory of Agriculture and Agri-Product Safety, Yangzhou University, Yangzhou, China
| | - John Elmerdahl Olsen
- Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Xinan Jiao
- Key Laboratory of Prevention and Control of Biological Hazard Factors (Animal Origin) for Agri-food Safety and Quality, Ministry of Agriculture of China, Yangzhou University, Yangzhou, China.,Jiangsu Key Lab of Zoonosis/Jiangsu Co-Innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University, Yangzhou, China.,Joint International Research Laboratory of Agriculture and Agri-Product Safety, Yangzhou University, Yangzhou, China
| |
Collapse
|
40
|
Mier P, Paladin L, Tamana S, Petrosian S, Hajdu-Soltész B, Urbanek A, Gruca A, Plewczynski D, Grynberg M, Bernadó P, Gáspári Z, Ouzounis CA, Promponas VJ, Kajava AV, Hancock JM, Tosatto SCE, Dosztanyi Z, Andrade-Navarro MA. Disentangling the complexity of low complexity proteins. Brief Bioinform 2021; 21:458-472. [PMID: 30698641 PMCID: PMC7299295 DOI: 10.1093/bib/bbz007] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Revised: 12/19/2018] [Accepted: 01/07/2019] [Indexed: 12/31/2022] Open
Abstract
There are multiple definitions for low complexity regions (LCRs) in protein sequences, with all of them broadly considering LCRs as regions with fewer amino acid types compared to an average composition. Following this view, LCRs can also be defined as regions showing composition bias. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, and more generally the overlaps between different properties related to LCRs, using examples. We argue that statistical measures alone cannot capture all structural aspects of LCRs and recommend the combined usage of a variety of predictive tools and measurements. While the methodologies available to study LCRs are already very advanced, we foresee that a more comprehensive annotation of sequences in the databases will enable the improvement of predictions and a better understanding of the evolution and the connection between structure and function of LCRs. This will require the use of standards for the generation and exchange of data describing all aspects of LCRs. Short abstract There are multiple definitions for low complexity regions (LCRs) in protein sequences. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, plus overlaps between different properties related to LCRs, using examples.
Collapse
Affiliation(s)
- Pablo Mier
- Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, Mainz, Germany
| | - Lisanna Paladin
- Department of Biomedical Science, University of Padova, Padova, Italy
| | - Stella Tamana
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Sophia Petrosian
- Biological Computation and Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thessalonica, Greece
| | - Borbála Hajdu-Soltész
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, Hungary
| | - Annika Urbanek
- Centre de Biochimie Structurale, INSERM, CNRS, Université de Montpellier, Montpellier, France
| | - Aleksandra Gruca
- Institute of Informatics, Silesian University of Technology, Gliwice, Poland
| | - Dariusz Plewczynski
- Center of New Technologies, University of Warsaw, Warsaw, Poland.,Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| | | | - Pau Bernadó
- Centre de Biochimie Structurale, INSERM, CNRS, Université de Montpellier, Montpellier, France
| | - Zoltán Gáspári
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Budapest, Hungary
| | - Christos A Ouzounis
- Biological Computation and Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thessalonica, Greece
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Andrey V Kajava
- Centre de Recherche en Biologie Cellulaire de Montpellier, CNRS-UMR, Institut de Biologie Computationnelle, Universite de Montpellier, Montpellier, France.,Institute of Bioengineering, University ITMO, St. Petersburg, Russia
| | - John M Hancock
- Earlham Institute, Norwich, UK.,ELIXIR Hub, Welcome Genome Campus, Hinxton, UK
| | - Silvio C E Tosatto
- Department of Biomedical Science, University of Padova, Padova, Italy.,CNR Institute of Neuroscience, Padova, Italy
| | - Zsuzsanna Dosztanyi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, Hungary
| | - Miguel A Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, Mainz, Germany
| |
Collapse
|
41
|
Dorone Y, Boeynaems S, Flores E, Jin B, Hateley S, Bossi F, Lazarus E, Pennington JG, Michiels E, De Decker M, Vints K, Baatsen P, Bassel GW, Otegui MS, Holehouse AS, Exposito-Alonso M, Sukenik S, Gitler AD, Rhee SY. A prion-like protein regulator of seed germination undergoes hydration-dependent phase separation. Cell 2021; 184:4284-4298.e27. [PMID: 34233164 PMCID: PMC8513799 DOI: 10.1016/j.cell.2021.06.009] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Revised: 03/22/2021] [Accepted: 06/04/2021] [Indexed: 12/22/2022]
Abstract
Many organisms evolved strategies to survive desiccation. Plant seeds protect dehydrated embryos from various stressors and can lay dormant for millennia. Hydration is the key trigger to initiate germination, but the mechanism by which seeds sense water remains unresolved. We identified an uncharacterized Arabidopsis thaliana prion-like protein we named FLOE1, which phase separates upon hydration and allows the embryo to sense water stress. We demonstrate that biophysical states of FLOE1 condensates modulate its biological function in vivo in suppressing seed germination under unfavorable environments. We find intragenic, intraspecific, and interspecific natural variation in FLOE1 expression and phase separation and show that intragenic variation is associated with adaptive germination strategies in natural populations. This combination of molecular, organismal, and ecological studies uncovers FLOE1 as a tunable environmental sensor with direct implications for the design of drought-resistant crops, in the face of climate change.
Collapse
Affiliation(s)
- Yanniv Dorone
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA; Department of Biology, Stanford University, Stanford, CA 94305, USA
| | - Steven Boeynaems
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Eduardo Flores
- Department of Chemistry and Chemical Biology, UC Merced, Merced, CA 95340, USA
| | - Benjamin Jin
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA
| | - Shannon Hateley
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA
| | - Flavia Bossi
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA
| | - Elena Lazarus
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA
| | - Janice G Pennington
- Center for Quantitative Cell Imaging, University of Wisconsin, Madison, WI 53706, USA
| | - Emiel Michiels
- EM-platform@VIB Bio Imaging Core and VIB Center for Brain and Disease Research, KU Leuven, 3000 Leuven, Belgium; Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, 3000 Leuven, Belgium
| | - Mathias De Decker
- EM-platform@VIB Bio Imaging Core and VIB Center for Brain and Disease Research, KU Leuven, 3000 Leuven, Belgium; KU Leuven - University of Leuven, Department of Neurosciences, Experimental Neurology, and Leuven Brain Institute (LBI), 3000 Leuven, Belgium
| | - Katlijn Vints
- EM-platform@VIB Bio Imaging Core and VIB Center for Brain and Disease Research, KU Leuven, 3000 Leuven, Belgium
| | - Pieter Baatsen
- EM-platform@VIB Bio Imaging Core and VIB Center for Brain and Disease Research, KU Leuven, 3000 Leuven, Belgium
| | - George W Bassel
- School of Life Sciences, University of Warwick, Coventry CV4 7AL, UK
| | - Marisa S Otegui
- Center for Quantitative Cell Imaging, University of Wisconsin, Madison, WI 53706, USA; Department of Botany, University of Wisconsin, Madison, WI 53706, USA
| | - Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO 63110, USA; Center for Science and Engineering of Living Systems (CSELS), Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Moises Exposito-Alonso
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA; Department of Biology, Stanford University, Stanford, CA 94305, USA
| | - Shahar Sukenik
- Department of Chemistry and Chemical Biology, UC Merced, Merced, CA 95340, USA
| | - Aaron D Gitler
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA.
| | - Seung Y Rhee
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA.
| |
Collapse
|
42
|
Orti F, Navarro AM, Rabinovich A, Wodak SJ, Marino-Buslje C. Insight into membraneless organelles and their associated proteins: Drivers, Clients and Regulators. Comput Struct Biotechnol J 2021; 19:3964-3977. [PMID: 34377363 PMCID: PMC8318826 DOI: 10.1016/j.csbj.2021.06.042] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Revised: 06/26/2021] [Accepted: 06/27/2021] [Indexed: 02/06/2023] Open
Abstract
In recent years, attention has been devoted to proteins forming immiscible liquid phases within the liquid intracellular medium, commonly referred to as membraneless organelles (MLO). These organelles enable the spatiotemporal associations of cellular components that exchange dynamically with the cellular milieu. The dysregulation of these liquid-liquid phase separation processes (LLPS) may cause various diseases including neurodegenerative pathologies and cancer, among others. Until very recently, databases containing information on proteins forming MLOs, as well as tools and resources facilitating their analysis, were missing. This has recently changed with the publication of 4 databases that focus on different types of experiments, sets of proteins, inclusion criteria, and levels of annotation or curation. In this study we integrate and analyze the information across these databases, complement their records, and produce a consolidated set of proteins that enables the investigation of the LLPS phenomenon. To gain insight into the features that characterize different types of MLOs and the roles of their associated proteins, they were grouped into categories: High Confidence MLO associated (including Drivers and reviewed proteins), Potential Clients and Regulators, according to their annotated functions. We show that none of the databases taken alone covers the data sufficiently to enable meaningful analysis, validating our integration effort as essential for gaining better understanding of phase separation and laying the foundations for the discovery of new proteins potentially involved in this important cellular process. Lastly, we developed a server, enabling customized selections of different sets of proteins based on MLO location, database, disorder content, among other attributes (https://forti.shinyapps.io/mlos/).
Collapse
Affiliation(s)
- Fernando Orti
- Bioinformatics Unit, Fundación Instituto Leloir. Avda. Patricias Argentinas 435, Buenos Aires B1405WE, Argentina
| | - Alvaro M. Navarro
- Bioinformatics Unit, Fundación Instituto Leloir. Avda. Patricias Argentinas 435, Buenos Aires B1405WE, Argentina
| | - Andres Rabinovich
- Bioinformatics Unit, Fundación Instituto Leloir. Avda. Patricias Argentinas 435, Buenos Aires B1405WE, Argentina
| | - Shoshana J. Wodak
- VIB-VUB Center for Structural Biology, Flemish Institute for Biotechnology, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - Cristina Marino-Buslje
- Bioinformatics Unit, Fundación Instituto Leloir. Avda. Patricias Argentinas 435, Buenos Aires B1405WE, Argentina
| |
Collapse
|
43
|
Emenecker RJ, Holehouse AS, Strader LC. Biological Phase Separation and Biomolecular Condensates in Plants. ANNUAL REVIEW OF PLANT BIOLOGY 2021; 72:17-46. [PMID: 33684296 PMCID: PMC8221409 DOI: 10.1146/annurev-arplant-081720-015238] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
A surge in research focused on understanding the physical principles governing the formation, properties, and function of membraneless compartments has occurred over the past decade. Compartments such as the nucleolus, stress granules, and nuclear speckles have been designated as biomolecular condensates to describe their shared property of spatially concentrating biomolecules. Although this research has historically been carried out in animal and fungal systems, recent work has begun to explore whether these same principles are relevant in plants. Effectively understanding and studying biomolecular condensates require interdisciplinary expertise that spans cell biology, biochemistry, and condensed matter physics and biophysics. As such, some involved concepts may be unfamiliar to any given individual. This review focuses on introducing concepts essential to the study of biomolecular condensates and phase separation for biologists seeking to carry out research in this area and further examines aspects of biomolecular condensates that are relevant to plant systems.
Collapse
Affiliation(s)
- Ryan J Emenecker
- Department of Biology, Washington University, St. Louis, Missouri 63130, USA
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri 63110, USA;
- Center for Science and Engineering of Living Systems, Washington University, St. Louis, Missouri 63130, USA
- Center for Engineering MechanoBiology, Washington University, St. Louis, Missouri 63130, USA
| | - Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri 63110, USA;
- Center for Science and Engineering of Living Systems, Washington University, St. Louis, Missouri 63130, USA
| | - Lucia C Strader
- Center for Science and Engineering of Living Systems, Washington University, St. Louis, Missouri 63130, USA
- Center for Engineering MechanoBiology, Washington University, St. Louis, Missouri 63130, USA
- Department of Biology, Duke University, Durham, North Carolina 27708, USA;
| |
Collapse
|
44
|
Chen YF, Xia Y. Structural Profiling of Bacterial Effectors Reveals Enrichment of Host-Interacting Domains and Motifs. Front Mol Biosci 2021; 8:626600. [PMID: 34012977 PMCID: PMC8126662 DOI: 10.3389/fmolb.2021.626600] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 04/21/2021] [Indexed: 11/13/2022] Open
Abstract
Effector proteins are bacterial virulence factors secreted directly into host cells and, through extensive interactions with host proteins, rewire host signaling pathways to the advantage of the pathogen. Despite the crucial role of globular domains as mediators of protein-protein interactions (PPIs), previous structural studies of bacterial effectors are primarily focused on individual domains, rather than domain-mediated PPIs, which limits their ability to uncover systems-level molecular recognition principles governing host-bacteria interactions. Here, we took an interaction-centric approach and systematically examined the potential of structural components within bacterial proteins to engage in or target eukaryote-specific domain-domain interactions (DDIs). Our results indicate that: 1) effectors are about six times as likely as non-effectors to contain host-like domains that mediate DDIs exclusively in eukaryotes; 2) the average domain in effectors is about seven times as likely as that in non-effectors to co-occur with DDI partners in eukaryotes rather than in bacteria; and 3) effectors are about nine times as likely as non-effectors to contain bacteria-exclusive domains that target host domains mediating DDIs exclusively in eukaryotes. Moreover, in the absence of host-like domains or among pathogen proteins without domain assignment, effectors harbor a higher variety and density of short linear motifs targeting host domains that mediate DDIs exclusively in eukaryotes. Our study lends novel quantitative insight into the structural basis of effector-induced perturbation of host-endogenous PPIs and may aid in the design of selective inhibitors of host-pathogen interactions.
Collapse
Affiliation(s)
| | - Yu Xia
- Department of Bioengineering, McGill University, Montreal, QC, Canada
| |
Collapse
|
45
|
Sponga A, Arolas JL, Schwarz TC, Jeffries CM, Rodriguez Chamorro A, Kostan J, Ghisleni A, Drepper F, Polyansky A, De Almeida Ribeiro E, Pedron M, Zawadzka-Kazimierczuk A, Mlynek G, Peterbauer T, Doto P, Schreiner C, Hollerl E, Mateos B, Geist L, Faulkner G, Kozminski W, Svergun DI, Warscheid B, Zagrovic B, Gautel M, Konrat R, Djinović-Carugo K. Order from disorder in the sarcomere: FATZ forms a fuzzy but tight complex and phase-separated condensates with α-actinin. SCIENCE ADVANCES 2021; 7:eabg7653. [PMID: 34049882 PMCID: PMC8163081 DOI: 10.1126/sciadv.abg7653] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Accepted: 04/13/2021] [Indexed: 05/03/2023]
Abstract
In sarcomeres, α-actinin cross-links actin filaments and anchors them to the Z-disk. FATZ (filamin-, α-actinin-, and telethonin-binding protein of the Z-disk) proteins interact with α-actinin and other core Z-disk proteins, contributing to myofibril assembly and maintenance. Here, we report the first structure and its cellular validation of α-actinin-2 in complex with a Z-disk partner, FATZ-1, which is best described as a conformational ensemble. We show that FATZ-1 forms a tight fuzzy complex with α-actinin-2 and propose an interaction mechanism via main molecular recognition elements and secondary binding sites. The obtained integrative model reveals a polar architecture of the complex which, in combination with FATZ-1 multivalent scaffold function, might organize interaction partners and stabilize α-actinin-2 preferential orientation in Z-disk. Last, we uncover FATZ-1 ability to phase-separate and form biomolecular condensates with α-actinin-2, raising the question whether FATZ proteins can create an interaction hub for Z-disk proteins through membraneless compartmentalization during myofibrillogenesis.
Collapse
Affiliation(s)
- Antonio Sponga
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | - Joan L Arolas
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | - Thomas C Schwarz
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | - Cy M Jeffries
- European Molecular Biology Laboratory (EMBL), Hamburg Unit, Hamburg, Germany
| | - Ariadna Rodriguez Chamorro
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | - Julius Kostan
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | - Andrea Ghisleni
- King's College London BHF Centre for Research Excellence, Randall Centre for Cell and Molecular Biophysics, London SE1 1UL, UK
| | - Friedel Drepper
- Biochemistry and Functional Proteomics, Institute of Biology II, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg, 79104 Freiburg, Germany
| | - Anton Polyansky
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
- National Research University Higher School of Economics, Moscow 101000, Russia
| | - Euripedes De Almeida Ribeiro
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | - Miriam Pedron
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | - Anna Zawadzka-Kazimierczuk
- Biological and Chemical Research Centre, Faculty of Chemistry, University of Warsaw, Zwirki i Wigury 101, 02-089 Warsaw, Poland
| | - Georg Mlynek
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | - Thomas Peterbauer
- Department of Biochemistry and Cell Biology, Max Perutz Labs, University of Vienna, Dr. BohrGasse 9, A-1030 Vienna, Austria
| | - Pierantonio Doto
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | - Claudia Schreiner
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | - Eneda Hollerl
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | - Borja Mateos
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | - Leonhard Geist
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | | | - Wiktor Kozminski
- Biological and Chemical Research Centre, Faculty of Chemistry, University of Warsaw, Zwirki i Wigury 101, 02-089 Warsaw, Poland
| | - Dmitri I Svergun
- King's College London BHF Centre for Research Excellence, Randall Centre for Cell and Molecular Biophysics, London SE1 1UL, UK
| | - Bettina Warscheid
- Biochemistry and Functional Proteomics, Institute of Biology II, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg, 79104 Freiburg, Germany
| | - Bojan Zagrovic
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | - Mathias Gautel
- King's College London BHF Centre for Research Excellence, Randall Centre for Cell and Molecular Biophysics, London SE1 1UL, UK
| | - Robert Konrat
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria
| | - Kristina Djinović-Carugo
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, A-1030 Vienna, Austria.
- Department of Biochemistry, Faculty of Chemistry and Chemical Technology, University of Ljubljana, Večna pot 113, SI-1000 Ljubljana, Slovenia
| |
Collapse
|
46
|
Carinci M, Testa B, Bordi M, Milletti G, Bonora M, Antonucci L, Ferraina C, Carro M, Kumar M, Ceglie D, Eck F, Nardacci R, le Guerroué F, Petrini S, Soriano ME, Caruana I, Doria V, Manifava M, Peron C, Lambrughi M, Tiranti V, Behrends C, Papaleo E, Pinton P, Giorgi C, Ktistakis NT, Locatelli F, Nazio F, Cecconi F. TFG binds LC3C to regulate ULK1 localization and autophagosome formation. EMBO J 2021; 40:e103563. [PMID: 33932238 PMCID: PMC8126910 DOI: 10.15252/embj.2019103563] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Revised: 02/17/2021] [Accepted: 03/01/2021] [Indexed: 12/14/2022] Open
Abstract
The early secretory pathway and autophagy are two essential and evolutionarily conserved endomembrane processes that are finely interlinked. Although growing evidence suggests that intracellular trafficking is important for autophagosome biogenesis, the molecular regulatory network involved is still not fully defined. In this study, we demonstrate a crucial effect of the COPII vesicle-related protein TFG (Trk-fused gene) on ULK1 puncta number and localization during autophagy induction. This, in turn, affects formation of the isolation membrane, as well as the correct dynamics of association between LC3B and early ATG proteins, leading to the proper formation of both omegasomes and autophagosomes. Consistently, fibroblasts derived from a hereditary spastic paraparesis (HSP) patient carrying mutated TFG (R106C) show defects in both autophagy and ULK1 puncta accumulation. In addition, we demonstrate that TFG activity in autophagy depends on its interaction with the ATG8 protein LC3C through a canonical LIR motif, thereby favouring LC3C-ULK1 binding. Altogether, our results uncover a link between TFG and autophagy and identify TFG as a molecular scaffold linking the early secretion pathway to autophagy.
Collapse
Affiliation(s)
- Marianna Carinci
- Department of Pediatric Hemato-Oncology and Cell and Gene Therapy, IRCCS Bambino Gesù Children's Hospital, Rome, Italy.,Department of Medical Sciences, University of Ferrara, Laboratory of Technologies for Advanced Therapy (LTTA), Technopole of Ferrara, Ferrara, Italy
| | - Beatrice Testa
- Department of Biology, University of Rome Tor Vergata, Rome, Italy
| | - Matteo Bordi
- Department of Pediatric Hemato-Oncology and Cell and Gene Therapy, IRCCS Bambino Gesù Children's Hospital, Rome, Italy.,Department of Biology, University of Rome Tor Vergata, Rome, Italy
| | - Giacomo Milletti
- Department of Pediatric Hemato-Oncology and Cell and Gene Therapy, IRCCS Bambino Gesù Children's Hospital, Rome, Italy.,Department of Biology, University of Rome Tor Vergata, Rome, Italy
| | - Massimo Bonora
- Department of Medical Sciences, University of Ferrara, Laboratory of Technologies for Advanced Therapy (LTTA), Technopole of Ferrara, Ferrara, Italy
| | - Laura Antonucci
- Department of Pediatric Hemato-Oncology and Cell and Gene Therapy, IRCCS Bambino Gesù Children's Hospital, Rome, Italy
| | - Caterina Ferraina
- Department of Pediatric Hemato-Oncology and Cell and Gene Therapy, IRCCS Bambino Gesù Children's Hospital, Rome, Italy
| | - Marta Carro
- Department of Biology, University of Padua, Padua, Italy
| | - Mukesh Kumar
- Computational Biology Laboratory, Danish Cancer Society Research Center, Copenhagen, Denmark
| | - Donatella Ceglie
- Department of Pediatric Hemato-Oncology and Cell and Gene Therapy, IRCCS Bambino Gesù Children's Hospital, Rome, Italy
| | - Franziska Eck
- Munich Cluster for Systems Neurology (SyNergy), Ludwig-Maximilians-Universität (LMU), München, Germany
| | - Roberta Nardacci
- Department of Epidemiology and Preclinical Research, National Institute for Infectious Diseases IRCCS "L. Spallanzani", Rome, Italy
| | - Francois le Guerroué
- Munich Cluster for Systems Neurology (SyNergy), Ludwig-Maximilians-Universität (LMU), München, Germany
| | - Stefania Petrini
- Confocal Microscopy Core Facility, Research Laboratories, IRCCS Bambino Gesù Children's Hospital, Rome, Italy
| | | | - Ignazio Caruana
- Department of Pediatric Hemato-Oncology and Cell and Gene Therapy, IRCCS Bambino Gesù Children's Hospital, Rome, Italy
| | - Valentina Doria
- Confocal Microscopy Core Facility, Research Laboratories, IRCCS Bambino Gesù Children's Hospital, Rome, Italy
| | | | - Camille Peron
- UO Medical Genetics and Neurogenetics, Fondazione IRCCS Istituto Neurologico C. Besta, Milan, Italy
| | - Matteo Lambrughi
- Computational Biology Laboratory, Danish Cancer Society Research Center, Copenhagen, Denmark
| | - Valeria Tiranti
- UO Medical Genetics and Neurogenetics, Fondazione IRCCS Istituto Neurologico C. Besta, Milan, Italy
| | - Christian Behrends
- Munich Cluster for Systems Neurology (SyNergy), Ludwig-Maximilians-Universität (LMU), München, Germany
| | - Elena Papaleo
- Computational Biology Laboratory, Danish Cancer Society Research Center, Copenhagen, Denmark.,Translational Disease Systems Biology, Faculty of Health and Medical Sciences, Novo Nordisk Foundation Center for Protein Research University of Copenhagen, Copenhagen, Denmark
| | - Paolo Pinton
- Department of Medical Sciences, University of Ferrara, Laboratory of Technologies for Advanced Therapy (LTTA), Technopole of Ferrara, Ferrara, Italy
| | - Carlotta Giorgi
- Department of Medical Sciences, University of Ferrara, Laboratory of Technologies for Advanced Therapy (LTTA), Technopole of Ferrara, Ferrara, Italy
| | | | - Franco Locatelli
- Department of Pediatric Hemato-Oncology and Cell and Gene Therapy, IRCCS Bambino Gesù Children's Hospital, Rome, Italy.,Department of Gynecology/Obstetrics and Pediatrics, Sapienza University, Rome, Italy
| | - Francesca Nazio
- Department of Pediatric Hemato-Oncology and Cell and Gene Therapy, IRCCS Bambino Gesù Children's Hospital, Rome, Italy
| | - Francesco Cecconi
- Department of Pediatric Hemato-Oncology and Cell and Gene Therapy, IRCCS Bambino Gesù Children's Hospital, Rome, Italy.,Department of Biology, University of Rome Tor Vergata, Rome, Italy.,Unit of Cell Stress and Survival, Danish Cancer Society Research Center, Copenhagen, Denmark
| |
Collapse
|
47
|
Katuwawala A, Ghadermarzi S, Hu G, Wu Z, Kurgan L. QUARTERplus: Accurate disorder predictions integrated with interpretable residue-level quality assessment scores. Comput Struct Biotechnol J 2021; 19:2597-2606. [PMID: 34025946 PMCID: PMC8122155 DOI: 10.1016/j.csbj.2021.04.066] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 04/24/2021] [Accepted: 04/24/2021] [Indexed: 12/13/2022] Open
Abstract
A recent advance in the disorder prediction field is the development of the quality assessment (QA) scores. QA scores complement the propensities produced by the disorder predictors by identifying regions where these predictions are more likely to be correct. We develop, empirically test and release a new QA tool, QUARTERplus, that addresses several key drawbacks of the current QA method, QUARTER. QUARTERplus is the first solution that utilizes QA scores and the associated input disorder predictions to produce very accurate disorder predictions with the help of a modern deep learning meta-model. The deep neural network utilizes the QA scores to identify and fix the regions where the original/input disorder predictions are poor. More importantly, the accurate QUATERplus's predictions are accompanied by easy to interpret residue-level QA scores that reliably quantify their residue-level predictive quality. We provide these interpretable QA scores for QUARTERplus and 10 other popular disorder predictors. Empirical tests on a large and independent (low similarity) test dataset show that QUARTERplus predictions secure AUC = 0.93 and are statistically more accurate than the results of twelve state-of-the-art disorder predictors. We also demonstrate that the new QA scores produced by QUARTERplus are highly correlated with the actual predictive quality and that they can be effectively used to identify regions of correct disorder predictions. This feature empowers the users to easily identify which parts of the predictions generated by the modern disorder predictors are more trustworthy. QUARTERplus is available as a convenient webserver at http://biomine.cs.vcu.edu/servers/QUARTERplus/.
Collapse
Affiliation(s)
- Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Gang Hu
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin 300071, China
| | - Zhonghua Wu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
48
|
Nagaratnam N, Delker SL, Jernigan R, Edwards TE, Snider J, Thifault D, Williams D, Nannenga BL, Stofega M, Sambucetti L, Hsieh JJ, Flint AJ, Fromme P, Martin-Garcia JM. Structural insights into the function of the catalytically active human Taspase1. Structure 2021; 29:873-885.e5. [PMID: 33784495 DOI: 10.1016/j.str.2021.03.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2020] [Revised: 02/07/2021] [Accepted: 03/10/2021] [Indexed: 12/15/2022]
Abstract
Taspase1 is an Ntn-hydrolase overexpressed in primary human cancers, coordinating cancer cell proliferation, invasion, and metastasis. Loss of Taspase1 activity disrupts proliferation of human cancer cells in vitro and in mouse models of glioblastoma. Taspase1 is synthesized as an inactive proenzyme, becoming active upon intramolecular cleavage. The activation process changes the conformation of a long fragment at the C-terminus of the α subunit, for which no full-length structural information exists and whose function is poorly understood. We present a cloning strategy to generate a circularly permuted form of Taspase1 to determine the crystallographic structure of active Taspase1. We discovered that this region forms a long helix and is indispensable for the catalytic activity of Taspase1. Our study highlights the importance of this element for the enzymatic activity of Ntn-hydrolases, suggesting that it could be a potential target for the design of inhibitors with potential to be developed into anticancer therapeutics.
Collapse
Affiliation(s)
- Nirupa Nagaratnam
- Center for Applied Structural Discovery, Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA
| | - Silvia L Delker
- Beryllium Discovery Corp., with present address of UCB Biosciences, Bedford, MA 01730, USA
| | - Rebecca Jernigan
- Center for Applied Structural Discovery, Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA
| | - Thomas E Edwards
- Beryllium Discovery Corp., with present address of UCB Biosciences, Bedford, MA 01730, USA
| | - Janey Snider
- Division of Biosciences, SRI International Menlo Park, Menlo Park, CA 94025, USA
| | - Darren Thifault
- Center for Applied Structural Discovery, Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA
| | - Dewight Williams
- Eyring Materials Center, Arizona State University, Tempe, AZ 85257, USA
| | - Brent L Nannenga
- Center for Applied Structural Discovery, Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA; Chemical Engineering, School for Engineering of Matter, Transport, and Energy, Arizona State University, Tempe, AZ 85287, USA
| | - Mary Stofega
- Division of Biosciences, SRI International Menlo Park, Menlo Park, CA 94025, USA
| | - Lidia Sambucetti
- Division of Biosciences, SRI International Menlo Park, Menlo Park, CA 94025, USA
| | - James J Hsieh
- Molecular Oncology, Division of Oncology, Department of Medicine, Washington University, St. Louis, MO 63110, USA
| | - Andrew J Flint
- Frederick National Lab for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, MD 21702, USA
| | - Petra Fromme
- Center for Applied Structural Discovery, Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA.
| | - Jose M Martin-Garcia
- Center for Applied Structural Discovery, Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA; Department of Crystallography and Structural Biology, Institute of Physical-Chemistry "Rocasolano", Spanish National Research Council (CSIC), Madrid 28006, Spain.
| |
Collapse
|
49
|
In Silico Analysis of Huntingtin Homologs in Lower Eukaryotes. Int J Mol Sci 2021; 22:ijms22063214. [PMID: 33809947 PMCID: PMC8004120 DOI: 10.3390/ijms22063214] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Revised: 03/09/2021] [Accepted: 03/17/2021] [Indexed: 12/11/2022] Open
Abstract
Huntington’s disease is a rare neurodegenerative and autosomal dominant disorder. HD is caused by a mutation in the gene coding for huntingtin (Htt). The result is the production of a mutant Htt with an abnormally long polyglutamine repeat that leads to pathological Htt aggregates. Although the structure of human Htt has been determined, albeit at low resolution, its functions and how they are performed are largely unknown. Moreover, there is little information on the structure and function of Htt in other organisms. The comparison of Htt homologs can help to understand if there is a functional conservation of domains in the evolution of Htt in eukaryotes. In this work, through a computational approach, Htt homologs from lower eukaryotes have been analysed, identifying ordered domains and modelling their structure. Based on the structural models, a putative function for most of the domains has been predicted. A putative C. elegans Htt-like protein has also been analysed following the same approach. The results obtained support the notion that this protein is a orthologue of human Htt.
Collapse
|
50
|
Mészáros B, Hajdu-Soltész B, Zeke A, Dosztányi Z. Mutations of Intrinsically Disordered Protein Regions Can Drive Cancer but Lack Therapeutic Strategies. Biomolecules 2021; 11:biom11030381. [PMID: 33806614 PMCID: PMC8000335 DOI: 10.3390/biom11030381] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 02/22/2021] [Accepted: 02/24/2021] [Indexed: 12/22/2022] Open
Abstract
Many proteins contain intrinsically disordered regions (IDRs) which carry out important functions without relying on a single well-defined conformation. IDRs are increasingly recognized as critical elements of regulatory networks and have been also associated with cancer. However, it is unknown whether mutations targeting IDRs represent a distinct class of driver events associated with specific molecular and system-level properties, cancer types and treatment options. Here, we used an integrative computational approach to explore the direct role of intrinsically disordered protein regions driving cancer. We showed that around 20% of cancer drivers are primarily targeted through a disordered region. These IDRs can function in multiple ways which are distinct from the functional mechanisms of ordered drivers. Disordered drivers play a central role in context-dependent interaction networks and are enriched in specific biological processes such as transcription, gene expression regulation and protein degradation. Furthermore, their modulation represents an alternative mechanism for the emergence of all known cancer hallmarks. Importantly, in certain cancer patients, mutations of disordered drivers represent key driving events. However, treatment options for such patients are currently severely limited. The presented study highlights a largely overlooked class of cancer drivers associated with specific cancer types that need novel therapeutic options.
Collapse
Affiliation(s)
- Bálint Mészáros
- Department of Biochemistry, ELTE Eötvös Loránd University, H-1117 Budapest, Hungary; (B.M.); (B.H.-S.)
- EMBL Heidelberg, Meyerhofstraße 1, 69117 Heidelberg, Germany
| | - Borbála Hajdu-Soltész
- Department of Biochemistry, ELTE Eötvös Loránd University, H-1117 Budapest, Hungary; (B.M.); (B.H.-S.)
| | - András Zeke
- Institute of Enzymology, RCNS, P.O. Box 7, H-1518 Budapest, Hungary;
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, ELTE Eötvös Loránd University, H-1117 Budapest, Hungary; (B.M.); (B.H.-S.)
- Correspondence: ; Tel.: +36-1-372 2500/8537
| |
Collapse
|